Comparing fb395e29dc..58fd26c433 - mesa

fran/mesa

Author	SHA1	Message	Date
Brian Ho	58fd26c433	turnip: Fix vkCmdCopyQueryPoolResults with available flag Previously, calling vkCmdCopyQueryPoolResults with the VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result field in the buffer to 0 if unavailable and the query result if available. This was a misunderstanding of the Vulkan spec, and this commit corrects the behavior to emitting a separate available result in addition to the query result. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>	2020-01-30 20:30:46 +00:00
Brian Ho	1a3e2a7fa8	turnip: Fix vkGetQueryPoolResults with available flag Previously, calling vkGetQueryPoolResults with the VK_QUERY_RESULT_WITH_AVAILABILITY_BIT flag set the query result field in *pData to 0 if unavailable and the query result if available. This was a misunderstanding of the Vulkan spec, and this commit corrects the behavior to eriting a separate available result in addition to the query result. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3560>	2020-01-30 20:30:46 +00:00
Brian Ho	1c3319cf81	turnip: Free event->bo on vkDestroyEvent Fixes a leak from freeing event but not event->bo. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3639>	2020-01-30 18:50:06 +00:00
Kenneth Graunke	594cb30356	loader: Fix leak of kernel driver name This is strdup'd, it needs to be freed. CID: 1458032 Fixes: f93bb2fb102 ("loader: Check if the kernel driver is i915 before loading iris") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3630> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3630>	2020-01-30 10:08:17 -08:00
Jan Zielinski	f09c466732	docs: Update SWR tessellation support Update features.txt to reflect ARB_tessellation_shader support in SWR Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3636> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3636>	2020-01-30 11:18:15 +00:00
Kenneth Graunke	bdba744d70	i965: Use brw_batch_references in tex_busy check If the batch references the buffer, we will have to flush the batch immediately before mapping it, at which point it will be busy. (This bug has existed for a long time...even going back to BLT-era...) Fixes: `779923194c` ("i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage") Fixes: `d5d4ba9139` ("i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3616> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3616>	2020-01-30 10:01:21 +00:00
Christian Gmeiner	d3fa18a1fa	etnaviv: drm-shim: add GC400 These are the ETNAVIV_PARAM's returned from a GC400 found on a STM32MP157C-DK2 Discovery Board running mainline kernel. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3195> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3195>	2020-01-30 04:05:39 +00:00
Qiang Yu	c5e4d28724	lima: add noheap debug option Disable using heap buffer when set. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>	2020-01-30 03:39:21 +00:00
Qiang Yu	b220aec628	lima: create heap buffer with new interface if available Newly added heap buffer create interface can create a large enough buffer whose backup memory can increase dynamically as needed. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>	2020-01-30 03:39:21 +00:00
Qiang Yu	92465cc999	lima: sync lima_drm.h with kernel Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>	2020-01-30 03:39:21 +00:00
Icenowy Zheng	cd30c4d719	lima: fix lima_set_vertex_buffers() When setting the vertex buffers, lima calls util_set_vertex_buffers_mask() to reference and copy buffers. That function function adds dst with start_slot internally, so lima should not offset the destination address again. This is discovered when comparing with other drivers, and fixed by removing the extra offset in lima_set_vertex_buffers(). This fixes draws that get translated in u_vbuf, because u_vbuf adds extra vertex buffers when translating. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620>	2020-01-30 07:51:35 +08:00
Jonathan Marek	1c5d84fcae	turnip: hook up cmdbuffer event set/wait Gets some basic tests under "dEQP-VK.synchronization.event" passing Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>	2020-01-29 23:13:43 +00:00
Christian Gmeiner	5b5b762475	etnaviv: drop default state for PE_STENCIL_CONFIG_EXT2 It gets emitted when needed. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631>	2020-01-29 23:31:04 +01:00
Daniel Schürmann	d78e0de772	docs: add new features for RADV/ACO. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627>	2020-01-29 22:05:37 +00:00
Samuel Pitoiset	3a3b16a395	radv: refactor physical device properties Based on ANV. This removes a bunch of duplicated code for properties. Fixes: `1b8d99e288` ("radv: bump conformance version to 1.2.0.0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626>	2020-01-29 21:44:56 +00:00
Rob Clark	5b9fe18485	freedreno: remove flush-queue Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	b3b1fa5e2b	freedreno: add gmem_lock The gmem state is split out now, so it does not require synchronization. But gmem rendering still accesses vsc state from the context. TODO maybe there is a better way? For gen's that don't do vsc resizing, this is probably easier.. but for a6xx there isn't really a great position for more fine grained locking. Maybe it doesn't matter since in practice the lock shouldn't be contended. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	91f9bb99c5	freedreno: add gmem state cache Which also has the benefit of getting rid of fd_context::gmem. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	712f8802ee	freedreno: get GMEM state from batch Prep work to reduce churn in next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	4bcc3a0923	freedreno/a2xx: constify gmem state Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	5d442144ae	freedreno/a3xx: constify gmem state Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	7236d6dd4c	freedreno/a4xx: constify gmem state Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	2d2f4a55eb	freedreno/a5xx: constify gmem state Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	637ca78ee2	freedreno/a6xx: constify gmem state Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	82a64af907	freedreno: constify fd_vsc_pipe Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	cbae9f34e9	freedreno: constify fd_tile In a following patch, when we cache the gmem state, we will want to treat the gmem state as immuatable. So start converting things to const to make this more clear.. fd_tile is a good place to start. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	c7ab8874d0	freedreno: consolidate GMEM state The tile and vsc_pipe arrays are really part of the GMEM configuration. So pull these out of fd_context and into fd_gmem_stateobj. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Rob Clark	62c10b395e	freedreno: extract vsc pipe bo from GMEM state Prep work for reorganizing GMEM state and extracting out of fd_context. The vsc pipe bo was the one thing that doesn't change with GMEM/tile config. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>	2020-01-29 21:19:41 +00:00
Alejandro Piñeiro	d5c32db076	turnip: remove unused descriptor state dirty It was only used to be initialized to zero. Not even updated as descriptor sets are bind. As far as I understand, setting the bit TU_CMD_DIRTY_DESCRIPTOR_SET on tu_cmd_state.dirty is used instead. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>	2020-01-29 20:52:52 +00:00
Timur Kristóf	e73f604b21	aco: Fix the meaning of is_atomic. Previously, is_atomic really meant "is not atomic", contrary to its name. This commit fixes it to mean what one would think it means. Fixes: `69bed1c918` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>	2020-01-29 20:32:31 +00:00
Kenneth Graunke	ba148813d7	iris: Support multiple chained batches. There was never much point in artificially limiting chaining to two batches - we can trivially support arbitrary length chains. Currently, we should only ever have 1 or 2, but this may change. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>	2020-01-29 19:53:22 +00:00
Kenneth Graunke	94f9c5fff6	iris: Make iris_emit_default_l3_config pull devinfo from the batch No need to pass it, we can just use batch->screen->devinfo. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>	2020-01-29 19:53:22 +00:00
Kenneth Graunke	afcb6625e3	iris: Drop 'engine' from iris_batch. For the moment, everything is I915_EXEC_RENDER, so this isn't necessary. But even should that change, I don't think we want to handle multiple engines in this manner. Nowadays, we have batch->name (IRIS_BATCH_RENDER, IRIS_BATCH_COMPUTE, possibly an IRIS_BATCH_BLIT for blorp batches someday), which describes the functional usage of the batch. We can simply check that and select an engine for that class of work (assuming there ever is more than one). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>	2020-01-29 19:53:22 +00:00
Eric Anholt	06b13dfed2	tu: Fix binning address setup after pack macros change. This fixes a regression in "vkcube -m headless" rendering, but upsettingly none of my CTS tests I've been using. Fixes: `59f29fc845` ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.") Caught-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>	2020-01-29 19:30:09 +00:00
Brian Ho	3d5bdea2cf	turnip: Enable occlusionQueryPrecise This commit enables the occlusionQueryPrecise feature. No additonal work is required as occlusion queries are already implemented to track exact sample counts. Also enables a number of extra tests on the Vulkan CTS. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>	2020-01-29 19:05:23 +00:00
Daniel Schürmann	6f718edced	aco: simplify gathering of MIMG address components This patch has a slight effect on pipelinedb: Totals from affected shaders: SGPRS: 23616 -> 21504 (-8.94 %) VGPRS: 15088 -> 14444 (-4.27 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 662660 -> 664600 (0.29 %) bytes LDS: 49 -> 49 (0.00 %) blocks Max Waves: 3079 -> 3204 (4.06 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	901f06e9ad	aco: simplify adjust_sample_index_using_fmask() & get_image_coords() Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	99d032f3cd	aco: fix register allocation with multiple live-range splits This patch fixes register allocation if multiple live-range splits occur to the same variable within one instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	71440ba0f5	aco: reorder VMEM operands in ACO IR For all VMEM instructions, the resource constant is now in operands[0]. For MIMG instructions, the sampler shares operands[1] with write data in case this instruction writes memory. Moving the VADDR to be the last operand for MIMG is the first step to support Navi NSA encoding. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Caio Marcelo de Oliveira Filho	8548fe19f0	nir: Make nir_deref_path_init skip trivial casts In a NIR generated using SPIR-V initializers to variables, copy propagation can end up transforming vec1 32 ssa_33 = deref_var &@1 (shared mat2x4) vec1 32 ssa_35 = mov ssa_33 vec1 32 ssa_7 = deref_cast (mat2x4 )ssa_35 (shared mat2x4) / ptr_stride=0 / into vec1 32 ssa_33 = deref_var &@1 (shared mat2x4) vec1 32 ssa_7 = deref_cast (mat2x4 )ssa_33 (shared mat2x4) /* ptr_stride=0 */ Before the optimization, the "head" of a path of deref that uses ssa_7 will be the cast. After, it will be the variable in ssa_33. Since the types are the same, this is a trivial cast that would be picked up by nir_opt_deref. If we need to compare such deref-chain after optimization with another deref-chain for the same variable, the compare function will get confused by the cast in the middle. One alternative would be to add nir_opt_deref to places that compare derefs, but that might not scale well, so skip the trivial casts when generating the paths instead. Motivated by the discussion in https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>	2020-01-29 18:25:36 +00:00
Rhys Perry	db19e96c8c	aco: fix exec mask consistency issues There seems to be more, these are just the ones found in Detroit: Become Human shaders. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	c7d0514168	aco: parallelcopy exec mask before s_wqm It can be used later and we want any uses to not be fixed to exec, so it's definition can't be fixed to exec because of how exec masks interact with register demand calculation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	517fc3abc4	aco: fill reg_demand with sensible information in add_coupling_code() process_block() will use this to determine the register demand of the before the current instruction. Previously, it was filled with zeroes which could result in process_block() only using the register demand of after the current instruction. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	26d2511bcb	aco: improve assertion at the end of spiller Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	5ea23ba659	aco: set exec_potentially_empty after continues/breaks in nested IFs Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	4e83e05e62	aco: error when block has no logical preds but VGPRs are live at the start This would have caught the liveness error fixed in the previous commit. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	d282a292ec	aco: don't always add logical edges from continue_break blocks to headers Otherwise, code like this will be broken: loop { if (...) { break; } else { break; } } The continue_or_break block doesn't have any logical predecessors but it's a logical predecessor of the header block. This liveness error breaks the spiller in init_live_in_vars() (under "keep variables spilled on all incoming paths") and eventually creates garbage reloads. Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	dba71de5c6	aco: only create parallelcopy to restore exec at loop exit if needed The operand isn't fixed to exec, which can mess up the spiller. This also adds a new situation where a phi is needed. Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion when compiling a Detroit: Become Human shader. Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	4537b97410	aco: don't update demand in add_coupling_code() for loop headers We don't need to update it since it won't be used later. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	521525fc0a	aco: don't consider loop header blocks branch blocks in add_coupling_code Loops without continues create header blocks with only 1 predecessor. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	590c26beab	aco: fix target calculation when vgpr spilling introduces sgpr spilling A shader might require vgpr spilling but not require sgpr spilling. In that case, the spiller lowers the sgpr target by 5 which could mean sgpr spilling is then required. Then the vgpr target has to be lowered to make space for the linear vgprs. Previously, space wasn't make for the linear vgprs. Found while testing the spiller on the pipeline-db with a lowered limit Fixes: `a7ff1bb5b9` ('aco: simplify calculation of target register pressure when spilling') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Samuel Pitoiset	a61eff8330	radv/gfx10: re-enable NGG GS Now that NGG GS queries are implemented, it should be safe enough to enable NGG GS by default. It can be disabled with RADV_DEBUG=nongg if necessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>	2020-01-29 17:40:51 +01:00
Samuel Pitoiset	e4752dafed	radv/gfx10: implement NGG GS queries The number of generated primitives is only counted by the hardware if GS uses the legacy path. For NGG GS, we need to accumulate that value in the NGG GS itself. To achieve that, we use a plain GDS atomic operation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>	2020-01-29 17:40:48 +01:00
Samuel Pitoiset	3c1f657f35	radv/gfx10: add a separate flag for creating a GDS OA buffer For implementing NGG GS queries, we decided to use GDS but GDS OA is only required for NGG streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>	2020-01-29 17:40:46 +01:00
Michel Dänzer	ca6a22305b	winsys/amdgpu: Close KMS handles for other DRM file descriptions When a BO or amdgpu_screen_winsys is destroyed. Should fix leaking such BOs in other DRM file descriptions. v2: * Pass the correct file descriptor to drmIoctl (Pierre-Eric Pelloux-Prayer) * Use _mesa_hash_table_remove v3: * Close handles in amdgpu_winsys_unref as well v4: * Adapt to amdgpu_winsys::sws_list_lock. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2270 Fixes: `11a3679e3a` "winsys/amdgpu: Make KMS handles valid for original DRM file descriptor" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>	2020-01-29 15:51:01 +00:00
Michel Dänzer	9f2bed49d4	winsys/amdgpu: Re-use amdgpu_screen_winsys when possible Namely, if os_same_file_description determined that the DRM file descriptor references the same file description. v2: * Adapt to amdgpu_winsys::sws_list_lock. v3: * Fix comparison of amdgpu_screen_winsys file descriptions, see https://gitlab.freedesktop.org/mesa/mesa/issues/2413 . * Lock amdgpu_winsys::sws_list_lock for traversing the sws_list in amdgpu_winsys_create. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>	2020-01-29 15:51:01 +00:00
Jason Ekstrand	f21b40d0bf	anv: Rename a variable The name "desc" shadows another variable. Name it "desc_data" like all of the other descriptor data variables in this file.	2020-01-29 09:43:42 -06:00
Jason Ekstrand	e3f1a08c56	anv/block_pool: Ensure allocations have contiguous maps Because softpin block pools are made up of a set of BOs with different maps, it was possible for a single state to end up straddling blocks. To fix this, we pass a contiguous size to anv_block_pool_grow and it ensures that the next allocation in the pool will have at least that size. We also add an assert in anv_block_pool_map to ensure we always get contiguous maps. Prior to the changes to anv_block_pool_grow, the unit tests failed with this assert. With this patch, the tests pass. This was causing problems on Gen12 where we allocate the pages for the AUX table from the dynamic state pool. The first chunk, which gets allocated very early in the pool's history, is 1MB which was enough that it was getting multiple BOs. This caused the gen_aux_map code to write outside of the map and overwrite the instruction state pool buffer which lead to GPU hangs. Fixes: `731c4adcf9` "anv/allocator: Add support for non-userptr" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-29 09:43:42 -06:00
Jason Ekstrand	ee4cdef9ae	anv: Re-use one old BT block in reset_batch_bo_chain We intentionally throw away all but one BT block but then we set cmd_buffer->bt_block to ANV_STATE_NULL instead of the one we hung on to. This causes the command buffer to immediately re-emit STATE_BASE_ADDRESS the first time a BT is needed for no good reason. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-29 09:43:42 -06:00
Jason Ekstrand	a2e9dd51b3	anv: Set actual state pool sizes when we have softpin Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-29 09:43:42 -06:00
Rhys Perry	1f72857739	nir/algebraic: add some half packing optimizations pipeline-db (ACO): Totals from affected shaders: SGPRS: 29200 -> 29200 (0.00 %) VGPRS: 17372 -> 17372 (0.00 %) Spilled SGPRs: 105 -> 105 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1406576 -> 1389256 (-1.23 %) bytes LDS: 83 -> 83 (0.00 %) blocks Max Waves: 3976 -> 3976 (0.00 %) pipeline-db (LLVM): Totals from affected shaders: SGPRS: 21320 -> 21320 (0.00 %) VGPRS: 17056 -> 17036 (-0.12 %) Spilled SGPRs: 22 -> 22 (0.00 %) Spilled VGPRs: 503 -> 487 (-3.18 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 396 -> 396 (0.00 %) dwords per thread Code Size: 1441244 -> 1423292 (-1.25 %) bytes LDS: 463 -> 463 (0.00 %) blocks Max Waves: 3609 -> 3611 (0.06 %) v2: add pattern for ishr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>	2020-01-29 14:30:33 +00:00
Rhys Perry	5476d18183	nir/algebraic: add patterns for a >> #b << #b Fixes compilation of a Battlefront 2 shader with ACO by removing VGPR spilling. The reassociation makes it worse on LLVM though. pipeline-db (ACO): Totals from affected shaders: SGPRS: 10704 -> 10688 (-0.15 %) VGPRS: 18736 -> 18528 (-1.11 %) Spilled SGPRs: 70 -> 70 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 909696 -> 885796 (-2.63 %) bytes LDS: 225 -> 225 (0.00 %) blocks Max Waves: 1115 -> 1129 (1.26 %) pipeline-db (LLVM): Totals from affected shaders: SGPRS: 8472 -> 8424 (-0.57 %) VGPRS: 14284 -> 14368 (0.59 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 442 -> 503 (13.80 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 268 -> 396 (47.76 %) dwords per thread Code Size: 862568 -> 853028 (-1.11 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 971 -> 964 (-0.72 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2271>	2020-01-29 14:30:33 +00:00
Samuel Pitoiset	6aecc316c0	aco: fix VS input loads with MUBUF on GFX6 Only MTBUF supports vec3. Fixes: `03a0d39366` ("aco: use MUBUF in some situations instead of splitting vertex fetches") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3615>	2020-01-29 13:58:37 +00:00
Rhys Perry	404818dd28	aco: run p_wqm instructions in WQM If the p_wqm ends up creating copies, these need to be in WQM. Helps (but doesn't completely fix) artifacts in Strange Brigade. The actual issue still exists and is harder to fix. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>	2020-01-29 13:23:03 +00:00
Rhys Perry	2d7386a2d0	aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM We want any copies to be in WQM. I don't know if this fixes any real application, but I can create a vkrunner test than reproduces the issue. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3273>	2020-01-29 13:23:03 +00:00
Icecream95	9be9fd8591	pan/midgard: Fix a liveness info leak Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3566> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3566>	2020-01-29 12:59:32 +00:00
Jonathan Marek	6346490a2e	etnaviv: implement UBOs At the same time, use pre-HALTI2 to use address register for indirect uniform loads, since integers/LOAD instruction isn't always available. Passes all dEQP-GLES3.functional.ubo.* on GC7000L. GC3000 with an extra flush hack passes most of them, but still fails on some of the cases with many loads. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3389> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3389>	2020-01-29 11:47:34 +00:00
Rob Clark	7ff8ce7a3f	freedreno/a6xx: convert blend state to stateobj And move to new register builders while we are at it. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>	2020-01-29 11:21:47 +00:00
Rob Clark	f066e3afc7	freedreno/a6xx: remove special handling based on MRT format Logicop in particular is supposed to work for integer formats.. but maybe this situation doesn't happen in gles. The only thing that isn't required for integer formats is blending. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>	2020-01-29 11:21:47 +00:00
Rob Clark	eb281df1a1	mesa/st: random whitespace cleanup Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>	2020-01-29 11:21:47 +00:00
Rob Clark	d0e0141526	freedreno: use PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND This lets us drop a bunch of special handling for xRGB blend. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3565>	2020-01-29 11:21:47 +00:00
Thomas Hellstrom	9ee3ec348e	gallium/util: Increase the debug_flush map depth Some piglit tests trigger a map depth assert when debug_flush is active. Fix this by increasing the map depth from 16 to 32. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>	2020-01-29 10:56:06 +00:00
Thomas Hellstrom	8830e9f0ca	svga: Avoid discard DMA uploads Newer versions of the device code will make discard DMA uploads sub-optimal. Disable them for guest-backed aware code, where we previously had them conditionally enabled. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>	2020-01-29 10:56:06 +00:00
Thomas Hellstrom	8afe12b212	winsys/svga: Enable transhuge pages for buffer objects If the kernel supports it, enable transhuge pages for graphics buffer objects. Except for the syscall itself, this is never expected to cause any negative performance implications. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>	2020-01-29 10:56:06 +00:00
Roland Scheidegger	3b3c2daf3a	winsys/svga: use new ioctl for logging Use the new ioctl for logging (rather than duplicating what the kernel is doing). This way it's also independent from the actual guest/host mechanism to do the logging. Signed-off-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3614>	2020-01-29 10:56:06 +00:00
Samuel Pitoiset	f53b4defad	radv: remove the non conformant VK implementation warning on GFX10 It's no longer true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597>	2020-01-29 10:35:15 +00:00
Samuel Pitoiset	1b8d99e288	radv: bump conformance version to 1.2.0.0 https://www.khronos.org/conformance/adopters/conformant-products#submission_472 https://www.khronos.org/conformance/adopters/conformant-products#submission_473 https://www.khronos.org/conformance/adopters/conformant-products#submission_474 Fixes dEQP-VK.api.driver_properties.conformance_version. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3597>	2020-01-29 10:35:15 +00:00
Samuel Pitoiset	401bfe0283	radv: implement VK_AMD_shader_explicit_vertex_parameter Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2402 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	663d5c1399	radv: gather which input PS variables use an explicit interpolation mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	3922d95b51	aco: implement VK_AMD_shader_explicit_vertex_parameter Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	6f4c300919	ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	531a26d5aa	spirv: implement SPV_AMD_shader_explicit_vertex_parameter Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	cf6cae832c	nir: lower interp_deref_at_vertex to load_input_vertex This introduces a new NIR intrinsic for loading inputs at a specific vertex index. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	d29f10a7ca	nir: add nir_intrinsic_interp_deref_at_vertex From the SPV_AMD_shader_explicit_vertex_parameter extension: "Returns the value of the input <interpolant> without any interpolation, i.e. the raw output value of previous shader stage." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	687f170311	nir: lower SYSTEM_VALUE_BARYCENTRIC_* to nir_load_barycentric() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	9021b45b35	nir: add nir_intrinsic_load_barycentric_model Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	df8dd12e5b	spirv: add support for SpvBuiltInBaryCoord* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	61d24080bb	compiler: add new SYSTEM_VALUE_BARYCENTRIC_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	15d53d8294	compiler: add PERSP to the existing barycentric system values We need the LINEAR versions for AMD_shader_explicit_vertex_parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	5c053cc6ec	spirv: add support for SpvDecorationExplicitInterpAMD Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Samuel Pitoiset	746e9e5d66	compiler: add a new explicit interpolation mode This introduces one more interpolation mode INTERP_MODE_EXPLICIT, which is needed for AMD_shader_explicit_vertex_parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>	2020-01-29 09:49:50 +00:00
Eduardo Lima Mitev	e6b531af66	turnip: Fix issues in tu_compute_pipeline_create() that may lead to crash The shader object is destroyed even if its creation failed. It is also not destroyed if its compilation or upload fails, leading to leaks. Finally, tu_compute_pipeline_create() should set output var pPipeline to VK_NULL_HANDLE if it fails. Avoids crash on dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>	2020-01-29 09:25:20 +00:00
Eduardo Lima Mitev	0e11e8ba89	turnip: Remove failed command buffer from pool When an error condition occurs during tu_create_cmd_buffer(), the cmd buffer has already been added to a pool, so the cleanup code should remove it. Fixes a crash (assert in tu_device::tu_bo_finish()) in dEQP tests: dEQP-VK.api.object_management.max_concurrent.command_buffer_primary dEQP-VK.api.object_management.max_concurrent.command_buffer_secondary due to pool attempting to destroy an invalid command buffer. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3572>	2020-01-29 09:25:20 +00:00
Pierre-Eric Pelloux-Prayer	ab54624d0d	radeonsi: stop using the VM_ALWAYS_VALID flag Allocation all the bo as ALWAYS_VALID means they must all fit in memory (vram + gtt) at each command submission. This causes some trouble when the total allocated memory is greater than the available memory. Possible solutions: - being able to tag/untag a bo as ALWAYS_VALID: would require kernel changes - disable VM_ALWAYS_VALID when memory usage is more than a percentage of the available memory - disable VM_ALWAYS_VALID entirely v1 of this patch implemented option 2. v2 (this version) implements option 3. Related issues: - https://gitlab.freedesktop.org/drm/amd/issues/607 - https://gitlab.freedesktop.org/mesa/mesa/issues/1257 It also helps with some piglit tests (-t maxsize -t "max[_-].*size" -t maxuniformblocksize): instead of crashing the machine, the tests fail cleanly. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2190 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3430> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3430>	2020-01-29 09:05:04 +01:00
Samuel Pitoiset	b05ac4b158	radv: enable VK_AMD_shader_fragment_mask on GFX6-GFX7 Works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3603> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3603>	2020-01-29 08:08:27 +01:00
Kenneth Graunke	baf9327fa1	loader: Check if the kernel driver is i915 before loading iris To prevent it from trying to load on say gma500 hardware. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3595> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3595>	2020-01-28 15:35:09 -08:00
Jordan Justen	2969012d03	anv: Emit CS Stall before Instruction Cache flush for gen12 WA Before flushing the instruction cache with a pipe control, we need to use a CS Stall pipe control. Ref: GEN:BUG:1409226450 Rework: Add stall-at-scoreboard (Lionel) Rework: Merge with other anvil pre-invalidate stalls (Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>	2020-01-28 21:57:17 +00:00
Jordan Justen	da03e07cc2	iris: Emit CS Stall before Instruction Cache flush for gen12 WA Before flushing the instruction cache with a pipe control, we need to use a CS Stall pipe control. Ref: GEN:BUG:1409226450 Rework: Add stall-at-scoreboard (Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3457>	2020-01-28 21:57:17 +00:00
Erik Faye-Lund	b175effc72	zink: set compareEnable when setting compareOp We need to enable compareEnable for compareOp to be valid, and ANV was recently updated to respect this. So let's update Zink to match. This fixes the shadow-variants of several piglit regressions, like these: spec@arb_shader_texture_lod@execution@tex-miplevel-selection spec@glsl-1.20@execution@tex-miplevel-selection Fixes: `a19cdf989b` ("anv: only use VkSamplerCreateInfo::compareOp if enabled") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3473> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3473>	2020-01-28 21:04:26 +00:00
Eric Anholt	f6e59911e5	ci: Enable -Werror on the meson-i386 build. I find warnings to be very disruptive to my workflow (using emacs's "go to next error" feature), and I periodically have to go clean up other people's drivers to get back to finding my own warnings in the noise. I know I'm not the only one doing something like this. We don't want to enable -Werror by default in builds, since it means that end users will have builds spuriously fail based on what compiler version and opt flags they have compared to what the devs are using. However, it is quite easy to have CI ensure that we at least don't introduce warnings on the compiler version that it uses. For now I've just enabled it on meson-i386 to cover a bunch of Mesa core and get us started on ratcheting up warnings-cleanliness in the tree, without me having to fix up all the drivers at once. Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>	2020-01-28 12:31:07 -08:00
Eric Anholt	527a8c345b	mesa/st: Fix compiler warnings from INTEL_shader_integer_functions. Fixes: `1d165b0548` ("glsl: Add new expressions for INTEL_shader_integer_functions2") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>	2020-01-28 12:31:03 -08:00
Eric Anholt	096921c878	iris: Silence warning about AUX_USAGE_MC. It was recently introduced and not added to iris yet it looks like. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>	2020-01-28 12:30:48 -08:00
Eric Anholt	05e3ccd8a1	vulkan/wsi: Fix compiler warning when no WSI platforms are enabled. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3539>	2020-01-28 12:30:48 -08:00
Dylan Baker	71c6208200	docs: update news, calendar, and link release notes for 19.3.3 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>	2020-01-28 11:36:21 -08:00
Dylan Baker	3e49d0efe7	docs: Add SHA 256 sums for 19.3.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>	2020-01-28 11:35:30 -08:00
Dylan Baker	f9ef115927	docs: Add relnotes for 19.3.3 release Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3604>	2020-01-28 11:35:28 -08:00
Jason Ekstrand	997040e4b8	intel/mi_builder: Force write completion on Gen12+ Otherwise, we have no guarantee that the write actually lands before we move on to other things. Doing this on every SDI is probably a bit harsh but it's safe. We should figure out a good way to avoid this when we can. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>	2020-01-28 18:15:29 +00:00
Jason Ekstrand	06657e1dda	anv: Replace one more aux_surface.isl.size_B check This one was missed in `41bffe0913`. Fixes: `41bffe0913` "anv: Replace aux_surface.isl.size_B checks with..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>	2020-01-28 18:15:29 +00:00
Jason Ekstrand	f229579c0a	intel/blorp: Handle bit-casting UNORM and BGRA formats In `f132e0fddf`, I attempted to allow BLORP to do CCS_E copies by using the UNORM formats instead. However, the old BLORP bit-cast code could only handle RGBA formats and asserted on anything other than UINT formats. The reason we didn't catch this is because it only comes up on Gen12 platforms which aren't in our normal CI yet. Fixes: `f132e0fddf` "intel/blorp: Add support for CCS_E copies with..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3593>	2020-01-28 18:15:29 +00:00
Daniel Schürmann	396be00640	aco: fix combine_salu_not_bitwise() when SCC is used Previously, we didn't use the SCC bit, and thus, we didn't care about it. With 'aco: Transform uniform bitwise instructions to 32-bit if possible.' that changed, so that we have to handle it. Fixes: `8a32f57fff` ('aco: Transform uniform bitwise instructions to 32-bit if possible.') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>	2020-01-28 18:14:02 +01:00
Drew Davenport	0d99ff54cc	radeonsi: Clear uninitialized variable \|view\| was not initialized leading to flaky test failures in SkQP test unitTest_ES2BlendWithNoTexture. Fixes: `029bfa3d25` "radeonsi: add ability to bind images as image buffers" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3592> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3592>	2020-01-28 16:29:48 +00:00
Brian Ho	815a603889	anv: Handle unavailable queries in vkCmdCopyQueryPoolResults If VK_QUERY_RESULT_WAIT_BIT is not set, there is currently no special handling of unavailable queries in vkCmdCopyQueryPoolResults, and anv will write an invalid value for the query result. This commit updates vkCmdCopyQueryPoolResults for unavailable queries to return 0 if the VK_QUERY_RESULT_PARTIAL_BIT flag is set and if not, skip writing altogether. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>	2020-01-28 15:17:21 +00:00
Brian Ho	af92ce50a7	anv: Properly fetch partial results in vkGetQueryPoolResults Currently, fetching the partial results (VK_QUERY_RESULT_PARTIAL_BIT) of an unavailable occlusion query via vkGetQueryPoolResults can return invalid values. anv returns slot.end - slot.begin, but in the case of unavailable queries, slot.end is still at the initial value of 0. If slot.begin is non-zero, the occlusion count underflows to a value that is likely outside the acceptable range of the partial result. This commit fixes vkGetQueryPoolResults by always returning 0 if the query is unavailable and the VK_QUERY_RESULT_PARTIAL_BIT is set. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>	2020-01-28 15:17:21 +00:00
Rhys Perry	7edcf4a59d	aco: fix rebase error from GS copy shader support Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f8f7712666` ('aco: implement GS copy shaders') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3601>	2020-01-28 13:50:53 +00:00
Tapani Pälli	dd9bf7d291	anv/android: make format_supported_with_usage static Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532>	2020-01-28 14:46:38 +02:00
Tapani Pälli	104744f4df	anv/android: setup gralloc1 usage from gralloc0 usage manually This cuts away dependency to libgrallocusage. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3532>	2020-01-28 14:46:25 +02:00
Rhys Perry	03a0d39366	aco: use MUBUF in some situations instead of splitting vertex fetches Fixes most of the regressions from splitting vertex fetches in an earlier commit. pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 562696 -> 558344 (-0.77 %) VGPRS: 395596 -> 393752 (-0.47 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 11600912 -> 11311804 (-2.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 101839 -> 102372 (0.52 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:44:52 +00:00
Rhys Perry	21d2799cee	aco: value-number MUBUF instructions We will have to do this when we start creating MUBUF instructions for load_input because NIR might not be able to tell they are identical since it doesn't know whether two vertex attributes have the same offset. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:40:22 +00:00
Rhys Perry	d39f5519a1	aco: handle unaligned vertex fetch on GFX10 pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 795000 -> 802368 (0.93 %) VGPRS: 579632 -> 581280 (0.28 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 17208408 -> 17583652 (2.18 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 145731 -> 145279 (-0.31 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:40:10 +00:00
Rhys Perry	d9e357e35b	aco: skip unused channels at the start when fetching vertices pipeline-db (Vega): Totals from affected shaders: SGPRS: 161320 -> 161224 (-0.06 %) VGPRS: 153968 -> 149408 (-2.96 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 4331496 -> 4331308 (-0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 27814 -> 28594 (2.80 %) pipeline-db (Navi): Totals from affected shaders: SGPRS: 161504 -> 161408 (-0.06 %) VGPRS: 153836 -> 149440 (-2.86 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 4327572 -> 4327604 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 27837 -> 28618 (2.81 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:40:01 +00:00
Rhys Perry	525b107347	aco: rework vertex fetching a bit This will make it easier to skip unused channels at the start and to split unaligned loads on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:39:57 +00:00
Rhys Perry	4363a1f75b	amd/common,radv: move vertex_format_table to ac_shader_util.{h,c} Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>	2020-01-28 11:39:52 +00:00
Jan Zielinski	ab7ac1ffda	gallium/swr: fix tessellation state save/restore Tessellation state should be saved with TCS/TES state when binding new state and restored if old state is set again. Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3596>	2020-01-28 13:55:47 +01:00
Vasily Khoruzhick	fe5267d322	lima: disable early-z if fragment shader uses discard We have to disable early-z if fragment shader uses discard, otherwise we'll get misrendering. Reported-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3570>	2020-01-27 22:35:43 -08:00
Vasily Khoruzhick	650c680545	lima: ppir: always create move and update ld_tex successors for all blocks Always create a mov for ld_tex since we can't rely on ppir_node_has_single_src_succ() if we have multiple blocks. And since ld_tex successor can be in a different block we have to update their ppir_src as well. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>	2020-01-28 01:45:29 +00:00
Vasily Khoruzhick	4a0f62f1fc	lima: ppir: don't delete root ld_tex nodes without successors in current block We don't clone ld_tex nodes into each block anymore, so ld_tex may have successors in another block. Fixes: `c8554f849e` ("lima/ppir: don't clone texture loads") Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3564>	2020-01-28 01:45:29 +00:00
Rob Clark	63af27bc76	freedreno/drm: fix invalid-cmdstream-size with older kernels A cmdstream of size zero is invalid. But this can appear in various places where we emit a pointer to state. This doesn't show up with newer kernels (newer than v5.0) which use "softpin", but on earlier kernels can result in: [drm:msm_ioctl_gem_submit [msm]] ERROR invalid cmdstream size: 0 Since the pointer value doesn't matter in these cases, the easy solution is just to not emit a cmds table entry in this case. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2805>	2020-01-28 00:09:34 +00:00
Marek Olšák	0c154d9e2d	Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible" This reverts commit `b60f5cbc15`. This fixes dmesg errors and X freezes: [ 29.543096] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer [ 29.543103] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer	2020-01-27 17:48:42 -05:00
Marek Olšák	ba06c7620f	Revert "winsys/amdgpu: Close KMS handles for other DRM file descriptions" This reverts commit `552028c013`. Required by the next reverted commit.	2020-01-27 17:48:25 -05:00
Jason Ekstrand	993f866d2e	anv: Insert holes for non-existant XFB varyings Thanks to optimizations, it's possible for varyings to get deleted but still leave the variable there for nir_gather_xfb_info to find. If we get into this case, insert a hole. Fixes: `36ee2fd61c` "anv: Implement the basic form of..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>	2020-01-27 20:26:23 +00:00
Jason Ekstrand	68b3bfaa42	intel/genxml: Make SO_DECL::"Hole Flag" a Boolean Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3520>	2020-01-27 20:26:23 +00:00
Sagar Ghuge	a27542c5dd	intel/compiler: Clear accumulator register before EOT v2: (Francisco Jerez) - Drop vec4 changes. - Handle explicit acc0 operand and implicit one. - Make sure instruction is SIMD16, prediction is off and default mask control set to true. v3: (Francisco Jerez) - Clear accumulator only when it's written. - Use BRW_MASK_DISABLE instead of true. - Use correct width for brw_acc_reg(). - Fix last_inst_offset. v4: (Francisco Jerez) - Don't check for last instruction for accummulator write. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3376>	2020-01-27 19:48:11 +00:00
Alyssa Rosenzweig	480cf7d9bf	pan/midgard: Remove float_bitcast Now unused. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3588>	2020-01-27 13:37:36 -05:00
Samuel Pitoiset	83e1fa87a7	radv: do not allow sparse resources with multi-planar formats It's unsupported. Fixes some fails or hangs with dEQP-VK.sparse_resources.image_sparse_binding.* Cc: 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3581>	2020-01-27 15:47:49 +00:00
Boris Brezillon	24360966ab	panfrost/midgard: Prettify embedded constant prints Until now, embedded constants were printed as all 32 bits integer or floats, but the compiler can pack constant from different types if severa instructions with different reg_mode and native type refer to the constant register. Let's implement something smarter so users don't have to do a manual conversion when looking at a trace. Note that 8-bit constants are not decoded yet, as we're not sure how the writemask is encoded in that case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>	2020-01-27 15:24:54 +00:00
Boris Brezillon	aa973fc14e	panfrost/midgard: Add a condense_writemask() helper This way we can convert an 8-bit writemask (Midgard specific representation) into the more common 1-bit/component representation. 8-bit mode is not supported yet, as we're not sure how the writemask is encoded for this mode. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3536>	2020-01-27 15:24:54 +00:00
Rhys Perry	2dc63d39d3	aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `0be7409069` ('aco: rewrite literal combining') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Rhys Perry	827681f921	aco: always add sgprs to sgpr_ids when choosing literals Even if it's a literal, we should add this to sgpr_ids. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `0be7409069` ('aco: rewrite literal combining') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Rhys Perry	92970adb4b	aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Rhys Perry	e6c90e4af9	aco: fix WaR check for >64-bit FLAT/GLOBAL instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `5986e0019` ('aco: improve WAR hazard workaround with >64bit stores') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Alyssa Rosenzweig	8784062abb	pan/midgard: Handle tag 0x4 as texture Used for barriers which work as texture ops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>	2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig	5a271df028	pan/midgard: Validate barriers use a barrier tag ...and that non-barriers don't use a barrier tag. It's not clear what the difference means quite yet, though. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>	2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig	c9f4eface3	pan/midgard: Disassemble barrier instructions We don't need to print all the usual texture noise; just the relevant fields and the rest can be guarded to zero. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>	2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig	556964d927	pan/midgard: Record TEXTURE_OP_BARRIER It's 0x0B for whatever reason. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>	2020-01-27 13:38:41 +00:00
Alyssa Rosenzweig	3993969477	pan/decode: Drop MFBD compute shader stuff This is triggering all sorts of failures in pandecode and is only mostly spurious. Let's not overwhelm ourselves with this yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3580>	2020-01-27 13:38:41 +00:00
Icecream95	8004874885	panfrost: Don't copy uniforms when the size is zero This fixes a crash when using Gallium HUD with QuakeSpasm when gamma correction shaders (a QuakeSpasm feature, not part of Mesa) are used. Reviewd-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3549>	2020-01-27 13:23:34 +00:00
Florian Will	951083768b	radv/winsys: set IB flags prior to submit in the sysmem path This fixes missing scene objects in ZUSI 3 + dxvk. Index / vertex buffer upload using thousands of CopyBuffer commands in one huge Vulkan command buffer, mixed with lots of render pass begin/end and draw calls, failed for some of the buffers. radv divides the huge command buffer into 3 IBs, and they had random flags set because the field was uninitialized. Maybe IBs got discarded if they had the PREAMBLE bit set. Signed-off-by: Florian Will <florian.will@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3577>	2020-01-27 11:53:22 +01:00
Pierre-Eric Pelloux-Prayer	90312de551	docs: document AMD_DEBUG variable See https://gitlab.freedesktop.org/mesa/mesa/issues/2022 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>	2020-01-27 09:29:10 +01:00
Pierre-Eric Pelloux-Prayer	a803d41248	radeonsi: move AMD_DEBUG tests to AMD_TEST AMD_DEBUG env var is stored in a 64 bits int and has 64 different values. This commit makes some space by moving the test* special values to AMD_TEST. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3492>	2020-01-27 09:29:10 +01:00
Dave Airlie	58ba7b696d	gallivm/nir: add missing break for isub. Pointed out by coverity scan. Fixes: `3adf74f2ef` ("gallivm: pick integer builders for alu instructions.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3571>	2020-01-27 08:05:27 +10:00
Lionel Landwerlin	8bd92a15cf	isl: add gen12 comment about CCS for linear tiling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>	2020-01-26 20:46:14 +00:00
Lionel Landwerlin	a3f6db2c4e	isl: drop CCS row pitch requirement for linear surfaces We were applying row pitch constraint of CCS surfaces to linear surfaces. But CCS is only supported in linear tiling under some condition (more on that in the following commit). So let's drop that requirement for now. Fixes a bunch of crucible assert where the byte size of a linear image is expected to be similar to the byte size of buffer for the same extent in the following category : func.miptree.r8g8b8a8-unorm.aspect-color.view-2d.download-copy-with-draw. v2: Move restriction to isl_calc_tiled_min_row_pitch() v3: Move restrinction to isl_calc_row_pitch_alignment() (Jason) v4: Update message (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `07e16221d9` ("isl: Round up some pitches to 512B for Gen12's CCS") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3551>	2020-01-26 20:46:14 +00:00
Lionel Landwerlin	397ff2976b	intel: Implement Gen12 workaround for array textures of size 1 Gen12 does not support RENDER_SURFACE_STATE::SurfaceArray = true && RENDER_SURFACE_STATE::Depth = 0. SurfaceArray can only be set to true if Depth >= 1. We workaround this limitation by adding the max(value, 1) snippet in the shaders on the 3 components for texture array sizes. Tested on Gen9 with the following Vulkan CTS tests : dEQP-VK.image.image_size.2d_array.* v2: Drop debug print (Tapani) Switch to GEN:BUG instead of Wa_ v3: Fix dEQP-VK.image.image_size.1d_array.* cases (Lionel) v4: Fix dEQP-VK.glsl.texture_functions.query.texturesize.* cases (Missing tex_op handling) (Lionel) v5: Missing break statement (Lionel) v6: Fixup comment (Tapani) v7: Fixup comment again (Tapani) v8: Don't use sample_dim as index (Jason) Rename pass Simplify control flow Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v7) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3362>	2020-01-26 22:27:03 +02:00
Jason Ekstrand	4d03e53127	intel/isl: Allow CCS_E on more formats Now that BLORP supports copies on everything except R11G11B10_FLOAT, we should be able to support CCS_E those formats. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>	2020-01-25 17:48:54 +00:00
Jason Ekstrand	f132e0fddf	intel/blorp: Add support for CCS_E copies with UNORM formats Some of the smaller bit-size formats which support CCS_E don't have a UINT representative in their compression class. However, we should be able to use UNORM just fine and still get bit-exact copies. We just have to do a conversion to/from UNORM when we bitcast. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3554>	2020-01-25 17:48:54 +00:00
Erico Nunes	ae0b8ba5d5	lima/ppir: fix src read mask swizzling The src mask can't be calculated from the dest write_mask. Instead, it must be calculated from the swizzled operators of the src. Otherwise, liveness calculation may report incorrect live components for non-ssa registers. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>	2020-01-25 14:48:55 +01:00
Erico Nunes	ab36523ae7	lima/ppir: split ppir_op_undef into undef and dummy again Those were renamed/merged some time ago but it turns out that ppir_op_undef can't be shared. It was being used for undefined ssa operations and for read-before-write operations that may happen to e.g. uninitialized registers (non-ssa) inside a loop. We really don't want to reserve a register for the undef ssa case, but we must reserve and allocate register for the unitialized register case because when it happens inside a loop it may need to hold its value across iterations. This dummy node might be eliminated with a code refactor in ppir in case we are able to emit the write and allocate the ppir_reg before we emit the read. But a major refactor we need this to keep this code to avoid apparent regressions with the new liveness analysis implementation. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>	2020-01-25 14:48:55 +01:00
Erico Nunes	4ca3de06ec	lima/ppir: fix ssa undef emit The ssa doesn't need to be manually added to block->comp->reg_list. Doing so actually causes other registers to be marked as undef=true later. This patch alone fixes a few deqp tests that have undefs. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>	2020-01-25 14:48:55 +01:00
Erico Nunes	d6b1917c01	lima/ppir: handle write to dead registers in ppir nir can output writes to dead registers when expanding vec4 operations to non-ssa registers. In that case, some components of the vec4 may be assigned but never read. These are also not currently removed by a nir dead code elimination pass as they are not ssa. In order to prevent regalloc from allocating a live register for this operation, an interference must be assigned to it during liveness analysis. This workaround may be removed in the future if the assignments to dead components can be removed earlier in ppir or nir. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3502>	2020-01-25 14:48:02 +01:00
Marek Olšák	eb7cd575da	radeonsi: fix a regression since the addition of si_shader_llvm_vs.c Fixes: `cd5b99c541` - radeonsi: move VS shader code into si_shader_llvm_vs.c Closes: #2416 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>	2020-01-25 05:59:24 +00:00
Marek Olšák	688d2901b8	radeonsi: make screen available to shader part compilation to fix a crash in is_multi_part_shader. Fixes: `1a0890dcf3` - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3561>	2020-01-25 05:59:24 +00:00
Jason Ekstrand	07a441d53f	anv: Rework CCS memory handling on TGL-LP The previous way we were attempting to handle AUX tables on TGL-LP was very GL-like. We used the same aux table management code that's shared with iris and we updated the table on image create/destroy. The problem with this is that Vulkan allows multiple VkImage objects to be bound to the same memory location simultaneously and the app can ping-pong back and forth between them in the same command buffer. Because the AUX table contains format-specific data, we cannot support this ping-pong behavior with only CPU updates of the AUX table. The new mechanism switches things around a bit and instead makes the aux data part of the BO. At BO creation time, a bit of space is appended to the end of the BO for AUX data and the AUX table is updated in bulk for the entire BO. The problem here, of course, is that we can't insert the format-specific data into the AUX table at BO create time. Fortunately, Vulkan has a requirement that every TILING_OPTIMAL image must be initialized prior to use by transitioning the image from VK_IMAGE_LAYOUT_UNDEFINED to something else. When doing the above described ping-pong behavior, the app has to do such an initialization transition every time it corrupts the underlying memory of the VkImage by using it as something else. We can hook into this initialization and use it to update the AUX-TT entries from the command streamer. This way the AUX table gets its format information, apps get aliasing support, and everyone is happy. One side-effect of this is that we disallow CCS on shared buffers. We'll need to fix this for modifiers on the scanout path but that's a task for another patch. We should be able to do it with dedicated allocations. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	b29cf7daf3	anv: Make anv_vma_alloc/free a lot dumber All they do now is take a size, align, and flags and figure out which heap to allocate in. All of the actual code to deal with the BO is in anv_allocator.c. We want to leave anv_vma_alloc/free in anv_device.c because it deals with API-exposed heaps so it still makes sense to have it there. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	fd0f9d1196	anv: Make AUX table invalidate a PIPE_* bit This commit moves it in with all the other cache invalidation operations as if it were done by PIPE_CONTROL even though it's a pair of register writes. This means we only have to write the GFX_AUX_TABLE_BASE_ADDR register once at device initialization instead of every invalidate. Invalidates are now a single LRI instead of two. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	658dc9ca50	anv: Add another align_down helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	64ca8a3272	isl: Add a helper for calculating subimage memory ranges Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	4793116036	anv: Delete a redundant calculation We compute the same thing with the same variable name at the top of the function. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	a1e9adc9ce	intel/aux-map: Factor out some useful helpers This breaks add_mapping() into three pieces: 1. get_aux_entry() adds AUX-TT pages as needed and returns the L1 entry index, L1 entry address, and L1 entry map. 2. gen_aux_map_format_bits_for_isl_surf() computes the format- specific information that goes in the AUX-TT entry. 3. add_mapping() is a lot dumber function that now just adds the requested mapping with the requested format bits. This lets us break out some additional helpers in the API which we want to use for more direct AUX-TT management in ANV. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Jason Ekstrand	bea62ea566	intel/aux-map: Add some #defines Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>	2020-01-25 02:18:33 +00:00
Marek Olšák	0366c8c5b7	radeonsi: expose shader cache stats to the HUD Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Marek Olšák	c046551e60	radeonsi: print shader cache stats with AMD_DEBUG=cache_stats Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Marek Olšák	2fd3bb23ab	radeonsi: restructure si_shader_cache_load_shader Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Marek Olšák	0db74f479b	radeonsi: use the live shader cache Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Marek Olšák	4bb919b0b8	gallium/util: add a cache of live shaders for shader CSO deduplication Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Marek Olšák	f36f85d958	util/simple_mtx: add a missing include to get ASSERTED Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2929>	2020-01-24 20:29:29 -05:00
Caio Marcelo de Oliveira Filho	6a0dda63dd	intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT Fixes: `58907568ec` ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>	2020-01-24 23:52:30 +00:00
Caio Marcelo de Oliveira Filho	c1a2ac2abe	anv: Always initialize target_stencil_layout Pass down stencil data from the subpass attachment like we do elsewhere. Only stencil attachments will make use of it. Fixes warnings like ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’: ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4656 \| att_state->current_stencil_layout = target_stencil_layout; \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>	2020-01-24 14:01:38 -08:00
Jason Ekstrand	41bffe0913	anv: Replace aux_surface.isl.size_B checks with aux_usage checks Now that aux_usage has a unified meaning, aux_usage == NONE if and only if aux_surface.isl.size_B > 0. In most of these cases, the question we're asking is "does have compression?" and not "have we allocated an aux surface for compression?". Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>	2020-01-24 21:07:26 +00:00
Jason Ekstrand	e693a57232	anv: Rework the meaning of anv_image::planes[]::aux_usage Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant CCS_D. This sort-of made sense before we had anv_layout_to_aux_usage but now that we have that helper. However, in our more modern aux tracking model, all aux usage goes through anv_layout_to_* and we're better off making the meaning of anv_image::planes[]::aux_usage be AUX_USAGE_NONE if and only if there is no compression. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>	2020-01-24 21:07:26 +00:00
Samuel Pitoiset	de64719024	radv: print NIR shaders after lowering FS inputs/outputs This is confusing otherwise. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3553>	2020-01-24 19:54:58 +00:00
Jason Ekstrand	17e225ee1e	intel/isl: Add a hack for the Gen12 A0 texture buffer bug Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>	2020-01-24 19:18:27 +00:00
Jason Ekstrand	4cd23420bd	intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>	2020-01-24 19:18:27 +00:00
Jason Ekstrand	98aab272a8	intel/disasm: Properly disassemble indirect SENDs Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD. This is more correct because there is no GRF involved. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>	2020-01-24 19:18:27 +00:00
Jason Ekstrand	3b2eafbea9	intel/fs: Don't unnecessarily fall back to indirect sends on Gen12 The instruction encoding for SENDS changed on Gen12 and it now supports embedding the entire extended message descriptor in the instruction if it's an immediate. Stop falling back to doing an indirect SEND just because we had something in [15:12] of ex_desc.ud. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>	2020-01-24 19:18:27 +00:00
Jason Ekstrand	c70a786c77	anv: Improve BTI change cache flushing This commit makes two changes: 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly for the flush at the end of cmd_buffer_begin_subpass. 2. Because BLORP ops such as vkCmdClearAttachments may come in the middle of a render pass, we have to also flag the need for a cache flush after the blorp op. Fixes: `185630c6bc` "anv/blorp: Do the gen11 BTI flush" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>	2020-01-24 19:18:26 +00:00
Alyssa Rosenzweig	e39c52787e	panfrost: Fix 32-bit warning for `indices` ../src/gallium/drivers/panfrost/pan_context.c: In function ‘panfrost_draw_vbo’: ../src/gallium/drivers/panfrost/pan_context.c:1551:70: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] ctx->payloads[PIPE_SHADER_FRAGMENT].prefix.indices = (u64) NULL; ^ Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Icecream95 <ixn@keemail.me> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>	2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig	58aa2b8cfc	pan/decode: Remove SHORT_SLIDE indirection ../src/panfrost/pandecode/decode.c: In function ‘pandecode_compute_fbd’: ../src/panfrost/pandecode/decode.c:789:35: warning: taking address of packed member of ‘struct mali_compute_fbd’ may result in an unaligned pointer value [-Waddress-of-packed-member] 789 \| pandecode_u32_slide(num, s->unknown ## num, ARRAY_SIZE(s->unknown ## num)) \| ~^~~~~~~~~ ../src/panfrost/pandecode/decode.c:800:9: note: in expansion of macro ‘SHORT_SLIDE’ 800 \| SHORT_SLIDE(1); \| ^~~~~~~~~~~ Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>	2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig	7d52b3a18b	pan/midgard: Remove pack_color define Unused at the moment. ../src/panfrost/midgard/midgard_compile.c:124:29: warning: ‘m_pack_colour’ defined but not used [-Wunused-function] 124 \| static midgard_instruction m_##name(unsigned ssa, unsigned address) { \ \| ^~ ../src/panfrost/midgard/midgard_compile.c:145:22: note: in expansion of macro ‘M_LOAD_STORE’ 145 \| #define M_LOAD(name) M_LOAD_STORE(name, false) \| ^~~~~~~~~~~~ ../src/panfrost/midgard/midgard_compile.c:213:1: note: in expansion of macro ‘M_LOAD’ 213 \| M_LOAD(pack_colour); \| ^~~~~~ Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>	2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig	6c95ea6bd7	pan/decode: Remove last_size Fixes ../src/panfrost/pandecode/decode.c: In function ‘pandecode_jc’: ../src/panfrost/pandecode/decode.c:2859:14: warning: variable ‘last_size’ set but not used [-Wunused-but-set-variable] 2859 \| bool last_size; Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>	2020-01-24 18:53:31 +00:00
Alyssa Rosenzweig	d126515a16	panfrost: Don't use implicit mali_exception_status enum Fixes ../src/panfrost/pandecode/public.h:53:33: warning: ‘enum mali_exception_access’ declared inside parameter list will not be visible outside of this definition or declaration Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3543>	2020-01-24 18:53:31 +00:00
Samuel Pitoiset	4a553212fa	radv: enable ACO support for GFX6 CTS should pass, as well as Crucible and the few number of Piglit tests. List of game benchmarks tested: - Dawn of War 3 - Serious Sam 2017 - Shadow of The Tomb Raider - The Talos Principle - Thrones of Britannia - Total Warhammer 2 - Total War: Three Kingdoms Note that F12017 hangs with or without ACO on GFX6 at the moment. My whole pipelinedb (~30 games) doesn't trigger any compiler crashes. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2401 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Samuel Pitoiset	d4b4f40595	aco: copy the literal offset of SMEM instructions to a temporary GFX6 only supports up to 8-bit for the literal offset, so make sure it's copied to a temporary SGPR before emitting a SMEM instruction. The optimizer will propagate the literal offset if possible anyways. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Samuel Pitoiset	1ac49ba908	aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6 It's required to insert 1 wait state if the dst VGPR of any v_interp_* is followed by a read with v_readfirstlane or v_readlane to fix GPU hangs on GFX6. Note that v_writelane_* is apparently not affected. This hazard isn't documented anywhere but AMD confirmed it. This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Samuel Pitoiset	b9cc50fbce	aco: fix a hardware bug for MRTZ exports on GFX6 GFX6 (except OLAND and HAINAN) has a bug that it only looks at the X writemask component. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Brian Ho	f55e215b8c	turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries Use CP_COND_EXEC and CP_COND_WRITE to conditionally copy the results of a query to a buffer based off the query's availability. Fixes: #2238 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	9a3656b9fd	turnip: Implement vkCmdResetQueryPool Clears the available bit for each requested query on the GPU. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	97fa4cb3dc	turnip: Implement vkGetQueryPoolResults for occlusion queries Implements fetching the results of a query pool with the VK_QUERY_RESULT_WAIT_BIT, VK_QUERY_RESULT_WITH_AVAILABILITY_BIT, and VK_QUERY_RESULT_PARTIAL_BIT flags. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	24b95485dc	turnip: Update query availability on render pass end Unlike on an immidiate-mode renderer, Turnip only renders tiles on vkCmdEndRenderPass. As such, we need to track all queries that were active in a given render pass and defer setting the available bit on those queries until after all tiles have rendered. This commit adds a draw_epilogue_cs to tu_cmd_buffer that is executed as an IB at the end of tu_CmdEndRenderPass. We then emit packets to this command stream that update the availability bit of a given query in tu_CmdEndQuery. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	f750dd2ab8	turnip: Implement vkCmdEndQuery for occlusion queries Mostly a translation of freedreno's implementation of glEndQuery for GL_SAMPLES_PASSED query objects with a slight modification to set the availability bit of the query bo (slot->available) if the query was not ended inside a render pass. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	5824a59ee2	turnip: Implement vkCmdBeginQuery for occlusion queries Mostly a translation of freedreno's implementation of glBeginQuery for GL_SAMPLES_PASSED query objects with special logic for handling tiled render passes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	78dea40b1c	turnip: Implement vkCreateQueryPool for occlusion queries General structure is inspired by anv's implementation in genX_query.c. We define a packed struct that tracks sample count at the beginning of the query and at the end; the result of the occlusion query is then slot->end - slot->begin. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Brian Ho	a155ab93a3	turnip: Update tu_query_pool with turnip-specific fields tu_query_pool was forked from radv_query_pool, but we will need a different set of fields to implement queries in turnip. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3279>	2020-01-24 18:14:01 +00:00
Jason Ekstrand	0aa13245c1	anv: Allow HiZ in read-only depth layouts This improves the performance of Aztec Ruins by 5% on ICL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>	2020-01-24 17:42:36 +00:00
Jason Ekstrand	bf3a262a80	anv: Add a usage parameter to anv_layout_to_aux_usage Most places we actually know the usage and can provide it. There are two exceptions to this: 1. We pass 0 into get_blorp_surf_for_anv_image when we use ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is never actually called so it doesn't matter. 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer. However, the coming commits which will begin using the usage parameter only care about depth. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>	2020-01-24 17:42:36 +00:00
Jason Ekstrand	f8a4de6316	anv: Use isl_aux_state for HiZ resolves Rather than looking at the aux usage, we look at the isl_aux_state which provides us with more detailed information. This commit adds a couple helpers to isl which let us quickly determine if we have valid depth/hiz on the initial layout and if we need valid depth/hiz for the final layout. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>	2020-01-24 17:42:36 +00:00
Jason Ekstrand	9a1232a745	anv: Add a layout_to_aux_state helper This new helper maps VkImageLayout enums to isl_aux_state enums which are the hardware's concept of image layouts. We can then use the aux state to get the fast clear type and the aux usage. This should yield no functional change in driver behavior. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>	2020-01-24 17:42:36 +00:00
Jason Ekstrand	769d6ba200	anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves As of `52ad1712ed`, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL are now identical for depth buffers so there's no reason why we need to use the "wrong" layout. Technically, according to Vulkan, blits and MSAA resolves are transfer ops so we should use the transfer layout now that we can. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>	2020-01-24 17:42:36 +00:00
Jason Ekstrand	71c0f9e76d	intel/blorp: resize src and dst surfaces separately When copying to an RGB surface, we treat it as an R only one of three times the width, which may end up being larger than the maximum size supported by the hardware and so it hits the shrink path. This forced both source and destination surfaces to be shrunk, even though it's not necessary for the former, and may even hit some assertions in some cases, such as the surface being compressed. Fixes several tests under dEQP-VK.api.copy_and_blit.core.image_to_image.dimensions.* Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>	2020-01-24 17:02:40 +00:00
Samuel Pitoiset	918f00eef8	aco: combine MRTZ (depth, stencil, sample mask) exports Instead of emitting up to 3 for each different components (depth, stencil and sample mask). This is needed to fix a hw bug on GFX6. Totals from affected shaders: SGPRS: 34728 -> 35056 (0.94 %) VGPRS: 26440 -> 26476 (0.14 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1346088 -> 1344180 (-0.14 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3922 -> 3915 (-0.18 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3538>	2020-01-24 16:42:15 +00:00
Timur Kristóf	c787b8d2a1	aco/gfx10: Fix VcmpxExecWARHazard mitigation. The SOPP instruction shouldn't have a definition, and its block should be set to -1 in order to prevent it from being recognized as a branch. Also fix a typo in the readme. Fixes: `d6dfce02d0` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>	2020-01-24 16:21:08 +00:00
Timur Kristóf	8a32f57fff	aco: Transform uniform bitwise instructions to 32-bit if possible. This allows removing superfluous s_cselect instructions that come from turning booleans into 64-bit vector condition. v2 by Daniel Schürmann: - Make the code massively simpler v3 by Timur Kristóf: - Fix regressions, make it work in wave32 mode - Eliminate extra moves by not always using the SCC definition - Use s_absdiff_i32 for uniform XOR - Skip the transformation for uncommon or invalid instructions Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>	2020-01-24 14:40:45 +00:00
Martin Fuzzey	d1925fec53	etnaviv: update Android build files etnaviv no longer builds on Android, fix this. Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3447>	2020-01-24 14:03:28 +00:00
Rhys Perry	b046f55086	aco: use nir_move_copies Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	72e9a23443	radv/aco: use ACO for GS copy shaders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	f8f7712666	aco: implement GS copy shaders v5: rebase on float_controls changes v7: rebase after shader args MR and load/store vectorizer MR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	de4ce66f5c	aco: remove needs_instance_id Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	e192e268de	aco: explicitly mark end blocks for exports For GS copy shaders, whether we want to do exports is conditional. By explicitly marking the end blocks, we can mark an IF's then branch as an export block and ensure that's where the assembler inserts null exports. v6: only fixup exports in the end block, like before v8: simplify some code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	d46a54ecff	radv/aco: allow ACO for GS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	8bad100f83	aco: implement GS on GFX7-8 GS is the same on GFX6, but GFX6 isn't fully supported yet. v4: fix regclass v7: rebase after shader args MR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	40bb81c9dd	radv/aco,aco: implement GS on GFX9+ v2: implement GFX10 v3: rebase v7: rebase after shader args MR v8: fix gs_vtx_offset usage on GFX9/GFX10 v8: use unreachable() instead of printing intrinsic v8: rename output_state to ge_output_state v8: fix formatting around nir_foreach_variable() v8: rename some helpers in the scheduler v8: rename p_memory_barrier_all to p_memory_barrier_common v8: fix assertion comparing ctx.stage against vertex_geometry_gs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	70f63c1988	aco: improve support for s_sendmsg In particular, the messages needed for GS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Rhys Perry	0da7b3b18b	radv: move gs copy shader creation before other variants ACO lowers output derefs which breaks the shader_info pass used by gs copy shader creation. v3: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Timur Kristóf	23edcf6490	aco: Make a better guess at which instructions need the VCC hint. Previously, bool_to_vector_condition would always set the VCC hint on its result. This commit improves it by having the optimizer set the VCC hint only when the result really needs to be in the VCC. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>	2020-01-24 13:14:23 +00:00
Jan Zielinski	83f24b0587	gallium/swr: implementation of tessellation shaders compilation TCS and TES shaders compilation mechanisms in SWR and state management implementation. Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Dave Airlie <airlied@redhat.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3484> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3484>	2020-01-24 11:38:03 +00:00
Bas Nieuwenhuizen	0890482969	radv: Allow DCC & TC-compat HTILE with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT. I misunderstood the flag when initially disabling. But this flag only does something with mutable formats. If we have DCC and mutable formats, the formats are close enough that the allowed usage flags are not meaningfully different nor used during allocation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3424>	2020-01-24 11:16:39 +00:00
Bas Nieuwenhuizen	1b447bd2e6	radv: Expose VK_KHR_swapchain_mutable_format. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2354 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3425>	2020-01-24 10:47:07 +00:00
Connor Abbott	b103157a0e	freedreno: Document CP_INDIRECT_BUFFER_CHAIN This will let us use batch chaining instead of growing batches on a5xx and a6xx. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>	2020-01-24 10:03:08 +00:00
Connor Abbott	f58242b56e	freedreno: Document CP_UNK_A6XX_55 Reviewed-by: Rob Clark <robdclark@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3537>	2020-01-24 10:03:08 +00:00
Connor Abbott	3cf1d6b8db	freedreno: Document CP_COND_REG_EXEC more The vulkan blob uses the RENDER_MODE mode to condition a blit on the render mode in traces of a dEQP triangle test. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3182>	2020-01-24 09:23:27 +00:00
Samuel Pitoiset	a31bcf2be6	ac/llvm: fix missing casts in ac_build_readlane() Because ac_build_optimization_barrier() overwrites the original src_type, we have to keep track of it before emitting that barrier. Otherwise, wrong conversions are expected for pointers or small bitsizes. By doing this, we no longer need to do the cast dance in ac_build_readlane_no_opt_barrier(), it was just necessary for ac_build_optimization_barrier(). This fixes a bunch of crashes with subgroups related tests when RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash with The Surge 2. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395 Fixes: `0f45d4dc2b` ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>	2020-01-24 07:40:07 +01:00
Jason Ekstrand	8a135ff6e5	anv/apply_pipeline_layout: Initialize the nir_builder before use Fixes: #2410 Fixes: `3c754900b5` "nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops" Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548>	2020-01-23 19:35:39 -08:00
Kenneth Graunke	adaa3583f5	meson: Prefer 'iris' by default over 'i965'. This changes the default driver for Intel Gen8-11 hardware to be the newer 'iris' driver rather than the older 'i965' driver. To continue using i965, pass -Dprefer-iris=false when building. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3540> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3540>	2020-01-23 15:34:54 -08:00
Adam Jackson	2fc11e8a05	drisw: Cache the depth of the X drawable This is not always ->rgbBits, because there are cases where that could be 32 but we're (legally) bound to a depth-24 pixmap. The important thing to have match here is the actual server-side notion of depth. You can look this up (at modest expense) from the xlib visual info if the fbconfig has a visual. But it might not, so if not, fetch it (at slightly greater expense) from XGetGeometry. Do this at GLX drawable creation so you don't have to do it on the SwapBuffers path. Apparently this fixes glx/glx-swap-singlebuffer, which is unintentional but quite pleasant. Fixes: mesa/mesa#2291 Fixes: `90d58286` ("drisw: Fix and simplify drawable setup") Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3305> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3305>	2020-01-23 23:03:13 +00:00
Eric Anholt	59f29fc845	turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros. There are only a couple of hard cases left using pkt4, where the register number to write is computed. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	d67100519e	turnip: Convert renderpass setup to the new register packing macros. This gets a lot of the hard code converted over to the new macros, resulting in (I feel) much more readable code with LESS_SHOUTING_ABOUT_THE_REG(). I decided to consistently put the reg on its own line, so that all the register names line up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	08837ea3d2	turnip: Port krh's packing macros from freedreno to tu. This introduces some minor unpacking of the temporary fd_reg_pair structs to code that previously was packing a whole register field. In the pack wrapper in tu_cs.h, I added some explanatory docs, dropped the relocs handling since we don't need it, and removed the extra regs[] in the __ONE_REG() macro (which was causing gcc's optimizer to fall on its face in my release build). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	d4bc3c93ea	freedreno: Fix OUT_REG() on address regs without a .bo supplied. Sometimes you want to zero out an address by supplying a NULL BO, but without this we would end up only emitting one dword. Increases size of fd6_gmem.o by .8%, though it's not clear to me why (no obvious terrible codegen happening) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Eric Anholt	c1327bc283	freedreno: Add some missing a6xx address declarations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3455>	2020-01-23 22:46:09 +00:00
Ian Romanick	4b7de92e5f	relnotes: Add GL_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2 Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-23 13:36:14 -08:00
Vasily Khoruzhick	beab31b9bb	lima: use imul for calculations with intrinsic src It's source is supposed to be int, so we have to use integer multiplication otherwise we'll get undefined result. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>	2020-01-23 21:16:22 +00:00
Vasily Khoruzhick	3c754900b5	nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops Otherwise we'll have to lower it later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3529>	2020-01-23 21:16:22 +00:00
Icecream95	cf2c5a56a1	pan/decode: Rotate trace files Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Icecream95	c1952779d6	pan/decode: Dump to a file The file name is taken from the environment variable PANDECODE_DUMP_FILE, defaulting to pandecode.dump if it is not set. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Icecream95	be22c0789f	pan/decode: Support dumping to a file Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Icecream95	20a8957397	pan/bifrost: Support disassembling to a file Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Icecream95	968f36d1fc	pan/midgard: Support disassembling to a file Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Icecream95	7b525ba02b	pan/midgard: Fix a memory leak in the disassembler Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3525>	2020-01-23 20:46:38 +00:00
Eric Anholt	fbd9b4ce08	turnip: Fix execution of secondary cmd bufs with nothing in primary. We want to finish off cmd emission in the primary CS and add its entry to the IB, but regardless of whether there had been anything in the primary CS to emit, we still need a reserved CS entry for the loop below. Fixes crashes in dEQP-VK.binding_model.shader_access.secondary_cmd_buf.* and many more in dEQP-VK.renderpass* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3524>	2020-01-23 20:27:26 +00:00
Alyssa Rosenzweig	d6d6ef2862	panfrost: Drop mysterious zero=0xFFFF field It doesn't seem to affect any results and it's not at all clear if/why the blob sometimes(?) sets it? So let's clean this up since this solution isn't correct anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3513> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3513>	2020-01-23 19:59:58 +00:00
Icecream95	f8eb4441ae	pan/midgard: Fix bundle dynarray leak Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496>	2020-01-23 19:35:09 +00:00
Marek Olšák	43d9bac6f2	radeonsi: separate LLVM compilation from non-LLVM code Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	1a0890dcf3	radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	7ce84b256e	radeonsi: make si_compile_shader return bool Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	be772182e0	radeonsi: make si_compile_llvm return bool Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	bd19d144a1	radeonsi: move more LLVM functions into si_shader_llvm.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	9a66f3d3e2	radeonsi: fold si_shader_context_set_ir into si_build_main_function Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	beacb414b9	radeonsi: move si_nir_build_llvm into si_shader_llvm.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	1c73d598eb	radeonsi: minor cleanup in si_shader_internal.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	ab33ba987a	radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	cd5b99c541	radeonsi: move VS shader code into si_shader_llvm_vs.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	d1c42e2c6a	radeonsi: move non-LLVM code out of si_shader_llvm.c There was also some redundant code in si_shader_nir.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Marek Olšák	594f085cfa	radeonsi: use ctx->ac. for types and integer constants Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3421>	2020-01-23 19:10:21 +00:00
Jonathan Marek	8aa5d96864	turnip: simplify tu_physical_device_get_format_properties Fixes the "bad VkImageTiling" error when tiling is VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	b7e22b7a35	vulkan/wsi: remove unused image_get_modifier Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	e8afd40758	turnip: set linear tiling for scanout images Fixes: `210e6887` "vulkan/wsi: Use the interface from the real modifiers extension" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Jonathan Marek	11f6fba1c9	turnip: hook up GetImageDrmFormatModifierPropertiesEXT Fixes: `210e6887` "vulkan/wsi: Use the interface from the real modifiers extension" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3485>	2020-01-23 18:34:07 +00:00
Guido Günther	c5334d2943	freedreno/drm: Don't miscalculate timeout The current code overflows (s * 1000000000) for s >= 5 but that is e.g. used in msm_bo_cpu_prep. Signed-off-by: Guido Günther <agx@sigxcpu.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3514>	2020-01-23 18:07:13 +00:00
Eric Anholt	b327501dbf	turnip: Add support for fine derivatives. This does appear to be the required instruction sequence (dsxpp_1 dst src; dsxpp_1.p dst src) as dropping either instruction fails the testsuite. Fixes dEQP-VK.glsl.derivate.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>	2020-01-23 17:38:29 +00:00
Eric Anholt	876824908d	freedreno/ir3: Plumb the ir3_shader_variant into legalize. legalize is computing a lot of state that goes in the variant, let's just store it directly instead of passing pointers around. This leaves max_bary in place, which is doing some surprising work (overwriting the original total_in in some cases). Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3494>	2020-01-23 17:38:29 +00:00
Anthony Pesch	f77369086c	util/hash_table: update users to use new optimal integer hash functions Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Anthony Pesch	1496cc92f6	util/hash_table: added hash functions for integer types A few hash_table users roll their own integer hash functions which call _mesa_hash_data to perform the hashing which ultimately calls into XXH32 with a dynamic key length. When using small keys with a constant size the hash rate can be greatly improved by inlining XXH32 and providing it a constant key length, see: https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html Additionally, this patch removes calls to _mesa_key_hash_string and makes them instead call _mesa_has_string directly, matching the new integer hash functions. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Anthony Pesch	931388ceca	util/hash_table: replace _mesa_hash_data's fnv1a hash function with xxhash For most key sizes, xxhash outperforms fnv1a's hash rate substantially (bug 2153). In particular, the V3D driver hashes multiple ~200 byte keys as part of the shader cache lookup which can easily eat up 10-20% of the runtime on the Raspberry Pi. Swapping over to xxhash drops this to ~1% of the runtime. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Anthony Pesch	032f8807f7	util: move fnv1a hash implementation into its own header Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Anthony Pesch	17fac0e32d	util: import xxhash Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>	2020-01-23 17:06:57 +00:00
Michel Dänzer	552028c013	winsys/amdgpu: Close KMS handles for other DRM file descriptions When a BO or amdgpu_screen_winsys is destroyed. Should fix leaking such BOs in other DRM file descriptions. v2: * Pass the correct file descriptor to drmIoctl (Pierre-Eric Pelloux-Prayer) * Use _mesa_hash_table_remove v3: * Close handles in amdgpu_winsys_unref as well v4: * Adapt to amdgpu_winsys::sws_list_lock. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2270 Fixes: `11a3679e3a` "winsys/amdgpu: Make KMS handles valid for original DRM file descriptor" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	b60f5cbc15	winsys/amdgpu: Re-use amdgpu_screen_winsys when possible Namely, if os_same_file_description determined that the DRM file descriptor references the same file description. v2: * Adapt to amdgpu_winsys::sws_list_lock. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	f76cbc7901	util: Add os_same_file_description helper Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	c6468f66c7	winsys/amdgpu: Only re-export KMS handles for different DRM FDs When the amdgpu_screen_winsys uses the same FD as the amdgpu_winsys (which is always the case for the first amdgpu_screen_winsys), we can just use bo->u.real.kms_handle. v2: * Also only create the kms_handles hash table if the amdgpu_screen_winsys fd is different from the amdgpu_winsys one. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	24075ac60f	winsys/amdgpu: Keep track of retrieved KMS handles using hash tables The assumption being that KMS handles are only retrieved for relatively few BOs, so hash tables should be efficient both in terms of performance and memory consumption. We use the address of struct amdgpu_winsys_bo as the key and its kms_handle field (the KMS handle valid for the DRM file descriptor passed to amdgpu_device_initialize) as the hash value. v2: * Add comment above amdgpu_screen_winsys::kms_handles (Pierre-Eric Pelloux-Prayer) v3: * Protect kms_handles hash table with amdgpu_winsys::sws_list_lock mutex. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:24:00 +01:00
Michel Dänzer	f4010a6da9	winsys/amdgpu: Keep a list of amdgpu_screen_winsyses in amdgpu_winsys v2: * Add dedicated mutex for the list. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:23:32 +01:00
Samuel Pitoiset	8d5203dad2	aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6 V_TRUNC_F64 and V_FLOOR_F64 needs to be lowered on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:48 +01:00
Samuel Pitoiset	4d92601715	aco: implement 64-bit nir_op_ffloor on GFX6 GFX6 doesn't have V_FLOOR_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:45 +01:00
Samuel Pitoiset	fbd169e421	aco: implement 64-bit nir_op_fround_even on GFX6 GFX6 doesn't have V_RNDNE_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:42 +01:00
Samuel Pitoiset	87588801d3	aco: implement 64-bit nir_op_fceil on GFX6 GFX6 doesn't have V_CEIL_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:38 +01:00
Samuel Pitoiset	aad5176c58	aco: implement 64-bit nir_op_ftrunc on GFX6 GFX6 doesn't have V_TRUNC_F64, it needs to be lowered. Loosely based on the AMDGPU LLVM backend. Introduce a new function because it will be useful for some other 64-bit operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:34 +01:00
Samuel Pitoiset	36e7a5f5b9	aco: implement nir_intrinsic_global_atomic_* on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:30 +01:00
Samuel Pitoiset	22d8822683	aco: implement nir_intrinsic_load_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:27 +01:00
Samuel Pitoiset	d6af7571c2	aco: implement nir_intrinsic_store_global on GFX6 GFX6 doesn't have FLAT instructions, use MUBUF instructions instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:24 +01:00
Samuel Pitoiset	01f0bef71e	aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample Only GFX6 was affected, my mistake. The total number of SGPR operands should be 4 when we want to create a vec4. Fixes: `dbdf3b3ef9` ("aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3477>	2020-01-23 14:40:21 +01:00
Lionel Landwerlin	d101907de9	anv/iris: warn gen12 3DSTATE_HS restriction This should never happen but better off documenting it in case someone plays with max threads numbers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489>	2020-01-23 15:06:59 +02:00
Krzysztof Raszkowski	bf74a7f092	gallium/swr: add option for static link Set swr-shared to 'false' to link SWR statically into Mesa. Only one swr arch can be specified if swr-shared is set to false. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3510> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3510>	2020-01-23 12:20:24 +00:00
Samuel Pitoiset	54e54ec3e8	aco: fix printing assembly with CLRXdisasm on GFX6 We thought that CLRXdisasm allowed gfx600 as well as gfx700 but it actually doesn't. Use the family for GFX6 chips instead. Fixes: `0099f85232` ("aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3531>	2020-01-23 11:34:37 +00:00
Pierre Moreau	dda542e912	clover/meson: Define OpenCL header macros Rather than defining the macros any time right before including an OpenCL header, set Meson to define them for the whole clover project. Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>	2020-01-23 11:12:33 +00:00
Pierre Moreau	dd756b704f	clover: Use the dispatch table type from the OpenCL headers Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2243 Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>	2020-01-23 11:12:33 +00:00
Pierre Moreau	cd1c661cfc	include/CL: Update OpenCL headers to latest This latest update contains a new header that defines the dispatch table structure in order to avoid OpenCL implementations having to define it themselves. Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3137>	2020-01-23 11:12:33 +00:00
Samuel Pitoiset	12fe19ba3b	radv: advertise VK_AMD_shader_fragment_mask Only for GFX8+ because it's untested on older generations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	e030aef32c	aco: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	9e477d79b7	ac/nir: add support for nir_texop_fragment_{mask}_fetch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	84b08971fb	nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch These instructions are allowed to fetch from multisampled subpass input attachments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	76a34f5d3f	spirv: add support for SpvOpFragment{Mask}FetchAMD operations nir_tex_src_ms_index is re-used for the fragment index with nir_texop_fragment_fetch to avoid introducing a new texture source type. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	603e6ba972	nir: add two new texture ops for multisample fragment color/mask fetches This introduces: - nir_texop_fragment_mask_fetch (fetch a fragment mask from a compressed multisampled color surface) - nir_texop_fragment_fetch (fetch a color fragment for a particular sample at corresponding fragment mask index). These two texture operations are necessary for implementing SPV_AMD_shader_fragment_mask. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	dea29b3818	spirv: add SpvCapabilityFragmentMaskAMD This new capability is for SPV_AMD_shader_fragment_mask. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>	2020-01-23 10:48:02 +00:00
Samuel Pitoiset	e60de08547	radv: handle missing implicit subpass dependencies When a subpass doesn't declare an explicit dependency from/to VK_SUBPASS_EXTERNAL, Vulkan says there is an implicit dependency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>	2020-01-23 11:25:41 +01:00
Samuel Pitoiset	0d2da2a8c0	radv: add explicit external subpass dependencies to meta operations No functional changes because a subpass dependency with dstStageMask set to VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is a no-op. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3330>	2020-01-23 11:25:38 +01:00
Dave Airlie	48ab21109c	gallivm: fix find lsb the GLSL return value is different than the llvm intrinsic. Fixes arb gpu shader5 tests Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>	2020-01-23 13:48:16 +10:00
Dave Airlie	1e433c398e	galllivm: fix gather offset casting cast texture offsets to 32-bit integers Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>	2020-01-23 13:48:16 +10:00
Dave Airlie	fc9d67394d	llvmpipe: fix some integer instruction lowering. We want to lower to shifts for bitfields, and lower ifind_msb. Fixes a bunch of gpu shader5 tests. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>	2020-01-23 13:48:16 +10:00
Dave Airlie	6c88c81df9	gallivm: fix gather component handling. Fixes the extended gather test for gpu shader5 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3528>	2020-01-23 13:48:16 +10:00
Eric Anholt	65e432695d	turnip: Add support for uniform texel buffers. Pretty straightforward: Port texture descriptor code from freedreno, fill in alignment limits from closed vk, and tu_cmd_buffer.c was already uploading the texture descriptor. This doesn't implement storage texel buffers (required in the compute pipeline) yet, since those will need an IBO descriptor for the store path. Still, making the load path be connected to the texture descriptor won't hurt. Part of #2237 Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>	2020-01-23 02:40:09 +00:00
Kenneth Graunke	8dc0540a17	intel: Fix aux map alignments on 32-bit builds. ALIGN() brilliantly uses uintptr_t, making it unsafe for use with 64-bit GPU addresses in 32-bit builds of the driver. Use align64() instead, which uses uint64_t. Fixes assertion failures when running any 32-bit program on Tigerlake. Fixes: `2e6a7ced4d` ("iris/gen12: Write GFX_AUX_TABLE base address register") Fixes: `0d0290bb3f` ("intel/common: Add surface to aux map translation table support") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>	2020-01-23 02:16:50 +00:00
Matt Turner	4413537c80	util: Remove tmp argument from BITSET_FOREACH_SET macro Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>	2020-01-23 01:52:43 +00:00
Matt Turner	d3eb2a0951	util: Explain BITSET_FOREACH_SET params __size, in particular, makes this macro rather confusing to understand how to use. Hopefully this comment saves future users the headache. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>	2020-01-23 01:52:42 +00:00
Vasily Khoruzhick	60f9b45802	lima: implement invalidate_resource() We don't need to resolve invalidated resources, so it should improve performance for applications that are doing this hint. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3476>	2020-01-23 01:26:23 +00:00
Timothy Arceri	bf830250a7	glsl_to_nir: update interface type properly Since `76ba225184` the member variable types were being redefined but we assigned the old interface type to the variable. In a following patch series we will use the types to check if we are dealing with an interface instance when apply GLSL linking rules. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>	2020-01-23 01:02:25 +00:00
Timothy Arceri	d3a4d1775e	glsl: count uniform components and storage better in nir linking This helps avoid incorrect validation error when linking glsl shaders and avoids assigning uniform storage slots that will never be used. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>	2020-01-23 01:02:25 +00:00
Timothy Arceri	e5b3cf433e	glsl: fix check for matrices in blocks when using nir uniform linker We need to stripe any arrays before checking the type. Here we just use the uniform type which has already be stripped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>	2020-01-23 01:02:25 +00:00
Timothy Arceri	55e4410b34	glsl: remove bogus assert in nir uniform linking I'm not sure why this was first added but it causes an assert on any uniform matrix. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3468>	2020-01-23 01:02:25 +00:00
Ian Romanick	b065d8fb8c	nir/algebraic: Optimize some 64-bit integer comparisons involving zero I noticed that we can do better for these kinds of comparisons while working on the lowering for iadd_sat@64 and isub_sat@64. This eliminated 11 instruction from the fs-addSaturate-int64.shader_test. My hope is that this will improve the run-time of int64 tests on Ice Lake. I have no data to support or refute this. Unsurprisingly, no changes on shader-db. v2: Condition the min and max patterns with nir_lower_minmax64. Suggested by Caio. Very long discussion in the MR. :) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	c57338b924	anv: Enable SPV_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2 Currently only implemented in the scalar backend, so only enable for Gen8+. If support for the other opcodes is added to the vec4 backend, Gen7 could be supported. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	76970940a6	iris: Enable INTEL_shader_integer_functions2 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	b14e718e68	gallium: Add a cap bit for integer multiplication between 32-bit and 16-bit Driver supports integer multiplication between a 32-bit integer and a 16-bit integer. If the second operand is 32-bits, the upper 16-bits are ignored, and the low 16-bits are possibly sign extended as necessary. Iris will eventually enable this. Not sure about other drivers. v2: Add default value to u_screen.c. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	9db20748fd	gallium: Add a cap bit for OpenCL-style extended integer functions Iris will eventually enable this. Looking at the header files, it looks like Midgard could also enable it. Basically, any GPU that fully supports OpenCL can. v2: Add default value to u_screen.c. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	4e9079d0c7	i965: Enable INTEL_shader_integer_functions2 on Gen8+ v2: Use new lower_hadd64 and lower_usub_sat64 flags. v3: Enable SPIR-V capability. v4: Move lowering options to COMMON_SCALAR_OPTIONS. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	4fcddb55f2	spirv: Add support for IntegerFunctions2INTEL capability Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	aa56934e2a	spirv: Silence a bunch of unused parameter warnings The change to get_uniform_nir_atomic_op make it look like the other get__nir_atomic_op functions. The rest just add UNUSED or ASSERTED to parameters required for some of the interfaces. src/compiler/spirv/spirv_to_nir.c: In function ‘struct_member_decoration_cb’: src/compiler/spirv/spirv_to_nir.c:673:47: warning: unused parameter ‘val’ [-Wunused-parameter] struct vtn_value val, int member, ^~~ src/compiler/spirv/spirv_to_nir.c: In function ‘struct_member_matrix_stride_cb’: src/compiler/spirv/spirv_to_nir.c:778:50: warning: unused parameter ‘val’ [-Wunused-parameter] struct vtn_value val, int member, ^~~ src/compiler/spirv/spirv_to_nir.c: In function ‘type_decoration_cb’: src/compiler/spirv/spirv_to_nir.c:805:61: warning: unused parameter ‘ctx’ [-Wunused-parameter] const struct vtn_decoration dec, void ctx) ^~~ src/compiler/spirv/spirv_to_nir.c: In function ‘spec_constant_decoration_cb’: src/compiler/spirv/spirv_to_nir.c:1359:70: warning: unused parameter ‘v’ [-Wunused-parameter] spec_constant_decoration_cb(struct vtn_builder b, struct vtn_value v, ^ src/compiler/spirv/spirv_to_nir.c: In function ‘handle_workgroup_size_decoration_cb’: src/compiler/spirv/spirv_to_nir.c:1407:43: warning: unused parameter ‘data’ [-Wunused-parameter] void data) ^~~~ src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_function_call’: src/compiler/spirv/spirv_to_nir.c:1806:55: warning: unused parameter ‘opcode’ [-Wunused-parameter] vtn_handle_function_call(struct vtn_builder b, SpvOp opcode, ^~~~~~ src/compiler/spirv/spirv_to_nir.c:1807:54: warning: unused parameter ‘count’ [-Wunused-parameter] const uint32_t w, unsigned count) ^~~~~ src/compiler/spirv/spirv_to_nir.c: In function ‘get_uniform_nir_atomic_op’: src/compiler/spirv/spirv_to_nir.c:2548:47: warning: unused parameter ‘b’ [-Wunused-parameter] get_uniform_nir_atomic_op(struct vtn_builder b, SpvOp opcode) ^ src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_atomics’: src/compiler/spirv/spirv_to_nir.c:2633:48: warning: unused parameter ‘count’ [-Wunused-parameter] const uint32_t w, unsigned count) ^~~~~ src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_barrier’: src/compiler/spirv/spirv_to_nir.c:3197:48: warning: unused parameter ‘count’ [-Wunused-parameter] const uint32_t w, unsigned count) ^~~~~ src/compiler/spirv/spirv_to_nir.c: In function ‘vtn_handle_execution_mode’: src/compiler/spirv/spirv_to_nir.c:3618:68: warning: unused parameter ‘data’ [-Wunused-parameter] const struct vtn_decoration mode, void *data) ^~~~ Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	44471a76e9	nir/spirv: Translate SPIR-V to NIR for new INTEL_shader_integer_functions2 opcodes v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") v3: Add missing SpvOpUCountTrailingZerosINTEL case to switch in vtn_handle_body_instruction. Remove stray semicolon in vtn_nir_alu_op_for_spirv_opcode. Use umin instead of umax for SpvOpUCountTrailingZerosINTEL "lowering" in vtn_handle_alu. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	de6c0f8487	intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2 v2: Remove smashing type to D for nir_op_irhadd. Caio noticed it was odd, and removing it fixes an assertion failure in the crucible func.shader.averageRounded.int64_t test (because the source should be W). v3: Emit BRW_OPCODE_MUL directly for nir_op_umul_32x16 and nir_op_imul_32x16. Suggested by Curro. v4: Smash types of MUL instruction generated for nir_op_umul_32x16 and nir_op_imul_32x16. With this change, I get the same assembly now as I did with v2. v5: Remove support for pre-Gen7. The integer multiply path was incorrect, and, since the extension isn't enabled pre-Gen7, there's no way to test it. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	58907568ec	intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops v2: Add a big comment explaining the [IU]SUB_SAT lowering. Suggested by Caio. v3: Use get_fpu_lowered_simd_width in get_lowered_simd_width. Suggested by Ken on IRC. v4: Fix a typo in a comment. Noticed by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	74cd0964d6	intel/fs: Don't lower integer multiplies that don't need lowering v2: Move the check to fs_visitor::lower_integer_multiplication. Previously the cases where lowering was skipped, the original instruction was removed by fs_visitor::lower_integer_multiplication. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	db649fd582	compiler: Translate GLSL IR to NIR for new INTEL_shader_integer_functions2 expressions v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	d3d970166c	nir/algebraic: Add lowering for 64-bit iadd_sat and isub_sat v2: Rearranged and expand the comment about the optimizations applied to the lowering. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	dcadbd2dd2	nir/algebraic: Add lowering for 64-bit uadd_sat Fixes piglit fs-addsaturate-uint64 and vs-addsaturate-uint64 on Ice Lake. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	1bdfc6d7cb	nir/algebraic: Add lowering for 64-bit usub_sat v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") v3: Add a new lower_usub_sat64 flag that only applies to the 64-bit version of the nir_op_usub_sat instruction. v4: Also enable the lowering when nir_lower_iadd64 is set. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	a483771045	nir/algebraic: Add lowering for 64-bit hadd and rhadd v2: Rebase on `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") v3: Add a new lower_hadd64 flag that only applies to the 64-bit versions of the instructions. v4: Also enable the lowering when nir_lower_iadd64 is set. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	ea435560ee	nir/algebraic: Add lowering for uabs_usub and uabs_isub v2: Remove some rebase failures noticed by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	21f0d020fe	nir: Add new instructions for INTEL_shader_integer_functions2 uctz isn't added because it will implemented in the GLSL path and the SPIR-V path using other pre-existing instructions. v2: Avoid signed integer overflow for uabs_isub(0, INT_MIN). Noticed by Caio. v3: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN). I tried the previous methon in a small test program with -ftrapv, and it failed. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	cb518df775	glsl: Add built-in functions for INTEL_shader_integer_functions2 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	5eda9f5832	glsl_types: Add function to get an unsigned base type from a signed type Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	1d165b0548	glsl: Add new expressions for INTEL_shader_integer_functions2 v2: Re-write iadd64_saturate and isub64_saturate to avoid undefined overflow behavior. Also fix copy-and-paste bug in isub64_saturate. Suggested by Caio. v3: Avoid signed integer overflow for abs_sub(0, INT_MIN). Noticed by Caio. v4: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN). I tried the previous methon in a small test program with -ftrapv, and it failed. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Ian Romanick	20d34c4ebf	mesa: Extension boilerplate for INTEL_shader_integer_functions2 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>	2020-01-23 00:18:57 +00:00
Matt Turner	88a0523bd2	intel/compiler: Move Gen4/5 rounding to visitor Gen4/5's rounding instructions operate differently than later Gens'. They all return the floor of the input and the "Round-increment" conditional modifier answers whether the result should be incremented by 1.0 to get the appropriate result for the operation (and thus its behavior is determined by the round opcode; e.g., RNDZ vs RNDE). Since this requires a second instruciton (a predicated ADD) that consumes the result of the round instruction, the round instruction cannot write its result directly to the (write-only) message registers. By emitting the ADD in the generator, the backend thinks it's safe to store the round's result directly to the message register file. To avoid this, we move the emission of the ADD instruction to the NIR translator so that the backend has the information it needs. I suspect this also fixes code generated for RNDZ.SAT but since Gen4/5 don't support GLSL 1.30 which adds the trunc() function, I couldn't write a piglit test to confirm. My thinking is that if x=-0.5: sat(trunc(-0.5)) = 0.0 But on Gen4/5 where sat(trunc(x)) is implemented as rndz.r.f0 result, x // result = floor(x) // set f0 if increment needed (+f0) add result, result, 1.0 // fixup so result = trunc(x) then putting saturate on both instructions will give the wrong result. floor(-0.5) = -1.0 sat(floor(-0.5)) = 0.0 // +1 increment would be needed since floor(-0.5) != trunc(-0.5) sat(sat(floor(-0.5)) + 1.0) = 1.0 Fixes: `6f394343b1` ("nir/algebraic: i2f(f2i()) -> trunc()") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2355 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459>	2020-01-22 23:47:02 +00:00
Samuel Thibault	2fd85105c6	meson: Do not require libdrm for DRI2 on hurd Cc: 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3231> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3231>	2020-01-22 23:15:05 +00:00
Samuel Thibault	4f52425159	util: Do not fail to build on unknown pthread_setname_np This is only used for debugging, so better making porting on various systems less hard. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3229> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3229>	2020-01-22 22:39:57 +00:00
Samuel Thibault	e45dc93136	loader: #define PATH_MAX when undefined (eg. Hurd) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3228> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3228>	2020-01-22 22:10:29 +00:00
Eric Engestrom	d60b8fd3cb	util/atomic: fix return type of p_atomic_add_return() fallback Fixes: `385d13f26d` ("util/atomic: Add a _return variant of p_atomic_add") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3012> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3012>	2020-01-22 21:42:52 +00:00
James Xiong	ac0219cc5b	gallium: dmabuf support for yuv formats that are not natively supported V2 (Kenneth Graunke): added a helper function to check if every format is supported Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2846>	2020-01-22 21:18:49 +00:00
Emmanuel Gil Peyrot	5f78524d9b	intel/compiler: Return early if read() failed This was the only warning I could see while compiling Iris. Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821>	2020-01-22 20:52:47 +00:00
Alan Coopersmith	8490b7d917	intel/perf: adapt to platforms like Solaris without d_type in struct dirent Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> [Eric: factor out the is_dir_or_link() check and fix a bug in v1] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> v3: include directory path when lstat'ing files v4: fix inverted check in enumerate_sysfs_metrics() Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>	2020-01-22 20:23:51 +00:00
Eric Engestrom	8f140422ed	llvmpipe: drop LLVM < 3.4 support We don't support < 3.9 anymore, so this code is dead. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2760>	2020-01-22 11:21:13 -08:00
Eric Engestrom	7d7d1da1ac	egl: drop confusing mincore() error message A user came to me asking how to fix this error, but it's entirely expected that `get_wl_surface_proxy()` on recent enough wayland compositors will always print it. Let's just remove the message altogether, it is basically never useful. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3219> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3219>	2020-01-22 17:55:26 +00:00
Rhys Perry	15a1cc00d3	aco: fix off-by-one error when initializing sgpr_live_in Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2394 Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3511>	2020-01-22 17:23:30 +00:00
Samuel Pitoiset	bd51538d28	radv: fix double free corruption in radv_alloc_memory() If the driver fails to allocate memory for some reasons, it shouldn't free the 'mem' object twice. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2302 Fixes: `825ddfee59` ("radv: Handle device memory alloc failure with normal free.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3508>	2020-01-22 17:01:16 +00:00
Michel Dänzer	5a6a88f58c	gitlab-ci: Use single if for manual job rules entry I thought multiple ifs would all need to match, but looks like only the last one (or either one?) does. This should prevent a manual pipeline from getting created after merging changes which can't affect the pipeline. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474>	2020-01-22 16:42:11 +00:00
Michel Dänzer	2dd0cc60f1	gitlab-ci: Set GIT_STRATEGY to none for the dummy job It doesn't need anything from the Git repository. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3474>	2020-01-22 16:42:11 +00:00
X512	eb40c0adfc	util/u_thread: Fix build under Haiku	2020-01-22 16:21:54 +00:00
Alexander von Gluck IV	49d2a066c2	haiku/hgl: Fix build via header reordering	2020-01-22 16:21:54 +00:00
Rhys Perry	3f96a1ed86	aco: fix operand kill flags when a temporary is used more than once Helps create v_mac_f32 from v_mad_f32(b, a, b) Totals from affected shaders: SGPRS: 35824 -> 35824 (0.00 %) VGPRS: 33460 -> 33456 (-0.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2187264 -> 2180976 (-0.29 %) bytes LDS: 127 -> 127 (0.00 %) blocks Max Waves: 3802 -> 3802 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3486>	2020-01-22 15:55:00 +00:00
Boris Brezillon	5b810c7de3	panfrost/midgard: Add missing lowering passes for type/size conversion ops Replace the manual type/size conversion lowering description by one that's automatically generated and covers all type/size conversions. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	fcceeaffae	panfrost/midgard: Add 64 bits float <-> int converters The 64 bit converter cases were missing, add them now. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	fe5fbadd46	panfrost/midgard: Fix mir_print_instruction() for branch instructions Branch instructions should not be treated as regular ALUs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	e1f9e8d60b	panfrost/midgard: Add f2f64 support So we can convert floats into doubles. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	f53a0799c7	panfrost/midgard: Factorize f2f and u2u handling Those size conversion operations work the same way apart from f2f using an fmov op code and u2u using an imov. Let's handle them in the same case block to avoid code duplication. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	6548d01b3d	panfrost/midgard: Make sure promote_fmov() only promotes 32-bit imovs mir_constant_float() assumes we're dealing with 32-bit integers/floats, which is only the case if reg_mode is equal to midgard_reg_mode_32. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	9566f26ed4	panfrost/midgard: Rework mir_adjust_constants() to make it type/size agnostic Right now, constant combining is not supported in 16 bit mode, and 64 bit mode is simply ignored. Let's rework the function to make it type/bit-size agnostic. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Boris Brezillon	15c92d158c	panfrost/midgard: Use a union to manipulate embedded constants Each instruction bundle can contain up to 16 constant bytes. The meaning of those byte is instruction dependent: it depends on the instruction native type (int, uint or float) and the instruction reg_mode (8, 16, 32 or 64 bit). Those different layouts can be exposed as a union to facilitate constants manipulation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>	2020-01-22 15:31:28 +00:00
Lionel Landwerlin	63461cb7e1	anv: ensure prog params are initialized with 0s As a result of `9baa33cef0` our backend compiler leaves params pretty much untouched. So in order to avoid storing uninitialized values in the shader cache blobs, just 0 out this array. I've considered not even allocating this array which works on gen8+ but the vec4 backend still makes a copy of this array and so it crashes on memcpy on HSW. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9baa33cef0` ("anv: Rework push constant handling") Reported-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516>	2020-01-22 16:47:55 +02:00
Alyssa Rosenzweig	4936120230	panfrost: Fix crash in compute variant allocation Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `d8a3501f1b` ("panfrost: Dynamically allocate shader variants") Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3515>	2020-01-22 13:48:24 +00:00
Guido Günther	d817f2c696	etnaviv: drm: Don't miscalculate timeout The current code overflows (s * 1000000000) for s >= 5 but that is e.g. used in etna_bo_cpu_prep. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3509> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3509>	2020-01-22 13:22:47 +00:00
Alexander van der Grinten	047162d99c	egl: Fix _eglPointerIsDereferencable w/o mincore() On platforms without mincore(), _eglPointerIsDereferencable() currently just checks whether p != NULL. This is not sufficient: In the Wayland platform code (i.e., in get_wl_surface_proxy()), _eglPointerIsDereferencable() is called on the version field of `struct wl_egl_window` which is 3 on current versions of Wayland. This causes a segfault when trying to dereference p. Fix this behavior by assuming that the first page of the process is never dereferencable. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3103> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3103>	2020-01-22 12:55:05 +00:00
Tapani Pälli	39e7492d33	egl/android: fix buffer_count for applications setting max count Problem with previous solution was that it did not take account that some applications may set a max count for buffers. Therefore we need to query both min and max and clamp our setting based on that. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2373 Fixes: `be08e6a449` ("egl/android: Restrict minimum triple buffering for android color_buffers") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3480> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3480>	2020-01-22 10:37:04 +00:00
Timur Kristóf	1c9ecb2123	aco: Fix signedness compare warning. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:17 +01:00
Timur Kristóf	533a20dbd5	aco: Fix maybe-uninitialized warnings. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:14 +01:00
Timur Kristóf	6fb3df2786	aco: Fix -Wstringop-overflow warnings in aco_span. GCC does not understand how aco_span works. This patch fixes it by casting the aco_span's this pointer to uintptr_t rather than to a char pointer, effectively telling GCC not to try to figure it out. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3483>	2020-01-22 11:09:10 +01:00
Timur Kristóf	75e5720e1a	radeon: Fix multiple definition error with radeon_debug Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>	2020-01-22 09:36:28 +01:00
Timur Kristóf	8e22df3aec	gallium: Fix a couple of multiple definition warnings. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>	2020-01-22 09:36:25 +01:00
Timur Kristóf	a134ac5ee9	r600: Move get_pic_param to radeon_vce.c Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>	2020-01-22 09:36:23 +01:00
Timur Kristóf	b7f9759809	radeon: Move si_get_pic_param to radeon_vce.c Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3488>	2020-01-22 09:36:16 +01:00
Timur Kristóf	e45ea781f8	intel/compiler: Fix array bounds warning on GCC 10. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-22 08:35:18 +01:00
Eric Anholt	3abfde13be	turnip: Add support for non-zero (still constant) UBO buffer indices. This was actually all ready to go at this point, and just needed to increment by the value. Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_buffer.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3504>	2020-01-22 02:13:38 +00:00
Jonathan Marek	5f791df0d0	turnip: fix array/matrix varyings Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Jonathan Marek	c171765223	turnip: remove tu_sort_variables_by_location nir_assign_io_var_locations already does sorting. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Jonathan Marek	1736447f27	freedreno/ir3: allow inputs with the same location turnip can have multiple inputs with the same location, and different location_frac. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3109>	2020-01-21 20:36:08 -05:00
Matt Turner	17c9ec94f5	gitlab-ci: Skip ext_timer_query/time-elapsed This test's result is unpredictable, so it may occasionally pass when we expect it to fail, thus causing the CI pipeline to fail. Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3498> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3498>	2020-01-22 00:53:48 +00:00
Matt Turner	68cfc65ccb	intel/compiler: Test compaction on Gen <= 12 With the previous commits we can now enable the unit test on Gen <= 12. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	22462ba242	intel/compiler: Validate fuzzed instructions ... before giving them to the instruction compactor. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	72cf63cfc6	intel/compiler: Add unit tests for new EU validation checks Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	5f4eacaeda	intel/compiler: Validate some instruction word encodings Specifically, execution size, register file, and register type. I did not add validation for vertical stride and width because I don't believe it's possible to have an otherwise valid instruction with an invalid vertical stride or width, due to all of the other regioning restrictions. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	0fc490cdee	intel/compiler: Factor out brw_validate_instruction() In order to fuzz test instructions, we first need to do some sanity checking first. Factoring out this function allows us an easy way to validate a single instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	40f0ade68e	intel/compiler: Handle invalid compacted immediates 16-bit immediates need to be replicated through the 32-bit immediate field, so we should never see one that isn't. This does happen however in the fuzzer unit test, so returning false allows the fuzzer to reject this case. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	205cb8a139	intel/compiler: Handle invalid inputs to brw_reg_type_to_*() Necessary to handle these cases when we test fuzzed instructions. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	741cf9a104	intel/compiler: Split hw_type tables Previously we were sharing tables between generations that were nearly identical (i.e., Gen8 3-src adds HF support) and used a small bit of code to handle the differences. This is kind of a mess if you want to reject 64-bit types on platforms that don't support 64-bit types, so split the tables, allowing each generation's table to list exactly what it supports. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:21 +00:00
Matt Turner	0b70d46f7a	intel/compiler: Add a INVALID_{,HW_}REG_TYPE macros Since the enum brw_reg_type is packed, comparisons with -1 don't work directly, necessitating the cast. Add a macro to avoid this confusion. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	ab7c25b9aa	intel/compiler: Add NF some more places Necessary to handle these cases when we test fuzzed instructions. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	8634286c5d	intel/compiler: Limit compaction unit tests to specific gens Two of the tests emit instructions with MRF destinations, and MRFs aren't present on Gen7+. I think we were just lucky that this didn't cause a problem earlier since we were running the tests on Gen7-9. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	713c123bfa	intel/compiler: Don't disassemble align1 3-src operands on Gen < 10 Since the platforms don't support align1 3-src instructions, the contents of these operands are not going to be meaningful. Just don't print them to avoid hitting some assertions in brw_inst functions. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	49c21802cb	intel/compiler: Split has_64bit_types into float/int Gen7 has 64-bit floats but not 64-bit ints. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	bb47aa2124	intel/compiler: Extract GEN_* macros into separate file Will be used by the instruction compaction unit test. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Matt Turner	c69f3ece61	intel/compiler: Use ARRAY_SIZE() Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>	2020-01-22 00:19:20 +00:00
Caio Marcelo de Oliveira Filho	45164fc8c5	intel/fs: Don't emit control barrier if only one thread is used When there's only one hardware thread (i.e. the dispatch width greater or equal to the workgroup size), there's no need to use a barrier to ensure all the invocations reach the same point in the shader, because they are already running lock-step. Results for SKL running Iris for shader-db tests with compute shaders total sends in shared programs: 18361 -> 18339 (-0.12%) sends in affected programs: 904 -> 882 (-2.43%) helped: 9 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 2.44 x̃: 2 helped stats (rel) min: 0.84% max: 21.43% x̄: 7.82% x̃: 2.67% 95% mean confidence interval for sends value: -3.31 -1.58 95% mean confidence interval for sends %-change: -14.67% -0.97% Sends are helped. Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped. Results for ICL and TGL are similar to SKL. Results for BDW are similar to SKL except for DeusEx shader that has a workgroup size 16 but in BDW picks the SIMD8. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>	2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho	4f431e870c	intel/fs: Don't emit fence for shared memory if only one thread is used When there's only one hardware thread (i.e. the dispatch width greater or equal to the workgroup size), there's no need to synchronize shared memory access (SLM) since all the requests from a single thread are already synchronized. In such case, we just add a scheduling fence. To be able to identify that case for all platforms, move the handling of platforms prior to Gen11 (which don't have a separate SLM fence) after the optimization. Results for SKL running Iris for shader-db tests with compute shaders total sends in shared programs: 18395 -> 18361 (-0.18%) sends in affected programs: 938 -> 904 (-3.62%) helped: 9 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.78 x̃: 4 helped stats (rel) min: 1.56% max: 26.32% x̄: 10.33% x̃: 2.60% 95% mean confidence interval for sends value: -4.85 -2.71 95% mean confidence interval for sends %-change: -19.12% -1.54% Sends are helped. Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped. Results for ICL and TGL are similar to SKL. Results for BDW are similar to SKL except for DeusEx shader that has a workgroup size 16 but in BDW picks the SIMD8. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>	2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho	ff5b74ef32	intel/fs: Add workgroup_size() helper Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>	2020-01-21 23:41:35 +00:00
Caio Marcelo de Oliveira Filho	18e72ee210	intel/fs: Add FS_OPCODE_SCHEDULING_FENCE Like a SHADER_OPCODE_MEMORY_FENCE but doesn't doesn't generate any assembly code. Will be used when the compiler shouldn't reorder certain instructions but there's no need to generate code for the HW to do it -- as the ordering will be guaranteed by other means. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>	2020-01-21 23:41:35 +00:00
Dongwon Kim	9d964da19f	gallium: check all planes' pipe formats in case of multi-samplers Current code only checks whether first plane's format is supported in case YUV format sampling is done by sampling each plane separately. It would be safer to check other planes' as well. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2863>	2020-01-21 23:04:33 +00:00
Kenneth Graunke	d3a0d3a80b	anv: Drop some workarounds that are no longer necessary These workarounds are no longer required by 10th Gen hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>	2020-01-21 13:58:42 -08:00
Kenneth Graunke	311cab27e2	iris: Drop some workarounds which are no longer necessary These workarounds are no longer required by 10th Gen hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3495>	2020-01-21 13:58:40 -08:00
Eric Anholt	d1166a3b3a	turnip: Disable UBWC on images used as storage images. The closed GL driver doesn't use UBWC on any storage images. It does tile mostly (skipping tiling on writeonly images, it seems), but for freedreno we've been enabling tiling in all cases and it's fine. We do need to disable UBWC, as tests fail otherwise and just plugging in the equivalent UBWC regs like we were setting up a texture isn't enough. Fixes dEQP-VK.image.atomic_operations.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	e5ce365cde	turnip: Add limited support for storage images. So far this doesn't handle the texture state-based storage image access loads, and doesn't support descriptor arrays (same as SSBOs). The texture side is more tricky, since we have another remapping table to work around. This is enough to get some of dEQP-VK.image.atomic_operations.* working. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	85e424c591	turnip: Refactor the intrinsic lowering. Too many things in one function, split them out based on the intrinsic. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	3ac662e8df	turnip: Fix some whitespace around binary operators. Conforms to mesa style and the rest of turnip. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3433>	2020-01-21 19:29:59 +00:00
Eric Anholt	6c10af95c7	radeonsi: Drop PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS. Now that we don't expose TGSI, we can stop exposing the flag. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>	2020-01-21 19:04:22 +00:00
Eric Anholt	609a67461d	r300: Remove a bunch of default handling of pipe caps. u_screen will return 0 for all of these, which means that this is one less driver to see in git grep when I'm checking who exposes a cap. The exception is the texel/gather offsets and stream output components, which will not be exposed since we don't expose the corresponding GLSL version. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>	2020-01-21 19:04:22 +00:00
Eric Anholt	e7e034e1de	r600: Remove a bunch of default handling of pipe caps. u_screen will return 0 for all of these, which means that this is one less driver to see in git grep when I'm checking who exposes a cap. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>	2020-01-21 19:04:22 +00:00
Eric Anholt	3e1dd99adc	radeonsi: Remove a bunch of default handling of pipe caps. u_screen will return 0 for all of these, which means that this is one less driver to see in git grep when I'm checking who exposes a cap. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3493>	2020-01-21 19:04:22 +00:00
Lionel Landwerlin	e618951322	anv: don't report error with other vendor DRM devices Enumeration should just skip unsupported DRM devices. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `34c8621c3b` ("anv: Allow enumerating multiple physical devices") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2386 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3481> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3481>	2020-01-21 18:36:26 +00:00
Eric Anholt	fb6fca0037	freedreno: Stop scattered remapping of SSBOs/images to IBOs. Just make it be all SSBOs then all storage images. The remapping table was there to make it so that the big gap present from gallium's atomic lowering would get cleaned up, but that's no longer case. The table has made it very hard to support Vulkan storage images, so it's time for it to go. This does mean that an SSBO/IBO that is only loaded (or size-queried) will now occupy a slot in the table where it wouldn't before. This seems like a minor cost compared to being able to drop this much logic. With the remapping table gone, SSBO array handling for turnip just falls out. Fixes many array cases of dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.* Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jonathan Marek <jonathan@marek.ca> (turnip) Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	7558b5da13	compiler: Add a note about how num_ssbos works in the program info. These numbers are always confusing, and it's particularly so for this field where it has a different meaning in different info structs. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	d0975bfc4a	nir: Drop the ssbo_offset to atomic lowering. The arguments passed in were: - prog->info.num_ssbos - prog->nir->info.num_ssbos - arbitrary values for standalone compilers The num_ssbos should match between the prog's info and prog->nir's info until this lowering happens. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	d5a3971457	gallium: Pack the atomic counters just above the SSBOs. We carve out half the SSBO space for atomics, and we were just binding them way up there. freedreno was then using a remapping table to map the sparse buffer index back down, since space in the descriptor array is a shared resource that may limit parallelism. That remapping table generated inside of the ir3 compiler is getting thoroughly in the way of implementing vulkan descriptor sets. We will be able to get rid of the freedreno's remapping table, and hopefully save shared resources on other hardware, by packing the atomics tightly above the SSBOs (like i965 does). We already rebind the shader buffers on program change if either the old or new program has SSBOs or ABOs, so this doesn't necessarily increase the program state change cost (the only cost increase I can come up with is if you're using the same atomic counter without rebinding it across changes of programs with varying SSBO counts, meaning it would now bounce around index space). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	10dc4ac4c5	mesa: Make atomic lowering put atomics above SSBOs. Gallium arbitrarily (it seems) put atomics below SSBOs, resulting in a bunch of extra index management, and surprising shader code when you would see your SSBOs up at index 16. It makes a lot more sense to see atomics converted to SSBOs appear as magic high numbers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Eric Anholt	2dc2055157	turnip: Refactor linkage state setup. As I touch this for descriptor set reworks, I don't want to have to update it twice. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>	2020-01-21 10:06:23 -08:00
Timur Kristóf	28eb481bc2	nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2020-01-21 17:36:36 +01:00
Tapani Pälli	5fede43fe0	anv: initialize clear_color_is_zero_one Fixes following valgrind warning: ==12508== Conditional jump or move depends on uninitialised value(s) ==12508== at 0x2CCD8B79: cmd_buffer_begin_subpass (genX_cmd_buffer.c:4599) ==12508== by 0x2CCDA72B: gen9_CmdBeginRenderPass (genX_cmd_buffer.c:5275) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3487>	2020-01-21 17:47:30 +02:00
Boris Brezillon	9134f22df2	panfrost/midgard: Print the actual source register for store operations Store operation use r26/r27 but have a word->reg set to 0 or 1 (base is r26). Let's take this base offset into account in print_load_store_instr(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482>	2020-01-21 14:57:12 +00:00
Alyssa Rosenzweig	14b37ebd44	panfrost: Add pandecode entries for ASTC/ETC formats Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:23 -05:00
Icecream95	31bd3b5279	panfrost: Add ASTC texture formats Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:23 -05:00
Icecream95	960fe9daea	panfrost: Add ETC1/ETC2 texture formats Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:23 -05:00
Alyssa Rosenzweig	2091d311c9	panfrost: Rework linear<--->tiled conversions There's a lot going on here (it's a ton of commits squashed together since otherwise this would be impossible to review...) 1. We have a fast path for linear->tiled for whole (aligned) tiles, but we have to use a slow path for unaligned accesses. We can get a pretty major win for partial updates by using this slow path simply on the borders of the update region, and then hit the fast path for the tile-aligned interior. This does require some shuffling. 2. Mark the LUTs constant, which allows the compiler to inline them, which pairs well with loop unrolling (eliminating the memory accesses and just becoming some immediates.. which are not as immediate on aarch64 as I'd like..) 3. Add fast path for bpp1/2/8/16. These use the same algorithm and we have native types for them, so may as well get the fast path. 4. Drop generic path for bpp != 1/2/8/16, since these formats are generally awful and there's no way to tile them efficienctly and honestly there's not a good reason too either. Lima doesn't support any of these formats; Panfrost can make the opinionated choice to make them linear. 5. Specialize the unaligned routines. They don't have to be fully generic, they just can't assume alignment. So now they should be nearly as fast as the aligned versions (which get some extra tricks to be even faster but the difference might be neglible on some workloads). 6. Specialize also for the size of the tile, to allow 4x4 tiling as well as 16x16 tiling. This allows compressed textures to be efficiently tiled with the same routines (so we add support for tiling ASTC/ETC textures while we're at it) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:19 -05:00
Alyssa Rosenzweig	f2d876b2b2	panfrost,lima: De-Galliumize tiling routines There's an implicit dependence on Gallium here that will add more complexity than needed when testing/optimizing out of driver as well as potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h properties directly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:16 -05:00
Alyssa Rosenzweig	0ca7ab1c97	panfrost: Compile tiling routines with -O3 These are major hot spots for panfrost and lima; better let the compiler do its thing even on debug builds. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>	2020-01-21 08:35:01 -05:00
Bas Nieuwenhuizen	bd4380c63c	radv: Remove syncobj_handle variable in header. I strongly suspect it was supposed to be a typedef. However, used nowhere, we should remove it. Fixes: `eaa56eab6d` "radv: initial support for shared semaphores (v2)" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2385 Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3479>	2020-01-21 12:28:00 +00:00
Neil Armstrong	dc594c95dd	gitlab-ci/lava: add pipeline information in the lava job name In order to have more informations in the LAVA jobs list, add the current pipeline URL and commit ref name in the LAVA job name. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2337> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2337>	2020-01-21 11:29:36 +00:00
Jan Zielinski	a24b3b228a	gallium/gallivm: enable linking lp_bld_printf function with C++ code To enable linking functions declared in lp_bld_printf.h file with C++, we need to add appropriate macros to the header. Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3470>	2020-01-21 11:00:18 +00:00
Danylo Piliaiev	3f9a6011a6	iris: Fix value of out-of-bounds accesses for vertex attributes Having VERTEX_BUFFER_STATE.BufferSize greater than the size of a bound vertex buffer allows shader to read uninitialized vertex attributes from BO, instead of allowing hardware to return zeroes on out-of-bounds access. OpenGL spec "6.4 Effects of Accessing Outside Buffer Bounds" says: "Robust buffer access can be enabled by creating a context with robust access enabled through the window system binding APIs. When enabled, any command unable to generate a GL error as described above, such as buffer object accesses from the active program, will not read or modify memory outside of the data store of the buffer object and will not result in GL interruption or termination. Out-of-bounds reads may return values from within the buffer object or zero values." Fixes three webgl tests: conformance/rendering/out-of-bounds-array-buffers.html conformance2/rendering/out-of-bounds-index-buffers-after-copying.html conformance2/rendering/element-index-uint.html See #1996 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3427>	2020-01-21 09:52:40 +00:00
Vasily Khoruzhick	e470116aac	ci: Re-enable CI for lima on mali450 Amend fails and skips lists basing on lists from Andreas Baierl, shard mali400 job across two devices since it takes close to 10min and rename jobs to lima-mali400-test and lima-mali450-test. Also don't set MESA_GLES_VERSION_OVERRIDE=3.0 for lima since we don't support GLES 3.0 and lower DEQP_PARALLEL to 3 for jobs on H3. Keep mali400 jobs disabled atm since they take too much time to complete and we also get some unexplicable failures in dEQP-GLES2.functional.default_vertex_attrib.* Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163>	2020-01-21 09:33:57 +00:00
Vasily Khoruzhick	5e5b5348f6	ci: lava: pass CI_NODE_INDEX and CI_NODE_TOTAL to lava jobs deqp-runner.sh uses it to determine whether we split job across multiple devices and if we do what's the node index. With this change we now can set 'parallel: N' in job description if we want to split the job. Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3163>	2020-01-21 09:33:57 +00:00
Hyunjun Ko	26d93a7495	turnip: fix invalid VK_ERROR_OUT_OF_POOL_MEMORY When VK_DESCRIPTOR_TYPE_SAMPLER is provided, it doesn't need to be counted as a buffer count. Otherwise it leads to mismatch of allocated buffer size, hitting VK_ERROR_OUT_OF_POOL_MEMORY finally. Fixes: `c39afe68f0` Also fixes amber tests: ./tests/cases/address_modes_float.amber ./tests/cases/address_modes_int.amber ./tests/cases/magfilter_linear.amber ./tests/cases/magfilter_nearest.amber Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2020-01-21 10:29:16 +01:00
Jan Vesely	87e1f8eca5	clover: Initialize Asm Parsers Fixes piglits that use ADMGCN inline assembly: program@execute@calls program@execute@amdgcn-mubuf-negative-vaddr CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2020-01-21 01:39:08 +00:00
Jason Ekstrand	34c8621c3b	anv: Allow enumerating multiple physical devices Instead of having a single physical device in anv_instance, have a linked list of them. What we have now works today because we our GPUs are build into the CPU and so you're guaranteed to only ever have one of them. One day, that will change and we want ANV to be ready. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	e963e151d8	anv: Re-arrange physical_device_init This commit simply moves fetching the device info and checking if ANV supports the device a bit higher up. This way we fail earlier and it'll make error checking easier in the next commit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	3ecfba388a	anv: Drop separate chipset_id fields This already exists in gen_device_info. There's no reason to keep duplicate copies. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	02044be23f	anv: Move the physical device dispatch table to anv_instance We don't actually have genX versions of any physical device level commands so we don't need the trampoline versions and we don't need to have a separate table per physical device. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	78ff747408	anv: Drop the instance pointer from anv_device There are very few times when we actually want to fetch the instance from the anv_device. We can put up with a bit of pain there in exchange for strongly discouraging people from doing this in general. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	f0519c9cf9	anv: Stop allocating WSI event fences off the instance Fixes: `16eb390834` "anv: add VK_EXT_display_control to anv driver [v5]" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	1ec84bd208	anv: Take a device in anv_perf_warn Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	cb6ea77045	anv: Take an anv_device in vk_errorf Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Jason Ekstrand	70e8064e13	anv: Add an anv_physical_device field to anv_device Having to always pull the physical device from the instance has been annoying for almost as long as the driver has existed. It also won't work in a world where we ever have more than one physical device. This commit adds a new field called "physical" to anv_device and switches every location where we use device->instance->physicalDevice to use the new field instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>	2020-01-20 22:08:52 +00:00
Marek Olšák	735a3ba007	radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling Only non-indexed triangle lists and strips are supported. This increases performance if there is something to cull. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	c377f45c18	radeonsi/gfx10: rewrite late alloc computation - Use conservative late alloc when the number of CUs <= 6. - Move the late alloc GS register to the GS shader state, so that it can be tuned for NGG culling. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	4e4b2d13f0	ac: add helper ac_build_triangle_strip_indices_to_triangle Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	8db00a51f8	radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	aa2d846604	radeonsi/gfx10: move GE_PC_ALLOC setting to shader states The value is not changed. I just use a different way to compute it. The value will vary with NGG culling. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	41fef6fc09	radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough v2: TES doesn't use the GS PrimitiveID Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	943d131e7d	radeonsi/gfx10: merge main and pos/param export IF blocks into one if possible Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	a966729c84	radeonsi/gfx10: export primitives at the beginning of VS/TES This decreases VGPR usage and will allow us to merge some IF blocks in shaders. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	5a0fcf11f0	radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders This will allow us to merge some IF blocks in shaders. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	cf9f8d1ea2	radeonsi/gfx10: correct VS PrimitiveID implementation for NGG We didn't use the correct LDS pointer, though it probably doesn't matter, because I think that nothing else is using LDS here. This commit makes it consistent with all other esgs_ring use. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	b2326a7549	radeonsi/gfx10: update comments and remove invalid TODOs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	0f45d4dc2b	ac: add ac_build_readlane without optimization barrier Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	77393cf39b	ac: add prefix bitcount functions Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 16:16:11 -05:00
Marek Olšák	679b6244e1	radeonsi: turn an assertion into return in si_nir_store_output_tcs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:13 -05:00
Marek Olšák	27cc7703d3	radeonsi: fix doubles and int64 Fixes: `57bd73e229` - radeonsi: remove llvm_type_is_64bit Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:10 -05:00
Marek Olšák	df34fa14bb	radeonsi: don't invoke decompression inside internal launch_grid Decompress resources properly but don't do it inside launch_grid to prevent recursion. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: 19.3 <mesa-stable@lists.freedesktop.org>	2020-01-20 15:40:08 -05:00
Marek Olšák	58c929be0d	radeonsi: clean up how internal compute dispatches are handled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: 19.3 <mesa-stable@lists.freedesktop.org>	2020-01-20 15:40:07 -05:00
Marek Olšák	d69483270e	Revert "radeonsi: unbind image before compute clear" This reverts commit `3a527eda7c`. It's incorrect. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-20 15:40:05 -05:00
Samuel Pitoiset	dbdf3b3ef9	aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6 GFX6 doesn't have FLAT instructions which means we have to emit a 64-bit MUBUF load. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	9e2fde84fc	aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7 According to the different ISA docs (and to LLVM), this bit seems to only exists on GFX6-GFX7. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	fe9157a700	aco: do not use the vec3 variant for loads on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	1b5bb204d9	aco: do not use the vec3 variant for stores on GFX6 GFX6 only supports vec3 with load/store format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Samuel Pitoiset	b8abfafe86	aco: fix constant folding of SMRD instructions on GFX6 SMRD instructions have an 8-bit dword offset on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Jason Ekstrand	dd92179a72	anv: Canonicalize buffer formats for image/buffer copies Some formats, in particular YCbCr formats and ASTC have additional restrictions. We already whack ASTC formats to RGBA32_UINT because the hardware doesn't allow LINEAR with ASTC. However, we need to fix YCbCr formats as well because they come with alignment restrictions that we can't guarantee are satisfied. We're using blorp_copy to do the copies so we may as well just stomp formats for everything. Fixes: `b24b93d584` "anv: enable VK_KHR_sampler_ycbcr_conversion" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>	2020-01-20 16:08:17 +00:00
Jason Ekstrand	14c6e665f7	anv/blorp: Rename buffer image stride parameters The new names fit better with the Vulkan names and don't pretend to be an actual image extent. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3460>	2020-01-20 16:08:17 +00:00
Daniel Stone	cf5fccb0d9	Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES" This reverts commit `bec9c90b5e`. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>	2020-01-20 12:33:29 +00:00
Daniel Stone	32d45733ae	Revert "st/dri: do FLUSH_VERTICES before calling flush_resource" This reverts commit `3ba16d36c9`. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3472>	2020-01-20 12:33:22 +00:00
Rhys Perry	29bfe18abd	aco: fix fall-through test in try_remove_simple_block() with back-edges `3bca0af2` enhanced empty block determination which exposed this bug and created an infinite loop in a Guild Wars 2 shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `3bca0af25d` ('aco: ignore parallelcopies to the same register on jump threading') Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2364 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3452>	2020-01-20 11:51:45 +00:00
Krzysztof Raszkowski	afb75e71e0	docs/GL4: update gallium/swr features Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2020-01-20 11:37:16 +00:00
Rhys Perry	e151398de6	aco: fix stack buffer overflow in apply_sgprs() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `cef7879719` ('aco: rewrite apply_sgprs()') Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>	2020-01-20 11:13:11 +00:00
Tapani Pälli	9b2ccd6a0e	anv: add assert for isl_mod_info in choose_isl_tiling_flags CID: 1457859 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>	2020-01-20 12:12:29 +02:00
Tapani Pälli	8eebdd594b	anv: fix assert in GetImageDrmFormatModifierPropertiesEXT CID: 1457861 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3469>	2020-01-20 12:11:43 +02:00
Tapani Pälli	31feae1c21	isl/gen12: add reminder comment about missing WA with 3D surfaces Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3441>	2020-01-20 08:06:19 +02:00
Icecream95	d8a3501f1b	panfrost: Dynamically allocate shader variants This fixes a crash in LZDoom where over 16 shader variants are needed for a few shaders in some maps, and should also save a few kilobytes of RAM as most of the time only one or two variants of the 8 previously allocated are actually needed. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-18 11:47:34 -05:00
Alyssa Rosenzweig	bef716b56c	panfrost: Expose some functionality with dEQP flag These features are stable enough that they don't need to be hidden. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3464>	2020-01-18 14:57:52 +00:00
Alyssa Rosenzweig	4af8d5b064	pan/midgard: Fix recursive csel scheduling Corner case causing invalid scheduling on shaders with nested csels, i.e. GLSL code resembling: (foo ? bool1 : bool2) ? x : y By explicitly disallowing csels this is fixed. Fixes INSTR_INVALID_ENC on a glamor shader (noticeable with slowdown and visual corruption when scrolling "too far" on GTK apps). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>	2020-01-18 14:40:05 +00:00
Alyssa Rosenzweig	564a782ff7	panfrost: Identify un/pack colour opcodes We still need to identify formats in the disassembler, but this will at least get the opcode name clear. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>	2020-01-18 14:18:48 +00:00
Alyssa Rosenzweig	13c32e5fed	pan/midgard: Bytemasks should round up, not round down Otherwise we'll lost components in DCE. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>	2020-01-18 14:18:48 +00:00
Icecream95	5e8386c606	panfrost: Compact the bo_access readers array Previously, the array bo_access->readers was only cleared when there were no unsignaled fences, which in some situations never happened. That resulted in the array having thousands of NULL pointers, but only a handful of active readers. With this patch, all the unsignaled readers are moved to the front of the array, effectively building a new array only containing the active readers in-place. This results in the readers array usually only having a couple of elements. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3419>	2020-01-18 13:58:43 +00:00
Erik Faye-Lund	c0ba9000d2	zink: support arrays of samplers Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	a9023ec566	zink: support sampling non-float textures Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	3e1acff560	zink: store image-type per texture Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	5fc1562a72	zink: avoid incorrect vector-construction Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	8112240d29	zink: support offset-variants of texturing Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	f1a5bcdc16	zink: implement nir_texop_txs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3275>	2020-01-18 10:45:38 +00:00
Erik Faye-Lund	7ee94d1b21	docs: fixup indentation The most canonical indentation-style here is two spaces, which is what the standard boilerplate in all documents use. So let's normalize to that. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:32 +01:00
Erik Faye-Lund	2ef989473a	docs: remove pointless, stray newline Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:29 +01:00
Erik Faye-Lund	199572b65b	docs: use [1] instead of asterisk for footnote While we're at it, make it a link as well. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:25 +01:00
Erik Faye-Lund	063a28642e	docs: remove trailing newlines Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:22 +01:00
Erik Faye-Lund	9954120b38	docs: remove leading spaces There's no good reason to have leading space in these pre-formatted blocks. It looks strange, so let's get rid of it. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:18 +01:00
Erik Faye-Lund	c871862744	docs: remove trailing header This header has been there since the document was added, but contains nothing. So let's get rid of it. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:14 +01:00
Erik Faye-Lund	37daddd3e4	docs: use figure/figcaption instead of tables Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:39:07 +01:00
Erik Faye-Lund	f5983a6eed	docs: do not use definition-list for sub-topics The dl-tag isn't a neat tool for defining sub-headings, it's a semantic tool for defining definitions and their meaning. Let's insetad use normal sub-headings instead. To make the last few paragraphs stand out from the above, let's add a sub-heading for those as well. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3443>	2020-01-18 11:38:56 +01:00
Rob Clark	95187083c4	freedreno/a6xx: add PROG_FB_RAST stateobj For the handful of registers that depend on the union of program/ framebuffer/rasterizer state. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	6dc9b292d0	freedreno/a6xx: move dynamic program state to streaming stateobj Move the program state which we can't pre-bake to a streaming state object, rather than emitting directly in the draw cmdstream. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	d2fd6469c3	freedreno/a6xx: drop a few more per-draw registers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	4d8f42c851	freedreno/a6xx: separate rast stateobj for prim restart This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather than unconditionally emitting it directly in the cmdstream on every draw. This also starts adding some tracking about previous draw state, so that following patches can limit some of the register writes we currently emit on every draw. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	0e063b3079	freedreno/a6xx: cleanup rasterizer state All but one of the reg values is only used in the stateobj, so we can inline the register value setup and stateobj construction. While we are at it, switch over to the new register builders. Prep work for next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Rob Clark	fba7e6f896	freedreno/a6xx: limit scratch/debug markers to debug builds The overhead does seem to matter when you have a high enough # of draw calls that effect few bins/pixels, because these writes would happen unconditionally (ie. not part of a state-group). Possibly we could keep these if we moved them into a state-group so the register writes would be no-ops on bins with no geometry. OTOH I usually end up adding in a WFI when using them scratch reg values to track down a crash. (So add a WFI to mitigate the annoyance of needing to use a debug build to get scratch regs to locate the position of a crash/hang in the cmdstream.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>	2020-01-17 15:43:51 -08:00
Jordan Justen	5d7381c645	iris: Fix some indentation in iris_init_render_context Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2020-01-17 15:28:07 -08:00
C Stout	c1104e4cee	util/vector: Fix u_vector_foreach when head rolls over Also add unit tests for u_vector. Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>	2020-01-17 22:21:00 +00:00
Francisco Jerez	b54b67e067	intel/fs: Switch to standard vector layout for barycentrics at optimization time. This involves permuting the registers of barycentric vectors to have the standard X[0-n] Y[0-n] layout at NIR translation time. Barycentrics are converted to the format expected by the PLN instruction in the lower_barycentrics() pass run after the optimization loop. Main reason is correctness of SIMD32 fragment shaders. The shuffle_from_pln_layout() and shuffle_to_pln_layout() helpers used during NIR translation are busted for SIMD32. This leads to serious corruption at present with INTEL_DEBUG=do32, especially on Gen11+ where these helpers are hit more frequently due to the lack of a hardware PLN instruction. Of course one could have chosen to fix those helpers instead, but there is another far more subtle issue that was reported during review of the SIMD32 fragment shader codegen changes: The SIMD splitting pass currently handles SIMD32 barycentric vectors as if they had the standard X[0-n] Y[0-n] layout, even though they are interleaved for the PLN instruction, which causes incorrect execution masks to be applied to the MOVs unzipping barycentric vectors in cases where a LINTERP instruction occurs under non-uniform control flow. I'm not aware of any conformance regressions due to the latter issue at present, but for our peace of mind let's move the conversion to the PLN layout into the lower_barycentrics() pass run after lower_simd_width(). This leads to the following shader-db improvements (including SIMD32 shaders) in combination with the previous back-end preparation changes -- Without them (especially the copy propagation changes) this would lead to a massive number of regressions. On ICL: total instructions in shared programs: 20662316 -> 20466903 (-0.95%) instructions in affected programs: 10538474 -> 10343061 (-1.85%) helped: 68775 HURT: 6 total spills in shared programs: 8938 -> 8748 (-2.13%) spills in affected programs: 376 -> 186 (-50.53%) helped: 9 HURT: 5 total fills in shared programs: 8965 -> 8663 (-3.37%) fills in affected programs: 965 -> 663 (-31.30%) helped: 9 HURT: 6 LOST: 146 GAINED: 43 On SKL: total instructions in shared programs: 18725867 -> 18614912 (-0.59%) instructions in affected programs: 3876590 -> 3765635 (-2.86%) helped: 27492 HURT: 2 LOST: 191 GAINED: 417 On SNB: total instructions in shared programs: 14573613 -> 13980646 (-4.07%) instructions in affected programs: 5199074 -> 4606107 (-11.41%) helped: 29998 HURT: 0 LOST: 21 GAINED: 30 Results are somewhat less impressive but still significant without SIMD32 fragment shaders enabled. On ICL: total instructions in shared programs: 16148728 -> 16061659 (-0.54%) instructions in affected programs: 6114788 -> 6027719 (-1.42%) helped: 42046 HURT: 6 total spills in shared programs: 8218 -> 8028 (-2.31%) spills in affected programs: 376 -> 186 (-50.53%) helped: 9 HURT: 5 total fills in shared programs: 8953 -> 8651 (-3.37%) fills in affected programs: 965 -> 663 (-31.30%) helped: 9 HURT: 6 LOST: 0 GAINED: 3 On SKL: total instructions in shared programs: 14927994 -> 14926738 (-0.01%) instructions in affected programs: 168850 -> 167594 (-0.74%) helped: 711 HURT: 2 On SNB: total instructions in shared programs: 10770538 -> 10734403 (-0.34%) instructions in affected programs: 2702172 -> 2666037 (-1.34%) helped: 17818 HURT: 0 All of the hurt shaders are either spilling slightly more or emitting additional NOP instructions due to the SIMD16 POW workaround for Gen8-9 combined with differences in scheduling. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:23:12 -08:00
Francisco Jerez	79bd252d6e	intel/fs: Introduce barycentric layout lowering pass. The goal is to represent barycentrics with the standard vector layout during optimization and particularly SIMD lowering. Instead of emitting the barycentric layout conversions at NIR translation time, do it later as a lowering pass. For the moment this is only applied to PI messages, but we'll give the same treatment to LINTERP instructions too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:59 -08:00
Francisco Jerez	44d7d66adc	intel/fs: Split fetch_payload_reg() into separate helper for barycentrics. We're about to change the layout of barycentric vectors, which will involve permuting the GRFs of barycentrics fetched from the thread payload. Make room for this in a function separate from the generic fetch_payload_reg(), since the permutation will only be applicable to barycentric vectors. This allows simplifying fetch_payload_reg(), since there was no need for handling multiple-component payload registers except for barycentrics. This causes some minor shader-db noise due to the new helper emitting a LOAD_PAYLOAD instruction unconditionally, but it will be cleaned up shortly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:51 -08:00
Francisco Jerez	9c9e80103c	intel/fs/gen6: Use SEL instead of bashing thread payload for unlit centroid workaround. This prevents regressions on SNB due to the redundant MOVs lying around in cases where fetch_payload_reg() returns a VGRF (currently only in SIMD32 but soon in pretty much all cases). The MOVs can't be register-coalesced due to their source being a FIXED_GRF, and they can't be copy-propagated either due to the unlit centroid workaround partial writes. They can be copy-propagated just fine into a SEL instruction though. On SNB this prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series: total instructions in shared programs: 13996898 -> 14001982 (0.04%) instructions in affected programs: 197461 -> 202545 (2.57%) helped: 0 HURT: 1251 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:39 -08:00
Francisco Jerez	0dd18d70ae	intel/fs/gen6: Generalize aligned_pairs_class to SIMD16 aligned barycentrics. This is mainly meant to avoid shader-db regressions on SNB as we start using VGRFs for barycentrics more frequently. Currently the aligned_pairs_class is only useful in SIMD8 mode, because in SIMD16 mode barycentric vectors are typically 4 GRFs. This is not a problem on Gen4-5, because on those platforms all VGRF allocations are pair-aligned in SIMD16 mode. However on Gen6 we end up using either the fast or the slow path of LINTERP rather non-deterministically based on the behavior of the register allocator. Fix it by repurposing aligned_pairs_class to hold PLN-aligned registers of whatever the natural size of a barycentric vector is in the current dispatch width. On SNB this prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series: total instructions in shared programs: 13983257 -> 14527274 (3.89%) instructions in affected programs: 1766255 -> 2310272 (30.80%) helped: 0 HURT: 11608 LOST: 26 GAINED: 13 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:34 -08:00
Francisco Jerez	0db4455c1f	intel/fs/gen6: Constrain barycentric source of LINTERP during bank conflict mitigation. This avoids regressions on SNB due to the bank conflict mitigation pass moving a VGRF-allocated barycentric vector to a misaligned location, which would prevent the PLN instruction from being used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:29 -08:00
Francisco Jerez	369aef851d	intel/fs/gen4-6: Allocate registers from aligned_pairs_class based on LINTERP use. Previously we would hardcode fs_visitor::delta_xy barycentrics to be allocated from aligned_pairs_class on hardware with PLN source alignment restrictions (pre-Gen7). Instead allocate any registers consumed by LINTERP from aligned_pairs_class, even if some barycentric vector had ended up in a temporary. On SNB this prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series: total instructions in shared programs: 13983257 -> 14527274 (3.89%) instructions in affected programs: 1766255 -> 2310272 (30.80%) helped: 0 HURT: 11608 LOST: 26 GAINED: 13 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:20 -08:00
Francisco Jerez	54b1b71e73	intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into another. This is particularly useful in cases where register coalaesce is unlikely to succeed because the LOAD_PAYLOAD isn't a plain copy -- E.g. when a LOAD_PAYLOAD is shuffling the contents of a barycentric vector in order to transform it into the PLN layout. This prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series. On SKL: total instructions in shared programs: 18596672 -> 18976097 (2.04%) instructions in affected programs: 7937041 -> 8316466 (4.78%) helped: 39 HURT: 67427 LOST: 466 GAINED: 220 On SNB: total instructions in shared programs: 13993866 -> 14202963 (1.49%) instructions in affected programs: 7611309 -> 7820406 (2.75%) helped: 624 HURT: 52943 LOST: 6 GAINED: 18 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:22:09 -08:00
Francisco Jerez	8eb4f2092a	intel/fs: Add support for copy-propagating a block of multiple FIXED_GRFs. In cases where a LOAD_PAYLOAD instruction copies a single block of sequential GRF registers into the destination (see is_identity_payload()), splitting the block copy into a number of ACP entries (one for each LOAD_PAYLOAD source) is undesirable, because that prevents copy propagation into any instructions which read multiple components at once with the same source (the barycentric source of the LINTERP instruction is going to be the overwhelmingly most common example). Technically it would also be possible to do this for VGRF sources, but there is little benefit from that since register coalesce already covers many of those cases -- There is no way for a block of FIXED_GRFs to be coalesced into a VGRF though. This prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series. On SKL: total instructions in shared programs: 18595160 -> 18828562 (1.26%) instructions in affected programs: 13374946 -> 13608348 (1.75%) helped: 7 HURT: 108977 total spills in shared programs: 9116 -> 9106 (-0.11%) spills in affected programs: 404 -> 394 (-2.48%) helped: 7 HURT: 9 total fills in shared programs: 8994 -> 9176 (2.02%) fills in affected programs: 898 -> 1080 (20.27%) helped: 7 HURT: 9 LOST: 469 GAINED: 220 On SNB: total instructions in shared programs: 13996898 -> 14096222 (0.71%) instructions in affected programs: 8088546 -> 8187870 (1.23%) helped: 2 HURT: 66520 total spills in shared programs: 2985 -> 2961 (-0.80%) spills in affected programs: 632 -> 608 (-3.80%) helped: 2 HURT: 0 total fills in shared programs: 3144 -> 3128 (-0.51%) fills in affected programs: 1515 -> 1499 (-1.06%) helped: 2 HURT: 0 LOST: 0 GAINED: 4 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:21:41 -08:00
Francisco Jerez	e328fbd9f8	intel/fs: Add partial support for copy-propagating FIXED_GRFs. This will be useful for eliminating redundant copies from the FS thread payload, particularly in SIMD32 programs. For the moment we only allow FIXED_GRFs with identity strides in order to avoid dealing with composing the arbitrary bidimensional strides that FIXED_GRF regions potentially have, which are rarely used at the IR level anyway. This enables the following commit allowing block-propagation of FIXED_GRF LOAD_PAYLOAD copies, and prevents the following shader-db regressions (including SIMD32 programs) in combination with the interpolation rework part of this series. On ICL: total instructions in shared programs: 20484665 -> 20529650 (0.22%) instructions in affected programs: 6031235 -> 6076220 (0.75%) helped: 5 HURT: 42073 total spills in shared programs: 8748 -> 8925 (2.02%) spills in affected programs: 186 -> 363 (95.16%) helped: 5 HURT: 9 total fills in shared programs: 8663 -> 8960 (3.43%) fills in affected programs: 647 -> 944 (45.90%) helped: 5 HURT: 9 On SKL: total instructions in shared programs: 18937442 -> 19128162 (1.01%) instructions in affected programs: 8378187 -> 8568907 (2.28%) helped: 39 HURT: 68176 LOST: 1 GAINED: 4 On SNB: total instructions in shared programs: 14094685 -> 14243499 (1.06%) instructions in affected programs: 7751062 -> 7899876 (1.92%) helped: 623 HURT: 53586 LOST: 7 GAINED: 25 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:21:33 -08:00
Francisco Jerez	5153d06d92	intel/fs: Extend copy propagation dataflow analysis to copies with FIXED_GRF source. This involves indexing the ACP tables used internally by fs_copy_prop_dataflow::setup_initial_values() by reg_space() instead of register number. Both are nearly equivalent for virtual GRFs (barring the single bit of entropy lost in the hash), and this makes handling FIXED_GRFs straightforward. Because we're only going to support FIXED_GRFs for the source of a copy, this change is only strictly necessary during the second pass that checks for source interference, but we also apply the same change to the first pass for consistency. Note that this shouldn't change the behavior of the copy propagation pass until we start inserting FIXED_GRF entries into the ACP. Even then FIXED_GRF writes are extremely rare so this change will hardly ever have an effect, but they aren't completely non-existing so we need to handle them for correctness. No functional nor shader-db changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:21:27 -08:00
Francisco Jerez	ab0d1b3b3d	intel/fs: Rework fs_inst::is_copy_payload() into multiple classification helpers. This reworks the current fs_inst::is_copy_payload() method into a number of classification helpers with well-defined semantics. This will be useful later on in order to optimize LOAD_PAYLOAD instructions more aggressively in cases where we can determine it's safe to do so. The closest equivalent of the present fs_inst::is_copy_payload() method is the is_coalescing_payload() helper introduced here. No functional nor shader-db changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:21:19 -08:00
Francisco Jerez	1873202f44	intel/fs: Generalize fs_reg::is_contiguous() to register files other than VGRF. No functional nor shader-db changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:20:59 -08:00
Francisco Jerez	d9a57c85cc	intel/fs: Try to vectorize header setup in lower_load_payload(). In cases where LOAD_PAYLOAD is provided a pair of contiguous registers as header sources, try to use a single SIMD16 instruction in order to initialize them. This is unlikely to affect the overall cycle count of the shader, since the compressed instruction has twice the issue time, except due to the reduced pressure on the instruction cache. Main motivation is avoiding instruction-count regressions in combination with the following copy propagation improvements, which will allow the SIMD16 g0-1 header setup emitted for framebuffer writes to be copy-propagated into its LOAD_PAYLOAD, leading to the emission of two SIMD8 MOV instructions instead of a single SIMD16 MOV. Reverting this commit on top of the copy propagation changes would lead to the following shader-db regressions on SKL and other platforms: total instructions in shared programs: 14926738 -> 14935415 (0.06%) instructions in affected programs: 1892445 -> 1901122 (0.46%) helped: 0 HURT: 8676 Without the following copy propagation changes this doesn't have any effect on shader-db on Gen7+, because we would typically set up the FB write header with a separate SIMD16 MOV that isn't currently copy-propagated into the LOAD_PAYLOAD, so the individual SIMD8 MOVs result of LOAD_PAYLOAD lowering would get register-coalesced away under normal circumstances. However that wasn't the case for MRF LOAD_PAYLOAD destinations on Gen6 and earlier, because register coalesce only kicks in for GRFs, leaving a number of redundant SIMD8 MOVs lying around. On SNB this leads to the following shader-db improvements: total instructions in shared programs: 10770538 -> 10734681 (-0.33%) instructions in affected programs: 2700655 -> 2664798 (-1.33%) helped: 17791 HURT: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-17 13:20:46 -08:00
Marek Olšák	3ba16d36c9	st/dri: do FLUSH_VERTICES before calling flush_resource	2020-01-17 15:04:35 -05:00
Marek Olšák	bec9c90b5e	gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES	2020-01-17 15:04:35 -05:00
Lionel Landwerlin	ddb80f9276	anv: enable VK_KHR_swapchain_mutable_format Enable new tests in dEQP-VK.image.swapchain_mutable.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	4bdf8547f4	vulkan/wsi: Implement VK_KHR_swapchain_mutable_format This is only the core WSI code for the extension. It adds the image format list and the flags to vkCreateImage as well as handling things properly in the modifier queries. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	a218f13278	vulkan/wsi: Filter modifiers with ImageFormatProperties Just because a modifier is returned for the given format, that doesn't mean it works with all usages and flags. We need to filter the list by calling vkGetPhysicalDeviceImageFormatProperties2. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	210e68874b	vulkan/wsi: Use the interface from the real modifiers extension The anv implementation still isn't quite complete, but we can at least start using the structs from the real extension. v2: Fix circular pNext list (Lionel) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	c78926b84d	vulkan/wsi: Move the ImageCreateInfo higher up Future changes will be easier if we can modify it based on whether or not we're using modifiers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	6790397346	anv: Support modifiers in GetImageFormatProperties2 Images with modifiers come with restrictions: 1. They have to be simple 2D images right now 2. They need to have a sensible format (not compressed, multi-plane, or non-power-of-two) 3. If a CCS modifier is being requested, they have to actually support CCS_E and be CCS-compatible with any other formats the client may wish to use for image views. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Jason Ekstrand	44f5a92c0b	anv: Drop some VK_IMAGE_TILING_OPTIMAL checks The DRM format modifiers extension adds a TILING_DRM_FORMAT_MODIFIER which will be used for modifiers so we can no longer use OPTIMAL to indicate tiled inside the driver. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>	2020-01-17 18:27:29 +00:00
Samuel Pitoiset	0099f85232	aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system LLVM only supports GFX8+. Using CLRXdisasm works most of the time, so it's useful to add support for it. Original patch by Daniel Schürmann. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>	2020-01-17 17:41:32 +00:00
Andres Rodriguez	51de5d5ac6	vulkan/wsi: disable the hardware cursor Ensure the hardware cursor is disabled when we set the mode for a VkDisplayKHR object. The extension doesn't expose any mechanisms to program the hardware cursor, so we need to ensure it is hidden. Currently, it seems like X is responsible for disabling the cursor before handing over the lease. But that seems a little frail, and we should be disabling the cursor ourselves so it works correctly independently of how the lease was prepared for us. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>	2020-01-17 17:15:52 +00:00
Krzysztof Raszkowski	ad820d5aca	gallium/swr: Disable showing detected arch message. When swr driver is in use it print detected architecture message to std::err. It can be harmfull when swr is using in multinodes environments. It can be enabled setting env var SWR_PRINT_INFO to 1. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2020-01-17 16:41:53 +00:00
Samuel Pitoiset	b9b393f0ce	aco: fix emitting slc for MUBUF instructions on GFX6-GFX7 Same as GFX10, only GFX8/GFX9 moved that bit near the opcode. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>	2020-01-17 16:56:04 +01:00
Boris Brezillon	6af63c939b	panfrost/midgard: Fix swizzle for store instructions The current logic considers that the nir_intrinsic_component(store_intr) encodes the source components start, but it actually encodes the destination one. Source component offset adjustment is taken care of in install_registers_instr(), when offset_swizzle() is called. This fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.45 when PAN_MESA_DEBUG=deqp (looks like exposing GLES3 features has an impact on the varyings layout). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>	2020-01-17 12:54:31 +00:00
Erik Faye-Lund	be95c816a7	docs: do not double-close link tag Fixes: `f8148d0cc1` "docs: remove mailing list as way of submitting patches" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:19:16 +01:00
Erik Faye-Lund	b009a7644b	docs: remove double-closed definition-list Fixes: `bc17ac5866` "docs: add documentation for building with meson" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:19:10 +01:00
Erik Faye-Lund	b387f68f49	docs: move paragraph closing tag The pre-tag right before is a block-level tag, which means it implicitly terminates the paragraph. So there's no paragraph to close after this. Instead, move the paragraph-closing before the pre-tag, to explicitly close the paragraph. Fixes: `41b3eb08d9` "docs: update meson docs for windows" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:19:03 +01:00
Erik Faye-Lund	a370cfd96e	docs: use code-tags instead of pre-tags Similar to the previous two commits, it seems more appropriate to use code-tags here than pre-tag. Fixes: `9af6c38def` "docs: Add use of Closes: tag for closing gitlab issues" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:52 +01:00
Erik Faye-Lund	1de361e56b	docs: use code-tags instead of pre-tags Similar to the previous commit, code-tags seems more appropriate than pre-tags here. So let's change it. Fixes: `ca0c1e69ca` "docs: update releasing process to use new scripts and gitlab" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:48 +01:00
Erik Faye-Lund	36e0275275	docs: use code-tag instead of pre-tag It's unlikely the author meant to use <pre>-here, as that starts a whole new block. Instead, the inline code-tag seems more appropriate here. Fixes: `41b3eb08d9` "docs: update meson docs for windows" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:42 +01:00
Erik Faye-Lund	f0677086a1	docs: open paragraph before closing it Fixes: `44c5e634a5` "docs: update meson docs for windows" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:36 +01:00
Erik Faye-Lund	a0d25c4d87	docs: fix paragraphs Paragraphs are terminated by pre-tags, so the latter one closes a new, empty one. Let's split the paragraph in two around the pre-tag instead. Fixes: `c0dfe8c6df` "docs: do not use div for line-breaking" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:25 +01:00
Erik Faye-Lund	750d664226	docs: fix typo in html tag name Fixes: `5d11a828e1` "docs: update install docs for meson" Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>	2020-01-17 13:18:06 +01:00
Pierre-Eric Pelloux-Prayer	5b1c4e1b75	util: call bind_sampler_states before setting sampler_views Fixes the following valgrind error: Invalid read of size 16 at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so) by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so) by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Address 0x18142a10 is 0 bytes inside a block of size 48 free'd at 0x48369AB: free (vg_replace_malloc.c:540) by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Block was alloc'd at at 0x4837B65: calloc (vg_replace_malloc.c:762) by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so) by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so) by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so) by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so) by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0) Fixes: `69430d7e59` ("va: use a compute shader for the blit") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>	2020-01-17 10:14:57 +01:00
Eric Anholt	d55573aac6	nir: Fix printing of ~0 .locations. I kept wondering what "429" meant in variable declarations, when it was just a truncated ~0 snprintf. Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>	2020-01-16 23:29:10 +00:00
Eric Engestrom	65641e0c7a	meson: use github URL for wraps instead of completely unreliable wrapdb Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>	2020-01-16 23:06:43 +00:00
Dylan Baker	d7cef7c67b	docs: Update release calendar for 20.0 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>	2020-01-16 22:41:55 +00:00
Andreas Baierl	2ebfc6db16	lima: Fix alpha blending Introduce separate helper functions to set the blendfactor bits. Lima uses bits 0-2 for the type, bit 3 sets the inverted function and bit 4 is set if alpha is used. alpha_src_factor and alpha_dst_factor don't need the alpha bit, so they are masked with 0xf. There is only place for 4 bits anyway. If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need to change it to PIPE_BLENDFACTOR_ONE first. This is exactly what the blob does and we pass all dEQP-GLES2.functional.fragment_ops.blend.* tests now. Better than the blob btw... Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>	2020-01-16 16:43:41 +00:00
Daniel Schürmann	3bca0af25d	aco: ignore parallelcopies to the same register on jump threading The more conservative lowering to CSSA inserts unnecessary parallelcopies which might get coalesced and can be ignored on jump threading. v2: outline is_empty_block() check. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Daniel Schürmann	427e5eeb02	aco: handle phi affinities transitively through parallelcopies This can coalesce most unnecessarily inserted parallelcopies from lowering to CSSA. v2: refactor loop a bit to make it more efficient and readable. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Daniel Schürmann	d098024c40	aco: rework lower_to_cssa() This patch changes lower_to_cssa to be much more conservative about assumptions which phi operands might interfere. Previously, this pass wasn't exhaustive and could miss some corner cases. v2: remove optimizations to find better insertion points as it's hard to guarantee that they are always correct and have overall no benefit. Fixes: `0b8216b2cd` ('aco: Lower to CSSA') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3385>	2020-01-16 16:01:59 +01:00
Samuel Pitoiset	300f8dec76	aco: implement stream output with vec3 on GFX6 GFX6 doesn't support vec3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Samuel Pitoiset	a445cb35bd	aco: do not combine additions of DS instructions on GFX6 The offset field doesn't work as expected on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Samuel Pitoiset	923005bf54	aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6 Only GFX7 and later support large ds_read/ds_write. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Lionel Landwerlin	44ffeb4fee	intel/perf: report query split for mdapi Also forgotten in the initial implementation. v2: Report begin timestamp scaled by the timestamp frequency (Windows behavior) v3: Rename split to disjoint to match GL terminology (Tapani) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Acked-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>	2020-01-16 15:29:40 +02:00
Lionel Landwerlin	3bb8a4bfec	intel/perf: expose timestamp begin for mdapi This was forgotten in the initial implementation. v2: ensure the value is written for both GL & Vulkan queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Acked-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>	2020-01-16 15:29:28 +02:00
Tapani Pälli	630cbb45ac	anv: set depth stall enabled when depth flush enabled on gen12 This implements HW workaround #1409600907 for anv driver. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>	2020-01-16 14:05:54 +02:00
Tapani Pälli	3cec148455	iris: set depth stall enabled when depth flush enabled on gen12 This implements HW workaround #1409600907 for iris driver. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>	2020-01-16 14:05:54 +02:00
Lionel Landwerlin	308efbf2f3	anv: implement another workaround for non pipelined states Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:51:30 +02:00
Lionel Landwerlin	9eca823cce	iris: implement another workaround for non pipelined states v2: add comment (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:51:22 +02:00
Lionel Landwerlin	e6e5cbac04	iris: handle new PIPE_CONTROL field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:48:11 +02:00
Lionel Landwerlin	31f0af5568	genxml: add new Gen11+ PIPE_CONTROL field PIPE_CONTROL gained a new field in its first DWORD on Gen11. We had no use for it so far, but we start using it on Gen12. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>	2020-01-16 11:48:04 +02:00
Kenneth Graunke	e3405f177b	st/mesa: Allocate full miplevels if MaxLevel is explicitly set Some applications explicitly call glTex[ture]Parameteri[v] to set GL_TEXTURE_MAX_LEVEL and GL_TEXTURE_BASE_LEVEL before uploading any texture data. Core Mesa initializes MaxLevel to 1000, so if it isn't that, we know they've set it. (We check for < TEXTURE_MAX_LEVELS to avoid hardcoding that value, however.) If MaxLevel - BaseLevel > 0, then the app is trying to tell us that this texture is going to have multiple miplevels. In that case, go ahead and allocate the space for it. Avoids many resource_copy_region calls at texture finalization time in the Civilization VI benchmark. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3401> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3401>	2020-01-16 00:06:54 -08:00
Samuel Pitoiset	68abc07317	aco: fix emitting SMEM instructions with no operands on GFX6-GFX7 Like s_memtime. Fixes dEQP-VK.glsl.shader_clock.* on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3407>	2020-01-16 08:18:18 +01:00
Vasily Khoruzhick	e5226cff75	lima: fix handling of reverse depth range Looks like we need to handle cases when near > far and near == far. In first case we just need to swap near and far, and in second we need subtract epsilon from near if it's not zero. Fixes 10 tests in dEQP-GLES2.functional.depth_range.* Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>	2020-01-16 01:57:05 +00:00
Ilia Mirkin	784b84d308	nvc0: disable xfb's which don't have a stride No stride / no attributes means that nothing is being written to the buffer. However it might still prevent primitives from being written out to the other buffers. Disabling it entirely seems to fix it. Fixes GTF-GL45.gtf30.GL3Tests.transform_feedback.transform_feedback_overflow Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2020-01-15 19:53:18 -05:00
Erico Nunes	9bf210ba98	lima/ppir: implement full liveness analysis for regalloc The existing liveness analysis in ppir still ultimately relies on a single continuous live_in and live_out range per register and was observed to be the bottleneck for register allocation on complicated examples with several control flow blocks. The use of live_in and live_out ranges was fine before ppir got control flow, but now it ends up creating unnecessary interferences as live_in and live_out ranges may span across entire blocks after blocks get placed sequentially. This new liveness analysis implementation generates a set of live variables at each program point; before and after each instruction and beginning and end of each block. This is a global analysis and propagates the sets of live registers across blocks independently of their sequence. The resulting sets optimally represent all variables that cannot share a register at each program point, so can be directly translated as interferences to the register allocator. Special care has to be taken with non-ssa registers. In order to properly define their live range, their alive components also need to be tracked. Therefore ppir can't use simple bitsets to keep track of live registers. The algorithm uses an auxiliary set data structure to keep track of the live registers. The initial implementation used only trivial arrays, however regalloc execution time was then prohibitive (>1minute on Cortex-A53) on extreme benchmarks with hundreds of instructions, hundreds of registers and several spilling iterations, mostly due to the n^2 complexity to generate the interferences from the live sets. Since the live registers set are only a very sparse subset of all registers at each instruction, iterating only over this subset allows it to run very fast again (a couple of seconds for the same benchmark). Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>	2020-01-15 22:55:31 +00:00
Erico Nunes	7e2765fded	lima/ppir: remove orphan load node after cloning There are some cases in shades using control flow where the varying load is cloned to every block, and then the original node is left orphan. This is not harmful for program execution, but it complicates analysis for register allocation as there is now a case of writing to a register that is never read. While ppir doesn't have a dead code elimination pass for its own optimizations and it is not hard to detect when we cloned the last load, let's remove it early. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>	2020-01-15 22:55:31 +00:00
Kristian H. Kristensen	a3a73d116c	iris: Print warning and return *out = NULL when fd to syncobj fails Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-15 14:47:46 -08:00
Kristian H. Kristensen	1ac138694b	iris: Advertise PIPE_CAP_NATIVE_FENCE_FD Enables EGL_ANDROID_native_fence_sync. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-15 14:47:46 -08:00
Kenneth Graunke	e9f9a944d3	iris: Fix export of fences that have already completed. After flushing batches, iris_fence_flush() asks the kernel whether each batch's last_syncpt has already signalled or not. (The idea is that either the compute or render batch may not have actually had any work queued up, so last_syncpt there might have been signalled a long time ago.) If it's already completed, we don't bother to record it. A strange corner is the case of repeated flushes. For example, we might flush for some reason, and hit a glFlush(), and hit SwapBuffers. It's possible for all the batches to have been flushed previously, -and- for them to have actually completed. In this case, we'll see that there are no syncobj's to wait on, and record fence->count == 0. This works fine internally - fence_finish can see count == 0 and realize that it doesn't need to wait, for example. But when working with native FDs, we may be asked to export a fence with count == 0. So we need an actual synchronization primitive we can hand off. Because all of the relevant batches had been signalled when creating the fence, we want the new dummy fence to be signalled as well. So we just make a signalled syncobj and export it. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2020-01-15 14:47:46 -08:00
Robert Foss	6b9fce5d9e	android: Fix whitespace issue Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-15 22:30:17 +00:00
Robert Foss	62adb6522b	panfrost: Prefix schedule_program to prevent collision Currently the schedule_program implementation being used is picked at compile time, which on the Android platform means that the bifrost compiler & scheduler is used for all targets, including midgard based hardware. This commit disambiguates between the two schedule_program functions. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-15 22:30:17 +00:00
Marek Olšák	c4daf2b485	radeonsi: merge si_compile_llvm and si_llvm_compile functions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	68586bdd21	radeonsi: remove useless #includes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	30b14ba67e	radeonsi: move code for shader resources into si_shader_llvm_resources.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	da2c12af4b	radeonsi: move geometry shader code into si_shader_llvm_gs.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	57bd73e229	radeonsi: remove llvm_type_is_64bit Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	194449a405	radeonsi: move tessellation shader code into si_shader_llvm_tess.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	d7c86b106c	radeonsi: move si_insert_input_* functions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>	2020-01-15 21:54:55 +00:00
Marek Olšák	8ff8e68e42	radeonsi: work around an LLVM crash when using llvm.amdgcn.icmp.i64.i1 Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>	2020-01-15 20:17:23 +00:00
Marek Olšák	af3fbb410c	radeonsi: fix si_build_wrapper_function for compute-based primitive culling Fixes: `3b143369a5` "ac/nir, radv, radeonsi: Switch to using ac_shader_args" Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>	2020-01-15 20:17:23 +00:00
Marek Olšák	6d4993c942	radeonsi/gfx10: separate code for determining the number of vertices for NGG Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:34 -05:00
Marek Olšák	7a25521f92	radeonsi/gfx10: separate code for getting edgeflags from the gs_invocation_id VGPR Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:33 -05:00
Marek Olšák	cf65c6f0d2	radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:31 -05:00
Marek Olšák	34ef0c5083	radeonsi: make si_insert_input_* functions non-static Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:29 -05:00
Marek Olšák	eeb4a11c11	ac/cull: don't read Position.Z if it's not needed for culling It could be NULL. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 15:06:20 -05:00
Marek Olšák	8070402a30	radeonsi: separate code computing info for small primitive culling Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-15 14:59:11 -05:00
Kenneth Graunke	0a1c47074b	intel/compiler: Fix illegal mutation in get_nir_image_intrinsic_image get_nir_image_intrinsic_image() was incorrectly mutating the value held by the register which holds the intrinsic's first source (image index). If this happened to be the register for an SSA def which is also used elsewhere in the program, this meant that we would clobber that value in subsequent uses. Note that this only affects i965, because neither anv nor iris use the binding table start sections, so nothing is ever added here. Fixes KHR-GL46.compute_shader.resources-max on i965 with Eric Anholt's MR !3240 applied. That MR reorders SSBOs and ABOs, so that test uses image 0 and SSBO 0, causing this code to brilliantly add binding table index 45 to both the image (correct) and the SSBO (bzzt, wrong!). Fixes: `09f1de97a7` ("anv,i965: Lower away image derefs in the driver") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3404> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3404>	2020-01-15 19:25:35 +00:00
Rob Clark	b706a157c5	gitlab-ci: fix missing caselist.css/xsl My best guess is that this was broken by `d62dd8b0` Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3413> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3413>	2020-01-15 19:03:56 +00:00
Jason Ekstrand	af6c2f4193	relnotes: Add Vulkan 1.2	2020-01-15 09:25:51 -06:00
Samuel Pitoiset	7f5462e349	radv: enable Vulkan 1.2 This bumps the Vulkan version to 1.2.128. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	68d6bead78	radv: implement Vulkan 1.2 features and properties Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	b3033198a8	radv: implement Vulkan 1.1 features and properties Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	a09ab76828	radv: update VK_KHR_timeline_semaphore for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	fab0aa9182	radv: update VK_KHR_uniform_buffer_standard_layout for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	3ff8d12458	radv: update VK_KHR_shader_subgroup_extended_types for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	af25c8d57b	radv: update VK_KHR_shader_float_controls for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	5335bb6c39	radv: update VK_KHR_shader_float16_int8 for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	a73d01b1db	radv: update VK_KHR_shader_atomic_int64 for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	83d1773a57	radv: update VK_KHR_imageless_framebuffer for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	b3bdb4e6ff	radv: update VK_KHR_image_format_list for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	a80229941f	radv: update VK_KHR_driver_properties for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	af883bf3dc	radv: update VK_KHR_draw_indirect_count for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	b537be4368	radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	5993f13b27	radv: update VK_KHR_create_renderpass2 for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	b2be00fbc1	radv: update VK_KHR_buffer_device_address for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	0eb26aae1c	radv: update VK_KHR_8bit_storage for Vulkan 1.2 Promoted to Vulkan 1.2 with the KHR suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	b4eed4e548	radv: update VK_EXT_scalar_block_layout for Vulkan 1.2 Promoted to Vulkan 1.2 with the EXT suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	efdf9d8969	radv: update VK_EXT_sampler_filter_minmax for Vulkan 1.2 Promoted to Vulkan 1.2 with the EXT suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	65e215e6f3	radv: update VK_EXT_host_query_reset for Vulkan 1.2 Promoted to Vulkan 1.2 with the EXT suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Samuel Pitoiset	95ec0c050b	radv: update VK_EXT_descriptor_indexing for Vulkan 1.2 Promoted to Vulkan 1.2 with the EXT suffix omitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-15 08:42:25 -06:00
Iván Briano	4ef3f7e3d3	anv: Enable Vulkan 1.2 support Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-15 08:34:57 -06:00
Jason Ekstrand	c616627f63	anv: Implement the new core version property queries Vulkan 1.2 introduces some new structures to get the properties and features of a device from extensions that were promoted to core in 1.1 and 1.2. This commit implements the new property queries and makes all of the corresponding extension queries map to them. Reviewed-by: Iván Briano <ivan.briano@intel.com>	2020-01-15 08:34:57 -06:00
Jason Ekstrand	a47152c622	anv: Implement the new core version feature queries Vulkan 1.2 introduces some new structures to get the properties and features of a device from extensions that were promoted to core in 1.1 and 1.2. This commit implements the new feature queries and makes all of the corresponding extension queries map to them. Reviewed-by: Iván Briano <ivan.briano@intel.com>	2020-01-15 08:34:57 -06:00
Jason Ekstrand	721666e52a	anv,nir: Lower quad_broadcast with dynamic index in NIR This is required for the subgroupBroadcastDynamicId feature that was added in Vulkan 1.2. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-15 08:34:57 -06:00
Jason Ekstrand	7e3e2ce702	anv: Bump the patch version to 131	2020-01-15 08:34:57 -06:00
Samuel Pitoiset	f33a68af63	vulkan/overlay: Fix for Vulkan 1.2 v2 (Jason Ekstrand): - Add duplicate hooks for both the 1.2 and KHR versions of vkCmdDraw[Indexed]IndirectCount. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-15 08:34:57 -06:00
Jason Ekstrand	75755e0eba	turnip: Pretend to support Vulkan 1.2 It doesn't really support any Vulkan properly yet so why not claim 1.2? This was an easier way of fixing the build than trying to roll it forward to a later version of ANV's entrypoint generator scripts.	2020-01-15 08:34:57 -06:00
Jason Ekstrand	ac0c7ad2c2	vulkan: Update the XML and headers to 1.2.131 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-15 08:07:04 -06:00
Michel Dänzer	8775b742ea	gitlab-ci: Stop using manual jobs for merge requests They were causing trouble with Marge Bot: The project settings require that the pipeline succeeds before a merge request (MR) can be merged, otherwise Marge doesn't wait for the pipeline to succeed before merging an MR assigned to her. But Marge can't start manual jobs, so she would always time out waiting for pipelines with manual jobs. To avoid this, use these rules: * Run the pipeline by default for MRs and main project branches changing any files affecting it. * For other MRs, run a single dummy job which always succeeds. * Don't run any jobs for main project branch changes (e.g. from an MR having been merged) not affecting the pipeline. * Allow jobs to be started manually on branches of forked projects, as before. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3361> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3361>	2020-01-15 10:31:01 +00:00
Pierre-Eric Pelloux-Prayer	7b0b085c94	radeonsi: drop the negation from fmask_is_not_identity This change eases code reading ("fmask_is_identity = true" is clearer than "fmask_is_not_identity = false"). Initialization is not changed so fmask_is_identity is false when a texture is created. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>	2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer	3a527eda7c	radeonsi: unbind image before compute clear It's not used and avoid infinite recursion when used from si_compute_expand_fmask Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>	2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer	c2df5389bb	radeonsi: make sure fmask expand is done if needed Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2248 Fixes: `095a58204d` ("radeonsi: expand FMASK before MSAA image stores are used") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>	2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer	b5e748b49b	radeonsi: fix fmask expand compute shader 'coord' variable was using TGSI_WRITEMASK_XYZ so subsequent uses of TGSI_WRITEMASK_W were dropped. The result for a 2 samples program was: 0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy 1: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA 2: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA 3: END instead of the expected: 0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy 1: MOV TEMP[0].w, IMM[0].yyyy 2: LOAD TEMP[1], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA 3: MOV TEMP[0].w, IMM[0].zzzz 4: LOAD TEMP[2], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA 5: MOV TEMP[0].w, IMM[0].yyyy 6: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA 7: MOV TEMP[0].w, IMM[0].zzzz 8: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA 9: END This fixes half of https://gitlab.freedesktop.org/mesa/mesa/issues/2248 Fixes: `095a58204d` ("radeonsi: expand FMASK before MSAA image stores are used") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>	2020-01-15 10:10:15 +00:00
Nataraj Deshpande	be08e6a449	egl/android: Restrict minimum triple buffering for android color_buffers The patch restricts triple buffering as minimum at driver for android color_buffers in order to fix onscreen performance hit for T-Rex and Manhattan. v2: Update min_buffer check condition (Tapani Pälli) v3: further code cleanup (Eric Engestrom) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2332 Fixes: `0661c357c6` ("egl/android: Update color_buffers querying for buffer age") Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3384> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3384>	2020-01-15 09:42:08 +00:00
Lionel Landwerlin	a014105498	anv: fix pipeline switch back for non pipelined states Setting state base address can happen even before pipeline is selected. Also we must ensure it is set to 3D for Gen12, we can't switch back to an invalid pipeline value (UINT32_MAX). v2: Reuse helpers (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b34422db5e` ("anv: Implement Gen12 workaround for non pipelined state") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3396>	2020-01-15 11:14:43 +02:00
Samuel Pitoiset	fce28a7341	radv/gfx10: simplify some duplicated NGG GS code Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>	2020-01-15 07:45:29 +00:00
Samuel Pitoiset	53b50be35c	radv/gfx10: enable all CUs if NGG is never used Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>	2020-01-15 07:45:29 +00:00
Samuel Pitoiset	5ff12322c9	radv: only use VkSamplerCreateInfo::compareOp if enabled Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3392>	2020-01-15 08:16:15 +01:00
Iago Toral Quiroga	3f3ec07be5	v3d: fix bug when checking result of syncobj fence import Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383>	2020-01-15 07:53:58 +01:00
Jonathan Marek	222e127e39	st/mesa: run st_nir_lower_tex_src_plane for lowered xyuv/ayuv Has the effect of removing the nir_tex_src_plane for these formats too. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>	2020-01-15 02:20:00 +00:00
Jonathan Marek	a554b45d73	st/mesa: don't lower YUV when driver supports it natively This fixes YUYV support on etnaviv. Fixes: `7404833c` "gallium: add handling for YUV planar surfaces" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>	2020-01-15 02:20:00 +00:00
Bas Nieuwenhuizen	4e3c81517b	radv: Disable VK_EXT_sample_locations on GFX10. Workaround for https://gitlab.freedesktop.org/mesa/mesa/issues/2163 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3236>	2020-01-15 01:54:27 +00:00
Gurchetan Singh	6c978b1362	st/mesa: implement EGLImageTargetTexStorage We can now support this extension. Acked-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>	2020-01-15 01:18:54 +00:00
Gurchetan Singh	2f1032f8f2	st/mesa: refactor egl image binding a bit We'll need it for egl image tex storage. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>	2020-01-15 01:18:54 +00:00
Gurchetan Singh	be347863ba	st/dri: track if image is created by a dmabuf Will be used by EXT_EGL_image_storage later. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>	2020-01-15 01:18:54 +00:00
Rob Clark	2629cb627c	freedreno/ir3: rename instructions Turns out this range of opcodes are more general purpose if/else/endif instructions. We should re-work tess to create a basic block and use normal flow control. And possibly (for a6xx+) optimize cases to use if/else/endif when appropriate. Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3398>	2020-01-15 00:56:24 +00:00
Elie Tournier	22c5c54a4f	nir/algebraic: sqrt(x)*sqrt(x) -> fabs(x) total instructions in shared programs: 12840840 -> 12839341 (-0.01%) instructions in affected programs: 122581 -> 121082 (-1.22%) helped: 559 HURT: 0 total cycles in shared programs: 302505756 -> 302490031 (<.01%) cycles in affected programs: 2022900 -> 2007175 (-0.78%) helped: 1090 HURT: 130 Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>	2020-01-15 00:30:52 +00:00
Elie Tournier	6f394343b1	nir/algebraic: i2f(f2i()) -> trunc() total instructions in shared programs: 12840968 -> 12840784 (<.01%) instructions in affected programs: 17886 -> 17702 (-1.03%) helped: 77 HURT: 0 total cycles in shared programs: 302508917 -> 302505592 (<.01%) cycles in affected programs: 249964 -> 246639 (-1.33%) helped: 70 HURT: 7 Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>	2020-01-15 00:30:52 +00:00
Eric Anholt	3d9a3d0be0	i965: Reuse the new core glsl_count_dword_slots(). The only difference I could see was treating interfaces like structs. Maintain that case. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>	2020-01-14 23:55:00 +00:00
Eric Anholt	bc4f089d01	mesa/st: Move the dword slot counting function to glsl_types as well. To implement NIR-to-TGSI, we need to be able to get the size of the uniform variable for the TGSI declaration, not just the .driver_location. With its location in mesa/st, drivers couldn't link to it from nir-to-tgsi. This feels like a common enough function to want, so let's share it in the core compiler. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>	2020-01-14 23:55:00 +00:00
Eric Anholt	4cabd4812a	mesa/prog: Reuse count_vec4_slots() from ir_to_mesa. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>	2020-01-14 23:55:00 +00:00
Eric Anholt	74ee3f76de	mesa/st: Move the vec4 type size function into core GLSL types. The only bit that gallium varied on was handling of bindless. We can retain previous behavior for count_attribute_slots() by passing in "true" (though I suspect this is just giving a silly answer to a silly question), and delete our recursive function from mesa/st. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>	2020-01-14 23:55:00 +00:00
Eric Anholt	b807f7a43a	mesa/st: Deduplicate the NIR uniform lowering code. Just a little refactor as I go looking at the type size functions. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>	2020-01-14 23:55:00 +00:00
Marek Olšák	8832a88434	radeonsi: move PS LLVM code into si_shader_llvm_ps.c This is an attempt to clean up si_shader.c. v2: don't move code that is not specific to LLVM Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)	2020-01-14 18:46:07 -05:00
Marek Olšák	9b60b3ce93	radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	37916a66b1	radeonsi: fold si_create_function into si_llvm_create_func Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	42112010a3	radeonsi: rename si_shader_create -> si_create_shader_variant for clarity Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	63b5d85baa	radeonsi: rename si_compile_tgsi_main -> si_build_main_function Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	f4ba457e1e	radeonsi: clean up si_shader_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	03950473df	radeonsi: merge si_tessctrl_info into si_shader_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	5fa2ab831e	radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	18aaceae8d	radeonsi: rename si_shader_info -> si_shader_binary_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	7f4a54d5bd	radeonsi: remove TGSI from comments Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	b1badf4ad6	radeonsi: rename DBG_NO_TGSI -> DBG_NO_NIR Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Marek Olšák	b144d4be74	radeonsi: don't adjust depth and stencil PS output locations this was for compatibility with TGSI Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-14 18:46:07 -05:00
Caio Marcelo de Oliveira Filho	3cc501be69	nir: Add missing nir_var_mem_global to various passes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>	2020-01-14 14:42:12 -08:00
Caio Marcelo de Oliveira Filho	d8440a3d2f	spirv: Handle PhysicalStorageBuffer in memory barriers PhysicalStorageBuffer is lowered to nir_var_mem_global, and SPIR-V 1.5rev1 in section "3.25. Memory Semantics <id>" says UniformMemory Apply the memory-ordering constraints to StorageBuffer, PhysicalStorageBuffer, or Uniform Storage Class memory. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>	2020-01-14 14:42:12 -08:00
Caio Marcelo de Oliveira Filho	1ec0d4fdff	spirv: Drop EXT for PhysicalStorageBuffer symbols Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>	2020-01-14 14:42:12 -08:00
Timur Kristóf	dfaa3c0af6	aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible. When possible, get rid of an s_not when all it does is invert the SCC, and its successor s_cbranch / s_cselect can be inverted instead. Also modify some parts of instruction_selection to take advantage of this feature. Example: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 s2: %3902 = s_cselect_b64 -1, 0, %3900:scc s2: %407, s1: %3903:scc = s_not_b64 %3902 s2: %3906, s1: %3905:scc = s_and_b64 %407, %0:exec p_cbranch_z %3905:scc Can now be optimized to: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 p_cbranch_nz %3900:scc Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	c0f82165a7	aco: Optimize out s_and with exec, when used on uniform bitwise values. Previously all booleans needed an s_and with exec when they were turned into a scalar condition. However, this is not needed for uniform booleans. v2 by Daniel Schürmann: - Make the code more readable v3 by Timur Kristóf: - Fix regressions, make it work in wave32 mode Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	1c44129db3	aco: Don't skip combine_instruction when definitions[1] is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	338d03090f	aco: Allow optimizing vote_all and nir_op_iand. By adding an extra instruction, we can replace the operands of the s_cselect_b64, which allows it to get picked up by the optimizer when it looks for uniform booleans. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	d962bbd895	aco: Implement 64-bit constant propagation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Alyssa Rosenzweig	6bd9c4dc57	panfrost: Fix linear depth textures As pointed out by Boris, what we were calling PAN_LINEAR depth textures was in fact u-interleaved tiled (!), but we never noticed since we flipped the flag used for sampling, leading to all sorts of fun bugs when attempting to directly acess depth textures from the CPU. Which begs the question -- if what we called LINEAR was tiled, how do we actually render linear depth textures? It turns out the flags for AFBC form a mali_block_format 2-bit code just like their render-target counterparts, so we can render to any of the above. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>	2020-01-14 19:42:20 +00:00
Jason Ekstrand	7c16a1ae4e	vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first The Aztec Ruins benchmark just grabs the first format in the list and SRGB causes it to render washed out. With this workaround, it renders the same as OpenGL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>	2020-01-14 19:27:13 +00:00
Caio Marcelo de Oliveira Filho	edf6a40cb2	intel/fs: Only use SLM fence in compute shaders Fixes: `b390ff3517` ("intel/fs: Add support for SLM fence in Gen11") Fixes: `e142061399` ("intel/fs: Implement scoped_memory_barrier") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-14 10:55:48 -08:00
Marek Olšák	9e699ae690	radeonsi: actually enable VBOs in user SGPRs Fixes: `363b4027fc` - radeonsi: put up to 5 VBO descriptors into user SGPRs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-14 13:42:36 -05:00
Marek Olšák	f341db3e17	radeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers The assertion was failing. Fixes: `363b4027fc` - radeonsi: put up to 5 VBO descriptors into user SGPRs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-14 13:42:36 -05:00
Rhys Perry	cc3ef3643a	nir/algebraic: a & ~(a >> 31) -> imax(a, 0) Found in some Doom shaders Totals from affected shaders: SGPRS: 30056 -> 30064 (0.03 %) VGPRS: 28024 -> 28024 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 4278648 -> 4270852 (-0.18 %) bytes Max Waves: 1476 -> 1476 (0.00 %) Instructions: 835287 -> 833338 (-0.23 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>	2020-01-14 17:54:40 +00:00
Marco Felsch	1607123ae7	etnaviv: Fix assert when try to accumulate an invalid fd Check if it is a valid fd before merging it to the context's fd. Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>	2020-01-14 17:40:10 +00:00
Afonso Bordado	22217f24ec	pan/midgard: Fix midgard_compile.h includes We now use enum mali_format which is defined in panfrost-job.h Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>	2020-01-14 17:16:11 +00:00
Lionel Landwerlin	a19cdf989b	anv: only use VkSamplerCreateInfo::compareOp if enabled The spec says nothing about the validity of the compareOp field when compareEnable is false. v2: use vulkan enum to pick default value (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>	2020-01-14 16:40:16 +00:00
Rhys Perry	d8e05edbd9	nir/sink,nir/move: move/sink nir_op_mov Can uncover opportunities to move other instructions. This can increase register usage, but that doesn't seem to actually happen. This optimizes a pattern of a load_per_vertex_input followed by several moves and then a store_output in a different block. v2: add nir_move_copies to make it optional Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Acked-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>	2020-01-14 13:56:45 +00:00
Rhys Perry	04fac72ec7	nir/sink,nir/move: move/sink load_per_vertex_input Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>	2020-01-14 13:56:45 +00:00
Tomeu Vizoso	22d976454f	gitlab-ci: Consolidate container and build stages for LAVA Use the normal build job to also prepare the artifacts for LAVA jobs. For that, the build container needs to also build the test suites, kernel, ramdisk, etc. Then the build job will place the just-built Mesa in the ramdisk and the test job can generate a LAVA job and point to those artifacts. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>	2020-01-14 13:17:24 +00:00
Rhys Perry	f978e0e516	aco: add integer min/max to can_swap_operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f92a89a979	aco: improve readfirstlane after uniform LDS loads Totals from affected shaders: SGPRS: 976 -> 968 (-0.82 %) VGPRS: 580 -> 584 (0.69 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 106032 -> 103076 (-2.79 %) bytes Max Waves: 237 -> 237 (0.00 %) Instructions: 19452 -> 18740 (-3.66 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	92ace0bb31	aco: replace extract_vector with copies Helps a small number of small shaders with situations like this: a = p_create_vector ... b = p_extract_vector a, 3 and copy propagation can't be done Totals from affected shaders: SGPRS: 14304 -> 14416 (0.78 %) VGPRS: 8716 -> 6592 (-24.37 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 184664 -> 176888 (-4.21 %) bytes Max Waves: 6260 -> 6260 (0.00 %) Instructions: 35561 -> 33617 (-5.47 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	20d869079d	aco: allow input modifiers on v_cndmask_b32 Totals from affected shaders: SGPRS: 594099 -> 594019 (-0.01 %) VGPRS: 441016 -> 441124 (0.02 %) Spilled SGPRs: 101 -> 101 (0.00 %) Spilled VGPRs: 18 -> 18 (0.00 %) Code Size: 30266652 -> 30125256 (-0.47 %) bytes Max Waves: 67044 -> 67057 (0.02 %) Instructions: 5753097 -> 5726607 (-0.46 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f9405ceb8a	aco: don't move literal to reg when making an instruction VOP3 on GFX10 pipeline-db (Navi): Totals from affected shaders: SGPRS: 163398 -> 163398 (0.00 %) VGPRS: 143820 -> 143820 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 13065744 -> 13044308 (-0.16 %) bytes Max Waves: 18921 -> 18921 (0.00 %) Instructions: 2514644 -> 2509285 (-0.21 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	e686e4765e	aco: add min(-max(), ) and max(-min(), ) optimization No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	fa8357eb70	aco: improve clamp optimization Not sure why it checked the use count, it doesn't apply the constants. pipeline-db (Navi): Totals from affected shaders: SGPRS: 269409 -> 269745 (0.12 %) VGPRS: 238120 -> 238132 (0.01 %) Spilled SGPRs: 305 -> 305 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 22908584 -> 22904672 (-0.02 %) bytes Max Waves: 20217 -> 20217 (0.00 %) Instructions: 4275312 -> 4263869 (-0.27 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 155409 -> 155233 (-0.11 %) VGPRS: 153072 -> 153072 (0.00 %) Spilled SGPRs: 269 -> 269 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 14650824 -> 14650396 (-0.00 %) bytes Max Waves: 9609 -> 9609 (0.00 %) Instructions: 2762802 -> 2755517 (-0.26 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	edc888ccb1	aco: fix clamp optimization We can't do the optimization if there are neg/abs in-between. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f664cb01ec	aco: improve creation of v_madmk_f32/v_madak_f32 Using needs_vop3 check was flawed because it would only combine the literal if the first operand is the literal. If the second or third operand is the literal, then needs_vop3 will be true and the literal will not be combined. pipeline-db (Navi): Totals from affected shaders: SGPRS: 782051 -> 782051 (0.00 %) VGPRS: 630048 -> 630048 (0.00 %) Spilled SGPRs: 195 -> 195 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 54743740 -> 54585548 (-0.29 %) bytes Max Waves: 67340 -> 67340 (0.00 %) Instructions: 10182030 -> 10182030 (0.00 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 701990 -> 699590 (-0.34 %) VGPRS: 566632 -> 566784 (0.03 %) Spilled SGPRs: 218 -> 218 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 49173564 -> 49007856 (-0.34 %) bytes Max Waves: 59650 -> 59612 (-0.06 %) Instructions: 9315135 -> 9293330 (-0.23 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	15e25da3e5	aco: take advantage of GFX10's constant bus limit and VOP3 literals pipeline-db (Navi): Totals from affected shaders: SGPRS: 2397159 -> 2392494 (-0.19 %) VGPRS: 1756036 -> 1753920 (-0.12 %) Spilled SGPRs: 461 -> 470 (1.95 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 110287304 -> 109946304 (-0.31 %) bytes Max Waves: 318341 -> 318475 (0.04 %) Instructions: 21019327 -> 20533618 (-2.31 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 0 -> 0 (0.00 %) bytes Max Waves: 0 -> 0 (0.00 %) Instructions: 0 -> 0 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	9c2d37308f	aco: allow an extra SGPR with multiple uses to be applied to VOP3 This is in a separate patch from the apply_sgprs() rewrite so that the rewrite can be more easily tested. pipeline-db (Navi): Totals from affected shaders: SGPRS: 3056 -> 3056 (0.00 %) VGPRS: 1632 -> 1632 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 156468 -> 156304 (-0.10 %) bytes Max Waves: 288 -> 288 (0.00 %) Instructions: 29510 -> 29469 (-0.14 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 1616 -> 1616 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 156132 -> 155968 (-0.11 %) bytes Max Waves: 289 -> 289 (0.00 %) Instructions: 29426 -> 29385 (-0.14 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f4c2c90e1a	aco: allow applying two sgprs to an instruction We could create VALU instructions which read two sgprs, but only if isel created an instruction which already read one of them. This change is in a separate patch from the apply_sgprs() rewrite so that it can be tested if the rewrite affected anything. pipeline-db (Navi): Totals from affected shaders: SGPRS: 216 -> 216 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 1756 -> 1708 (-2.73 %) bytes Max Waves: 120 -> 120 (0.00 %) Instructions: 312 -> 300 (-3.85 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 216 -> 216 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 1784 -> 1736 (-2.69 %) bytes Max Waves: 120 -> 120 (0.00 %) Instructions: 319 -> 307 (-3.76 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	7da07ca3e4	aco: follow through temporary when merging tests into constant comparisons This can happen with v_mov_b32(s_mov_b32(literal)) pipeline-db (Navi): Totals from affected shaders: SGPRS: 632 -> 632 (0.00 %) VGPRS: 492 -> 492 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 77488 -> 76928 (-0.72 %) bytes Max Waves: 67 -> 67 (0.00 %) Instructions: 14426 -> 14332 (-0.65 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 632 -> 632 (0.00 %) VGPRS: 492 -> 492 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 77512 -> 76952 (-0.72 %) bytes Max Waves: 67 -> 67 (0.00 %) Instructions: 14432 -> 14338 (-0.65 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	dc6c35e1c3	aco: be more careful with literals in combine_salu_{n2,lshl_add} No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	fcf52eb42d	aco: add check_vop3_operands() This will be useful when taking advantage of GFX10 features. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	cef7879719	aco: rewrite apply_sgprs() This will make it easier to apply two different sgprs (for GFX10) or apply the same sgpr twice (just remove the break). No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	0be7409069	aco: rewrite literal combining Should make taking advantage of GFX10's increased constant bus limit and VOP3 literals easier. No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	84b9f3786b	aco: improve can_use_VOP3() No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	3cb98ed939	aco: combine two sgprs into a VALU if they're the same This was supposed to be done before but it wasn't done correctly and everywhere. pipeline-db (Navi): Totals from affected shaders: SGPRS: 784680 -> 786128 (0.18 %) VGPRS: 574012 -> 573892 (-0.02 %) Spilled SGPRs: 461 -> 461 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 45477088 -> 45478172 (0.00 %) bytes Max Waves: 81294 -> 81277 (-0.02 %) Instructions: 8657970 -> 8622483 (-0.41 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 780664 -> 782072 (0.18 %) VGPRS: 573880 -> 573760 (-0.02 %) Spilled SGPRs: 629 -> 629 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 45445244 -> 45448340 (0.01 %) bytes Max Waves: 81178 -> 81161 (-0.02 %) Instructions: 8649902 -> 8614918 (-0.40 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	c240c1aecf	aco: apply literals to split mads Removing the return is also needed to apply literals to mads (which can be done on GFX10). pipeline-db (Navi): Totals from affected shaders: SGPRS: 368787 -> 367555 (-0.33 %) VGPRS: 312436 -> 312448 (0.00 %) Spilled SGPRs: 461 -> 461 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 26113388 -> 26098260 (-0.06 %) bytes Max Waves: 35982 -> 35982 (0.00 %) Instructions: 5038670 -> 5028941 (-0.19 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 369843 -> 368659 (-0.32 %) VGPRS: 317224 -> 317196 (-0.01 %) Spilled SGPRs: 629 -> 629 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 26310540 -> 26295156 (-0.06 %) bytes Max Waves: 36324 -> 36326 (0.01 %) Instructions: 5073957 -> 5064164 (-0.19 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	8f10e48745	aco: update IR validator GFX10 increased the constant bus limit and allowed literals on VOP3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:27 +00:00
Rhys Perry	1ffacc3ce1	nir/lower_gs_intrinsics: add option for per-stream counts Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>	2020-01-14 12:11:14 +00:00
Rhys Perry	9fb0c2e033	nir/divergence: handle load_primitive_id in GS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>	2020-01-14 11:29:44 +00:00
Erik Faye-Lund	9aab36b6eb	mesa/st: use float literals This removes a warning on MSVC. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2020-01-14 12:01:29 +01:00
Erik Faye-Lund	fcdd3c866b	gallium: fix a warning On some platforms (like Win64), unsigned long is 32-bit, so the first cast doesn't do anything, and the compiler complains about an implicit cast to a smaller type. So let's cast to an uintptr_t instead first, as that's large enough on all platforms. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2020-01-14 12:01:05 +01:00
Erik Faye-Lund	1a1e5a763a	st/wgl: eliminate implicit cast warning I get warnings on MSVC for these implicit casts. Let's use explicit casts instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2020-01-14 12:01:00 +01:00
Erik Faye-Lund	d5c0fbfd78	util: initialize float-array with float-literals We currently initialize this float-array with double-literals. Some compilers generate warnings for this, so let's switch these to float-literals instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2020-01-14 12:00:27 +01:00
Lionel Landwerlin	b34422db5e	anv: Implement Gen12 workaround for non pipelined state Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>	2020-01-14 11:52:36 +02:00
Lionel Landwerlin	b8fbb39ab2	iris: Implement Gen12 workaround for non pipelined state Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>	2020-01-14 11:52:36 +02:00
Vasily Khoruzhick	55b0aa436e	lima: add new findings to texture descriptor Lower 8 bits of unknown_1_3 seems to be min_lod, rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be lod bias. All are in fixed format with 4 bit integer and 4 bit fraction, lod_bias also has sign bit. Blob also seems to do some magic with lod_bias if min filter is nearest -- it adds 0.5 to lod_bias in this case. Same story when all filters are nearest and mipmapping is enabled, but in this case it subtracts 1/16 from lod_bias. Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.* Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>	2020-01-13 22:50:36 -08:00
Kenneth Graunke	a9bd0668d5	intel: Use similar brand strings to the Windows drivers This updates our product name strings to match the ones reported by the Windows driver, which is typically the marketing name. We retain a platform abbreviation and GT level in parenthesis so that we're able to distinguish similar parts more easily, helping us better understand at a glance which GPU a bug reporter has. Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>	2020-01-13 19:42:35 -08:00
Kenneth Graunke	f63d6260d1	iris: Simplify iris_get_renderer_string() We use gen_get_device_name() instead of PCI ID list munging. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>	2020-01-13 19:42:30 -08:00
Kenneth Graunke	44bad9c31a	i965: Simplify brw_get_renderer_string() This stops using driGetRendererString() in favor of a simple snprintf(). This should have the same functionality on 64-bit systems, but drops a "x86/MMX/SSE2" suffix on 32-bit systems. (People shouldn't be using the GL_RENDERER string to check for CPU features...) We also use gen_get_device_name() instead of PCI ID list munging. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>	2020-01-13 19:42:22 -08:00
Kenneth Graunke	50c47ba49e	Revert "nir: assert that nir_lower_tex runs after lowering derefs" This reverts commit `4cda61f11e` for now, as it appears to break i965 CI (32,000+ failures). Rob and I suspect we need to do the equivalent of `1c6a2efa06` on i965 - we are doing nir_lower_tex and brw_nir_lower_resources in the wrong order and that's likely triggering this condition. Once we fix that, we should put this patch back.	2020-01-13 17:37:40 -08:00
Erik Faye-Lund	09b37ba65f	zink: fixup initialization of operand_mask / num_extra_operands This doesn't change behavior, but makes the code a bit easier to read. Both values are zero, but I somehow swapped the logical meaning of them when initializing.	2020-01-14 01:06:59 +00:00
Eric Anholt	3be4b89c03	mesa: Fix detection of invalidating both depth and stencil. Fixes an extra 1024x1024x4 MSAA Z/S store on WebGL fishtank on cheza. Reported-by: Dave Airlie <airlied@redhat.com> Fixes: `db2ae51121` ("mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>	2020-01-13 23:37:54 +00:00
Rob Clark	1c6a2efa06	mesa/st: lower samplers before nir_lower_tex Fixes incorrect lowering of YUV samplers when there are non-yuv samplers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>	2020-01-13 23:19:49 +00:00
Rob Clark	4cda61f11e	nir: assert that nir_lower_tex runs after lowering derefs It isn't going to do the right thing, because texture_index/ sampler_index defaults to zero. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>	2020-01-13 23:19:49 +00:00
Gurchetan Singh	d72f178753	i965: support EXT_EGL_image_storage i965 can support this. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:36 -08:00
Gurchetan Singh	b1c266d5fa	i965: refactor intel_image_target_texture_2d intel_image_target_texture_tex_storage can reuse much of this code. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:32 -08:00
Gurchetan Singh	34fe560cd6	i965: track if image is created by a dmabuf Will be used by EXT_EGL_image_storage later. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:27 -08:00
Gurchetan Singh	bf576772ab	dri_util: add driImageFormatToSizedInternalGLFormat function This is needed to implement the EXT_EGL_image_storage spec: "If <target> is GL_TEXTURE_2D, then the resultant texture must have a sized internal format which is colorspace and size compatible with the dma-buf. If the GL is unable to determine such a format, the error INVALID_OPERATION is generated." Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:22 -08:00
Gurchetan Singh	b68ff2b873	glapi / teximage: implement EGLImageTargetTexStorageEXT Check various parts of the EXT_EGL_image_storage spec, and add a new vfunc for drivers implementing it. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:18 -08:00
Gurchetan Singh	1fe23d0e22	teximage: split out helper from EGLImageTargetTexture2DOES The major differences between EXT_EGL_image_storage and EGLImageTargetTexture2DOES are: (1) The texture target is made immutable (2) EXT_EGL_image_storage supports non-2D targets. We can reuse EGLImageTargetTexture2D and FreeTextureImageBuffer for (1) pretty easily. For (2), let's just not support the complicated targets. Let's reuse aspects of the EGLImageTargetTexture2DOES implementation. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-13 14:57:07 -08:00
Jason Ekstrand	7978f2401b	anv: Memset array properties This is probably better than possibly leaving those bytes uninitialized even if the app will theoretically not use them. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>	2020-01-13 22:33:55 +00:00
Jason Ekstrand	d36eed3e69	anv: Don't over-advertise descriptor indexing features We should only advertise sub-features if we advertise the extension. Fixes: `6e230d7607` "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>	2020-01-13 22:33:55 +00:00
Jason Ekstrand	d7ff137445	intel/blorp: Fill out all the dwords of MI_ATOMIC This makes us valgrind clean again. Fixes: `9175c7058e` "intel/blorp: Make blorp update the clear color..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>	2020-01-13 21:48:00 +00:00
Tomeu Vizoso	40dd418e14	gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5 Some fixes got in that should prevent hangs in lima jobs. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>	2020-01-13 21:26:11 +00:00
Daniel Schürmann	05c81875d7	aco: fix unconditional demote_to_helper This patch fixes an out-of-bounds access on p_exit_early and binds the exec register to the correct operand. Fixes: `2ea9e59e8d` ('aco: move s_andn2_b64 instructions out of the p_discard_if') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>	2020-01-13 21:08:41 +00:00
Marek Olšák	2bb88b2fdc	radeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	363b4027fc	radeonsi: put up to 5 VBO descriptors into user SGPRs gfx6-8: 1 VBO descriptor in user SGPRs gfx9-10: 5 VBO descriptors in user SGPRs We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled. Totals from affected shaders: SGPRS: 1110528 -> 1170528 (5.40 %) VGPRS: 952896 -> 951936 (-0.10 %) Spilled SGPRs: 83 -> 61 (-26.51 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 23766296 -> 22843920 (-3.88 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 179344 -> 179344 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	220d00314f	ac,radeonsi: increase the maximum number of shader args and return values Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	ef253c6789	radeonsi: simplify si_set_vertex_buffers Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	312e04689a	radeonsi: don't allow draw calls with uninitialized VS inputs These always hang, because vertex buffer descriptors are not set up. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	c278c73f13	radeonsi: add si_context::num_vertex_elements Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Marek Olšák	1e03b63b3b	radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-13 15:57:07 -05:00
Lionel Landwerlin	2cc14bd7b8	anv: set stencil layout for input attachments If an input attachment has a stencil format, we need to set this. v2: Fish out VkAttachmentReferenceStencilLayoutKHR from VkAttachmentReference2KHR::pNext (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `c1c346f166` ("anv: implement VK_KHR_separate_depth_stencil_layouts") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>	2020-01-13 21:57:33 +02:00
Jason Ekstrand	21bc16a723	anv: Drop an unused variable	2020-01-13 12:20:48 -06:00
Jason Ekstrand	d3737002ee	nir/lower_atomics_to_ssbo: Also lower barriers This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	e40b11bbcb	nir: Rename nir_intrinsic_barrier to control_barrier This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	bd3ab75aef	intel/nir: Stop adding redundant barriers Now that both GLSL and SPIR-V are adding shared and tcs_patch barriers (as appropreate) prior to the nir_intrinsic_barrier, we don't need to do it ourselves in the back-end. This reverts commit 26e950a5de01564e3b5f2148ae994454ae5205fe. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	ba43b66dc9	nir/glsl: Emit memory barriers as part of barrier() The GLSL barrier() intrinsic does an implicit shared memory barrier in compute shaders and an implicit TCS patch output barrier in tessellation control shaders. We'd like NIR's barrier intrinsic to just be a control flow barrier and not have memory implications. To satisfy this, we need to add an extra memory barrier in front of each nir_intrinsic_barrier. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	a4125b4d26	spirv: Add output memory semantics to OpControlBarrier in TCS Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	2365520c9d	spirv: Add a workaround for OpControlBarrier on old GLSLang As per the Vulkan memory model, the proper translation of GLSL barrier() is an OpControlBarrier with a scope of Workgroup and semantics of Acquire, Release, and WorkgroupMemory. Older versions of GLSLang gave an OpControlBarrier with semantics of None so we need to patch it up on those versions. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	60097cc840	nir: Add a new memory_barrier_tcs_patch intrinsic Right now, it's implemented as a no-op for everyone. For most drivers, it's a switch case in the NIR -> whatever which just breaks. For ir3, they already have code to delete tessellation barriers so we just add a case to also delete memory_barrier_tcs_patch. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:47 +00:00
Jason Ekstrand	f2eece773c	llmvpipe: No-op implement more barriers Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:46 +00:00
Jason Ekstrand	3498ab98f5	nir: Handle barriers with more granularity in combine_stores Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:46 +00:00
Jason Ekstrand	f09db0bed5	nir: Handle more barriers in dead_write and copy_prop Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:46 +00:00
Jason Ekstrand	ada49bae5e	intel/vec4: Support scoped_memory_barrier Fixes: `06aecb14c0` "anv: Implement VK_KHR_vulkan_memory_model" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>	2020-01-13 17:23:46 +00:00
Andreas Baierl	40aef2bf3e	lima: Add stencil support This re-enables and fixes support for stencil buffer. It fixes 365 stencil related deqp tests. All tests that use INCR, INCR_WRAR, DECR and DECR_WRAP as a stencil op still fail, but they also fail with the blob, so we may ignore that for now. We still have dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked failing, which is strange because it's the only one out of the depth_stencil_clear.* set. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2020-01-13 16:11:37 +00:00
Andreas Baierl	2ce71494f1	lima/parser: Make rsw alpha blend parsing more readable Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2020-01-13 16:11:37 +00:00
Boris Brezillon	440b0d6eec	panfrost: Remove unneeded phi nodes Add a pass to remove unneeded phi nodes as done in other drivers. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>	2020-01-13 14:09:47 +00:00
Rhys Perry	809c8feb92	aco: check if multiplication/clamp is live when applying output modifier It's possible that a multiplication/clamp is dead code and the single use is from a different user. Fixes portal rendering in Path of Exile when global illumination is enabled. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	ef8abfa790	aco: disable add combining for ds_swizzle_b32 ds_bpermute_b32/ds_permute_b32 are fine, I think Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	69bed1c918	aco: don't DCE atomics with return values We don't create atomics with definitions if they are not used in NIR, but our own DCE can remove the uses if an export turns out to be null. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	8f291dc146	aco: set exec_potentially_empty for demotes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	21eafe30df	aco: better handle neg/abs of sgprs isel/label_instruction currently doesn't create these but we should probably check anyway. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	f29a5a205c	aco: check usesModifiers() when identifying a neg/abs This was fine because a literal used to mean that it didn't use modifiers, but now VOP3 can take a literal on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	46fb341b8d	aco: handle omod successors with the constant in the first operand No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	7ce244b7d1	aco: handle VOP3 modifiers when combining a constant comparison's NaN test No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	bbac52873f	aco: fix uninitialized data in the binary Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:25:32 +00:00
Rhys Perry	fcd6d83245	aco: fix imageSize()/textureSize() with large buffers on GFX8 Tested on Navi by using dEQP-VK.image.image_size.buffer.* and the GFX8 path with the size multipled by the stride. dEQP-VK.image.image_size.buffer.* was also run with the tests modified to use a 96bit format. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:25:32 +00:00
Rhys Perry	49bcd06f97	aco: set vm for pos0 exports on GFX10 RADV's LLVM backend and radeonsi does the same thing. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 19.3 <mesa-stable@lists.freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:25:32 +00:00
Daniel Ogorchock	632885741f	panfrost: Fix headers and gpu_headers memory leak The per-batch headers/gpu_headers dynarrays need to be freed during the batch cleanup to prevent leaking. Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>	2020-01-13 09:11:35 +00:00
Daniel Ogorchock	2848edc0ef	panfrost: Fix panfrost_bo_access memory leak The bo access needs to be freed prior to removing it from its hash table. This prevents leaking them over time. Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3308>	2020-01-13 09:11:35 +00:00
Samuel Pitoiset	ecace26853	radv/gfx10: improve performance for TES using PrimID but not exporting it This field is for the primitive ID export to the fragment shader. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-13 08:14:47 +01:00
Samuel Pitoiset	1db276ba23	radv/gfx10: add support for NGG passthrough mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-13 08:14:45 +01:00
Samuel Pitoiset	471738e97b	radv/gfx10: do not declare LDS for NGG if useless Only needed for NGG without passthrough mode or for NGG streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-13 08:14:43 +01:00
Samuel Pitoiset	0758f645d0	radv/gfx10: determine if a pipeline is eligible for NGG passthrough It can't be enabled for geometry shaders, for NGG streamout and for vertex shaders that export the primitive ID. NGG passthrough requires that LDS isn't used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-13 08:14:40 +01:00
Samuel Pitoiset	c65015f83c	radv/gfx10: disable vertex grouping RadeonSI and AMDVLK does that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-13 08:14:38 +01:00
Ilia Mirkin	201b88a93b	nvc0: treat all draws without color0 broadcast as MRT Per the semi-recently-released NVIDIA docs, when this bit is not enabled, then the result for RT[0] will be used. So if e.g. only a single RT is drawn to and it's not RT[2], the results will not be visible. Fixes GTF-GL45.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline which was failing due to a frag shader outputting only to location=2. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2020-01-12 12:11:16 -05:00
Ilia Mirkin	3e9aacb139	gm107/ir: avoid combining geometry shader stores at 0x60 This corresponds to gl_PrimitiveID and gl_Layer. When both of these are stored in a single AST.64 or AST.128 operation, then it appears as though the whole store fails. Fixes the recently extended glsl-1.50-transform-feedback-builtins piglit, and also gtf30.GL3Tests.transform_feedback.transform_feedback_builtins. The issue was reproduced on GM107 and GP108 but not GK208 nor GK104. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2020-01-12 12:11:16 -05:00
Ilia Mirkin	3be708eb31	nvc0: add dummy reset status support Perhaps in a future implementation, such events could be passed back to the driver, or queried directly. However for now, this is required for GL 4.3 robustness contexts. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2020-01-12 12:11:16 -05:00
Ilia Mirkin	838118462e	nv50,nvc0: fix destination coordinates of blit The fix was found by Karol Herbst a long time ago, but it was unclear why it helped or if it would create additional problems. This change adds a comment that explains what's going on, and in the process also normalizes the nv50 implementation to match. The coordinates which are fed to gl_Position map directly to pixel coordinates, since the viewport transform is disabled. If the framebuffer is MSAA, then that doesn't affect the pixel coordinates at all, it's just that each pixel has multiple samples. Note that this makes it really clear that this approach is inappropriate for EXT_framebuffer_multisample_blit_scaled, and also the 3d path will fail terribly for direct copies. Thankfully the 2d path normally takes care of this. Fixes KHR-GL43.packed_depth_stencil.blit.depth32f_stencil8 as well as scaling issues in a number of EXT_framebuffer_multisample-related piglit tests (although they continue to fail due to inaccuracies). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2020-01-12 12:11:16 -05:00
Bas Nieuwenhuizen	bfd9e7ff24	radv: Use new scanout gfx9 metadata flag. This updates for the new metadata ABI in radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3244> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3244>	2020-01-12 14:01:59 +01:00
Vasily Khoruzhick	f06be79457	lima: fix PIPE_CAP_* to mark features that aren't supported yet lima doesn't support alpha test, flat shading, two-sided color nor clip planes. We can enable these caps when corresponding hw features are implemented in the driver. Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-12 00:10:04 -08:00
Vasily Khoruzhick	8a421135fa	lima: implement polygon offset Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A. Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-12 00:10:04 -08:00
Vasily Khoruzhick	b936b1f9b4	lima: fix viewport clipping Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport is still rendered. Looks like we need to use scissors to do clipping. Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures fail on blob as well. Remaining [1] fails on many other gallium drivers. [1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-12 00:10:04 -08:00
Vasily Khoruzhick	997a30d709	lima: fix PLBU_CMD_PRIMITIVE_SETUP command Apparently it doesn't depend on primitive type, the value only depends on whether we specify point size via PLBU command -- bit 12 is set in this case Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-12 00:10:04 -08:00
Timothy Arceri	6bafd230e3	glsl: fix potential bug in nir uniform linker The state value of main_uniform_storage_index will be wrong for add_parameter() when find_and_update_previous_uniform_storage() finds a uniform if there is more than 1 uniform used in multiple shader stages. The new code is also simpler. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-12 11:02:20 +11:00
Christian Gmeiner	db7967ef9f	etnaviv: add deqp debug option This new debug option will fake some driver CAPs to be able to run dEQP for GLES3. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3351>	2020-01-11 22:05:35 +00:00
Timur Kristóf	44a6b17df7	aco/wave32: Set the definitions of v_cmp instructions to the lane mask. The output of v_cmp instructions is s1 (a single SGPR) in wave32 mode, as opposed to s2 (an SGPR-pair) in wave64 mode. A couple of cases where this should have been fixed were omitted from the previous patch by mistake. Fixes: `e0bcefc3a0` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-11 20:15:53 +01:00
Alyssa Rosenzweig	59d30fd4bc	pan/midgard: Support indirect UBO offsets ...in case we have arrays in a UBO block that we'd like to access indirectly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>	2020-01-10 17:48:42 -05:00
Francisco Jerez	c20dc9b836	intel/fs: Make implied_mrf_writes() an fs_inst method. This will be convenient in a later commit enabling SIMD32 fragment shaders, and happens to fix the calculation for MATH instructions which is currently inaccurate for SIMD-lowered instructions on Gen4-5 platforms (all of them on Gen4 in SIMD16 mode), since it was based on the shader's dispatch width rather than on the actual execution size of the instruction. This causes some shader-db noise on Gen4 due to the more compact register allocation interacting with the SEND dependency workarounds, but otherwise no major changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:02:30 -08:00
Francisco Jerez	591f146fd2	intel/fs/cse: Fix non-deterministic behavior due to inaccurate liveness calculation. The liveness calculation done by the local CSE pass in order to prune AEB entries whose sources are no longer live is currently inaccurate, because the live intervals are calculated once at the beginning of the pass, so they don't take into account any of the copy instructions inserted by the CSE pass as it makes progress. However the IP counter used in that calculation is based on the start_ip of the basic block, which is updated automatically whenever any instructions are inserted into the CFG. This causes the IP counter and liveness intervals to get out of sync in programs with multiple basic blocks, causing the CSE pass to toss AEB entries prematurely, which can lead to missed optimization opportunities rather non-deterministically. On BDW this leads to the following shader-db changes: total instructions in shared programs: 14952488 -> 14951763 (-0.00%) instructions in affected programs: 45416 -> 44691 (-1.60%) helped: 40 HURT: 4 total spills in shared programs: 20989 -> 20970 (-0.09%) spills in affected programs: 103 -> 84 (-18.45%) helped: 3 HURT: 0 total fills in shared programs: 24981 -> 24926 (-0.22%) fills in affected programs: 127 -> 72 (-43.31%) helped: 3 HURT: 0 In addition it avoids a number of regressions in combination with some of the optimization changes I'm working on for SIMD32, which would have made CSE more effective... Causing it to be less effective elsewhere in the program astonishingly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:02:06 -08:00
Francisco Jerez	cc0ea482ad	intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32. For uniform sample ID, only the first channel of msg_data will be initialized. We need to pass that component only to the SEND message for SIMD lowering to unzip the descriptor source correctly. Fixes several dozens of conformance test failures with SIMD32 fragment shaders enabled, including: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.dynamic_sample_number.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:01:52 -08:00
Francisco Jerez	0703eab012	intel/fs/gen8+: Fix r127 dst/src overlap RA workaround for EOT message payload. The problem occured when the return payload of a SIMD8 SEND instruction was re-used as source payload of an EOT SEND message. In such cases the interference edge added by that workaround between the payload and grf127_send_hack_node would have no effect, because the payload would be allocated to a fixed range of registers containing r127 by the special handling of EOT message payloads in the same function. This would cause things to blow up if the source payload of the first SIMD8 message ended up being allocated to a range which happened to overlap the destination. Fix it by avoiding r127 altogether in the allocation of EOT message payloads. The problem can be reproduced on ICL with the fp-indirections2 Piglit test-case in combination with the other optimizer changes of this series. Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:00:42 -08:00
Francisco Jerez	0a6e46d44d	intel/fs/gen11+: Handle ROR/ROL in lower_simd_width(). Prevents invalid code from being emitted for ROR/ROL instructions in SIMD32 shaders. The problem can be reproduced with the following tests while forcing SIMD32 to be used for fragment shaders: piglit.shaders.glsl-rotate-left piglit.shaders.glsl-rotate-right However the issue could occur in production already with compute shaders and a workgroup size large enough to trigger SIMD32 dispatch. Fixes: `83fdec0f0d` "intel/compiler: Enable the emission of ROR/ROL instructions" Cc: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-10 11:00:24 -08:00
Francisco Jerez	a30bb25a7a	glsl: Fix software 64-bit integer to 32-bit float conversions. The current implementation was broken for any integers between 2^24 and 2^30 (it would return zero for me on ICL). The reason is that for such integers we wouldn't take the 'if (0 <= shiftCount)' early return path, however 'shiftCount + 7' would be positive, leading to a negative 'count' argument passed to __shift64RightJamming(), which would give undefined results. This reworks the affected conversion functions to use either __shortShift64Left() or __shift64RightJamming() based on the sign of the final shift count, which should avoid the problem. In addition this should qualify as a clean-up/optimization -- This implementation of the conversion functions translates to 7 instructions less than the original on Intel hardware. This fixes the 'KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot' conformance tests on soft fp64 hardware with large enough subgroup size (>16). Fixes: `d5cf6e92b4` "glsl: Add built-in functions to do uint64_to_fp32(uint64_t)" Fixes: `c9d333a6b7` "glsl: Add built-in functions to do int64_to_fp32(int64_t)" Cc: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2020-01-10 10:51:58 -08:00
Daniel Schürmann	8b7a42d6d0	aco: compact aco::span<T> to use uint16_t offset and size instead of pointer and size_t. This reduces the size of the Instruction base class from 40 bytes to 16 bytes. No pipelinedb changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>	2020-01-10 17:49:18 +00:00
Daniel Schürmann	ffb4790279	aco: compact various Instruction classes No pipelinedb changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3332>	2020-01-10 17:49:18 +00:00
Andrii Simiklit	ebaab89761	mesa/st: fix a memory leak in get_version This patch prevents memory leak in get_version function in st_manager.c This issue was found by valgrind: 16 bytes in 1 blocks are definitely lost in loss record 6 of 1,418 at 0x483CD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x63D9476: st_init_extensions (st_extensions.c:1679) by 0x63B803B: get_version (st_manager.c:1271) by 0x63B8124: st_api_query_versions (st_manager.c:1289) by 0x63266EF: dri_init_screen_helper (dri_screen.c:583) by 0x6321B12: dri2_init_screen (dri2.c:2110) by 0x631AACC: driCreateNewScreen2 (dri_util.c:155) by 0x5D58192: dri3_create_screen (dri3_glx.c:897) by 0x5D39829: AllocAndFetchScreenConfigs (glxext.c:815) by 0x5D39C57: __glXInitialize (glxext.c:941) by 0x5D3290A: GetGLXPrivScreenConfig (glxcmds.c:174) by 0x5D34F38: glXQueryExtensionsString (glxcmds.c:1307) Fixes: `eca8032f20` ("gallium: Add ARB_gl_spirv support") Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3345>	2020-01-10 17:27:39 +00:00
Lasse Lopperi	3de2774dcb	freedreno/drm: Fix memory leak in softpin implementation Free the memory allocated for cmds/reloc_bos array when destoying the associated ringbuffer. For similar fix for the non-softpin implementation see: `d014af98b7` Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2324 Fixes: `f3cc0d2` ("freedreno: import libdrm_freedreno + redesign submit") Signed-off-by: Lasse Lopperi <lasse.lopperi@ge.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3342>	2020-01-10 16:21:35 +00:00
Rhys Perry	b5c9688516	aco: limit register usage for large work groups Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-10 12:10:37 +00:00
Timur Kristóf	eccac46cdc	ac/llvm: Fix ac_build_reduce in wave32 mode. Previously, when cluster_size was set to 0, it always worked as if the cluster size was 64. This commit fixes it in wave32 mode by changing to work as if the cluster size was set to 32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-10 12:30:44 +01:00
Pierre-Eric Pelloux-Prayer	a5fe84aefb	radeonsi: release saved resources in si_compute_do_clear_or_copy Fixes: `9b331e462e` ("radeonsi: use compute shaders for clear_buffer & copy_buffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:40 +01:00
Pierre-Eric Pelloux-Prayer	6912149ee5	radeonsi: release saved resources in si_compute_clear_12bytes_buffer Fixes: `6c901f0675` ("radeonsi: use compute shader for clear 12-byte buffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:38 +01:00
Pierre-Eric Pelloux-Prayer	1acf714d57	radeonsi: release saved resources in si_compute_copy_image Fixes: `1b25d340b7` ("radeonsi: use compute for resource_copy_region when possible") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:35 +01:00
Pierre-Eric Pelloux-Prayer	e1e87466ae	radeonsi: release saved resources in si_compute_clear_render_target Fixes: `984fd73515` ("radeonsi: use compute for clear_render_target when possible") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:33 +01:00
Pierre-Eric Pelloux-Prayer	6c019e28ca	radeonsi: release saved resources in si_compute_expand_fmask Fixes: `095a58204d` ("radeonsi: expand FMASK before MSAA image stores are used") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:31 +01:00
Pierre-Eric Pelloux-Prayer	9211cbe07a	radeonsi: release saved resources in si_retile_dcc Fixes: `1f21396431` ("radeonsi: add support for displayable DCC for multi-RB chips") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2330 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-10 08:41:19 +01:00
Samuel Iglesias Gonsálvez	39c1892dd8	main: fix coverity error in _mesa_program_resource_find_name() We did not take into account if name is NULL, so we could dereference a NULL pointer in strncmp() call. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 08:40:00 +01:00
Icecream95	f2f1277624	panfrost: Add negative lod bias support Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-10 06:51:42 +00:00
Gurchetan Singh	daf1d5ad4c	virgl/drm: update UAPI This seems to compile. Header copied over from drm-misc-next 7da5492739db. Acked-by: Eric Engestrom <eric@engestrom.ch>	2020-01-10 04:12:40 +00:00
Vasily Khoruzhick	438c677859	lima: drop support for R8G8B8 format We can only sample from 24-bit packed format and can't render into it and it causes chromium-based browsers to fail when they create FBO with GL_RGB format. Drop R8G8B8 alltogether so mesa can promote it to RGBX format. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-09 18:46:08 -08:00
Jason Ekstrand	9b71171442	anv: Re-use flush_descriptor_sets in flush_compute_state There's no reason to hand-roll all of the memory re-allocation fall-back code for compute shaders. It's just duplicated complexity. This also makes it more clear in flush_compute_state where the MEDIA_INTERFACE_DESCRIPTOR_LOAD command gets emitted relative to other packets in the command stream. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ae72d1238c	anv: Flag descriptors dirty when gl_NumWorkgroups is used Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:45:00 -06:00
Jason Ekstrand	ca6b3b11af	anv: Don't add dynamic state base address to push constants on Gen7 Because Gen7 push constants are already relative to dynamic state base address, they aren't really an address. It's deceptive to return an address from the helper function. Instead, let's leave it as a special-case in the gen7-11 helper; we don't need the helper for code de-duplication for Gen7 anyway. Fixes: `67d2cb3e93` "anv: Add get_push_range_address() helper" Closes: #2323 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-09 19:44:06 -06:00
Vasily Khoruzhick	044da65f52	lima: add debug flag to disable tiling Add debug flag to disable tiling. Note that it prevents lima from creating tiled buffers, but it's still able to import them if modifier is specified Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-10 01:13:47 +00:00
Vasily Khoruzhick	a533d1d4c6	lima: use linear layout for shared buffers if modifier is not specified Use linear layout for shared buffers if modifier is not specified and use linear layout when importing buffers with invalid modifier. Fixes: `01a451b04d` ("lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-10 01:13:47 +00:00
Timothy Arceri	87e0dd68f5	glsl: call calculate_subroutine_compat() from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	726e8f24c6	glsl: move calculate_subroutine_compat() to shared linker code We will make use of this in the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	c60d0bd92f	glsl: call uniform resource checks from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	05c1f7a154	glsl: move uniform resource checks into the common linker code We will call this from the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	b85985dd51	glsl: call check_subroutine_resources() from the nir linker Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Timothy Arceri	a6fd1c7752	glsl: move check_subroutine_resources() into the shared util code We will make use of this in the nir linker in the following patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-10 00:41:20 +00:00
Jason Ekstrand	3dec68e682	genxml: Remove a non-existant HW bit	2020-01-09 18:40:20 -06:00
Kristian H. Kristensen	f9d35ea55b	ir3: Set up full/half register conflicts correctly Setting up transitive conflicts between a full register and its two half registers (eg r0.x and hr0.x and hr0.y) will make the half registers conflict. They don't actually conflict and this prevents us from using both at the same time. Add and use a new ra helper that sets up transitive conflicts between a register and its subregisters, except it carefully avoids the subregister conflict. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2020-01-09 16:03:25 -08:00
Dave Airlie	85eed5def3	llvmpipe: add ARB_derivative_control support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2020-01-10 08:43:40 +10:00
Marek Olšák	269953e779	radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly on gfx9 Fixes: `69ea473` "amd/addrlib: update to the latest version" Closes: #2325 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-09 16:28:28 -05:00
Lionel Landwerlin	60e0db3bfb	anv: fix intel perf queries availability writes The availability is not written at the location changed in ee6fbb95a74d... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ee6fbb95a7` ("anv: Properly handle host query reset of performance queries") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-09 20:42:36 +02:00
Dylan Baker	da2fe9c15e	docs: Add release notes for 19.3.2, update calendar and home page	2020-01-09 10:33:49 -08:00
Dylan Baker	2d46a7f26d	docs: add SHA256 sums for 19.3.2	2020-01-09 10:32:18 -08:00
Dylan Baker	d4f237dcce	docs: Add release notes for 19.3.2	2020-01-09 10:32:14 -08:00
Satyajit Sahu	4e3a09db25	radeon/vcn: Handle crop parameters for encoder Set proper cropping parameter if frame cropping is enabled Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com> Reviewed-by: Boyuan Zhang boyuan.zhang@amd.com Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3328>	2020-01-09 15:43:18 +00:00
Daniel Schürmann	cd31da4587	nir: fix printing of var_decl with more than 4 components. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Fixes: `a8ec4082a4` ('nir+vtn: vec8+vec16 support') Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3320>	2020-01-09 10:31:26 +01:00
Samuel Pitoiset	e298e78a01	radv: advertise VK_AMD_shader_image_load_store_lod This extension allows to use LOD with image read/write operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:34 +01:00
Samuel Pitoiset	4d49a7ac73	aco: handle nir_intrinsic_image_deref_{load,store} with lod Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Samuel Pitoiset	e77ff89914	amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Samuel Pitoiset	1b808d208f	spirv,nir: add new lod parameter to image_{load,store} intrinsics SPV_AMD_shader_image_load_store_lod allows to use a lod parameter with OpImageRead, OpImageWrite and OpImageSparseRead. According to the specification, this parameter should be a 32-bit integer. It is initialized to 0 when no lod parameter is found during SPIR-V->NIR translation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Samuel Pitoiset	37bfd854c7	spirv: add SpvCapabilityImageReadWriteLodAMD New SPIR-V capability for SPV_AMD_shader_image_load_store_lod. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-09 07:58:33 +01:00
Tapani Pälli	1e29ff7b3d	mesa: create program resource hash in a single place This is a cleanup but also a fix for commit `dd09f1d806`. In case of i965 we did not actually create hash for cached shader programs. Fixes: `dd09f1d806` "mesa/st/i965: add a ProgramResourceHash for quicker resource lookup" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-09 07:28:13 +02:00
Dave Airlie	ee9879335e	llvmpipe: add support for ARB_indirect_parameters. This just adds support for getting the draw count from the indirect buffer. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>	2020-01-09 10:35:44 +10:00
Dave Airlie	315fa2e5c9	llvmpipe: enable driver side multi draw indirect Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>	2020-01-09 10:35:40 +10:00
Dave Airlie	d10a3d528f	gallium/util: add multi_draw_indirect to util_draw_indirect. ARB_indirect_parameters needs drivers to deal with mutli_draw_indirect themselves. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3234>	2020-01-09 10:35:36 +10:00
Thong Thai	3a4f8c8158	mesa: Prevent _MaxLevel from being less than zero When decoding using VDPAU, the _MaxLevel value becomes -1 due to NumLevels being equal to 0 at a certain point, and decoding fails due to an assertion later on. Signed-off-by: Thong Thai <thong.thai@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>	2020-01-08 16:44:20 -05:00
Marek Olšák	9b71041627	ac: add ac_build_s_endpgm Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:03:48 -05:00
Marek Olšák	1c44480538	ac: add 128-bit bitcount Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:41 -05:00
Marek Olšák	d7b565365e	ac/gpu_info: add pc_lines and use it in radeonsi Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:40 -05:00
Marek Olšák	d1c8aeb24f	ac: unify primitive export code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:38 -05:00
Marek Olšák	1c77a18cc2	ac: unify build_sendmsg_gs_alloc_req Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 16:00:36 -05:00
Marek Olšák	fd84e422b6	radeonsi: clean up messy si_emit_rasterizer_prim_state Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:49 -05:00
Marek Olšák	b64a3240c2	radeonsi: determine accurately if line stippling is enabled for performance Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:47 -05:00
Marek Olšák	79cc7e6ff0	radeonsi: test polygon mode enablement accurately Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:43 -05:00
Marek Olšák	898c9cb797	radeonsi: fix context roll tracking in si_emit_shader_vs probably harmless, because we don't need to track context rolls on gfx10 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:39 -05:00
Marek Olšák	4249a90f5d	radeonsi: fix monolithic pixel shaders with two-sided colors and SampleMaskIn They are never used except for testing AMD_DEBUG=mono. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:35 -05:00
Marek Olšák	186335d17d	ac/gpu_info: always use distributed tessellation on gfx10 This might fix a hang on Navi14. Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-08 15:48:32 -05:00
Marek Olšák	eb1e10d0be	gallium: bypass u_vbuf if it's not needed (no fallbacks and no user VBOs) This decreases CPU overhead, because u_vbuf is completely bypassed in those cases. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-08 13:40:59 -05:00
Marek Olšák	9f6020abc6	gallium/cso_context: move non-vbuf vertex buffer and element code into helpers These will be reused. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-08 13:40:59 -05:00
Marek Olšák	ce648b913f	gallium: put u_vbuf_get_caps return values into u_vbuf_caps Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-08 13:40:59 -05:00
Jonathan Marek	472593e9cf	etnaviv: remove unnecessary vertex_elements_state_create error checking PIPE_CAP_MAX_VERTEX_BUFFERS already sets the maximum vertex_buffer_index. There's no need to error on num_elements == 0 (if that can even happen). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-08 12:27:35 -05:00
Jonathan Marek	76d93b437b	etnaviv: implement gl_VertexID/gl_InstanceID Fixes: dEQP-GLES3.functional.instanced.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-08 12:27:35 -05:00
Jonathan Marek	93ff6f5919	etnaviv: HALTI2+ instanced draw Fixes: dEQP-GLES3.functional.draw.draw_arrays_instanced.* dEQP-GLES3.functional.draw.draw_elements_instanced.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-08 12:27:34 -05:00
Jonathan Marek	ea608ae23b	etnaviv: update headers from rnndb Update to etna_viv commit 46af5f1d. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-08 12:27:34 -05:00
Lionel Landwerlin	4578d4ae52	anv: don't close invalid syncfd semaphore Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-08 18:20:50 +02:00
Krzysztof Raszkowski	7d33203b44	gallium/swr: Fix glVertexPointer race condition. Sometimes using user buffer (not VBO) e.g. glVertexPointer one thread could free memory before other thread used it. Instead of copying this memory to driver simplier thing is to block until draw finish. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2020-01-08 15:42:03 +00:00
Jason Ekstrand	b788cccfe2	intel/disasm: Fix decoding of src0 of SENDS There is no instruction field for the register file for src0 because it's always GRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3309>	2020-01-08 14:14:16 +00:00
Yevhenii Kolesnikov	8dcff01c8b	meta: Add cleanup function for Bitmap Buffer object and temporary texture were never freed, causing memory leaks. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-08 15:34:03 +02:00
Juan A. Suarez Romero	ad4fb7ea04	nir/spirv: skip unreachable blocks in Phi second pass Only the blocks that are reachable are inserted with an end_nop instruction at the end. When handling the Phi second pass, if the Phi has a parent block that does not have an end_nop then it means this block is unreachable, and thus we can ignore it, as the Phi will never come through it. Fixes dEQP-VK.graphicsfuzz.uninit-element-cast-in-loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-08 11:32:24 +01:00
Pierre-Eric Pelloux-Prayer	5f8daae4d8	radeonsi: check ctx->sdma_cs before using it `e5167a9276` disabled SDMA for gfx8. This caused 3 piglit arb_sparse_buffer tests (basic, buffer-data and commit) to crash on GFX8. Reported-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Fixes: `e5167a9276` ("radeonsi: disable SDMA on gfx8 to fix corruption on RX 580")	2020-01-08 09:31:35 +01:00
Samuel Pitoiset	e565fd4255	radv: do not fill keys from fragment shader twice radv_fill_shader_info() already does that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2020-01-08 08:59:04 +01:00
Yevhenii Kolesnikov	ed43dd62ac	main: allow external textures for BindImageTexture From issue 10 of the OES_EGL_image_external_essl3: A limited set of use-cases is enabled by making glBindImageTexture accept external textures. Shaders can access such external textures using the existing <image2D> sampler type. Fixes: `02a6d901ee` ("mesa: add OES_EGL_image_external_essl3 support") Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-08 09:21:39 +02:00
Jason Ekstrand	803fad43c3	intel/nir: Add a memory barrier before barrier() Our barrier instruction does not implicitly do a memory fence but the GLSL barrier() intrinsic is supposed to. The easiest back-portable solution is to just add the NIR barriers. We'll sort this out more properly in later commits. Cc: mesa-stable@lists.freedesktop.org Closes: #2138 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-07 21:52:19 -06:00
Bas Nieuwenhuizen	7cc0702bbb	radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK. Fixes a hang on Raven with Resident Evil 2. I did not find anything more restricted to fix it: - Setting persistent_states_per_bin to 1 fixes it too, but likely does an internal break on any descriptor set changes too. - Only breaking the batch when cb_target_mask changes does not fix it (and looking at AMDVLK comments, I suspect the code in radeonsi should really be doing a FLUSH_DFSM). - Always doing a FLUSH_DFSM on shader switch helps, but that is more often than this and I don't think we should be doing that when DFSM is disabled. - Also emitting the existing break on framebuffer change when DFSM is disabled does not fix the issue. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2315 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2020-01-07 22:44:31 +01:00
Tapani Pälli	dd09f1d806	mesa/st/i965: add a ProgramResourceHash for quicker resource lookup Many resource APIs require searching by name, add a hash table to make this faster. Currently we traverse the whole resource list for name based queries, this change makes all these cases use the hash. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2203 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254>	2020-01-07 10:48:41 +00:00
Michel Dänzer	5f0ff004ca	gitlab-ci: Test against LLVM / clang 9 on x86 They're not available for Debian buster yet, so we have to use upstream snapshot packages again. In contrast to earlier, we now store the LLVM APT repository key in Git instead of re-downloading it every time.	2020-01-07 11:00:16 +01:00
Alyssa Rosenzweig	4cd3dc94ad	panfrost: Don't double-flip Z/W for 2D arrays We need to mindful that we don't clobber the shadow comparator. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-07 08:54:52 +01:00
Alyssa Rosenzweig	bc4c853b49	pan/midgard: Account for z/w flip in texelFetch Required for proper txf of 2D arrays. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.2darray Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-07 08:54:47 +01:00
Alyssa Rosenzweig	4152d45d38	panfrost: Adjust for mismatch between hardware/Gallium in arrays/cube The hardware separates face selection and array indexing, it looks like, whereas Gallium smushes them together with some modulus fun. Let's fix it so mipmapped 2D arrays work without regressing cubemaps. Fixes dEQP-GLES3.functional.texture.filtering.2d_array.* among others. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-07 08:54:40 +01:00
Alyssa Rosenzweig	0b714f3fa3	panfrost: Respect constant buffer_offset Fixes dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.* among others Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-07 08:54:23 +01:00
Timothy Arceri	3bd4bcd418	glsl: use nir version of check_image_resources() for nir linker Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 09:53:51 +11:00
Timothy Arceri	feffd1fa65	glsl: add check_image_resources() for the nir linker This is adapted from the GLSL IR code but doesn't need to iterate over the IR. I believe this also fixes a potential bug in the GLSL IR code which potentially counts the same output twice. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 09:53:51 +11:00
Timothy Arceri	a853de0c95	glsl: use nir linker to link atomics Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 09:50:57 +11:00
Timothy Arceri	8f2cab7767	mesa: add new UseNIRGLSLLinker constant This will be used to disable some GLSL IR passes in following patches. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 08:39:47 +11:00
Timothy Arceri	4caf3fc8df	glsl: reorder link_and_validate_uniforms() calls This is required for the following commit. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 08:39:34 +11:00
Timothy Arceri	ed325ac4dd	glsl: add new gl_nir_link_glsl() helper This will allow us to do some linking in NIR that was previously done by the GLSL IR linker. To start with this just has calls for linking atomics. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 08:39:16 +11:00
Timothy Arceri	0e60ea1d67	glsl: add gl_nir_link_check_atomic_counter_resources() This is pretty much a copy of link_check_atomic_counter_resources() updated to work with the NIR linker. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 08:38:52 +11:00
Timothy Arceri	432ed13dec	glsl: rename gl_nir_link() to gl_nir_link_spirv() A NIR based glsl linking function will be too different to the spirv version to bother attempting any sharing. So lets change the name to be explicit. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2020-01-07 08:38:41 +11:00
Kristian H. Kristensen	6c1c13e90e	st/mesa: Lower vars to ssa and constant prop before gl_nir_lower_buffers The gl_nir_lower_buffers pass relies on recognizing the same literal constants as the GLSL compiler so that constant buffer array indices are constant in nir as well. Without this, get_block_array_index() would see vec1 32 ssa_723 = deref_var &const_temp@1 (function_temp int) vec1 32 ssa_724 = load_const (0x00000001 /* 0.000000 /) ... vec1 32 ssa_5 = deref_var &const_temp@1 (function_temp int) vec1 32 ssa_6 = intrinsic load_deref (ssa_5) (0) / access=0 / vec1 32 ssa_7 = deref_var &blockB (ssbo BlockB[1]) vec1 32 ssa_8 = deref_array &(ssa_7)[ssa_6] (ssbo BlockB) /* &blockB[ssa_6] */ instead of a literal 1, and ultimately generate the block name BlockB[0]. That used to work, since we before the previous commits we'd compact the block binding points and names. Thus, there would always be a BlockB[0]. Now, if an entry in a block array isn't used, we don't generate that block name, which means that if entry 0 isn't used BlockB[0] isn't present and then get_block_array_index() fails to find the block. In most cases we would have dealt with this in the call to st_nir_opts() in st_nir_link_shaders(), but in the num_shaders == 1 case (for example, compute) we would call gl_nir_lower_buffers() before we lowered GLSL constants. Move that corner case up next to where we call st_nir_link_shaders() so we call st_nir_opts() at the same point in the flow for all shaders. Fixes: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.18 Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2020-01-06 13:01:19 -08:00
Andrii Simiklit	be6d51e1e3	glsl/nir: do not change an element index to have correct block name When SSBO array is used with packed layout, both IR tree and as a result, NIR tree will be incorrect. In fact, the SSBO dereference indices won't match the array size in some cases like the following: "layout(packed, binding=1) buffer SSBO { vec4 a; } ssbo[3]; out vec4 color; void main() { color = ssbo[2].a; }" After linking the IR and then NIR will have an SSBO array definition with size 1 but dereference still will have index 2 and linked_shader->Program->sh.ShaderStorageBlocks will contain just SSBO with name "SSBO[2]" So this line should be removed at least as a workaround for now to avoid error like: Failed to find the block by name "SSBO[0]" Fixes: `810dde2a` "glsl/nir: Add a pass to lower UBO and SSBO access" Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2020-01-06 13:01:19 -08:00
Andrii Simiklit	4beb0a2308	glsl: fix a binding points assignment for ssbo/ubo arrays This is needed to be in agreement with spec requirements: https://github.com/KhronosGroup/OpenGL-API/issues/46 Piers Daniell: "We discussed this in the OpenGL/ES working group meeting and agreed that eliminating unused elements from the interface block array is not desirable. There is no statement in the spec that this takes place and it would be highly implementation dependent if it happens. If the application has an "interface" in the shader they need to match up with the API it would be quite confusing to have the binding point get compacted. So the answer is no, the binding points aren't affected by unused elements in the interface block array." v2: - 'original_dim_size' field moved above to keep the struct packed better on 64-bit - added a comment for 'total_num_array_elements' field - fixed a binding point calculations for SSBOs array of arrays ( Ian Romanick <ian.d.romanick@intel.com> ) - fixed binding point calculations for non-packed SSBOs v3: - rename 'total_num_array_elements' to 'aoa_size' ( Jason Ekstrand <jason@jlekstrand.net> ) - rename 'boffset' to 'binding_stride' ( Alejandro Piñeiro <apinheiro@igalia.com> ) Fixes: `8cf1333b` "glsl: link uniform block arrays of arrays" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532 Reported-By: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Fritz Koenig <frkoenig@google.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2020-01-06 13:01:19 -08:00
Andrii Simiklit	a3c9a2881e	glsl: fix an incorrect max_array_access after optimization of ssbo/ubo This is needed to fix these tests: piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_frag piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_comp Fixes: `8cf1333b` "glsl: link uniform block arrays of arrays" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532 Reported-By: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Fritz Koenig <frkoenig@google.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2020-01-06 13:01:19 -08:00
Marek Olšák	420fe1e7f9	radeonsi: remove TGSI Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-06 15:57:20 -05:00
Marek Olšák	e5167a9276	radeonsi: disable SDMA on gfx8 to fix corruption on RX 580 Closes: #1399 Closes: #1889 Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:36 -05:00
Marek Olšák	991328498b	radeonsi: move SI and CIK+ SDMA code into 1 common function for cleanups Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:35 -05:00
Marek Olšák	3c265c2586	radeonsi: rename dma_cs -> sdma_cs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:33 -05:00
Marek Olšák	cd6a4f7631	radeonsi: add AMD_DEBUG=nodmacopyimage for debugging Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:32 -05:00
Marek Olšák	0c9e7a67f9	radeonsi: add AMD_DEBUG=nodmaclear for debugging Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:30 -05:00
Marek Olšák	4110e6e564	radeonsi: remove broken and unused SI SDMA image copy code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:28 -05:00
Marek Olšák	503bd821fa	radeonsi: rename SDMA debug flags Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2020-01-06 15:38:11 -05:00
Tomeu Vizoso	d62dd8b0cb	gitlab-ci: Switch LAVA jobs to use shared dEQP runner Take one step towards sharing code between the LAVA and non-LAVA jobs, with the goals of reducing maintenance burden and use of computational resources. The env var DEQP_NO_SAVE_RESULTS allows us to skip the procesing of the XML result files, which can take a long time and is not useful in the LAVA case as we are not uploading artifacts anywhere at the moment. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-01-06 14:27:36 +01:00
Tomeu Vizoso	f5c2807ff2	gitlab-ci: Update kernel for LAVA to 5.5-rc1 plus fixes Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-01-06 14:27:21 +01:00
Alyssa Rosenzweig	b3ff83c107	panfrost: Handle PIPE_FORMAT_R10G10B10A2_USCALED Same format code as UINT... might be different in how it's fed into a shader but we'll deal with that when we get there. Fixes dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.usigned_int2_10_10_10.components4_vec2_quads1 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:50:00 -05:00
Alyssa Rosenzweig	5c71547c68	panfrost: Report MSAA 4x supported for dEQP Fixes dEQP-GLES3.functional.state_query.integers.max_samples_getinteger64 We'll have to actually implement multisampling next, but hey. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:49:58 -05:00
Alyssa Rosenzweig	32851ff715	panfrost: Cleanup tiling selection logic Make it a lot more obvious what we're doing and fix more than a few corner cases in the process. Fixes dEQP-GLES3.functional.buffer.map.write.render_as_index_array.pixel*, and likely others. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:49:53 -05:00
Alyssa Rosenzweig	dadfca3775	panfrost: Implement sRGB blend shaders We use the lowering in nir_format_convert. There are native ops for this so this is far from optimal and not remotely efficient but as with most blend shader things right now, it's hard enough to get it working, so let's focus on that for now. We'll make it fast later (once we have GLES3 stable, we can start optimizing these things). Fixes dEQP-GLES3.functional.fragment_ops.blend.fbo_srgb.* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:49:48 -05:00
Alyssa Rosenzweig	ef00849877	panfrost: Support rendering to non-zero Z/S layers Fixes abort in STK's shadow implementation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:49:42 -05:00
Alyssa Rosenzweig	ef8c2ebee1	panfrost: Texture from Z32F_S8 as R32F Z32F_S8 becomes Z32F in texturing, which in turn just becomes R32F. Fixes dEQP-GLES3.functional.texture.format.sized..depth32f_stencil8 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2020-01-06 07:49:33 -05:00
Danylo Piliaiev	f3ca47d9f3	iris/query: Implement PIPE_QUERY_GPU_FINISHED Implementation is similar to radeonsi in `5f1cef76` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-06 12:43:14 +02:00
Erik Faye-Lund	642125edd9	st/mesa: use uint-samplers for sampling stencil buffers Otherwise, we end up mismatching the sampler types when rendering. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-06 09:23:43 +01:00
Samuel Pitoiset	09ea2de2b8	ac/surface: use uint16_t for mipmap level pitches Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-06 07:59:50 +01:00
Jonathan Marek	680d806950	etnaviv: fix incorrectly failing vertex size assert Changes the assert to match the comment above. This assert was failing in some cases while running darkplaces. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-05 17:04:39 +00:00
Vasily Khoruzhick	c5ae64ebc7	lima: fix PP stream terminator size PP stream terminator size seems to be 4 words, it worked with full PP stream because we align stream beginning to 32 bytes and BO is initialized with zeroes. But with partial PP stream it sometimes break if for new PP stream we reuse BO that has non-zero value at this place. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-05 00:16:39 -08:00
Vasily Khoruzhick	4f5bfe2a5e	lima: don't reload and redraw tiles that were not updated We don't need to reload and redraw some tiles if framebuffer was not cleared and scissor test was enabled for some of draws. This simple optimization fixes cursor lag in X11 Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-05 00:16:36 -08:00
Vasily Khoruzhick	83abdf8e45	lima: postpone PP stream generation This commit postpones PP stream generation till job is submitted. Doing that this late allows us to skip reloading and redrawing tiles that were not updated. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-05 00:16:33 -08:00
Andreas Baierl	7ad1896ab8	lima/parser: Fix VS cmd stream parser prefetch is int, not bool. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2020-01-05 03:08:01 +00:00
Andreas Baierl	af7dc4675d	lima/parser: Fix rsw parser Drop assert as it is not necessary and used wrong anyway. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2020-01-05 03:08:01 +00:00
Kenneth Graunke	defb3a9465	anv: Only enable EWA LOD algorithm when doing anisotropic filtering. Updated documentation renames "Anisotropic Algorithm" to "LOD Algorithm" and adds a note for Gen9+ saying "The EWA Algorithm should only be enabled for Anisotropic Filtering modes." and indicating that the extra accuracy shouldn't be necessary for other modes, and comes at a cost. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-04 14:27:22 -08:00
Kenneth Graunke	c0c899cf78	iris: Allow HiZ for copy_region sources Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-04 12:25:55 -08:00
Jason Ekstrand	7d75bf4f3f	i965: Allow HiZ for glCopyImageSubData sources v2 (Ken): Handle platforms without sampler support for HiZ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2 changes]	2020-01-04 12:25:55 -08:00
Jason Ekstrand	52ad1712ed	anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	b274469daa	intel/blorp: Use the source format when using blorp_copy with HiZ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	ea7446ba82	i965/blorp: Don't resolve HiZ unless we're reinterpreting This eliminates 50% of pixels (2M) rendered for a blit in GS:GO. This accounts for 3% of pixels rendered in the game. Total GPU clocks for the first 900 frames of CSGO improves by 1%. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	95cc5438eb	blorp: Allow reading with HiZ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Jason Ekstrand	4a1093005c	blorp: Stop whacking Z24 depth to BGRA8 The shader code required to do this is int(sat(x) * UINT24_MAX) which isn't really worth all the effort to avoid. Doing the format conversion, on the other hand, prevents us from sampling with HiZ which is something that we very much want on gen8-9 where we can. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2020-01-04 12:25:54 -08:00
Christian Gmeiner	a597a64ae2	etnaviv: move descriptor based texture structs This moves the descriptor based texture structs and their helpers into the only user. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2020-01-04 20:44:36 +01:00
Christian Gmeiner	7c687d221d	etnaviv: move state based texture structs This moves the state based texture structs and their helpers into the only user. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2020-01-04 20:44:36 +01:00
Roman Stratiienko	ed0fa78b46	panfrost: Fix Android build Include missing `encoder/pan_props.c` into the build. Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-04 16:54:38 +00:00
Gert Wollny	9162e2f03f	mesa/st: glsl_to_nir: don't lower atomics to SSBOs if driver supports HW atomics At least on r600 HW atomic operations are way less expensive than SSBO atomic operations. v2: use st->has_hw_atomics (Erik Anholt) v3: remove second invocation of atomic to ssbo lowering (Erik Anholt) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	b119f8b4a0	r600: Delete vertex buffer only if there is actually a shader state Fixes: gl-2.0-vertexattribpointer Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	32bb5f2941	r600: Make SID and unsigned value The value is never negative, and makeing it unsigned fixes some warnings Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	e8559ae448	r600: Fix maximum line width There are only 13 bits available to store the line width, hence it can't be larger than 8191 v2: Add Fixes tag v3: - Unify value since for all r600 archs (Konstantin Kharlamov) - Correct the value the line width value is emitted as a 12.4 fixed point value of 1/2 line width on r600-r700 and as 8 * line width on Evergreen and newer. Fixes: `06bfb2d28f` r600: fork and import gallium/radeon Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	829107819d	r600/sb: Correct SB disassambler for better debugging Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	bfbdaf9a46	r600: Make it possible to include r600_asm.h in a C++ file Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	23c5ba8baa	r600: Add functions to dump the shader info This will be helpful to compare TGSI and NIR code path, Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Konstantin Kharlamov <hi-angel@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	570a6c6c79	gallium: tgsi_from_mesa - handle VARYING_SLOT_FACE Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	6c9495b392	nir: make nir_get_texture_size/lod available outside nir_lower_tex This functions can be useful in other places. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Gert Wollny	f69bf7fe8c	gallium/tgsi_from_mesa: Add 'extern "C"' to be able to include from C++ The r600/nir backend is in C++ and needs to include this file. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3286>	2020-01-04 16:22:40 +00:00
Bas Nieuwenhuizen	96c9483ccf	spirv: Fix glsl type assert in spir2nir. Fixes: `624789e370` "compiler/glsl: handle case where we have multiple users for types" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2020-01-04 15:53:24 +00:00
Christian Gmeiner	b178262cb9	etnaviv: use a better name for FE_VERTEX_STREAM_UNK14680 Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2020-01-04 14:15:36 +01:00
Bas Nieuwenhuizen	17741a0a05	radv: Only use the gfx mipmap level offset/pitch for linear textures. The tiled-case is non-sensical for non-base mips, but Vulkan requires that this function handles it but at the same time does not require returning anything useful. So we can basically return anything. Correct tiled pitch and offset are still required for our own WSI and in the future getting the layouts of images with DRM format modifiers. Both don't have to deal with images with more than 1 level though. Fixes: `824bd0830e` "radv: return the correct pitch for linear mipmaps on GFX10" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-04 13:04:40 +01:00
Bas Nieuwenhuizen	f0ed67b770	Revert "amd/common: Always initialize gfx9 mipmap offset/pitch." This reverts commit `973181c06c`. Requested by Marek. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-04 13:04:40 +01:00
Kenneth Graunke	645b195312	iris: Delete remnants of the unimplemented ASTC 5x5 workaround I copy and pasted some of the boilerplate but never the implementation. For now, ASTC 5x5 is disabled and faked via uncompressed RGBA; let's delete these remnants until such a time when we implement it properly. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-03 18:06:38 -08:00
Kenneth Graunke	e858321f09	iris: Disable ASTC 5x5 support on Gen9 for now. Intel Gen9 hardware has some nasty restrictions where ASTC 5x5 formats and color compression can't both live in the sampler cache at the same time. To properly support it, we have to track which of those exist in the cache and flush ASTC out or resolve away compression. As far as I'm aware, very little uses ASTC 5x5 textures, so instead of replicating all that for iris, we simply turn it off and rely on the Gallium fallback mechanism to fake it via uncompressed RGBA. This should avoid GPU hangs any time people use ASTC 5x5 with CCS. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2020-01-03 18:06:38 -08:00
Kenneth Graunke	8e6308363b	st/mesa: Allow ASTC5x5 fallbacks separately from other ASTC LDR formats. This patch allows us to fake ASTC 5x5 specifically, while leaving the other ASTC LDR formats with native support. I plan to use this in iris, at least for the time being. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-03 18:06:35 -08:00
Erik Faye-Lund	56fc791b31	etnaviv: use nir_lower_clip_halfz instead of open-coding We already have a helper for this, so let's use that instead of rolling our own version. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Paul Cercueil <paul@crapouillou.net>	2020-01-03 22:48:19 +00:00
Erik Faye-Lund	d9ff5f0414	nir/zink: move clip_halfz-lowering to common code Etnaviv also does the same thing, so let's try to avoid repetition here, and use the same for it code as well. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Paul Cercueil <paul@crapouillou.net>	2020-01-03 22:48:19 +00:00
Erik Faye-Lund	5c2376af63	zink: remove unused code-path in lower_pos_write This code is never reached, because we don't call nir_lower_io before lowering this. So let's get rid of it.	2020-01-03 22:48:19 +00:00
Erik Faye-Lund	87b3d8dce5	zink: use nir_fmul_imm Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Paul Cercueil <paul@crapouillou.net>	2020-01-03 22:48:19 +00:00
Erik Faye-Lund	e51bf4914c	zink: implement load_vertex_id Not 100% sure if this matches the semantics, but it seems to pass the tests, so it seems like an improvement. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-03 22:20:12 +00:00
Erik Faye-Lund	1b2731f268	zink: factor out builtin-var creation This is useful so we can reuse it for the next patch Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-03 22:20:12 +00:00
Erik Faye-Lund	ce1ea6e9c2	zink: simplify front-face type Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-03 22:20:12 +00:00
Caio Marcelo de Oliveira Filho	75a19186b2	anv: Ignore some CreateInfo structs when rasterization is disabled According to the description of VkGraphicsPipelineCreateInfo(), pViewportState, pMultisampleState, pDepthStencilState and pColorBlendState must be ignored when rasterization is not enabled. This avoids potentially invalid pointers being dereferenced when rasterization is disabled. Tested with `demos_x64 VK_Parameter_Zoo` from Renderdoc repository. v2: Don't store the `raster_enabled` as part of anv_pipeline, just query it from the create info. This avoids storing a state that's only used during pipeline creation. (Jason) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2258 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> [v1] Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-03 13:57:31 -08:00
Caio Marcelo de Oliveira Filho	6755b6315b	anv: Drop unused function parameter Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2020-01-03 13:29:49 -08:00
Marek Olšák	66483ee017	radeonsi: remove the "display_dcc_offset == 0" assertion I think it's not needed. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-03 15:07:19 -05:00
Marek Olšák	bfddfd12b6	radeonsi: ignore PIPE_BIND_SCANOUT for imported textures It's obtained from the BO metadata. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-03 15:07:17 -05:00
Marek Olšák	ba10fb3f7f	radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10 Closes: #2195 Closes: #2294 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2020-01-03 15:07:11 -05:00
Vasily Khoruzhick	1de06e540a	lima: fix allocation of GP outputs storage for indexed draw For indexed draw number of VS invocations is (ctx->max_index - ctx->min_index + 1), so we have to use this number when calculating space for varyings, gl_Position and gl_PointSize. Fixes dEQP-GLES2.functional.buffer.write.use.index_array.array and dEQP-GLES2.functional.buffer.write.use.index_array.element_array Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2020-01-03 18:57:36 +00:00
Jason Ekstrand	9bd8000c6c	anv: Drop unneeded struct keywords All VkFoo structs are typedef'd to not need the struct keyword. Leaving it in there is just extra characters and breaks Vulkan's aliasing when stuff gets promoted to core versions. It's better to just never use struct for VkFoo. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2020-01-03 11:32:34 -06:00
Thong Thai	8dc7c467e6	r600: Remove HEVC related code since HEVC is not supported Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>	2020-01-03 16:30:22 +00:00
Thong Thai	466001a226	radeon: Use P010 for decoding of 10-bit videos Previously, P016 was used for the decoding of 10-bit HEVC/H.265 encoded videos, which worked fine for mpv and ffmpeg. GStreamer specifically looks for P010, so this patch sets the default buffer type to P010 for HEVC decoding. Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>	2020-01-03 16:30:22 +00:00
Thong Thai	68881af435	st/va: Add support for P010, used for 10-bit videos Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>	2020-01-03 16:30:22 +00:00
Thong Thai	f3569f215d	gallium: Add PIPE_FORMAT_P010 support Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>	2020-01-03 16:30:22 +00:00
Thong Thai	ee8344bdcf	util/format: Add the P010 format used for 10-bit videos Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3153>	2020-01-03 16:30:22 +00:00
Erik Faye-Lund	98885e9f61	zink: implement some more trivial opcodes Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-03 17:16:18 +01:00
Erik Faye-Lund	8c18331afe	zink: implement txf texelFetch is a requirement for OpenGL 3.0, so this gets us a step closer to GL 3.0 support. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-03 15:28:27 +01:00
Samuel Pitoiset	7b70502a5d	radv: implement VK_AMD_mixed_attachment_samples With VK_AMD_mixed_attachment_samples, the number of depth/stencil samples isn't always equal to the number of color samples. Adjust the number of Z samples when it's different but make sure to have a consistent sample count if there are no depth/stencil attachments. Also adjust the number of samples used for fragment shaders which is the number of color samples if mixed attachment samples are used. Only enabled on GFX8+ because it's untested on previous chips. All dEQP-VK.pipeline.multisample.mixed_attachment_samples.* now pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>	2020-01-03 12:31:53 +00:00
Samuel Pitoiset	7bbf497b68	radv: record number of color/depth samples for each subpass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>	2020-01-03 12:31:53 +00:00
Christian Gmeiner	8d50ab5395	etnaviv: gc400 does not support any vertex sampler On STM32MP1 fixes the dEQPs below and changes the dEQP run statistics to: - Passed: 16856/17346 (97.2%) - Failed: 236/17346 (1.4%) - Not supported: 199/17346 (1.1%) + Passed: 16780/17346 (96.7%) + Failed: 86/17346 (0.5%) + Not supported: 425/17346 (2.5%) Warnings: 55/17346 (0.3%) dEQP-GLES2.functional.shaders.struct.uniform.sampler_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_in_function_arg_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_in_array_function_arg_vertex dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2d dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dproj_vec3 dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dproj_vec4 dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dlod dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3 dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec4 dEQP-GLES2.functional.shaders.texture_functions.vertex.texturecube dEQP-GLES2.functional.shaders.texture_functions.vertex.texturecubelod dEQP-GLES2.functional.shaders.random.texture.vertex.0 dEQP-GLES2.functional.shaders.random.texture.vertex.1 dEQP-GLES2.functional.shaders.random.texture.vertex.2 dEQP-GLES2.functional.shaders.random.texture.vertex.3 dEQP-GLES2.functional.shaders.random.texture.vertex.4 dEQP-GLES2.functional.shaders.random.texture.vertex.5 dEQP-GLES2.functional.shaders.random.texture.vertex.6 dEQP-GLES2.functional.shaders.random.texture.vertex.7 dEQP-GLES2.functional.shaders.random.texture.vertex.8 dEQP-GLES2.functional.shaders.random.texture.vertex.9 dEQP-GLES2.functional.shaders.random.texture.vertex.10 dEQP-GLES2.functional.shaders.random.texture.vertex.11 dEQP-GLES2.functional.shaders.random.texture.vertex.12 dEQP-GLES2.functional.shaders.random.texture.vertex.13 dEQP-GLES2.functional.shaders.random.texture.vertex.14 dEQP-GLES2.functional.shaders.random.texture.vertex.16 dEQP-GLES2.functional.shaders.random.texture.vertex.17 dEQP-GLES2.functional.shaders.random.texture.vertex.18 dEQP-GLES2.functional.shaders.random.texture.vertex.19 dEQP-GLES2.functional.shaders.random.texture.vertex.20 dEQP-GLES2.functional.shaders.random.texture.vertex.22 dEQP-GLES2.functional.shaders.random.texture.vertex.23 dEQP-GLES2.functional.shaders.random.texture.vertex.24 dEQP-GLES2.functional.shaders.random.texture.vertex.26 dEQP-GLES2.functional.shaders.random.texture.vertex.28 dEQP-GLES2.functional.shaders.random.texture.vertex.29 dEQP-GLES2.functional.shaders.random.texture.vertex.31 dEQP-GLES2.functional.shaders.random.texture.vertex.34 dEQP-GLES2.functional.shaders.random.texture.vertex.37 dEQP-GLES2.functional.shaders.random.texture.vertex.38 dEQP-GLES2.functional.shaders.random.texture.vertex.39 dEQP-GLES2.functional.shaders.random.texture.vertex.40 dEQP-GLES2.functional.shaders.random.texture.vertex.42 dEQP-GLES2.functional.shaders.random.texture.vertex.43 dEQP-GLES2.functional.shaders.random.texture.vertex.44 dEQP-GLES2.functional.shaders.random.texture.vertex.45 dEQP-GLES2.functional.shaders.random.texture.vertex.48 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_nearest_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_nearest_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_mirror dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_clamp dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_repeat dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_mirror dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_clamp dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_repeat dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_mirror dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_clamp dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_repeat dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_mirror dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_clamp dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_repeat dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_mirror dEQP-GLES2.functional.fbo.api.attach_names dEQP-GLES2.functional.uniform_api.info_query.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.info_query.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.info_query.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.info_query.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.info_query.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.info_query.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.info_query.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.info_query.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.info_query.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.info_query.unused_uniforms.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.info_query.unused_uniforms.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.render.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.initial.render.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.initial.render.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.initial.render.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.initial.render.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.initial.render.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array_first_elem_without_brackets.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_array_first_elem_without_brackets.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array_first_elem_without_brackets.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_array_first_elem_without_brackets.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.get_uniform.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic.samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.basic_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.struct_in_array.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_full.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.basic_array.sampler2D_vertex dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.array_in_struct.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.basic_array_assign_partial.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.unused_uniforms.sampler2D_samplerCube_vertex dEQP-GLES2.functional.uniform_api.value.assigned.unused_uniforms.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.random.0 dEQP-GLES2.functional.uniform_api.random.3 dEQP-GLES2.functional.uniform_api.random.6 dEQP-GLES2.functional.uniform_api.random.11 dEQP-GLES2.functional.uniform_api.random.14 dEQP-GLES2.functional.uniform_api.random.21 dEQP-GLES2.functional.uniform_api.random.22 dEQP-GLES2.functional.uniform_api.random.24 dEQP-GLES2.functional.uniform_api.random.25 dEQP-GLES2.functional.uniform_api.random.29 dEQP-GLES2.functional.uniform_api.random.30 dEQP-GLES2.functional.uniform_api.random.32 dEQP-GLES2.functional.uniform_api.random.33 dEQP-GLES2.functional.uniform_api.random.37 dEQP-GLES2.functional.uniform_api.random.41 dEQP-GLES2.functional.uniform_api.random.49 dEQP-GLES2.functional.uniform_api.random.51 dEQP-GLES2.functional.uniform_api.random.55 dEQP-GLES2.functional.uniform_api.random.61 dEQP-GLES2.functional.uniform_api.random.69 dEQP-GLES2.functional.uniform_api.random.72 dEQP-GLES2.functional.uniform_api.random.78 dEQP-GLES2.functional.uniform_api.random.79 dEQP-GLES2.functional.uniform_api.random.82 dEQP-GLES2.functional.uniform_api.random.87 dEQP-GLES2.functional.uniform_api.random.88 dEQP-GLES2.functional.uniform_api.random.94 dEQP-GLES2.functional.uniform_api.random.95 dEQP-GLES2.functional.uniform_api.random.96 Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Marek Vasut <marex@denx.de>	2020-01-03 09:23:41 +01:00
Christian Gmeiner	46b8273eb1	etnaviv: check if MSAA is supported Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2020-01-03 08:31:02 +01:00
Iago Toral Quiroga	2271a187c2	u_vbuf: don't try to delete NULL driver CSO Since `18a8c3f7f1` we don't create a driver CSO if there are any incompatible elements, so only ask backends to delete it if it exists. Fixes multiple CTS crashes in V3D. Fixes: `18a8c3f7f1` ("u_vbuf: Only create driver CSO if no incompatible elements") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2020-01-03 07:58:35 +01:00
Kenneth Graunke	d0d28c783d	iris: Set nir_shader_compiler_options::unify_interfaces. This is technically enabling the option in the common intel backend code, but only the st/nir linker uses the option, so it's iris-only. Fixes Piglit's spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out Closes: #2274 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Kenneth Graunke	19ed12afd1	st/nir: Optionally unify inputs_read/outputs_written when linking. i965 and iris use inputs_read/outputs_written for a shader stage to determine the layout of input and output storage. Adjacent stages must agree on the layout, so adjacent input/output bitfields must match. This patch adds a new nir_shader_compiler_options::unify_interfaces flag which asks the linker to unify the input/output interfaces between adjacent stages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3249>	2020-01-03 00:41:50 +00:00
Kenneth Graunke	7a9c0fc0d7	intel: Drop Gen11 WaBTPPrefetchDisable workaround This isn't needed on production Icelake hardware. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3250>	2020-01-03 00:20:17 +00:00
Jordan Justen	ed17baab5f	intel: Remove unused Tigerlake PCI ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2020-01-02 15:18:18 -08:00
Alyssa Rosenzweig	3759b84926	pan/midgard: Use upper ALU tags for MFBD writeout It's not clear yet what the distinction is. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 17:27:23 -05:00
Alyssa Rosenzweig	2d1e18ee83	pan/midgard: Identity ld_color_buffer as 32-bit I'm not sure why I mistakenly identified it as an 8-bit op before. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	5063ab6a9c	pan/midgard: Remove old comment Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	5bc62af2a0	pan/midgard: Generate MRT writeout loops They need a very particular form; the naive way we did before is not sufficient in practice, it doesn't look like. So let's follow the rough structure of the blob's writeout since this is fixed code anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	db879b034a	pan/midgard: Generalize IS_ALU and quadword_size There are more ALU tags, let's do some cleanup while we're at it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	66f98ffab0	pan/midgard: Use better heuristic for shader termination This still may not be perfect (in the sense that legal shaders might still get cut off) but this fits how writeout is done with both Panfrost and the blob, so it's good enough for what we need and allows MRT shaders to be sanely disassembled. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	c298f25c4e	pan/midgard: Fix memory corruption in constant combining It's a long story... but we'd try to insert constants that weren't there and end up clobbering fields in the bundle following the constant array... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	d58600c0e0	panfrost: Pack MRT blend shaders into a single BO Blend shader size and location in memory is considerably constrained, probably to facilitate optimizations (my guess is that blend shaders are run strictly out of i-cache). We need to pack the blend shaders for each RT of a single framebuffer together. The easiest way to do this is at draw time which is not terribly efficient but will hold us over for now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Alyssa Rosenzweig	1b86e0927d	panfrost: Handle RGB16F colour clear We don't handle this format yet, but we will soon, and the abort in pan_pack_color is possible even without exposing the format... Handling this gracefully might not be required by the spec but let's not crash. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 15:20:55 -05:00
Tomeu Vizoso	829f338a59	panfrost: Store internal format It's needed by u_transfer_helper to know when the depth+stencil buffer has been split. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 12:41:17 -05:00
Tomeu Vizoso	14bc4c7cce	panfrost: Map with size of first layer for 3D textures As that's what Gallium expects in transfer.layer_stride. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 12:41:15 -05:00
Tomeu Vizoso	ed3eede296	panfrost: Dynamically allocate array of texture pointers With 3D textures we can have lots of layers, so better allocate it dynamically at runtime. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2020-01-02 12:41:02 -05:00
Bas Nieuwenhuizen	c1a1a86658	meson: Enable -Werror=int-conversion. I think implicit conversions here are almost always wrong: 1) wrong argument position ptr vs. int 2) will often have issues with 32-bit platforms. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>	2020-01-02 11:47:02 +00:00
Bas Nieuwenhuizen	b72182fcfa	turnip: Use VK_NULL_HANDLE instead of NULL. Only occurrence of implicitly converting pointer->int. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2570>	2020-01-02 11:47:02 +00:00
Bas Nieuwenhuizen	973181c06c	amd/common: Always initialize gfx9 mipmap offset/pitch. The WSI expects pitch to be meaningful even for tiled textures. (It is used for the pitch in modesetting and X11) Fixes: `824bd0830e` "radv: return the correct pitch for linear mipmaps on GFX10" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3245>	2020-01-02 11:25:51 +00:00
Bas Nieuwenhuizen	59c4fb9d72	nir: print non-uniform tex fields. To ease debugging in the future. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>	2020-01-02 11:42:33 +01:00
Bas Nieuwenhuizen	69bdc1c5fc	nir: Add clone/hash/serialize support for non-uniform tex instructions. These were missed when the fields got added. Added it everywhere where texture_index got used and it made sense. Found this in "The Surge 2", where the inliner does not copy the fields, resulting in corruption and hangs. Fixes: `3bd5457641` "nir: Add a lowering pass for non-uniform resource access" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1203 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3246>	2020-01-02 11:41:33 +01:00
Afonso Bordado	525cbe85ef	pan/midgard: Optimize branches with inverted arguments Remove the invert on arguments to branches, and invert the branch condition instead. This saves one instruction per inverted argument. Closes #2088 Signed-off-by: Afonso Bordado <afonsobordado@az8.co> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-31 20:01:16 +00:00
Afonso Bordado	0e83688f47	pan/midgard: Move midgard_is_branch_unit to helpers Signed-off-by: Afonso Bordado <afonsobordado@az8.co> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-31 20:01:12 +00:00
Marek Vasut	5e9106f7af	etnaviv: Do not filter out PIPE_FORMAT_S8_UINT_Z24_UNORM on pre-HALTI2 The format PIPE_FORMAT_S8_UINT_Z24_UNORM is supported even on pre-HALTI hardware like GCnano. Do not report it as unsupported format. This fixes the following dEQP on GCnano: dEQP-GLES2.functional.fbo.completeness.renderable.texture.color0.depth_stencil_unsigned_int_24_8 Fixes: `64c7cdcae5` ("etnaviv: add missing formats") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3200> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3200>	2019-12-31 15:12:49 +00:00
Marek Vasut	a812cb57e5	etnaviv: Report correct number of vertex buffers The GCnano has only 4 vertex buffers instead of 16. This information can be extracted from the GPU status registers and is already stored in screen->specs.stream_count. Use PIPE_CAP_MAX_VERTEX_BUFFERS to report this information and permit u_vbuf to reorganize the shaders to fit. This fixes the following dEQP on GCnano: dEQP-GLES2.functional.shaders.conversions.vector_combine.float_float_float_float_to_vec4_vertex This fixes all the other dEQP-GLES2.functional.shaders.conversions.* which used to fail on GCnano. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3241> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3241>	2019-12-31 14:55:04 +00:00
Timur Kristóf	11e62a9734	aco: Fix uniform i2i64. Fixes 240 failing test cases in dEQP-VK.spirv_assembly which were failing due to a bad s_ashr_i32 instruction. This commit fixes the instruction format along with the definitions of the instruction. Fixes: `11f43caaec` Cc: 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-31 14:22:31 +01:00
Robert Foss	182679e7c5	android: Fix u_format_table.c being generated twice Two competing rules for defining u_format_table.c exists, which is an error. Additionally the more general rule lacks the inclusion of format/u_format.csv. Fixes: `882ca6dfb0` ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-31 14:06:02 +01:00
Alyssa Rosenzweig	a0d65d860d	pan/midgard: Remove prepacked_branch It's an ugly hack that's no longer used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-31 03:26:24 +00:00
Alyssa Rosenzweig	02f503ef00	pan/midgard: Convert fragment writeout to proper branches This eliminates the only use of prepacked_branch, which is a such a hack anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-31 03:26:24 +00:00
Marek Olšák	84b82f8cd1	winsys/radeon: initialize pte_fragment_size Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Closes: #2179 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-30 20:20:59 -05:00
Marek Olšák	5c9dcbea77	Revert "u_vbuf: Regard non-constant vbufs with non-instance elements as free" This reverts commit `c6ef79c488`. It broke torcs.	2019-12-30 18:41:24 -05:00
Alyssa Rosenzweig	3909b16000	panfrost: Respect glPointSize() We have native support for this somehow. Fixes the mesa demo `points` Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	8f4b15636b	panfrost: Remove MRT indirection in blend shaders Since we have a separate blend shader for each render target, let's simplify this structure and reduce the options memory footprint by 88% or something goofy like that. Should also enable separate blending per render target. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	67fe2afa51	panfrost: Implement integer varyings We need to actually work out the varying format on demand, rather than assuming rgba32f. Fixes dEQP-GLES3.functional.fragment_out.basic.int.* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	62d056d8e3	panfrost: Disable some CAPs we want lowered Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	71df7c69bc	panfrost: Identify glProvokingVertex flag Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	c17a441666	pan/midgard: Implement flat shading We need to shuffle around some lowerings but it's just a flag. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	66c2696fda	pan/midgard: Use type-appropriate st_vary We would like to store (u)ints as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	3996fd7b90	pan/midgard: Promote tilebuffer reads to 32-bit Fixes (among others) dEQP-GLES3.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba16f Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-30 17:11:08 -05:00
Alyssa Rosenzweig	ddc5a371b3	glsl: Set .flat for gl_FrontFacing It is a boolean. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3237>	2019-12-30 19:54:50 +00:00
Samuel Pitoiset	824bd0830e	radv: return the correct pitch for linear mipmaps on GFX10 On GFX9, the pitch of a level is always the pitch of the entire image but not on GFX10. This fixes graphics glithes with Halo - The Master Chief Collection. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2188 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-30 14:17:45 +01:00
Yevhenii Kolesnikov	b318bc2072	meta: Cleanup function for DrawTex Buffer object was never freed, causing memory leaks. Fixes: `76cfe2bc44` ("meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex") CC: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1390> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1390>	2019-12-30 12:41:52 +02:00
Jan Zielinski	7040d6c197	gallium/gallivm/tgsi: enable tessellation shaders Tessellation Control and Evaluation shaders are implementing tessellation and require special handling of their inputs and outputs. TCS can write out not only per-vertex, but also per-patch (per-primitive) attributes and tessellation factor values that control the tessellator. TES can read TCS outputs, plus must be feeded with new system values (tessellation coordinates) that are outputs of the tessellator fixed function. TCS can also contain calls to barrier() function (similar to compute shaders). Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-12-30 11:32:47 +01:00
Dave Airlie	26c5ae80f0	llvmpipe: enable ARB_shader_group_vote. This just adds the NIR paths for shader group vote. v2: drop feq for now. (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3213> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3213>	2019-12-30 05:30:30 +00:00
Bas Nieuwenhuizen	88f567b5ce	amd/common: Handle alignment of 96-bit formats. addrlib doesn't quite do it right, so do it ourselves. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2162 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-30 00:02:46 +01:00
Caio Marcelo de Oliveira Filho	b0203b561c	panfrost: Fix Makefile.sources Add missing `\`. Fixes Android build. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `de077c2078` ("panfrost: Remove mali_alt_func")	2019-12-28 12:31:41 -08:00
Eric Engestrom	a6873a8df2	mesa: avoid returning a value in a void function Fixes: `1d1722e910` ("mesa: add EXT_dsa NamedProgram functions") Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-28 11:58:37 +00:00
Eric Engestrom	dcba7731e6	meson: simplify install_megadrivers.py invocation Note: `find_program()` needs a shebang on scripts. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-27 22:43:34 +00:00
Eric Engestrom	ff3a2576a4	nine: fix empty-body-issues Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-12-27 22:09:16 +00:00
Eric Engestrom	51569e525a	amd: fix empty-body issues Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-12-27 22:09:00 +00:00
Eric Engestrom	7a4a75a185	u_format: move format tests to util/tests/ Suggested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-27 21:04:44 +00:00
Eric Engestrom	da9937d09b	util/format: add trivial srgb<->linear conversion test This would've caught `8829f9ccb0` ("u_format: add ETC2 to util_format_srgb/util_format_linear"). Suggested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-27 21:04:43 +00:00
Eric Engestrom	8f4d4c808b	util/format: add PIPE_FORMAT_ASTC_xx*_SRGB to util_format_{srgb,linear}() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-27 21:04:43 +00:00
Eric Engestrom	cc7a64f101	util/format: remove left-over util_format_description_table declaration Fixes: `3c45c4bc44` ("util: Cope with the fact that formats in u_format.csv are not ordered.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-27 21:04:43 +00:00
Dave Airlie	baa064f0f5	gallivm: fixup const int64 builder. Pointed out by Ilia. Fixes: `84ba008774` (gallivm: add 64-bit const int creator.) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-28 05:44:31 +10:00
Marek Olšák	e79f55ff86	radeonsi/gfx10: improve performance for TES using PrimID but not exporting it This field is really for the primitive export to the pixel shader. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-27 13:50:57 -05:00
Marek Olšák	aa3df12fc2	radeonsi/gfx10: enable NGG passthrough for eligible shaders Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-27 13:50:57 -05:00
Marek Olšák	17164d4e27	radeonsi/gfx10: don't declare any LDS for NGG if it's not used Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-27 13:50:57 -05:00
Alyssa Rosenzweig	65e5c1942a	panfrost: Remove 32-bit next_job path It has been unused for a while; let's just remove the abstraction. Technically the hardware does support 32-bit job descriptors, but we don't and we can't keep them from breaking so let's not pretend they work. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-12-27 13:03:22 -05:00
Alyssa Rosenzweig	95ba661b49	panfrost; Update comment about work/uniform_count Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 13:01:17 -05:00
Alyssa Rosenzweig	de077c2078	panfrost: Remove mali_alt_func There's only one way to encode comparison functions in the command stream, not two. It's just that the semantics for texture comparisons are flipped from the semantics of stencil comparison. We can factor out that flip to common Panfrost code, rather than tying it to a second Gallium routine. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig	bc1fc29e21	panfrost: Add missing #include in common header Fixes way back when... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig	330e9b154e	panfrost: Add pan_attributes.c to Android.mk Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `31305e1b28` ("panfrost: Move instancing routines to encoder/")	2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig	5fe58271b2	panfrost: Implement remaining texture wrap modes Somehow we have native hardware for all of these. Suspected by staring at the bit pattern; confirmed by poking in various texture wrap modes into the textures mesa demo and seeing what happens. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:58:00 -05:00
Alyssa Rosenzweig	4ccd42e0bc	panfrost: Inline away MALI_NEGATIVE It's an awfully fancy way to add one... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:16:09 -05:00
Alyssa Rosenzweig	76519b216b	panfrost: Remove MALI_ATTR_INTERNAL It's a relic from before we understood the varying builtins. It should never actually come up if the builtins are decoded correctly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:11:37 -05:00
Alyssa Rosenzweig	5f8376101d	panfrost: Update information on fixed attributes/varyings Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:10:24 -05:00
Alyssa Rosenzweig	9bde6e551d	panfrost: Remove MALI_SPECIAL_ATTRIBUTE_BASE defines These are conventions by the blob (a convention we happent to follow). They are not at all intrinsic to the hardware, so now that the convention is implemented within the Midgard stack, these defines are wholly unused. Remove them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-27 12:08:45 -05:00
Alyssa Rosenzweig	8c188722d9	pan/midgard: Fix minor typo Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-12-27 12:07:45 -05:00
Mauro Rossi	563bd61fee	android: radv: build radv_shader_args.c Updates radv Makefile.sources and fixes the following building error: external/mesa/src/amd/vulkan/radv_shader.c:1122: error: undefined reference to 'radv_declare_shader_args' Fixes: `3b14336` ("ac/nir, radv, radeonsi: Switch to using ac_shader_args") Fixes: `66c703b` ("radv: Move argument declaration out of nir_to_llvm") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-27 09:20:54 +01:00
Mauro Rossi	962b70c259	android: radeonsi,ac: fix building error due to ac changes Updates amd Makefile.sources and fixes the following building errors: external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:338: error: undefined reference to 'ac_add_arg' external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:340: error: undefined reference to 'ac_add_arg' external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:341: error: undefined reference to 'ac_add_arg' external/mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:342: error: undefined reference to 'ac_add_arg' Fixes: `9885af3` ("ac: Add a shared interface between radv, radeonsi, LLVM and ACO") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-27 09:20:49 +01:00
Mauro Rossi	ad1c65e322	android: radv: fix vk_format_table.c generated source build RADV Android build rules are now getting the wrong vk_format.h from src/vulkan/util include, the simplest way to fix is to add src/amd/vulkan include prior to src/vulkan/util include Fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:39:4: error: use of undeclared identifier 'VK_FORMAT_LAYOUT_PLAIN' ... out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/vk_format_table.c:131:8: error: use of undeclared identifier 'VK_FORMAT_TYPE_UNSIGNED'; did you mean 'UTIL_FORMAT_TYPE_UNSIGNED'? {VK_FORMAT_TYPE_UNSIGNED, true, false, false, 4, 0}, /* x = a */ fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `3a28281` ("util: Add a mapping from VkFormat to PIPE_FORMAT.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-27 09:20:44 +01:00
Mauro Rossi	13ef793770	android: util: Add a mapping from VkFormat to PIPE_FORMAT. Updates Makefile.sources and fixes the following building error: In file included from external/mesa/src/vulkan/util/vk_format.c:24: In file included from external/mesa/src/vulkan/util/vk_format.h:28: external/mesa/src/util/format/u_format.h:33:10: fatal error: 'pipe/p_format.h' file not found #include "pipe/p_format.h" ^~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `3a28281` ("util: Add a mapping from VkFormat to PIPE_FORMAT.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-27 09:20:40 +01:00
Mauro Rossi	200be80858	android: nir: add a load/store vectorization pass Fixes the following aco building error: external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:846: error: undefined reference to 'nir_opt_load_store_vectorize' Fixes: `ce9205c` ("nir: add a load/store vectorization pass") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-27 09:20:35 +01:00
Dave Airlie	c8042c289e	llvmpipe: add debug option to enable OpenCL support. LP_DEBUG=cl will enable CL support for now. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	29784bb49c	gallivm/nir: add vec8/16 support Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	5be1ea7d79	gallivm/nir: lower packing This fixes some CL upsample tests, which lower into packing that needs lowering. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	31e0e8a51b	llvmpipe: lower hadd/add_sat Fixes some CL piglits. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	0a73eafdbe	gallivm: handle non-32 bit undefined other sized undefs caused llvm asserts Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	b16fd4d9e9	llvmpipe/nir: use nir_max_vec_components in more places This is prep work for when vec8/16 have landed. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	073734ca7f	llvmpipe: add support for compute shader params Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	22d631e235	llvmpipe: handle serialized nir as a shader type. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	264663d55d	gallivm/llvmpipe: add support for global operations. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:33 +10:00
Dave Airlie	9630c2ddd8	gallivm/llvmpipe: add support for block size intrinsic We have to pass the main block size into the coroutine and into the shader. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:32 +10:00
Dave Airlie	336954f7e7	gallivm/llvmpipe: add support for work dimension intrinsic. We have to pass the work_dim given by the user into the shader. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:32 +10:00
Dave Airlie	b8d403c03f	tgsi/mesa: handle KERNEL case Translate to compute for now. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:32 +10:00
Dave Airlie	dac8cb981f	gallivm/nir: allow 8/16-bit conversion and comparison. This adds the convert to 8/16 and support for 8/16 comparsions Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:32 +10:00
Dave Airlie	3adf74f2ef	gallivm: pick integer builders for alu instructions. This allows these to be used with non 32-bit types. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:26:32 +10:00
Dave Airlie	df3e0fe9d8	gallivm: add support for 8-bit/16-bit integer builders Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:22:43 +10:00
Dave Airlie	258b9bc02e	llvmpipe/gallivm: add kernel inputs compute shaders need kernel input support Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:22:40 +10:00
Dave Airlie	84ba008774	gallivm: add 64-bit const int creator. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-12-27 13:22:35 +10:00
Dave Airlie	41c77dbc1e	nir: sanitize work group intrinsics to always be 32-bit. This saves handling them in the backend later. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-27 13:22:34 +10:00
Bas Nieuwenhuizen	a435f002c4	radv: Expose all sample counts for integer formats as well. Things work the same between float and integer. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2261 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-26 10:48:29 +01:00
Alyssa Rosenzweig	be691ca22d	panfrost: Route gl_VertexID through cmdstream It shows up as a special (magic?) attribute. We could try to be clever and only include the extra record if gl_VertexID is actually read, but honestly that's just extra complexity for no good reason. Might as well just always include it; this won't be a real bottleneck, I don't think. Fixes dEQP-GLES3.functional.shaders.builtin_variable.vertex_id. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig	8781378224	panfrost: Extend attribute_count for vertex builtins They stretch beyond the usual limit for attributes so are included implicitly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig	306800d747	pan/midgard: Lower gl_VertexID/gl_InstanceID to attributes We have special records for these, put in a fixed location by convention per the blob. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig	6e68890fd6	pan/midgard: Factor out emit_attr_read We will load attributes directly for gl_VertexID. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig	695b35605b	panfrost: Unset vertex_id_zero_based We don't want the lowering; we have native gl_VertexID. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:55:04 -05:00
Alyssa Rosenzweig	3b3d9653a7	pan/decode: Handle gl_VertexID/gl_InstanceID Just like varyings have special records for point coordinates (etc), attributes have special records for vertex/instance ID. We can parse these fairly easily, although they don't line up exactly with normal attribute records. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:54:58 -05:00
Alyssa Rosenzweig	d36ca7c0a3	panfrost: Remove pan_shift_odd Padded counts are numbers of the form: n = (2k + 1) * 2^s for k, s integers. Rather than explicitly store k and s separately and then compute this formula on demand, it's much cleaner to store the padded number itself, which is what you manipulate most of the time. When you do need k,s it is easy to factor by noticing the bitwise representation: s = ctz(n) k = n >> (s + 1) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig	62ce9001c2	panfrost: Slight cleanup of Gallium's pan_attribute.c Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig	385a4f773f	pan/decode: Fix reference computation for invocations Slight bug with instancing. No harm done but let's get rid of the pandecode warning, it's just noise. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig	9c249d3e6b	panfrost: Fix off-by-one in pan_invocation.c When instance_count=2, the packing code was broken. Fixes a dEQP test. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig	467ae0d39d	panfrost: Factor out panfrost_compute_magic_divisor The algorithm doesn't need to be tangled up in details about the attribute records themselves. We'll need to compute magic divisors for gl_InstanceID in a second. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 22:42:07 -05:00
Alyssa Rosenzweig	31305e1b28	panfrost: Move instancing routines to encoder/ Nothing Gallium specific or stateful about them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig	8a57672673	panfrost: Factor batch/resource out of instancing routines They don't need them; this will allow us to move the code into encoder/ which in turn will make the messy Gallium code less scary. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig	ddcd68f52b	panfrost: Rename pan_instancing.c -> pan_attributes.c Let's follow the naming convention that panfrost command stream code is organized by command stream structure. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:48:57 -05:00
Alyssa Rosenzweig	a0e75adabb	pan/midgard: Compute destination override We shift over the mask in this case. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:23:02 -05:00
Alyssa Rosenzweig	9a5d462480	pan/midgard: Add mir_upper_override helper Checks if we should emit a dest_override=upper, given a mask. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:23:02 -05:00
Alyssa Rosenzweig	fc4193d0c7	pan/midgard: Support loads from R11G11B10 in a blend shader Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:22:54 -05:00
Alyssa Rosenzweig	3af5a398f3	pan/midgard: Enable lower_(un)pack_* lowering These show up in some blend shaders. Let's use the shared lowering and remove our own. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 19:21:52 -05:00
Tomeu Vizoso	843a6db6bb	panfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16 GL ES 3.0 requires it to be higher, and stuff seems to work just fine. Fixes: dEQP-GLES3.functional.implementation_limits.max_vertex_output_components Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-24 19:21:52 -05:00
Tomeu Vizoso	f107059bb2	panfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM Fixes dEQP-GLES3.functional.texture.format.sized.2d.depth24_stencil8_pot Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-24 19:21:52 -05:00
Alyssa Rosenzweig	6b7243f28f	pan/midgard: Implement shadow cubemaps We need to reshuffle to sync up the shadow coordinate temporary with the cubemap coordinate temporary. Once that's in place, it's simple enough (we load the shadow coordinate into .z like 2D). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig	9e5a1412ed	pan/midgard: Generalize temp coordinate to non-2D Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig	1bce7fdecd	pan/midgard: Do witchcraft on texture offsets My latest divination spell has uncovered a pattern in the aether. Although the swizzle is unaligned, its format is otherwise standard. Document this, removing the old incorrect understanding of the swizzle (which coincided on common special swizzles only). Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetchoffset.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:23 +00:00
Alyssa Rosenzweig	4ec1f95d76	pan/midgard: Fix fallthrough from offset to comparator Fixes: `ccbc9a4e67` ("pan/midgard: Implement textureOffset for 2D textures") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig	64b2fe9626	pan/midgard: Expand swizzle for texelFetch We zero the extra components anyway. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig	72e5749a63	pan/midgard: Clamp LOD register swizzle Fixes register allocation failures with textureLodOffset. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig	06df977c1c	pan/midgard: Extend IS_VEC4_ONLY to arguments I think both need to be aligned at least for ld_cubemap_coords. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:22 +00:00
Alyssa Rosenzweig	4e75d75724	pan/midgard: Bounds check lcra_restrict_range We may call it with sentinel values (~0 in particular) corresponding to unused arguments; ignore these. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 23:46:22 +00:00
Rob Clark	0c32063794	freedreno/ir3: fix flat shading again These days `ctx->inputs` is the split scalar input components and `ir->inputs` is the full vecN. This got fixed in the load_input case, but the load_interpolated_input case was missed. Fixes: `bdf6b7018c` ("freedreno/ir3: re-work shader inputs/outputs") Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-24 17:16:31 +00:00
Alyssa Rosenzweig	a8beef332d	pan/midgard: Fix disassembler cycle/quadword counting Due to the succeeding break we would fall into some off-by-one errors. These should be resolved now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig	0cc6e33537	pan/decode: Append 0:0 spills:fills to blobber-db At the moment there's no need to actually count these but we do need a placeholder for report.py to be happy. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig	6a74934e7a	pan/decode: Prefix blobberdb with MESA_SHADER_* We use these prefixes in panfrost shader-db and they need to match for shader-db to be happpy. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig	ead35f586c	pan/decode: Skip COMPUTE in blobber-db The blob uses COMPUTE jobs for some internal purposes. These are essentially free but panfrost doesn't use them, so it messes up the numbering. Just filter them out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 16:55:46 +00:00
Alyssa Rosenzweig	09671c8d68	panfrost: Decode shader types in pantrace shader-db We see some COMPUTE jobs that were mistakenly identified as VERTEX. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-24 16:55:46 +00:00
Jason Ekstrand	ac70442ce1	anv: Properly advertise sampledImageIntegerSampleCounts We support the same set of samples for integer color formats as for non-integer. We've been advertising it wrong since before the initial Vulkan 1.0 release. :-( Fixes: `d689745303` "vk/0.210.0: Rework device features and limits" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-24 08:31:44 -06:00
Roman Stratiienko	c411d4896c	Android: Fix build issue without LLVM Some of the latest changes are causing the following build error on Android: ``` external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.c:403:6: error: redefinition of 'nir_tgsi_scan_shader' void nir_tgsi_scan_shader(const struct nir_shader nir, ^ external/mesa3d/src/gallium/auxiliary/nir/nir_to_tgsi_info.h:37:20: note: previous definition is here static inline void nir_tgsi_scan_shader(const struct nir_shader nir, ^ ``` Include nir_to_tgsi_info.c and nir_to_tgsi_info.h into the build only if LLVM is enabled. Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2978>	2019-12-23 10:22:02 +02:00
Kenneth Graunke	97e9de1795	iris: Avoid replacing backing storage for buffers with no contents We might get asked to pitch the storage on a buffer that already has no meaningful contents. In this case, the existing buffer is as good as a new one.	2019-12-22 16:18:30 -08:00
Kenneth Graunke	c96c1141fb	iris: Fix shader recompile debug printing I was passing iris keys to brw_debug_key_recompile, leading to out of bounds memory reads. Fixes: `2e654db27a` ("iris: Create smaller program keys without legacy features")	2019-12-22 16:18:30 -08:00
Kenneth Graunke	1ef4514c5b	iris: Make helper functions to turn iris shader keys into brw keys. We'll need to use these in recompile debugging in the next commit. Fixes: `2e654db27a` ("iris: Create smaller program keys without legacy features")	2019-12-22 16:18:30 -08:00
Vinson Lee	2d971cc1ca	swr: Fix build with llvm-10.0. Fix build error after llvm-10 commit 5d986953c8b9 ("[IR] Split out target specific intrinsic enums into separate headers"). ../src/gallium/drivers/swr/rasterizer/jitter/functionpasses/lower_x86.cpp:78:37: error: ‘x86_bmi_bextr_32’ is not a member of ‘llvm::Intrinsic’ {"meta.intrinsic.BEXTR_32", Intrinsic::x86_bmi_bextr_32}, ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-21 16:36:27 -08:00
Eric Engestrom	bc943d00aa	travis: autodetect python version instead of hard-coding it Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-21 20:23:08 +00:00
Marek Vasut	45e1443fd8	etnaviv: tgsi: Fix gl_FrontFacing support The GPU presents the state of the hardware front_face in internal register 0 (i0), the range of which is 0.0f..1.0f. This patch assigns the fragment shader input to this internal register. Moreover, based on the internal front_ccw state, the value of the i0 register is inverted accordingly using SET.EQ/SEQ.NE instruction before being further processed in the shader. This mimics the operation of the NIR compiler. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2868>	2019-12-21 20:17:27 +01:00
Paul Cercueil	63b33120b7	u_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below minimum Return true in u_vbuf_get_caps if the number of vertex buffers is below the minimum required for proper OpenGL 2.0. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Paul Cercueil	c6ef79c488	u_vbuf: Regard non-constant vbufs with non-instance elements as free In the case of unroll_indices, we can regard all non-constant vertex buffers with only non-instance vertex elements as incompatible and thus free. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Wladimir J. van der Laan	87a6029ccf	u_vbuf: use single vertex buffer if it's not possible to have multiple Put CONST, VERTEX and INSTANCE attributes into one vertex buffer if necessary due to hardware constraints. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Paul Cercueil	18a8c3f7f1	u_vbuf: Only create driver CSO if no incompatible elements Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Paul Cercueil	88d041a6b9	u_vbuf: Mark vbufs incompatible if more were requested than HW supports More vertex buffers are used than the hardware supports. In principle, we only need to make sure that less vertex buffers are used, and mark some of the latter vertex buffers as incompatible. For now, mark all vertex buffers as incompatible. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Wladimir J. van der Laan	5f37e38b81	u_vbuf: add logic to use a limited number of vbufs Make it possible to limit the number of vertex buffers as there exist GPUs with less then 32 supported vertex buffers. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
Christian Gmeiner	5bd6a5c41b	gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS Add PIPE_CAP_MAX_VERTEX_BUFFERS param, which defaults to 16. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2807>	2019-12-21 18:29:30 +00:00
David Heidelberg	5343124932	.mailmap: use correct email address Signed-off-by: David Heidelberg <david@ixit.cz> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3190>	2019-12-21 17:50:01 +00:00
Paul Cercueil	2bbf8ebadc	kmsro: Extend to include ingenic-drm This enables Mesa to work with Ingenic SoCs through the use of the ingenic-drm modesetting driver along with the render-only drivers, such as Etnaviv on the JZ4770 SoC. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-21 18:27:51 +01:00
Stephan Gerhold	4da46a1c3c	kmsro: Add "mcde" entry point ST-Ericsson Ux500 boards use a Mali 400 GPU together with MCDE ("Multi Channel Display Engine"), which is supported by the "mcde" DRM driver. Adding an entry point for it in kmsro seems to be enough to make Lima work - at least kmscube is working correctly. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3139>	2019-12-21 16:43:35 +00:00
Rhys Perry	afe1a8ff5b	aco: fix vgpr alloc granule with wave32 We still need to increase the number of physical vgprs Totals from affected shaders: SGPRS: 671976 -> 675288 (0.49 %) VGPRS: 550112 -> 562596 (2.27 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 27621660 -> 27606532 (-0.05 %) bytes Max Waves: 81083 -> 87833 (8.32 %) Instructions: 5391560 -> 5389031 (-0.05 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-21 12:38:42 +01:00
Rhys Perry	01ccd7839c	aco: improve jump threading with wave32 Totals from affected shaders: SGPRS: 748746 -> 748746 (0.00 %) VGPRS: 636984 -> 636984 (0.00 %) Spilled SGPRs: 387 -> 387 (0.00 %) Spilled VGPRs: 15 -> 15 (0.00 %) Code Size: 61138824 -> 60928620 (-0.34 %) bytes Max Waves: 48602 -> 48602 (0.00 %) Instructions: 11967660 -> 11915084 (-0.44 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-21 12:38:42 +01:00
Rhys Perry	6ff92f3d68	aco/wave32: fix comparison optimizations Previously, they weren't done in wave32. Totals from affected shaders: SGPRS: 507726 -> 508006 (0.06 %) VGPRS: 450340 -> 450268 (-0.02 %) Spilled SGPRs: 298 -> 298 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 39689708 -> 39384488 (-0.77 %) bytes Max Waves: 39631 -> 39636 (0.01 %) Instructions: 7865919 -> 7793650 (-0.92 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-21 12:38:42 +01:00
Karol Herbst	4dd08b710b	nv50ir/nir: support vec8 and vec16 Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-12-21 11:00:17 +00:00
Rob Clark	a8ec4082a4	nir+vtn: vec8+vec16 support This introduces new vec8 and vec16 instructions (which are the only instructions taking more than 4 sources), in order to construct 8 and 16 component vectors. In order to avoid fixing up the non-autogenerated nir_build_alu() sites and making them pass 16 src args for the benefit of the two instructions that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is used for the > 4 src args case). v2 (Karol Herbst): use nir_build_alu2 for vec8 and vec16 use python's array multiplication syntax add nir_op_vec helper simplify nir_vec nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert use nir_build_alu for opcodes with <= 4 sources v3 (Karol Herbst): fix nir_serialize v4 (Dave Airlie): fix serialization of glsl_type handle vec8/16 in lowering of bools v5 (Karol Herbst): fix load store vectorizer Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-21 11:00:17 +00:00
Karol Herbst	b35e583c17	aco: use NIR_MAX_VEC_COMPONENTS instead of 4 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-21 11:00:16 +00:00
Karol Herbst	c83b1a4560	nir/serialize: cast swizzle before shifting fixes undefined behaviour with enabled vec16 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-21 11:00:16 +00:00
Dave Airlie	e6b2af56cb	llvmpipe: switch to NIR by default Add LP_DEBUG=tgsi_ir (tgsi already taken) to fallback to TGSI paths. Disable NIR_VALIDATE in CI (Michel/Eric acked) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>	2019-12-21 13:07:17 +10:00
Dave Airlie	c717ac1247	gallivm/nir: wrap idiv to avoid divide by 0 (v2) This code is taken from the TGSI paths, and should fix the regression seens with GLES2 v2: use the udiv path which has d3d10 defined return. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2303>	2019-12-21 13:06:58 +10:00
Marek Olšák	7d65614422	ac/surface: fix an assertion failure on gfx9 in CMASK computation addrlib only allows the 2D resource type with CMASK. Fixes: `69ea473eeb` "amd/addrlib: update to the latest version" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3187>	2019-12-20 22:57:08 +00:00
Afonso Bordado	3e1e4ad13d	pan/midgard: Optimize comparisions with similar operations Optimizes comparisions by removing the invert flag on operands which we can prove to be equal without the invert. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>	2019-12-20 22:36:06 +00:00
Erico Nunes	8e9e94d084	lima: set shader caps to optimize control flow With these new caps, nir is able to unroll loops and optimize conditionals much more efficiently in both gpit and ppir. panfrost and vc4 were used as reference for the values. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>	2019-12-20 20:59:15 +01:00
Erico Nunes	4322656dee	lima/ppir: remove assert on ppir_emit_tex unsupported feature This assert causes testing tools such as shaderdb to abort on some test cases. This is an unsupported feature and not a compiler bug. The compilation error is already propagated correctly, so we can remove the assert to allow testing tools to run to completion. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>	2019-12-20 20:58:50 +01:00
Erico Nunes	d56710ab82	lima/ppir: fix lod bias src ppir has some code that operates on all ppir_src variables, and for that uses ppir_node_get_src. lod bias support introduced a separate ppir_src that is inaccessible by that function, causing it to be missed by the compiler in some routines. Ultimately this caused, in some cases, a bug in const lowering: .../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed. This fix moves the ppir_srcs in ppir_load_texture_node together so they don't get missed. Fixes: `721d82cf06` lima/ppir: add lod-bias support Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>	2019-12-20 19:39:55 +00:00
Andreas Baierl	1b0743dbb6	lima: Fix dump file creation Otherwise lima_dump_file_next() always opens a new file and creates the dumps regardless of what the environment variables say. Fixes `d71cd245d7` ('lima: Rotate dump files after each finished pp frame') Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>	2019-12-20 17:44:12 +01:00
Pierre-Eric Pelloux-Prayer	9c2a3b4e75	radeon/vcn2: enable rate control for hevc encoding Based on `b0626c1f30` ("radeon/vcn: enable rate control for hevc encoding"). Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2225 Fixes: `587b9c5dae` ("radeon/vcn: implement vcn 2.0 encode") Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3134>	2019-12-20 16:51:53 +01:00
Samuel Pitoiset	02dd1fb859	radv: rely on pipeline layout when creating push descriptors with template descriptorSetLayout should be ignored for push descriptors. While we are it, also ignore pipelineBindPoint. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2210 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3180>	2019-12-20 13:41:29 +01:00
Marek Vasut	f51ee564f5	etnaviv: Replace bitwise OR with logical OR The test here is testing whether either variable is non-zero. While currently the test works fine, it's fragile. Replace it with logical OR to avoid the fragility. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-20 13:15:37 +01:00
Christian Gmeiner	6e75f2172b	etnaviv: update resource status after flushing Currently piglit spec@arb_occlusion_query@occlusion_query_conform spins for ever as the resource status is never reset. See etna_hw_get_query_result(..) for more details. Fixes: `1456aa61cc` ("etnaviv: Rework resource status tracking") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Marek Vasut <marex@denx.de>	2019-12-20 12:43:23 +01:00
Ross Zwisler	cabcbb4db0	intel: limit shader geometry on BDW GT1 Similar to the SKL GT1 fix introduced here: `b1ba7ffdbd` we need to limit the .urb.max_entries[MESA_SHADER_GEOMETRY] on BDW GT1 to address failures in these two tests: dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_2d_array The value 690 was found via bisection. 691 is the actual max on the hardware I'm using, but 690 seemed like a nice round number. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3173>	2019-12-20 10:47:52 +00:00
Alyssa Rosenzweig	c57337bbd3	pan/midgard: Lower txd with lower_tex This is a hack since we do have native gradient stuff, but for the moment I'm more interested in conformance and the lowered code is good enough. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>	2019-12-20 09:10:39 +01:00
Alyssa Rosenzweig	da73651da4	pan/midgard: Fix crash with txs This regressed since we implemented RECT textures natively, oops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>	2019-12-20 09:10:36 +01:00
Alyssa Rosenzweig	ccbc9a4e67	pan/midgard: Implement textureOffset for 2D textures Fixes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2d_fixed_fragment. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>	2019-12-20 09:10:26 +01:00
Samuel Pitoiset	2eef9e050f	radv: ignore pColorBlendState if rasterization is disabled Or if the subpass has no color attachments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:21:02 +01:00
Samuel Pitoiset	021c7b5309	radv: tidy up radv_pipeline_init_blend_state() This is needed for the next commit because pColorBlendState can actually be NULL but some fields might have to be initialized (eg. alpha to coverage with no color attachments). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:20:58 +01:00
Samuel Pitoiset	ebc7a77869	radv: ignore pDepthStencilState if rasterization is disabled Or if the subpass has no depth stencil attachment. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:20:55 +01:00
Samuel Pitoiset	ce67e41535	radv: ignore pTessellationState if the pipeline doesn't use tess Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:20:52 +01:00
Samuel Pitoiset	7735f314b7	radv: ignore pMultisampleState if rasterization is disabled Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:20:49 +01:00
Samuel Pitoiset	589bfcbde3	radv: init a default multisample state for the resolve FS path pMultisampleState must be a valid pointer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3167>	2019-12-20 08:20:44 +01:00
Caio Marcelo de Oliveira Filho	4fbc99c124	spirv: Implement SPV_KHR_non_semantic_info Do nothing for OpExtInst from extended instruction sets that name start with "NonSemantic.". Since they can be used within the "preamble" to annotate global decorations, also don't stop iterating when one of them is found. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3154>	2019-12-19 22:49:39 -08:00
Jonathan Marek	13adce2845	turnip: disable B8G8R8 vertex formats Looks like swap doesn't work as expected on these, disable them. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>	2019-12-19 19:03:02 -05:00
Jonathan Marek	54f72c83d6	util/format: add missing vulkan formats Add some missing vulkan formats to util/format, this solves all the missing pipe format cases for the formats that turnip supports. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3170>	2019-12-19 19:03:02 -05:00
Jonathan Marek	b9d4c10e4b	turnip: minor warning fixes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3177>	2019-12-19 23:21:01 +00:00
Andreas Baierl	d71cd245d7	lima: Rotate dump files after each finished pp frame This rotates the dump files like the mali-syscall-tracker does. After each finished pp frame a new file is generated. They are numbered like lima.dump.0000, lima.dump.0001 ... The filename and path can be given with the new environment variable LIMA_DUMP_FILE. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>	2019-12-19 23:53:22 +01:00
Vasily Khoruzhick	039f3f6adb	lima: drop suballocator Since we're using a separate per-draw BO for GP outputs we don't need suballocator anymore. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>	2019-12-19 14:28:32 -08:00
Vasily Khoruzhick	9f72d7195a	lima: use single BO for GP outputs Varyings, gl_Position and gl_PointSize are all GP outputs, so we can use a single BO for them all. Also that allows us to get rid of suballocator. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>	2019-12-19 14:28:32 -08:00
Jonathan Marek	06ae0674fd	nir: fix assign_io_var_locations for vertex inputs Also fixes fragment inputs using the wrong "base" value (which was working only because FRAG_RESULT_DATA0 is less than VARYING_SLOT_VAR0) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3108>	2019-12-19 21:26:52 +00:00
Jonathan Marek	e9a32af3bf	turnip: implement secondary command buffers Uses a new "tu_cs_add_entries" function because tu_cs_emit_call doesn't work inside draw_cs (which is already called by tu_cs_emit_call). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>	2019-12-19 20:42:08 +00:00
Jonathan Marek	85fff42d08	turnip: compute gmem offsets at renderpass creation time This makes it easier to implement secondary command buffers, since we no longer need to know the render area to set the gmem offsets for input attachments and CmdClearAttachments. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3075>	2019-12-19 20:42:08 +00:00
Jonathan Marek	f81c41a812	turnip: emit_compute_driver_params fixes Offset was wrong, it is in vec4 not dwords. There's a hole between DP_NUM_WORK_GROUPS_Z and DP_LOCAL_GROUP_SIZE_X so use the IR3 enums. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	bb134c5316	turnip: emit base instance vs driver param Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	a3a70588c0	freedreno/ir3: support load_base_instance Not supported by hardware, uses same mechanism as base vertex. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Jonathan Marek	5c17d9b9ca	freedreno/registers: document vertex/instance id offset bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>	2019-12-19 15:13:40 -05:00
Neha Bhende	83ad2e5084	st/mesa: release tgsi tokens for shader states Since we are using st_common_variant while creating variant for vertext program, we can release tokens created in st_create_vp_variant which are already stored in respective states. This fix memory leak found with piglit tests Fixes `bc99b22a30` ('st/mesa: use a separate VS variant for the draw module') Reviewed-by: Charmaine Lee <charmainel@vmware.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-12-19 14:40:08 -05:00
Juan A. Suarez Romero	7f821289cb	Revert "nir/lower_double_ops: relax lower mod()" This reverts commit `8172b1fa03`. This commit was done taking in account Vulkan spec, but did not realize it was affecting OpenGL too. Closes: #2252	2019-12-19 20:01:16 +01:00
Kristian H. Kristensen	a4db9a1512	freedreno/a6xx: Set up multisample sysmem MRTs correctly We had an extra factor of num_samples in the stride. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	e688a16e2b	freedreno/a6xx: Rewrite compressed blits in a helper function Similar to how we handle zs blits. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	f8c0ea61e4	freedreno/a6xx: Move handle_rgba_blit() up If we move this function up, we don't have to forward declare it. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	183d482f7f	freedreno/a6xx: Handle srgb blits on the blitter Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	3a18e5d420	freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	e4c2bb6a93	freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	8089fb2e62	freedreno/a6xx: Use blitter for resolve blits We have a SAMPLES_AVERAGE bit that does what we need for resolving multisample buffers - let's use it. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	1d7267fc91	freedreno/a6xx: Add fd_resource_swap() helper Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	e0ebaa819d	freedreno/a6xx: Pick blitter swap based on resource tiling The linear levels in a tiled resource are stored in the canonical swap, WZYX. We need to pick the swap based on whether or not the resource is tiled, not whether the the level in question is tiled. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	b59222640e	freedreno/a6xx: Program sampler swap based on resource tiling It doesn't matter whether or not the level in question is linear. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	a2f6c44a1c	freedreno: Add debug flag for forcing linear layouts Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Kristian H. Kristensen	d908a2ab18	freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks Use new macro, DEBUG_BLIT, for dumping all blits. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>	2019-12-19 09:56:05 -08:00
Jonathan Marek	fe4a8df9a8	freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs The first pre_assign_inputs loop doesn't pre-assign sysvals, so skip the second part for sysvals. The sysvals don't need to be pre-assigned since the state for those isn't shared between binning / nonbinning shaders. Fixes assert failures in cases where the sysvals didn't end up in the same registers for binning / nonbinning. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3168>	2019-12-19 11:31:12 -05:00
Thong Thai	2add63060b	st/va: Convert interlaced NV12 to progressive In vlVaDeriveImage, convert interlaced NV12 buffers to progressive. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1193 Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3157>	2019-12-19 15:49:09 +00:00
Alyssa Rosenzweig	5710250074	pan/midgard: Add uniform/work heuristic Uniform/work registers are partitioned on a shader-by-shader basis as determined by the compiler. We add a simple heuristic here running before scheduling that prioritizes mitigating spilling at all costs. A more sophisticated heuristic should run after scheduling, doing a dry run of the register allocator itself to determine spilling. Fitting this into our current scheduling model is difficult, so while this heuristic does hurt some shaders, overall the results are acceptable: total instructions in shared programs: 50065 -> 38747 (-22.61%) instructions in affected programs: 37187 -> 25869 (-30.44%) helped: 59 HURT: 77 helped stats (abs) min: 1 max: 757 x̄: 198.46 x̃: 151 helped stats (rel) min: 0.48% max: 62.89% x̄: 32.95% x̃: 36.27% HURT stats (abs) min: 1 max: 9 x̄: 5.08 x̃: 6 HURT stats (rel) min: 0.92% max: 14.29% x̄: 6.71% x̃: 4.60% 95% mean confidence interval for instructions value: -111.15 -55.29 95% mean confidence interval for instructions %-change: -14.33% -6.67% Instructions are helped. total bundles in shared programs: 30606 -> 19157 (-37.41%) bundles in affected programs: 23907 -> 12458 (-47.89%) helped: 58 HURT: 74 helped stats (abs) min: 6 max: 757 x̄: 203.09 x̃: 152 helped stats (rel) min: 5.19% max: 77.00% x̄: 49.38% x̃: 53.79% HURT stats (abs) min: 1 max: 9 x̄: 4.46 x̃: 5 HURT stats (rel) min: 1.85% max: 26.32% x̄: 11.70% x̃: 9.57% 95% mean confidence interval for bundles value: -115.46 -58.01 95% mean confidence interval for bundles %-change: -20.87% -9.41% Bundles are helped. total quadwords in shared programs: 31305 -> 32027 (2.31%) quadwords in affected programs: 20471 -> 21193 (3.53%) helped: 0 HURT: 133 HURT stats (abs) min: 1 max: 9 x̄: 5.43 x̃: 5 HURT stats (rel) min: 0.76% max: 15.15% x̄: 5.47% x̃: 4.65% 95% mean confidence interval for quadwords value: 5.00 5.86 95% mean confidence interval for quadwords %-change: 4.85% 6.08% Quadwords are HURT. total registers in shared programs: 2256 -> 2545 (12.81%) registers in affected programs: 708 -> 997 (40.82%) helped: 0 HURT: 95 HURT stats (abs) min: 1 max: 8 x̄: 3.04 x̃: 3 HURT stats (rel) min: 12.50% max: 100.00% x̄: 39.41% x̃: 37.50% 95% mean confidence interval for registers value: 2.64 3.45 95% mean confidence interval for registers %-change: 34.62% 44.19% Registers are HURT. total threads in shared programs: 1776 -> 1709 (-3.77%) threads in affected programs: 134 -> 67 (-50.00%) helped: 0 HURT: 67 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.00 -1.00 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. total spills in shared programs: 3868 -> 2 (-99.95%) spills in affected programs: 3868 -> 2 (-99.95%) helped: 60 HURT: 0 total fills in shared programs: 6456 -> 4 (-99.94%) fills in affected programs: 6456 -> 4 (-99.94%) helped: 60 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>	2019-12-19 15:22:39 +00:00
Samuel Pitoiset	13b4e9adcf	ac: declare an enum for the OOB select field on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>	2019-12-19 15:15:32 +01:00
Samuel Pitoiset	f3cccd05d9	radv/gfx10: fix the out-of-bounds check for vertex descriptors When stride is 0, it should check against the offset not the index. This fixes black character models with Beat Saber and missing snow with Dragon Quest. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2233 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1975 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>	2019-12-19 15:15:30 +01:00
Juan A. Suarez Romero	8172b1fa03	nir/lower_double_ops: relax lower mod() Currently when lowering mod() we add an extra instruction so if mod(a,b) == b then 0 is returned instead of b, as mathematically mod(a,b) is in the interval [0, b). But Vulkan spec has relaxed this restriction, and allows the result to be in the interval [0, b]. This commit takes this in account to remove the extra instruction required to return 0 instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2922>	2019-12-19 12:36:30 +00:00
Erik Faye-Lund	af65bfb38f	zink: implement nir_texop_txd This lets us enable PIPE_CAP_FRAGMENT_SHADER_TEXTURE_LOD, which in turns gives us ARB_shader_texture_lod. Still fails one piglit test on ANV, namely spec@arb_shader_texture_lod@execution@arb_shader_texture_lod-texgradcube, but with 33 new passing tests, I think this is worth it. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3140>	2019-12-19 13:14:29 +01:00
Erik Faye-Lund	b31d1b73bc	zink: enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS This just works in Vulkan, there's no work neeed to enable it. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3148> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3148>	2019-12-19 10:08:13 +01:00
Jonathan Marek	5785bcc8a0	turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used Fixes artifacts in the subpasses demo, which has a shader using fragcoord without any varyings. It looks like setting this bit when there are no varyings can cause weirdness in some cases (without this change, if the previous shader had <= 8 varyings it would work, but with 9 varyings it would have artifacts). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>	2019-12-18 19:03:37 -05:00
Jonathan Marek	4a59bc6df2	turnip: add cache invalidate to fix input attachment cases Fixes artifacts in the subpasses demo. Workaround texture cache with input attachments from GMEM by adding a cache invalidate between subpasses. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3143>	2019-12-18 19:03:37 -05:00
Lionel Landwerlin	fc2552b644	loader: fix close on uninitialized file descriptor value Using a drm syscall layer faking a kernel driver : ==581460== Conditional jump or move depends on uninitialised value(s) ==581460== by 0x48A4C2B: close (drm-hooks.cpp:185) ==581460== by 0x5A815F1: dri3_alloc_render_buffer (loader_dri3_helper.c:1469) ==581460== by 0x5A82050: dri3_get_buffer (loader_dri3_helper.c:1827) ==581460== by 0x5A82662: loader_dri3_get_buffers (loader_dri3_helper.c:2028) ==581460== by 0x6C78109: intel_update_image_buffers (brw_context.c:1870) ==581460== by 0x6C77805: intel_update_renderbuffers (brw_context.c:1499) ==581460== by 0x6C7789D: intel_prepare_render (brw_context.c:1520) ==581460== by 0x6C773D4: intelMakeCurrent (brw_context.c:1341) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `069fdd5f9f` ("egl/x11: Support DRI3 v1.1") Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3152>	2019-12-19 00:51:36 +02:00
Connor Abbott	648cc22afb	freedreno: Fix CP_MEM_TO_REG flag definitions These actually mean something completely different, at least on A5xx and A6xx. The only other usage of the old flags on something older than A6xx was a typo, so I don't know if it was always this way, but at the same time it means that we don't have to worry too much about that. Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:09:05 +01:00
Connor Abbott	4c5ac156c3	freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE Similar to the existing usage for CP_COND_WRITE5, this makes it clear what each of the magic parameters are for. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:09:00 +01:00
Connor Abbott	cfa1fb895a	a6xx: Add more CP packets And add fields uncovered by looking at the firmware. I think this covers all the memory, register, and scratch manipulation opcodes that exist on A6xx, plus one additional nice find for Vulkan and describing a previously unknown opcode and documenting CP_WAIT_REG_MEM. Note that the bits for the CP_REG_TO_MEM count, as well as the formula for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG, are changed because the A630 SQE firmware actually does something different. I haven't investigated older microcodes to see whether this extends back to A5xx and A4xx, but the only non-A6xx uses of this field result in the same bit-pattern when using the A6xx bit range and formula, so it should be safe to change the definition universally. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>	2019-12-18 23:08:55 +01:00
Bas Nieuwenhuizen	a9a3108be7	radv: Limit workgroup size to 1024. Fixes a hang with geekbench. The existence of RX 580 and NAVI10 results shows that the generations before and after this do not have the issue. (They show up on the website). So this is likely a GFX9 only issue. This is not something weird like LDS size since none of the shaders seem to use LDS. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>	2019-12-18 20:41:18 +00:00
Dylan Baker	69decdb28a	docs: Add release notes, news, and update calendar for 19.2.8	2019-12-18 11:25:32 -08:00
Dylan Baker	7017f69a64	docs/relnotes/19.2.8: Add SHA256 sum	2019-12-18 11:24:46 -08:00
Dylan Baker	2f724d2202	docs: add relnotes for 19.2.8	2019-12-18 11:24:44 -08:00
Dylan Baker	d32e1257c0	docs: Add release notes, update calendar, and add news for 19.3.1	2019-12-18 10:58:54 -08:00
Dylan Baker	636175da6d	dcos: add releanse notes for 19.3.1	2019-12-18 10:57:54 -08:00
Lionel Landwerlin	afdc0121b5	i965/iris/perf: factor out frequency register capture Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>	2019-12-18 14:23:17 +02:00
Jonathan Marek	072e95e07a	freedreno/ir3: update prefetch input_offset when packing inlocs If the input location changes then prefetch input_offset needs to change. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3141>	2019-12-17 16:41:13 -05:00
Eric Anholt	62998f6e2d	ci: Fix caselist results archiving after parallel-deqp-runner rename. Noticed while reviewing some lava parallel-deqp-runner changes. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3138>	2019-12-17 20:13:10 +00:00
Kristian H. Kristensen	9aaa23fbad	freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits There are bits for binning, gmem and sysmem. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>	2019-12-17 11:45:20 -08:00
Caio Marcelo de Oliveira Filho	c61ad77cd2	anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT) For the sake of our testing infrastructure, disable this extension for TGL until we can sort out a hang in Vulkan CTS. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-17 11:07:41 -08:00
Caio Marcelo de Oliveira Filho	766fdeccf9	intel/vec4: Fix lowering of multiplication by 16-bit constant Existing code was ignoring whether the type of the immediate source was signed or not. If the source was signed, it would ignore small negative values but it also would wrongly accept values between INT16_MAX and UINT16_MAX, causing the atual value to later be reinterpreted as a negative number (under 16-bits). Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for older platforms that don't support MUL with 32x32 types and use vec4. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-17 10:45:22 -08:00
Caio Marcelo de Oliveira Filho	2137be22fa	intel/fs: Fix lowering of dword multiplication by 16-bit constant Existing code was ignoring whether the type of the immediate source was signed or not. If the source was signed, it would ignore small negative values but it also would wrongly accept values between INT16_MAX and UINT16_MAX, causing the atual value to later be reinterpreted as a negative number (under 16-bits). Fixes tests/shaders/glsl-mul-const.shader_test in Piglit for platforms that don't support MUL with 32x32 types, including ICL and TGL. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2186 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-17 10:45:22 -08:00
Alyssa Rosenzweig	66013cb1be	pan/midgard: Set Z to shadow comparator for 2D We still need to generalize for other types of (non-2D / array) shadow samplers, but this is enough for sampler2DShadow to work with initial dEQP tests passing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	1a53bed41c	pan/midgard: Set .shadow for shadow samplers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	d183f84585	pan/midgard: Hoist temporary coordinate for cubemaps We'll reuse some of this code for shadow samplers, which are represented by a distinct source in NIR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	96df5f1fbf	pan/midgard: Use a reg temporary for mutiple writes Bug in texelfetch implementation from inspection. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	bf5d8cfd28	panfrost: Handle empty shaders I didn't realize this was in spec, but it fixes a crash in shaderdb. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	35418f6770	panfrost: Let precompile imply shaderdb This cuts down the number of random environmental variables we need flying around; now PAN_MESA_DEBUG=precompile is sufficient and MIDGARD_MESA_DEBUG=shaderdb will be implied. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Alyssa Rosenzweig	271726eaca	panfrost: Add PAN_MESA_DEBUG=precompile for shader-db We would like to use run.c for shader-db runs (rather than capturing in real-time, which is limiting). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>	2019-12-17 17:42:57 +00:00
Lionel Landwerlin	2c8742ed85	mesa: avoid triggering assert in implementation When tearing down a GL context with an active performance query, the implementation can be confused by a query marked active when it's being deleted. This shouldn't happen in the implementation because the context will already be idle. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2235 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3115>	2019-12-17 12:52:04 +00:00
Samuel Pitoiset	d399f4f414	radv/gfx10: fix ngg_get_ordered_id Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>	2019-12-17 12:34:18 +00:00
Neil Armstrong	089c8f0b8d	ci: Remove T820 from CI temporarily Our lab will have continuous programmed power cuts until the 6th January 2020, so it's safer to disable the T820 CI running on the BayLibre kernelCI lab to avoid breaking CI. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3135>	2019-12-17 11:40:07 +01:00
Tapani Pälli	75caae2268	i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual Patch adds BGRX sRGB visuals, required format translation information to the __DRI_IMAGE_FOURCC_SXRGB8888 format and makes all BGRX visuals sRGB capable just like is done with BGRA. squashed patches from Yevhenii Kolesnikov: dri: Add __DRI_IMAGE_FOURCC_SXRGB8888 conversion i965: force visuals without alpha bits to use sRGB Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1501 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>	2019-12-17 09:28:25 +00:00
Tapani Pälli	8b6b5ce669	dri: add __DRI_IMAGE_FORMAT_SXRGB8 Add format definition and required plumbing to create images. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3077>	2019-12-17 09:28:25 +00:00
Gert Wollny	cffa7bb990	virgl: Increase the shader transfer buffer by doubling the size With only linearly increasing the size of the shader transfer buffer the transfer of very large shaders may fail, so with each attempt double the size of the buffer. CTS: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.48 for VTK-GL-CTS b5dcfb9c5 and newer virglrenderer bug: https://gitlab.freedesktop.org/virgl/virglrenderer/issues/150 Fixes: `a8987b88ff` virgl: add driver for virtio-gpu 3D (v2) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3121>	2019-12-17 08:07:51 +00:00
Eric Anholt	2da68c8649	turnip: Fix support for immutable samplers. We were setting up the hardware sampler state when updating a combined image sampler, but never looking at the immutable sampler for in the separate case. Fixes failures in dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_immutable.fragment.* Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3127>	2019-12-16 19:51:27 -08:00
Jonathan Marek	edfc4daab8	turnip: don't set LRZ enable at end of renderpass Fixes hanging with cases that use more than one renderpass. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3122>	2019-12-17 00:59:00 +00:00
Jonathan Marek	c7c5a84cf3	freedreno/ir3: lower pack/unpack ops Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>	2019-12-16 19:20:07 -05:00
Jonathan Marek	004797002f	nir: add option to lower half packing opcodes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3106>	2019-12-16 19:20:07 -05:00
Eric Anholt	2d3182b429	turnip: Add support for descriptor arrays. I had a bigger rework I was working on, but this is simple and gets tests passing. Fixes 36 failures in dEQP-VK.binding_model.shader_access.primary_cmd_buf.sampler_mutable.fragment.* (now all passing) Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>	2019-12-16 23:57:22 +00:00
Eric Anholt	02d764b96a	turnip: Drop unused variable. We really need -Werror in CI. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3124>	2019-12-16 23:57:22 +00:00
Alyssa Rosenzweig	0eb84eb702	panfrost: Don't double-create scratchpad Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `4f7fddbd71` ("panfrost: Pass size to panfrost_batch_get_scratchpad") Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>	2019-12-16 23:32:07 +00:00
Alyssa Rosenzweig	73bd9fe20c	panfrost: Simplify sampler upload condition Makes it more obvious what's going on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3119>	2019-12-16 23:32:06 +00:00
Icecream95	37bc028367	gallium/auxiliary: Handle count == 0 in u_vbuf_get_minmax_index_mapped This makes u_vbuf_get_minmax_index_mapped return min = 0 / max = 0 when info->count == 0. That should never happen anyway, but this commit makes it at least return a sane value that callers expect, and also allows us - and GCC - to assume count != 0 for optimization purposes. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>	2019-12-16 22:57:35 +00:00
Icecream95	80aca96803	gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped With this patch, GCC generates vectorized code that does the comparisons without converting the indices to 32-bit first. This optimization makes the aforementioned function almost twice as fast for ARM NEON, and should speed up vectorised code on other platforms. Without vectorisation, the function is still a percent or two faster, but slightly larger. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3050>	2019-12-16 22:57:35 +00:00
Marek Olšák	69ea473eeb	amd/addrlib: update to the latest version Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-16 17:04:57 -05:00
Jonathan Marek	a3ea4805aa	turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	2d3492bc62	turnip: change emit_ibo to be like emit_textures Adds missing alignment and error checking. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	718bd4f8b4	turnip: fix emit_ibo Based on the GL driver: -Compute needs different opcode (this fixes a GPU hang problem) -REG_A6XX_SP_IBO_LO/REG_A6XX_SP_CS_IBO_LO were swapped Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	65007d438c	turnip: remove compute emit_border_color Current tu6_emit_border_color doesn't work for compute and there's no example from the GL driver to base it on, so replace it with a finishme. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Jonathan Marek	c9b12c71d7	turnip: fix emit_textures for compute shaders Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3104>	2019-12-16 21:04:42 +00:00
Rafael Antognolli	ed43d01dec	utils/os_socket: Define ssize_t on windows. Fixes: `ef5266ebd5` ("util/os_socket: Add socket related functions.") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-16 20:35:22 +00:00
Marek Olšák	43f05e0421	radeonsi/gfx10: fix ngg_get_ordered_id This could have caused issues with NGG streamout. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	8edf3df3e4	radeonsi: reset more fields in si_llvm_context_set_ir to fix reusing ctx Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	1436c261e9	radeonsi: fix determining whether the VS prolog is needed Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	378444ce90	radeonsi: allow generating VS prologs with 0 inputs If "ls_vgpr_fix" is set, we use a prolog, but it can have 0 inputs. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	4846aeaf57	radeonsi/gfx10: don't insert NGG streamout atomics if they are never used Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	de4a4595f6	radeonsi: don't wrap the VS prolog in if (ES thread) .. endif We can execute it unconditionally and the values computed for disabled threads won't be used anyway. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	db67e51903	radeonsi: set is_monolithic for VS prologs when the shader is really monolithic This fixes a bug with NGG that is probably harmless. Basically, !is_monolithic makes the VS prolog emit llvm.amdgcn.init.exec.from.input, which sets the EXEC mask to only enable ES threads. In the NGG non-GS case, the GS threads <= ES threads, so it was never an issue. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	451bc91158	radeonsi: disallow compute-based culling if polygon mode is enabled Polygon mode can generate thick points or lines. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	1a07df840e	radeonsi: deduplicate ES and GS thread enablement code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	f90cbd18ff	ac: fix the return value in cull_bbox when bbox culling is disabled Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Marek Olšák	e5e3ffa6b9	ac: fix ac_get_i1_sgpr_mask for Wave32 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>	2019-12-16 20:06:07 +00:00
Alyssa Rosenzweig	5386b7e011	panfrost: Remove asserts in panfrost_pack_work_groups_compute It's a hot routine and these are exceedingly unlikely to break. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>	2019-12-16 19:48:28 +00:00
Alyssa Rosenzweig	6378797a6d	panfrost: Pack invocation_shifts manually instead of a bit field gcc generates exceptionally bad code for panfrost_pack_work_groups_fused otherwise ... although that routine is somehow still hot ... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3067>	2019-12-16 19:48:28 +00:00
Iván Briano	a649bbffee	anv: Export VK_KHR_buffer_device_address only when really supported Fixes: `1b6991ba1d` ("anv: Implement VK_KHR_buffer_device_address") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>	2019-12-16 19:24:46 +00:00
Iván Briano	0fd93b9589	anv: Export filter_minmax support only when it's really supported Fixes: `bea4d4c78c` ("anv: add VK_EXT_sampler_filter_minmax support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3071>	2019-12-16 19:24:46 +00:00
Jonathan Marek	b936143327	freedreno/ir3: lower mul_2x32_64 lower_mul_2x32_64 generates mul_high opcodes, and lower_mul_high is done by nir_lower_alu, so call nir_lower_alu after nir_opt_algebraic. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:37:09 -05:00
Jonathan Marek	d4676d7a16	turnip: implement CmdFillBuffer/CmdUpdateBuffer Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	8d893a2071	turnip: don't require src image to be set for clear blits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	f78c4251f1	turnip: use common blit path for buffer copy Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Jonathan Marek	d6c8aa2b72	turnip: use single substream cs Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-16 13:13:53 -05:00
Alyssa Rosenzweig	8959364937	panfrost: Remove fbd_type enum Just use the MALI_MFBD tag directly; it's clean. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>	2019-12-16 12:51:03 -05:00
Alyssa Rosenzweig	5408700a12	ci: Reinstate Panfrost CI Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>	2019-12-16 12:51:03 -05:00
Alyssa Rosenzweig	caf55e7bfd	panfrost: Fix FBD issue Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `b0e915b4e6` ("panfrost: Emit SFBD/MFBD after a batch, instead of before") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3118>	2019-12-16 12:50:26 -05:00
Lionel Landwerlin	bc36160ccb	vulkan/wsi: error out when image fence doesn't signal If for some reason the fence associated with an image doesn't signal, we're likely in a device lost scenario, we should report that error. We can't really wait for a given amount of time because we could get a timeout and that is not a valid error to report for vkQueuePresentKHR, so just wait forever. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/830 Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-16 14:59:10 +02:00
Lionel Landwerlin	c056193288	anv: drop unused parameter from apply layout pass Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-16 14:35:25 +02:00
Lionel Landwerlin	7c223cf316	anv: constify pipeline layout in nir passes Was hoping to find potential issues but nothing. Still probably a good idea. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-16 14:35:22 +02:00
Alyssa Rosenzweig	e7721d8775	pan/midgard: Set r1.w magic I'm honestly unsure what this is for, but it's needed on MFBD systems for unknown reasons, at least when MRT is actually in use and then sometimes without MRT (it fixes a blend shader issue on T760?) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>	2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig	3448b2641a	pan/midgard: Fix liveness analysis with multiple epilogues Epilogues are special fixed-function blocks, so they need special handling for liveness analysis to work completely. This in turns fixes RA issues for many shaders using MRT. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>	2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig	60396340f5	pan/midgard: Writeout per render target The flow is considerably more complicated. Instead of one writeout loop like usual, we have a separate write loop for each render target. This requires some scheduling shenanigans to get right. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>	2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig	281cc6f9a6	pan/midgard: Add schedule barrier after fragment writeout This is a branch, like discard, so we need a barrier to make it safe. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>	2019-12-16 09:10:33 +00:00
Alyssa Rosenzweig	a2d5503b68	panfrost: Pass blend RT number through We have to key the blend shader for the render target number due to writeout silliness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Visoso <tomeu.vizoso@collabora.com>	2019-12-16 09:10:33 +00:00
Pierre-Eric Pelloux-Prayer	2c1983aefe	gallium: refuse to create buffers larger than UINT32_MAX pipe_resource.width0 is 32 bits and hardware support for bigger buffer is limited (eg: AMD hardware doesn't support buffer shader resources bigger than 4GB). Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2053 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2948>	2019-12-16 09:30:14 +01:00
Pierre-Eric Pelloux-Prayer	0e286f6cbf	radeonsi: disable dcc for 2x MSAA surface and bpe < 4 This fixes a series of dEQP tests on Raven platforms: - dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4 - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1 - dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565 - dEQP-GLES3.functional.fbo.msaa.2_samples.rg8 - dEQP-GLES3.functional.fbo.msaa.2_samples.r16f Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3090>	2019-12-16 08:08:08 +00:00
Iago Toral Quiroga	4202cf8bf1	v3d: expose OES_geometry_shader Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	ba7bc83dd5	v3d: support precompiling geometry shaders At present, this is only relevant for shader-db. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	7cee56b1df	v3d: disable lowering of indirect inputs V3D can do indirect inputs so we don't need it. Also, the lowering produces horrible if-ladder code that is particularly bad for geometry shaders where inputs are always arrays and shader bodies usually have a loop indexing into them. This fixes a couple of geometry shader tests in CTS that would fail to register allocate otherwise. There are no changes in shader-db. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	a1b7c0844d	v3d: fix primitive queries for geometry shaders With geometry shaders the number of emitted primitived is decided at run time, so we cannot precompute it in the CPU and we need to use the PRIMITIVE_COUNTS_FEEDBACK commands to have the GPU provide the number like we do for the number of primitives written to transform feedback. This may have a performance impact though, since it requires a sync wait for the draw to complete, so we only do it when geometry shaders are present. v2: remove '> 0' comparison for ponter type (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	6c7a2b69f8	v3d: handle writes to gl_Layer from geometry shaders When geometry shaders write a value to gl_Layer that doesn't correspond to an existing layer in the target framebuffer the rendering behavior is undefined according to the spec, however, there are CTS tests that trigger this scenario on purpose, probably to ensure that nothing terrible happens. For V3D, this situation is problematic because the binner uses the layer index to select the offset to write into the tile state data, and we only allocate tile state for MAX2(num_layers, 1), so we want to make sure we don't produce values that would lead to out of bounds writes. The simulator has an assert to catch this, although we haven't observed issues in actual hardware it is probably best to play safe. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	45bc61add0	v3d: move layer rendering to a separate helper This helps with reducing nesting level after adding the loop to handle layered rendering. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	74a59fdc6e	v3d: support rendering to multi-layered framebuffers When doing layered rendering the binning stage will prepare per-tile lists for each layer in the framebuffer, so we need to make sure we allocate enough space for them . We also need to emit the NUMBER_OF_LAYERS packet. This is required even when the number of layers is only 1, otherwise the simulator detects buffer overflows in the tile_state BO during some CTS test cases involving layered FBOs. When rendering, we need to emit commands for each layer of the framebuffer separately and make sure we address the correct layers for each one. v2: fixed typo in comment (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	a0c94c70ee	v3d: do not limit new CL space allocations with branch to 4096 bytes For layered rendering we need to emit per layer rendering commands lists so we we can end up requiring a fairly large buffer for this if the number of layers is large enough. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	56ba6f42e2	v3d: remove obsolete assertion OES_geometry_shader introduced the concept of layered framebuffers. Removing this assertion gets a bunch of CTS tests to pass. We will also need layered images to implement layered rendering with geometry shaders. v2: fix typo in commit message (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	e054fe0167	v3d: support transform feedback with geometry shaders Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	e54cf64939	v3d: save geometry shader state for blitting Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	a6b318ef52	v3d: predicate geometry shader outputs inside non-uniform control flow Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	b636d4ebc7	v3d: don't try to render if shaders failed to compile This is the same we do in the compute path to avoid crashes at draw time. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	e2f2263433	v3d: add support for adjacency primitives v2: remove obsolete comment (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	a07d70c54b	v3d: we always have at least one output segment If we program an output size of 0 the simulator asserts. This was not a problem until now because our VS would always have to emit fixed function outputs, however, now that it can be paired with a GS we can end up with a VS shader that no longer emits any outputs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	76fc8c8bb1	v3d: compute appropriate VPM memory configuration for geometry shader workloads Geometry shaders can output many vertices and thus have higher VPM memory pressure as a result. It is possible that too wide geometry shader dispatches exceed the maximum available VPM output allocated, in which case we need to reduce the dispatch width until we can fit the VPM memory requirements. Supported dispatch widths for geometry shaders are 16, 8, 4, 1. There is a limit in the number of VPM output sectors that can be used by a geometry shader that we can meet by lowering the dispatch width at compile time, however, at draw time we need to revisit this number and, together with other elements that can contribute to total VPM memory requirements, decide on a configuration that can fit the program into the available VPM memory. Ideally, we also want to aim for not using more than half of the available memory so we that we can run a pair of bin and render programs in parallel. v2: fixed language in comment and typo in commit log. (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	76f4c83815	v3d: add 1-way SIMD packing definition According to the documentation, the 1-way dispatch width is only supported with geometry shaders. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	4f5fbd6490	v3d: implement geometry shader instancing v2: - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	8a81ac2eed	v3d: emit geometry shader state commands This is good enough to get basic GS workloads working, later patches will improve this by adding instancing support, proper SIMD configuration, etc. Notice that most of the TESSELLATION_GEOMETRY_SHADER_PARAMS fields are only relevant when tessellation shaders are present. We do not support tessellation yet, but we still need to fill in these tessellation state with default values since our packing functions require some of these to have non-zero values. v2: - Add a comment in the code explaining why we fill in tessellation fields (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	0934bd4460	v3d: fix packet descriptions for geometry and tessellation shaders Every code address starts at bit 3 (addresses must be 64-bit aligned), with the first 3 bits used to specify threading and NaN propagation parameters for the shader program. We generally skip "reserved" bits, however, doing this when the reserved field is the last in a struct and it is large enough can make us compute incorrect (smaller) struct sizes which can lead to corrupt CLs. In particular, the "Tess/Geom Common Params" struct has a reserved field at the end that is 8-bit, so if we don't include this we compute a packet size that is 1 byte smaller than it shold, making the next packet we emit start 1 byte earlier and therefore leading to incorrect CL data from that point forward. The name of one of the fields was not correct. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	5d578c27ce	v3d: add initial compiler plumbing for geometry shaders Most of the relevant work happens in the v3d_nir_lower_io. Since geometry shaders can write any number of output vertices, this pass injects a few variables into the shader code to keep track of things like the number of vertices emitted or the offsets into the VPM of the current vertex output, etc. This is also where we handle EmitVertex() and EmitPrimitive() intrinsics. The geometry shader VPM output layout has a specific structure with a 32-bit general header, then another 32-bit header slot for each output vertex, and finally the actual vertex data. When vertex shaders are paired with geometry shaders we also need to consider the following: - Only geometry shaders emit fixed function outputs. - The coordinate shader used for the vertex stage during binning must not drop varyings other than those used by transform feedback, since these may be read by the binning GS. v2: - Use MAX3 instead of a chain of MAX2 (Alejandro). - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro) - Update comment in IO owering so it includes the GS stage (Alejandro) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	f63750accf	v3d: remove unused variable Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	52cbef0039	v3d: enable debug options for geometry shader dumps Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	d6b0786a38	v3d: add debug assert While lowering vpm outputs we look for the NIR variables matching particular store output instructions and we expect to find a match, so assert on that. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Iago Toral Quiroga	6e68f74395	v3d: add missing plumbing for VPM load instructions We will need to use LDVPMG_IN specifically to read VPM inputs in geometry shaders. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-16 08:42:37 +01:00
Eric Anholt	f58ef5d481	turnip: Lower usub_borrow. Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>	2019-12-16 04:52:09 +00:00
Caio Marcelo de Oliveira Filho	c06ba83589	intel/fs: Lower 64-bit MOVs after lower_load_payload() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>	2019-12-14 21:12:21 +00:00
Bas Nieuwenhuizen	b53856aca3	amd/common: Always use addrlib for HTILE tc-compat. Even without depth+stencil addrlib can (correctly!) decide to disable tc compatible HTILE. One example is 8x sampling with 32-bit depth on Stoney. The row size on Stoney is 1024, while the tile size is 2048, which results in tile splits which are not supported with tc-compat. On Stoney, this fixes dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>	2019-12-14 20:39:29 +00:00
Bas Nieuwenhuizen	e197fb1c2f	amd/common: Fix tcCompatible degradation on Stoney. addrlib sometimes returns smaller sizes for tcCompat as it does not seem to take into account the depth+stencil matching config gymnastics with tcCompat. This fixes dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>	2019-12-14 20:39:29 +00:00
Denis Pauk	6bf14e9c47	docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr Signed-off-by: Denis Pauk <pauk.denis@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> CC: Bruce Cherniak <bruce.cherniak@intel.com> CC: Matt Turner <mattst88@gmail.com>	2019-12-14 20:02:10 +00:00
Denis Pauk	3acc15f4f0	gallium/swr: Enable support bptc format. Reuse Code from: `f69bc797e1` gallium/auxiliary: Add helper support for bptc format compress/decompress Signed-off-by: Denis Pauk <pauk.denis@gmail.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: Marek Olšák <maraeo@gmail.com> CC: Tim Rowley <timothy.o.rowley@intel.com>	2019-12-14 20:02:10 +00:00
Rob Clark	1bf3837395	freedreno/a6xx: fix OUT_REG() vs growable cmdstream BEGIN_RING() could decide we can't fit the next packet in the current cmdstream segment, and grow a new segment. So we need to grab ring->cur after BEGIN_RING(), otherwise we are writing cmdstream past the end of the previous segment. Fixes: `bdd98b892f` ("freedreno: New struct packing macros") Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-14 09:12:39 -08:00
Erico Nunes	ce52b49348	lima: split draw calls on 64k vertices The Mali400 only supports draws with up to 64k vertices per command. To handle this, break the draw_vbo call into multiple commands. Indexed drawing is left to a separate code path. This implementation was ported from vc4_draw_vbo. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	6d46d0e82b	vc4: move the draw splitting routine to shared code This can also be useful for other hardware which has similar limitations on vertex count per single draw. The Mali400 has a similar limitation and can reuse this. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	2d7be5f01f	lima: refactor indexed draw indices upload As of this commit this is just a refactor in preparation to enable support for more than 64k vertices. To support splitting the draw_vbo call, indices shouldn't be re-uploaded every time. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	270c282a43	lima: allocate separate bo to store varyings The current strategy using the suballocator with fixed size doesn't scale and causes some programs with large number of vertices (like some glmark2 scenes) to crash. Change it to dynamically allocate a separate bo to accomodate for arbitrary number of vertices. This also fixes the buffer read/write flags for gp. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Erico Nunes	8bf2b5db78	gallium/util: add alignment parameter to util_upload_index_buffer At least on Mali Utgard, index buffers need to be aligned on 0x40. To avoid duplicating this, add an alignment parameter. Keep the previous default for the other existing users. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>	2019-12-14 07:44:43 +01:00
Kenneth Graunke	9fb45c5bbd	drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version This gets it running on i965 with Mesa master. (The game won't start without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4 for shaders.) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>	2019-12-13 17:58:42 -08:00
Timothy Arceri	7564c5fc6d	st/glsl_to_nir: fix SSO validation regression Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-13 23:09:57 +00:00
Alyssa Rosenzweig	46f0b9ecc5	ci: Remove T760/T860 from CI temporarily I feel really bad about this but this one test is flaking. I don't want to do a mass revert (and bisection is extremely difficult with nondeterministic/Heisenbugs), but it's Friday night and master needs to pass. This commit should be reverted asap (once the flake is solved) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 22:52:39 +00:00
Rafael Antognolli	59de5d9b6a	iris: Implement WA for push constants. v2: Apply WA to gen11+ instead of gen12+ (Jordan). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-12-13 14:15:04 -08:00
Andreas Baierl	8adeeaa7f2	lima/parser: Add texture descriptor parser Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Andreas Baierl	5456916309	lima/parser: Add RSW parsing Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Andreas Baierl	31ed081ca3	lima/parser: Some fixes and cleanups Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>	2019-12-13 22:02:03 +00:00
Rafael Antognolli	6a3b8811ea	vulkan/overlay: Update docs. Add mention to overlay control socket. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	56ccea58ae	vulkan/overlay: Add basic overlay control script. This can be used to start/stop statistics capturing from the command line. v3: - Install script (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	a94fa1da93	vulkan/overlay: Add a command to start capturing data to a file. By default, if an output_file is specified, the overlay layer will start capturing data immediately. After this commit, when a control socket is used, the capture starts disabled by default, and is only enabled when a command ":capture=1;" is received. when the capture is enabled, we might have already accumulated some stats. To avoid capturing such noise, we discard and reset the fps and stats, updating the display and capturing only data from that point on. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	606dff1b73	vulkan/overlay: Add support for a control socket. Add support for socket from which the overlay layer can receive commands. This control socket can be useful to allow setting options once the application is already running. For instance, triggering the capture of fps data at a certain point. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	e87d7fea8a	vulkan/overlay: Add a control socket. v2: Use a socket instead of named pipe. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Rafael Antognolli	ef5266ebd5	util/os_socket: Add socket related functions. v3: - Add os_socket.c/h into Makefile.sources (Lionel) - Add empty non-linux implementation to public functions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-13 20:53:44 +00:00
Eric Engestrom	c327245257	anv: drop unused #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:42:40 +00:00
Eric Engestrom	1a837e803b	util/simple_mtx: don't set the canary when it can't be checked Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 20:20:21 +00:00
Eric Engestrom	d600b19640	intel/compiler: replace `0` pointer with `NULL` Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Eric Engestrom	8074f68b3b	intel/compiler: add ASSERTED annotation to avoid "unused variable" warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-13 20:16:20 +00:00
Kenneth Graunke	91efae4f80	iris: Alphabetize source files after iris_perf.c was added	2019-12-13 11:03:13 -08:00
Rob Clark	3b8feefd9c	freedreno/ir3: add iterator macros So many open coded list iterators were getting annoying. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	ad92aa36ac	freedreno/ir3: add scheduler traces Add some infrastructure to trace scheduler decisions. The next patch will add some more traces, just splitting this out to reduce clutter. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	dd34ccb2c5	freedreno/ir3: add last-baryf shaderdb stat Sometimes sched changes that are a win in terms of instruction count and/or register pressure, are worse in real life, due to keeping varying storage locked for too long. Add a shader-db stat to give this more visibility. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Alejandro Piñeiro	2865d79a33	nir/opt_peephole_select: remove unused variables To avoid "unused variable" warnings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-12-13 17:14:58 +01:00
Alyssa Rosenzweig	7c972eba40	panfrost: Report GPU name in es2_info We can prettify the ID. Closes #2093 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	09a2c74cfd	panfrost: Add panfrost_model_name helper This gives us a string representation of a GPU ID. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	a215289176	panfrost: Move property queries to _encoder We'll want these in non-Gallium devices. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	102789886c	panfrost: Move nir_undef_to_zero to Midgard compiler Nothing Gallium about it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	ddbbb2db48	pandecode: Add cast Fixes minor coverity warning about the format specifier. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	4f7fddbd71	panfrost: Pass size to panfrost_batch_get_scratchpad We'll compute the size with the new scratchpad helpers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	bc887e8281	panfrost: Calculate maximum stack_size per batch We'll need this so we can allocate a stack for the batch large enough for all the jobs within it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	a337bf319c	pan/midgard: Handle misc. cppcheck warnings Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	f204791cd6	pan/midgard: Remove unused ld/st packing hepers Identified by cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	709d8c29cd	panfrost: Handle minor cppcheck issues Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	b0e915b4e6	panfrost: Emit SFBD/MFBD after a batch, instead of before The size of the scratchpad (as well as some tiler details) depend on the contents of the batch, so we need to wait to defer filling out the FBD until after all draws are queued. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig	7597015b85	panfrost: Route stack_size from compiler We'll need it in pan_context.c Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 10:26:35 -05:00
Jonathan Marek	440cd835de	etnaviv: add missing vs_needs_z_div handling to NIR backend Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:31:40 -05:00
Jonathan Marek	64c7cdcae5	etnaviv: add missing formats Add missing texture/render formats supported by hardware. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:10:29 -05:00
Jonathan Marek	d30499a3c8	etnaviv: remove swizzle from format table The only format that needs swizzle is R8 emulated with L8, so we can get rid of the SWIZ(X, Y, Z, W) everywhere. Note: R8G8 also had a swizzle, but it wasn't necessary. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:10:28 -05:00
Jonathan Marek	017cbab5b0	etnaviv: disable integer vertex formats on pre-HALTI2 hardware Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:10:28 -05:00
Jonathan Marek	d34705c891	etnaviv: update INT_FILTER choice for GLES3 formats Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:09:08 -05:00
Jonathan Marek	15e9704ccb	etnaviv: set output mode and saturate bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:09:08 -05:00
Jonathan Marek	b7730c54a9	etnaviv: sRGB render target support Note: no srgb render target support before HALTI3 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:09:08 -05:00
Jonathan Marek	39349e629a	etnaviv: remove sRGB formats from format table This supports all sRGB formats, without having them in the format table. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 09:09:08 -05:00
Tomasz Pyra	b62217780a	gallium/swr: Fix arb_transform_feedback2 Added support for pause/resume transform feedback. Fixed DrawTransformFeedback. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com> Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>	2019-12-13 10:58:36 +00:00
Samuel Pitoiset	b37c91c12e	radv: handle unaligned vertex fetches on GFX6/GFX10 The Vulkan spec doesn't have any words for vertex attributes alignment. Fixes a test failure on GFX6 and a GPU hang on GFX10 with: dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point vkpipeline-db results on GFX10: Totals from affected shaders: SGPRS: 463772 -> 472972 (1.98 %) VGPRS: 343208 -> 343752 (0.16 %) Spilled SGPRs: 323 -> 336 (4.02 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 13806200 -> 14164472 (2.60 %) bytes Max Waves: 84021 -> 83755 (-0.32 %) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-13 09:54:07 +00:00
Lionel Landwerlin	bd888bc1d6	i965/iris: perf-queries: don't invalidate/flush 3d pipeline Our current implementation of performance queries is fairly harsh because it completely flushes and invalidates the 3d pipeline caches at the beginning and end of each query. An argument can be made that this is how performance should be measured but it probably doesn't reflect what the application is actually doing and the actual cost of draw calls. A more appropriate approach is to just stall the pipeline at scoreboard, so that we measure the effect of a draw call without having the pipeline in a completely pristine state for every draw call. v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 11:27:22 +02:00
Lionel Landwerlin	a575b3cd5c	intel/perf: drop batchbuffer flushing at query begin This was initially intended to fix issues with the query timings going occassionally high. It turns out there was a bug in the attribution of OA reports to our context when parsing the OA data. This led to reports flagged with other context IDs to be included in our queries results. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-13 11:27:17 +02:00
Iago Toral Quiroga	ca475d5fba	v3d: actually root the first BO in a command list in the job We were passing cl->bo, which is NULL, so v3d_job_add_bo was a no-op. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-13 08:58:10 +00:00
Christian Gmeiner	06db271a6c	etnaviv: drop compiled_rs_state forward declaration Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 08:11:03 +00:00
Christian Gmeiner	5f7c5f5dd2	etnaviv: remove not used etna_bits_ones(..) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-12-13 08:11:03 +00:00
Vinson Lee	8d20b5cba5	swr: Fix build with llvm-10.0. Fix build error after llvm-10.0 commit ("1b2842bf902a [Alignment][NFC] CreateMemSet use MaybeAlign"). ../src/gallium/drivers/swr/swr_shader.cpp: In member function ‘void (* BuilderSWR::CompileGS(swr_context, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT)’: ../src/gallium/drivers/swr/swr_shader.cpp:738:65: error: no matching function for call to ‘BuilderSWR::MEMSET(llvm::Value&, llvm::Constant, int, long unsigned int)’ MEMSET(pStream, C((char)0), VERTEX_COUNT_SIZE + CONTROL_HEADER_SIZE, sizeof(float) * KNOB_SIMD_WIDTH); ^ In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0, from ../src/gallium/drivers/swr/swr_shader.cpp:43: src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note: candidate: llvm::CallInst* SwrJit::Builder::MEMSET(llvm::Value, llvm::Value, uint64_t, llvm::MaybeAlign, bool, llvm::MDNode, llvm::MDNode, llvm::MDNode) CallInst MEMSET(Value Ptr, Value Val, uint64_t Size, MaybeAlign Align, bool isVolatile = false, MDNode TBAATag = nullptr, MDNode ScopeTag = nullptr, MDNode NoAliasTag = nullptr) ^ src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note: no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’ In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0, from ../src/gallium/drivers/swr/swr_shader.cpp:43: src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note: candidate: llvm::CallInst SwrJit::Builder::MEMSET(llvm::Value, llvm::Value, llvm::Value, llvm::MaybeAlign, bool, llvm::MDNode, llvm::MDNode, llvm::MDNode) CallInst* MEMSET(Value Ptr, Value Val, Value Size, MaybeAlign Align, bool isVolatile = false, MDNode TBAATag = nullptr, MDNode ScopeTag = nullptr, MDNode NoAliasTag = nullptr) ^ src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note: no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-12 23:43:38 -08:00
Jonathan Marek	828f8f5531	turnip: implement subpass input attachments Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	3b4b5f549f	turnip: CmdClearAttachments fixes Partial depth/stencil clear and skipping unused attachments. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	aac7d6c1dc	turnip: subpass rework A renderpass is a tile load/store cycle. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	4322cf34c4	turnip: add dirty bit for push constants Fixes push constants not updating in some cases. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	27d2174508	turnip: no 8x msaa on 128bpp formats We don't have an entry for cpp 128 in the tile_alignment table, but I don't think the HW supports this at all (blob driver just doesn't have 8x msaa). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	5fd9fd3516	turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view Use a special format which allows sampling the stencil and set the correct swizzle. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	e71f79f6c6	turnip: set FRAG_WRITES_SAMPMASK bit GPU hangs if SAMPMASK_REGID is used without this bit. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	99a4f7c79f	turnip: set load_layer_id to zero We don't have layered rendering and ir3 doesn't support this intrinsic, so just set it to zero for now. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	7bbcf7deff	turnip: update tile_align_w/tile_align_h It looks like the actual tile alignment requirement is less than 32x32, but in some cases input attachment texture needs 64 alignment. Reduced the h alignment to 16 to compensate and it seems to work fine. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	402bc111fc	turnip: fix tile layout logic Use DIV_ROUND_UP and stop trying to increase the tile_count width/height once tile_align_w/tile_align_h are reached. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	14cbe2dea5	turnip: fix hw binning render area Fix a mistake in the y2 coordinate. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	029322c100	freedreno/registers: add a6xx texture format for stencil sampler Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	2db03867f6	freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:17 -05:00
Jonathan Marek	ab54aceaa8	turnip: fix incorrectly failing assert pColorBlendState is allowed to be NULL if subpass has >0 color attachments but they are all unused. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-12 20:33:16 -05:00
Alyssa Rosenzweig	07d8b98b54	panfrost: Query core count and thread tls alloc This is supported only on newer kernels. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 00:47:23 +00:00
Alyssa Rosenzweig	315324614e	panfrost: Factor out panfrost_query_raw We would like to query properties other than product ID. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-13 00:47:23 +00:00
Timothy Arceri	a6aedc662e	st/glsl_to_nir: use nir based program resource list builder Here we use the NIR based builder to add everything to the resource list execpt for SSO packed varyings. Since the details of those varyings get lost during packing we leave the special handing to the GLSL IR pass for now. In order to do this we add some bools to the build resource list functions. Using the NIR based resource list builder gets us a step closer to using a native NIR based linker. It should also be faster than the GLSL IR builder, one because the NIR optimisations should mean we add less entries due to better optimisations, and two because nir gives us better lists to work with and we don't need to walk the entire IR to find the resources. Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	144f54e483	st/glsl_to_nir: call gl_nir_lower_buffers() a little later In a following commit we will use a NIR based builder to build the OpenGL resource list, so we want to delay this call a little. Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	d0259f4159	glsl: add subroutine support to nir_build_program_resource_list() This is required so we can use the NIR linker to link GLSL in addition to spirv. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	46f9f74c57	glsl: add support for named varyings in nir_build_program_resource_list() This adds support for adding names of varying to the resource list which is required for us to use this function with the glsl linker. Support for names is optional for spirv which is why it had not been added yet. This is mostly a copy of the GLSL IR code adapted to nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	3c364f90fd	glsl: copy the new data fields when converting to nir These fields added in the previous commit will be used to make use of a NIR based GLSL linker. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	56c25b938c	nir: add some fields to nir_variable_data These will be used to provide NIR linking functionality to GLSL. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	89b2b0f767	glsl: copy the how_declared field when converting to nir This is needed to make use of nir_build_program_resource_list(). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Timothy Arceri	c3823d2d29	glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir() In order to be able to implement a NIR based glsl linker we need to build the program resource list with NIR. This change delays the remaping so that a later commit can call the NIR based resource list builder. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-13 00:07:19 +00:00
Dylan Baker	e37115c912	docs: Update release notes, index, and calendar for 19.3.0	2019-12-12 12:05:00 -08:00
Dylan Baker	941aa31572	docs/19.3.0: Add SHA256 sums	2019-12-12 11:57:54 -08:00
Dylan Baker	2ab4c2bc22	docs: add release notes for 19.3.0	2019-12-12 11:57:53 -08:00
Jason Ekstrand	fa4d981f6f	i965: Enable GL_EXT_gpu_shader4 on Gen6+ It's already enabled for all gallium drivers that support GLSL 1.40 or above and we already support everything in our compiler on SNB+ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-12-12 18:43:17 +00:00
Samuel Pitoiset	eda1b77cc2	radv: enable SpvCapabilityImageMSArray The Vulkan spec says that StorageImageMultisample and ImageMSArray SPIRV-V capabilities must be enabled if the shaderStorageImageMultisample feature is supported. This fixes a warning with RenderDoc. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-12 18:52:08 +01:00
Alyssa Rosenzweig	eac9247b2d	panfrost: Add routines to calculate stack size/shift These implement the aforementioned formulas. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	e6f8ef93ca	panfrost: Split stack_shift nibble from unk0 It's conceptually independent from the upper part (which is not yet understood, but for spilling generally remains equal to 0x1e). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	6c6372770c	panfrost: Rename unknown_address_0 -> scratchpad It's the analogue pointer in SFBD. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	8b290bb13d	panfrost: Describe thread local storage sizing rules Deeply nested powers-of-two, basically :-) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	2b4da476f4	pan/midgard: Fix shift for TLS access Due to this issue we were using 4x the memory we should have for TLS, which was messing up the size calculations. Oops! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	05b839f354	pan/midgard: Simplify and fix vector copyprop Fixes a regression in QuakeSpasm. See https://gitlab.freedesktop.org/mesa/mesa/issues/2169 for apitrace. Closes #2169 Fixes: `f72873e6aa` ("pan/midgard: Copypropagate vector creation") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Icecream95	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	4308d75281	pan/midgard: Don't try to free NULL in LCRA Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `12e393bacf` ("panfrost: add lcra_free() to free lcra state")	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	5e75eb547f	pan/midgard: Force alignment for csel_v The swizzle on the conditional gets lost. Fixes "horizontal mirroring" in godot. See https://gitlab.freedesktop.org/mesa/mesa/issues/2108 which has attached apitrace. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `d3b3daa9d3` ("pan/midgard: Use new scheduler") Reported-by: Icecream95	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	8c79467a0d	pan/midgard: Don't use no_spill for memory spill src I'm not totally sure why this would break things, but it's certainly not necessary and it does break things. Somehow this gives the RA more freedom, fixing some spill issues. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	d48c195acf	pan/midgard: Use no_spill bitmask We would like no_spill decisions to be class-specific -- spilling from special register to a work register doesn't preclude also spilling that work register to stack. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	08b16fb321	pan/midgard: Dynamically allocate r26/27 for spills This allows us to spill two 128-bit values in the same bundle, since we have two registers we can spill with. This improves the register allocation flexibility in programs with heavy spilling, though unfortunately it isn't sufficient (theoretically, 3.5 128-bit values can be spilled from 3 vector units and 2 scalar units). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:07 -05:00
Alyssa Rosenzweig	8e7f2b9ae3	pan/midgard: Remove code marked "TODO: remove me" It's a fossil, how cute :-) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	b6d1b32d58	pan/midgard: Remove consecutive_skip code This has been unused since the beginning since it's broken. Let's toss it so it doesn't get in the way of further fixes. Bigger to fish to fry. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	3c0f1ea58c	pan/midgard: Move bounds checking into LCRA This simplifies the cost calculation code a bit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	e985ae25a6	pan/midgard: Remove spill cost heuristic We do need some sort of a cost heuristic, but this one is just causing spilling to behave worse on shaders I'm looking at, and I don't need more noise in the spill implementation right now. Get it working first. We can optimize this later. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	cacb4bc022	pan/midgard: Simplify spillability test Let's not worry about spilling twice in a bundle; that's too restrictive. We'll need to change the schedule itself -- unfortunately, this can have second-order effects due to pipeline registers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	7cf5bee5aa	pan/midgard: Split spill node selection/spilling Instead of having a giant function for both, split into the two subtasks so we can handle errors better. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Alyssa Rosenzweig	9dc3b18e49	pan/midgard: Move spilling code out of scheduler We move it to the register allocator itself. It doesn't belong in midgard_schedule.c! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 11:42:06 -05:00
Tomeu Vizoso	88f9522f83	st/mesa: Don't access members of NULL pointers Should be harmless, but UBSAN complains about it and fills the logs with noise. ../src/mesa/state_tracker/st_manager.c:523:27: runtime error: member access within null pointer of type 'struct st_framebuffer'"} #0 0xaad4e89c in st_framebuffer_reference ../src/mesa/state_tracker/st_manager.c:523"} #1 0xaad4e89c in st_api_make_current ../src/mesa/state_tracker/st_manager.c:1091"} #2 0xaab69e0e in dri_make_current ../src/gallium/state_trackers/dri/dri_context.c:301"} #3 0xaab48fd2 in driBindContext ../src/mesa/drivers/dri/common/dri_util.c:581"} #4 0xb682a122 in dri2_make_current ../src/egl/drivers/dri2/egl_dri2.c:1625"} #5 0xb67f95a4 in eglMakeCurrent ../src/egl/main/eglapi.c:884"} #6 0x4c2b0e in tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) (/deqp/modules/gles2/deqp-gles2+0x29b0e)"} #7 0x4c3302 in tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const) const (/deqp/modules/gles2/deqp-gles2+0x2a302)"} #8 0x73a9b0 in glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const) (/deqp/modules/gles2/deqp-gles2+0x2a19b0)"} #9 0x73ad86 in glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (/deqp/modules/gles2/deqp-gles2+0x2a1d86)"} #10 0x4c6a78 in deqp::gles2::Context::Context(tcu::TestContext&) (/deqp/modules/gles2/deqp-gles2+0x2da78)"} #11 0x4c3ba0 in deqp::gles2::TestPackage::init() (/deqp/modules/gles2/deqp-gles2+0x2aba0)"} #12 0x852fd8 in tcu::TestHierarchyIterator::next() (/deqp/modules/gles2/deqp-gles2+0x3b9fd8)"} #13 0x829660 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x390660)"} #14 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #15 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #16 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} ../src/mesa/state_tracker/st_atom.c:115:8: runtime error: member access within null pointer of type 'struct st_program'"} #0 0xaae11a58 in check_program_state ../src/mesa/state_tracker/st_atom.c:115"} #1 0xaae128f6 in st_validate_state ../src/mesa/state_tracker/st_atom.c:192"} #2 0xaadc58c2 in prepare_draw ../src/mesa/state_tracker/st_draw.c:132"} #3 0xaadc58c2 in st_draw_vbo ../src/mesa/state_tracker/st_draw.c:184"} #4 0xabc4f924 in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:816"} #5 0xabc50240 in _mesa_DrawElements ../src/mesa/main/draw.c:970"} #6 0x73ebd2 in glu::CallLogWrapper::glDrawElements(unsigned int, int, unsigned int, void const) (/deqp/modules/gles2/deqp-gles2+0x2d4bd2)"} #7 0x6d86b2 in deqp::gls::FragOpInteractionCase::iterate() (/deqp/modules/gles2/deqp-gles2+0x26e6b2)"} #8 0x494d16 in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad16)"} #9 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase*) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #10 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #11 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #12 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #13 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:58 +01:00
Tomeu Vizoso	99d4c71f7e	panfrost: Don't lose bits! UBSAN complained that when alpha was 255 and we shifted it 24 positions to the left, it didn't fit in a signed int. That's because bitwise operations automatically promote to signed int. ../src/gallium/drivers/panfrost/pan_job.c:1130:64: runtime error: left shift of 255 by 24 places cannot be represented in type 'int'"} #0 0xacf953d6 in pan_pack_color ../src/gallium/drivers/panfrost/pan_job.c:1130"} #1 0xacf953d6 in panfrost_batch_clear ../src/gallium/drivers/panfrost/pan_job.c:1204"} #2 0xaae3226a in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:513"} #3 0x4c3d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"} #4 0x828cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #5 0x8295f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #6 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #7 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #8 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:54 +01:00
Tomeu Vizoso	165cb0a5fe	util: Don't access members of NULL pointers Should be harmless, but UBSAN complains about it and fills the logs with noise. ../src/gallium/auxiliary/util/u_inlines.h:110:8: runtime error: member access within null pointer of type 'struct pipe_surface'"} #0 0xaaccf186 in pipe_surface_reference ../src/gallium/auxiliary/util/u_inlines.h:110"} #1 0xaaccf186 in util_copy_framebuffer_state ../src/gallium/auxiliary/util/u_framebuffer.c:105"} #2 0xaabfb60e in cso_set_framebuffer ../src/gallium/auxiliary/cso_cache/cso_context.c:723"} #3 0xaae195ce in st_update_framebuffer_state ../src/mesa/state_tracker/st_atom_framebuffer.c:207"} #4 0xaae12316 in st_validate_state ../src/mesa/state_tracker/st_atom.c:261"} #5 0xaae31302 in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:438"} #6 0x4c3d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"} #7 0x828cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #8 0x8295f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #9 0x810aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #10 0x4c1d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #11 0xb64b6aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:50 +01:00
Tomeu Vizoso	fb579b0347	nir: Don't copy empty array It's undefined behavior UBSAN complains about, so fixing this will reduce the noise a bit. ../src/compiler/nir/nir_clone.c:710:4: runtime error: null pointer passed as argument 2, which is declared to never be null"} #0 0xac781be4 in clone_function ../src/compiler/nir/nir_clone.c:710"} #1 0xac781be4 in nir_shader_clone ../src/compiler/nir/nir_clone.c:740"} #2 0xacf99442 in panfrost_shader_compile ../src/gallium/drivers/panfrost/pan_assemble.c:54"} #3 0xacf6b268 in panfrost_bind_shader_state ../src/gallium/drivers/panfrost/pan_context.c:1960"} #4 0xaae326bc in set_fragment_shader ../src/mesa/state_tracker/st_cb_clear.c:135"} #5 0xaae326bc in clear_with_quad ../src/mesa/state_tracker/st_cb_clear.c:335"} #6 0xaae326bc in st_Clear ../src/mesa/state_tracker/st_cb_clear.c:518"} #7 0x494d0e in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad0e)"} #8 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #9 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #10 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #11 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #12 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:45 +01:00
Tomeu Vizoso	47a73888f5	pan/midgard: Remove undefined behavior As found by UBSAN, it should be harmless but it's good to remove any UB so the tool's output is useful. ../src/panfrost/midgard/midgard_schedule.c:1094:9: runtime error: index -1 out of bounds for type 'midgard_instruction [6]'"} #0 0xad047872 in schedule_block ../src/panfrost/midgard/midgard_schedule.c:1094"} #1 0xad04d41a in schedule_program ../src/panfrost/midgard/midgard_schedule.c:1116"} #2 0xad031f98 in midgard_compile_shader_nir ../src/panfrost/midgard/midgard_compile.c:2588"} #3 0xacf9874e in panfrost_shader_compile ../src/gallium/drivers/panfrost/pan_assemble.c:68"} #4 0xacf6b268 in panfrost_bind_shader_state ../src/gallium/drivers/panfrost/pan_context.c:1960"} #5 0xaae2596e in st_update_fp ../src/mesa/state_tracker/st_atom_shader.c:168"} #6 0xaae12316 in st_validate_state ../src/mesa/state_tracker/st_atom.c:261"} #7 0xaadc58c2 in prepare_draw ../src/mesa/state_tracker/st_draw.c:132"} #8 0xaadc58c2 in st_draw_vbo ../src/mesa/state_tracker/st_draw.c:184"} #9 0xabc4f924 in _mesa_validated_drawrangeelements ../src/mesa/main/draw.c:816"} #10 0xabc50240 in _mesa_DrawElements ../src/mesa/main/draw.c:970"} #11 0x73ebd2 in glu::CallLogWrapper::glDrawElements(unsigned int, int, unsigned int, void const) (/deqp/modules/gles2/deqp-gles2+0x2d4bd2)"} #12 0x6d86b2 in deqp::gls::FragOpInteractionCase::iterate() (/deqp/modules/gles2/deqp-gles2+0x26e6b2)"} #13 0x494d16 in deqp::gles2::TestCaseWrapper::iterate(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x2ad16)"} #14 0x7f9cf2 in tcu::TestSessionExecutor::iterateTestCase(tcu::TestCase) (/deqp/modules/gles2/deqp-gles2+0x38fcf2)"} #15 0x7fa5f0 in tcu::TestSessionExecutor::iterate() (/deqp/modules/gles2/deqp-gles2+0x3905f0)"} #16 0x7e1aac in tcu::App::iterate() (/deqp/modules/gles2/deqp-gles2+0x377aac)"} #17 0x492d4c in main (/deqp/modules/gles2/deqp-gles2+0x28d4c)"} #18 0xb64b9aa8 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x1aaa8)"} Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:40 +01:00
Tomeu Vizoso	5dfe41239c	panfrost: Hold a reference to sampler views Before we were just copying, but we need to hold a reference as well. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-12 16:26:35 +01:00
Jan Zielinski	bd5077ae1d	gallium/swr: Fix Windows build Tessellator defines own fmin/fmax functions that conflict with those defined in cmath header. Need to use legacy math.h which was originally used in MS code. Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>	2019-12-12 14:35:23 +01:00
Samuel Pitoiset	a0f1a5fa05	ac/nir: fix out-of-bound access when loading constants from global Global load/store instructions can't know if the offset is out-of-bound because they don't use descriptors (no range). Fix this by clamping the offset for arrays that are indexed with a non-constant offset that's greater or equal to the array size. This fixes VM faults and GPU hangs with Dead Rising 4. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148 Fixes: `71a6794200` ("ac/nir: Enable nir_opt_large_constants") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-12 10:12:56 +00:00
Lionel Landwerlin	2c5eb1df68	anv: fix assumptions about temporary fence payload Since `f9a3d9738b` temporary BO_WSI are definitely a thing so we have an assert wrong. Take that opportunity to expand a bit on an existing comment. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `f9a3d9738b` ("anv: Use BO fences/semaphores for AcquireNextImage") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-12 10:10:48 +00:00
Lionel Landwerlin	52bc235f2a	anv: fix fence underlying primitive checks We appear to have got lucky that the only type of temporary fence payload we could have was a syncobj and that would only happen when the type of the permanent payload was also a syncobj. This code was broken if that assumption changed and it did in commit `f9a3d9738b`. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-12 10:10:48 +00:00
Dave Airlie	790bc9a17e	vtn/opencl: add shuffle/shuffle support This adds nir encoding for these, generating them from libclc was very expensive, and this is a lot simpler. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-12 19:40:58 +10:00
Dave Airlie	5471ef7532	vtn: convert vload/store to single value loops There is an alignment issue doing this the other way, the spec clearly says vload/store don't require alignment. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-12-12 19:40:15 +10:00
Kenneth Graunke	dcb4230e5e	iris: Default to X-tiling for scanout buffers without modifiers Neither Mutter nor KWin's wayland compositors appear to use modifiers. In the non-modifier case, iris was still trying to use Y-tiling for scan-out surfaces, leading to this error: (gnome-shell:7247): mutter-WARNING **: 09:23:47.787: meta_drm_buffer_gbm_new failed: drmModeAddFB failed: Invalid argument We now fall back to the historical X-tiling for scanout buffers, which ought to work everyone, at lower performance. To regain that, we need to ensure modifiers are actually supported in environments people use. Fixes: `fbf3124771` ("iris: Rework tiling/modifiers handling") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-11 22:03:48 -08:00
Dave Airlie	3cd903a6c3	llvmpipe: enable ARB_shader_draw_parameters. All the bits should be in place for this now. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 10:29:43 +10:00
Dave Airlie	75f21895de	gallivm: fixup base_vertex support base vertex should be 0 for non-indexed draws according to the piglit tests. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 10:16:19 +10:00
Dave Airlie	73f5e2d7ef	gallivm/draw: add support for draw_id system value. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 10:16:19 +10:00
Dave Airlie	22a40dd1c1	gallivm: add base instance sysval support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 10:16:19 +10:00
Karol Herbst	20d0ae464c	nv50/ir: implement global atomics and handle it for nir TGSI doesn't have any concept of global memory right now. Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	70c6bff2f0	nir: handle nir_deref_type_ptr_as_array in rematerialize_deref_in_block I forgot why that was required, but it still is the correct thing to do. Hit it at some point when working on implementing more CL features. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Rob Clark	ddb9701a3c	spirv: add OpLifetime* These are just hints so we can ignore them. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	acc0658942	clover/spirv: allow Int64 Atomics for supported devices Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	dba8bf1169	clover/nir: set spirv environment to OpenCL Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	6d08f034ce	clover/nir: treat UniformConstant as global memory Just like we already do in the llvm backend. The current constant buffer code seems fundamentally flawed and right now we are thinking on how we want to reimplement all of that. But until that happens, just treat is as global memory and go on. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Karol Herbst	2402232c90	spirv: handle UniformConstant for OpenCL kernels The caller is responsible for setting up the ubo_addr_format value as contrary to shared and global, it's not controlled by the spirv. Right now clovers implementation of CL constant memory uses a 24/8 bit format to encode the buffer index and offset, but that code is dead as all backends treat constants as global memory to workaround annoying issues within OpenCL. Maybe that will change, maybe not. But just in case somebody wants to look at it, add a toggle for this inside vtn. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-11 23:54:39 +00:00
Dave Airlie	123f90cf36	gallivm/nir: copy compare ordering code from tgsi This fixes some isinf/isnan tests copying what the tgsi code paths do for float compares Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 09:16:41 +10:00
Dave Airlie	8f56ba5da4	gallivm/nir: cleanup code and call cmp wrapper Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 09:16:37 +10:00
Dave Airlie	63b3d38a50	gallivm: fix perspective enable if usage_mask doesn't have 0 bit set The current code looks like a typo, and fails if the usage_mask is for a y/z enabled input. Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer with llvmpipe/nir Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 09:16:33 +10:00
Dave Airlie	bf29040103	gallivm: fix transpose for when first channel isn't created The previous fix worked when the second channel wasn't exposed, but a couple of piglit tests have inputs with just the y/z chans, no x/w. Partly Fixes piglit ext_transform_feedback-immediate-reuse-index-buffer with llvmpipe/nir Fixes: `5363cda52b` ("gallivm: add swizzle support where one channel isn't defined.") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 09:16:28 +10:00
Dave Airlie	e35b2c37cd	llvmpipe/nir: handle texcoord requirements Switch to using texcoord intrinsic support. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-12 09:16:24 +10:00
Kristian H. Kristensen	b6f8c42846	freedreno/a6xx: Silence warning for unused perf counters Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	9b09776846	freedreno/a6xx: Convert some tile setup to OUT_REG() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	8a4b0d852c	freedreno/a6xx: Convert gmem blits to OUT_REG() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	201caa7281	freedreno/a6xx: Convert VSC pipe setup to OUT_REG() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	c71348f84a	freedreno/a6xx: Convert emit_zs() to OUT_REG() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	ffa7d9cbeb	freedreno/a6xx: Convert emit_mrt() to OUT_REG() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	781b2dd63b	freedreno/a6xx: Include fd6_pack.h in a few files Including non-functional changes to get the value from the fd_reg_pair in places. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	9783f6bc5d	freedreno/a6xx: Drop stale include Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	9b05466144	freedreno/registers: Add 64 bit address registers Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	bdd98b892f	freedreno: New struct packing macros Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Kristian H. Kristensen	b27b0e8550	freedreno/registers: Remove duplicate register definitions Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 22:25:47 +00:00
Timothy Arceri	f8148d0cc1	docs: remove mailing list as way of submitting patches All developers now use gitlab, don't confuse newcomers by suggesting they might use the mailing list. We want everyone to use gitlab so that patches get run through basic CI before they are merged. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-12 09:09:50 +11:00
Jason Ekstrand	776cfde699	anv: Bump the advertised patch version to 129 We've been keeping up with the spec updates. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Jason Ekstrand	5f5f5019bd	anv: Unconditionally advertise Vulkan 1.1 Vulkan 1.1 requires VK_KHR_external_fence which requires syncobj support to be actually usable. However, it doesn't strictly require that we support any external handle types. We should be able to advertise 1.1 even on old kernels that don't have syncobj support. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Jason Ekstrand	98a83d0fce	anv: Flush the queue on DeviceWaitIdle When we have syncobj_wait, we can trust in WAIT_FOR_SUBMIT but when we don't, we only have BO waits and those aren't quite as nice. This commit adds a flag to _anv_queue_submit to wait for the queue to drain before returning. This gives us the behavior we need to implement DeviceWaitIdle. Fixes: `246261f0ad` "anv: prepare the driver for delayed submissions" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 18:52:08 +00:00
Karol Herbst	0bafde717d	nir/tests: MSVC build fix Fixes: `11f736a6f9` "nir/tests: add serializer tests" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-11 17:12:48 +00:00
Jan Zielinski	ab55708200	swr/rasterizer: Add tessellator implementation to the rasterizer This is initial commit on the way to implement ARB_tessellation_shader extension in OpenSWR. It introduces tessellator implementation taken from Microsoft GitHub (published under MIT license): https://github.com/microsoft/DirectX-Specs/blob/master/d3d/archive/images/d3d11/tessellator.cpp https://github.com/microsoft/DirectX-Specs/blob/master/d3d/archive/images/d3d11/tessellator.hpp It also adds some glue code that connects the tessellator to the internals of SWR rasterizer. Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviwed-by: Alok Hota <alok.hota@intel.com>	2019-12-11 16:54:37 +00:00
Samuel Pitoiset	ff2e11b210	gitlab-ci: set RADV_DEBUG=checkir for RADV test jobs This is used to validate if the driver emits correct LLVM IR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-11 15:44:40 +00:00
Eric Engestrom	b2dac806f8	intel: add mi_builder_test for gen12 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-11 15:38:19 +00:00
Rohan Garg	2129b4152c	gitlab-ci: Use lavacli from packages lavacli 0.9.8 is now available in Debian Testing. Ref: https://tracker.debian.org/news/1066828/lavacli-098-1-migrated-to-testing/ Fixes: `555c0de` ("gitlab-ci: Move LAVA-related files into top-level ci dir") Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-11 15:19:43 +00:00
Erico Nunes	7701b7b7ee	lima/ppir: enable lower_fdph Otherwise we may lower some fdot to fdph which is not implemented in pp. Fixes #2126 Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-12-11 15:55:48 +01:00
Karol Herbst	11f736a6f9	nir/tests: add serializer tests Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-11 13:00:44 +01:00
Karol Herbst	676232d76f	nir/serialize: fix vec8 and vec16 Nir serializes uses nir_ssa_alu_instr_src_components in a few places to determine how many components a src has, but that's not what this function returns. It simply returns how many channels are used, which is still fine for most of the code. This was breaking code like this: vec16 32 ssa_1 = intrinsic load_global vec1 32 ssa_2 = fmax ssa_1.a, ssa_2.b v2: make the 16bit encoding work for identify swizzles again Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-11 13:00:44 +01:00
Bas Nieuwenhuizen	2e44bfc14f	radv: Fix RGBX Android<->Vulkan format correspondence. This is correct per the Vulkan spec format equivalence table. Fixes: `f36b52740a` "radv/android: Add android hardware buffer queries." Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-11 11:40:13 +01:00
Tomeu Vizoso	63ae9e61c1	panfrost: Add PAN_MESA_DEBUG=sync Sometimes it's useful to get information about GPU faults in the console, so it's synchronized with other messages. This commit will cause Mesa to wait for completion and check if there are any faults raised by the GPU. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-11 08:01:20 +01:00
Kenneth Graunke	2e654db27a	iris: Create smaller program keys without legacy features A lot of the brw_*_prog_key fields are for emulating features on legacy hardware that iris doesn't support. In particular, all of the texture swizzle fields take up a lot of space. These dead fields make hashing the shader keys more expensive than it ought to be. We introduce iris-specific keys with only the information we need, and translate them to brw keys when actually compiling new variants. This way, key comparisons can use the small keys. The size reductions are: VS: 328 bytes -> 8 bytes TCS: 312 bytes -> 24 bytes TES: 304 bytes -> 24 bytes GS: 284 bytes -> 8 bytes FS: 304 bytes -> 16 bytes CS: 280 bytes -> 4 bytes Scores for the Piglit drawoverhead microbenchmark case with a shader program change improve by roughly 30%. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-10 22:25:41 -08:00
Pierre Moreau	8ccd3f48a0	compiler/spirv: Fix uses of gnu struct = {} extension Fixes: `a24d6fbae6` ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Pierre Moreau <dev@pmoreau.org>	2019-12-11 06:03:22 +00:00
Vinson Lee	9661fc9cdb	util/u_thread: Restrict u_thread_get_time_nano on macOS. macOS does not have pthread_getcpuclockid. src/util/u_thread.h:156:4: error: implicit declaration of function 'pthread_getcpuclockid' is invalid in C99 [-Werror,-Wimplicit-function-declaration] pthread_getcpuclockid(thread, &cid); ^ Fixes: `4913215d14` ("util/u_thread: don't restrict u_thread_get_time_nano() to __linux__") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2171 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-12-10 21:35:47 -08:00
Eric Anholt	8bf590b46b	tu: Move UBWC layout into fdl6_layout() and use that function. This gets us shared non-UBWC layout code between gallium and turnip. Until I fix up the rest of gallium to handle UBWC mipmapping, we do the single-level UBWC setup in gallium as a fixup after layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	de619d7503	freedreno: Switch the 16-bit workaround to match what turnip does. Prevents regressions on argb1555 and rgb565 when making turnip use freedreno's layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	d9cf3e76bd	freedreno: Move a6xx's setup_slices() to a shareable helper function. We pass in all the parameters for setting up the layout, though freedreno still sets a few of them up early (since it uses layout helpers in making some decisions about the layout setup parameters that will be cleaned up once krh's blitter work lands).	2019-12-11 04:24:18 +00:00
Eric Anholt	67258a44d2	tu: Move our image layout into a freedreno_layout struct. This lets us start using some of the fdl_* helpers and have more obviously matching code between gallium and turnip. We can't yet use the fdl_* UBWC helpers, since the gallium driver doesn't do UBWC mipmaps (which I'm working on in another branch). Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	ea7631a9a6	freedreno: Move UBWC layout into a slices array like the non-UBWC slices. This is a little refactor in preparation for UBWC mipmapping support. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	bbe84c6c31	freedreno: Refactor the UBWC flags registers emission. It's the same logic for each of these being emitted, and I was about to change the rsc->layout.* for UBWC. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Eric Anholt	97be9503bb	freedreno: Drop the extra offset field for mipmap slices. We can just bake the UBWC-goes-first delta into the slices at setup time. I did have to fix up the resource shadowing swap path to swap the slice fields, as it was missing and regressed the format reinterpets otherwise. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-11 04:24:18 +00:00
Kenneth Graunke	69d7782b15	intel/decoder: Make get_state_size take a full 64-bit address and a base i965 wants to use an offset from a base because everything is in a single buffer whose address may be relocated, and all base addresses are set to the start of that buffer. iris wants to use a full 64-bit address, because state lives in separate buffers which may be in the shader, surface, and dynamic memory zones, where addresses grow downward from the top of a 4GB zone, So it's very possible for a 32-bit offset to exist relative to multiple bases, leading to the wrong state size.	2019-12-10 19:10:49 -08:00
Dongwon Kim	8a8534a698	iris: INTEL performance query implementation low-level implementation of INTEL-performance-query APIs in Intel iris driver. Most of functions and procedures defined here are adopted from i965 driver (brw_performance_query.c) v2: - replace genX_init_performance_query with iris_init_perfquery_functions which is gen's version agnositic - general code clean-up v3: include gen_perf_gens.h as some of defines were moved to this new header file v4: - checking for kernel 4.13+ won't be needed here as Iris won't be loaded anyway without DRM_SYNCOBJ that is enabled after Kernel 4.13. - checking whether gen < 8 or is_cherryview won't be required as well because those cases are screened in iris_screen_create. v5: remove genX(init_performance_query) v6: - remove oa_metrics_kernel_support as iris works only with kernel 4.18 and newer. - use perf functions defined in separate file, iris_perf.h/c Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-10 17:02:58 -08:00
Mark Janes	ca2dd99bf6	iris: separating out common perf code The configuration of the gen_perf vtable will be the same for INTEL_performance_query and AMD_performance_monitor. Initialize the table in a single routine that can be called from both implementations. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-10 17:02:58 -08:00
Dongwon Kim	106054ef79	gallium: enable INTEL_PERFORMANCE_QUERY new state tracker APIs added for INTEL_performance_query This extension is enabled if all vendor specific functions for it exist. v2: add st_cb_perfquery.* to the list of sources in Makefile v3: minor code clean-up v4: - add driver hooks for intel-performance-query apis - add PIPE level performance counter and type enums that match to OpenGL enums - do conversion of pipe_perf_counter_type and pipe_perf_counter_data_type enums to GL defines in state_tracker Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-10 17:02:58 -08:00
Dylan Baker	d0eebda990	meson/broadcom: libbroadcom_cle also needs zlib Fixes: `1ae8018a6a` ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-11 00:49:44 +00:00
Kenneth Graunke	0f2f561a10	anv: Enable Gen11 Color/Z write merging optimization TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:19:46 -08:00
Kenneth Graunke	5cc7636993	iris: Enable Gen11 Color/Z write merging optimization TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Improves performance in Manhattan 3.0 by 6% on ICL 8x8 at a fixed frequency, according to Felix Degrood. I didn't see any improvements at out-of-the-box power management settings, however. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:19:43 -08:00
Kenneth Graunke	0b74f85870	intel/genxml: Add a partial TCCNTLREG definition TCCNTLREG contains additional cache programming settings. In particular, there are several write combining controls we'd like to use. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:19:33 -08:00
Kenneth Graunke	74665eaf3a	util: Detect use-after-destroy in simple_mtx This makes simple_mtx_destroy set the counter to an invalid canary value and then makes lock/unlock assert that the value is legal. That way, calling lock/unlock after destroy will assert fail, rather than deadlocking or potentially even working. This has caught real deadlocks in dEQP multithreaded tests (in st/mesa shader variant zombie list handling), which have since been fixed. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-12-10 23:48:40 +00:00
Rob Clark	fc97643c57	freedreno/a6xx: enable LRZ by default Now that dEQP should be happy, lets flip the switch. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Rob Clark	1b4c12d3ee	freedreno/a6xx: fix LRZ logic In particular, we need to invalidate the LRZ state when we cannot be confident in what the Z state would be during rendering: 1) depth test modes not supported by LRZ 2) stencil test, which would require full rasterization and stencil test in the binning pass (whereas LRZ normally just needs to determine the min and max z value in an 8x8 quad) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Rob Clark	3c479849c5	freedreno/a6xx: fix LRZ layout Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Rob Clark	6cf101402d	freedreno/a5xx+a6xx: split LRZ layout to per-gen Seems to be a bit different for a6xx, so let's split this out. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Rob Clark	3b074a2e53	freedreno/a6xx: disable LRZ when blending Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-10 22:55:21 +00:00
Marek Olšák	a305543c8d	radeonsi: don't rely on CLEAR_STATE to set PA_SC_GENERIC_SCISSOR_* Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-10 16:32:37 -05:00
Marek Olšák	aced18aa61	radeonsi/gfx10: simplify the tess_turns_off_ngg condition Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-10 16:32:36 -05:00
Marek Olšák	42f921387b	radeonsi/gfx10: disable vertex grouping based on PAL. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-10 16:32:34 -05:00
Marek Olšák	75ce078a0a	radeonsi: enable NIR by default and document GL 4.6 support Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-10 15:48:58 -05:00
Marek Olšák	42b28e7ac3	st/dri: assume external consumers of back buffers can write to the buffers This was reverted needlessly because if was part of another series. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-12-10 15:37:37 -05:00
Jason Ekstrand	41691ac016	ANV: Stop advertising smoothLines support on gen10+ Reviewed-by: Ivan Briano <ivan.briano@intel.com>	2019-12-10 20:13:56 +00:00
Dylan Baker	85a9698ac3	meson/broadcom: libbroadcom_cle needs expat headers Fixes: `1ae8018a6a` ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-10 10:48:38 -08:00
Lionel Landwerlin	5fdea9f401	anv: fix incorrect VMA alignment for CCS main surfaces Maybe finer way of dealing with this requirement would be to increase the number of pdevice->memory.types[] to add a category for special alignment cases. Meanwhile this fixes the problem of CCS surface alignment and it's probably not going to cause issues given the size of our address space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6af8a4acc4` ("anv: Add aux-map translation for gen12+") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:06:54 +00:00
Lionel Landwerlin	dcfe1903c3	anv: fix missing gen12 handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `181be14d43` ("anv: Build for gen12") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-12-10 16:06:54 +00:00
Eric Engestrom	865f4b193f	docs: reword a bit and list HTTPS before FTP Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-12-10 15:21:23 +00:00
Eric Engestrom	d90e656fa7	meson: drop `intel_` prefix on imgui_core Again, no real effect, just the name of a temporary build file. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-10 15:16:02 +00:00
Eric Engestrom	2b0e3e9fd1	meson: drop duplicate `lib` prefix on libiris_gen* This has no real effect other than the names of the temporary files in the build folder. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-10 15:16:02 +00:00
Samuel Pitoiset	e4c8491bdf	radv: implement VK_KHR_separate_depth_stencil_layouts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 13:16:17 +01:00
Samuel Pitoiset	48ee62178f	radv: initialize HTILE for separate depth/stencil aspects It either clears the whole HTILE buffer or part of it depending on the HTILE mask parameter. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 13:09:29 +01:00
Samuel Pitoiset	41cebfc9c1	radv: do not init HTILE as compressed state when dst layout allows it I don't think this makes much differences and a potential clear following the initialization will overwrite HTILE anyways. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 13:09:26 +01:00
Samuel Pitoiset	b603cc8c84	radv: synchronize after performing a separate depth/stencil fast clears For depth+stencil images, the driver might use an optimized path if only one aspect is cleared. It either clears the depth or the stencil part of HTILE. Because the two separate aspects might use the same HTILE memory we have to synchronize. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 13:09:22 +01:00
Michel Dänzer	dadd609664	gitlab-ci: Don't exclude any piglit quick_shader tests Now that we're running these with process isolation enabled, their results will hopefully be stable. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-10 11:19:11 +00:00
Krzysztof Raszkowski	cfe00a52f0	gallivm: add TGSI bit arithmetic opcodes support Add TGSI_OPCODE_BFI, TGSI_OPCODE_POPC, TGSI_OPCODE_LSB, TGSI_OPCODE_IMSB, TGSI_OPCODE_UMSB, TGSI_OPCODE_IBFE, TGSI_OPCODE_UBFE, TGSI_OPCODE_BREV support. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-10 10:34:18 +00:00
Samuel Pitoiset	008fe909ca	radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rast PA_SC_AA_CONFIG might be updated when conversative rasterization is enabled. Because the driver only re-emits the multisample state if the number of samples is different, that register value might not be updated correctly. Found by inspection, doesn't fix anything known. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 11:04:43 +01:00
Samuel Pitoiset	4f659224c8	radv: move emission of two PA_SC_* registers to the pipeline CS They don't have to be updated dynamically. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-10 11:04:40 +01:00
Pierre-Eric Pelloux-Prayer	87f7ec8a2c	st/dri: use st->flush callback to flush the backbuffer Previously the flush was done before the call to st->flush but could lead to problems as FLUSH_VERTICES could push some work that would change the backbuffer (or modify it). With this commit, all the backbuffer flushing code is executed right before the call to st_flush. Closes: https://gitlab.freedesktop.org/drm/amd/issues/842 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205049 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer	cc0d0afe3b	st/mesa: add a notify_before_flush callback param to flush The new callback is called right before the flush is done to allow users of st->flush to do some work after all the previous work has been flushed. This will be used by dri_flush in the next commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer	f5c1cb2383	radeonsi: dcc dirty flag Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer	e3e91cebcd	radeonsi: fix multi plane buffers creation When using 3 planes, the sequence produces this chain: plane0 -> plane2 This commit fixes this to produce: plane0 -> plane1 -> plane2 Fixes: `86e60bc265` ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2193 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-10 08:52:16 +01:00
Pierre-Eric Pelloux-Prayer	ff0f108666	radeonsi: use gfx9.surf_offset to compute texture offset Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2177 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-10 08:52:07 +01:00
Sonny Jiang	6c901f0675	radeonsi: use compute shader for clear 12-byte buffer Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-09 23:25:57 -05:00
Marek Olšák	38e9eb9561	st/mesa: release the draw shader properly to fix driver crashes (iris) Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 22:41:41 -05:00
Marek Olšák	41118246c6	draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	a3de63fbb3	st/mesa: don't generate VS TGSI if NIR is enabled it's no longer needed Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	a90f4453fe	st/mesa: remove struct st_vp_variant in favor of st_common_variant Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	6299b90fd4	st/mesa: remove st_vp_variant::num_inputs Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	bc99b22a30	st/mesa: use a separate VS variant for the draw module instead of keeping the IR indefinitely in st_vp_variant. This trivially fixes Selection/Feedback/RasterPos for NIR. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	17e8839a2f	st/mesa: support shader images for Selection/Feedback/RasterPos Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	b7393f1115	st/mesa: support SSBOs for Selection/Feedback/RasterPos Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	e91b044bd8	st/mesa: support samplers for Selection/Feedback/RasterPos Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	2891c4b2e2	st/mesa: save currently bound vertex samplers and sampler views in st_context for st_draw_feedback.c Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	226e7aee70	st/mesa: support UBOs for Selection/Feedback/RasterPos Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	60db75cb77	gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe This is already used in st_draw_feedback.c, because it uses shaders generated for drivers. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-09 21:09:28 -05:00
Marek Olšák	525c8b90c7	llvmpipe: implement TEX_LZ and TXF_LZ opcodes gallivm receives these opcodes anyway because st_draw_feedback.c uses shaders that were assembled for drivers, not llvmpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-09 21:09:28 -05:00
Gurchetan Singh	3c8ddc8f4b	drirc: set allow_higher_compat_version for Faster Than Light With 781a78 ("mesa: enable ARB_direct_state_access in compat for GL3.1+), it's possible to have DSA with GL3.1+. FTL creates a GL3.1 compat context, but fails the _mesa_has_geometry_shaders(..) check in frame_buffer_texture. Bump the compat version to pass the check. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-09 15:27:02 -08:00
Roland Scheidegger	23f1b78e8f	util/atomic: Fix p_atomic_add for unlocked and msvc paths Braces mismatch (flagged by CI, untested). Fixes: `385d13f26d` "util/atomic: Add a _return variant of p_atomic_add" Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-09 15:02:58 -08:00
Eric Anholt	0470a03769	freedreno: Track the set of UBOs to be uploaded in UBO analysis. We were iterating over the entire 32-entry array each time, when we can just use a bitset to know that we're only uploading from the first entry normally. Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL fishtank. Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-12-09 14:13:50 -08:00
Eric Anholt	10da0a9d18	freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off. The default is to not throw GL errors when drawing with mapped buffers, but we were forcing it on for unclear reasons. Internally we keep all our buffers mapped anyway, so it should be a no-op other than reducing CPU overhead (.23% in a perf report for WebGL fishtank) Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-12-09 14:13:47 -08:00
Rob Clark	dc791d3c68	freedreno/fdperf: use drmOpen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-09 13:09:58 -08:00
Alyssa Rosenzweig	a37822f5f7	gallium/util: Support POLYGON in u_stream_outputs_for_vertices u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is trivial to support as a special case directly (since we have the number of vertices directly). Fixes aborts in Panfrost in apps using GL_POLYGON. Fixes: `e881aa8c12` ("gallium/util: Add u_stream_outputs_for_vertices helper") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Revewied-by: Eric Anholt <eric@anholt.net>	2019-12-09 21:09:05 +00:00
Anuj Phogat	1a32fbd48c	intel: Add pci-ids for Jasper Lake Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-09 12:22:57 -08:00
Anuj Phogat	11fdd5f52c	intel: Add device info for 1x4x6 Jasper Lake Also removing the FIXME comments after matching the numbers with updated documentation. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-09 12:22:56 -08:00
Vasily Khoruzhick	9f5fa496cb	lima: expose tiled format modifier in query_dmabuf_modifiers() Fixes: `8c12f4e5f2` ("lima: enable tiling") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-12-09 15:21:55 +00:00
Vasily Khoruzhick	01a451b04d	lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle() Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID in resource_from_handle() and we don't have RO. Fixes: `8c12f4e5f2` ("lima: enable tiling") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-12-09 15:21:55 +00:00
Jonathan Marek	9d78cf4584	turnip: add hw binning Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-09 08:22:18 -05:00
Samuel Pitoiset	86dfe92bd0	radv: do not use VK_TRUE/VK_FALSE For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-09 09:21:26 +01:00
Dave Airlie	d7dc14628a	gallivm: add bitfield reverse and ufind_msb Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>	2019-12-09 06:05:02 +10:00
Roland Scheidegger	1c7693e3bd	gallium/scons: fix graw_gdi build Fixes: `44a6b0107b` (gallivm: add nir->llvm translation (v2)) Reviewed-by: Dave Airlie <Airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-12-07 17:50:53 +01:00
Daniel Schürmann	8259c97b2d	aco: propagate temporaries into expanded vectors Gives a very slight decrease in code size: Totals from affected shaders: Code Size: 1708488 -> 1702768 (-0.33 %) bytes Max Waves: 2858 -> 2855 (-0.10 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	df3e674fb3	aco: improve readfirstlane after uniform ssbo loads on GFX7 pipeline-db changes for GFX7: 80310 shaders in 40472 tests Totals: SGPRS: 3655900 -> 3643916 (-0.33 %) VGPRS: 2678324 -> 2686324 (0.30 %) Spilled SGPRs: 1730 -> 1634 (-5.55 %) Spilled VGPRs: 14 -> 21 (50.00 %) Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread Code Size: 136106120 -> 135457616 (-0.48 %) bytes LDS: 1259 -> 1259 (0.00 %) blocks Max Waves: 601014 -> 600206 (-0.13 %) Totals from affected shaders: SGPRS: 307832 -> 295848 (-3.89 %) VGPRS: 267864 -> 275864 (2.99 %) Spilled SGPRs: 770 -> 674 (-12.47 %) Spilled VGPRs: 14 -> 21 (50.00 %) Scratch size: 16 -> 12 (-25.00 %) dwords per thread Code Size: 22007488 -> 21358984 (-2.95 %) bytes LDS: 65 -> 65 (0.00 %) blocks Max Waves: 28668 -> 27860 (-2.82 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	0837471463	aco: use soffset for MUBUF instructions on SI/CI pipeline-db changes for GFX7: 80310 shaders in 40472 tests Totals: SGPRS: 3655300 -> 3655900 (0.02 %) VGPRS: 2677732 -> 2678324 (0.02 %) Spilled SGPRs: 1730 -> 1730 (0.00 %) Spilled VGPRs: 14 -> 14 (0.00 %) Scratch size: 15540 -> 15540 (0.00 %) dwords per thread Code Size: 136488364 -> 136106120 (-0.28 %) bytes LDS: 1259 -> 1259 (0.00 %) blocks Max Waves: 601039 -> 601014 (-0.00 %) Totals from affected shaders: SGPRS: 316312 -> 316912 (0.19 %) VGPRS: 273844 -> 274436 (0.22 %) Spilled SGPRs: 770 -> 770 (0.00 %) Spilled VGPRs: 14 -> 14 (0.00 %) Scratch size: 16 -> 16 (0.00 %) dwords per thread Code Size: 22724904 -> 22342660 (-1.68 %) bytes LDS: 114 -> 114 (0.00 %) blocks Max Waves: 30861 -> 30836 (-0.08 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	7b38d95b32	radv: Enable ACO on GFX7 (Sea Islands) This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used. Note that shader_ballot works correctly, but performance seems inferior. To enable shader_ballot use RADV_PERFTEST=shader_ballot. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	28c95cc402	aco: return to loop_active mask at continue_or_break blocks Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	0f9447ccb0	radv: disable Youngblood app profile if ACO is used Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	746165e540	aco: implement exclusive scan for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	7ae227effd	aco: implement inclusive_scan for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	f895a8b1df	aco: implement (clustered) reductions for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	9254fb4fc7	aco: don't use a scalar temporary for reductions on GFX10 This patch also adds the scalar temporary for scans on SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	8ad43d8838	aco: flush denorms after fmin/fmax on pre-GFX9 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	21f67a3bdc	radv: only flush scalar cache for SSBO writes with ACO on GFX8+ Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	79ce6c1b33	aco: disable disassembly for SI/CI due to lack of support by LLVM Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	1c4afe38f2	aco: implement 64bit ine/ieq for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	1e1356b2ad	aco: implement 64bit i2b for SI /CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	da7ff58835	aco: make 1/2*PI a literal constant on SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	90fad7360d	aco: implement 64bit VGPR shifts for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	6a586a6006	aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	23319add93	aco: fix disassembly of writelane instructions. ACO writes an unused 3rd operand for internal usage which makes LLVM recoginize it as illegal instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	6fc9ddfef8	aco: recognize SI/CI SMRD hazards Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	3eed4d2be5	aco: implement quad swizzles for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	bde9c1e3a1	aco: move buffer_store data to VGPR if needed Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	a8195bdf2e	aco: implement nir_op_isign on SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	b8783973cd	aco: only use scalar loads for readonly buffers on SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	f27783a667	aco: implement nir_op_fquantize2f16 for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	caea4bbfdc	aco: fix SMEM offsets for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	8aab92b393	aco: SI/CI - fix sampler aniso Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Dave Airlie	9b533a2ca3	aco: handle gfx7 int8/10 clamping on exports Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	0d42e4d7a0	aco: Initial GFX7 Support Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	3177346bfc	aco: refactor visit_store_fs_output() to use the Builder Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Jason Ekstrand	0f60aa4037	anv: Re-emit all compute state on pipeline switch It's a very odd case to hit in the real world. However, there are some CTS tests which switch back and forth between dispatch and clear without changing the pipeline. Fixes: `bc612536eb` "anv: Emit a dummy MEDIA_VFE_STATE before switching..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-07 04:03:35 +00:00
Jason Ekstrand	bce1c3c668	anv: Re-capture all batch and state buffers When we moved from allocating BOs directly to using the BO cache, we lost the EXEC_OBJECT_CAPTURE flag on all our state buffers. Fixes: `3119b96bdf` "anv: Allocate block pool BOs from the cache" Fixes: `ee77938733` "anv: Allocate batch and fence buffers from..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-07 04:03:35 +00:00
Jason Ekstrand	865ffe4e02	anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 22:32:05 +00:00
Eric Anholt	e3b249f166	freedreno: Enable texture upload memory throttling. Fixes oom-killer during streaming-texture-upload, which I found while trying to enable piglit in CI. Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-12-06 14:03:50 -08:00
Fritz Koenig	c496d44284	freedreno: reorder format check With the addition of the planar formats helper, the planar formats no longer have a valid block.bits field. Calling util_format_get_blocksize therefore asserts. Reorder the check to see if the format is supported before doing the query to get the blocksize. Fixes: `20f132e5ef` ("gallium/util: add planar format layouts and helpers") Signed-off-by: Fritz Koenig <frkoenig@google.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-12-06 21:27:10 +00:00
Nanley Chery	21376cffb3	iris: Fix import of multi-planar surfaces with modifiers Multi-planar surfaces are allowed to have modifiers. Don't require DRM_FORMAT_MOD_INVALID in order to create a surface for each plane defined by the format. Fixes: `246eebba4a` ("iris: Export and import surfaces with modifiers that have aux data") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-06 20:31:48 +00:00
Nanley Chery	51ee8fff9b	gallium: Store the image format in winsys_handle This format will be used to properly handle planar images with modifiers in iris. Fixes: `246eebba4a` ("iris: Export and import surfaces with modifiers that have aux data") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-06 20:31:48 +00:00
Nanley Chery	d5c857837a	gallium/dri2: Fix creation of multi-planar modifier images The commit noted below assumed and enforced that DRM_MOD_INVALID was the only valid modifier for multi-planar imported images. Due to that, it required that modifier on multi-planar images to: 1. Allow multiple planes. 2. Perform YUV format lowering and extent adjustments. 3. Use buffer_index to correctly map the given planes. Fix these issues by removing or updating the code built on that assumption. Fixes: `2066966c10` ("gallium/dri2: Support creating multi-planar modifier images") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-06 20:31:48 +00:00
Kenneth Graunke	ab016a6a2d	meson: Include iris in default gallium-drivers for x86/x86_64 We build i965 by default on x86/x86_64 platforms; let's build iris too. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-06 12:27:26 -08:00
Jason Ekstrand	f9a3d9738b	anv: Use BO fences/semaphores for AcquireNextImage Instead of doing a dummy submit on the command buffer for the fence or a dummy semaphore and trusting in implicit sync, this commit moves us to take advantage of implicit sync and just use the WSI image BO as the fence. Both semaphores and fences require a tiny bit of extra plumbing to do this but the result is that we can get rid of a bunch of the extra synchronization we're doing today. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:07 +00:00
Jason Ekstrand	ecc119a96e	anv: Add a fence_reset_reset_temporary helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:07 +00:00
Jason Ekstrand	ccb7d606f1	anv: Use submit-time implicit sync instead of allocate-time In `83b943cc2f`, we started making all VkDeviceMemory BOs resident all the time. One unfortunate side-effect of this is that every vkQueueSubmit sets EXEC_OBJECT_WRITE on every WSI memory object which means that X server or Wayland compositor, instead of waiting on the last vkQueueSubmit to actually write the buffer, now waits on the last vkQueueSubmit to from that driver instance relative to whenever the compositor's GL driver instance calls execbuf. This potentially leads to a lot of extra synchronization that we didn't intend to have. Instead, this commit makes it so that we leave WSI memory objects with EXEC_OBJECT_ASYNC most of the time and only unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE in the dummy execbuf that we do as part of vkQueuePresent. This should hopefully result in tighter integration with the compositor, lower latency, and better performance. Testing with DOOM 2016, this seems to reduce latency by at least a frame if not two and makes the game much more responsive. Testing was, however, subjective, so we don't have any hard data on that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:07 +00:00
Jason Ekstrand	6ebf677cfd	anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags Otherwise, we're trusting in the execbuf_add_bo which sets EXEC_OBJECT_WRITE to to always be the first one that gets called. This is likely true for fences but it seems somewhat fragile. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:07 +00:00
Jason Ekstrand	778b51f491	vulkan/wsi: Add a hooks for signaling semaphores and fences Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:07 +00:00
Jason Ekstrand	48e23a6406	vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit This lets us treat the implicit synchronization that we need for X11 and Wayland like a semaphore. Instead of trusting the driver to somehow figure out when that memory object needs to be signaled, we provide an explicit point where the driver can set EXEC_OBJECT_WRITE and signal the dma_fence on the BO. Without this, we have to somehow track inside the driver when WSI buffers are actually used to avoid extra synchronization dependencies. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-06 19:58:06 +00:00
Urja Rannikko	d07ed0c9c9	panfrost: free spill cost table in mir_spill_register Signed-off-by: Urja Rannikko <urjaman@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 15:26:13 +00:00
Urja Rannikko	12e393bacf	panfrost: add lcra_free() to free lcra state Signed-off-by: Urja Rannikko <urjaman@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 15:26:13 +00:00
Urja Rannikko	5b6108834b	panfrost: free allocations in schedule_block Signed-off-by: Urja Rannikko <urjaman@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 15:26:13 +00:00
Urja Rannikko	e2dbea683c	panfrost: free last_read/write tables in mir_create_dependency_graph Signed-off-by: Urja Rannikko <urjaman@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 15:26:13 +00:00
Alyssa Rosenzweig	adf716dc7f	panfrost: Rename SET_VALUE to WRITE_VALUE See https://lists.freedesktop.org/archives/dri-devel/2019-December/247601.html Write value emphasises that it's just a generic write primitive. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 14:37:17 +00:00
Alyssa Rosenzweig	9eae950342	panfrost: Update SET_VALUE with information from igt It's not a tiler specific initialization; it's a generic GPU-side write primitive that may be used for tiler reset on midgard. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-06 14:37:17 +00:00
Samuel Pitoiset	c1a362722f	gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally Only Polaris10 is tested at the moment, and I disabled a TON of tests to keep a CTS run within 5 minutes because my local runner is a bit slow. A full CTS run takes more than 1h, which means it will hit the timeout. RADV CI can only be triggered manually on personal branches to avoid breaking the world because one runner is definitely not enough. This will allow us to test it until it's stable enough to be enabled by default. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:58:03 +01:00
Samuel Pitoiset	40c6a56751	gitlab-ci: build RADV in meson-testing This requires to bump LLVM to 8 because it's the minimum supported version by RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:58:00 +01:00
Samuel Pitoiset	f32bf4f1e2	gitlab-ci: configure the Vulkan ICD export with VK_DRIVER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-06 10:57:57 +01:00
Samuel Pitoiset	16b999b7d1	gitlab-ci: allow to run dEQP Vulkan with DEQP_VER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:55 +01:00
Samuel Pitoiset	0b246d3558	gitlab-ci: add a new base test job for VK Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:54 +01:00
Samuel Pitoiset	35a7ec79db	gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:52 +01:00
Samuel Pitoiset	4bbb1d3b06	gitlab-ci: build cts_runner in the x86 test image for VK Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:50 +01:00
Samuel Pitoiset	f2a594f384	gitlab-ci: add a new job that builds a base test image for VK Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:48 +01:00
Samuel Pitoiset	520a77d486	gitlab-ci: add a gl suffix to the x86 test image and all test jobs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:46 +01:00
Samuel Pitoiset	7e0ab6aae0	gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-06 10:57:45 +01:00
Michel Dänzer	41797a1fed	gitlab-ci: Overhaul job run policy Use new rules: instead of only: For container stage jobs: * In the main Mesa project, run them by default. * In merge requests, run them by default if any files affecting pipeline results are changed. * In all other cases (in particular branches in personal projects), don't run them by default but allow triggering them manually. build & test stage jobs are left at the default (when: on_success), so they will run automatically once all their dependencies are satisified. (Using the same rules as above would require these jobs to be manually triggered as well, which is only possible once all dependency jobs have passed) Please be considerate of CI runner resources and cancel unneeded jobs on personal branches with no corresponding merge requests (this can be done before the jobs start running). In summary: No more special branch names. Unnecessary job runs are avoided by default, but jobs which don't run by default can be triggered manually. v2: * Split out LAVA changes to separate commit * Clarify commit log a little, in particular WRT build/test stage jobs Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> # v1 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1 Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> # v1 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-06 10:02:01 +01:00
Michel Dänzer	ebd1309fef	gitlab-ci: Use the common run policy for LAVA jobs as well again Having different policies could have some weird results, e.g. changes only touching documentation (where the intention is not to run the pipeline by default) would still create a pipeline with the LAVA jobs running by default. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-06 09:39:40 +01:00
Jonathan Marek	0796e7e70d	turnip: implement border color Fixes the deqp fails in: dEQP-VK.pipeline.sampler.border (minus 1d array/d24 cases which fail for other reasons) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:30 -05:00
Jonathan Marek	095d35eff8	turnip: improve emit_textures Two things: * Texture/sampler pointers aligned to the size of texture/sampler state * Returning errors instead of crashing on OOM Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:30 -05:00
Jonathan Marek	3ab4f99461	turnip: add function to allocate aligned memory in a substream cs To use with texture states that need alignment (texconst, sampler, border) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-05 22:12:29 -05:00
Timothy Arceri	1abca2b3c8	glsl/nir: iterate the system values list when adding varyings Iterate the system values list when adding varyings to the program resource list in the NIR linker. This is needed to avoid CTS regressions when using the NIR to build the GLSL resource list in an upcoming series. Presumably it also fixes a bug with the current ARB_gl_spirv support. Fixes: `ffdb44d3a0` ("nir/linker: Add inputs/outputs to the program resource list") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-12-05 22:04:31 +00:00
Dave Airlie	201ed4b4e7	llvmpipe: enable support for primitives generated outside streamout This enables the draw support when the queries are enabled. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-06 06:48:30 +10:00
Dave Airlie	5f8af9731e	draw: add support for collecting primitives generated outside streamout GL/gallium require gathering primitives generated outside streamout stats. This introduces the draw interfaces to enabling collecting this. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-06 06:48:30 +10:00
Dave Airlie	f137672197	llvmpipe: disable occlusion queries when requested by state tracker Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-06 06:48:30 +10:00
Dave Airlie	3b8e1b3ee4	llvmpipe: add queries disabled flag This flag is set when the state tracker request queries be disabled for meta operations. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-12-06 06:48:30 +10:00
Kenneth Graunke	ef893db468	main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API) The main and Gallium implementations were recently merged, and the align2 parameter in the Gallium one is in bits. execmem.c expected bytes still. This led to every call here asserting. Fixes: b6fd679a9e("mesa/main/util: moving gallium u_mm to util, remove main/mm") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-12-05 21:07:09 +01:00
Eric Anholt	3097efe5f0	ci: Disable egl_ext_device_drm tests in piglit. If the runner has a HW device that would be supported, even without /dev/dri forwarded into the container, it will be enumerated and the tests on llvmpipe fail with (for example): libEGL warning: Not allowed to force software rendering when API explicitly selects a hardware device. libEGL warning: MESA-LOADER: failed to open i965 (search paths /builds/anholt/mesa/install/lib/dri) Given that we can't necessarily control the DRI devices present on the runners (particularly for developers bringing their own runners to reduce the demands on fd.o's shared resources), just skip these tests in CI. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-12-05 18:06:10 +00:00
Jason Ekstrand	752196a493	util/atomic: Add p_atomic_add_return for the unlocked path Fixes: `385d13f26d` "util/atomic: Add a _return variant of p_atomic_add" Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-12-05 11:55:21 -06:00
Jason Ekstrand	1b6991ba1d	anv: Implement VK_KHR_buffer_device_address The primary difference between the KHR and EXT versions of the extension is that the KHR provides the address at AllocateMemory time for replay so we can replay it safely without moving to a sparse address model. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	4428cd9127	anv: Use a pNext loop in AllocateMemory This function has a lot of possible extensions and some of them we can easily handle on-the-fly so it's easier to just have a loop than to find each structure manually. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	a8e59b3708	anv: Add allocator support for client-visible addresses When a BO is flagged as having a client visible address, we put it in its own heap. We also support the client explicitly specifying an address in said heap. If an address collision happens, we return false from anv_vma_alloc which turns into a VK_ERROR_OUT_OF_DEVICE_MEMORY. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	96e3328ac2	util/vma: Add a function to allocate a particular address range This new function lets you request to remove a specific address range from the allocator. It returns true on success and leaves the allocator unmodified and returns false on failure. It doesn't need to return an offset because, if it succeeds, the offset passed in is the allocated offset. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	782fb5407d	util/vma: Factor out the hole splitting part of util_vma_heap_alloc Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	03450e9cfc	anv: Add an explicit_address parameter to anv_device_alloc_bo We already have a mechanism for specifying that we want a fixed address provided by the driver internals. We're about to let the client start specifying addresses in some very special scenarios as well so we want to pass this through to the allocation function. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	597fdb9e21	anv: Stop advertising two heaps just for the VF cache WA Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	b47bc0202a	anv: Set up VMA heaps independently from memory heaps Our VMA allocations are really independent from the memory heaps we expose via the API. The only thing that really matters is the GTT size so we can make the high heap the right size. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	1037b52cf4	anv: Stop tracking VMA allocations util_vma_heap_alloc will already return 0 if it doesn't have enough space. The only thing the vma_*_available tracking was doing was preventing us from allocating too much on any given heap. Now that we're tracking that in the heap itself, we can drop these. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	a4e3d8f0db	anv: Disallow allocating above heap sizes We're already tracking the amount of memory used in each heap. This commit just makes us start rejecting memory allocations if the heap would grow too large. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	385d13f26d	util/atomic: Add a _return variant of p_atomic_add Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	0a36fafa95	anv: Don't leak when set_tiling fails Fixes: `a44744e01d` "anv: Require a dedicated allocation for..." Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	46af0ecc1d	anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	1b5cb92b62	anv: Apply cache flushes after setting index/draw VBs Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	7ce39a55c1	anv: Always invalidate the VF cache in BeginCommandBuffer I think the reason why we only do this for primaries is that we didn't expect to have blorp calls in secondaries. However, you are allowed to have a full render pass in a secondary command buffer so resolves and clears can end up in there. We should just always invalidate. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	a500a6b7f1	blorp: Pass the VB size to the VF cache workaround Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	c142a40a92	anv: Add a has_softpin boolean This separates "has" from "use" which will make the next commit a bit cleaner. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:59:10 -06:00
Jason Ekstrand	0bba88081b	anv: Drop bo_flags from anv_bo_pool In `ee77938733`, we started using the BO cache for anv_bo_pool and stopped using the bo_flags parameter. However, we never dropped it from the struct or the init function. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 10:58:14 -06:00
Michel Dänzer	f6a913bb95	glsl/tests: Use splitlines() instead of strip() strip() removes leading and trailing newlines, but leaves newlines between multiple lines in the string. This could cause failures when comparing the output of cross-compiled Windows binaries (producing Windows-style newlines) to the expected output with Unix-style newlines. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-12-05 12:31:17 +01:00
Mauro Rossi	96aef08dc6	android: radeonsi: fix build after vl refactoring (v2) vl functions moved from radeonsi to gallium/auxiliary/vl have left android build of radeonsi in broken state. libmesa_galliumvl static is need to build readeonsi, gallium_dri building rules are reworked to avoid multiple symbols and libmesa_galliumvl static dependency is needed in radeonsi. Here is the changelog: - android: gallium/auxiliary: add libmesa_galliumvl static - android: gallium_dri: move libmesa_gallium to static to prevent multiple symbols - android: radeonsi: fix build after vl refactoring Fixes the following building error: external/mesa/src/gallium/drivers/radeonsi/si_uvd.c:47: error: undefined reference to 'vl_video_buffer_create_as_resource' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `86e60bc` ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-05 08:08:23 +00:00
Tapani Pälli	32ebd4207a	intel/compiler: force simd8 when dual src blending on gen8 Patch introduces option to force simd8 and uses it as a workaround for dual source blending issues seen with skqp (skia testsuite) on gen8. Fixes following Piglit test on gen8 platforms: arb_blend_func_extended-dual-src-blending-issue-1917 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1917 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> c: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 09:42:50 +02:00
Tapani Pälli	f6004bac1f	intel/compiler: add newline to limit_dispatch_width message Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-05 08:13:58 +02:00
Eric Anholt	c3efeac4c6	turnip: Add support for compute shaders. Since compute shares the FS state with graphics, we have to re-upload the pipeline state when switching between compute dispatch and graphics draws. We could potentially expose graphics and compute as separate queues and then we wouldn't need pipeline state management, but the closed driver exposes a single queue and consistency with them is probably good. So far I'm emitting texture/ibo state as IBs that we jump to. This is kind of silly when we could just emit it directly in our CS, but that's a refactor we can do later. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	ccf8230547	turnip: Move pipeline BO list adding to BindPipeline. We only need to do it once when we bind, rather than having to check at every draw call. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	e26962f756	turnip: Sanity check that we're adding valid BOs to the list. I tripped over this during CS enabling when my program BO wasn't set up. Easier to debug this way than the kernel telling us a 0 handle is invalid. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	4365e955d8	turnip: Add a helper function for getting tu_buffer iovas. Easier than remembering to add all 3 offsets. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	70d6428be5	turnip: Refactor the graphics pipeline create implementation. The loop over the pipelines to create (and the failure handling) was noisy, and the stub for compute setup looked nicer to me. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	e46da7dbea	turnip: Add basic SSBO support. This is enough to pass dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.fragment.single_descriptor.* with fragmentStoresAndAtomics set, and thus to be able to start working on compute. I haven't enabled that flag yet, because it also implies image load/store support, which I haven't filled in. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	1f4e8f3c46	turnip: Reuse tu6_stage2opcode() more. A bit of cleanup for adding more stages later. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	5b23671f6a	turnip: Drop redefinition of VALIDREG now that it's in ir3.h. Fixes: `937b905569` ("freedreno/ir3: fix neverball assert in case of unused VS inputs") Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Eric Anholt	bb49f19c1b	turnip: Fix unused variable warnings. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-12-04 20:32:15 -08:00
Timothy Arceri	1b1b436fa7	glsl: make use of active_shader_mask when building resource list This allows us to avoid walking the entire IR looking for used uniforms. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:30 +11:00
Timothy Arceri	f0cb0fe1c0	glsl: don't set uniform block as used when its not The spec requires unused uniform block to be set as active in the program resource list. To support this we tell opt dead code not to remove them. However we can mark them as unused internally and avoid unnecessarily state changes. This change is also required for the folowing clean-up patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:23 +11:00
Timothy Arceri	50dc4b77f6	glsl: move calculate_array_size_and_stride() to link_uniforms.cpp This is where all the other uniform values are populated so it makes much more sense here. Moving it will also allow us to better share code between the NIR and GLSL IR resource list builders. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-12-05 13:18:02 +11:00
Ian Romanick	c9acf0739f	anv: Fix error message format string See also `246261f0ad` Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> CID: 1455892 Fixes: `246261f0ad` ("anv: prepare the driver for delayed submissions")	2019-12-04 15:34:03 -08:00
Ian Romanick	7840985609	mesa: Silence unused parameter warning Unused since `e4da8b9c33` ("mesa/compiler: rework tear down of builtin/types"). src/mesa/main/context.c: In function ‘_mesa_free_context_data’: src/mesa/main/context.c:1321:54: warning: unused parameter ‘destroy_compiler_types’ [-Wunused-parameter] 1321 \| _mesa_free_context_data(struct gl_context *ctx, bool destroy_compiler_types) \| ^ Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-12-04 15:34:03 -08:00
Ian Romanick	a7e607641a	mesa: Silence 'left shift of negative value' warning in BPTC compression code src/util/format/../../mesa/main/texcompress_bptc_tmp.h:830:31: warning: left shift of negative value [-Wshift-negative-value] 830 \| value \|= (~(int32_t) 0) << n_bits; \| ^~ v2: Rewrite to just shift left then shift right. Based on conversation with Neil in https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2792#note_320272, this should be fine. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> [v1] Reviewed-by: Neil Roberts <nroberts@igalia.com>	2019-12-04 15:34:03 -08:00
Ian Romanick	668635abd2	intel/compiler: Fix 'comparison is always true' warning Without looking at the assembly or something, I'm not sure what the compiler does here. The brw_reg_type enum is marked packed, so I'm guess that it gets represented as a uint8_t. That's the only reason I could think that comparing with -1 would be always true. This patch adds the same cast that exists in brw_hw_type_to_reg_type. It might be better to add a #define outside the enum for BRW_REGISTER_TYPE_INVALID as (enum brw_reg_type)-1. src/intel/compiler/brw_eu_compact.c: In function ‘has_immediate’: src/intel/compiler/brw_eu_compact.c:1515:20: warning: comparison is always true due to limited range of data type [-Wtype-limits] 1515 \| return type != -1; \| ^~ src/intel/compiler/brw_eu_compact.c:1518:20: warning: comparison is always true due to limited range of data type [-Wtype-limits] 1518 \| return type != -1; \| ^~ Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> CID: 1455194 Fixes: `12d3b11908` ("intel/compiler: Add instruction compaction support on Gen12") Cc: @mattst88	2019-12-04 15:34:03 -08:00
Dylan Baker	5b3d6979a6	docs: Update mesa 19.3 release calendar	2019-12-04 14:42:41 -08:00
Dylan Baker	953d20e6f5	docs: update calendar, add news item and link release notes for 19.2.7	2019-12-04 14:42:41 -08:00
Dylan Baker	bd518aa208	docs: Add SHA256 sums for 19.2.7	2019-12-04 14:42:41 -08:00
Dylan Baker	26aa024cdf	docs: Add release notes for 19.2.7	2019-12-04 14:42:41 -08:00
Jonathan Marek	ec28714b78	turnip: allow writes to draw_cs outside of render pass This is for state commands like CmdSetViewport that can be used outside of a renderpass. Accumulating those into draw_cs outside of the renderpass should have the desired effect. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 17:35:18 -05:00
Rob Clark	372ed42d22	nir/lower_clip: Fix incorrect driver loc for clipdist outputs Somehow adjusting maxloc based on existing outputs got lost, resulting in the clipdist varying clobbering the position varying. Causing a shader that had no position output in freedreno/ir3, which triggers GPU hangs in neverball. Fixes: `d0f746b645` ("nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs.") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Rob Clark	937b905569	freedreno/ir3: fix neverball assert in case of unused VS inputs The logic to ensure VS and BS inputs are aligned wasn't accounting for unused inputs in VS. This usually doesn't happen, but it seems it can in the case of ARB programs? Fixes assert: ``` fd6_program_create: Assertion `bs->inputs[i].regid == vs->inputs[i].regid' failed. ``` Fixes: `882d53d8e3` ("freedreno/ir3+a6xx: same VBO state for draw/binning") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Rob Clark	4e47c205b9	freedreno/ir3: remove store_output lowered to store_shared_ir3 Fixes crashes that were unnoticed in CI because debug_assert() was not enabled (but become real crashes after the next patch): dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.ivec2_mediump_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_highp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_lowp_geometry dEQP-GLES31.functional.shaders.builtin_functions.integer.bitfieldextract.uvec2_mediump_geometry Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-12-04 13:08:52 -08:00
Rafael Antognolli	50f60d69e4	iris: Add restriction to 3DSTATE_CONSTANT_ packets. The following programming note shows up in all 3DSTATE_CONSTANT_* packets: "The sum of all four read length fields must be less than or equal to the size of 64." The backend compiler should guarantee this for us, so let's just add a check here. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	d3e339364f	anv: Use 3DSTATE_CONSTANT_ALL when possible. Use this new instruction introduced in Gen12. The instruction itself is smaller, and it also allows us to emit a single instruction to all stages that have the same push constant buffers (e.g. when they don't have constant buffers). There's one restriction to use this instruction, though: the length field is only 5 bits long, so we need to check whether we can use it, and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32. v2: - Rebased on top of the lasted changes from Jason. - Added review suggestions by Caio. - Removed struct push_bos and merged some code into anv_nir_compute_push_layout(). v3: - Remove code churn due to gen8+ workaround in anv_nir_compute_push_layout(). This code has been removed in an earlier commit, and implemented in cmd_buffer_emit_push_constant(). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	7d5da53d27	anv: Move code for emitting push constants into its own function. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	67d2cb3e93	anv: Add get_push_range_address() helper. Add a helper function to get the push range address. Once we have a separate function for emitting gen12 push constants, we can use this helper and avoid duplicating code. v3: Do not add range->start to the address in gen7 (Caio). v4: Do not drop range->start from gen7 (Caio, Jason). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	c0225a728e	anv: Move gen8+ push constant packet workaround. Store push_ranges in ascending order, and only "shift" them to the end of the array during state packet emission. We don't need this workaround with the new 3DSTATE_CONSTANT_ALL packet. So instead of applying the workaround here just for GEN < 12 (which requires and extra loop through all the ranges to figure out if we should shift them or not), we simply move the whole logic to the state emission code. At that point, in a later commit, we are already looping through all of the ranges anyway to check which packet we will be using, so we might as well implement the workaround there, where it is going to be used. v3: Move gen8+ workaround to the state emission code (Caio). v4: Add explanation of why we moved the workaroudn (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	06438ea7fa	iris: Use 3DSTATE_CONSTANT_ALL when possible. Use this new instruction introduced in Gen12. The instruction itself is smaller, and it also allows us to emit a single instruction to all stages that have the same push constant buffers (e.g. when they don't have constant buffers). There's one restriction to use this instruction, though: the length field is only 5 bits long, so we need to check whether we can use it, and fallback to the old 3DSTATE_CONSTANT_XS if that field is >= 32. v2 (Suggestions from Caio): - use max_length instead of large_buffers. - remove UNUSED and use #if GEN_GEN >= 12 instead. - inline "buffers" and drop BITSET_RANGE() usage. - add assert(n <= max_pointers) - move emit to outside of the loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	1ba9a18911	iris: Rework push constants emitting code. Split into a function the logic to gather the push constant buffers, which now stores them in struct push_bos. Another function is added to emit the packet, using data from the push_bos struct. This will be useful when adding a new function for emitting push constants for newer platforms. v2 (Suggestions from Caio): - rename 'n' -> 'buffer_count' - remove large_buffers (for now) - initialize push_bos - remove assert - change for() condition (i <= 3 -> i < 4) v3: - Add comment about size limit. - Rework "shift" logic and 'for' loop. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	9db044792f	intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants. In blorp, all the push constants are disabled, so we only need to emit a single 3DSTATE_CONSTANT_ALL with the bitmask for stage update appropriately set. v2: Update comment (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	8983622995	intel/aubinator: Decode 3DSTATE_CONSTANT_ALL. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-04 20:48:25 +00:00
Rafael Antognolli	2d127614a2	intel/genxml: Add 3DSTATE_CONSTANT_ALL packet. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-12-04 20:48:25 +00:00
Jonathan Marek	1576ff5fbb	turnip: MSAA resolve directly from GMEM Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	abaaf0b2e7	turnip: don't set unused BLIT_DST_INFO bits for GMEM clear These bits are ignored when clearing so don't bother setting them. Note: MSAA samples when clearing comes from other registers (tu6_emit_msaa) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	4babdc7381	turnip: implement CmdClearAttachments Passes these deqp tests: dEQP-VK.api.image_clearing.core.attachsingle* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Jonathan Marek	1dfa2e6c99	turnip: don't skip unused attachments when setting up tiling config This makes it easier to find the gmem_offset associated with an attachment. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 14:39:06 -05:00
Vasily Khoruzhick	8c12f4e5f2	lima: enable tiling Now that we have tiled format modifier merged into linux we can enable tiling. That should improve overall performance and also workaround broken mipmapping for linear textures since now we prefer tiled textures. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-12-04 08:20:56 -08:00
Tapani Pälli	272ef5d39a	glsl: additional interface redeclaration check for SSO programs Patch adds additional linker check for SSO programs to make sure they are redeclaring built-in blocks as required by the desktop spec. This fixes following Piglit tests: arb_separate_shader_objects/linker/pervertex-* Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-12-04 15:27:41 +00:00
Tapani Pälli	2d26cc077d	gitlab-ci: bump piglit checkout commit Commit also updates the Piglit quick_gl.txt, list modifications happened due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-12-04 15:27:41 +00:00
Rhys Perry	3e67aa2e4e	nir/load_store_vectorize: fix combining stores with aliasing loads between v2: add test Fixes: `ce9205c03b` ('nir: add a load/store vectorization pass') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v2)	2019-12-04 12:21:40 +00:00
Timur Kristóf	637c5a1dd9	aco/wave32: Fix reductions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	21db083504	aco/wave32: Allow setting the subgroup ballot size to 64-bit. Previously, it would only work when the ballot size was set to the lane mask. This patch makes is possible to set the ballot size to either 32-bit or 64-bit for wave32 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	ed815d503e	aco/wave32: Use wave_size for barrier intrinsic. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	b8f2edb452	aco/wave32: Fix load_local_invocation_index to support wave32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	e0bcefc3a0	aco/wave32: Use lane mask regclass for exec/vcc. Currently all usages of exec and vcc are hardcoded to use s2 regclass. This commit makes it possible to use s1 in wave32 mode and s2 in wave64 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	b4efe179ed	aco/wave32: Add wave size specific opcodes to aco_builder. Several places in ACO we use SOP1 or SOP2 instructions to operate over the exec mask or VCC, and these need to be adapted to the new size in wave32 mode. This commit adds a way to deal with this problem in aco_builder: the caller can specify a wave size specific opcode and the builder will translate that to the correct opcode based on the current wave size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	c44af6cbc7	aco/wave32: Introduce emit_mbcnt which takes wave size into account. This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32 instruction is superfluous. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	07754a9c9e	aco/wave32: Replace hardcoded numbers in spiller with wave size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	c0dbf42a03	aco/wave32: Change uniform bool optimization to work with wave32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	dd9dad731b	aco: Optimize load_subgroup_id to one bit field extract instruction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	753670e902	aco: Remove lower_linear_bool_phi, it is not needed anymore. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	0d2d672020	aco: Remove superfluous argument from emit_boolean_logic. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Timur Kristóf	9a43d26b74	aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Michel Dänzer	5585b8eadd	gitlab-ci: Run piglit glslparser & quick_shader tests separately And only use --process-isolation false for the quick_gl tests. This will hopefully avoid variance in the test results that we've been seeing lately. But even if it doesn't, it should at least help narrow down the cause of the variance. Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-04 10:36:33 +01:00
Lionel Landwerlin	ddacd3d43b	intel/perf: fix improper pointer access This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	8c0b058263	intel/perf: simplify the processing of OA reports This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	b364e920bf	intel/perf: take into account that reports read can be fairly old If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	9d0a5c817c	intel/perf: set read buffer len to 0 to identify empty buffer We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Lionel Landwerlin	acea59dbf8	intel/perf: fix invalid hw_id in query results Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `41b54b5faf` ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 09:21:15 +00:00
Pierre-Eric Pelloux-Prayer	a7bbebcfb9	radeonsi: display cs blit count for AMD_DEBUG=testdma Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-04 09:08:28 +01:00
Pierre-Eric Pelloux-Prayer	082d1c1686	radeonsi: implement sdma for GFX9 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-04 09:08:28 +01:00
Samuel Pitoiset	4cacba0c86	radv/gfx10: fix the vertex order for triangle strips emitted by a GS My fix wasn't totally correct as pointed out by Marek. Ported from RadeonSI. Fixes: `deafe4cc58` ("radv/gfx10: fix primitive indices orientation for NGG GS") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:28:57 +01:00
Samuel Pitoiset	dac6bd29ae	radv: simplify a check in radv_fixup_vertex_input_fetches() The number of loaded channels should always be > 0 now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:04:05 +01:00
Samuel Pitoiset	3b51259f06	radv: remove dead shader input/output variables No pipeline-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-04 08:04:05 +01:00
Jason Ekstrand	0604768ae4	iris: Stop setting up fake params In `d1c4e64a69`, we added a parameter to tell the back-end compiler to ignore the param array and just push however many constants you ask it to push. Iris doesn't want to push anything so it gives a bogus number of parameters and trusts the back-end compiler to dead-code all of them. Now that we can tell the back-end compiler to stop re-arranging things, delete the hack and enable the new simpler code path. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-12-04 04:52:20 +00:00
Dave Airlie	713636766d	gallium/scons: fix graw-xlib build on OSX. Fixes: `44a6b0107b` (gallivm: add nir->llvm translation (v2)) Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-12-04 13:24:44 +10:00
Dave Airlie	3263c9824e	llvmpipe: enable texcoord semantics To make NIR transitioning easier, move the driver to using texcoord semantics. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-04 12:08:14 +10:00
Jason Ekstrand	178a2946c0	anv: Respect the always_flush_cache driconf option Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-03 17:10:51 -06:00
Krzysztof Raszkowski	07adc47460	gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format. Reject the new formats in swr to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-12-03 16:51:24 +00:00
Rob Clark	b31637c453	gitlab-ci: disable junit results for deqp They don't seem to be hugely useful, and seem to be bogging down gitlab. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-03 08:46:39 -08:00
Jason Ekstrand	b1f37688ba	anv: Set up SBE_SWIZ properly for gl_Viewport gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-12-03 16:20:50 +00:00
Michel Dänzer	0c88d5952a	gitlab-ci: Update to current ci-templates master Fixes skopeo copy failures. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-03 16:03:31 +01:00
Samuel Pitoiset	f63a3132e8	ac/llvm: fix atomic var operations if source isn't a deref Fixes some CTS regressions. Fixes: `e61a826f39` ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-03 09:41:33 +01:00
Neil Armstrong	dde734030b	Add support for T820 CI Jobs Tomeu: - Small rebase fixups Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 06:44:08 +01:00
Dave Airlie	502548a09c	gallivm/llvmpipe: add support for front facing in sysval. This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-03 15:29:04 +10:00
Dave Airlie	f52cdaa517	llvmpipe/images: handle undefined atomic without crashing just return 0 for unbound atomic operations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-03 15:29:04 +10:00
Alyssa Rosenzweig	71dd52e056	panfrost: Remove blend shader hack This is no longer used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	c707b4d0f9	gitlab-ci: Test Panfrost on T720 GPUs Now that the Mali T720 GPU is supoprted at the same level as the T760, test it on PINE64 H64 boards. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	6d05e38a96	gitlab-ci: Remove non-default skips from Panfrost During the past months, Panfrost has matured considerably and several tests stopped being flaky or failing at all. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	b655be7252	panfrost: White list the Mali T720 Support for this GPU is equal now to that of T760, so whitelist it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	8555bffafd	pan/midgard: Splatter on fragment out Make sure that the fragment is complete when writing it out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	ab81a23d36	panfrost: Simplify shader patching We need to always upload anyway. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	6ddaa5558a	panfrost: Simplify draw_flags Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	9fb0904712	panfrost: Implement pan_tiler for non-hierarchy GPUs The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu: - Also disable tiling when !hierarchy and !vertex_count - Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte - Take into account tile size when calculating polygon list size for !hierarchy - Allow 0-sized tiles in a single dimension Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Alyssa Rosenzweig	63cd5b8198	panfrost: Add information about T720 tiling We've figured out most of the big pieces, and though it looks faintly like other Midgards, it's much simpler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-12-03 04:25:04 +00:00
Tomeu Vizoso	6887ff4e79	panfrost: Add quirks system to cmdstream Similarly to how it's already done in the compiler, add a way to express differences between GPU models that need to be taken into account when assembling the cmdstream. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-12-03 04:25:04 +00:00
Ian Romanick	fbd5359a0a	nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select Reviewed-by: Matt Turner <mattst88@gmail.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14660366 -> 14653437 (-0.05%) instructions in affected programs: 316166 -> 309237 (-2.19%) helped: 905 HURT: 10 helped stats (abs) min: 1 max: 36 x̄: 7.67 x̃: 6 helped stats (rel) min: 0.13% max: 18.75% x̄: 4.28% x̃: 3.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.10% max: 1.33% x̄: 0.70% x̃: 0.97% 95% mean confidence interval for instructions value: -7.91 -7.23 95% mean confidence interval for instructions %-change: -4.46% -3.99% Instructions are helped. total cycles in shared programs: 228571646 -> 228549759 (<.01%) cycles in affected programs: 56239919 -> 56218032 (-0.04%) helped: 681 HURT: 216 helped stats (abs) min: 1 max: 5156 x̄: 45.49 x̃: 10 helped stats (rel) min: <.01% max: 10.45% x̄: 1.29% x̃: 0.65% HURT stats (abs) min: 1 max: 320 x̄: 42.09 x̃: 14 HURT stats (rel) min: <.01% max: 37.04% x̄: 1.38% x̃: 0.49% 95% mean confidence interval for cycles value: -41.51 -7.29 95% mean confidence interval for cycles %-change: -0.80% -0.49% Cycles are helped. LOST: 1 GAINED: 0	2019-12-02 16:46:20 -08:00
Ian Romanick	780b5c1037	nir/algebraic: Simplify some Inf and NaN avoidance code Since a is non-negative, neither fsqrt nor frsq should return NaN. frsq should only return Inf when fsqrt returns 0. The changes are pretty small, but this turns a few hundred hurt shaders in the next patch into helped shaders. An alternative to the intBitsToFloat is to import numpy and do np.finfo(np.float32).max. That's more explicit, but we may also want to have specific bit encodings of float values later. I could be convinced either way, but intBitsToFloat(0x7f7fffff) was what I implemented first. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14661140 -> 14661104 (<.01%) instructions in affected programs: 7520 -> 7484 (-0.48%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.32% max: 0.61% x̄: 0.49% x̃: 0.52% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.52% -0.47% Instructions are helped. total cycles in shared programs: 228585416 -> 228584806 (<.01%) cycles in affected programs: 56321 -> 55711 (-1.08%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 98 x̄: 19.06 x̃: 10 helped stats (rel) min: 0.08% max: 6.41% x̄: 1.09% x̃: 0.65% 95% mean confidence interval for cycles value: -28.32 -9.80 95% mean confidence interval for cycles %-change: -1.63% -0.54% Cycles are helped. Sandy Bridge total cycles in shared programs: 152991077 -> 152991075 (<.01%) cycles in affected programs: 11525 -> 11523 (-0.02%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.07% max: 0.11% x̄: 0.09% x̃: 0.09% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -5.27 4.27 95% mean confidence interval for cycles %-change: -0.16% 0.15% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45.	2019-12-02 16:46:20 -08:00
Ian Romanick	d15344c0f5	intel/compiler: Increase nir_opt_peephole_select threshold I tried 2, 4, 6, 8, and 10. 8 seemed to be the sweet spot across all Intel platforms. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14736141 -> 14661140 (-0.51%) instructions in affected programs: 2272413 -> 2197412 (-3.30%) helped: 8416 HURT: 140 helped stats (abs) min: 1 max: 1152 x̄: 8.99 x̃: 6 helped stats (rel) min: 0.13% max: 42.55% x̄: 4.15% x̃: 3.20% HURT stats (abs) min: 1 max: 140 x̄: 4.73 x̃: 1 HURT stats (rel) min: 0.03% max: 3.44% x̄: 0.87% x̃: 0.60% 95% mean confidence interval for instructions value: -9.36 -8.17 95% mean confidence interval for instructions %-change: -4.14% -3.99% Instructions are helped. total cycles in shared programs: 231560416 -> 228585416 (-1.28%) cycles in affected programs: 126536021 -> 123561021 (-2.35%) helped: 7092 HURT: 1898 helped stats (abs) min: 1 max: 419320 x̄: 519.02 x̃: 159 helped stats (rel) min: <.01% max: 77.25% x̄: 13.52% x̃: 11.77% HURT stats (abs) min: 1 max: 14518 x̄: 371.91 x̃: 36 HURT stats (rel) min: <.01% max: 103.23% x̄: 5.92% x̃: 2.55% 95% mean confidence interval for cycles value: -514.34 -147.50 95% mean confidence interval for cycles %-change: -9.69% -9.14% Cycles are helped. total spills in shared programs: 5763 -> 5848 (1.47%) spills in affected programs: 1797 -> 1882 (4.73%) helped: 13 HURT: 13 total fills in shared programs: 17163 -> 16931 (-1.35%) fills in affected programs: 7214 -> 6982 (-3.22%) helped: 22 HURT: 19 total sends in shared programs: 730410 -> 730246 (-0.02%) sends in affected programs: 2705 -> 2541 (-6.06%) helped: 114 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.60% max: 20.00% x̄: 7.26% x̃: 5.88% 95% mean confidence interval for sends value: -1.55 -1.33 95% mean confidence interval for sends %-change: -7.90% -6.62% Sends are helped. LOST: 4 GAINED: 0 Sandy Bridge total instructions in shared programs: 10760511 -> 10724637 (-0.33%) instructions in affected programs: 961305 -> 925431 (-3.73%) helped: 3734 HURT: 110 helped stats (abs) min: 1 max: 151 x̄: 9.66 x̃: 8 helped stats (rel) min: 0.14% max: 41.21% x̄: 4.93% x̃: 3.95% HURT stats (abs) min: 1 max: 20 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.12% max: 5.41% x̄: 0.88% x̃: 0.52% 95% mean confidence interval for instructions value: -9.76 -8.91 95% mean confidence interval for instructions %-change: -4.90% -4.63% Instructions are helped. total cycles in shared programs: 153359411 -> 152991077 (-0.24%) cycles in affected programs: 11615401 -> 11247067 (-3.17%) helped: 2725 HURT: 1138 helped stats (abs) min: 1 max: 2844 x̄: 164.27 x̃: 80 helped stats (rel) min: 0.02% max: 48.60% x̄: 7.47% x̃: 3.91% HURT stats (abs) min: 1 max: 4351 x̄: 69.69 x̃: 25 HURT stats (rel) min: 0.02% max: 40.00% x̄: 3.39% x̃: 1.47% 95% mean confidence interval for cycles value: -103.18 -87.52 95% mean confidence interval for cycles %-change: -4.57% -3.97% Cycles are helped. total sends in shared programs: 584038 -> 583855 (-0.03%) sends in affected programs: 3512 -> 3329 (-5.21%) helped: 157 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 helped stats (rel) min: 2.38% max: 25.00% x̄: 6.52% x̃: 6.06% 95% mean confidence interval for sends value: -1.26 -1.07 95% mean confidence interval for sends %-change: -7.17% -5.87% Sends are helped. LOST: 23 GAINED: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122617 -> 8111592 (-0.14%) instructions in affected programs: 380503 -> 369478 (-2.90%) helped: 912 HURT: 86 helped stats (abs) min: 1 max: 129 x̄: 12.19 x̃: 9 helped stats (rel) min: 0.30% max: 39.21% x̄: 3.69% x̃: 2.57% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.12% max: 3.64% x̄: 0.54% x̃: 0.36% 95% mean confidence interval for instructions value: -12.00 -10.10 95% mean confidence interval for instructions %-change: -3.56% -3.10% Instructions are helped. total cycles in shared programs: 188509780 -> 188534398 (0.01%) cycles in affected programs: 7211542 -> 7236160 (0.34%) helped: 859 HURT: 132 helped stats (abs) min: 2 max: 690 x̄: 46.59 x̃: 16 helped stats (rel) min: 0.01% max: 26.76% x̄: 1.53% x̃: 0.33% HURT stats (abs) min: 2 max: 1592 x̄: 489.67 x̃: 618 HURT stats (rel) min: 0.03% max: 185.92% x̄: 23.35% x̃: 6.26% 95% mean confidence interval for cycles value: 9.58 40.10 95% mean confidence interval for cycles %-change: 0.65% 2.93% Cycles are HURT.	2019-12-02 16:46:20 -08:00
Ian Romanick	e342d6970b	nir/opt_peephole_select: Don't count some unary operations In many cases, fsat, fneg, fabs, ineg, and iabs will get folded into another instruction as either source or destination modifiers. Counting them as instructions means that some if-statements won't get converted to selects. For example, vec1 32 ssa_25 = flt32 ssa_0, ssa_23.x /* succs: block_1 block_2 / if ssa_25 { block block_1: / preds: block_0 / vec1 32 ssa_26 = fabs ssa_24 vec1 32 ssa_27 = fneg ssa_26 vec1 32 ssa_28 = fabs ssa_20 vec1 32 ssa_29 = fneg ssa_28 vec1 32 ssa_30 = fmul ssa_27, ssa_29 vec1 32 ssa_31 = fsat ssa_30 / succs: block_3 / } else { block block_2: / preds: block_0 / / succs: block_3 / } block block_3: / preds: block_1 block_2 */ block_1 isn't really 6 instructions, but it will be counted that way. Most callers of the peephole_select pass use either 1 or 8. It's very easy to blow way past either of these limits with things that are really only one or two actual instructions. I also tried some fancier things like making sure the fsat was of another SSA def from the same block, but the simple test was actually better. The i965 back-end SEL peephole pass still helps ~700 shaders in shader-db with this change. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 14743694 -> 14738910 (-0.03%) instructions in affected programs: 156575 -> 151791 (-3.06%) helped: 1204 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 3.97 x̃: 3 helped stats (rel) min: 0.15% max: 19.57% x̄: 5.15% x̃: 4.55% 95% mean confidence interval for instructions value: -4.12 -3.82 95% mean confidence interval for instructions %-change: -5.35% -4.95% Instructions are helped. total cycles in shared programs: 231749141 -> 231602916 (-0.06%) cycles in affected programs: 2818975 -> 2672750 (-5.19%) helped: 876 HURT: 322 helped stats (abs) min: 2 max: 788 x̄: 180.99 x̃: 220 helped stats (rel) min: <.01% max: 43.82% x̄: 20.75% x̃: 19.44% HURT stats (abs) min: 1 max: 1188 x̄: 38.27 x̃: 20 HURT stats (rel) min: 0.09% max: 102.67% x̄: 5.17% x̃: 1.70% 95% mean confidence interval for cycles value: -130.47 -113.64 95% mean confidence interval for cycles %-change: -14.85% -12.72% Cycles are helped. total sends in shared programs: 730495 -> 730491 (<.01%) sends in affected programs: 46 -> 42 (-8.70%) helped: 2 HURT: 0 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8122757 -> 8122617 (<.01%) instructions in affected programs: 14716 -> 14576 (-0.95%) helped: 46 HURT: 1 helped stats (abs) min: 1 max: 8 x̄: 3.07 x̃: 3 helped stats (rel) min: 0.36% max: 10.00% x̄: 2.54% x̃: 1.06% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.59% max: 1.59% x̄: 1.59% x̃: 1.59% 95% mean confidence interval for instructions value: -3.42 -2.54 95% mean confidence interval for instructions %-change: -3.28% -1.62% Instructions are helped. total cycles in shared programs: 188510100 -> 188509780 (<.01%) cycles in affected programs: 58994 -> 58674 (-0.54%) helped: 32 HURT: 1 helped stats (abs) min: 2 max: 96 x̄: 10.06 x̃: 6 helped stats (rel) min: 0.05% max: 15.29% x̄: 1.37% x̃: 0.31% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.68% max: 0.68% x̄: 0.68% x̃: 0.68% 95% mean confidence interval for cycles value: -16.34 -3.06 95% mean confidence interval for cycles %-change: -2.46% -0.15% Cycles are helped.	2019-12-02 16:46:19 -08:00
Jordan Justen	e277009d8d	iris: Allow max dynamic pool size of 2GB for gen12 Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: `8125d7960b` ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-12-02 16:34:12 -08:00
Marek Olšák	7730d583c2	radeonsi/gfx10: fix the vertex order for triangle strips emitted by a GS Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-02 18:22:27 -05:00
Marek Olšák	91da6a98e7	radeonsi/gfx10: simplify some duplicated NGG GS code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-12-02 18:22:25 -05:00
Jonathan Gray	4913215d14	util/u_thread: don't restrict u_thread_get_time_nano() to __linux__ pthread_getcpuclockid() and clock_gettime() are also available on at least OpenBSD, FreeBSD, NetBSD, DragonFly, Cygwin. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 17:23:49 -05:00
Jonathan Gray	c91997b6c4	util/futex: use futex syscall on OpenBSD Make use of the futex syscall added in OpenBSD 6.2. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 17:23:49 -05:00
Kenneth Graunke	dbe923bff9	meson: Add a "prefer_iris" build option Enabling this option makes Intel Gen8-11 hardware load the 'iris' driver by default instead of the older 'i965' driver. Regardless of how this option is set, users can still override which driver the loader selects via two methods. The first is to create a ~/.drirc or /etc/drirc file with the following snippet: <driconf> <device driver="loader" kernel_driver="i915"> <option name="dri_driver" value="i965" /> </device> </driconf> The other option is to set an environment variable: export MESA_LOADER_DRIVER_OVERRIDE=i965 For now, "prefer_iris" defaults to i965 (the historical choice). A separate future patch will change the default driver to iris. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1893 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-02 12:56:27 -08:00
Jonathan Marek	bebfb17a2b	turnip: fix display wsi fence timing out Fixes: `df9f2adf` ("turnip: add display wsi") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-02 14:29:47 -05:00
Rhys Perry	5404b7aaa3	nir/lower_io_to_vector: don't create arrays when not needed Some backends require that there are no array varyings. If there were no arrays in the input shader, the pass shouldn't have to create new ones. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2103 Fixes: `bcd14756ee` ('nir/lower_io_to_vector: add flat mode') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-12-02 17:45:01 +00:00
Rhys Perry	01cacdb71e	aco: fix block_kind_discard s_andn2 definition to exec Improves generated code of dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop because a loop exit phi can then be fixed to exec, removing copies and improving jump threading. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:24 +00:00
Rhys Perry	0e8da9f607	aco: handle loop exit and IF merge phis with break/discard ACO considers discards jumps and creates edges in the CFG for them but NIR does neither of these. This can be fixed instead by keeping track of whether a side of an IF had a break/discard, but this doesn't solve the issue with discards affecting loop exit phis. So this reworks phi handling a bit. Fixes these tests: dEQP-VK.graphicsfuzz.disc-and-add-in-func-in-loop dEQP-VK.graphicsfuzz.loop-call-discard dEQP-VK.graphicsfuzz.complex-nested-loops-and-call Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:19 +00:00
Rhys Perry	06fc83989c	aco: validate the CFG Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-02 16:56:05 +00:00
Alejandro Piñeiro	b6fd679a9e	mesa/main/util: moving gallium u_mm to util, remove main/mm Right now there are two copies of mm: * mesa/main/mm.[ch] * gallium/auxiliary/util/u_mm.[ch] At some point they splitted, and from the commit message it was not clear why it was not possible to have only one copy at a common place. Taking into account that was several years ago, Im assuming that it was not possible then. This change would allow to have one copy of the same code, and also being able to use that code out of mesa/main or gallium, if needed. This commit moves u_mm and removes mm, as u_mm has slightly more changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-12-02 13:59:28 +01:00
Rhys Perry	35fab1ba33	radv: set writes_memory for global memory stores/atomics Fixes: `13ab63bb62` ('radv: Implement VK_EXT_buffer_device_address.') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-12-02 11:47:12 +00:00
Rhys Perry	a814f3d8a7	ac/llvm: improve sync scope for global atomics Stronger ordering is implemented in SPIRV->NIR with barriers. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:27 +00:00
Rhys Perry	e61a826f39	ac/llvm: fix pointer type for global atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-12-02 10:48:18 +00:00
Kenneth Graunke	1d416ffd09	iris: Map FXT1 texture formats This exposes GL_TDFX_texture_compression_FXT1 support. It's ancient, only Intel GPUs appear to support it, and I seriously doubt anybody uses it. But i965 supports it, and it's trivial to do, so we may as well support it in the new iris driver as well. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-12-01 22:55:56 -08:00
Kenneth Graunke	1bdd342b60	st/mesa: Add GL_TDFX_texture_compression_FXT1 support Eric recently added PIPE_FORMAT_FXT1_RGB[A] as part of his format unification work. This was really most of the work of implementing the extension. We just need to handle it in a couple of places and expose the extension. v2: Reject the new formats in llvmpipe_is_format_supported to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2019-12-01 22:55:21 -08:00
Dave Airlie	3e21e17b2f	nir/samplers: don't zero samplers_used/txf. This allows this pass to be run multiple times and the results are just or'ed together. It fixes on test on llvmpipe nir, and regresses none. Suggested by Kenneth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-12-02 09:15:55 +10:00
Samuel Pitoiset	0eb78a078e	aco: drop useless lowering of deref operations for shared memory Moved to RADV. No pipeline-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 21:58:25 +01:00
Samuel Pitoiset	c105e6169c	radv,ac/nir: lower deref operations for shared memory This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipeline-db (NAVI10/LLVM): SGPRS: 9043 -> 9051 (0.09 %) VGPRS: 7272 -> 7292 (0.28 %) Code Size: 638892 -> 621628 (-2.70 %) bytes LDS: 1333 -> 1331 (-0.15 %) blocks Max Waves: 1614 -> 1608 (-0.37 %) Found this while glancing at some F12019 shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-29 21:58:18 +01:00
Daniel Schürmann	b690543851	aco: fix a couple of value numbering issues Fixes: `3a20ef4a32` 'aco: refactor value numbering' Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-29 21:54:27 +01:00
Daniel Schürmann	8861a82be7	aco: don't split live-ranges of linear VGPRs Fixes: `93c8ebfa78` 'aco: Initial commit of independent AMD compiler' Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-29 21:54:27 +01:00
Rhys Perry	73783ed389	aco: implement global atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	389ee819c0	aco: improve FLAT/GLOBAL scheduling Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	cc742562c1	aco: don't enable store_global for helper invocations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:02 +00:00
Rhys Perry	31e68e230f	aco: fix SADDR with FLAT on GFX10 The reference guide is incorrect and SADDR is actually used with FLAT on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	082e3a68fa	aco: fix assembly of FLAT/GLOBAL atomics They can take both a definition and data operand Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	f1381e6715	aco: fix GFX10 opcodes for some global/flat atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	5986e00194	aco: improve WAR hazard workaround with >64bit stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	a9fc81b098	aco: add v_nop inbetween exec write and VMEM/DS/FLAT LLVM and the proprietary compiler seem to do this Fixes: `b01847bd9` ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	54742e157d	aco: fix incorrect cast in parse_wait_instr() s_waitcnt is SOPP, not SOPK Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	11f43caaec	aco: fix i2i64 Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	ff70ccad16	aco: propagate p_wqm on an image_sample's coordinate p_create_vector Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2156 Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:19:52 +00:00
Christian Gmeiner	1be220833c	etnaviv: remove dead code ptiled is always NULL so the if statement is useless. CoverityID: 1415572 Fixes: `b962776530` ("etnaviv: rework compatible render base") CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-29 16:22:40 +01:00
Christian Gmeiner	1dfe6a3e9a	etnaviv: handle integer case for GENERIC_ATTRIB_SCALE Reviewed-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-29 15:06:18 +01:00
Christian Gmeiner	5361ea2a9b	etnaviv: fix R10G10B10A2 vertex format entries Reviewed-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-29 15:06:18 +01:00
Christian Gmeiner	06d7071bca	etnaviv: use NORMALIZE_SIGN_EXTEND The blob driver does something like this for all vertex formats: if (normalize) { if (OPENGL_ES30) val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_SIGN_EXTEND; else val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_ON; } else { val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_OFF; } As there is no way to get to that information in gallium we always assume OPENGL_ES30. Reviewed-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-29 15:06:18 +01:00
Christian Gmeiner	ca6c73f335	etnaviv: fix integer vertex formats Reviewed-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-29 15:06:18 +01:00
Jonathan Gray	34dda0ca65	i965: update Makefile.sources for perf changes brw_performance_query_metrics.h was removed in `134e750e16` and brw_performance_query.h was removed in `8ae6667992` remove reference to these files from Makefile.sources Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Fixes: `134e750e16` ("i965: extract performance query metrics") Fixes: `8ae6667992` ("intel/perf: move query_object into perf") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-29 13:20:55 +00:00
Vinson Lee	0d21fe5397	scons: Bump C standard to gnu11 on macOS 10.15. Fix build error on macOS 10.15 Catalina. src/util/u_queue.c:179:7: error: implicit declaration of function 'timespec_get' is invalid in C99 [-Werror,-Wimplicit-function-declaration] timespec_get(&ts, TIME_UTC); ^ timespec_get needs C11 starting with macOS 10.15. /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/time.h 193 #if (__DARWIN_C_LEVEL >= __DARWIN_C_FULL) && \ 194 ((defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L) \|\| \ 195 (defined(__cplusplus) && __cplusplus >= 201703L)) 196 /* ISO/IEC 9899:201x 7.27.2.5 The timespec_get function / 197 #define TIME_UTC 1 / time elapsed since epoch / 198 __API_AVAILABLE(macosx(10.15), ios(13.0), tvos(13.0), watchos(6.0)) 199 int timespec_get(struct timespec ts, int base); 200 #endif Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-11-29 12:38:29 +00:00
Boris Brezillon	c6e2096c47	panfrost: Make sure we reset the damage region of RTs at flush time We must reset the damage info of our render targets here even though a damage reset normally happens when the DRI layer swaps buffers. That's because there can be implicit flushes the GL app is not aware of, and those might impact the damage region: if part of the damaged portion is drawn during those implicit flushes, you have to reload those areas before next draws are pushed, and since the driver can't easily know what's been modified by the draws it flushed, the easiest solution is to reload everything. Reported-by: Carsten Haitzler <raster@rasterman.com> Fixes: `65ae86b854` ("panfrost: Add support for KHR_partial_update()") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-29 10:20:29 +01:00
Boris Brezillon	b196e1a8cf	gallium: Fix the ->set_damage_region() implementation BACK_LEFT attachment can be outdated when the user calls KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a damage region update on the wrong pipe_resource object. Let's delay the ->set_damage_region() call until the attachments are updated when we're in that case. Reported-by: Carsten Haitzler <raster@rasterman.com> Fixes: `492ffbed63` ("st/dri2: Implement DRI2bufferDamageExtension") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-29 10:20:29 +01:00
Erik Faye-Lund	5fcb503c73	zink: silence coverity error Coverity doesn't know that we always have coordinates if we have lod. To avoid annoying errors, let's just zero-initialize this. CoverityID: 1455202 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	7a63124a06	zink: error-check right variable That's not the value we just allocated... CoverityID: 1455177 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	c8769ff8dd	zink: avoid NULL-deref Same story as the previous two commits; these functions dereference the memory they are pointed at. We can't do that. CoverityID: 1455180 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	e54240f153	zink: avoid NULL-deref Similar to the previous commit, pipe_resource_reference also dereference the memory pointed at. Let's avoid it. CoverityID: 1455198 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	bda64440e4	zink: avoid NULL-deref zink_render_pass_reference will dereference the memory 'dst' points at, which can't really go well. All we want to do here is to increase the reference-count, so let's use a different helper for that instead. CoverityID: 1455200 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	8e1dca35ab	zink: handle calloc-failure In case we fail to allocate the context, we should notice and fail gracefully. CoverityID: 1455193 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	8772d95d40	zink: do not try to destroy NULL-fence destroy_fence doesn't handle NULL-pointers gracefully. So let's avoid hitting that code-path, by simply returning NULL early here instead. CoverityID: 1455179 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	49f53ee336	zink: delete query rather than allocating a new one It seems I had some fat fingers when writing this function, and I accidentally ended up allocating a new query and immediately trying to delete an uninitialized pool instead of just deleting the pool of the query that was passed. CoverityID: 1455196 Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:54:25 +01:00
Erik Faye-Lund	f2188e58ce	zink: fix crash when restoring sampler-states When I changed to heap-allocated sampler-objects, I missed the code-path that restores sampler-states after the blitter; it needs an array of pointers, not an array of VkSampler objects to behave. This fixes spec@arb_texture_cube_map@copyteximage for me. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `5ea787950f` ("zink: heap-allocate samplers objects") Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 09:19:54 +01:00
Erik Faye-Lund	655b9aa711	zink: reject invalid sample-counts Vulkan only allows power-of-two sample counts. We already kinda checked for this, but forgot to validate the result in the end. Let's check the result and error properly. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 08:58:05 +01:00
Erik Faye-Lund	927363e0b9	zink: use true/false instead of TRUE/FALSE Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-11-29 08:57:33 +01:00
Erik Faye-Lund	c7c0bd9f1e	st/mesa: unmap pbo after updating cache Unmapping first leads to accessing an invalid pointer. So let's switch these lines around. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-29 07:45:06 +00:00
Vinson Lee	de2e5f6f54	panfrost: Fix gnu-empty-initializer build errors. Fixes: `a24d6fbae6` ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-28 16:12:38 -08:00
Timothy Arceri	9d2d609cce	docs: update source code repository documentation This drops all the old documentaion around applying for push access. Also this removes the documentation stating that you can push directly to mesa rather than using merge requests. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1969 Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-29 11:09:00 +11:00
Bas Nieuwenhuizen	48fc65413c	radv: Fix timeline semaphore refcounting. Was totally broken ... Removed two if(point) {} because point is always non-NULL and we were counting on that already for counting, since we NULL our references to semaphores without active point earlier. Fixes: `4aa75bb3bd` "radv: Add wait-before-submit support for timelines." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2137 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-28 23:46:09 +01:00
Jonathan Gray	3fe3bde4f2	winsys/amdgpu: avoid double simple_mtx_unlock() pthread_mutex_unlock() when unlocked is documented by posix as being undefined behaviour. On OpenBSD pthread_mutex_unlock() will call abort(3) if this happens. This occurs in amdgpu_winsys_create() after `cb446dc0fa` winsys/amdgpu: Add amdgpu_screen_winsys Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-11-28 15:03:59 -05:00
Marek Olšák	5e81fbf44a	util/driconfig: print ATTENTION if MESA_DEBUG=silent is not set unix-bytebenchmark refuses to run if the driver prints ATTENTION to stderr. Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-11-28 14:36:32 -05:00
Tapani Pälli	d61a21f439	glsl: handle max uniform limits with lower_const_arrays_to_uniforms Fixes arb_tessellation_shader-large-uniforms Piglit test. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-28 14:11:46 +02:00
Bas Nieuwenhuizen	4cde0e04e3	radv: Unify max_descriptor_set_size. They were out of sync. Besides syncing, lets ensure they never diverge again. Fixes: `8d2654a419` "radv: Support VK_EXT_inline_uniform_block." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-28 12:06:44 +01:00
Bas Nieuwenhuizen	e09426ad6b	amd/llvm: Refactor ac_build_scan. Split out the logic for exclusive scans into a separate function that makes clear what it does instead of having this opaque 60 line if. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-28 11:35:11 +01:00
Samuel Pitoiset	d347f2805d	radv: add more constants to avoid using magic numbers Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-28 10:59:14 +01:00
Samuel Pitoiset	52aadbfd04	ac/llvm: convert src operands to pointers if necessary To avoid generating invalid LLVM IR when both operands don't have the same type. This might happen when performing pointer comparisons with SPIRV 1.4. Fixes invalid LLVM IR for: dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-28 08:26:51 +01:00
Dave Airlie	18f896e55d	llvmpipe: add initial nir support This adds the hooks between llvmpipe and the gallivm NIR code, for compute and fragment shaders. NIR support is hidden behind LP_DEBUG=nir for now until all the intergration issues are solved Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:49:23 +10:00
Dave Airlie	5363cda52b	gallivm: add swizzle support where one channel isn't defined. NIR doesn't always define all output channels relies on outputs being memset to 0 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:49:16 +10:00
Dave Airlie	3eb27cfccd	gallium: add nir lowering passes for the draw pipe stages. (v2) This transforms the NIR shaders like the TGSI transforms worked. v2: fix some nir info requirements, use 32-bit bools Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:49:05 +10:00
Dave Airlie	bf12bc2dd7	draw: add nir info gathering and building support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:56 +10:00
Dave Airlie	44a6b0107b	gallivm: add nir->llvm translation (v2) This add the initial implementation of the NIR->LLVM conversion for llvmpipe NIR support. v2: lower bool to int32 in nir not llvm Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:44 +10:00
Dave Airlie	18ed09d449	gallivm: add selection for non-32 bit types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:38 +10:00
Dave Airlie	9461f2b5df	gallivm: add cttz wrapper this will be used to write find_lsb support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:32 +10:00
Dave Airlie	1a608901cc	gallivm: add popcount intrinsic wrapper Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:25 +10:00
Dave Airlie	3b9950098b	gallivm: nir->tgsi info convertor (v2) This is a port of the old radeonsi code to be used for llvmpipe NIR support. Once we remove TGSI support from llvmpipe (I can dream? :-), then we should be able to refine most of this down and remove it. v2: port to later radeonsi code for vertex inputs and sampler/io parsing. Acked-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:48:11 +10:00
Dave Airlie	c879efec09	gallivm: split out the flow control ir to a common file. We can share a bunch of flow control handling between NIR and TGSI. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-28 14:47:54 +10:00
Marek Olšák	754c7b8939	radeonsi: enable SPIR-V and GL 4.6 for NIR Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:35 -05:00
Marek Olšák	cf240ea6a5	radeonsi/nir: support interface output types to fix SPIR-V xfb piglits Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:34 -05:00
Marek Olšák	1b45da15a9	radeonsi/nir: fix location_frac handling for TCS outputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:32 -05:00
Marek Olšák	268e42e4f8	radeonsi/nir: don't rely on data.patch for tess factors GLCTS SPIR-V tests have this issue. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:30 -05:00
Marek Olšák	59daac686d	radeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factors Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:29 -05:00
Marek Olšák	272f1369ec	radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:28 -05:00
Marek Olšák	1e3aab4cd0	radeonsi: simplify the interface of get_dw_address_from_generic_indices Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:26 -05:00
Marek Olšák	756fc9f1bb	radeonsi/nir: implement subgroup system values for SPIR-V Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:23 -05:00
Marek Olšák	42318f9197	ac/nir: don't rely on data.patch for tess factors Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-27 19:28:10 -05:00
Kenneth Graunke	51cc380894	drirc: Set vs_position_always_invariant for Shadow of Mordor on Intel When drawing the main character in Shadow of Mordor, the game appears to draw Talion with one vertex shader, and the Wraith with another. If the compiler optimizes those in different ways which lead to slight imprecisions, then the resulting positions may not line up, leading to Z-fighting occurring as the game decides which of the two are in front. brw_nir_opt_peephole_ffma looks at usages of multiply adds across the entire shader, and may make different decisions between the two, leading to such imprecisions and Z-fighting. This started happening recently after a NIR change to eliminate unnecessary MOVs (`7025dbe7`), but that change simply exposed the existing problem. Improves performance on Skylake GT4e by 1.22945% +/- 0.398672% (n=3), likely due to the fixed rendering. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1985 Fixes: `7025dbe794` ("nir: Skip emitting no-op movs from the builder.") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-11-27 18:48:04 +00:00
Kenneth Graunke	9b577f2a88	driconf, glsl: Add a vs_position_always_invariant option Many applications use multi-pass rendering and require their vertex shader position to be computed the same way each time. Optimizations may consider, say, fusing a multiply-add based on global usage of an expression in a shader. But a second shader with the same expression may have different code, causing that optimization to make the other choice the second time around. The correct solution is for applications to mark their VS outputs 'invariant', indicating they need multiple shaders to compute that output in the same manner. However, most applications fail to do so. So, we add a new driconf option - vs_position_always_invariant - which forces the gl_Position output in vertex shaders to be marked invariant. Fixes: `7025dbe794` ("nir: Skip emitting no-op movs from the builder.") Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-11-27 18:48:04 +00:00
Eric Anholt	424d5e4e11	turnip: Disable timestamp queries for now. They're not implemented, and not critical to bring up immediately. Avoids failures in the CTS when nothing gets written to the query. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 10:05:59 -08:00
Jonathan Marek	080c92e7d4	freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	98d7125b36	freedreno/perfcntrs/fdperf: add missing a20x compatible Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	24cde37e8d	freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	baab4017b9	freedreno/perfcntrs: add a2xx MH counters Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Jonathan Marek	0d0c8a9e82	freedreno/registers: add missing MH perfcounter enum for a2xx Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-11-27 12:11:57 -05:00
Michel Dänzer	a3b3d3bfcc	gitlab-ci: Put HTML summary in artifacts for failed piglit jobs This will make it easier to look at details of failed / skipped tests. Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-27 10:20:31 +01:00
Michel Dänzer	07c1346113	gitlab-ci: Stop storing piglit test results as JUnit Since we're not reporting test results as JUnit anymore, we can use the default JSON format. This affects how test results are summarized, update the reference files accordingly. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-27 10:19:22 +01:00
Michel Dänzer	c9cdb7cef0	gitlab-ci: Stop reporting piglit test results via JUnit It was basically useless in this form, and processing the JUnit data in the GitLab backend was pretty expensive. Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-27 10:18:33 +01:00
Iago Toral Quiroga	18a09e788d	v3d: fix indirect BO allocation for uniforms We were always ensuring a minimum size of 4 bytes for uniforms for the case where we don't have any, to account for hardware pre-fetching of the uniform stream, however, pre-fetching could also lead to to out of bounds reads when have read the last uniform in the stream, so we probably want to have the extra 4 bytes to prevent the kernel from observing invalid memory accesses when the uniform stream sits right at the end of a page. This seems to fix MMU exceptions reported with a Linux 5.4 kernel. Credit goes to Phil Elwell for identifying the problem and narrowing it down to memory accesses in the uniform stream. Reported-by: Phil Elwell <phil@raspberrypi.org> Tested-by: Phil Elwell <phil@raspberrypi.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-27 08:43:13 +01:00
Samuel Pitoiset	a24f1c8f7f	radv: enable VK_KHR_shader_subgroup_extended_types on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:42:44 +01:00
Samuel Pitoiset	0812dbd403	ac: add 8-bit and 16-bit supports to ac_build_permlane16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:42:42 +01:00
Samuel Pitoiset	c9aa843961	radv/gfx10: fix implementation of exclusive scans This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic..subgroupexclusive on GFX10. Fixes: `227c29a80d` ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:39:26 +01:00
Samuel Pitoiset	86a5fbfd4a	radv: fix enabling sample shading with SampleID/SamplePosition When a fragment shader includes an input variable decorated with SampleId or SamplePosition, sample shading should be enabled because minSampleShadingFactor is expected to be 1.0. Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-27 07:22:54 +01:00
Jonathan Marek	62ff90cc5e	turnip: fix integer render targets Add missing required bits. Fixes at least: dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint dEQP-VK.renderpass.dedicated_allocation.attachment.4.401 dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 16:01:19 -08:00
Jason Ekstrand	a8965c076b	anv: Push constants are relative to dynamic state on IVB Fixes: `aecde2351` "anv: Pre-compute push ranges for graphics pipelines" Closes: #2136 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 22:15:54 +00:00
Dylan Baker	a24d6fbae6	meson: Add -Werror=gnu-empty-initializer to MSVC compat args Only clang has this argument (at least as of clang 8 and gcc 9), which errors when using the gcc empty initializer syntax in C: ```C struct foo f = {}; ``` GCC has a warning for this, but only when using -Wpedantic, which is a lot of noise to lose useful warnings in. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-26 12:48:11 -08:00
Dylan Baker	25e58e3718	gallium/auxiliary: Fix uses of gnu struct = {} extension Most of these will never actually be compiled by windows, but in the interest of being able to make using struct foo = {}; an error and avoiding breaking windows removing a handful of safe uses seems like a good trade off. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-26 12:48:11 -08:00
Marek Olšák	ed1ff99da7	st/mesa: add st_variant base class to simplify code for shader variants Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	b8772a559a	st/mesa: don't use ** in the st_nir_link_shaders signature Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	adbba2142d	st/mesa: simplify looping over linked shaders when linking NIR Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	8567e06046	st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	e8f0a39d45	st/mesa: don't call ProgramStringNotify in glsl_to_nir Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	5a714531f7	st/mesa: don't use redundant stp->state.ir.nir Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Marek Olšák	6cf011fcc8	st/mesa: don't serialize all streamout state if there are no SO outputs Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-26 15:14:10 -05:00
Kenneth Graunke	3fdf2bb313	iris: Disable VF cache partial address workaround on Gen11+ The vertex cache uses the full 48-bit address on Gen11+. See the documentation for 3DSTATE_VERTEX_BUFFERS, which describes the workaround and lists it as pre-Icelake. Interestingly, the docs don't mention index buffers as needing a workaround at all. So either we've been overzealous, or the docs never got updated to record that. Which begs the question of whether the issue there was fixed, if there was one... Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears that it improves performance by about 1-2% on Icelake 8x8 (not frequency locked).	2019-11-26 12:13:34 -08:00
Rob Clark	8d9f5a28e3	freedreno: switch to layout helper The slices table and most of the other layout fields in the freedreno_resource moves into fdl_layout. v2: Changes by anholt to not have duplicate fields, which was introducing a surprising behavior change in resource layout (using the level_linear helper before the setup of the shadowed fields) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:08 +00:00
Eric Anholt	997b8d4749	freedreno/a6xx: Log the tiling mode in resource layout debug. This was important for figuring out what went wrong with the layout refactor. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Eric Anholt	2e62a622e7	freedreno: Convert the slice struct to the new resource header. This gets the worst of the sed required for shared resource layout out of the way. The texture layout comment is dropped now that we're referencing the shared header, which has a more complete description. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Eric Anholt	930432577f	freedreno: Introduce a resource layout header. This will be used for sharing resource layout code between freedreno and tu. Mostly copied from a commit by Rob, with a new location and the slice struct renamed for consistency. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Eric Anholt	2ec420b264	freedreno: Introduce a fd_resource_tile_mode() helper. Multiple places were doing the same thing to get the tile mode of a level, so refactor it out. This will make the shared resource helper transition cleaner. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Eric Anholt	6b09227ede	freedreno: Introduce a fd_resource_layer_stride() helper. This factors out a bit of duplicated code, but will also make the shared resource layout transition process clearer. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Rob Clark	9e9a26c768	freedreno: use rsc->slice accessor everywhere This will make it easier to extract the slice table out into a layout helper. Acked-by: Rob Clark <robdclark@chromium.org>	2019-11-26 18:46:07 +00:00
Eric Anholt	d845dca0f5	nir: Make algebraic backtrack and reprocess after a replacement. The algebraic pass was exhibiting O(n^2) behavior in dEQP-GLES2.functional.uniform_api.random.3 and dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 (along with other code-generated tests, and likely real-world loop-unroll cases). In the process of using fmul(b2f(x), b2f(x)) -> b2f(iand(x, y)) to transform: result = b2f(a == b); result = b2f(c == d); ... result = b2f(z == w); -> temp = (a == b) temp = temp && (c == d) ... temp = temp && (z == w) result = b2f(temp); nir_opt_algebraic, proceeding bottom-to-top, would match and convert the top-most fmul(b2f(), b2f()) case each time, leaving the new b2f to be matched by the next fmul down on the next time algebraic got run by the optimization loop. Back in 2016 in `7be8d07732` ("nir: Do opt_algebraic in reverse order."), Matt changed algebraic to go bottom-to-top so that we would match the biggest patterns first. This helped his cases, but I believe introduced this failure mode. Instead of reverting that, now that we've got the automaton, we can update the automaton's state recursively and just re-process any instructions whose state has changed (indicating that they might match new things). There's a small chance that the state will hash to the same value and miss out on this round of algebraic, but this seems to be good enough to fix dEQP. Effects with NIR_VALIDATE=0 (improvement is better with validation enabled): Intel shader-db runtime -0.954712% +/- 0.333844% (n=44/46, obvious throttling outliers removed) dEQP-GLES2.functional.uniform_api.random.3 runtime -65.3512% +/- 4.22369% (n=21, was 1.4s) dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 runtime -68.8066% +/- 6.49523% (was 4.8s) v2: Use two worklists, suggested by @cwabbott, to cut out a bunch of tricky code. Runtime of uniform_api.random.3 down -0.790299% +/- 0.244213% compred to v1. v3: Re-add the nir_instr_remove() that I accidentally dropped in v2, fixing infinite loops. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-26 10:13:46 -08:00
Eric Anholt	90ad6304bf	nir: Refactor algebraic's block walk My motivation was to clarify the changes in the following commit, but incidentally, it reduces runtime of dEQP-GLES2.functional.uniform_api.random.3 (an algebraic-heavy testcase) by -5.39524% +/- 2.21179% (n=15) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-26 10:13:40 -08:00
Connor Abbott	305d1300f9	nir: Maintain the algebraic automaton's state as we work. In order to have nir_opt_algebraic be able to do further algebraic work on the output of a replacement, we need to maintain the automaton's state. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 10:13:19 -08:00
Jonathan Marek	2da4a58ed9	etnaviv: support 3d/array/integer formats in texture descriptors Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-26 19:07:04 +01:00
Jonathan Marek	7806e058c9	etnaviv: blt: fix partial ZS clears with TS If not all bits are cleared, then BLT needs to be given the current clear value and not the new one. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-26 19:04:51 +01:00
Daniel Schürmann	7cd548d352	aco: don't value-number instructions from within a loop with ones after the loop. Fixes: Wolfenstein:Youngblood (w/o shader_ballot) dEQP-VK.descriptor_indexing.combined_image_sampler_in_loop_with_lod Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-26 14:39:27 +00:00
Rhys Perry	46420dd294	aco: set dlc/glc correctly for image loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-11-26 14:39:27 +00:00
Rhys Perry	37843e454e	aco: allow constant offsets for global/scratch instructions on GFX10 I don't think the bug applies for global/scratch instructions and load_barycentric_at_sample selection expects this feature to work. Fixes various dEQP-VK.pipeline.multisample_interpolation.* tests on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-11-26 14:39:27 +00:00
Bas Nieuwenhuizen	02375b8436	radv: Enable VK_KHR_buffer_device_address. Still no capture/replay or multi device support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-26 11:59:52 +00:00
Samuel Pitoiset	34dd4251e2	radv: fix reporting subgroup size with VK_KHR_pipeline_executable_properties Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-26 10:48:48 +01:00
Bas Nieuwenhuizen	25bc9102d8	radv: Allocate cmdbuffer space for buffer marker write. Fixes: `946193ae00` "radv: add support for VK_AMD_buffer_marker" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-26 09:35:02 +00:00
Gert Wollny	e41958e344	r600: Disable eight bit three channel formats Commit `0899bf55` made some deqp-gles3 tests related to RGB8 PBOs fail on R600 because it exposed PIPE_FORMAT_R8G8B8_UNORM and R600 doesn't propely handle this. Disabling this format also for buffers fixes the issue. In addition, disabling also the related RGB8 integer formats for buffers fixes some deqp-gles3 tests: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb8ui_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8i_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb8ui_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8i_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb8ui_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8i_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb8ui_3d Fixes: `0899bf55` st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM Closes #2118 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-26 09:28:52 +01:00
Samuel Pitoiset	f6770b9726	ac/llvm: fix warning in ac_build_canonicalize() ../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’: ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4567 \| return ac_build_intrinsic(ctx, intr, type, params, 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4568 \| AC_FUNC_ATTR_READNONE); \| ~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-26 08:35:10 +01:00
Tapani Pälli	5d58fea660	mapi: add GetInteger64vEXT with EXT_disjoint_timer_query From EXT_disjoint_timer_query spec: "Interaction: This extension adds GetInteger64vEXT if OpenGL ES 3.0 is not supported" See https://github.com/KhronosGroup/OpenGL-Registry/issues/326. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-26 07:41:24 +02:00
Jason Ekstrand	200a3301e2	vulkan: Update the XML and headers to 1.1.129 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Jason Ekstrand	854859fefa	anv/entrypoints: Better handle promoted extensions In the case of promoted extensions we can end up with an entrypoint that we support being an alias of an entrypoint we do not support. For instance, if an extension gets promoted from EXT to KHR, the EXT entry- points may be aliases of the KHR ones. We want to leave everything as EXT until we get around to advertising the KHR so that we don't break things when we update the XML and headers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Jason Ekstrand	121551bfdb	vulkan/enum_to_str: Handle out-of-order aliases The current code can only handle enum aliases if the original enum is declared first followed by the alias as we walk the XML in a linear fashion. This commit allows us to handle aliases where the alias declaration comes before the thing it's aliasing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-26 02:48:42 +00:00
Kenneth Graunke	f6aa51103b	iris: Update SURFACE_STATE addresses when setting sampler views We may have replaced the backing storage for a texture buffer while it was unbound, at which point iris_rebind_buffer would not have caught it and updated it. We need to ensure that the current resource's address matches the one our SURFACE_STATE points at. If not, update addresses and re-upload the SURFACE_STATE. Shader images and buffers do not suffer from this problem because we re-stream the surface state on every set call, since there isn't a created CSO object for those with a saved SURFACE_STATE. Constant buffers are also currently re-streamed (we pitch the SURFACE_STATE on every set_constant_buffer call). Surfaces would need this treatment (as they're created CSOs) except that we never swap out their backing storage today (we only do it for buffers), so it's OK for now. Fixes misrendering in Unreal 4 demos (Elemental, Matinee Fight Scene). Huge thanks to Andrii Simiklit for tracking down the problem - it was quite difficult to find! Also fixes Andrii's new Piglit test for the bug, 'arb_texture_buffer_object-re-init'. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1365	2019-11-25 15:54:54 -08:00
Kenneth Graunke	060a2c52fa	iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. When replacing the backing storage for texture buffers, image buffers, and so on, we may need to update the "Surface Base Address" field in any corresponding SURFACE_STATE. This is easier to accomplish if we have a copy on the CPU - we can just compare the current field, update it, and re-upload. This patch adds a CPU-side copy to the new iris_surface_state wrapper struct, and reworks allocation and upload to fill things out on the CPU copy first, then upload that to the GPU when finished. This will be necessary to fix iris_invalidate_resource bugs shortly. Technically, we never replace the backing storage for pipe_surfaces (render targets), so we don't need to make this change there. However, it's nice to have surfaces, sampler views, and image views handled similarly. Plus, if we ever wanted to swap out backing storage for busy textures, we'd need this infrastructure. v2: Properly free memory (caught by Andrii Simiklit)	2019-11-25 15:54:54 -08:00
Kenneth Graunke	2b09e818dc	iris: Create an "iris_surface_state" wrapper struct Today, we only have a state reference to the GPU buffer containing our uploaded SURFACE_STATEs. However, we're going to want a CPU-side copy soon. Making a wrapper struct means we can talk about both together, and also put both in the field called "surface_state".	2019-11-25 15:54:54 -08:00
Kenneth Graunke	4c1f81ad62	iris: Drop 'old_address' parameter from iris_rebind_buffer We can just compare the VERTEX_BUFFER_STATE address field to the current BO's address. When calling rebind, we've already updated the resource to the new buffer, but the state will have the old address.	2019-11-25 15:54:54 -08:00
Kenneth Graunke	518be59c1a	iris: Stop mutating the resource in get_rt_read_isl_surf(). Mutating fields of global resources is generally not safe, and the only reason we were doing it was to avoid passing an extra parameter to the fill_surface_state helper.	2019-11-25 15:54:54 -08:00
Marek Olšák	b02e0d2604	radeonsi/nir: don't run si_nir_opts again if there is no change 0.3% less overhead Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-25 16:48:27 -05:00
Marek Olšák	4675cb2019	radeonsi: initialize the per-context compiler on demand This takes a noticable amount of time in piglit and some tests don't need it. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-25 16:48:27 -05:00
Marek Olšák	f671cc4d95	ac: set swizzled bit in cache policy as a hint not to merge loads/stores LLVM now merges loads and stores for all opcodes, so this must be set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 16:48:27 -05:00
Eric Anholt	8afab607ac	nir: Add a scheduler pass to reduce maximum register pressure. This is similar to a scheduler I've written for vc4 and i965, but this time written at the NIR level so that hopefully it's reusable. A notable new feature it has is Goodman/Hsu's heuristic of "once we've started processing the uses of a value, prioritize processing the rest of their uses", which should help avoid the heuristic otherwise making such systematically bad choices around getting texture results consumed. Results for v3d: total instructions in shared programs: 6497588 -> 6518242 (0.32%) total threads in shared programs: 154000 -> 152828 (-0.76%) total uniforms in shared programs: 2119629 -> 2068681 (-2.40%) total spills in shared programs: 4984 -> 472 (-90.53%) total fills in shared programs: 6418 -> 1546 (-75.91%) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v2) v2: Use the DAG datastructure, fold in the scheduling-for-parallelism patch, include SSA defs in live values so we can switch to bottom-up if we want. v3: Squash in improvements from Alejandro Piñeiro for getting V3D to successfully register allocate on GLES3.1 dEQP. Make sure that discards don't move after store_output. Comment spelling fix.	2019-11-25 21:12:21 +00:00
Jonathan Marek	5159db60fc	etnaviv: implement 64bpp clear At the same time, update etna_clear_blit_pack_rgba to work with integer formats. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-25 20:23:22 +01:00
Jonathan Marek	2214f99c07	etnaviv: avoid using RS for 64bpp formats At the same time, this change allows using BLT for 8bpp formats Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-25 20:22:43 +01:00
Christian Gmeiner	92d5e3c692	etnaviv: add support for extended pe formats Use the extended format if an such a format was passed. v1 -> v2: - set FORMAT_MASK bit when using ext PE format as suggested by Wladimir J. van der Laan Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-25 20:12:52 +01:00
Christian Gmeiner	396818fd9d	etnaviv: handle 8 byte block in tiling Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-25 20:11:30 +01:00
Samuel Pitoiset	2af39c719e	radv: select the depth decompress path based on the aspect mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:23 +01:00
Samuel Pitoiset	905c005561	radv: create decompress pipelines for separate depth/stencil layouts No functional changes as the driver still uses the depth+stencil pipeline. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:21 +01:00
Samuel Pitoiset	faa58201f3	radv: rework creation of decompress/resummarize meta pipelines This refactoring will help for creating more decompress pipelines. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:18 +01:00
Samuel Pitoiset	8f0fb38825	radv: set the image view aspect mask before resolves No functional changes, but it will be used to decompress separate depth/stencil aspects. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:16 +01:00
Samuel Pitoiset	9dec90b7bc	radv: set the image view aspect mask during subpass transitions No functional changes because the aspect mask is still not used during image transitions but it will be needed for the separate depth/stencil aspects logic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 16:29:13 +01:00
Rhys Perry	459bc77763	aco: enable load/store vectorizer Totals from affected shaders: SGPRS: 1890373 -> 1900772 (0.55 %) VGPRS: 1210024 -> 1215244 (0.43 %) Spilled SGPRs: 828 -> 828 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 252 -> 252 (0.00 %) dwords per thread Code Size: 81937504 -> 74608304 (-8.94 %) bytes LDS: 746 -> 746 (0.00 %) blocks Max Waves: 230491 -> 230158 (-0.14 %) In NeiR:Automata and GTA V, the code decrease is especially large: -13.79% and -15.32%, respectively. v9: rework the callback function v10: handle load_shared/store_shared in the callback Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	0a759c3be6	nir: add load/store vectorizer tests v7: run nir_opt_algebraic v9: rework the callback function v9: update alignment on all loads/stores, even if they're not vectorized v10: add tests for 64-bit offsets v10: add tests for signed offsets Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	ce9205c03b	nir: add a load/store vectorization pass This pass combines intersecting, adjacent and identical loads/stores into potentially larger ones and will be used by ACO to greatly reduce the number of memory operations. v2: handle nir_deref_type_ptr_as_array v3: assume explicitly laid out types for derefs v4: create less deref casts v4: fix shared boolean vectorization v4: fix copy+paste error in resources_different v4: fix extract_subvector() to pass nir_load_store_vectorize_test.ssbo_load_intersecting_32_32_64 v4: rebase v5: subtract from deref/offset instead of scheduling offset calculations v5: various non-functional changes/cleanups v5: require less metadata and preserve more v5: rebase v6: cleanup and improve dependency handling v6: emit less deref casts v6: pass undef to components not set in the write_mask for new stores v7: fix 8-bit extract_vector() with 64-bit input v7: cleanup creation of store write data v7: update align correctly for when the bit size of load/store increases v7: rename extract_vector to extract_component and update comment v8: prevent combining of row-major matrix column acceses v9: rework process_block() to be able to vectorize more v9: rework the callback function v9: update alignment on all loads/stores, even if they're not vectorized v9: remove entry::store_value, since it will not be updated if it's was from a vectorized load v9: fix bug in subtract_deref(), causing artifacts in Dishonored 2 v9: handle nir_intrinsic_scoped_memory_barrier v10: use nir_ssa_scalar v10: handle non-32-bit offsets v10: use signed offsets for comparison v10: improve create_entry_key_from_offset() v10: support load_shared/store_shared v10: remove strip_deref_casts() v10: don't ever pass NULL to memcmp v10: remove recursion in gcd() v10: fix outdated comment v11: use the new nir_extract_bits() v12: remove use of nir_src_as_const_value in resources_different v13: make entry key hash function deterministic v13: simplify mask_sign_extend() v14: add comment in hash_entry_key() about hashing pointers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v9)	2019-11-25 13:59:11 +00:00
Rhys Perry	b3a3e4d1d2	radv: set alignment for load_ssbo/store_ssbo in meta shaders Otherwise, nir_intrinsic_align() will assert when called on the intrinsics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 13:59:11 +00:00
Rhys Perry	c14f823ee5	nir: add nir_num_variable_modes and nir_var_mem_push_const These will be useful in the upcoming load/store vectorizer. v11: rebase Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-25 13:59:11 +00:00
Connor Abbott	01eb6ef870	aco: Make unused workgroup id's 0 It shouldn't matter, but the 1 was leftover from when it was handled together with workgroup_size and num_work_groups. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	bb78f9b4e4	aco: Use common argument handling Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	e7f4cadd02	radv: Replace supports_spill with explict_scratch_args The former was always true and hence dead code. We will want to explicitly declare the ring offset register with ACO, but we also want to declare the scratch offset too, and we can't try to disable it since ACO also supports spilling and the determination of whether spilling has to happen occurs well after setting up registers. So replace supports_spill with something that will actually be used for ACO. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:17:51 +01:00
Connor Abbott	4d6676d78a	aco: Make num_workgroups and local_invocation_ids one argument each To match the LLVM argument setup code. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	a7f1c63442	aco: Split vector arguments at the beginning Due to how LLVM works we have to make some of the FS inputs become vectors, and therefore have to split them early so that they don't take up extra register pressure due to how RA currently works. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	b45c54ff8d	aco: Use radv_shader_args in aco_compile_shader() Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	680b086db1	aco: Constify radv_nir_compiler_options in isel It's already const for everything else. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-25 14:17:51 +01:00
Connor Abbott	66c703b3e8	radv: Move argument declaration out of nir_to_llvm Now it's executed for ACO too. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:17:51 +01:00
Connor Abbott	3b143369a5	ac/nir, radv, radeonsi: Switch to using ac_shader_args Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-11-25 14:17:10 +01:00
Connor Abbott	9885af3bdf	ac: Add a shared interface between radv, radeonsi, LLVM and ACO ac_shader_args will be similar to ac_shader_abi, except for being free from LLVM-specific concepts and therefore capable of being shared between LLVM and ACO. This will help us accomplish a few different things: - Decouple setting up SGPR and VGPR arguments from translating to LLVM, so that we can reference these arguments in NIR lowering passes, which will let us lower e.g. descriptor sets in NIR. - Stop using radv-specific structures for things like determining the chip generation in ACO. In the end, we should replace ac_shader_abi with this structure + driver-specific lowering passes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:12:46 +01:00
Connor Abbott	43da33c169	radv: Rename ac_arg_regfile We'll duplicate this in a header file in the next commit, and then remove the original enum. Just rename it temporarily so that things keep building. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-25 14:12:46 +01:00
Danylo Piliaiev	29081c671f	drirc: Add glsl_zero_init workaround for GpuTest GiMark benchmark from GpuTest has such code in VS: out vec4 lightDir0; out vec4 lightDir1; ... lightDir0.xyz = lp0 - vVertex.xyz; lightDir1.xyz = lp1 - vVertex.xyz; In FS: float distSqr = dot(lightDir0, lightDir0); So due to the usage of uninitialized .w channel in the dot product, distSqr may become undefined which results in many black dots in the test on Iris. In https://www.geeks3d.com/forums/index.php/topic,6242.0.html developer stated that this benchmark most likely won't be updated. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1919 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-25 12:22:37 +02:00
Samuel Pitoiset	d6db858771	meson: only build imgui when needed Only required for Intel tools or the Vulkan overlay layer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-25 07:51:56 +00:00
Samuel Pitoiset	bfb307aea9	ac/llvm: fix the local invocation index for wave32 Fixes dEQP-VK.compute.builtin_var.local_invocation_index with RADV_PERFTEST=cswave32. My initial fix was to lower it but Rhys suggested the shift-right and it's much better like this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 07:25:48 +00:00
Samuel Pitoiset	b99295fb33	radv: disable subgroup shuffle operations on GFX10 They are broken like on GFX6-GFX7. It seems better to disable them instead of enabling a broken feature. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-25 08:03:24 +01:00
Dave Airlie	1c5dc4eaf9	docs: add llvmpipe to ARB_query_buffer_object.	2019-11-25 12:37:58 +10:00
Dave Airlie	506e51b856	llvmpipe: initial query buffer object support. (v2) This fails a couple of piglits due to other bugs in llvmpipe, but it adds support for the feature properly. v2: don't reset pipestats, just recalc, fix CI expectation	2019-11-25 12:37:32 +10:00
Timothy Arceri	f54c4e85ce	radv: create a fresh fork for each pipeline compile In order to prevent a potential malicious pipeline tainting our secure compile process and interfering with successive pipelines we want to create a fresh fork for each pipeline compile. Benchmarking has shown that simply forking on each pipeline creation doubles the total time it takes to compile a fossilize db collection. So instead here we fork the process at device creation so that we have a slim copy of the device and then fork this otherwise idle and untainted process each time we compile a pipeline. Forking this slim copy of the device results in only a 20% increase in compile time vs a 100% increase. Fixes: `cff53da3` ("radv: enable secure compile support")	2019-11-25 10:10:14 +11:00
Timothy Arceri	1663bb1f77	radv: add a secure_compile_open_fifo_fds() helper This will be used to create a communication pipe between the user facing device and a freshly forked (per pipeline compile) slim copy of that device. We can't use pipe() here because the fork will not be a direct fork of the user facing process. Instead we use a previously forked copy of the process that was forked at device creation in order to reduce the resources required for the fork and avoid performance issues. Fixes: `cff53da374` ("radv: enable secure compile support")	2019-11-25 10:10:14 +11:00
Timothy Arceri	ef54f15da9	radv: add some infrastructure for fresh forks for each secure compile In the following commits we want to be able to fork an existing lightweight fork created at device creation time. In order for the user facing process to communicate with this new fresh fork we create some members here to hold FIFO file descriptors and a unique id. Here we also add a new fork enum that we use to tell the lightweight process to create a fresh fork. For more information on why we create a fresh fork see the following commits.	2019-11-25 10:10:14 +11:00
Brian Paul	a2689ebcd6	nir: no-op C99 _Pragma() with MSVC This fixes a build failure on MSVC. BTW, it looks like clang supports _Pragma() but I don't know if it understands the "gcc unroll N" directive. Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-11-23 10:34:24 -07:00
Michel Zou	95fdde5a60	Meson: Add llvm>=9 modules Fixes build with MinGW, with shared LLVM and lto /tmp/opengl32.dll.BxiIYm.ltrans59.ltrans.o:<artificial>:(.text+0x1674): undefined reference to `LLVMAddInstructionCombiningPass' See also scons/llvm.py Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-23 16:09:52 +00:00
Michel Zou	02d63ee5a4	disk_cache_get_function_timestamp: check for dladdr instead of dlopen Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-23 12:01:11 +01:00
Michel Zou	bfd9f3201e	Meson: Check for dladdr with MinGW Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-23 12:01:11 +01:00
Marek Olšák	ad40715f35	nir/serialize: support any num_components for remaining instructions Only NPOT vectors greater than vec4 use the extra uint32. This is for instructions that share the dest code. load_const and undef already support 1-16 in the header. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	c028449c01	nir/serialize: use 3 unused bits in intrinsic for packed_const_indices Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	3d44aed09e	nir/serialize: don't serialize redundant nir_intrinsic_instr::num_components Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	a2df670b14	nir/serialize: serialize writemask for vec8 and vec16 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	a5c5388234	nir/serialize: serialize swizzles for vec8 and vec16 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	f1a48d54ea	nir/serialize: reuse the writemask field for 2 src X swizzles of SSA ALU Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	487a495cc0	nir/serialize: remove up to 3 consecutive equal ALU instruction headers vec4 scalarized ALUs typically have 4 equal instruction headers, so remove the last 3. There are no bits left in the ALU header for more flags, so future extensions of NIR will have to use something like instr_type == 15 to describe more complex ALU instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	c3fa9de2a9	nir/serialize: try to pack both deref array src into 32 bits Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	ed6b01d5e0	nir/serialize: cleanup - fold nir_deref_type_var cases into switches Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	a0cd67d292	nir/serialize: try to put deref->var index into the unused bits of the header Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	ca201bfe70	nir/serialize: don't serialize mode for deref non-cast instructions It can be derived from src and var. This frees 10 bits in the header that will be used later. "mode" is moved in the structure, because those bits will be used for something else later. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	2286340fde	nir/serialize: don't store deref types if not needed - type_cast: deduplicate types if the last one is the same - derive the type from the parent for other derefs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	70a7f85149	nir/serialize: try to pack two alu srcs into 1 uint32 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	ef4630cf4f	nir/serialize: pack nir_intrinsic_instr::const_index[] better Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	d3346b275a	nir/serialize: pack 1-component constants into 20 bits if possible The majority of constants can be packed like this. v2: - use enum for the packing encoding, - trim packed_value to 20 bits add 1 bit to last_component, which simplifies a later commit Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	75f7c38863	nir/serialize: pack load_const with non-64-bit constants better v2: use blob_write_uint8/16 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	a572ba673b	nir/serialize: try to store a diff in var data locations instead of var data Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	c8314678ee	nir/serialize: deduplicate serialized var types by reusing the last unique one Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	545415f45f	nir/serialize: don't serialize var->data for temporaries Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	c358c2b2bf	nir/serialize: pack src better and limit the object count to 1M from 1G We need to limit the object count to 1M to free 10 bits for the src modifiers. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	35655865cb	nir/serialize: pack instructions better Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Marek Olšák	4fe1d7822b	util/blob: add 8-bit and 16-bit reads and writes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-23 00:02:10 -05:00
Eric Anholt	59b489f44b	ci: Use a tag from the parallel-deqp-runner repo. If the repo continues development, we don't want to accidentally pick up potentially breaking changes on our next container rebuild. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 15:37:04 -08:00
Rob Clark	215866523b	gitlab-ci/freedreno/a6xx: remove most of the flakes xfb + lines/points still flakes too frequently (and the problem isn't even related to xfb), but we can add the rest back into this mix now. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-22 13:48:29 -08:00
Rob Clark	9f422cbe1c	gitlab-ci/deqp: generate junit results Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	415d565d96	gitlab-ci/deqp: generate xml results for fails/flakes Extract .qpa for the individual unexpected results and flakes, and translate to xml, preserved with the artifacts. This allows easy browsing of the test logs for fails/flakes, for easier debugging. The # of logs to preserve is capped at 50 to avoid saving 100s of megabytes of logs in case someone pushes a change that breaks everything. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	8af7551a9e	gitlab-ci: bump arm test container To pick up updated cts_runner and netcat for the flake reporting. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	fdaf777076	gitlab-ci/deqp: detect and report flakes If there are a small number of fails, re-run to determine if they are flakes, and optionally (if `$FLAKES_CHANNEL` configured) report the flakes. This way flakes don't interfere with developers working on other drivers, but get logged so that the developers working on the flaking driver can monitor the situation. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	cc6484f164	gitlab-ci/deqp: preserve caselists for blocks with fails Bump cts_runner to pick up the change to preserve .qpa and caselist .txt files for blocks of tests that contain fails, and preserve the caselist files. To reproduce fails that depend on order of running tests, these are useful. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	59ed90fc74	gitlab-ci/deqp: preserve full list of unexpected results The log only shows the first 50, but preserve the full list for easier browsing. (Also move return of exit code to end which makes later patches in the series easier) Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Rob Clark	5fa397a0d9	gitlab-ci: update deqp build so we can generate xml Update the deqp build to preserve testlog-to-xml and stylesheets, so deqp runner can extract .qpa for failed/flaked tests, and convert to xml. With this, will be able to browse output from failed tests directly from the artifacts. The main motiviation is to give better visibility into what happens with flaked tests, when it is difficult/impossible to reproduce the flake locally (ie. when it happens once out of N million tests). But this should also make it easier to debug regressions that a MR triggers, especially when it is on hw that you don't have. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-22 13:48:29 -08:00
Markus Wick	dba903ed0b	drirc: Enable glthread for dolphin/citra/yuzu. Dolphin: 75 fps -> 88 fps - Super Mario Galaxy Citra: 81 fps -> 91 fps - A Link Between Worlds Yuzu: 21 fps -> 27 fps - Super Mario Odyssey Dolphin still has many syncs because of glFenceSync and glClientWaitSync. Moving them to the dispatcher thread might yield another speedup. Yuzu uses a compatible profile by default. This benchmark used the variable MESA_GL_VERSION_OVERRIDE=4.5FC to overwrite this behavior. This profilation was done on a mobile i7-8550U CPU with i965. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-22 15:29:29 -05:00
Markus Wick	f4c61d422d	mesa/glthread: Implement ARB_multi_bind. Signed-off-by: Markus Wick <markus@selfnet.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-22 15:29:07 -05:00
Rhys Perry	517728477c	aco: fix waitcnts for barriers at block ends Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `d1b9deee` ('aco: improve waitcnt insertion around loops') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-22 19:56:31 +00:00
Zebediah Figura	a3c8bc10aa	Revert "draw: revert using correct order for prim decomposition." This reverts commit `f97b731c82`. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/250 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-22 20:37:42 +01:00
Kenneth Graunke	acd36e488d	iris: Change keybox parenting For temporary lookups, just allocate out of the NULL ralloc context, so we don't have to edit the linked list of ralloc children to add it and then immediately remove it again. When uploading a new shader, allocate the keybox off the shader, so if we delete the shader the keybox also goes away. Less manual cleanup.	2019-11-22 09:50:59 -08:00
Ian Romanick	ca353285cb	nir/range_analysis: Make sure the table validation only occurs once All of the tables are static const, so they only need to be validated once. As noted in the previous commit, the compiler should be able to eliminate all of this code when the assertions would pass. Even with the help of the previous commit, this does not always occur. -Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5 -O1: No difference proven at 95.0% confidence. N=5 -O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-22 08:16:06 -08:00
Ian Romanick	ccefce46cb	nir/range-analysis: Add pragmas to help loop unrolling I was pretty liberal with these assertions when I wrote this code because I had assumed that GCC would unroll the loops, inline the look ups of static const arrays with now constant indices, and then elmininate all the actuall assertions. It seems none of this happens even at -O3. Adding the pragmas helps encourage loop unrolling at some optimization levels. I tested by running shader-db with NIR_VALIDATE=false on a Core i7 Haswell desktop system. -Og: No difference proven at 95.0% confidence. N=5 -O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5 -O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5 v2: Add a _Pragma to an inner loop that was accidentally dropped during a rebase. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-22 08:16:06 -08:00
Danylo Piliaiev	25a00b449f	glsl: Add varyings to "zero-init of uninitialized vars" workaround Varyings are similar to already handled cases. And "glsl_zero_init" name of the workaround already looks like it should include varyings. The issue was observed in GiMark subtest from GpuTest. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-22 15:25:56 +00:00
Alyssa Rosenzweig	4c43b354c3	pan/midgard: Use lower_tex_without_implicit_lod Just a bit of cleanup. lower_tex can do this lowering for us, which should also eliminate some special cases (one less thing to fix if we ever need texturing in tess/geom/etc, perhaps?) Closes #2133 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 08:38:57 -05:00
Christian Gmeiner	47c7c4263c	etnaviv: use a more self-explanatory param name Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-22 10:47:13 +00:00
Christian Gmeiner	a949fa9d5d	etnaviv: drop not used config_out function param Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-22 10:47:13 +00:00
Samuel Pitoiset	6f7ec6ee39	gitlab-ci: reduce the number of scons build It seems overkill to me to build scons 7x for every pipeline. Scons is now build with the oldest llvm version in scons-old-llvm and with the newest llvm version in scons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-22 10:39:21 +01:00
Alyssa Rosenzweig	2e14fe6490	panfrost: Add lcra.c to Android.mk This was forgotten. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig	bda2bb31b1	pan/midgard: Enable LOD lowering only on buggy chips T720 and earlier need this workaround, so check the quirk before lowering. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig	68c2c7962a	pan/midgard: Describe quirk MIDGARD_BROKEN_LOD Corresponds to errata #10471, applies to T6xx and T720. Fixed in T760. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig	d32d4acf68	pan/midgard: Add LOD bias/clamp lowering We fetch the info with the new intrinsic and lower with ALU ops for txl instructions, which seemingly correspond to "TEXGRD" instructions (what we call textureLod). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig	4e07e7b232	pan/midgard: Implement load_sampler_lod_paramaters_pan We can stuff this information in as parametrized system values, like we currently do texture size and SSBO addresses. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Alyssa Rosenzweig	deaebc82a7	nir: Add load_sampler_lod_paramaters_pan intrinsic This loads in the <min_lod, max_lod, lod_bias> settings for a given sampler, which is necessary for lowering clamps/biases on certain Midgard chips. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-22 05:07:19 +00:00
Markus Wick	b1156ecdf2	mapi/glapi: Generate sizeof() helpers instead of fixed sizes. Generating a source code with a fixed size leads to issues with plattform dependent types. We either hard code 4 or 8 bytes there, and both are wrong on the other plattform. So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both on 32 and on 64 bit plattforms. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-11-21 22:52:55 -05:00
Ian Romanick	e51eda99df	intel/fs: Disable conditional discard optimization on Gen4 and Gen5 The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of valid data and 31 bits of junk. Results of comparisons that are used as Boolean values need to have a fixup applied to generate the proper 0/~0 values. Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup code from being generated. This results in a sequence like: cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F / ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F / 0F / (+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD g8<8,8,1>UD instead of cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F / 0F / ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F / 0F */ or(16) g4<1>UD g4<8,8,1>UD g8<8,8,1>UD (+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD 1UD I examined a couple of the shaders hurt by this change, and ALL of them would have been affected by this bug. :( Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836 Fixes: `0ba9497e66` ("intel/fs: Improve discard_if code generation") Iron Lake total instructions in shared programs: 8122757 -> 8122957 (<.01%) instructions in affected programs: 8307 -> 8507 (2.41%) helped: 0 HURT: 100 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.58% 3.03% Instructions are HURT. total cycles in shared programs: 188510100 -> 188510376 (<.01%) cycles in affected programs: 76018 -> 76294 (0.36%) helped: 0 HURT: 55 HURT stats (abs) min: 2 max: 12 x̄: 5.02 x̃: 4 HURT stats (rel) min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56% 95% mean confidence interval for cycles value: 4.33 5.71 95% mean confidence interval for cycles %-change: 0.60% 1.12% Cycles are HURT. GM45 total instructions in shared programs: 4994403 -> 4994503 (<.01%) instructions in affected programs: 4212 -> 4312 (2.37%) helped: 0 HURT: 50 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.45% 3.07% Instructions are HURT. total cycles in shared programs: 128928750 -> 128928982 (<.01%) cycles in affected programs: 67442 -> 67674 (0.34%) helped: 0 HURT: 47 HURT stats (abs) min: 2 max: 12 x̄: 4.94 x̃: 4 HURT stats (rel) min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53% 95% mean confidence interval for cycles value: 4.19 5.68 95% mean confidence interval for cycles %-change: 0.50% 1.00% Cycles are HURT.	2019-11-21 16:40:50 -08:00
Dylan Baker	bba44ef176	docs: update calendar, add news item and link release notes for 19.2.6	2019-11-21 16:34:00 -08:00
Dylan Baker	3531d74e82	docs: Add SHA256 sum for 19.2.6	2019-11-21 16:32:35 -08:00
Dylan Baker	f8070577a4	docs: Add release notes for 19.2.6	2019-11-21 16:32:34 -08:00
Marek Olšák	0b1452ffdd	nir/serialize: do ctx = {0} instead of manual initializations Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-21 18:49:57 -05:00
Marek Olšák	ff71fae440	nir: strip as we serialize to remove the nir_shader_clone call Serializing stripped NIR is faster now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-21 18:49:57 -05:00
Christian Gmeiner	8acaab1aa7	etnaviv: add drm-shim Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:56:04 +00:00
Eric Engestrom	609a6ae23e	vk_util: drop duplicate formats in vk_format_map[] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:52:40 +00:00
Jonathan Marek	773d640efa	turnip: implement UBWC This enables UBWC for everything except 3D textures. It breaks many image_to_image copies but those aren't important and it can be worked around later (image_to_image copy needs to be done in two steps, decode from the source format and then encode to the destination format). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:21:57 +00:00
Jonathan Marek	91fd83d142	freedreno/regs: update UBWC related bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 22:21:57 +00:00
Vinson Lee	6613a4a029	swr: Fix build with llvm-10.0. Fix build error after llvm-10.0 commit 1dfede3122ee ("Move CodeGenFileType enum to Support/CodeGen.h"). ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function, const char)’: ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’ *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-11-21 13:20:08 -08:00
Rhys Perry	29d131d619	aco: fix copy+paste error Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-21 20:28:57 +00:00
Rhys Perry	d1b9deeea8	aco: improve waitcnt insertion around loops Do this by repeating processing of loops until no progress is made. Totals from affected shaders: SGPRS: 162576 -> 162576 (0.00 %) VGPRS: 145228 -> 145228 (0.00 %) Spilled SGPRs: 668 -> 668 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 15778640 -> 15771336 (-0.05 %) bytes LDS: 146 -> 146 (0.00 %) blocks Max Waves: 6087 -> 6087 (0.00 %) v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end of loops instead of at each continue Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-21 20:28:57 +00:00
Rob Clark	1a8c49d76c	freedreno/perfctrs/fdperf: periodically restore counters When GPU is idle and suspends, the currently selected countables will all reset to the first one. So periodically restore the selected countables. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	5a13507164	freedreno/perfcntrs: add fdperf Port from the envytools tree, but converted to use the .c tables for describing the perfcounter groups/countables, rather than using rnndec to get this at runtime from the register xml. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	b2338a5b00	freedreno/perfcntrs/a6xx: remove RBBM counters Currently this are getting blocked by the kernel.. these counters don't seem to be the most useful ones, and to use them we'd have to somehow probe the kernel by submitting cmdstream to write the selector regs and see if that triggers a GPU fault. So let's just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	6a517b3079	freedreno/perfctrs/a2xx: move CP to be first group fdperf expects this, to find the ALWAYS_COUNT counter Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	e35c4e6ad2	freedreno/perfcntrs: add accessor to get per-gen tables Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	b21f03ae7e	freedreno/perfcntrs: move to shared location This should eventually be useful for VK_KHR_performance_query as well. And in the more near term, for fdperf. Attempt to not break android build is best-effort and untested. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:03 +00:00
Rob Clark	6727114cba	freedreno/perfcntrs: remove gallium dependencies Prep work to move to a shared location. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:02 +00:00
Rob Clark	3fb6aaf42e	freedreno/perfcntrs: small cleanup When we had one gen supporting performance counters, it made sense to have these builder macros in the .c file with the table. But time has come to de-duplicate. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 20:01:02 +00:00
Dave Airlie	cce07ea835	nir: fix deref offset builder Use the correct bit size Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:41 +10:00
Dave Airlie	7325f6ac98	vtn/opencl: add clz support This is needed for OpenCL Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:41 +10:00
Dave Airlie	e3b21dfcb1	nouveau: request ufind_msb64 lowering in the frontend. This passes the piglit CL builtin-ulong-clz-1.0.generated.cl test. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-11-22 04:37:41 +10:00
Dave Airlie	d0d96053e6	nir: add 64-bit ufind_msb lowering support. (v2) This adds the option to lower 64-bit ufind_msb opcodes. v2: use split_x/y removes component loops (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:37 +10:00
Dave Airlie	12913bcf86	spirv/nir/opencl: handle some multiply instructions. This adds support for some missing 24-bit and hi multiply variants. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:25 +10:00
Dave Airlie	5375c30234	spirv: get the correct type for function returns. This needs to be derived from the address format, not always 1/32. Suggested by Jason Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:25 +10:00
Dave Airlie	b62a925ad1	spirv: don't store 0 to cs.ptr_size for non kernel stages. cs is a union so storing this there is wrong. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-22 04:37:25 +10:00
Jonathan Marek	1496e1164f	util: add missing R8G8B8A8_SRGB format to vk_format_map Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-21 17:46:27 +00:00
Elie Tournier	72b44d148d	docs: fix ascii html representation v2 (Eric): Use more readable ascii version Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-21 16:51:18 +00:00
Elie Tournier	64d7bd96b8	Docs: remove duplicate meson docs for windows This block is duplicated, we already have the windows instruction above. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-21 16:51:18 +00:00
Eric Anholt	dd76a6f198	ci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs. I set the runners to concurrency=1, so they serve only one gitlab-ci job at at time. Swap over to using the parallel runner now to keep the runners busy, more efficiently than spawning many docker containers and downloading artifacts multiple times, and producing easier-to-understand results for browsing on the web. This bumps the a306 runners to 4x parallel instead of 2x like before, but cheza gles3 drops from 6 to 4. Current rough timings of the jobs (if no container download): db410c-gles2: 5:00 a630-gles2: 1:30 a630-gles3: 6:00 a630-gles31: 5:30 a630-gles3 is a bit longer than I like, but it should come back down once I can sort out the NIR algebraic rewinding.	2019-11-21 05:48:17 -08:00
Iago Toral Quiroga	c573b50179	glsl: add missing initialization of the location path field This was apparently missed in `67b32190f3`, which added support for ARB_shading_language_include to #line, including the 'path' field for the location. Fixes crashes in CTS with all drivers as they attempt to access an uninitialized path string during parsing. Fixes: `67b32190f3` ("glsl: add ARB_shading_language_include support to #line") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-11-21 12:55:15 +01:00
Rhys Perry	1a0500cd04	docs: update features.txt for RADV [skip ci] Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-21 11:00:50 +00:00
Michel Dänzer	32618ee719	gitlab-ci: Directly use host-mapped directory for ccache Use hardcoded /cache/mesa/ccache for the cache, so it will be shared by all jobs of all Mesa projects running on the same runner host. This should increase the hit rate and decrease the worst case storage used. Further benefits of directly using a host-mapped directory: * Saves up to ~1 minute per job for restoring and saving the cache contents via the GitLab CI cache mechanism * Cache contents generated by failed jobs are no longer lost * Jobs running in parallel on the same runner host can get hits from each other Also enable compression, so the default maximum cache size of 5G might be sufficient. v2: * Move CCACHE_DIR variable to the .build-linux template Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> # v1	2019-11-21 10:13:43 +01:00
Samuel Pitoiset	0d1085ac4a	gitlab-ci: remove now useless meson-swr-glvnd build job All things are already part of meson-main. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:35:05 +01:00
Samuel Pitoiset	7362176cfe	gitlab-ci: build GLVND in meson-clang Building GLVND in meson-main doesn't work because this disables libEGL and it's needed for running shader-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:35:05 +01:00
Samuel Pitoiset	e6d26d77a3	gitlab-ci: build swr in meson-main Now that debugoptimized isn't set and that all test jobs depend on meson-testing, enabling swr shouldn't slowdown the CI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:35:05 +01:00
Samuel Pitoiset	6cf9b53fa2	gitlab-ci: do not build with debugoptimized for meson-main This should reduce compile time because optimizations are costly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:35:05 +01:00
Samuel Pitoiset	66b5627074	gitlab-ci: add a job that only build things needed for testing For turnip and RADV testing, we will need a debugoptimized build without UBSAN. This introduces meson-testing which builds only the things that are needed by the test stage. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:35:04 +01:00
Samuel Pitoiset	eab328fbe9	gitlab-ci: fix ldd check for Vulkan drivers The 'dri' directory isn't created when building Vulkan drivers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:34:08 +01:00
Samuel Pitoiset	24dd730efc	gitlab-ci: move building piglit into a separate script Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-21 09:33:39 +01:00
Samuel Pitoiset	8fc8e8e8be	pipe-loader: check that the pointer to driconf_xml isn't NULL This happens when mesa is built with only swrast. The default driver being kmsro and the default driconf file being v3d, it's NULL and then strdup crashes. This fixes a crash with piglit spec/egl_mesa_query_driver/conformance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-21 07:34:20 +01:00
Alyssa Rosenzweig	046097c092	panfrost: Add the lod_bias field Enough trial and error ... just think even more Midgard about where this field might be! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-21 06:05:12 +00:00
Timothy Arceri	cd6322366d	compiler: move build definition of pp_standalone_scaffolding.c This should fix android build issues while still allowing scons to build the standalone compiler. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129 Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-11-21 16:07:08 +11:00
Karol Herbst	5934a53bfe	nir/validate: validate num_components on registers and intrinsics also make 8 and 16 compoments invalid. We will enable that later again when we actually support it. v2: fix validation of nir_intrinsic_instr::num_components correct validation of instr->num_components Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-21 01:10:24 +01:00
Mark Janes	eae8dfef58	Revert "st/mesa: keep serialized NIR instead of nir_shader in st_program" This reverts commit `db0c89d4bf`. Gitlab: mesa/mesa#2128 Acked-by: Marek Olšák <maraeo@gmail.com>	2019-11-20 15:22:32 -08:00
Mark Janes	f1f19b6445	Revert "st/mesa: call nir_serialize only once per shader" This reverts commit `3a8d686889`. Acked-by: Marek Olšák <maraeo@gmail.com>	2019-11-20 15:22:32 -08:00
Arno Messiaen	721d82cf06	lima/ppir: add lod-bias support Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-11-20 22:24:00 +00:00
Jason Ekstrand	2fca325ea6	Revert "i965/fs: Merge CMP and SEL into CSEL on Gen8+" This reverts commit `52c7df1643`. The pass, while clearly useful for some shaders, has at least three bugs that I was able to find fairly quickly: 1. It doesn't work for type-converting MOVs because f > 0 is not the same as f2i(f) > 0 2. CSEL is a 3src instruction and only supports one source type; it doesn't take this into account and tries to create instructions which do a F compare and a D select. This is especially nasty to debug because you don't see that in the dumped assembly because we don't properly assert that types are the same in codegen. 3. While you can handle 2, in theory, by reinterpreting types, you can't do that in the presence of source modifiers. This pass doesn't even attempt to detect that. Those are just the ones I found with the one almost trival shader I was debugging. There very likely may be more and. Best thing to do for now is just shut it off until someone has the time to figure out how to do this properly and write tests to ensure it's correct. Fixes: 3cb085e6d61a "i965/fs: Merge CMP and SEL into CSEL on Gen8+" Reviewed-by: Brian Paul <brianp@vmware.com>	2019-11-20 20:47:32 +00:00
Daniel Schürmann	8d7621a53f	radv: Enable Subgroup Arithmetic and Clustered for SI This patch also allows to enable VK_AMD_shader_ballot on SI. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-20 20:31:45 +00:00
Daniel Schürmann	0cbcfc071e	amd/llvm: Add Subgroup Scan functions for SI The idea of this implementation is taken from the ROCm Device Libs: https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-20 20:31:45 +00:00
Andreas Baierl	fca2d3ce3f	lima/streamparser: Add findings introduced with gl_PointSize Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-11-20 19:24:12 +00:00
Andreas Baierl	804c295039	lima/streamparser: Fix typo in vs semaphore parser Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-11-20 19:24:12 +00:00
Yevhenii Kolesnikov	9af22ccddc	meson: Fix linkage of libgallium_nine with libgalliumvl Do not link libgallium_nine with libgalliumvl_stub if it's already linked with libgalliumvl. Linking with stub leads to "duplicate symbol" errors. Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2040 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-20 19:16:20 +00:00
Dylan Baker	bcfc9c0fec	docs/release-calendar: Update for extended 19.3 rc period	2019-11-20 09:57:05 -08:00
Dylan Baker	ff21acc91c	docs: update calendar, add news item and link release notes for 19.2.5	2019-11-20 09:22:29 -08:00
Dylan Baker	d35429239b	docs/relnotes/19.2.5: Add SHA256 sum	2019-11-20 09:19:02 -08:00
Dylan Baker	6567b2daa9	docs: Add relnotes for 19.2.5	2019-11-20 09:19:00 -08:00
Rhys Perry	ca2de7ae9c	nir/large_constants: use nir_index_vars and nir_variable::index Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-20 15:05:42 +00:00
Rhys Perry	9f92e8b721	nir: add nir_variable::index and nir_index_vars This will be useful as a deterministic identifier/index for the variable. v2: fix comment style Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)	2019-11-20 15:05:42 +00:00
Rhys Perry	45a0b53490	nir: make nir_variable::{num_members,num_state_slots} a uint16_t Doesn't shrink it (at least, on x86-64) and leaves space for more members. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-20 15:05:42 +00:00
Samuel Pitoiset	645332f3f5	docs: add missing new features for RADV [skip ci] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-20 16:04:15 +01:00
Hyunjun Ko	02f4c39b8d	freedreno/ir3: enable half precision for pre-fs texture fetch Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	407f8c71d3	freedreno/ir3: fixup when changing to mad.f16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	d0f38394b1	freedreno/ir3: fix printing output registers of FS. Fixes: `cea39af2fb` ("freedreno/ir3: Generalize ir3_shader_disasm()") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	37f5395783	freedreno/ir3: Enabling lowering 16-bit flrp Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	35124b0311	freedreno: support 16b for the sampler opcode Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	b934716bd8	freedreno/ir3: Implement f2b16 and i2b16 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	030b046df8	freedreno/ir3: Add implementation of nir_op_b16csel Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	f0a046024d	freedreno/ir3: Support 16-bit comparison instructions v2. [Hyunjun Ko (zzoon@igalia.com)] Avoid using too much open code like "instr->regs[n]->flags \|= FOO" v3. [Hyunjun Ko (zzoon@igalia.com)] Remove redundant code for both 16b and 32b operations. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Hyunjun Ko	138542499f	freedreno/ir3: cleanup by removing repeated code Prep-work for the corresponding patch. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	f6b5abe91a	nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	634eb9c04b	nir: Add a 8-bit bool type Adds nir_type_bool8 as well as 8-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	0f5640c577	nir: Add a 16-bit bool type Adds nir_type_bool16 as well as 16-bit versions of all the bool opcodes. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	2ec97e78a9	nir/opcodes: Add a helper function to generate reduce opcodes Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit versions of the reduce operation. This reduces the code duplication a bit and will make it easier to later add 16-bit versions as well. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Neil Roberts	9a96afb97e	nir/opcodes: Add a helper function to generate the comparison binops Adds binop_compare_all_sizes which generates both 1-bit and 32-bit versions of the comparison operation. This reduces the code duplication a bit and will make it easier to later add 16-bit versions as well. Reviewed-by: Rob Clark <robdclark@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 14:09:43 +01:00
Samuel Pitoiset	7ecd8a3471	radv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7 Most of DEQP-VK.subgroups are skipped because 16-bit float aren't supported but others pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-20 11:09:58 +00:00
Alejandro Piñeiro	b4bc59e37e	v3d: adds an extra MOV for any sig.ld* Specifically when we are in non-uniform control flow, as we would need to set the condition for the last instruction. If (for example) a image atomic load stores directly their value on a NIR register, last_inst would be a nop, and would fail when set the condition. Fixes piglit test: spec/glsl-es-3.10/execution/cs-ssbo-atomic-if-else-2.shader_test Fixes: `6281f26f06` ("v3d: Add support for shader_image_load_store.") v2: (Changes suggested by Eric Anholt) * Cover all sig.ld* signals, not just ldunif and ldtmu, as all of them have the same restriction. * Update comment explaining why we add a MOV in that case * Tweak commit message. v3: * Drop extra set of parens (Eric) * Add missing ld signal to is_ld_signal to fix shader-db regression. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-20 11:21:16 +01:00
Jose Maria Casanova Crespo	d983055184	v3d: Fix predication with atomic image operations Fixes dEQP test: dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.image_atomic_multiple_interleaved_write_read Fixes piglit test: spec/glsl-es-3.10/execution/cs-image-atomic-if-else.shader_test Fixes: `6281f26f06` ("v3d: Add support for shader_image_load_store.") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-20 11:20:55 +01:00
Tomeu Vizoso	36b099a7b0	panfrost: Don't print the midgard_blend_rt structs on SFBD Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 08:04:25 +01:00
Tomeu Vizoso	2dc720cb2c	gitlab-ci: Fix dir name for VK-GL-CTS sources Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 08:03:44 +01:00
Tomeu Vizoso	409f6c40ca	panfrost: Rework buffers in SFBD Support cases such as depth-only renders and only set stencil buffers when needed, to match the blob's behaviour. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 08:03:36 +01:00
Tomeu Vizoso	697f02c2a1	panfrost: Just print tiler fields as-is for Tx20 The tiler unit in these GPUs is quite different and we haven't reverse engineered enough of it yet to validate and pretty print it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-20 08:00:41 +01:00
Alyssa Rosenzweig	fcf144d96a	pan/midgard: Introduce quirks checks Rather than open-coding checks on gpu_id in the compiler, let's track quirks applying to whatever we're compiling for, to allow us to manage the complexity of many heterogenous GPUs in the compiler. It was discovered that a workaround used on T720 is also required on T820 (and presumably T830), so let's fix this. This will also decrease friction as we continue improving T720 support. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-20 07:41:39 +01:00
Timothy Arceri	614fba0ce1	gitlab-ci: update for arb_shading_language_include	2019-11-20 05:05:56 +00:00
Timothy Arceri	530d3b2900	gitlab-ci: bump piglit checkout commit	2019-11-20 05:05:56 +00:00
Timothy Arceri	af432be538	mesa: enable ARB_shading_language_include Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/999 Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	49cdbba9f6	mesa: implement glCompileShaderIncludeARB() Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	bad2c77aa8	mesa: add shader include lookup support for relative paths Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	1201d3377e	mesa: add support cursor support for relative path shader includes This will allow us to continue searching the current path for relative shader includes. From the ARB_shading_language_include spec: "If it is quoted with double quotes in a previously included string, then the first search point will be the tree location where the previously included string had been found." Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	db5197cec5	glsl: delay compilation skip if shader contains an include If the shader contains an include when need to first run the preprocessor before deciding if we can skip compilation based on the shader cache. Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	17df8f8b5d	glsl: add can_skip_compile() helper We will reuse this in the following commit. Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:56 +00:00
Timothy Arceri	5327b756bf	glsl: error if #include used while extension is disabled In other words make sure the shader does this: Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	13a1426b97	glsl: add preprocessor #include support Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	e0fd2fa689	glsl: pass gl_context to glcpp_parser_create() This is a small tidy up and will be useful in the following commit. Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	67b32190f3	glsl: add ARB_shading_language_include support to #line From the ARB_shading_language_include spec: "#line must have, after macro substitution, one of the following forms: #line <line> #line <line> <source-string-number> #line <line> "<path>" where <line> and <source-string-number> are constant integer expressions and <path> is a valid string for a path supplied in the #include directive. After processing this directive (including its new-line), the implementation will behave as if it is compiling at line number <line> and source string number <source-string-number> or <path> path. Subsequent source strings will be numbered sequentially, until another #line directive overrides that numbering." Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	2497c51717	mesa: implement glDeleteNamedStringARB() Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	f2d01cac7e	mesa: split _mesa_lookup_shader_include() in two The new local function lookup_shader_include() will be used by glDeleteNamedStringARB() in the following patch. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	ae2e41841f	mesa: implement glGetNamedStringivARB() Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	575137e613	mesa: implement glIsNamedStringARB() Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	fafda32127	mesa: make error checking optional in _mesa_lookup_shader_include() This will be usefull when implementing glIsNamedStringARB() which doesn't do error checking, it just returns false for invalid lookups instead. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	a47bfbe189	mesa: implement glGetNamedStringARB() Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	fc573c9816	mesa: add glNamedStringARB() support Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	628d34fddd	mesa: add copy_string() helper This will be used by the various ARB_shading_language_include functions in the following patches. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	8acab84f93	mesa: add _mesa_lookup_shader_include() helper This will be used both by the glsl compiler and the GL API. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	643a533fc2	mesa: add helper to validate tokenise shader include path Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	06f33d82ca	mesa: add ARB_shading_language_include infrastructure to gl_shared_state Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	35108caa71	glsl: add infrastructure for ARB_shading_language_include Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Timothy Arceri	906f1a2933	mesa: add ARB_shading_language_include stubs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-11-20 05:05:55 +00:00
Bas Nieuwenhuizen	4eb2a1dc6f	radv: Do not change scratch settings while shaders are active. When the scratch ringbuffer settings are changed, the shader unit has to be idle or we will have shaders using old and new settings. That combination is not supported on the HW (likely the offset is ringbuffer idx * WAVESIZE * 1024). CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-20 01:18:36 +00:00
Eric Anholt	bdf03b738d	turnip: Drop the copy of the formats table. Now that we can (mostly) generate a pipe format for a VkFormat, use that to answer queries about formats. This will let us refactor the freedreno format table surface layout code to be shared between gallium and vulkan. This causes us to expose fewer formats for now (on a 1/100 CTS run I'm doing, skips go from 3671 to 3835 out of 5145 tests). Fails stay about the same (478 -> 434, but the run is pretty flaky and we're doing fewer tests now). v2: Rebase on master, throw a finishme on missing vk-to-pipe formats that tu used to support. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v1) Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-19 15:35:52 -08:00
Eric Anholt	3a28281bf8	util: Add a mapping from VkFormat to PIPE_FORMAT. I'm planning on using this from radv and tu for queries about formats. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-11-19 15:35:52 -08:00
Marek Olšák	36c055c9b7	winsys/amdgpu: detect noop dependencies on the same ring correctly Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:32:56 -05:00
Marek Olšák	e7fb9c73a7	ac: fill num_rings for remaining IPs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:31:53 -05:00
Marek Olšák	e9cc4f670f	ac: add radeon_info::num_rings and move ring_type to amd_family.h Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:31:53 -05:00
Marek Olšák	654efd38bb	nir: don't use GLenum16 in nir.h Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:12 -05:00
Marek Olšák	ec7d37c9c0	nir: move data.descriptor_set above data.index for better packing 4 bytes down Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:10 -05:00
Marek Olšák	b160acb9f5	glsl_to_nir: rename image_access to mem_access Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:09 -05:00
Marek Olšák	193e2c9625	nir/print: only print image.format for image variables Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:07 -05:00
Marek Olšák	ebe7579655	nir: move data.image.access to data.access The size of the data structure doesn't change. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-19 18:20:05 -05:00
Marek Olšák	3a8d686889	st/mesa: call nir_serialize only once per shader It was called twice. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	db0c89d4bf	st/mesa: keep serialized NIR instead of nir_shader in st_program This decreases memory usage, because serialized NIR is more compact. If shader_has_one_variant is true and the shader is uncached, the first variant is created from nir_shader, otherwise the first variant and all other variants are created from serialized NIR. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	610fb0e19c	st/mesa: call nir_sweep in st_finalize_nir This is invoked sooner before (pre-)compiling the first variant and is also applied to fixed-func and ARB programs.	2019-11-19 18:02:06 -05:00
Marek Olšák	4e70cba638	st/mesa: subclass st_vertex_program for VP-specific members Inheritance: gl_program -> st_program -> st_vertex_program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	16e5f13b64	st/mesa: more cleanups after unification of st_vertex/common_program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	6b3d72b041	st/mesa: rename occurences of stcp to stp to correspond to st_program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	1375217116	st/mesa: cleanups after unification of st_vertex/common program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	5fed208285	st/mesa: rename st_common_program to st_program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	2e39e8b972	st/mesa: trivially merge st_vertex_program into st_common_program a later commit will add back st_vertex_program as a subclass of st_common_program Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	c97df7b4c7	st/mesa: consolidate and simplify code flagging program::affected_states Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	f71e93db0a	st/mesa: initialize affected_states and uniform storage earlier in deserialize This matches the uncached codepath. affected_states was used before initialization, which was technically a bug, but probably not reproducible due to _NEW_PROGRAM rebinding everything. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	60398e2d45	st/mesa: start deduplicating some program code Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	445ec0fc63	st/mesa: decrease the size of st_fp_variant_key from 48 to 40 bytes Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Marek Olšák	2c8652f98a	st/mesa: rename delete_basic_variant -> delete_common_variant Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-19 18:02:06 -05:00
Eric Engestrom	51e214c1db	anv: add missing "fall-through" annotation CoverityID: 1455884 Fixes: `c1c346f166` ("anv: implement VK_KHR_separate_depth_stencil_layouts") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-19 22:03:00 +00:00
Eric Engestrom	99788de909	egl: use EGL_CAST() macro in eglmesaext.h Allows eglmesaext.h to be used in C++ code. This aligns this file with the rest of EGL. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-11-19 22:00:24 +00:00
Eric Engestrom	344859c32d	vulkan: delete typo'd header Two files exist in that directory: - vulkan_xlib_randr.h - vulkan_xlib_xrandr.h Both were imported in `205c271562` ("vulkan: Update the XML and headers to 1.1.70") with identical contents (ie. the VK_EXT_acquire_xlib_display extension), but the former was never included anywhere and can't be found upstream [1], while the latter is included in vulkan.h and found upstream. [1] https://github.com/KhronosGroup/Vulkan-Headers/tree/master/include/vulkan Fixes: `205c271562` ("vulkan: Update the XML and headers to 1.1.70") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-19 21:56:22 +00:00
Eric Engestrom	0d69c2e932	CL: sync C++ headers with Khronos https://github.com/KhronosGroup/OpenCL-CLHPP at commit cf9fc1035e8298c7ce65ee33066a660fd9892ebb Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 21:50:26 +00:00
Eric Engestrom	a15aef0d39	CL: sync C headers with Khronos https://github.com/KhronosGroup/OpenCL-Headers at commit 0d5f18c6e7196863bc1557a693f1509adfcee056 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 21:50:25 +00:00
Rafael Antognolli	dadb6ebbd1	intel: Add workaround for stencil state. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-11-19 21:43:09 +00:00
Jonathan Marek	d2cf3cad91	turnip: fix sRGB GMEM clear Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-19 21:35:37 +00:00
Jonathan Marek	d68acdb3b9	turnip: implement CmdClearColorImage/CmdClearDepthStencilImage Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-19 21:35:37 +00:00
Rhys Perry	7eb7969213	radv/aco: enable VK_KHR_shader_subgroup_extended_types We could enable it on GFX10 if LLVM wasn't used as a fallback for unsupported stages. Note that the CTS only tests it if VK_KHR_shader_float16_int8 is enabled, even though it's not a requirement. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-19 18:58:04 +00:00
Rhys Perry	56c06c79fc	aco: implement 64-bit integer reductions The multiplication reduction is larger than it could be, but it should be easier to implement this way. No failures with dEQP-VK.subgroups.int64 except those caused by LLVM being used for other stages. v2: don't call setFixed() for v_add carry-out, since setHint sets physReg v3: add and use emit_vadd32() helper v4: use num_opcodes instead of last_opcode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)	2019-11-19 18:58:04 +00:00
Rhys Perry	33277bd66e	aco: refactor reduction lowering helpers Should make 64-bit integer reductions easier to implement. v4: use num_opcodes instead of last_opcode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)	2019-11-19 18:56:21 +00:00
Samuel Pitoiset	c93f2cefd5	radv: advertise VK_KHR_shader_subgroup_extended_types on GFX8-GFX9 This extension allows to use subgroup operations with 8 and 16-bits Untested on GFX6-GFX7, and most of subgroup operations are broken on GFX10, so don't enable it for now. Not enabled on ACO because it's still doesn't support 8-bits/16-bits. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	80c71cbbd8	ac: add 16-bit float support to ac_build_alu_op() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	670aa24c69	ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	21a9243f5e	ac: add 8-bit and 16-bit supports to ac_build_wwm() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	ef352a2466	ac: add 8-bit and 16-bit supports to get_reduction_identity() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	c8af1d51d4	ac: add 8-bit and 16-bit supports to ac_build_swizzle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	1565118d8f	ac: add 8-bit and 16-bit supports to ac_build_dpp() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	2113867f0c	ac: add 8-bit and 16-bit supports to ac_build_set_inactive() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	c29514bd22	ac: add 8-bit and 16-bit supports to ac_build_readlane() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	58d5ab98a3	ac: add 8-bit and 16-bit supports to ac_build_shuffle() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	204cf54b70	ac: remove useless cast in ac_build_set_inactive() The return type is always the src type (32 or 64 bits). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Samuel Pitoiset	194bee193c	spirv: fix lowering of OpGroupNonUniformAllEqual It should rely on the source type, not on the return type which is always a boolean anyways, so vote_feq was never selected. For OpSubgroupAllEqualKHR it's always an integer comparison. This fixes some VK_KHR_shader_subgroup_extended_types tests with RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-19 18:01:13 +00:00
Tomeu Vizoso	2941a734a0	gitlab-ci: Remove limit on kernel logging We don't seem to fault any more when running dEQP GLES2, and we don't scrape serial output any more anyway so no problems should be caused by that. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 15:39:13 +01:00
Pierre-Eric Pelloux-Prayer	99f0feb9e2	mesa: fix warning in 32 bits build Fixes: `febedee4f6` ("mesa: add EXT_dsa glGetVertexArray* 4 functions") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	3a5a55e5a5	mesa: enable EXT_direct_state_access Always enabled; this doesn't require any driver work, it's just core mesa bits. quick_gl.txt is also updated because previously piglit ext_dsa tests were skipped. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	1ef297645c	mesa: add ARB_sparse_buffer NamedBufferPageCommitmentEXT function The spec is unclear on how to handle the buffer argument so we reuse the logic from the EXT_direct_state_access spec. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	8b6d19413f	mesa: add ARB_vertex_attrib_binding glVertexArray* functions We can't simply alias ARB_direct_state_access functions because those fail if the vao has never been bound before. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	657396aa10	mesa: extend vertex_array_attrib_format to support EXT_dsa Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	bb2241bf06	mesa: implement ARB_texture_storage_multisample + EXT_dsa functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	a0d667036d	mesa: add ARB_texture_buffer_range glTextureBufferRangeEXT function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	b78e2a197a	mesa: add ARB_instanced_arrays EXT_dsa function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	a807b8c0a8	mesa: add ARB_gpu_shader_fp64 selector-less functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	e3385eb0c1	mesa: add ARB_clear_buffer_object named functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:45 +01:00
Pierre-Eric Pelloux-Prayer	442fd3d007	mesa: add ARB_vertex_attrib_64bit VertexArrayVertexAttribLOffsetEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:44 +01:00
Pierre-Eric Pelloux-Prayer	8cfb3e4ee5	mesa: add ARB_framebuffer_no_attachments named functions The wording in ARB_framebuffer_no_attachments and EXT_direct_state_access is different. In the former framebuffer names must have been generated using glGenFramebuffers before using the named functions. In the latter framebuffer names have no such constraints, so we can't use the _mesa_lookup_framebuffer_dsa function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:44 +01:00
Pierre-Eric Pelloux-Prayer	dc057f638c	mesa: update features.txt to reflect EXT_dsa status All features from the EXT_dsa spec are implemented. Interactions with other specs: - GL_AMD_gpu_shader_int64: not needed, since it's not enabled in compatibility profile. - GL_ARB_bindless_texture is DONE "INVALID_OPERATION is generated when calling various functions to modify the state of a texture object from which handles have been extracted" - GL_ARB_buffer_storage/GL_EXT_buffer_storage is DONE (NamedBufferStorageEXT function) - GL_ARB_texture_storage is DONE (3 TextureStorageDEXT functions) - GL_ARB_vertex_attrib_binding is DONE (6 VertexArray functions) - GL_EXT_external_buffer is not supported by Mesa Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-19 08:49:44 +01:00
Alyssa Rosenzweig	8b1548a12f	panfrost: Set PIPE_COMPUTE_CAP_ADDRESS_BITS to 64 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig	9c28700aaf	panfrost: Disable tiling for GLOBAL resources It doesn't make sense to have nonlinear layouts for a buffer that can be accessed as direct memory for a compute kernel. Turn that off so things work as expected. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig	21dd7574a8	panfrost: Pass kernel inputs as uniforms We can take the OpenCL kernel inputs and interpret them as uniforms by simply reusing the Gallium callback. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 06:22:31 +00:00
Alyssa Rosenzweig	a7b5dd1290	panfrost: Stub out clover callbacks We don't implement these yet but let's not crash. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-19 06:22:31 +00:00
Miguel Casas-Sanchez	b196958574	i965: Ensure that all 2101010 image imports can pass framebuffer completeness. Chrome OS would like to import and render to any supported format that has a corresponding display plane format, and this prevents throwing framebuffer incomplete for FBOs using these textures. See: crbug.com/949260 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-19 02:21:12 +00:00
Dave Airlie	1468a4f1f3	nir/serialize: fix serializing functions with no implementations. Store a flag stating if there was an implmentation, and use fxn->impl as a temporary flag between deserializsation stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-19 09:30:32 +10:00
Dave Airlie	0fd6b8aa98	nir/serialize: pack function has name and entry point into flags. Suggested by Jason. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-19 09:30:12 +10:00
Jason Ekstrand	fc72df1d93	iris: Re-enable param compaction In `d1c4e64a69`, we added a parameter to tell the back-end compiler to ignore the param array and just push however many constants you ask it to push. I enabled it for iris because this is really what iris wants but it seems to have caused a number of regressions. Revert to the old behavior for now. Fixes: `d1c4e64a69` "intel/compiler: Add a flag to avoid compacting..."	2019-11-18 16:54:07 -06:00
Marek Olšák	189c0cc45b	mesa: enable glthread for 7 Days To Die Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-18 17:25:57 -05:00
Iván Briano	ca94717035	intel/compiler: Don't change hstride if not needed Alignment requirements may have changed the horizontal stride already, so don't set it if not required to avoid breaking said requirements. Fixes several tests such as dEQP-VK.subgroups.vote.graphics.subgroupallequal_int8_t Signed-off-by: Iván Briano <ivan.briano@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-18 14:19:41 -08:00
Jonathan Marek	3cd44839fa	turnip: add x11 wsi Copied from radv Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-18 22:18:05 +00:00
Jonathan Marek	df9f2adfa3	turnip: add display wsi Copied from radv (minus the fence change) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-18 22:18:05 +00:00
Jason Ekstrand	7260df5894	nir: Validate that variables are in the right lists Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-18 16:15:30 -06:00
Jonathan Marek	e2b9d6277e	etnaviv: blt: set TS dirty after clear RS engine does this already, it is missing for BLT engine. This fixes cases where a clear isn't immediately at the start of the frame. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-18 20:59:02 +01:00
Jonathan Marek	d819d4b344	etnaviv: separate PE and RS formats, use only RS only for tiling There are PE formats not supported by RS, so we can't have a single to translate both. Use RS only for same formats until we have a translate_rs_format and test the possible different format blits. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-18 20:58:14 +01:00
Jonathan Marek	e1a86bd634	etnaviv: blt: use only for tiling, and add missing formats * Removes the incorrect usage of translate_rs_format * Disables use of BLT engine for different src/dst format We only really need the BLT engine for tiling/detiling right now, but it would be nice to support as many blit cases as possible to avoid using PE for that. To deal with different formats we need to: * Have a translate_blt_format which has all supported formats * Fix the swizzle translation from gallium (current version was wrong) * Set the src/dst sRGB bits as needed * Find which type conversions the BLT engine can actually do Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-11-18 20:57:40 +01:00
Brian Paul	02c3dad0f3	Call shmget() with permission 0600 instead of 0777 A security advisory (TALOS-2019-0857/CVE-2019-5068) found that creating shared memory regions with permission mode 0777 could allow any user to access that memory. Several Mesa drivers use shared- memory XImages to implement back buffers for improved performance. This path changes the shmget() calls to use 0600 (user r/w). Tested with legacy Xlib driver and llvmpipe. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-18 12:28:59 -07:00
Jason Ekstrand	fdaf8144a8	anv: Emit a NULL vertex for zero base_vertex/instance If both are zero (the common case), we can emit a null vertex buffer rather than emitting a vertex buffer with zeros in it. The packing of the VERTEX_BUFFER_STATE is faster because no relocation is emitted and we can avoid creating the vertex buffer which means one less anv_state_stream_alloc. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	bc9d7836bc	anv: Use an anv_state for the next binding table This is a bit more natural because we're already getting an anv_state most places in the pipeline. The important part here, however, is that we're no longer calling anv_block_pool_map on every alloc_binding_table call. While it's probably pretty cheap, it is potentially a linear walk over the list of BOs and it was showing up in profiles. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	98dc179c1e	anv: More carefully dirty state in BindPipeline Instead of blindly dirtying descriptors and push constants the moment we see a pipeline change, check to see if it actually changes the bind layout or push constant layout. This doubles the runtime performance of one CPU-limited example running with the Dawn WebGPU implementation when running on my laptop. NOTE: This effectively reverts `beca63c6c0`. While it was a nice optimization, it was based on prog_data and we can't do that anymore once we start allowing the same binding table to be used with multiple different pipelines. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	22f16ff54a	anv: More carefully dirty state in BindDescriptorSets Instead of dirtying all graphics or all compute based on binding point, we're now much more careful. We first check to see if the actual descriptor set changed and then only dirty the stages used by that descriptor set. For dynamic offsets, we keep a bitfield per-stage of which offsets are actually used in that stage and we only dirty push constants and descriptors if that stage has dynamic offsets AND those offsets actually change. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	ca8117b5d5	anv: Use a switch statement for binding table setup It theoretically could be more efficient but the real point here is that it's no longer really a matter of dealing with special cases and then the "real" thing. The way we're handling binding tables, it's more of a multi-step process and a switch is more natural. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	9baa33cef0	anv: Rework push constant handling This substantially reworks both the state setup side of push constant handling and the pipeline compile side. The fundamental change here is that we're no longer respecting the prog_data::param array and instead are just instructing the back-end compiler to leave the array alone. This makes the state setup side substantially simpler because we can now just memcpy the whole block of push constants and don't have to upload one DWORD at a time. This also means that we can compute the full push constant layout up-front and just trust the back-end compiler to not mess with it. Maybe one day we'll decide that the back-end compiler can do useful things there again but for now, this is functionally no different from what we had before this commit and makes the NIR handling cleaner. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	ca91ab8015	anv: Re-arrange push constant data a bit This moves the compute stuff into a anv_push_constants::cs sub-struct. It also moves dynamic offsets into the push constants. This means we have to duplicate the data per-stage but that doesn't seem like the end of the world and one day we may wish to make dynamic offsets per-stage anyway. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	d1c4e64a69	intel/compiler: Add a flag to avoid compacting push constants In vec4, we can just not run the pass. In fs, things are a bit more deeply intertwined. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	aecde23519	anv: Pre-compute push ranges for graphics pipelines It turns off that emitting push constants is one of the hottest paths in the driver and ANY work we do there costs us. By pre-computing things a bit ahead of time, we shave 5% off the runtime of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	4b392ced2d	anv: Stop bounds-checking pushed UBOs The bounds checking is actually less safe than just pushing the data. If the bounds checking actually ever kicks in and it's not on the last UBO push range, then the shrinking will cause all subsequent ranges to be pushed to the wrong place in the GRF. One of the behaviors we definitely don't want is for OOB UBO access to result in completely unrelated UBOs returning garbage values. It's safer to just push the UBOs as-requested. If we're really concerned about robustness, we can emit shader code to do bounds checking which should be stupid cheap (a CMP followed by SEL). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	ebad00d9e7	anv: Delete dead shader constant pushing code As of `2d78e55a8c`, nir_intrinsic_load_constant with a constant offset is constant-folded so we should never end up with any that trigger brw_nir_analyze_ubo_ranges. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	0709c0f6b4	anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout This lets us stop tracking the pipeline layout. It also means less indirection on a very hot path. As an extra bonus, we can make some of our data structures smaller. No measurable CPU overhead improvement. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	fa120cb31c	anv: Input attachments are always single-plane Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	0a02f2a278	genxml: Mark everything in genX_pack.h always_inline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Jason Ekstrand	abfd4651ed	anv/pipeline: Assume layout != NULL In the early days of the driver we allowed layout to be VK_NULL_HANDLE and used that for some internal pipelines when we wanted to be lazy. Vulkan doesn't actually allow NULL layouts, however, so there's no reason to have this check. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-18 18:35:14 +00:00
Italo Nicola	59623f211b	intel/compiler: remove old comment This comment was correct some time ago, but since commit `d3c10ad427`, it isn't true anymore. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-18 10:20:34 -08:00
Alyssa Rosenzweig	3663340049	pan/midgard: Use shader stage in mir_op_computes_derivative A 'normal' texture op may be emitted in a vertex shader on T720 but it still doesn't take any derivatives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-18 08:48:54 -05:00
Danylo Piliaiev	6f17fe0606	i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround Re-emitting 3DSTATE_CC_STATE_POINTERS after emitting 3DSTATE_BLEND_STATE_POINTERS fixes the shadow flickering in SuperTuxCart and Tropico 6 which was seen only on Haswell. The reason for this is unknown and fix was found empirically. The closest mention in PRM is that it should improve performance. From the HSW PRM, volume 2b, page 823 (3DSTATE_BLEND_STATE_POINTERS): "When the BLEND_STATE pointer changes but not the CC_STATE pointer, driver needs to force a CC_STATE pointer change to improve blend performance in pixel backend." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1834 Fixes: `eca4a654` ("i965: Disable dual source blending when shader doesn't support it on gen8+") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-18 11:00:23 +02:00
Samuel Pitoiset	1ebd9459e7	radv: implement VK_AMD_device_coherent_memory This extension adds the device coherent and device uncached memory types. It's known to be slower than non-device coherent memory but it might be useful for debugging. This is only exposed for chips that support L2 uncached. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-18 08:20:19 +00:00
Samuel Pitoiset	2af7511ed2	ac: add radeon_info::has_l2_uncached For chips that have uncached device memory (ie. MTYPE_UC). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-18 08:20:19 +00:00
Pierre-Eric Pelloux-Prayer	3c9ea6bdfd	radeonsi: enable mesa_glthread for GfxBench It improves offscreen tests performance. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-18 09:16:18 +01:00
Alyssa Rosenzweig	bc9a7d0699	pan/midgard: Represent ld/st offset unpacked This simplifies manipulation of the offsets dramatically, fixing some UBO access related bugs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig	1798f6bfc3	pan/midgard: Fix masks/alignment for 64-bit loads These need to be handled with special care. Oh, Midgard, you're extra special. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 22:19:31 -05:00
Alyssa Rosenzweig	34a860b9e3	pan/midgard: Expose more typesize helpers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 21:30:14 -05:00
Alyssa Rosenzweig	2236904f72	pan/midgard: Implement non-aligned UBOs The field is more fine-grained than we had assumed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-17 21:18:45 -05:00
Christian Gmeiner	ee3ad0fad2	etnaviv: rs: upsampling is not supported This change makes it possible to support different downsample cases like 4 -> 2 or 4 -> 1. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-11-17 18:42:31 +00:00
Jonathan Marek	75e58d1fae	freedreno/registers: fix a6xx_2d_blit_cntl ROTATE A change from `b7093882` got overwritten by `610c8c93` Fixes: `610c8c93` ("freedreno/registers: Update with GS, HS and DS registers") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:40:53 +00:00
Jonathan Marek	0f5743429c	freedreno/ir3: disable texture prefetch for 1d array textures Prefetch only supports the basic 2D texture case, checking is_array is needed because 1d array textures pass the coord num_components==2 test. Fixes: `2a0d45ae` ("freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-17 17:01:18 +00:00
Andreas Baierl	ef9635d0bc	lima: Parse VS and PLBU command stream while making a dump This makes the streams more readable and comparable with the blob's parser as it parses the VS and PLBU stream and shows the currently known values. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-11-17 05:39:17 +00:00
Andreas Baierl	c76eb7ea84	lima: Beautify stream dumps Change the dump, that the output looks more like the output of mali-syscall-tracker [1]. This is a preparation for a more detailed stream analysis. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> [1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker	2019-11-17 05:39:17 +00:00
Aaron Watry	3b3494174d	clover/llvm: fix build after llvm 10 commit 1dfede3122ee CodeGenFileType moved from ::llvm::TargetMachine in llvm/Target/TargetMachine.h to ::llvm:: in llvm/Support/CodeGen.h Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-11-15 22:54:31 -06:00
Mauro Rossi	09ab297e9f	android: util/format: fix include path list To avoid following building error: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_util_intermediates/format/u_format_table.c:30:10: fatal error: 'u_format.h' file not found ^~~~~~~~~~~~ 1 error generated. Fixes: `882ca6d` ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-11-16 00:06:31 +01:00
Mauro Rossi	3cd522c70a	android: radeonsi: fix build error due to wrong u_format.csv file path GEN10_FORMAT_TABLE_INPUTS requires correction of u_format.csv file path in order to avoid following build error: ninja: error: 'external/mesa/util/format/u_format.csv', needed by 'out/target/product/x86_64/gen/STATIC_LIBRARIES/libmesa_pipe_radeonsi_intermediates/radeonsi/gfx10_format_table.h', missing and no known rule to make it Fixes: `882ca6d` ("util: Move gallium's PIPE_FORMAT utils to /util/format/") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-11-15 23:20:03 +01:00
Eric Anholt	b30589cbd3	mesa/st: Reuse st_choose_matching_format from st_choose_format(). We had this ad-hoc exact size matching for unsized internalformats, but st_choose_matching_format() can do exactly what we want. This means, that, for example, we'll now prefer the matching ordering for 565/565_REV if the driver supports both orders. We also pass Unpack.SwapBytes through from ChooseTextureFormat so that we can hit the memcpy path for 8888 formats when that flag is set. Some interesting format choice changes from this (on softpipe): intf/form/type before after ---------------------------------------------------- RGBA/RGBA/USHORT: R8G8B8A8_UNORM -> RGBA_UNORM16 RGB/RGBA/8888: X8B8G8R8_UNORM -> R8G8B8X8_UNORM RGB/ABGR/8888_REV: X8B8G8R8_UNORM -> R8G8B8X8_UNORM RGBA/RGBA/5551: B5G5R5A1_UNORM -> A1B5G5R5_UNORM RGBA/RGBA/4444: R8G8B8A8_UNORM -> A4B4G4R4_UNORM RGBA/GL_RGBA/1010102: R8G8B8A8_UNORM -> A2B10G10R10_UNORM DEPTH/DEPTH/UINT: Z24X8 -> Z_UNORM32 DEPTH/DEPTH/USHORT: Z24X8 -> Z_UNORM16 v2: Make sure that the baseformat still matches. v1 would pick MESA_FORMAT_L16_UNORM for RED/LUMINANCE/SHORT, when we clearly want a red format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Eric Anholt	bc2b14a4a3	mesa: Don't put sRGB formats in the array format table. sRGB vs unorm was the only conflict case being guarded against in this function. Before the PIPE_FORMAT conversion, we always listed the unorm before the sRGB in the enums, but PIPE_FORMAT_A8B8G8R8_SRGB happens to be before _UNORM. We always want the unorm result here. Fixes: `807a800d8c` ("mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Eric Anholt	e5b06008f1	mesa/st: Simplify st_choose_matching_format(). We now have a nice helper function for finding those memcpy formats, without needing to go through each entry of the mesa format table to see if it happens to match. While looking at sysprof of a softpipe GLES2 CTS run, we were spending ~8% of the CPU on ChooseTextureFormat. With this, roughly the same region of the testsuite was .4%. v2: Add Ken's fix for canonicalizing array formats. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-15 20:32:17 +00:00
Kenneth Graunke	69f109cc37	mesa: Handle GL_COLOR_INDEX in _mesa_format_from_format_and_type(). Just return MESA_FORMAT_NONE to avoid triggering unreachable; there's really no sensible thing to return for this case anyway. This prevents regressions in the next commit, which makes st/mesa start using this function to find a reasonable format from GL format and type enums. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-15 20:32:17 +00:00
Alyssa Rosenzweig	ea232c7cfd	pan/midgard: Use generic constant packing for 8/64-bit Eventually, we will want to combine constants across types, but for now let's not break the world. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	4c182a6d11	pan/midgard: Pack 64-bit swizzles 64-bit ops have their own funky swizzles. Let's pack them, both for native 64-bit sources as well as extended 32-bit sources. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	ba2fb98d36	pan/midgard: Fix mir_round_bytemask_down for !32b Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	2655a300a3	pan/midgard: Implement i2i64 and u2u64 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig	855eec93b1	pan/midgard: Expand 64-bit writemasks Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 20:08:46 +00:00
Marek Olšák	bda3ec5d55	radeonsi/nir: don't lower fma, instead, fuse fma We want fma. This decreases compile times by 4% for Borderlands 2. 48505 shaders in 30515 tests Totals: SGPRS: 2206584 -> 2204784 (-0.08 %) VGPRS: 1647892 -> 1648964 (0.07 %) Spilled SGPRs: 6256 -> 6078 (-2.85 %) Spilled VGPRs: 72 -> 72 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2240 -> 2240 (0.00 %) dwords per thread Code Size: 49680804 -> 49837988 (0.32 %) bytes LDS: 74 -> 74 (0.00 %) blocks Max Waves: 371387 -> 371352 (-0.01 %) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	dec34e880d	radeonsi/nir: call nir_lower_flrp only once per shader Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	0714b3d57e	radeonsi/nir: remove dead function temps glxgears has dead temps after lowering color inputs to load intrinsics. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Marek Olšák	bc5097a7d9	gallium/noop: call finalize_nir For measuring st/mesa compile time. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-15 14:34:49 -05:00
Tomeu Vizoso	27801b90fa	panfrost: Make sure the shader descriptor is in sync with the GL state State was leaking from previous frames as we weren't updating the descriptor in all cases. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig	095654e3c2	pan/midgard: Prioritize texture registers On newer GPUs, this is a no-op. On older GPUs, this prevents needless spilling since texture registers are shared with a subset of work registers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig	339401b53c	pan/midgard: Disassemble with old pipeline always on T720 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	8344d7425b	pan/midgard: Use texture, not textureLod, on early Midgard We have to disable the fixup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	29f5b00e6e	pan/midgard: Fix vertex texturing on early Midgard We use a different set of texture registers, probably to save hardware. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig	3866d0776f	pan/midgard: Generalize texture registers across GPUs Early Midgard uses a different set of texture registers; let's not hardcode. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com>	2019-11-15 18:37:33 +00:00
Rhys Perry	df645fa369	aco: implement VK_KHR_shader_float_controls This actually supports more of the extension than the LLVM backend but we can't enable it because ACO doesn't work with all stages yet. With more of it enabled, some CTS tests fail because our 64-bit sqrt is very imprecise. I can't find any precision requirements for it anywhere, so I'm thinking it might be a CTS issue. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	be1d11249b	aco: fix 64-bit fsign with 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	b062b92ab1	aco: don't combine literals into v_cndmask_b32/v_subb/v_addc No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-15 17:36:21 +00:00
Rhys Perry	d7b0d9a8d8	radv: enable FP16/FP64 denormals earlier and only for LLVM ACO sets this itself and will have to set it differently in the future to support shaderDenormFlushToZeroFloat64. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-15 17:36:21 +00:00
Michel Dänzer	c6c7652753	gitlab-ci: Organize images using new REPO_SUFFIX templates feature Two benefits: Most docker image related environment variables can now be defined in the jobs where they're used instead of globally. The DEBIAN_TAG values are propagated to other jobs via YAML anchors. Images on https://gitlab.freedesktop.org/mesa/mesa/container_registry are now organized in separate repositories with a suffix matching the name of the job which makes sure the image is there. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-15 16:23:22 +01:00
Michel Dänzer	506e9d5fc7	gitlab-ci: Rename container install scripts to match job names (better) Cleans up .gitlab-ci/ a little, and allows using a single DEBIAN_EXEC line for all container jobs. v2: * Use lava_arm.sh instead of arm_lava.sh for consistency with v2 of the previous change Reviewed-by: Eric Anholt <eric@anholt.net> # v1 Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-15 16:21:10 +01:00
Michel Dänzer	3a48f4565e	gitlab-ci: Use functional container job names This makes it easier to tell which job is which in a pipeline. v2: * Use lava_arm{64,hf} instead of arm{64,hf}_lava to keep these jobs together in pipeline overviews Reviewed-by: Eric Anholt <eric@anholt.net> # v1 Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-15 16:20:16 +01:00
Michel Dänzer	670277846d	gitlab-ci: Document that ci-templates refs must be in sync Otherwise there can be weird breakage. (Removing the include from .gitlab-ci/lava-gitlab-ci.yml doesn't seem possible unfortunately: https://gitlab.freedesktop.org/daenzer/mesa/pipelines/79458) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-15 16:06:54 +01:00
Tomeu Vizoso	7d24cef200	panfrost: Multiply offset_units by 2 Per the spec, the units passed to glPolygonOffset are to be multiplied by an implementation-defined constant. On Midgard, this constant seems to be 2. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-15 14:45:29 +01:00
Lionel Landwerlin	c061185e17	intel/perf: add EHL performance query support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-11-15 13:14:30 +00:00
Lionel Landwerlin	39fd11a9f8	intel/dev: flag the Elkhart Lake platform We'll use this for performance metrics which are different from ICL. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-11-15 13:14:30 +00:00
Tapani Pälli	7a893a0d57	gitlab-ci: update Piglit commit, update skips Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-15 12:06:15 +02:00
Tapani Pälli	1d970f15e2	mesa: allow bit queries for EXT_disjoint_timer_query Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-15 12:05:56 +02:00
Samuel Pitoiset	41a1152cdc	radv: make sure to not clear the ds attachment after resolves To not overwrite the resolve if there is pending clear aspects, same as color resolves. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-15 09:36:43 +01:00
Samuel Pitoiset	519d9b30de	radv: remove useless RADV_DEBUG=unsafemath debug option This option is useless and shouldn't be used at all. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-15 09:07:34 +01:00
Nathan Kidd	9a80b7fd8f	llvmpipe: Check thread creation errors In the case of glibc, pthread_t is internally a pointer. If lp_rast_destroy() passes a 0-value pthread_t to pthread_join(), the latter will SEGV dereferencing it. pthread_create() can fail if either the user's ulimit -u or Linux kernel's /proc/sys/kernel/threads-max is reached. Choosing to continue, rather than fail, on theory that it is better to run with the one main thread, than not run at all. Keeping as many threads as we got, since lack of threads severely degrades llvmpipe performance. Signed-off-by: Nathan Kidd <nkidd@opentext.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-11-15 02:43:22 +01:00
Ben Crocker	9c3be6d21f	llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders Large programs, e.g. gnome-shell and firefox, may tax the addressability of the Medium code model once a (potentially unbounded) number of dynamically generated JIT-compiled shader programs are linked in and relocated. Yet the default code model as of LLVM 8 is Medium or even Small. The cost of changing from Medium to Large is negligible: - an additional 8-byte pointer stored immediately before the shader entrypoint; - change an add-immediate (addis) instruction to a load (ld). Testing with WebGL Conformance (https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html) yields clean runs with this change (and crashes without it). Testing with glxgears shows no detectable performance difference. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1753327, 1753789, 1543572, 1747110, and 1582226 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/223 Co-authored by: Nemanja Ivanovic <nemanjai@ca.ibm.com>, Tom Stellard <tstellar@redhat.com> CC: mesa-stable@lists.freedesktop.org Signed-off-by: Ben Crocker <bcrocker@redhat.com>	2019-11-14 23:07:26 +00:00
Kenneth Graunke	4242c57227	iris: Wrap iris_fix_edge_flags in NIR_PASS So nir_validate happens properly. Unfortunately this means we have to play the metadata song and dance, so walk over all impls and say that we didn't hurt anything. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-14 14:50:11 -08:00
Kenneth Graunke	39c23fd1bb	iris: Properly move edgeflag_out from output list to global list When demoting it from an output to a global, we need to actually move it to the correct list. While here, we also refactor so it's clear we aren't mutating the list while iterating. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2106 Fixes: `f9fd04aca1` ("nir: Fix non-determinism in lower_global_vars_to_local") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-14 14:50:09 -08:00
Eric Anholt	790d0ebef3	mesa: Move compile of common Mesa core files to a static lib. We were compiling them twice, costing extra build time. Reduces my ccache-hot clean build time by a second (24.3s to 23.3s, 3 runs each). The windows args are a little strange -- it's not clear to me that they're actually used for building these files, but keep them in place just in case, since we don't have a good windows CI story yet. We should want them on both gallium and classic regardless: Only osmesa could be built for windows in classic, and classic OSMesa's scons build defines these flags too. Closes: #2052 Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-14 21:46:10 +00:00
Prodea Alexandru-Liviu	cc758f1224	Appveyor: Quickly fix meson build. As this required use of Python 3.8, mako module also had to be updated. v2 - Unbind mako module version when using Meson. Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-14 21:45:23 +00:00
Danylo Piliaiev	0904ee0c60	intel/fs: Do not lower large local arrays to scratch on gen7 On gen7 and earlier the scratch space size is limited to 12kB. By enabling this optimization we may easily exceed this limit without having any fallback. arb_compute_shader/linker/bug-93840.shader_test crashes with this lowering on IVB due to exceeding scratch size limit. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2092 Fixes: `69244fc7` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-14 20:08:30 +00:00
Eric Anholt	882ca6dfb0	util: Move gallium's PIPE_FORMAT utils to /util/format/ To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 10:47:20 -08:00
Eric Engestrom	ac78ca4b39	gitlab-ci: auto-cancel CI runs when a newer commit is pushed to the same branch Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-14 17:32:31 +00:00
Timur Kristóf	9b8dc6929e	aco: Optimize out trivial code from uniform bools. This should remove most of the excess code size that was introduced by making all booleans per-lane. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-14 17:27:11 +01:00
Timur Kristóf	8995c0b30a	aco: Treat all booleans as per-lane. Previously, instruction selection had two kinds of booleans: 1. divergent which was per-lane and stored in s2 (VCC size) 2. uniform which was stored in s1 Additionally, uniform booleans were made per-lane when they resulted from operations which were supported only by the VALU. To decide which type was used, we relied on the destination size, which was not reliable due to the per-lane uniform bools, but it mostly works on wave64. However, in wave32 mode (where VCC is also s1) this approach makes it impossible keep track of which boolean is uniform and which is divergent. This commit makes all booleans per-lane. The resulting excess code size will be taken care of by the optimizer. v2 (by Daniel Schürmann): - Better names for some functions - Use s_andn2_b64 with exec for nir_op_inot - Simplify code due to using s_and_b64 in bool_to_scalar_condition v3 (by Timur Kristóf): - Fix several subgroups regressions Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-14 17:27:11 +01:00
Daniel Schürmann	a1622c1a11	aco: use s_and_b64 exec to reduce uniform booleans to one bit Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-14 17:27:10 +01:00
Timur Kristóf	94e355148f	aco: Make sure not to mistakenly propagate 64-bit constants. ACO's optimizer would try to propagate 64-bit constants, but does so in such a way that wouldn't work due to how the 64-bit constants are handled in the IR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-14 17:27:10 +01:00
Daniel Schürmann	9d3e070524	aco: value number instructions using the execution mask This patch tries to give instructions with the same execution mask also the same pass_flags and enables VN for SALU instructions using exec as Operand. This patch also adds back VN for VOPC instructions and removes VN for phis. v2 (by Timur Kristóf): - Fix some regressions. v3 (by Daniel Schürmann): - Fix additional issues Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-14 17:27:10 +01:00
Daniel Schürmann	8657eede8a	aco: check if SALU instructions are predeceeded by exec when calculating WQM needs Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-14 17:27:10 +01:00
Samuel Pitoiset	ee9811a0bb	ac: fix build with recent LLVM Build is broken since "Move CodeGenFileType enum to Support/CodeGen.h". Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-11-14 14:41:55 +00:00
Tapani Pälli	94cb4916e3	Revert "mesa: allow bit queries for EXT_disjoint_timer_query" This reverts commit `66d24a9ef7`. This commit made Mesa CI red because commit depends on a Piglit test change.	2019-11-14 13:34:33 +00:00
Connor Abbott	f9fd04aca1	nir: Fix non-determinism in lower_global_vars_to_local Using a hash-table walk means that variables will get inserted in different orders on different runs. Just walk the list of globals instead, even if some of them can't be turned into locals. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-14 13:10:58 +00:00
Iago Toral Quiroga	f512965b0b	mesa/st: make sure we remove dead IO variables before handing NIR to backends Commit "1c2bf82d24a glsl: disable lower_fragdata_array() for NIR drivers" disabled the GLSL IR lowering that turned gl_FragData from an array into a collection of scalar outputs under the assumption that this was already being handled properly elsewhere, however there are some corner cases where NIR would fail to do this, leaving gl_FragData[] as an array variable. This can break backends that assume that all their outputs will be scalar and use the variable definitions from the shader to do their output setup, such as the case of V3D. At least one corner case was found in some Portal shaders from shader-db, where NIR would optimize out the full body of a fragment shader. In this scenario, the empty shader would keep the original array definition of gl_FragData[], causing the backend to assert. We need to do this late enough for it to be effective, since doing it in st_nir_preprocess does not fix the original problem. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2091 Fixes: `1c2bf82d` ("glsl: disable lower_fragdata_array() for NIR drivers") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-14 10:49:00 +01:00
Tapani Pälli	66d24a9ef7	mesa: allow bit queries for EXT_disjoint_timer_query Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2090 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-14 09:27:13 +02:00
Tapani Pälli	1a093a06d6	Revert "dri_interface: add interface for EGL_EXT_image_flush_external" This reverts commit `7520478461`. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-14 07:46:36 +02:00
Tapani Pälli	7951eb146c	Revert "st/dri: assume external consumers of back buffers can write to the buffers" This reverts commit `1d1b457821`. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-14 07:46:30 +02:00
Tapani Pälli	25f596e6ba	Revert "st/dri: add support for EGL_EXT_image_flush_external" This reverts commit `1d122c104a`. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-14 07:46:20 +02:00
Tapani Pälli	ff05f16c99	Revert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT" This reverts commit `34b1aa957a`. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-14 07:46:14 +02:00
Tapani Pälli	e64b91e34a	Revert "egl: implement new functions from EGL_EXT_image_flush_external" This reverts commit `c1c574fdf1`. This series caused unexpected flickering artifacts with Iris driver on Chrome OS and EGL_EXT_image_flush_external spec has not been published yet. Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-14 07:46:04 +02:00
Alyssa Rosenzweig	ad6b2ac374	pan/midgard: Fix copypropagation for textures total instructions in shared programs: 3562 -> 3457 (-2.95%) instructions in affected programs: 575 -> 470 (-18.26%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 6.56 x̃: 10 helped stats (rel) min: 5.71% max: 24.56% x̄: 16.83% x̃: 18.87% 95% mean confidence interval for instructions value: -9.07 -4.06 95% mean confidence interval for instructions %-change: -19.00% -14.66% Instructions are helped. total bundles in shared programs: 1846 -> 1830 (-0.87%) bundles in affected programs: 338 -> 322 (-4.73%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.50% max: 20.00% x̄: 8.85% x̃: 3.33% 95% mean confidence interval for bundles value: -1.00 -1.00 95% mean confidence interval for bundles %-change: -13.02% -4.67% Bundles are helped. total quadwords in shared programs: 3191 -> 3144 (-1.47%) quadwords in affected programs: 606 -> 559 (-7.76%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 2.94 x̃: 3 helped stats (rel) min: 5.17% max: 22.22% x̄: 11.20% x̃: 5.62% 95% mean confidence interval for quadwords value: -4.58 -1.29 95% mean confidence interval for quadwords %-change: -15.16% -7.24% Quadwords are helped. total registers in shared programs: 312 -> 303 (-2.88%) registers in affected programs: 27 -> 18 (-33.33%) helped: 9 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% 95% mean confidence interval for registers value: -1.00 -1.00 95% mean confidence interval for registers %-change: -33.33% -33.33% Registers are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig	f72873e6aa	pan/midgard: Copypropagate vector creation total instructions in shared programs: 3457 -> 3431 (-0.75%) instructions in affected programs: 787 -> 761 (-3.30%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 1.86 x̃: 1 helped stats (rel) min: 1.01% max: 11.11% x̄: 9.22% x̃: 11.11% 95% mean confidence interval for instructions value: -3.55 -0.16 95% mean confidence interval for instructions %-change: -11.41% -7.03% Instructions are helped. total bundles in shared programs: 1830 -> 1826 (-0.22%) bundles in affected programs: 279 -> 275 (-1.43%) helped: 2 HURT: 0 total quadwords in shared programs: 3144 -> 3121 (-0.73%) quadwords in affected programs: 645 -> 622 (-3.57%) helped: 13 HURT: 0 helped stats (abs) min: 1 max: 11 x̄: 1.77 x̃: 1 helped stats (rel) min: 2.09% max: 16.67% x̄: 12.61% x̃: 14.29% 95% mean confidence interval for quadwords value: -3.45 -0.09 95% mean confidence interval for quadwords %-change: -15.43% -9.79% Quadwords are helped. total registers in shared programs: 303 -> 301 (-0.66%) registers in affected programs: 14 -> 12 (-14.29%) helped: 2 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig	39b5f2fa0b	pan/lcra: Use Chaitin's spilling heuristic Not much of a difference but slightly better and slightly less arbitrary. total instructions in shared programs: 3560 -> 3559 (-0.03%) instructions in affected programs: 44 -> 43 (-2.27%) helped: 1 HURT: 0 total bundles in shared programs: 1844 -> 1843 (-0.05%) bundles in affected programs: 23 -> 22 (-4.35%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig	23c83f3f05	pan/midgard: Compute spill costs Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-14 02:36:21 +00:00
Paulo Zanoni	eb6352162d	intel/compiler: fix nir_op_{i,u}*32 on ICL On ICL we have the src1 restriction which is applied through fix_byte_src() and potentially changes the type of the operands from 8 to 32 bits. When this change happens, we fall into the "else if (bit_size < 32)" case and miscompute src_type because it takes into consideration bit_size (8) instead of the adjusted size of temp_op (32). This results in the shader reading unused memory, giving us mostly failures, but occasional passes due to whatever was already in the registers we were reading. This commit fixes a lot of dEQP subgroup i8vec2 tests on ICL, such as: dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2 This can also be verified by simply changing fix_byte_src() to apply on all platforms. Fixes: `5847de6e9a` ("intel/compiler: don't use byte operands for src1 on ICL") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-13 22:13:52 +00:00
Caio Marcelo de Oliveira Filho	7ae506e5b8	spirv: Consider the sampled_image case in wa_glslang_179 workaround Fixes: `9e440b8d0b` ("spirv: Sort out the mess that is sampled image") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-13 12:02:29 -08:00
Dylan Baker	943f630f8e	docs: update calendar, add news item and link release notes for 19.2.4	2019-11-13 11:12:53 -08:00
Dylan Baker	ff5bcd7ce9	docs: Add SHA256 sum for for 19.2.4	2019-11-13 11:10:51 -08:00
Dylan Baker	67fd2b936d	docs: Add release notes for 19.2.4	2019-11-13 11:10:47 -08:00
Eric Anholt	f0eeb98c6c	ci: Expand the freedreno blit skip regex to cover more cases. We've had flaps on at least: - r16f_to_r16f - r16i_to_rg16i Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-11-13 10:58:52 -08:00
Caio Marcelo de Oliveira Filho	0aaf47f7cd	anv: Initialize depth_bounds_test_enable when not explicitly set This was causing uninitialized value to end up propagated to the 3DSTATE_DEPTH_BOUNDS packet, leading to asserts on packet building due to the value being greater than 1. Fixes: `939ddccb7a` ("anv: Add support for depth bounds testing.") Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-11-13 10:13:27 -08:00
Alyssa Rosenzweig	771d23584a	pan/midgard: Remove util/ra support It's now unused, in favour of LCRA. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig	e343f2ceb9	pan/midgard: Integrate LCRA Pretty routine, we do have a hack to force swizzle alignment for !32-bit for until we implement !32-bit the right way. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig	66ad64d73d	pan/midgard: Implement linearly-constrained register allocation Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig	fd81916ee5	pan/midgard: Add blend shader selection bits for MRT This is less complicated than previously thought. Note we have no way of specifying the work register count for blend shaders; it must be strictly less than the work register count of the corresponding fragment shader (which is fine since we force the fragment shader to report a count of 16 with a blend shader as a major hack until we get register pressure down for blend shaders). TODO: pandecode the flags. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-13 15:27:56 +00:00
Christian Gmeiner	e101af8671	drm-shim: fix EOF case Close input end of the pipe after data was written. Without this fix I have seen a hang in sysfs_uevent_get(.., "OF_FULLNAME") when key was not found. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-13 12:39:14 +00:00
Tapani Pälli	b12911c88e	util/android: fix android build errors Fixes: `9020f519` ("util/u_endian: Add error checks") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2078 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-13 12:31:31 +02:00
Samuel Pitoiset	47ba227448	gitlab-ci: build RADV on ARM64 The ARMHF LLVM package is LLVM 7 but RADV requires LLVM 8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-13 10:52:10 +01:00
Samuel Pitoiset	cb19f69ff0	gitlab-ci: build a specific libdrm version for ARM64 RADV requires libdrm-2.4.100 but the distrib package is too old. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-13 10:52:08 +01:00
Erik Faye-Lund	4c1cef68cf	zink: move drawing separate source This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree.	2019-11-13 09:14:05 +01:00
Erik Faye-Lund	589e8651e6	zink: move blitting to separate source This code is kinda stand-alone, and it makes it a bit easier to find the right source in the source-tree	2019-11-13 09:14:05 +01:00
Erik Faye-Lund	1605a0c8f2	zink: move filter-helper to separate helper-header This will help code-reuse a bit in the next commit.	2019-11-13 09:12:36 +01:00
Erik Faye-Lund	36f3902213	zink: move format-checking to separate source This code is more or less stand-alone, and this keeps the formats array a bit more encapsulated.	2019-11-13 09:12:36 +01:00
Eric Anholt	fd777d2cea	ci: Disable flappy blit tests on a630. These have shown up with the new CTS runner, which has changed test ordering. Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-11-12 16:43:04 -08:00
Rob Clark	0f33c255d3	freedreno/ir3: remove unused parameter Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-12 13:57:52 -08:00
Rob Clark	df7a88dca3	freedreno/ir3: legalize cleanups We can clear the "needs" flags once we emit a flag. And also, don't open-code the opcode name. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	b22617fb57	freedreno/ir3: fix gpu hang with pre-fs-tex-fetch For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x, even if it is not used in the shader (ie. only varying use is for tex coords). But if, for example, gl_FragCoord is used, it could get assigned on top of bary_ij, resulting in a GPU hang. The solution to this is two-fold: (1) the inputs/outputs rework has the benefit of making RA realize bary_ij is a vec2, even if there are no split/collect instructions (due to no varying fetches in the shader itself). And (2) extend the live ranges of meta:input instructions to the first non-input, to prevent RA from assigning the same register to multiple inputs. Backport note: because of (1) above, a better solution for 19.3 would be to revert `f30c256ec0`. Fixes: `f30c256ec0` ("freedreno/ir3: enable pre-fs texture fetch for a6xx") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	4bb697d938	freedreno/ir3: only tex instructions have wrmask At the ir3 level, we would assume that we could use wrmask to mask off other components of an instruction returning a vecN when they are not used. Which would let RA use components not written for other live values. But this is only true for tex instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	bdf6b7018c	freedreno/ir3: re-work shader inputs/outputs Allow inputs/outputs to be vecN (ie. whatever their actual size is), and use split to get scalar components of inputs, and collect to gather up scalar components of outputs. The main motivation is to simplify RA, by only having to consider split/ collect to figure out where values need to land in consecutive scalar registers, rather than having to also deal with left/right neighbors. Because of varying packing, and the resulting fractional location (location_frac), to implement load_input/store_output, it is still convenient to have a table of scalar inputs/outputs. We move this to the compile ctx (since it is only needed for nir->ir3). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:57:52 -08:00
Rob Clark	2aae13f642	freedreno/ir3: simplify creating sysval inputs In almost all places, the add_sysval_input() is paired directly with a create_input(). (The one exception is frag shader ij bary coord, and this exception will go away in a later patch.) So go ahead and clean this up before reworking input/output handling. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	68d2ec5f7e	freedreno/ir3: remove first-vertex sysval This is a driver-param (loaded from uniform), not a sysval (populated by hw into a register). So it has no value to having a sysval slot. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	7b2166785a	freedreno/ir3: helper to print ir if debug enabled Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	7a5f073da3	freedreno/ir3: show input/output wrmask's in disasm Currently it is always 0x1 (scalar), but that will change in a later patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	c00a67171c	freedreno/ir3: add input/output iterators We can at least get rid of the if-not-NULL check in a bunch of places. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	b2417801e5	freedreno/ir3: remove impossible condition We keep kill's alive w/ keeps these days, rather than a fake output. This condition was left over from prior to that change. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	611258d578	freedreno/ir3: rename fanin/fanout to collect/split If I'm going to refactor a bit to use these meta instructions to also handle input/output, then might as well cleanup the names first. Nouveau also uses collect/split for names of these meta instructions, and I like those names better. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	4af86bd0b9	freedreno/ir3: remove half-precision output This doesn't really work, we can't necessarily just change the outputs to half-precision like this in anything but simple cases. Keep the shader key entry around though, eventually with proper mediump support we could use this with a nir pass to use lower precision frag shader outputs when the render target format has <= 16b/component. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Rob Clark	089b105396	freedreno/ir3: fix valgrind complaint with STLW The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but `instr->regs[4]` is not. ``` Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'.. ==29239== Invalid read of size 8 ==29239== at 0x5BE9CDC: emit_cat6 (ir3.c:841) ==29239== by 0x5BEA1BF: ir3_assemble (ir3.c:921) ==29239== by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133) ==29239== by 0x5BDF193: assemble_variant (ir3_shader.c:162) ==29239== by 0x5BDF407: create_variant (ir3_shader.c:215) ==29239== by 0x5BDF4DB: shader_variant (ir3_shader.c:241) ==29239== by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257) ==29239== by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80) ==29239== by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96) ==29239== by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119) ==29239== by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186) ==29239== by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290) ==29239== Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd ==29239== at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so) ==29239== by 0x61BD35B: ralloc_size (ralloc.c:119) ==29239== by 0x61BD41B: rzalloc_size (ralloc.c:151) ==29239== by 0x5BE599B: ir3_alloc (ir3.c:45) ==29239== by 0x5BEA583: instr_create (ir3.c:984) ==29239== by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000) ==29239== by 0x5BEE317: ir3_STLW (ir3.h:1431) ==29239== by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903) ==29239== by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802) ==29239== by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339) ==29239== by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426) ==29239== by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474) ==29239== ``` Probably this only triggers in non-optimized builds? Fixes: `1f3b52ce50` ("freedreno/a6xx: Add register offset for STG/LDG") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 13:55:03 -08:00
Eric Anholt	f3244c6019	ci: Remove old commented copy of freedreno artifacts. This path was from an older version of freedreno CI.	2019-11-12 12:54:04 -08:00
Eric Anholt	52843ec5d3	ci: Enable all of GLES3/3.1 testing for softpipe. Now that we're not using so many job slots, it's easy to get these jobs run in a reasonable amount of time (gles3 took 10 minutes for 4 cores, and gles31 was 15 minutes for 4 cores). Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-12 12:54:04 -08:00
Eric Anholt	f08c810028	ci: Use cts_runner for our dEQP runs. This runner is a little project by Bas, written in C++, that spawns threads that then loop grabbing chunks of the (randomly shuffled but consistently so) test list and hand it to a dEQP instance. As the remaining list gets shorter, so do the chunks, so hopefully the threads all complete effectively at once. It also handles restarting after crashes automatically. I've extended the runner a bit to do what I was doing in the bash scripts before, like the skip list and expected failures handling. This project should also be a good baseline for extending to handle retesting of intermittent failures. By switching to it, we can have the swrast tests just take up one job slot on the shared runners and keep their allotment of CPUs busy, instead of taking up job slots with single-threaded dEQP jobs. It will also let us (eventually, once I reprovision) switch the freedreno runners over to threading within the job instead of running concurrent jobs, so that memory scribbles in one pipeline don't affect unrelated pipelines, and I can experiment with their parallelism (particularly on a306 where we are frequently backed up) without trashing other people's jobs. What we lose in this process is per-test output in the log (not a big loss, I think, since we summarize fails at the end and reducing log length keeps chrome from choking on our logs so badly). We also drop the renderer sanity checking, since it's not saving qpa files for us to go poke through. Given that all the drivers involved have fail lists, if we got the wrong renderer somehow, we'd get a job failure anyway. v2: Rebase on droppong of the autoscale cluster and the arm64 build/test split. Use a script to deduplicate the cts-runner build. v3: Rebase on the amd64 build/test container split. Acked-by: Daniel Stone <daniels@collabora.com> (v1) Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2)	2019-11-12 12:54:04 -08:00
Eric Anholt	7f52df7fc9	ci: Make the skip list regexes match the full test name. The bash scripts were using grep in the manner that matches any subset of the line, but the new CTS runner matches the whole line and I think that's a pretty good behavior. Given that some of the skip lists already were written to match the full test name, just make them consistently do so. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-12 12:54:04 -08:00
Eric Anholt	66719e0242	ci: Use several debian buster packages instead of hand-building. This helps cut down our container build time. I've left a few that we're likely to rev more frequently or I was less confident in dropping. v2: Rebase on the build/test container split, now bumps the build container tag in this commit. Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Acked-by: Daniel Stone <daniels@collabora.com> (v1)	2019-11-12 12:54:04 -08:00
Rafael Antognolli	a4da6008b6	iris: Use mocs from isl_dev. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 20:41:52 +00:00
Rafael Antognolli	d4f628235e	anv: Use mocs settings from isl_dev. v2: Remove device->default_mocs and external_mocs (Jason). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 20:41:52 +00:00
Rafael Antognolli	2b01636ddb	intel/isl: Add MOCS settings to isl_device. Centralize mocs settings into isl. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 20:41:52 +00:00
Rob Clark	d509a46225	freedreno: fix eglDupNativeFenceFD error We can end up with scenarios where last_fence is associated with a batch that is flushed through some other path before needs_out_fence_fd gets set. Resulting in returning a fence that has no backing fd. The simplest thing is to just skip the optimization to try and avoid no-op batches when a fence-fd is requested. This should normally be just once a frame anyways. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-12 11:38:16 -08:00
Brian Paul	bd49dedae0	nir: fix a couple signed/unsigned comparison warnings in nir_builder.h Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-12 11:44:02 -07:00
Brian Paul	a69e105361	s/APIENTRY/GLAPIENTRY/ in teximage.c The later is the right symbol for entrypoint functions. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-12 11:44:01 -07:00
Lepton Wu	5c2d307a10	android: mesa: Revert "android: mesa: revert "Enable asm unconditionally"" Commit `45206d7673` fixed PIC issue of x86 asm stub. We can enable asm for Android x86 now. This should sightly improve performance. Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-11-12 18:09:43 +00:00
Rhys Perry	6914b0236f	aco: combine read_invocation and shuffle implementations They do mostly the same thing now. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-12 17:21:38 +00:00
Rhys Perry	2c98d79d11	aco: don't propagate vgprs into v_readlane/v_writelane Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler')	2019-11-12 17:21:38 +00:00
Rhys Perry	5a1bacb6f9	aco: fix read_invocation with VGPR lane index Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler')	2019-11-12 17:21:38 +00:00
Rhys Perry	c877f4d320	nir/divergence: improve DA of shuffle If the data is uniform, then it's really a uniform copy. If the index is uniform, then it's really a read_invocation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-12 17:21:38 +00:00
Rhys Perry	f97d933426	aco: fix shuffle with uniform operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `93c8ebfa` ('aco: Initial commit of independent AMD compiler')	2019-11-12 17:21:38 +00:00
Rhys Perry	3204e83768	aco: use DPP instead of exec modification when lowering GFX10 shuffles Seems we can use DPP's row_mask field to get an effect similar to modifying exec. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-12 17:21:38 +00:00
Eric Engestrom	06347989a0	gitlab-ci: build libdrm using meson instead of autotools Autotools was deprecated for a while and has now been removed, so let's start using meson here so that we won't have any issues next time we update libdrm. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-11-12 17:08:02 +00:00
Daniel Schürmann	746b9380bd	aco: rematerialize s_movk instructions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-12 15:59:48 +00:00
Daniel Schürmann	b6f5085dfe	aco: preserve kill flag on moved operands during RA Fixes: `93c8ebfa78` aco: Initial commit of independent AMD compiler Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-12 15:59:48 +00:00
Daniel Schürmann	a2a6880743	aco: fix invalid access on Pseudo_instructions Fixes: `93c8ebfa78` aco: Initial commit of independent AMD compiler Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-12 15:59:48 +00:00
Erik Faye-Lund	5b09a7e2e4	zink: remove no-longer-needed hack It seems whatever was causing this is no longer an issue. So let's get rid of the hack here. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-11-12 13:30:35 +00:00
Erik Faye-Lund	e1c87bbb4b	zink: implement buffer-to-buffer copies	2019-11-12 12:40:49 +00:00
Erik Faye-Lund	9352991880	zink: always allow transfer to/from buffers	2019-11-12 12:40:49 +00:00
Danylo Piliaiev	d4c8182018	intel/blorp: Fix usage of uninitialized memory in key hashing The automatically generated padding in structs contains undefined values, force pack the structs to eliminate the padding. Otherwise structs with the same values may generate different hashes. Valgrind output: Conditional jump or move depends on uninitialised value(s) util_fast_urem32 (fast_urem_by_const.h:71) hash_table_search (hash_table.c:262) _mesa_hash_table_search (hash_table.c:296) anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318) anv_pipeline_cache_search (anv_pipeline_cache.c:335) lookup_blorp_shader (anv_blorp.c:38) blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112) blorp_mcs_partial_resolve (blorp_clear.c:1205) anv_image_mcs_op (anv_blorp.c:1742) anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774) transition_color_buffer (genX_cmd_buffer.c:1159) cmd_buffer_end_subpass (genX_cmd_buffer.c:4840) Uninitialised value was created by a stack allocation blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103) Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 13:59:29 +02:00
Danylo Piliaiev	3349b4b056	i965/program_cache: Lift restriction on shader key size This will allow usage of packed structs which may have size not divisible by 4. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-12 13:59:24 +02:00
Michel Dänzer	af684753f3	gitlab-ci: Delete install/bin from artifacts as well This cuts the x86 artifacts zip file size in less than half. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:18:31 +01:00
Michel Dänzer	aebf43dcc1	gitlab-ci: Use separate docker images for x86 build/test jobs Same as was done for the ARM images before. This should make it less painful to update to newer dEQP / piglit as well as to make changes to the build/test environment. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:17:21 +01:00
Michel Dänzer	576f7b6ea5	gitlab-ci: Run piglit tests with llvmpipe One job for the quick_gl profile, one for the glslparser & quick_shader profiles (doing these together takes hardly any more time than quick_shader alone). v2: * Don't break lava tests v3: * Remove piglit test artifacts paths: * Exclude some quick_shader tests again: - Test whose result flips between pass/fail/skip - @vs_in tests, as not the same one of these gets picked every time v4: Do not list passing tests in .gitlab-ci/piglit/.txt (Eric Anholt) Include the test number summary in .gitlab-ci/piglit/.txt Completely disable generating any vs_in tests in the piglit build. * Remove some more unneded files from the piglit build tree. * Exclude quick_gl arb_gpu_shader5 tests; they were all skipped anyway, as llvmpipe doesn't support this extension yet, but occasionally they would spuriously fail instead. v5: * Set LD_LIBRARY_PATH, so we actually test the Mesa build from the pipeline... * Verify that wflinfo reports the expected Mesa version * Pass -noreset to Xvfb v6: * Don't use autoscale runners, run piglit with -j4 (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:16:23 +01:00
Michel Dänzer	4b25b5885b	gitlab-ci: Sort packages in debian-install.sh And remove duplicates. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:16:08 +01:00
Michel Dänzer	df26e18b9f	gitlab-ci: Share dEQP build process between x86 & ARM test image scripts See https://gitlab.freedesktop.org/mesa/mesa/issues/2056 v2: * Rename .gitlab-ci/deqp-build.sh => .gitlab-ci/build-deqp.sh (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:14:49 +01:00
Michel Dänzer	59fcb019d0	gitlab-ci: Move artifact preparation to separate script It's currently only needed for the meson-main and meson-arm64 jobs, not the other meson build jobs. Also remove MESON_SHADERDB, just run .gitlab-ci/run-shader-db.sh directly from the meson-main job. v2: * Also run prepare-artifacts.sh in meson-arm64 script v3: * Move tarball creation into the new script as well, as it prevented ccache --show-stats from running in after_script Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:14:26 +01:00
Michel Dänzer	2921a38484	gitlab-ci: Use ninja -j4 for building dEQP By default, ninja tries to saturate all cores of the runner host machine, which could overload it due to other jobs running in parallel. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-12 10:14:04 +01:00
Jason Ekstrand	0c7e0c5599	spirv: Fix the MSVC build Fixes: `9cc4c2c916` "spirv: Add a vtn_decorate_pointer helper" Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-12 08:34:55 +00:00
Erik Faye-Lund	9b8964d064	nir: patch up deref-vars when lowering clip-planes Otherwise, we fail validation and potentially generate invalid code. Let's fix up the mode of the accesses to the variable. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-12 09:13:22 +01:00
Samuel Pitoiset	bef7b2f805	ac: handle pointer types to LDS in ac_get_elem_bits() This fixes crashes with some dEQP-VK.spirv_assembly.instruction.spirv1p4.* tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-12 08:32:15 +01:00
Jonathan Marek	01cae57c80	freedreno: add Adreno 640 ID A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-11-11 20:46:01 -05:00
Luis Mendes	0cb5c96a83	radv: fix radv secure compile feature breaks compilation on armhf EABI and aarch64 __NR_select is not defined the same way across architectures, sometimes is not even defined, like in armhf EABI and aarch64. Signed-off-by: Luis Mendes <luis.p.mendes@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2042	2019-11-12 11:47:20 +11:00
Marek Olšák	3a23af9f44	st/mesa: remove unused TGSI-only debug printing functions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:45:12 -05:00
Marek Olšák	d29a332862	st/mesa: add ST_DEBUG=nir to print NIR shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:45:10 -05:00
Marek Olšák	265abc54f8	st/mesa: print TCS/TES/GS/CS TGSI in the right place & keep disk cache enabled The old place only printed on a disk cache miss, which is why the disk cache was disabled. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:45:08 -05:00
Marek Olšák	98e27e5e28	st/mesa: remove \n being only printed in debug builds after printed TGSI Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:45:07 -05:00
Marek Olšák	c3351bb44b	st/mesa: rename DEBUG_TGSI -> DEBUG_PRINT_IR Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:45:04 -05:00
Marek Olšák	e00791c552	st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them They use the "sample" keyword as a variable name. Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 19:23:37 -05:00
Lionel Landwerlin	34f32a6d66	anv: implement VK_KHR_timeline_semaphore v2: Fix inverted condition in vkGetPhysicalDeviceExternalSemaphoreProperties() v3: Add anv_timeline_* helpers (Jason) v4: Avoid variable shadowing (Jason) Split timeline wait/signal device operations (Jason/Lionel) v5: s/point/signal_value/ (Jason) Drop piece of drm-syncobj timeline code (Jason) v6: Add missing sync_fd semaphore signaling (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Jason Ekstrand	5a4f15ef2c	anv: Plumb timeline semaphore signal/wait values through from the API Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	edc6606d4e	anv/wsi: signal the semaphore in the acquireNextImage We seem to have forgotten about the semaphore in the acquireNextImageInfo. v2: Signal semaphore/fence regardless of presentation status (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Jason Ekstrand	b10b455c1d	anv: Lock around fetching sync file FDs from semaphores Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	246261f0ad	anv: prepare the driver for delayed submissions Timeline semaphore introduce support for wait before signal behavior, which means that it is now allowed to call vkQueueSubmit() with wait semaphores not yet submitted for execution. Our kernel driver requires all of the wait primitives to be created before calling the execbuf ioctl. As a result, we must delay submissions in the userspace driver. This change store the necessary information to be able to delay a VkSubmitInfo submission to the kernel driver. v2: Fold count++ into array access (Jason) Move queue list to another patch (Jason) v3: Document cleanup of temporary semaphores (Jason) v4: Track semaphores of SYNC_FD type that needs updating after delayed submission v5: Don't forget to update sync_fd in signaled semaphores after submission (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	3e22363537	anv: refcount semaphores Delayed submissions required by timeline semaphores mean we need to be able to update the sync fd backed semaphores in a delayed fashion. This could mean a race between the application destroying the semaphore and the submission code trying to update it with the new sync fd. This change prepares semaphores to be refcounted, we'll most likely only take a reference for cases where we signal a sync fd semaphore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	3da798c9f1	anv: prepare driver to report submission error through queues When we will submit to i915 from a submission thread, we won't be able to directly report the error to the user (in particular through the debug report callbacks). So prepare 2 paths to report errors device -> notifying the user immediately, queue -> notifying the user the next time an entry point is called. In this change we still report directly for both paths, this will change in the next commit. v2: Split NULL batch parameter handling in anv_queue_submit_simple_batch() in a different commit Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	89de271bc2	anv: allow NULL batch parameter to anv_queue_submit_simple_batch We can reuse device->trivial_batch_bo Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	f606c12731	anv: move queue init/finish to anv_queue.c Prepare the queue initialization to take on more responsabilities and possibly fail. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	206ab49ba1	anv: expose timeout helpers outside of anv_queue.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	2f4dcc8a1c	anv: detach batch emission allocation from device In the future we'll have 2 different allocations depending on whether we're using threaded submission or not. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	935f8f0e56	anv: remove list items on batch fini This doesn't seem to fix anything because those destroy() calls happen right before the command buffer object & its list of batch_bo is also destroyed. Still looks a bit cleaner. v2: Found a second occurence Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2) Fixes: `26ba0ad54d` ("vk: Re-name command buffer implementation files") Cc: <mesa-stable@lists.freedesktop.org>	2019-11-11 21:46:51 +00:00
Lionel Landwerlin	048f0690ee	anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit We always close the in_fence at the end the anv_cmd_buffer_execbuf() so when we take it from the semaphore, let's not forget to invalidate it. Note that the code leaks the fence_in if we get any error before reaching the close(). Let's fix that in another patch or better, rewrite the whole thing! v2: drop redundant fd = -1 (Jason) v3: Update commit message (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 21:46:51 +00:00
Rhys Perry	de998d3eb5	radv: fix radv_nir_get_max_workgroup_size when nir=NULL Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `84a1a2578` ('compiler: pack shader_info from 160 bytes to 96 bytes') Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-11 20:44:12 +00:00
Lionel Landwerlin	f93bb90302	mesa: check framebuffer completeness only after state update The change made in `88d665830f` ("mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv") correctly updated the state prior to checking the framebuffer completeness on glClearBufferiv but not in glClearBufferfi. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Fixes: `88d665830f` ("mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/2072	2019-11-11 22:04:55 +02:00
Caio Marcelo de Oliveira Filho	d4a3b09c4b	glsl: Check earlier for MaxTextureImageUnits and MaxImageUniforms Currently the linker do all the work then check for the limits, which means num_textures and num_images in shader_info may have to store more than the limit. This breaks down now since shader_info was packed and doesn't expect to store larger invalid values. To fix this, pull the check before we set the counts in shader_info. Add necessary plumbing to make sure we bail once those errors are found. Fixes: `84a1a2578d` ("compiler: pack shader_info from 160 bytes to 96 bytes") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 10:58:40 -08:00
Caio Marcelo de Oliveira Filho	fce76ae769	glsl: Check earlier for MaxShaderStorageBlocks and MaxUniformBlocks Currently the linker do all the work then check for the limits, which means num_ssbos and num_ubos in shader_info may have to store more than the limit. This breaks down now since shader_info was packed and doesn't expect to store larger invalid values. To fix this, pull the check before we set the counts in shader_info. One drawback of this approach is that for some cases we might not see the collected errors from various stages, but bail as soon as a stage breaks the limits. Fixes: `84a1a2578d` ("compiler: pack shader_info from 160 bytes to 96 bytes") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-11 10:58:40 -08:00
Dylan Baker	a8d941091f	util: Use ZSTD for shader cache if possible This allows ZSTD instead of ZLIB to be used for compressing the shader cache. On a 72 core system emulating skl with a full shader-db (with i965): ZSTD: 1915.10s user 229.27s system 5150% cpu 41.632 total (cold cache) 225.40s user 10.87s system 3810% cpu 6.201 total (warm cache) 154M (235M on disk) ZLIB: 2231.33s user 194.24s system 1899% cpu 2:07.72 total (cold cache) 229.15s user 10.63s system 3906% cpu 6.139 total (warm cache) 163M (244M on disk) Tim Arceri sees (8 core ryzen and a full shader-db): ZSTD: 2505.22 user 40.50 system 3:18.73 elapsed 1280% CPU (cold cache) 418.71 user 14.93 system 0:46.53 elapsed 931% CPU (warm cache) 454.3 MB (681.7 MB on disk) ZLIB: 3069.83 user 40.02 system 4:20.13 elapsed 1195% CPU (cold cache) 425.50 user 15.17 system 0:46.80 elapsed 941% CPU (warm cache) 470.3 MB (701.4 MB on disk) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-11 18:53:45 +00:00
Laurent Carlier	57acf921e2	egl: avoid local modifications for eglext.h Khronos standard header file Move differences in eglextchromium.h header file, then provide the same header than libglvnd-1.2 So program that omit to include eglextchromium.h will fail to build with both mesa and libglvnd headers. Fixes: `a0a8109f` "include: add the definition of EGL_EXT_image_flush_external" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-11 17:20:16 +00:00
Eric Engestrom	eaf4396602	egl: move #include of local headers out of Khronos headers Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-11 17:20:16 +00:00
Jason Ekstrand	69244fc72a	intel/fs: Lower large local arrays to scratch Shader-db results on Kaby Lake: total instructions in shared programs: 14929212 -> 14880028 (-0.33%) instructions in affected programs: 72428 -> 23244 (-67.91%) helped: 6 HURT: 2 helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624 helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08% HURT stats (abs) min: 1178 max: 1178 x̄: 1178.00 x̃: 1178 HURT stats (rel) min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97% 95% mean confidence interval for instructions value: -11947.03 -348.97 95% mean confidence interval for instructions %-change: -125.72% 202.37% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 368585300 -> 342557344 (-7.06%) cycles in affected programs: 28144921 -> 2116965 (-92.48%) helped: 6 HURT: 2 helped stats (abs) min: 1404978 max: 7766106 x̄: 4353922.00 x̃: 3890682 helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28% HURT stats (abs) min: 47778 max: 47798 x̄: 47788.00 x̃: 47788 HURT stats (rel) min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59% 95% mean confidence interval for cycles value: -5900438.73 -606550.27 95% mean confidence interval for cycles %-change: -140.79% 146.16% Inconclusive result (%-change mean confidence interval includes 0). total spills in shared programs: 9243 -> 8901 (-3.70%) spills in affected programs: 2718 -> 2376 (-12.58%) helped: 4 HURT: 4 total fills in shared programs: 21831 -> 10141 (-53.55%) fills in affected programs: 11804 -> 114 (-99.03%) helped: 6 HURT: 2 total sends in shared programs: 815912 -> 815912 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 1 GAINED: 3 The helped shaders are all compute shaders in Aztec Ruins. There is also a compute shader in synmark2 OglCSDof that's helped but it doesn't show up in above shader-db results because it went from SIMD8 to SIMD16. That shader improves enough to yield an 15-20% performance boost to the benchmark as a whole on my KBL laptop. The hurt shaders are a couple shaders in Kerbal Space Program and a couple in Aztec Ruins. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	53bfcdeecf	intel/fs: Implement the new load/store_scratch intrinsics This commit fills in a number of different pieces: 1. We add support to brw_nir_lower_mem_access_bit_sizes to handle the new intrinsics. This involves simple plumbing work as well as a tiny bit of extra logic to always scalarize scratch intrinsics 2. Add code to brw_fs_nir.cpp to turn nir_load/store_scratch intrinsics into byte/dword scattered read/write messages which use the A32 stateless model. 3. Add code to lower_surface_logical_send to handle dword scattered messages and the A32 stateless model. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	e2297699de	intel/nir: Plumb devinfo through lower_mem_access_bit_sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	1dff48af05	intel/fs: refactor surface header setup Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	a0999bc049	intel/fs: Add DWord scattered read/write opcodes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	83f04d80b0	intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes The new helper solves most of the annoying problems with data wrangling in brw_nir_lower_mem_access_bit_sizes. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Jason Ekstrand	b8d45d9307	nir: Add tests for nir_extract_bits	2019-11-11 17:17:02 +00:00
Jason Ekstrand	d0bbf98c96	nir/builder: Add a nir_extract_bits helper This new helper is better than nir_bitcast_vector because it's able to take a (mostly) arbitrary range from the source vector. The only requirement is that first_bit has to be aligned to the smaller of the two bit sizes. It wouldn't be hard to lift that requirement but it's reasonable for now. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-11 17:17:02 +00:00
Eric Engestrom	86d3a346f1	egl: fix _EGL_NATIVE_PLATFORM fallback When the X11 or Haiku platforms were compiled in, they would bypass the `_EGL_NATIVE_PLATFORM` fallback by always returning themselves instead. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-11 17:14:07 +00:00
Ricardo Garcia	20b403aad0	anv: Unify GetDeviceQueue and GetDeviceQueue2 Avoid duplicating some checks and code by making anv_GetDeviceQueue a subcase of anv_GetDeviceQueue2, like radv does. Signed-off-by: Ricardo Garcia <rgarcia@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-11 16:14:56 +00:00
Alyssa Rosenzweig	5b31182665	panfrost: Select format-specific blending intrinsics If we have an accelerated path for a particular framebuffer format, let's use it to save a bunch of instructions in a blend shader. [Tomeu: Only use the faster intrinsic on >T760] Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig	3295edaadf	pan/midgard: Pack load/store masks While most load/store operations on 32-bit/vec4 intriniscally, some are not and have special type-size-dependent semantics for the mask. We need to convert into this native format. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig	843874c7c3	pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan We can use the native Midgard ops for this, depending what chip we're on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig	5885b64e42	pan/midgard: Identify ld_color_buffer_u8_as_fp16* There are two versions of this opcode, depending what version of the ISA you're using. I'm not sure if there's a semantic difference; I think there might be some slight subtleties but it's too early to know at this stage. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig	03f73c7fc6	nir: Add load_output_u8_as_fp16_pan intrinsic This is a single opcode, at least on newer Midgard chips. It's easier to have this represented in NIR rather than trying to optimize out the conversions, so let's add the intrinsic. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-11 15:23:44 +00:00
Tomeu Vizoso	ee5321f239	panfrost: Set depth and stencil for SFBD based on the format Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-11 15:23:44 +00:00
Erik Faye-Lund	b4d47e21d7	zink: correct depth-stencil format When using packed vulkan-formats on little-endian systems, we need to swap the components for the gallium formats. And since Zink isn't big-endian safe yet, little-endian is the only endianess we care about right now. This fixes a bunch of piglit tests, amongs others: - spec@arb_depth_texture@depth-level-clamp - spec@arb_depth_texture@depthstencil-render-miplevels * d=z24 - spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit - spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `8d46e35d16` ("zink: introduce opengl over vulkan")	2019-11-11 14:35:53 +00:00
Erik Faye-Lund	d7a6cc8f4a	zink/spirv: add support for nir_op_flrp This fixes the following piglit: spec@ati_fragment_shader@ati_fragment_shader-render-fog Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-11-11 14:25:30 +01:00
Chris Wilson	863872e141	egl: Mention if swrast is being forced The system can be disabling HW acceleration unbeknown to the user, leading to a long debug session trying to work out which component is failing. A quick mention that it is the environment override would be very useful. v2: Use more generic "CPU renderer" and so try to avoid jargon. Reviewed-By: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Martin Peres <martin.peres@linux.intel.com>	2019-11-11 11:52:02 +00:00
Jason Ekstrand	9e440b8d0b	spirv: Sort out the mess that is sampled image This commit makes two major changes. First, we add a second case to OpLoad for sampled images which constructs a vtn_sampled_image and stashes that rather than stashing a pointer to the combined image sampler like we do for bare samplers and images. This should be more in line with how SPIR-V is intended to work and hopefully doesn't cause any weird problems. The second is a rework of vtn_handle_texture to assume that everything has an image but not everything has a sampler. We also add a vtn_fail_if for the case where a texture instructions require a sampler but none is provided. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-09 15:29:01 +00:00
Jason Ekstrand	9cc4c2c916	spirv: Add a vtn_decorate_pointer helper This helper makes a duplicate copy of the pointer if any new access flags are set at this stage. This way we don't end up propagating access flags further than they actual SPIR-V decorations. In several instances where we create new pointers, we still call the decoration helper directly because no copy is needed. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-09 15:29:01 +00:00
Jason Ekstrand	4f9688e571	spirv: Remove the type from sampled_image We have types on all vtn_values at this point so there's no reason to carry the redundant type information. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-09 15:29:01 +00:00
Rob Clark	a3dc975ee7	freedreno/ir3: also track # of nops for shader-db The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	5f45818673	freedreno/ir3: sync disasm changes from envytools Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	f3980a8ef7	freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION Set flag based on actual output reg type. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	f0f9ec6882	freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION We should really be setting this based on the actual output register type. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	df229977c3	freedreno/ir3: remove obsolete comment The meta PHI instruction was removed long ago. And fanin/fanout themselves to not contribute actual instructions (at least not by the time you get to sched, they may prevent copy-propagating away a mov) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	e804b42fd7	freedreno/ir3/ra: remove ir print after livein/out The IR hasn't changed at this point, so it isn't really adding any value. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	8b92052f10	freedreno/ir3/ra: move regs_count==0 check Fold it in to writes_gpr() (since a register that does not reference any registers by definition does not write a register). This lets us avoid having to handle this case in a few other places. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	bd21c73d3f	freedreno/ir3: ir3_print tweaks Handle HALF/HIGH flags in all cases, and colorize SSA src notation. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:15 +00:00
Rob Clark	5da10704bb	freedreno/ir3: use SSA flag on dest register too We did this in some places before, but not consistantly. But it will be useful for two-pass RA, to identify which registers have already been assigned. While we are cleaning this up, use __ssa_src() and new __ssa_dst() helper more consistently. (If nothing else, this reduces the # of callers of ir3_reg_create() to audit that we didn't miss something) Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:14 +00:00
Rob Clark	8449f6183f	freedreno/ir3: split pre-coloring to it's own function Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-11-09 02:49:14 +00:00
Caio Marcelo de Oliveira Filho	087ecd9ca5	spirv: Don't leak GS initialization to other stages The stage specific fields of shader_info are in an union. We've likely been lucky that this value was either overwritten or ignored by other stages. The recent change in shader_info layout in commit `84a1a2578d` ("compiler: pack shader_info from 160 bytes to 96 bytes") made this issue visible. Fixes: `cf2257069c` ("nir/spirv: Set a default number of invocations for geometry shaders") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-08 16:28:21 -08:00
Marek Olšák	84a1a2578d	compiler: pack shader_info from 160 bytes to 96 bytes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-08 16:54:08 -05:00
Marek Olšák	9950523368	glsl/linker: pass shader_info to analyze_clip_cull_usage directly This will be needed by the next commit. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-11-08 16:54:06 -05:00
Marek Olšák	3ef50b023e	radeonsi/nir: fix compute shader crash due to nir_binary == NULL This partially reverts `8b30114dda`. Fixes: `8b30114dda` "radeonsi/nir: call nir_serialize only once per shader"	2019-11-08 16:47:59 -05:00
Marek Olšák	8b30114dda	radeonsi/nir: call nir_serialize only once per shader We were calling it twice. First serialize it, then use it to compute the cache key. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-08 15:30:28 -05:00
Marek Olšák	ad56022b0d	util: add blob_finish_get_buffer Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-08 15:30:28 -05:00
Eric Anholt	b1f38aed84	u_format: Fix swizzle of A1R5G5B5. Found once I started using the generated unpack code from the Mesa side. Fixes: `4bbaac3782` ("gallium: Add some more channel orderings of packed formats.") Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-11-08 11:56:02 -08:00
David Stevens	0466239aae	virgl: support emulating planar image sampling Mesa emulates planar format sampling with per-plane samplers. Virgl now supports this by allowing the plane index to be passed when creating a sampler view from a planar image. With this change, mesa now passes that information to virgl. Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Lepton Wu <lepton@chromium.org>	2019-11-08 17:06:56 +00:00
Krzysztof Raszkowski	084431ce45	gallium/swr: Enable some ARB_gpu_shader5 extensions Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-11-08 16:04:47 +00:00
Krzysztof Raszkowski	e5ed9a1b91	gallium/swr: Fix GS invocation issues - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-11-08 14:52:16 +00:00
Timur Kristóf	911a826141	ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format. It happens that some games try to access a vertex buffer without a valid format. This case was incorrectly handled by ac_get_tbuffer_format which made ACO emit an invalid instruction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Cc: 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-08 13:30:30 +01:00
Boris Brezillon	ee82f9f07e	panfrost: Try to evict unused BOs from the cache The panfrost BO cache can only grow since all newly allocated BOs are returned to the cache (unless they've been exported). With the MADVISE ioctl that's not a big issue because the kernel can come and reclaim this memory, but MADVISE will only be available on 5.4 kernels. This means an app can currently allocate a lot memory without ever releasing it, leading to some situations where the OOM-killer kicks in and kills the app (or even worse, kills another process consuming more memory than the GL app) to get some of this memory back. Let's try to limit the amount of BOs we keep in the cache by evicting entries that have not been used for more than one second (if the app stopped allocating BOs of this size, it's likely to not allocate similar BOs in a near future). This solution is based on the VC4/V3D implementation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 11:26:47 +01:00
Boris Brezillon	25059cc41f	panfrost: Move BO cache related fields to a sub-struct We will soon introduce an LRU list to evict BOs that have been unused for more than 1 second. Let's first move all BO cache fields to a sub-struct to clarify which fields are used by the BO caching logic. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 11:26:47 +01:00
Alyssa Rosenzweig	5f768eda43	pan/midgard: Switch base for vertex texturing on T720 There aren't texture pipeline registers anymore; instead, space is shared with work and ldst registers for output and input respectively. We need to shift the base registers to represent this correctly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig	ac14facf7a	pan/midgard: Pass shader stage to disassembler Vertex texturing behaves differently from fragment texturing on some GPUs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig	515941202d	pan/midgard: Disassemble half-steps correctly The meaning of some bits shifts; we need to account for this to print swizzles sanely. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig	ec2af6bc97	pan/midgard: Fix printing of half-registers in texture ops We were using old style half-registers; let's update that to be consistent, preparing us for more disassmbler changes in this area. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-08 06:45:03 +00:00
Kristian H. Kristensen	4a4fad7f40	freedreno/ir3: Use regid() helper when setting up precolor regs Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:46:21 -08:00
Kristian H. Kristensen	3699a74a43	freedreno/a6xx: Turn on tessellation shaders Wow. Very triangle. So shader. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	53782571ae	freedreno/a6xx: Only use merged regs and four quads for VS+FS When other geometry stages are present, we chose two quads and no merged regs. Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	07aedc367c	freedreno/blitter: Save tessellation state We have tessellation state now. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	d2d0c8186d	freedreno/a6xx: Only set emit.hs/ds when we're drawing patches At least the gallium blitter helper will call us to draw with tessellation shaders set but a non-patch primitive. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	e584790885	freedreno: Use bypass rendering for tessellation It seems like tiling could work in the Adreno architecture, but we've only ever seen bypass rendering with tessellation. For now, let's do that too. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	47e2c19511	freedreno/a6xx: Program state for tessellation stages Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	03a30e7c3d	freedreno/a6xx: Emit constant parameters for tessellation stages Assemble the information the stages need and emit the constants. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	5dd51d2da7	freedreno/a6xx: Allocate and program tessellation buffer Tessellation needs a couple of buffers that should hold the entire output from a full VS+TCS draw call. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	f0ef3e9697	freedreno/a6xx: Build the right draw command for tessellation We need to select the right primitive type, set a bit to turn on tessellation and or in the TES output primitive type. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	7272e8a709	freedreno/ir3: Allocate const space for tessellation parameters The tessellation stages need size and stride or the patch layout as well as locations of attributes in the patch. The tesselation stages also use two system memory BOs and need the iovas of those. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	8739ea3ab5	freedreno/ir3: Pre-color TCS header and primitive ID inputs Similar to GS, the registers are shared and not reinitialized betewen VS and TCS, so we need to make sure to allocate the same registers for the system values between stages. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	b12ebe3e81	freedreno/ir3: Don't assume binning shader is always VS In tessellation mode, the TES is (probably) the binning shader. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	3cedeba7c9	freedreno/ir3: Setup inputs and outputs for tessellation stages Similar to GS, some inputs are reused when the chsh from VS to TCS or TES to GS, so we need to make sure we setup the right inputs and make the shared system values outputs so they don't get clobbered. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	e28fbbd861	freedreno/ir3: Implement TCS synchronization intrinsics We add two new IR3 specific nir intrinsics that map to the new condend and endpatch instructions. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:40:27 -08:00
Kristian H. Kristensen	4915231b8a	freedreno/ir3: Implement tess coord intrinsic Our lowering pass made the z component unused by replacing its uses by 1 - x - y. The intrinsic implementation then just need to return the x and y components. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:08 -08:00
Kristian H. Kristensen	e16e48d00c	freedreno/ir3: End TES with chsh when using GS When we have both TES and GS, the TES needs to chain to the VS with chmask and chsh GS just like the VS does to either TCS or GS. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:05 -08:00
Kristian H. Kristensen	581cd59692	freedreno/ir3: Add new synchronization opcodes There are two new opcodes in use in tesselation control shaders: category 0, opcodes 13 and 15. unk13 is a kill type of instruction that terminates threads where !p0.x and it used to narrow down a patch wavefront to just thread 0. Then, once thread 0 has written the tess levels, it issues unk15, which might signal the TE that another patch has been fully written. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:37:02 -08:00
Kristian H. Kristensen	56ed835bff	freedreno/ir3: Extend geometry lowering pass to handle tessellation VS and TCS pass varyings the same way as VS and GS does. TCS then writes entire patch to a system memory BO and TES eventually reads back from the BO once the TE starts generating vertices. TES outputs vertices the same way as VS and GS, except when there's a GS as well, in which case TES passes varyings to GS same way the VS would. In addition, the TCS needs a little bit of control flow massaging so that it only runs for valid invocations needs a couple of unknown instructions to synchronize with the TE. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:59 -08:00
Kristian H. Kristensen	8621fbc37b	freedreno/ir3: Add tessellation field to shader key Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:56 -08:00
Kristian H. Kristensen	77b96b843e	freedreno/ir3: Use imul24 in offset calculations With the imul24 opcode in place, we can now use it for computing local offsets (ie for ldlw/stlw). Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:53 -08:00
Kristian H. Kristensen	41984c8422	freedreno/ir3: Add ir3 intrinsics for tessellation These provide the iovas for system memory buffers used for tessellation as well as a new HW specific system value. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:50 -08:00
Kristian H. Kristensen	d6209a50bb	freedreno: Don't count primitives for patches The gallium helper doesn't like patches and we can't determine how many primitives it gets tessellated into anyway. On gens where we have tessellation, we get the prim count from a HW counter so just skip counting on the CPU. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:47 -08:00
Kristian H. Kristensen	fe450ef4cf	freedreno/ir3: Add load and store intrinsics for global io These intrinsics take a ivec2 for the 64 bit base address and a integer offset. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:44 -08:00
Kristian H. Kristensen	5d67da13a3	freedreno/ir3: Emit link map as byte or dwords offsets as needed Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages that load with ldg (TES) need dwords offsets. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:42 -08:00
Kristian H. Kristensen	1f3b52ce50	freedreno/a6xx: Add register offset for STG/LDG These instructions take a 64 bit iova as two conescutive registers and a immediate offset. This patch adds support for the offset to be a single register, which is added to the 64 bit iova. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:39 -08:00
Kristian H. Kristensen	3d16ec4a71	freedreno/a6x: Rename z/s formats What we call eRB6_Z24_UNORM_S8_UINT now is actually RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually RB6_Z24_UNORM_S8_UINT. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:36 -08:00
Kristian H. Kristensen	50124afe34	freedreno/a6xx: Fix layered texture type enum 2D array textures and 3D textures are different enum values after all. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:33 -08:00
Kristian H. Kristensen	0276d0766d	freedreno: Add nogmem debug option to force bypass rendering Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:31 -08:00
Kristian H. Kristensen	7fed7c2a7d	freedreno/a6xx: Clear sysmem with CP_BLIT Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:28 -08:00
Kristian H. Kristensen	b0b443dcab	freedreno/a6xx: Fix primitive counters again We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO) PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit the geometry pipeline, whether or not xfb is on. Then for PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction that writes out per-stream counts for generated and emitted, but only when xfb is enabled. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:22 -08:00
Kristian H. Kristensen	835f8d1ba1	freedreno/registers: Add comments about primitive counters Adding comments about best guess at what the counters count. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:19 -08:00
Kristian H. Kristensen	96968d0ba2	freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST Move these two to be in order with the other VS regs. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:36:16 -08:00
Kristian H. Kristensen	ba54f7dd03	freedreno/registers: Fix typo Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-11-07 16:35:27 -08:00
Rhys Perry	78e3ea9a0f	aco: add Instruction::usesModifiers() and add more checks in the optimizer No pipeline-db changes. v2: use early-exit for VOP3 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v1)	2019-11-08 00:14:06 +00:00
Rhys Perry	76544f632d	radv: adjust loop unrolling heuristics for int64 In particular, increase the cost of 64-bit integer division. Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom , with ACO used for GS this creates shaders requiring a branch with >32767 dword offset. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-07 23:29:12 +00:00
Erico Nunes	9817bff4da	lima: fix bo submit memory leak Fix memory leak on allocation for lima submit, reported by valgrind. 128 bytes in 1 blocks are definitely lost in loss record 38 of 84 at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91) by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139) by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113) by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57) by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802) by 0x586378F: lima_draw_vbo (lima_draw.c:1351) by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184) by 0x55D0A57: st_draw_vbo (st_draw.c:268) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) by 0x43610B: Mesh::render_vbo() (mesh.cpp:583) by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242) by 0x41131B: MainLoop::draw() (main-loop.cpp:133) by 0x411947: MainLoop::step() (main-loop.cpp:108) Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-11-07 23:03:01 +00:00
Erico Nunes	d939f5d463	lima: fix nir shader memory leak Fix memory leak on allocation for nir shader, reported by valgrind. 3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84 at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x5750817: ralloc_size (ralloc.c:119) by 0x5750977: rzalloc_size (ralloc.c:151) by 0x575C173: nir_shader_create (nir.c:45) by 0x5763ACB: nir_shader_clone (nir_clone.c:728) by 0x55D5003: st_create_fp_variant (st_program.c:1242) by 0x55D789F: st_get_fp_variant (st_program.c:1522) by 0x55D789F: st_get_fp_variant (st_program.c:1507) by 0x56400C3: st_update_fp (st_atom_shader.c:163) by 0x563D333: st_validate_state (st_atom.c:261) by 0x55D07CB: prepare_draw (st_draw.c:132) by 0x55D08DF: st_draw_vbo (st_draw.c:184) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-11-07 23:03:01 +00:00
Prodea Alexandru-Liviu	1a05811936	Meson: Remove lib prefix from graw and osmesa when building with Mingw. Also remove version sufix from osmesa swrast on Windows. v2: Make sure we don't remove lib prefix on *nix platforms. Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: "19.3" <mesa-stable@lists.freedesktop.org>	2019-11-07 22:04:50 +00:00
Marek Olšák	0b3111ed84	mesa: expose SPIR-V extensions in the Compatibility profile too We would like to have GL 4.6 Compatibility too. The extensions don't support compatibility features, so no other changes are needed. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-11-07 16:04:30 -05:00
Drew DeVault	299c55df88	st_get_external_sampler_key: improve error message Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 15:57:23 -05:00
Eric Anholt	9d2c8df3eb	mesa/st: Make st_pipe_format_to_mesa_format an effective no-op. All callers other than the unit test just wanted to convert back from a known-mesa-equivalent format, which is now a no-op. v2: Fix assertion failure in iris GL startup with BGR565 by continuing to return MESA_FORMAT_NONE for non-Mesa formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-11-07 19:43:41 +00:00
Eric Anholt	75921a0912	mesa/st: Gut most of st_mesa_format_to_pipe_format(). Now that MESA_FORMAT_x is just a PIPE_FORMAT_x define, we can strip this function down to just the compression fallbacks. v2: Restore the SRGB format for ASTC SRGB fallback case. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	807a800d8c	mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*. There are various places in Mesa where we would like to be able to have a shared format enum between Mesa and gallium (NIR compiler's image formats, for example, or mapping from gallium's formats to mesa's and vice versa in st_format.c). Rewriting all MESA_FORMAT to PIPE_FORMAT would be disruptive and possibly more work than it's worth (And I actually prefer MESA_FORMAT's name scheme), so for now just make it so that there's one shared set of enum values. The #defines here were generated by printing out from the tests/st_format.c round-tripping loop, with the exception of 8888 formats where I hand-edited the #defines to point at the corresponding gallium packed format define. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	d27dda907a	mesa: Prepare for the MESA_FORMAT_* enum to be sparse. To redefine MESA_FORMAT in terms of PIPE_FORMAT enums, we need to fix places where we iterated up to MESA_FORMAT_COUNT. I use _mesa_get_format_name(f) == NULL as the signal that it's not an enum value with a MESA_FORMAT. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	6b1c250245	mesa/st: Test round-tripping of all compressed formats. We checked round-tripping of formats without fallbacks, but weren't setting the compression support flags in the mock context and thus needed to skip testing those. Just set all the flags and assert that no fallbacks are triggered, so we get full test coverage. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	80a8021d6c	mesa: Stop defining a full separate format for RGBA_UINT8. We have packed formats for RGBA and ABGR already, so we can just pack/unpack code. v2: Rebase on endianness macro rename Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-11-07 19:43:41 +00:00
Eric Anholt	b28eb044cd	gallium: Add equivalents of packed MESA_FORMAT_*UINT formats. These are the last formats that MESA_FORMAT had and PIPE_FORMAT didn't. The .csv entries channel sizes and swizzles all came from the corresponding UNORM format. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	6fab4a7b59	gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8. This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT didn't. Note that it's an array format on gallium's side as well, since it's a NPOT pixel size. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	4bbaac3782	gallium: Add some more channel orderings of packed formats. This covers everything that MESA_FORMAT had for packed unorm. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	6196259d95	gallium: Add defines for FXT1 texture compression. This texture compression is exposed by 830 and 915, and to make MESA_FORMAT match PIPE_FORMAT defines I need a corresponding PIPE_FORMAT. v2: Set is_hand_written so we don't try to generate pack/unpack code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Eric Anholt	cb9fefe1db	mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-07 19:43:41 +00:00
Samuel Pitoiset	deafe4cc58	radv/gfx10: fix primitive indices orientation for NGG GS The primitive indices have to be swapped to follow the drawing order. This fixes corruption with Overwatch when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-07 19:21:15 +00:00
Kenneth Graunke	49ee657ef8	Revert "intel/blorp: Fix usage of uninitialized memory in key hashing" This reverts commit `4432a2d14d`. Pretty much every SKQP test dies with this assertion: skqp: ../src/mesa/drivers/dri/i965/brw_program_cache.c:102: hash_key: Assertion `item->key_size % 4 == 0' failed.	2019-11-07 09:27:12 -08:00
Danylo Piliaiev	4432a2d14d	intel/blorp: Fix usage of uninitialized memory in key hashing The automatically generated padding in structs contains undefined values, force pack the structs to eliminate the padding. Otherwise structs with the same values may generate different hashes. Valgrind output: Conditional jump or move depends on uninitialised value(s) util_fast_urem32 (fast_urem_by_const.h:71) hash_table_search (hash_table.c:262) _mesa_hash_table_search (hash_table.c:296) anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318) anv_pipeline_cache_search (anv_pipeline_cache.c:335) lookup_blorp_shader (anv_blorp.c:38) blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112) blorp_mcs_partial_resolve (blorp_clear.c:1205) anv_image_mcs_op (anv_blorp.c:1742) anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774) transition_color_buffer (genX_cmd_buffer.c:1159) cmd_buffer_end_subpass (genX_cmd_buffer.c:4840) Uninitialised value was created by a stack allocation blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-07 16:02:55 +00:00
Dylan Baker	0013af540d	osmesa/tests: Extend render test to cover other working cases Only the GL_UNSIGNED_BYTE cases actually work, the rest all fail, but we should test the working cases to ensure that they continue to work. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-11-07 06:11:19 -08:00
Dylan Baker	7bfb56a135	gallium/osmesa: Convert osmesa test to gtest This uses a bunch of additional C++ features for niceness and safety. Reviewed-by: Brian Paul <brianp@vmware.com>	2019-11-07 06:11:19 -08:00
Dylan Baker	d1767362aa	meson: gtest needs pthreads Reviewed-by: Brian Paul <brianp@vmware.com>	2019-11-07 06:11:19 -08:00
Tomeu Vizoso	072207bc18	panfrost: Pipe the GPU ID into compiler and disassembler Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-11-07 08:48:45 +00:00
Daniel Schürmann	a47e232ccd	aco: workaround Tonga/Iceland hardware bug The workaround got accidentally moved to the wrong place Fixes: `08d510010b` aco: increase accuracy of SGPR limits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-11-07 09:19:50 +01:00
Boris Brezillon	b60ed3c7b2	panfrost: Release the ctx->pipe_framebuffer ref ctx->pipe_framebuffer contains the last bound FB state, let's release resources pointed by this FB state when the context is destroyed. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-07 08:33:08 +01:00
Boris Brezillon	8c8e4fd5c6	panfrost: Destroy the upload manager allocated in panfrost_create_context() pipe->stream_uploader has been allocated with u_upload_create_default() in panfrost_create_context(), let's destroy it in the context destroy path. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-07 08:33:08 +01:00
Kai Wasserbäch	ddc588ff71	intel/gen_decoder: Fix unused-but-set-variable warning This commit fixes the following warning: ../src/intel/common/gen_decoder.c: In function ‘gen_spec_load_from_path’: ../src/intel/common/gen_decoder.c:741:11: warning: variable ‘len’ set but not used [-Wunused-but-set-variable] 741 \| size_t len, filename_len = strlen(path) + 20; \| ^~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-11-07 11:32:55 +11:00
Kai Wasserbäch	acfea09dbd	nir: fix unused function warning in src/compiler/nir/nir.c This commit fixes the following warning: ../src/compiler/nir/nir.c:1827:1: warning: ‘dest_is_ssa’ defined but not used [-Wunused-function] 1827 \| dest_is_ssa(nir_dest dest, void _state) \| ^~~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-07 11:32:55 +11:00
Kai Wasserbäch	4f8cc032b7	nir: fix unused variable warning in find_and_update_previous_uniform_storage This commit fixes the following warning: ../src/compiler/glsl/gl_nir_link_uniforms.c: In function ‘find_and_update_previous_uniform_storage’: ../src/compiler/glsl/gl_nir_link_uniforms.c:166:16: warning: unused variable ‘num_blks’ [-Wunused-variable] 166 \| unsigned num_blks = nir_variable_is_in_ubo(var) ? \| ^~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-07 11:32:55 +11:00
Kai Wasserbäch	8aa4d0bff6	nir: fix unused variable warning in nir_lower_vars_to_explicit_types This commit fixes the following warning: ../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_vars_to_explicit_types’: ../src/compiler/nir/nir_lower_io.c:1435:22: warning: unused variable ‘supported’ [-Wunused-variable] 1435 \| nir_variable_mode supported = nir_var_mem_shared \| nir_var_shader_temp \| nir_var_function_temp; \| ^~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-07 11:32:55 +11:00
Lepton Wu	5a40e153fd	gallium: dri2: Use index as plane number. This fix wrong color when playing video under Android + virgl configuration. Fixes: `2decad495f` ("gallium/dri2: Support images with multiple planes for modifiers") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-11-06 21:58:28 +00:00
Lionel Landwerlin	c1c346f166	anv: implement VK_KHR_separate_depth_stencil_layouts v2: Use ternary to simplify code (Jason) v3: Reorder switch cases to follow existing section ordering (Nanley) Add missing comment in cmd_buffer_end_subpass() about new layout (Nanley) v4: Fix layout comparison for stencil case (Nanley) Update a few more comments (Nanley) Move VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL_KHR in color attachment case for future stencil-CCS support (Nanley) v5: Missed comments update (Nanley) Updated relnotes.txt (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-11-06 20:13:30 +00:00
Eric Anholt	cb655d2554	Revert "ci: Switch over to an autoscaling GKE cluster for builds." This reverts commit `c9df92bf79`. It turns out that gitlab-runner uses kubernetes all wrong, spawning Pods and sshing into them to run the script instead of Jobs containing the script to run. This means that when anything goes wrong with the pod (autoscale, preemption, VM maintenance, cluster reconfiguration), the job fails and only sometimes gets handled as a runner system failure. Even worse, due to bugs in either the runner or k8s itself, some classes of timeout-related failure end up not being reported as failures, and the job will incorrectly report success! Disable using the "autoscale" cluster until we can do something else (docker-machine instead of k8s, or the custom third-party k8s-native runner). Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Acked-by: Daniel Stone <daniels@collabora.com>	2019-11-06 11:38:07 -08:00
Tomeu Vizoso	94e6d17043	panfrost: Print the right zero field Copy paste error. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-11-06 18:13:16 +01:00
Dylan Baker	401d7221ed	docs: update calendar, add news item and link release notes for 19.2.2	2019-11-06 09:07:02 -08:00
Dylan Baker	6fb82263d4	docs: add sha256 sum to 19.2.3 release notes	2019-11-06 09:05:58 -08:00
Dylan Baker	d7418d67af	docs: add release notes for 19.2.3	2019-11-06 09:05:56 -08:00
Tomeu Vizoso	6469c1a445	panfrost: Generate polygon list manually for SFBD On clears without draws, the SFBD GPUs need for userspace to generate the trivial polygon list. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:19:31 +01:00
Tomeu Vizoso	8e1ae5fa14	panfrost: Decode blend shaders for SFBD Also set MALI_HAS_BLEND_SHADER as needed. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:18:46 +01:00
Tomeu Vizoso	afeda06062	panfrost: Take into account texture layers in SFBD Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:18:46 +01:00
Tomeu Vizoso	9447a84f69	panfrost: Rework format encoding on SFBD Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:18:46 +01:00
Tomeu Vizoso	e40d11ccb2	panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720 Testing shows that it's needed. Also remove ctx->is_t6xx as it was the last use of it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:17:13 +01:00
Tomeu Vizoso	23fe7cd2d6	panfrost: Add checksum fields to SFBD descriptor During tests on T720, these fields were discovered. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-06 16:17:13 +01:00
Erik Faye-Lund	bc80900b6c	zink: do advertize integer support in shaders This is supported, so let's correct this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	8920689a58	zink/spirv: implement ball_fequal[2-4]	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	ea2d9b3d38	zink/spirv: implement ball_iequal[2-4]	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	0515ac4571	zink/spirv: implement bany_inequal[2-4]	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	c18c81edc6	zink/spirv: implement bany_fnequal[2-4]	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	4e0ca477d8	zink/spirv: support loading bool constants Seems I missed this before; let's add support for this.	2019-11-06 13:43:14 +01:00
Erik Faye-Lund	6630baecf1	zink/spirv: drop temp-array for component-count	2019-11-06 13:43:14 +01:00
Michel Dänzer	e0fff37f70	gitlab-ci: Don't build libdrm for ARM The Debian packages work fine. Saves a little bit of time and disk space. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-11-06 13:10:07 +01:00
Michel Dänzer	b4d3ae2269	gitlab-ci: Use separate arm64 build/test docker images The image used for test jobs is only about 1/6 as big as before, which may help avoid some issues with some of the test boards. Inspired by https://gitlab.freedesktop.org/mesa/mesa/issues/2046 . v2: * Leave LIBDRM_VERSION at 2.4.99 (Daniel Stone) * Delete more build artifacts from dEQP tree (Daniel Stone) v3: * Set LD_LIBRARY_PATH for ldd Acked-by: Daniel Stone <daniels@collabora.com> # v2 Reviewed-by: Eric Anholt <eric@anholt.net> # Except for the ldd line	2019-11-06 13:10:07 +01:00
Erik Faye-Lund	dd4587b55c	zink: use u_blitter when format-reinterpreting	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	7b9d17fe84	zink: always allow sampling of images This is required if we're going to blit from/to it using u_blitter.	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	1277192d55	zink: transition resources before resolving	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	b385ad0c75	zink: disable fragment-shader texture-lod We don't support nir_texop_txd, which is required by this cap. So let's disable it for now. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `8d46e35d16` ("zink: introduce opengl over vulkan")	2019-11-06 11:37:36 +00:00
Duncan Hopkins	aa64b6dc7f	zink: make sure src image is transfer-src-optimal Fixes: `d2bb63c8d4` ("zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.")	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	a32a92f53a	zink: do not advertize coherent mapping We do not support them yet, so let's not pretend. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `8d46e35d16` ("zink: introduce opengl over vulkan")	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	ca87a53b46	zink: always allow mutating the format There's no good way to know if a texture-view will be created, so we just have to accept it for all resources. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `8d46e35d16` ("zink: introduce opengl over vulkan")	2019-11-06 11:37:36 +00:00
Erik Faye-Lund	f3a72fd61c	zink: use actual format for render-pass We should use the format derived from the image-view here, not from the image itselt. Otherwise, we'll end up with incompatible render-passes. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `8d46e35d16` ("zink: introduce opengl over vulkan")	2019-11-06 11:37:36 +00:00
Pierre-Eric Pelloux-Prayer	21be5c8edd	radeonsi: fix shader disk cache key Use unsigned values otherwise signed extension will produce a 64 bits value where the 32 left-most bits are 1. Fixes: `2afeed3010` ("radeonsi: tell the shader disk cache what IR is used")	2019-11-06 10:15:37 +01:00
Samuel Pitoiset	fb07fd4e6c	radv: implement VK_EXT_subgroup_size_control This extension allows to control the subgroup size by allowing a varying subgroup size and also specifying a required subgroup size. This implementation only allows to specify a required subgroup size for compute shaders because there is some caveats with other shader stages (eg. NGG with geometry shader). This basically allows apps to use Wave32 for compute shaders. This extension is enabled for all chips but only GFX10 supports Wave32. ACO doesn't support it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 09:20:39 +01:00
Samuel Pitoiset	da6c30f9f6	radv: rely on shader's wavesize when computing NGG info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 09:20:36 +01:00
Samuel Pitoiset	d3f9957de4	radv: determine shaders wavesize at pipeline level Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 09:20:34 +01:00
Samuel Pitoiset	d1e1f7c4d5	radv: hardcode the number of waves for the GFX6 LS-HS bug It's always 64. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 09:20:32 +01:00
Samuel Pitoiset	f010b90ac5	radv/gfx10: enable wave32 for compute based on shader's wavesize This will allow to change wavesize on-demand. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 09:20:30 +01:00
Samuel Pitoiset	c0f76528ae	nir: fix packing of nir_variable The maximum number of descriptor sets is indeed 32 but without the sign bit. The maximum number of bindings for RADV is way larger, keep it as 32-bit. Fixes: `96e6ef80d9` ("nir: pack the rest of nir_variable::data") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2019-11-06 08:51:53 +01:00
Samuel Pitoiset	0b3bd1a7c2	radv: fix 32-bit compiler warnings Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2031 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 08:00:33 +01:00
Samuel Pitoiset	50b3ec35d2	radv: add a note about perftest/debug options Now that all environment variables are documented, it would be appreciated if we can keep this up-to-date. [skip ci] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 07:58:33 +01:00
Samuel Pitoiset	cc66976d0a	docs: document all RADV environment variables Requested by https://gitlab.freedesktop.org/mesa/mesa/issues/2022 [skip ci] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-06 07:58:22 +01:00
Marek Olšák	8145492f4a	nir/serialize: pack nir_variable flags Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:35:31 -05:00
Marek Olšák	3aa72a394a	nir/serialize: store 32-bit object IDs instead of 64-bit That means we have only 30 bits for object IDs, because 2 bits are sometimes used for something else. This decrease the uncompressed shader size for the biggest Borderlands 2 shader from 33.6 KB to 23.2 KB. (31% decrease) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:35:31 -05:00
Marek Olšák	d5768fcd45	nir/serialize: don't expand 16-bit variable state slots to 32 bits the swizzle also needs only 16 bits Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:35:31 -05:00
Marek Olšák	96e6ef80d9	nir: pack the rest of nir_variable::data Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-05 23:32:34 -05:00
Marek Olšák	442ef8c3e3	radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector This decreases memory usage, because serialized NIR is more compact. The main shader part is compiled from nir_shader. Monolithic shader variants are compiled from nir_binary. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-05 23:28:45 -05:00
Marek Olšák	abb8011f9d	radeonsi: don't keep compute shader IR after compilation not needed. We also need to free TGSI in the destroy function for the case when an app is terminated and si_create_compute_state_async is never executed because of util_queue_drop_job. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-05 23:28:43 -05:00
Marek Olšák	62229e8949	radeonsi: use IR SHA1 as the cache key for the in-memory shader cache instead of using whole IR binaries. This saves some memory. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-05 23:28:42 -05:00
Vasily Khoruzhick	65a5b24aee	lima: add support for gl_PointSize GP handles gl_PointSize similar to gl_Position, i.e. it needs separate buffer and it has special type in varying descriptors, also for indexed draw we need to emit special PLBU command to pass address of gl_PointSize buffer. Blob also clamps gl_PointSize to 1 .. 100 (as well as line width), so let's do the same. Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-11-05 17:44:56 -08:00
Eric Engestrom	73cc2fec10	mesa/imports: let the build system detect strtok_r() Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2013 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-11-05 22:38:04 +00:00
Eric Engestrom	66dd53584e	meson: require `nm` again on Unix systems This was made optional in `ff9bf223c2` ("meson: make nm binary optional") for Windows, but proper windows has been added and `nm` is now only used on Unix systems. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:56:44 +00:00
Eric Engestrom	4d5cde1fff	meson: add windows support to symbols checks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:31:37 +00:00
Eric Engestrom	2f652e0b36	meson: move the generic symbols check arguments to a common variable Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:30:47 +00:00
Eric Engestrom	2c4395e61c	meson: add variable to control the symbols checks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviwed-by: Dylan Baker <dylan@pnwbakers>	2019-11-05 20:12:32 +00:00
Pierre-Eric Pelloux-Prayer	67718ca352	mesa: fix call to _mesa_lookup_vao_err Fixes: `3e842a0b0e` ("mesa: rework _mesa_lookup_vao_err to allow usage from EXT_dsa") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2055 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-11-05 12:05:33 -08:00
Dylan Baker	5d085ad052	meson: Add dep_glvnd to egl deps when building with glvnd Otherwise if glvnd is not installed systemwide, but only in a prefix, it's headers wont be found. This happens because if it's headers are in /usr/include/ then another dependence will provide the necessary -I arguments and compilation will work. Fixes: `035ec7a2bb` ("meson: Add support for EGL glvnd") Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:44:41 +00:00
Dylan Baker	9020f519d2	util/u_endian: Add error checks As suggested by Eric Engestrom and Michel Dänzer.	2019-11-05 16:39:55 +00:00
Dylan Baker	ee4f1bc187	util: rename PIPE_ARCH__ENDIAN to UTIL_ARCH__ENDIAN As requested by Tim. This was generated with: grep 'PIPE_ARCH_._ENDIAN' -rIl \| xargs sed -ie 's@PIPE_ARCH_$.$_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g v2: - add this patch Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	6b6897a9f9	gallium/osmesa: Use PIPE_ARCH_*_ENDIAN instead of little_endian function Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	39b9fe03a9	mesa/main: delete now unused _mesa_little_endian Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	f73a9c6586	mesa/swrast: replace instances of _mesa_little_endian with preprocessor Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	453d52acd8	mesa/main: replace uses of _mesa_little_endian with preprocessor Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	f9f60da813	util/u_endian: set PIPE_ARCH__ENDIAN to 1 This will allow it to be used as a drop in replacement for _mesa_little_endian in a number of cases. v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN, define the one that reflects the host system to 1 and the other to 0 - replace all uses of #ifdef, #ifndef, and #if defined() with #if and #if ! with PIPE_ARCH__ENDIAN Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	37e54736a7	util/u_endian: Use _WIN32 instead of _MSC_VER _WIN32 is defined by basically all windows compilers (MSVC, ICL, MinGW), wereas _MSC_VER is not defined by MinGW. Without this change MinGW falls through and doesn't define PIPE_ARCH at all, and is caught by some extra code in gallium. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	cb0dbdd369	dri/osmesa: use preprocessor for selecting endian code paths Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	68d8c1f971	r100: Use preprocessor to select big vs little endian paths Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Dylan Baker	a550b6b7f8	r200: use preprocessor for big vs little endian checks Instead of using a function at runtime we can just build the right code for the right platform. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-11-05 16:39:55 +00:00
Philipp Sieweck	38e706656d	svga: check return value of define_query_vgpu{9,10} Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-11-05 16:23:39 +00:00
Tomeu Vizoso	427d0c4b6a	gitlab-ci: Run only LAVA jobs in special-named branches Run only jobs needed for testing on LAVA devices if a branch starts with lava-ci-. This allows developers to have faster test cycles as these pipelines take only a bit above 8 minutes. Also has the advantage of conserving resources. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-05 16:09:47 +01:00
Pierre-Eric Pelloux-Prayer	febedee4f6	mesa: add EXT_dsa glGetVertexArray* 4 functions The implementation doesn't share much with get.c because: * the refactoring needed for get.c to not depend on ctx->Array.VAO would be quite large * glGetVertexArray* would still need to filter pname to only accept the one specified by the spec * these functions are getter, the implementation is trivial (the complexity is in the correct filtering of pname input) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	2b44ca779b	mesa: extract helper function from _mesa_GetPointerv Will be used by EXT_dsa gllGetVertexArrayPointervEXT implementation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	5adeff8033	mesa: add EXT_dsa EnableVertexArrayAttribEXT / DisableVertexArrayAttribEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	f793a8663d	mesa: add EXT_dsa glEnableVertexArrayEXT / glDisableVertexArrayEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	a053361793	mesa: add gl_vertex_array_object parameter to client state helpers This will allow to use the same helper for the EXT_direct_state_access implementation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	aef5d99671	mesa: add EXT_dsa glVertexArray* functions implementation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	a78d4e7e75	mesa: add vao/vbo lookup helper for EXT_dsa Add a single helper dealing with the lookup of both the vao and the vbo to avoid duplicating this code in all the glVertexArray* functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	3e842a0b0e	mesa: rework _mesa_lookup_vao_err to allow usage from EXT_dsa ARB_dsa and EXT_dsa slightly differs when an uninitialized VAO is requested. In this case ARB_dsa fails while EXT_dsa requires to initialize the object. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	a26bb93943	mesa: add EXT_dsa glVertexArray* functions declarations Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Pierre-Eric Pelloux-Prayer	bfc1e4c112	mesa: pass vao as a function paramter This change will allow reusing the same function for the EXT_direct_state_access implementation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-05 13:58:28 +01:00
Michel Dänzer	d80dece065	gitlab-ci: Set arm job CCACHE_DIR properly $PWD doesn't work for variables:, it ended up as "/ccache", always starting with an empty cache. v2: * Use relative path and realpath v3: * Use $CI_PROJECT_DIR (Eric Anholt) * Clear ccache stats in before_script if the cache is in $CI_PROJECT_DIR Fixes: `c9df92bf79` "ci: Switch over to an autoscaling GKE cluster for builds." Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-11-05 09:27:32 +01:00
Kenneth Graunke	337f58438e	nir: Handle image arrays when setting variable data Fixes a ton of regressions in image load store tests. Fixes: `4319cc8c0f` ("nir: pack nir_variable::data::xfb_*") Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 18:16:06 -08:00
Paulo Zanoni	b57383a944	intel/compiler: remove the operand restriction for src1 on GLK Commit `5847de6e9a` implemented a restriction that applies to ICL, but wrongly marked it as also applying to GLK. Reviewers or MR !1125 pointed this, and the commit history shows removal of GLK to parts of the patch, but it turns there was still a left-over GLK check in the code. This code was breaking some of the i8vec2 tests on GLK, for example: dEQP-VK.subgroups.arithmetic.compute.subgroupadd_i8vec2 Removing the GLK check solves the issue for GLK. I don't see a reason on why implementing this restriction would actually break GLK, so there's still more to investigate here since this bug may be affecting ICL+, but let's apply the real GLK fix while we analyze and discuss the other possible issues. Fixes: `5847de6e9a` ("intel/compiler: don't use byte operands for src1 on ICL") BSpec: 3017 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-11-05 00:08:34 +00:00
Marek Olšák	4319cc8c0f	nir: pack nir_variable::data::xfb_* Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 18:17:34 -05:00
Marek Olšák	08dc541b66	nir: pack nir_variable::data::stream Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 18:17:34 -05:00
Ian Romanick	9be4a422a0	nir/algebraic: Mark other comparison exact when removing a == a This prevents some additional optimizations that would change the original result. This includes things like (b < a && b < c) => b < min(a, c) and !(a < b) => b >= a. Both of these optimizations were specifically observed in the piglit tests added in piglit!160. This was discovered while investigating https://gitlab.freedesktop.org/mesa/mesa/issues/1958. However, the problem in that issue was Chrome or Angle is replacing calls to isnan() with some stuff that we (correctly) optimize to false. If they had left the calls to isnan() alone, everything would have just worked. No shader-db changes on any Intel platform. I also tried marking the comparison generated by the isnan() function precise. The precise marker "infects" every computation involved in calculating the parameter to the isnan() function, and this severely hurt all of the (few) shaders in shader-db that use isnan(). I also considered adding a new ir_unop_isnan opcode that would implement the functionality. During GLSL IR-to-NIR translation, the resulting comparison operation would be marked exact (and the samething would need to happen in SPIR-V translation). This approach taken by this patch seemed easier, but we may want to do the ir_unop_isnan thing anyway. Fixes: `d55835b8bd` ("nir/algebraic: Add optimizations for "a == a && a CMP b"") Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 14:05:49 -08:00
Ian Romanick	ea19f2fb68	nir/algebraic: Add the ability to mark a replacement as exact Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 14:05:49 -08:00
Marek Olšák	af94600484	compiler: make variable::data::binding unsigned Nothing seems to set a negative value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 16:49:46 -05:00
Marek Olšák	4b4b383f38	st/mesa: call nir_lower_flrp only once per shader Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 16:49:44 -05:00
Marek Olšák	7d00218aed	st/mesa: call nir_opt_access only once Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-11-04 16:49:42 -05:00
Leo Liu	352b57d709	ac: add missing Arcturus to the info of pc lines Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2019-11-04 16:27:35 -05:00
Alyssa Rosenzweig	4da648a170	panfrost/ci: Update T760 expectations Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	12d071024b	pan/midgard: Extend default_phys_reg to !32-bit We can pass through a size. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	762623381d	pan/midgard: Extend swizzle packing for vec4/16-bit We would like to pack not just xyzw swizzles but also efgh swizzles. This should work for vec4/16-bit. More work will be needed to pack swizzles for vec8/16-bit and even more work for 8-bit, of course. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	bf5508f7b9	pan/midgard: Extend offset_swizzle to non-32-bit We take a size parameter; use it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	f538981384	pan/midgard: offset_swizzle doesn't need dstsize This argument should be omitted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	9eac9389fb	pan/midgard: Add bizarre corner case Someone really needs to look into this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	4ae4d82e21	pan/midgard: Compute bundle interference Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	45ac8ea8bd	pan/midgard: Fix quadword_count handling Spilling can mess with this considerably. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig	0a77dd3203	pan/midgard: Validate tags when branching Midgard prefetches instructions based on tag (ALU, LD/ST, texture * size). To do so, the shader descriptor specifies the tag of the first instruction, all instructions specify the tag of the next linear instruction is, and all branches explicitly specify the tag of the branch target. If you mess this up, you get an INSTR_TYPE_MISMATCH, which unambiguously refers to this problem, but it's still annoying to try to work out all the branch targets in your head to debug. Instead, let's track the tags of various blocks over time, so we can automatically validate tags of branch targets, to make INSTR_TYPE_MISMATCH issues immediately obvious in a disassembly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 15:36:08 -05:00
Daniel Schürmann	efe737fc4f	aco: fix accidential reordering of instructions when scheduling Fixes: `8678699918` "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-04 20:14:14 +01:00
Daniel Schürmann	5c7dcb15e0	aco: only use single-dword loads/stores for spilling Fixes: `8678699918` "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-04 20:14:14 +01:00
Daniel Schürmann	d97c0bdd55	aco: fix immediate offset for spills if scratch is used Fixes: `8678699918` "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-04 20:14:14 +01:00
Lionel Landwerlin	ee6fbb95a7	anv: Properly handle host query reset of performance queries The host query reset entry point didn't use the availability offset for performance queries. To fix this, reorder the availability of performance queries to match other queries. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2b5f30b1d9` ("anv: implement VK_INTEL_performance_query") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-04 19:04:38 +00:00
Paul Gofman	ecc31d032e	state_tracker: Handle texture view min level in st_generate_mipmap() Signed-off-by: Paul Gofman <gofmanp@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-11-04 13:24:31 -05:00
James Xiong	b6d45e7f74	iris: try to set the specified tiling when importing a dmabuf When importing a dmabuf with a specified tiling, the dmabuf user should always try to set the tiling mode because: 1) the exporter can set tiling AFTER exporting/importing. 2) a dmabuf could be exported from a kernel driver other than i915, in this case the dmabuf user and exporter need to set tiling separately. This patch fixes a problem when running vkmark under weston with iris on ICL, it crashed to console with the following assert. i965 doesn't have this problem as it always tries to set the specified tiling mode. weston: ../src/gallium/drivers/iris/iris_resource.c:990: iris_resource_from_handle: Assertion `res->bo->tiling_mode == isl_tiling_to_i915_tiling(res->surf.tiling)' failed. Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-11-04 17:59:52 +00:00
Kenneth Graunke	fc7b748086	iris: Fix "Force Zero RTA Index Enable" setting again In `2ca0d913ea`, we began updating cso_fb->layers to the actual layer count, rather than 0. This fixed cases where we were setting "Force Zero RTA Index Enable" even when doing layered rendering. Sadly, it also broke the check entirely: cso_fb->layers is now 1 for non-layered cases, but the Force Zero RTA Index check was still comparing for 0. Fixes: `2ca0d913ea` ("iris: Fix framebuffer layer count")	2019-11-04 08:57:37 -08:00
Dylan Baker	717606f9f3	nir: correct use of identity check in python Python has the identity operator `is`, and the equality operator `==`. Using `is` with strings sometimes works in CPython due to optimizations (they have some kind of cache), but it may not always work. Fixes: `96c4b135e3` ("nir/algebraic: Don't put quotes around floating point literals") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-11-04 16:06:39 +00:00
Boris Brezillon	28440820ef	panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASK MALI_DEPTH_TEST should only be set when depth->writemask is true, not when the depth test is enabled. Let's rename the flag and patch panfrost_bind_depth_stencil_state() to do the right thing. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-04 16:14:09 +01:00
Lionel Landwerlin	71634b1003	vulkan: bump headers/registry to 1.1.127 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-04 17:07:11 +02:00
Samuel Pitoiset	9ab27647ff	radv: fix compute pipeline keys when optimizations are disabled If an app first creates a compute pipeline with VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it without that flag, the driver should re-compile the compute shader. Otherwise, it will return the unoptimized one. Fixes: `ce188813bf` ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-11-04 08:50:00 +01:00
Karol Herbst	538d2c33b8	nv50/ir: fix crash in isUniform for undefined values Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-11-03 01:02:52 +01:00
Lionel Landwerlin	88d665830f	mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-02 09:14:26 +00:00
Vasily Khoruzhick	103378f332	lima: set dithering flag when necessary Bit 13 in aux1 enables dithering Reviewed-by: Qiang.Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-11-01 21:44:31 -07:00
Marek Olšák	c236e6c1e3	glsl: encode struct/interface types better Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 19:19:03 -04:00
Marek Olšák	5dde2aa8d9	glsl: encode array types better Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 19:19:03 -04:00
Marek Olšák	c141366560	glsl: encode explicit_stride for basic types better Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 19:19:03 -04:00
Marek Olšák	86adce4fef	glsl: encode vector_elements and matrix_columns better Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 19:19:03 -04:00
Marek Olšák	21d2fbb8c3	glsl: encode/decode types using a union with bitfields for readability Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 19:19:03 -04:00
Vasily Khoruzhick	dd52744201	lima: ignore flags while looking for BO in cache Any BO would work, we don't have any BO types yet anyway. Moreover lima_submit_add_bo() changes BO flags so they won't match allocation flags. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-11-01 13:12:07 -07:00
Vasily Khoruzhick	ae0b05d8db	lima: align size before trying to fetch BO from cache Otherwise we may be looking in wrong bucket Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-11-01 13:12:03 -07:00
Vasily Khoruzhick	08d6416a1d	lima: add debug prints for BO cache LIMA_DEBUG=bocache now activates debug prints for BO allocation, destruction and BO cache. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-11-01 13:11:47 -07:00
Alyssa Rosenzweig	b32caa6f1f	pan/midgard: Use fp32 blend shaders Clearly we do want to have fp16 at some point ... but I kind of give up debugging and it turns out the issues with fp16 support in 'frost are so deeply rooted that I might as well disable this non-opt and land LCRA now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-01 13:47:52 -04:00
Bas Nieuwenhuizen	8efb8f55a6	radv: Close all unnecessary fds in secure compile. The seccomp filter allows read/write, let us make sure nobody can do anything with this. Fixes: `cff53da374` "radv: enable secure compile support" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-11-01 17:15:34 +01:00
Erik Faye-Lund	dd77bdb34b	anv: remove incorrect polygonMode=point early-out This is incorrect, because polygonMode only applies if the final primitive type is a polygon; polygonMode doesn't apply to line-primitives as the comment suggests. The Vulkan 1.1 spec, section 26.11, "Polygons" defines that polygons are separate from points and line segments: " A polygon results from the decomposition of a triangle strip, triangle fan or a series of independent triangles. Like points and line segments, polygon rasterization is controlled by several variables in the VkPipelineRasterizationStateCreateInfo structure. " Further, section 26.11.2, "Polygon Mode", only define polygonMode to apply to polygons: " Possible values of the VkPipelineRasterizationStateCreateInfo::polygonMode property of the currently active pipeline, specifying the method of rasterization for polygons, are: " This seems to clearly define that polygonMode doesn't apply to points and lines, so let's make sure that we don't early out with the wrong value. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-11-01 07:26:03 +00:00
Alyssa Rosenzweig	c3a46e7644	pan/midgard: Eliminate blank_alu_src We don't need it in practice, so this is some more cleanup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig	70072a20e0	pan/midgard: Refactor swizzles Rather than having hw-specific swizzles encoded directly in the instructions, have a unified swizzle arary so we can manipulate swizzles generically. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig	e7fd14ca8a	pan/midgard: Add a dummy source for loads We want symmetry between loads and stores, so we add a dummy source. So we get, e.g. st_int4 _, val, arg_1, arg_2 ld_int4 dest, _, arg_1, arg_2 Semantically, this dummy source represents the data itself, as if the load is simply a move. That means it has a swizzle that acts as a source. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig	b5938be51d	pan/midgard: Remove OP_IS_STORE_VARY Unused. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-11-01 01:01:46 +00:00
Timothy Arceri	1c2bf82d24	glsl: disable lower_fragdata_array() for NIR drivers This function was added in `7e414b5864` to work around a defect in lower_output_reads(). As of the previous commit no NIR driver calls lower_output_reads(). This change means we don't need the special GLSL IR style gl_FragData handling for building the resource list in a NIR based linker. No shader-db change on SKL i965. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-01 11:33:54 +11:00
Timothy Arceri	0e186c18ba	glsl: just use NIR to lower outputs when driver can't read outputs This will allow us to stop lowering gl_FragData in GLSL IR for NIR drivers which means we won't need the special GLSL IR type handling for building the resource list in a NIR based linker. i965 has been doing this since `b828f7a27b`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-11-01 11:33:33 +11:00
Icenowy Zheng	8fa13db251	lima: support indexed draw with bias When doing an indexed draw with index_bias set to a non-zero value (e.g. by glDrawElementsBaseVertex), the vertex buffer should be offseted by index_bias vertices. Add this offset when setting the vertex buffer address. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-31 21:56:45 +00:00
Jason Ekstrand	f60ef0fff4	anv: Move the RT BTI flush workaround to begin_subpass Now that we're no longer compacting binding table entries, the only time they can possibly change is when we actually switch subpasses. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-31 21:07:15 +00:00
Jason Ekstrand	6a8f43030c	anv: Stop compacting render targets in the binding table Instead, always emit one entry for every color attachment in the subpass or one NULL if there are no color attachments. This will let us adjust an Ice Lake workaround so we don't get a stall on every draw call. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-31 21:07:15 +00:00
Jason Ekstrand	c765e2156a	anv: Don't claim the null RT as a valid color target If it's NULL, we can let the compiler go ahead and delete it or flag it as NULL. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-31 21:07:15 +00:00
Jason Ekstrand	df7a730b4f	anv: Don't delete fragment shaders that write sample mask Also, use color_outputs_valid rather than nr_color_outputs since it should be a bit more accurate. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-31 21:07:15 +00:00
Yevhenii Kolesnikov	265e4d9432	glsl: Enable textureSize for samplerExternalOES From OES_EGL_image_external_essl3 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1901 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-31 20:23:56 +00:00
Eric Anholt	c9df92bf79	ci: Switch over to an autoscaling GKE cluster for builds. The GKE pool we're using is 1-3 32-core VMs, preemptible (to keep costs down), with 8 jobs concurrent per system. We have plenty of memory (4G/core), so we run make -j8 to try to keep the cores busy even when one job is in a single-threaded step (docker image download, git clone, artifacts processing, etc.) When all jobs are generating work for all the cores, they'll be scheduled fairly. The nodes in the pool have 300GB boot disks (over-provisioned in space to provide enough iops and throughput) mounted to /ccache, and CACHE_DIR set pointing to them. This means that once a new autoscaled-up node has run some jobs, it should have a hot ccache from then on (instead of having to rely on the docker container cache having our ccache laying around and not getting wiped out by some other fd.o job). Local SSDs would provide higher performance, but unfortunately are not supported with the cluster autoscaler. For now, the softpipe/llvmpipe test runs are still on the shared runners, until I can get them ported onto Bas's runner so they can be parallelized in a single job. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-31 11:19:43 -07:00
Eric Anholt	da6cc72237	ci: Make lava inherit the ccache setup of the .build script. It was just duplicating the code. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-31 11:19:43 -07:00
Eric Engestrom	6e21dcc5a3	meson: revert glvnd workaround This effectively reverts MR !2112. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 17:09:59 +00:00
Eric Engestrom	0f201e9dbc	meson: require glvnd 1.2.0 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 17:09:59 +00:00
Eric Engestrom	9b58ab803d	gitlab-ci: build a recent enough version of GLVND (ie. 1.2.0) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 17:09:59 +00:00
Eric Engestrom	c32236811d	meson: move idep_xmlconfig_headers to xmlpool/ That's where `xmlpool_options_h` is defined, and this way we can make sure nobody starts making use of it in the future :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 16:03:57 +00:00
Jason Ekstrand	02d9403067	anv: Use the new BO alloc API for Android Fixes: `a44f5ee0d8` "anv: Rework the internal BO allocation API" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 15:46:39 +00:00
Erik Faye-Lund	b7674829a1	zink: emit line-width when using polygon line-mode When switching this to dynamic state, I forgot that this also needs to be emitted when we use a polygon-mode set to lines. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `6d30abb4f1` ("zink: use dynamic state for line-width")	2019-10-31 15:38:21 +00:00
Eric Engestrom	fbb98ae0ed	radeon: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	2c9898a329	r200: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	ea36ddae1e	nouveau: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	4c5c31a651	i915: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	039797bef9	dri: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	5774abe725	targets/xvmc: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	ad8cd21def	targets/xa: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	ec2555a3d6	targets/vdpau: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	8be89b4319	targets/va: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	375094c70b	targets/omx: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	71ca5fb68a	loader: replace xmlpool_options_h with idep_xmlconfig_headers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	0bd6fc0a84	pipe-loader: drop unnecessary xmlpool_options_h idep_xmlconfig already covers that Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	a2eba4b17d	radv: drop unnecessary xmlpool_options_h idep_xmlconfig already covers that Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	791ece114e	anv: add missing xmlconfig headers dependency Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Eric Engestrom	4072b3360b	meson: split out idep_xmlconfig_headers from idep_xmlconfig A bunch of components need the former but not the latter. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-31 15:29:06 +00:00
Alyssa Rosenzweig	bf15318991	pipe-loader: Build kmsro loader for with all kmsro targets Build failure reported by i965 CI, triggered by building dynamic pipeloaders with kmsro drivers (besides 'frost). At this point, there's no reason to actually do that -- mesa CI didn't mind -- but let's not break the build. v2: Simplify script. Add extra dependencies for v3d. Fixes: `afb0d08cb0` ("pipe-loader: Default to kmsro if probe fails") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Clayton Craft <clayton.a.craft@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-10-31 15:26:10 +00:00
Erik Faye-Lund	5ea787950f	zink: heap-allocate samplers objects VkSampler is 64-bit even on 32-bit systems, so casting it to a pointer is a bad idea there. So let's heap-allocate the sampler-object instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2017 Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com> Tested-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-10-31 13:57:43 +00:00
Jason Ekstrand	0ca0ad1252	anv: Zero released anv_bo structs Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	b3c0b1b218	anv: Use a bitset for tracking residency Now that we can conveniently map between GEM handles and struct anv_bo pointers, we can use a simple bitset for residency tracking instead of the complex hash set. This shaves about 3% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	9ef198c59a	anv: Set the batch allocator for compute pipelines Otherwise relocations just up and crash. Fixes: `a3153162a9` "anv: Delay allocation of relocation lists" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	9f665d9c1c	anv: Add a device parameter to anv_execbuf_add_bo We're about to start needing to lookup BO pointers by GEM handle so we need access to the device. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	63d7a38630	anv: Drop anv_bo_init and anv_bo_init_new BOs are now only ever allocated through the BO cache so there's no need to have these exposed. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	853d3b59fd	anv: Allocate misc BOs from the cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	d0ec55d5a3	anv: Allocate scratch BOs from the cache While we're here, we get rid of the locking and use a lock-free algorithm. The chances of spilling contention are low and this is actually a bit simpler in some ways. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	ee77938733	anv: Allocate batch and fence buffers from the cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	e4f01eca3b	util: Add a free list structure for use with util_sparse_array Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	0a6d2593b8	anv: Allocate descriptor buffers from the BO cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	e0ee23660f	anv: Set more flags on descriptor pool buffers the ASYNC flag, in particular, has the potential to help performance because it means less sync tracking in the kernel. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	c3eb4b3ba5	anv: Allocate query pool BOs from the cache Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	0d2787f7c9	anv: Use the query_slot helper in vkResetQueryPoolEXT Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	3119b96bdf	anv: Allocate block pool BOs from the cache This commit switches block pools over to being allocated from the BO cache rather than being allocated manually by the block pool. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	cc972d72c7	anv/tests: Initialize the BO cache and device mutex We're about to start depending on the BO cache in the state and block pools so we need them properly initialized for the tests to work. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	9076e9f375	anv/tests: Zero-initialize instances Some of the tests were actually relying on some of those uninitialized bits to be non-zero. In particular, a couple want use_softpin = true. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	5c664dff75	anv: Choose BO flags internally in anv_block_pool All block pools are allocated with the same flags. There's no good reason why it needs to be configurable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	a44f5ee0d8	anv: Rework the internal BO allocation API This makes a number of changes to the current API: 1. Everything is renamed to anv_device_* instead of anv_bo_cache_* because the BO cache is soon going to be the sole BO allocation path and not some special case to make import/export work. 2. Drop the cache parameter. It's totally redundant with the device and just annoying to keep typing. 3. Rework flags so that they go the convenient direction for usage in ANV rather than whichever awkward way the i915 specified it to maintain backwards compatibility. This also gives us the opportunity to set some defaults. 4. Add flags for mapping and coherency. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:09 +00:00
Jason Ekstrand	1be2e4c0ef	anv: Use anv_block_pool_foreach_bo in get_bo_from_pool While we're at it, use gen_48b_address(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	3178e583c8	anv: Rework anv_block_pool_expand_range The growing algorithms for the softpin case and the userptr version are almost entirely different. Having this weird join doesn't make the code more comprehensible. This rework does a few things: 1. Move the comment about 48-bit addresses to anv_device_init where we actually unset the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag. 2. Separate the paths in anv_block_pool_expand_range so it's easier to see what happens in the two different cases. 3. Use the anv_block_poo::bos array for storing all allocated BOs in both paths rather than using the cleanup list in both paths. This lets us make the cleanups array only used for mmaps of the memfd for the userptr case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	bb257e1852	anv: Fix a potential BO handle leak Fixes: `731c4adcf9` "anv/allocator: Add support for non-userptr" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	6f4fa81769	anv: Handle state pool relocations using "wrapper" BOs Instead of depending on a mutable BO in the state pool for handling growing state pools, add a concept of "wrapper" BOs which just wrap an actual BO. This way, the wrapper can exist once for all of time and we can put it in relocation lists even if the actual BO it references gets swapped out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	b781c85c79	anv: Replace ANV_BO_EXTERNAL with anv_bo::is_external We're not THAT strapped for space that we can't burn one extra bit for a boolean. If we're really worried about it, we can always shrink the flags field to 16 bits because the kernel only uses 7 currently. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	5534358ef6	anv: Inline anv_block_pool_get_bo It has exactly one caller and we're about to change some of the dynamics which would make this confusing as a separate function. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	c0a4722f29	anv: Declare the bo in the anv_block_pool_foreach_bo loop Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	325345b2bd	anv: Stop storing the GEM handle in anv_reloc_list_add We have to go through and rewrite them all anyway so it doesn't do us any good to put them in the list in anv_reloc_list_add. Also, for state pools the handles are likely wrong by the time vkQueueSubmit is called. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	c4be72934e	anv: Fix a relocation race condition Previously, we would read the offset from the BO in anv_reloc_list_add to generate the presumed offset and then again in the caller to compute the 64-bit address to write into the buffer. However, if the offset somehow changed between these two points, the presumed offset would no longer match the written offset. This is unlikely to actually ever be a problem in practice because the presumed offset gets recorded first and so if the written address is wrong then the presumed offset is almost certainly wrong and the relocation will trigger. However, it's much safer to simply have anv_reloc_list_add return the 64-bit address. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	bbf389013f	anv: Use a util_sparse_array for the GEM handle -> BO map This lets us do less allocation because the anv_bo's are now embedded in the sparse array and it also allows lock-free translation from GEM handle to BO which will be useful in future commits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	821ce7be36	anv: Move refcount to anv_bo Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Jason Ekstrand	09ec6917c1	util: Add a util_sparse_array data structure Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 13:46:08 +00:00
Pierre-Eric Pelloux-Prayer	8a723282e3	mesa: enable msaa in clear_with_quad if needed If the DrawBuffer sample count is > 1 and msaa is enabled we must also enable msaa when clearing it. Fixes: `ea5b7de138` ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1991 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Witold Baryluk <witold.baryluk@gmail.com>	2019-10-31 12:30:53 +01:00
Lionel Landwerlin	b087b7bd90	intel/perf: fix Android build Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `15b7b56eb2` ("intel/perf: add TGL support") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-31 11:20:30 +00:00
Tomeu Vizoso	01af59b2d9	gitlab-ci: Disable lima jobs The runner that submits jobs there is down and will turn some time to get fixed. Disable them for now to keep the CI green. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-10-31 11:08:11 +00:00
Bas Nieuwenhuizen	6ced684e27	radv: Fix disk_cache_get size argument. Got some int->pointer warnings and 20 is not a valid pointer .... Fixes: `2e3a635ee6` "radv: Add an early exit in the secure compile if we already have the cache entries." Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-31 11:40:43 +01:00
Andrii Simiklit	e06fcbe2c8	main: fix several 'may be used uninitialized' warnings This patch fixes approximately 39 warnings in 'texcompress_etc.c' for the release configuration v2: Fixed by adding the unreachable case to the etc2_rgb8_fetch_texel ( Eric Engestrom <eric.engestrom@intel.com> ) Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-10-31 10:14:09 +00:00
Bas Nieuwenhuizen	3e86d553a4	anv: Remove _mesa_locale_init/fini calls. The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 09:47:56 +00:00
Bas Nieuwenhuizen	72f858fc07	turnip: Remove _mesa_locale_init/fini calls. The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 09:47:56 +00:00
Bas Nieuwenhuizen	344ba56b0f	radv: Remove _mesa_locale_init/fini calls. The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 09:47:56 +00:00
Pierre-Eric Pelloux-Prayer	2afeed3010	radeonsi: tell the shader disk cache what IR is used Until `8bef4df196` the IR (TGSI or NIR) was used in disk_cache driver_flags. This commit restores this features to avoid crashing when switching from one IR to the other. As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI to keep the same driver_flags. Fixes: `8bef4df196` ("radeonsi: add si_debug_options for convenient adding/removing of options") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-31 10:43:40 +01:00
Lionel Landwerlin	15b7b56eb2	intel/perf: add TGL support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-31 09:13:20 +00:00
Robert Foss	f140467b5b	android: Add panfrost support to build scripts Currently the Android build system doesn't expose the panfrost driver. This patch enables the panfrost driver to be build on for the Android platform. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-By: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-31 10:03:54 +01:00
Robert Foss	6f3f855320	nir: Build nir_lower_point_size.c in libmesa_nir nir_lower_point_size.c was not build into the libmesa_nir library for non-meson builds. However it was included in the meson build. This patch fixes that. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-31 10:03:54 +01:00
Iago Toral Quiroga	e7e501efce	v3d: rename vertex shader key (num)_fs_inputs fields Until now this made sense because we always paired vertex shaders with fragment shaders, but as soon as we implement geometry and tessellation shaders that will no longer be the case, so rename this to (num_)used_outputs. v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-31 08:46:35 +00:00
Mauro Rossi	d688e4166c	android: aco: fix Lower to CSSA Fixes the following building error: external/mesa/src/amd/compiler/aco_spill.cpp:1768: error: undefined reference to 'aco::lower_to_cssa(aco::Program, aco::live&, radv_nir_compiler_options const)' Fixes: `0b8216b` ("aco: Lower to CSSA") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-10-31 07:38:46 +00:00
Jan Zielinski	7baedc9162	gallium/swr: Fix depth values for blit scenario Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-10-31 07:25:54 +00:00
Jordan Justen	bb0c5c487e	iris/gen11+: Move flush for render target change When starting a BLORP operation, we do the BTI-change flush. However, when ending it and transitioning back to regular drawing, we change the render target again - without a set_framebuffer_state() call. We need to do the BTI flush there too. BLORP flags IRIS_DIRTY_RENDER_BUFFER now, which will cause the next draw to get the BTI flush again. (explanation of fix by Ken) Fixes: `2b956a093a` ("iris: totally untested icelake support") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-31 00:24:25 -07:00
Jordan Justen	a2c3c65a31	iris: Add IRIS_DIRTY_RENDER_BUFFER state flag Fixes: `2b956a093a` ("iris: totally untested icelake support") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-31 00:24:25 -07:00
Samuel Pitoiset	1e36a8f41d	radv: declare NGG scratch for VS or TES and only on GFX10 Do not need to declare it for other stages because this is for streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-31 06:51:01 +00:00
Arno Messiaen	a9391a1a01	lima: add cubemap support Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-10-31 06:29:31 +00:00
Arno Messiaen	9890590fba	lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-10-31 06:29:31 +00:00
Arno Messiaen	28e1d55d6e	lima: add layer_stride field to lima_resource struct Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-10-31 06:29:31 +00:00
Arno Messiaen	f3686083a4	lima: fix stride in texture descriptor Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-10-31 06:29:31 +00:00
Ian Romanick	7b3f38ef69	intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too This make shader-db's report.py work on Haswell and earlier platforms. The problem is that the script would detect the "sends" output for scalar shaders and expect in in vec4 shaders too. When it didn't find it, the script would fail with: Traceback (most recent call last): File "./report.py", line 351, in <module> main() File "./report.py", line 182, in main before_count = before[p][m] KeyError: 'sends' Fixes: `f192741ddd` ("intel/compiler: Report the number of non-spill/fill SEND messages") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 21:27:03 -07:00
Tapani Pälli	b380d47998	nir: fix couple of compile warnings Fixes "warning: braces around scalar initializer" warnings. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-31 00:21:44 +00:00
Bas Nieuwenhuizen	ec770085c2	radv: Fix timeout handling in syncobj wait. libdrm returns -errno instead of directly the ioctl ret of -1. Fixes: `1c3cda7d27` "radv: Add syncobj signal/reset/wait to winsys." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-31 00:48:17 +01:00
Ilia Mirkin	1b9d1e13d8	nv50/ir: mark STORE destination inputs as used Observed an issue when looking at the code generatedy by the image-vertex-attrib-input-output piglit test. Even though the test itself worked fine (due to TIC 0 being used for the image), this needs to be fixed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2019-10-30 19:13:18 -04:00
Ilia Mirkin	869e32593a	gm107/ir: fix loading z offset for layered 3d image bindings Unfortuantely we don't know if a particular load is a real 2d image (as would be a cube face or 2d array element), or a layer of a 3d image. Since we pass in the TIC reference, the instruction's type has to match what's in the TIC (experimentally). In order to properly support bindless images, this also can't be done by looking at the current bindings and generating appropriate code. As a result all plain 2d loads are converted into a pair of 2d/3d loads, with appropriate predicates to ensure only one of those actually executes, and the values are all merged in. This goes somewhat against the current flow, so for GM107 we do the OOB handling directly in the surface processing logic. Perhaps the other gens should do something similar, but that is left to another change. This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS tests like shader_image_load_store.non-layered_binding without breaking anything else. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "20.0" <mesa-stable@lists.freedesktop.org>	2019-10-30 19:12:36 -04:00
Lionel Landwerlin	e02c181bfd	intel/dev: set default num_eu_per_subslice on gen12 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8125d7960b` ("intel/dev: Add preliminary device info for Tigerlake") Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-30 22:30:09 +00:00
Dylan Baker	4226952199	docs/new_features: Empty the feature list for the 20.0 cycle	2019-10-30 15:18:27 -07:00
Dylan Baker	1fdcc2494e	Bump VERSION to 20.0.0-devel	2019-10-30 14:56:02 -07:00
Jordan Justen	98da208660	docs/relnotes/new_features.txt: Add note about gen12 support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-30 14:08:51 -07:00
Jordan Justen	2b186264cc	intel/eu/validate/gen12: Add TGL to eu_validate tests. These reworks were combined into this patch: * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+ * Francisco Jerez: intel/eu/validate/gen12: Disable qword_low_power_no_depctrl eu_validate test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:08:51 -07:00
Jordan Justen	8125d7960b	intel/dev: Add preliminary device info for Tigerlake Reworks: * adjust 64-bit support, hiz (Jason Ekstrand) * sim-id (Lionel Landwerlin) * adjust threads, urb size (Rafael Antognolli) * adjust urb size (Kenneth Graunke) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:08:48 -07:00
Lionel Landwerlin	632995227c	intel/dump_gpu: handle context create extended ioctl Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-30 21:58:31 +02:00
Bas Nieuwenhuizen	ae454a03b7	radv: Allocate space for temp. semaphore parts. Calculated the number for allocation and did not reserve space .... Fixes: `2117c53b72` "radv: Add temporary datastructure for submissions." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 20:51:39 +01:00
Rafael Antognolli	3c317e8187	anv: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Rafael Antognolli	a99c67b690	blorp: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Rafael Antognolli	d3995c19eb	iris: Add Tile Cache Flush for Unified Cache.	2019-10-30 19:51:03 +00:00
Jordan Justen	f573cd4757	intel/genxml: Add gen12 tile cache flush bit Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-30 19:51:03 +00:00
Daniel Schürmann	8678699918	aco: implement VGPR spilling VGPR spilling is implemented via MUBUF instructions and scratch memory. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	c79972b604	aco: always set scratch_offset in startpgm This patch also moves private_segment_buffer and scratch_offset to Program to easily access it. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	b0de16b7de	aco: omit linear VGPRs as spill variables Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	aded548e66	aco: ensure that spilled VGPR reloads are done after p_logical_start Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	a7ff1bb5b9	aco: simplify calculation of target register pressure when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Rhys Perry	e73de4e1d8	aco: fix new_demand calculation for first instructions Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	93b42a1907	aco: don't add interferences between spilled phi operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	fdf8ad0256	aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	d48d72e98a	aco: don't insert the exec mask into set of live-out variables when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	cd20e29de1	aco: fix transitive affinities of spilled variables Variables spilled on both branch legs need to be assigned to the same spilling slot. These affinities can be transitive through multiple merge blocks. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	8023dcd71e	aco: fix live-range splits of phis Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	655a703349	aco: remove potential critical edge on loops. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	78bca0d0ce	aco: improve live variable analysis This patch makes the live variable analysis more precise w.r.t. killed phi operands and the block's register pressure. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:32 +00:00
Daniel Schürmann	0b8216b2cd	aco: Lower to CSSA Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes. Previously, it was possible that phi operands have intersecting live-ranges, and thus, couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to spill phis, even if it was beneficial. This patch implements a conversion pass which is currently only called if spilling is necessary. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:32 +00:00
Jonathan Marek	329d322a16	etnaviv: fix non-pointsprite points on GC7000L Fixes these deqp tests (and more): dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute dEQP-GLES2.functional.draw.draw_elements.points.single_attribute dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes dEQP-GLES2.functional.draw.draw_elements.points.default_attribute Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-30 19:44:41 +00:00
Jonathan Marek	ad5cbbd228	etnaviv: stencil fix The final version of previous stencil fix patch ended up breaking one-sided stencil. Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L): dEQP-GLES2.functional.fragment_ops.depth_stencil.* Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0 Fixes: `05da025f` ("etnaviv: fix two-sided stencil") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-30 19:44:41 +00:00
Jonathan Marek	7b524e1acb	etnaviv: fix depth bias Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L): dEQP-GLES2.functional.polygon_offset.* Fixes: `6c3c05dc` ("etnaviv: fix polygon offset") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-30 19:44:41 +00:00
Jordan Justen	b529db00ee	iris: Set MOCS for external surfaces to uncached Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 12:42:54 -07:00
Rafael Antognolli	ffb46b2bb7	iris: Align fast clear color state buffer to a page. On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Fix typo case in the comment (Nanley) v3: Rebase and fix conflicts. v4: Fix rebase mistake (Nanley). Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-30 19:41:29 +00:00
Rafael Antognolli	e51722a7c7	anv: Align fast clear color state buffer to a page. On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Assert that image->planes[plane].offset is 4K aligned (Nanley) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-30 19:41:29 +00:00
Erik Faye-Lund	477f019812	zink: only enable KHR_external_memory_fd if supported While we're at it, make sure we error out if it's not supported when required. This brings us a bit closer to being able to test on SwiftShader, which doesn't currently support KHR_external_memory_fd.	2019-10-30 19:40:50 +00:00
Bas Nieuwenhuizen	780c937a5d	radv: Start signalling semaphores in WSI acquire. Winsys semaphores without signal operation get silently ignored. Not so for syncobjs, so actually signal them. Fixes: `84d9551b23` "radv: Always enable syncobj when supported for all fences/semaphores." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 19:42:10 +01:00
Rhys Perry	e1bcc7a828	aco: rename README to README.md Closes: #1974 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 18:16:00 +00:00
Rhys Perry	d4684a294b	aco: a couple loop handling fixes for GFX10 hazard pass It was joining from the wrong blocks and block.kind is a bitmask instead of an enum. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-30 18:13:53 +00:00
Matt Turner	12d3b11908	intel/compiler: Add instruction compaction support on Gen12 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	c8fbc8823f	intel/compiler: Make separate src0/src1 index tables TGL uses different data (and even a different format!) for each source. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	cde73625f8	intel/compiler: Inline get_src_index() TGL will have separate tables for src0 and src1, so the shared function will no longer make sense. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	d0eff8a539	intel/compiler: Restructure instruction compaction in preparation for Gen12 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-30 11:11:50 -07:00
Matt Turner	ded9fb2b18	intel/compiler: Remove unreachable() from brw_reg_type.c The EU compaction unit test fuzzes the compaction code by flipping bits. We use a simple skip_bits() function with a list of reserved bits to ignore, but for more complex cases like invalid combinations of register file:type, we need either machinery to check validity or for these functions to simply inform us whether a combination was valid. enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it with an "INVALID" value, just return -1 and let the caller check for that. Scott suggested redefining unreachable() within the unit test to longjmp() which would allow driver code like this to still use it and allow the test to handle expected failures like this. If that plan works out, I plan to revert this.	2019-10-30 11:11:50 -07:00
Jonathan Marek	fa3baeab76	freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED) Mostly for vertex formats, but they are supported as texture formats too (untested however). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-30 18:04:17 +00:00
Pierre-Eric Pelloux-Prayer	03a132912f	radeonsi: disable sdma for gfx10 Disable sdma on gfx10 until all timeouts bugs are fixed. See: https://gitlab.freedesktop.org/mesa/mesa/issues/1907 https://bugs.freedesktop.org/show_bug.cgi?id=111481 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer	2fb4b3c476	radeonsi: sdma misc fixes SDMA IB doesn't need to be padded for SDMA. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-30 18:03:14 +01:00
Pierre-Eric Pelloux-Prayer	21b9a6b590	radeonsi: align sdma byte count to dw If src/dst addresses are dw aligned and size is > 4 then we align byte count to dw as well. PAL implementation works like this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-30 18:03:14 +01:00
Timur Kristóf	f53811aeac	radv: Enable ACO on Navi. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 16:54:41 +00:00
Leo Liu	a886ae5162	radeonsi: enable 8K video decode support for HEVC and VP9 HW 8K decode support starts at Renoir Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>	2019-10-30 12:43:04 -04:00
Leo Liu	b4c812a269	radeon/vcn: Add VP9 8K decode support Require increase of context buffer size Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>	2019-10-30 12:43:04 -04:00
Rhys Perry	8235bc6411	aco: try to group together VMEM loads of the same resource v2: remove accidental shaderInt16 change v2: simplify can_move_down initialization v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-30 17:23:49 +01:00
Daniel Schürmann	8b5aee78cc	aco: don't schedule instructions through depending VMEM instructions Previously, the scheduler tried to move up instructions from below depending VMEM instructions only to move them down again when scheduling the VMEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	636d45e46a	aco: add can_reorder flags to load_ubo and load_constant These got lost due to some refactoring. Due to the way our scheduler works currently, for now we add back the reorder flag for divergent loads only. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	576f92d900	aco: only skip RAR dependencies if the variable is killed somewhere This patch changes VMEM scheduling in a way that they can only be moved upwards by previous VMEM instructions but not downwards. This way, it improves the order of VMEM instructions in relation to their users. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	703ce617ca	aco: restrict scheduling depending on max_waves Previously, we allowed all shaders to reduce the number of max_waves to as low as 5. Restricting this on shaders with low register demand, increases the total number of waves while the VMEM def-use distances hardly change. This patch also changes the max number of move operations per MEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Jason Ekstrand	beca63c6c0	anv: Avoid emitting UBO surface states that won't be used This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Jason Ekstrand	24c0545b2d	intel/vec4: Set brw_stage_prog_data::has_ubo_pull In `0e4a75f917`, Ken added a flag brw_stage_prog_data which indicates whether any UBO pulls ever occur. Unfortunately, he neglected to set the bit in the vec4 back-end. This was fine at the time because the optimization was intended for iris which does not support gen7 and using the vec4 back-end on Gen8+ requires an environment variable. We want to use this in Vulkan which does support Gen7 so we want the information from the vec4 back-end as well as scalar. Fixes: `0e4a75f917` "intel/compiler: Record whether any pull constant..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 16:05:57 +00:00
Samuel Pitoiset	5a9d777f5a	radv: fix perftest options RADV_PERFTEST=outooforder has been removed a while ago. This fixes dumping the options into hang reports. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:30 +01:00
Samuel Pitoiset	c895e08281	radv: move nomemorycache debug option at the right palce Fixes: `6571000071` ("radv: add debug option to turn off in memory cache") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:28 +01:00
Samuel Pitoiset	d4e0bef1bb	radv: fix dumping SPIR-V into hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 13:02:08 +00:00
Tapani Pälli	4f8c86e6a5	mesa: enable ARB_gpu_shader_int64 in compat profile Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:37:27 +02:00
Tapani Pälli	2d8b8d3bd1	mesa: add [Program]Uniform*64ARB display list support This is required for int64 to be enabled in compat profile. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-30 14:37:27 +02:00
Bas Nieuwenhuizen	396195e8f1	radv: Enable VK_KHR_timeline_semaphore. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	4aa75bb3bd	radv: Add wait-before-submit support for timelines. This is actually a non-threaded implementation. I'd summarize this as event-based submission. When submit happens we walk a tree of submissions that depend on the syncobj signal operations to be submitted and if those submission we no other dependencies we start to execute them immediately. Or, well I still use a list to avoid issues with long chains and the stacksize when using recursion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	88d41367b8	radv: Add timelines with a VK_KHR_timeline_semaphore impl. This does not fully do wait-before-submit, to be done in a follow up patch. For kernels without support for timeline syncobjs, this adds an implementation of non-shareable timelines using legacy syncobjs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2117c53b72	radv: Add temporary datastructure for submissions. So we can defer them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	c3eae659e7	radv: Split semaphore into two parts as enum+union. This is in preparation to adding more types. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	84d9551b23	radv: Always enable syncobj when supported for all fences/semaphores. This simplifies code for timeline semaphores by needing to support less configurations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	45f4a639a8	radv: Improve fence signalling in QueueSubmit. Only signalling it once. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	a9c8424e08	radv: Do sparse binding in queue submission. So we have one place to do queue things if we end up deferring submissions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	915e9178fa	radv: Split out commandbuffer submission. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	43ba44357c	radv: Clean up unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2e3a635ee6	radv: Add an early exit in the secure compile if we already have the cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen	d78809632f	radv: Compute hashes in secure process for secure compilation. To prevent poisoning arbitrary cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:37:41 +01:00
Erik Faye-Lund	4c4ac2d4d5	zink: drop nop descriptor-updates If there's nothing to be done, let's actually do nothing. Seems like a good idea. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Erik Faye-Lund	b222f28357	zink: use bitfield for dirty flagging Bitfields are a bit more ideomatic than explicit flags, and harder to get wrong. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Erik Faye-Lund	6d30abb4f1	zink: use dynamic state for line-width This will lead to fewer pipelines in the cache, which is assumed to become our most unavoidable performance bottle-neck down the line. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-10-30 10:29:23 +00:00
Duncan Hopkins	d2bb63c8d4	zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-30 09:09:49 +00:00
Michel Dänzer	aaf1b09270	gitlab-ci: Disable meson-windows job for the time being It needs a CI runner carrying the mesa-windows tag, but there's none available currently.	2019-10-30 09:38:20 +01:00
Timothy Arceri	cf25664686	radv: make use of radv_sc_read() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	28fff3efbc	radv: add radv_sc_read() helper This is a function with timeout support for reading from the pipe between processes used for secure compile. Initially we hardcode the timeout to 5 seconds. We can adjust the timeout limit in future if needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	23a6827e4d	radv: allow select() calls in secure compile This will be used in the following patch to support timeouts for reading the pipe between processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Lepton Wu	1abf05764b	mapi: Improve the x86 tsd stubs performance. This skips touching %ebx most times and it shows that glGetString performance increased from 114M/s to 120M/s on my desktop. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 20:50:05 -07:00
Lepton Wu	41407d5e9f	mapi: Inline call x86_current_tls. This saves one return and a simple benchmark which calls glGetString repeatedly on my desktop shows it improves calls per second from 123M to 141M. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1997 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Lepton Wu	b2b8639d8e	mapi: Clean up entry_patch_public for x86 tls Remove hard coded 16 and use entry_generate_or_patch to patch public stubs. The generated code actually is sightly tighter than before since the "nop" instructions before the final "jmp" get removed. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Lepton Wu	1fb75bee90	mapi: split entry_generate_or_patch for x86 tls The code works exactly the same with before. Just split this function out so we can reuse it. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:18:06 -07:00
Jonathan Gray	45206d7673	mapi: Adapted libglvnd x86 tsd changes The x86 assembly language stub in src/mapi/entry_x86_tsd.h does not generate PIC (position-independent code). This causes text relocations which bring troubles on recent versions of FreeBSD, OpenBSD, Android. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108541 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-29 17:13:14 -07:00
Caio Marcelo de Oliveira Filho	9c3c206e71	spirv: Don't fail if multiple ordering semantics bits are set Vulkan requires that only one bit for the ordering is set, but old versions of GLSLang just set all the bits. This was fixed as part of `c51287d744` but we can still find older versions (or shaders compiled with it) around. So instead of failing, emit a warning and fallback to the effective result of any combination of multiple bits: AcquireRelease. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2018 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 14:53:46 -07:00
Sagar Ghuge	f0db4c5204	intel/isl: Allow stencil buffer to support compression on Gen12+ v2: (Nanley Chery) - Fix commit title - Fix comment Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	b22b349443	iris: Resolve stencil resource prior to copy or used by CPU v2: Decide aux usage in get_copy_region_aux_settings (Nanley Chery) v3: Use isl_surf_usage_is_stencil function (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	5d331251cf	iris: Prepare resources before stencil blit operation We have to resolve destination surfaces if we are bliting to and from the same surface. v2: Revert unrelated change (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	4e0ed40ed7	iris: Prepare depth resource if clear_depth enable Avoid preparing depth resource, if we did fast depth clear before. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	81de49a9f2	iris: Prepare stencil resource before clear depth stencil Let aux surface state tracker track the stencil buffer's aux state while clearing depth stencil buffer. v2: Fix condition check (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	b8223991b5	iris: Resolve stencil buffer lossless compression with WM_HZ_OP packet Even though stencil buffer compression looks like regular lossless color compression w/o fast clear support, we have to resolve stencil buffer with WM_HZ_OP packet. v2: Check if resource is stencil with helper function (Nanley Chery) v3: Remove unnecessary included file (Nanley Chery) v4: (Nanley Chery) - Avoid stencil buffer aux state transition by improving condition check Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	87c57b8dae	intel/blorp: Set stencil resolve enable bit When set, the stencil buffer is filled with the true stencil values and we have to disable stencil buffer clear enable bit. v2: 1) Refactor code little bit (Nanley Chery) 2) Fix assertion (Nanley Chery) v3: 1) Remove unncessary assignment (Nanley Chery) 2) Fix GEN_GEN check (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	c401186762	intel: Track stencil aux usage on Gen12+ Enable stencil compression enable and control surface enable bit if stencil buffer lossless compression is enabled. v2: Remove unnecessary GEN_GEN check (Nanley Chery) v3: (Nanley Chery) - Change commit subject tag from intel/isl to intel - Keep assignment order correct Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	53d472df24	intel/blorp: Add helper function for stencil buffer resolve On Gen12+, Stencil buffer's lossless compression should be resolved with WM_HZ_OP packet. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	ce208be2d8	intel/blorp: Assign correct view while clearing depth stencil We never saw any failures regarding this typo but it's good to assign correct stencil view while constructing blorp_params. Fixes: `0cabf93b80` "intel/blorp: Add an entrypoint for clearing depth and stencil" Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Sagar Ghuge	4287e0a4e4	genxml/gen12: Add Stencil Buffer Resolve Enable bit Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	0a2a9a4a5b	iris: Allocate main and aux surfaces together On Gen12, the CCS buffer address doesn't have to be referenced in state packets. In the case of a stencil buffer with CCS, the kernel won't know the location of the CCS unless an extra call is made to pin its address. To avoid this extra call, make the CCS part of the main surface. v2. Update comment above bo_size. (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	ff5bc81b51	iris: Determine aux offsets within configure_aux If a resource has a modifier, the main and aux surfaces will share a BO. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	f0ed86c6c6	iris: Bail resource creation upon aux creation error The functions used during aux buffer configuration and creation only return false for exceptional errors. Don't proceed with surface creation in those cases. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Nanley Chery	8b62e3d978	iris: Drop iris_resource::aux::extra_aux::bo The primary and secondary aux buffers are always allocated in the same BO. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 14:46:15 -07:00
Duncan Hopkins	bb8e6994cc	zink: pass line width from rast_state to gfx_pipeline_state. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-29 20:38:26 +00:00
Jason Ekstrand	52aa7f3e05	anv: Reduce the minimum number of relocations The original value of 256 was under the assumption that you're a batch buffer which is likely going to have a large number of relocations. However, pipeline objects on Gen7 will have at most 6 relocations (one per shader stage and one for the workaround BO) so this is a lot of per-pipeline wasted space. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-29 20:27:52 +00:00
Jason Ekstrand	a3153162a9	anv: Delay allocation of relocation lists The old relocation list code always allocated 256 relocations and a hash set up-front without knowing whether or not we really need them. In particular, in the softpin case, this is two fairly large allocations that we don't need to be making. Also, for pipeline objects on haswell where we don't have softpin, we don't need relocations unless scratch is used so this is extra data per-pipeline. Instead, we should do it on-demand. This shaves 3.5% off of a cpu-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-29 20:27:52 +00:00
Plamena Manolova	4fe2317601	anv: Implement new way for setting streamout buffers. For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:21:20 +00:00
Plamena Manolova	0f610e17bc	iris: Implement new way for setting streamout buffers. For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:20:25 +00:00
Plamena Manolova	665b81e29a	genxml: Add 3DSTATE_SO_BUFFER_INDEX_* instructions For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-29 19:19:58 +00:00
Rob Clark	ff6e148a3d	freedreno/a6xx: add a618 support Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-29 09:19:34 -07:00
Rob Clark	afd224fac3	freedreno/a6xx: cleanup magic registers Extract out values for the handful of unknown registers which have different values across different a6xx models, to simplify adding support for new a6xx's. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-29 09:19:31 -07:00
Rob Clark	1fdc259bfc	freedreno/a6xx: remove some left over dead code These registers don't exist, just remnants of initial port from a5xx. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-29 09:19:27 -07:00
Plamena Manolova	f9ad73cdfd	anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures. Add depth bounds testing to the list of supported physical device features. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 16:05:33 +00:00
Plamena Manolova	e6c8750278	genxml: Change 3DSTATE_DEPTH_BOUNDS bias. The bias for the 3DSTATE_DEPTH_BOUNDS instruction should be 2 not 1. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-29 16:05:33 +00:00
Michel Dänzer	2a38fc1027	gitlab-ci: Only run the pipeline if any files affecting it have changed E.g. documentation-only changes cannot affect the outcome of the pipeline, so don't waste resources on running it. The thing we need to be careful about here is that the container stage jobs must always run if any later stage jobs using the corresponding docker images run. We're currently using the same .ci-run-policy template for all jobs, so this is trivially true. v2: * Add bin/ and common.py (Eric Engestrom) Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> # v1 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-29 15:09:56 +00:00
Krzysztof Raszkowski	163d5fde06	gallium/swr: Enable GL_ARB_gpu_shader5: multiple streams Added support for geometry shader multiple streams (part of GL_ARB_gpu_shader5 extension). Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-10-29 14:50:02 +00:00
Alyssa Rosenzweig	44971b84b7	panfrost: Remove unused definitions in mali-job.h Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig	fa14cdf6e4	panfrost: Cleanup _shader_upper -> shader I don't believe this is actually a tagged pointer; warn if it is. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-29 13:02:53 +00:00
Eric Engestrom	b4f508ab59	meson: define _GNU_SOURCE on FreeBSD _mesa_strtod() needs this to use strtod_l(), which behaves correctly wrt `,` vs `.` decimal separator. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2008 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-29 12:12:58 +00:00
Lionel Landwerlin	1a2246a5e0	intel/perf: update ICL configurations A few equations/programming changes for ICL. v2: Fix a couple of issues in naming and floating/integer operations (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-29 13:00:26 +02:00
Alexandros Frantzis	1257d06ba7	gitlab-ci: Update required libdrm version Commit `9edcce2a32` bumped the required libdrm-amdgpu version to 2.4.100. Update the version we use in our CI scripts to avoid CI build failures. Also bump the debian image name for this change to take effect. Note that amdgpu is only built with the debian-buster image, so only this image requires an update. Fixes: `9edcce2a` ("ac: get tcc_harvested from the kernel") Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-29 09:50:09 +00:00
Eric Engestrom	690d359b6f	travis: fix scons build after deprecation warning Fixes: `54053bc8d0` ("scons: Print a deprecation warning about using scons on not windows") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-29 09:25:40 +00:00
Caio Marcelo de Oliveira Filho	e2155158e9	anv: Fix output of INTEL_DEBUG=bat for chained batches The anv_batch_bo contents are linked one to another, and when printing we have to start with the first of those. Since in `u_vector` new elements are added to the head, to get the first element we need the vector's tail. Fixes: `32ffd90002` ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 19:34:54 -07:00
Marek Olšák	f9fe86e02a	winsys/amdgpu: use the new GPU reset query	2019-10-28 21:38:01 -04:00
Marek Olšák	9edcce2a32	ac: get tcc_harvested from the kernel Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 21:38:01 -04:00
Marek Olšák	4d1e43badb	radeonsi: initialize shader compilers in threads on demand It takes a noticable amount of time with piglit. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-28 21:36:18 -04:00
Marek Olšák	1380db9fa8	radeonsi: don't print diagnostic LLVM remarks and notes We don't use them. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-28 21:36:18 -04:00
Timur Kristóf	c52ebbcea4	aco: Introduce vgpr_limit to keep track of available VGPRs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Timur Kristóf	d59f702e26	aco: Implement subgroup shuffle in GFX10 wave64 mode. Previously subgroup shuffle was implemented using the bpermute instruction, which only works accross half-waves, so by itself it's not suitable for implementing subgroup shuffle when the shader is running in wave64 mode. This commit adds a trick using shared VGPRs that allows to implement subgroup shuffle still relatively effectively in this mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	c2eebfe3ea	aco: Remove dead code in reduction lowering. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	3865448012	aco: Fix reductions on GFX10. Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan with all reduction operations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Eric Engestrom	cd04b63c00	loader: default to iris for all future PCI IDs The existing "fallback" code didn't actually do anything, so this removes it, and instead we just always fallback to `iris` for future PCI IDs. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 23:21:39 +00:00
Eric Engestrom	ea8116908c	anv: add a couple printflike() annotations Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-28 23:17:16 +00:00
Erik Faye-Lund	21b7f79a76	st/mesa: lower global vars to local after lowering clip When this code was merged, this wasn't necessary because the state-tracker would do it later anyway. But this recently got changed, without changing the code that depended on this. Arguably, this was a mistake in the lowering pass to begin with. Either way, let's fix it by not assuming that the lowering code gets called later when it's not needed. This fixed user-defined clip-planes in Zink. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `eaffdad108` ("st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-28 21:17:40 +00:00
Sagar Ghuge	3ac688b0c2	iris: Create resource with aux_usage MCS_CCS Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:02 -07:00
Sagar Ghuge	366fcbf2d8	intel/isl: Support lossless compression with multisamples GEN12 adds the ability to losslessly compress each sample plane in a multisampled buffer that uses MCS compression. v2: Remove unnecessary assertion (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Sagar Ghuge	758a6a3a00	iris: Get correct resource aux usage for copy Add case for MCS_CCS so that we get the correct aux usage while copy operation. v2: Fix commit subject (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Sagar Ghuge	e80bca6895	intel/blorp: Use isl_aux_usage_has_mcs instead of comparing Depending on MCS_CSS or MCS we can emit blorp blit shaders. As we support MCS_CSS and MCS, it makes sense to use isl_aux_usage_has_mcs function. v2: Fix commit message (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Sagar Ghuge	d156632374	iris: Define MCS_CCS state transitions and usages v2: 1) Fix assertion check (Nanley Chery) 2) Correct commit subject (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Sagar Ghuge	2cd849cf17	iris: Initialize CCS to fast clear while using with MCS v2: Explain Bsepc quotes properly (Nanley Chery) v3: 1) Fix comment format (Nanley Chery) 2) Fix typo in comment (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Sagar Ghuge	2f0fbe06e6	intel/isl: Don't reconfigure aux surfaces for MCS If aux for MCS is already configured, don't configure again. v2: Fix missing period in commit message (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-28 14:02:01 -07:00
Erik Faye-Lund	810fc75dab	zink: emulate optional depth-formats The Vulkan spec says that an implementation has to support one of VK_FORMAT_X8_D24_UNORM_PACK32 and VK_FORMAT_D32_SFLOAT, as well of one of VK_FORMAT_D24_UNORM_S8_UINT and VK_FORMAT_D32_SFLOAT_S8_UINT. So let's keep track which one is supported of earch pair, and emulate one on top of the other one. This won't give the exact result for comparisons, or when mapping and unmapping the resources. But it's better than flat out failing to create the resource, and we can fix the map/unmap issue later if needed. Tested-by: Duncan Hopkins <duncan@thefoundry.co.uk>	2019-10-28 17:57:49 +00:00
Erik Faye-Lund	e6ea350fb0	zink: error if VK_KHR_maintenance1 isn't supported While we're at it, remove the VK_-prefix from the extension bool; all extensions have this so it's kinda superfluous.	2019-10-28 17:57:49 +00:00
Nanley Chery	d298740a1c	iris: Disallow incomplete resource creation If a modifier specifies an aux, it must be created. Fixes: `75a3947af4` ("iris/resource: Fall back to no aux if creation fails") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	f2fc5dece9	iris: Don't leak the resource for unsupported modifier Make sure the res struct is free'd before returning. Fixes: `2dce0e94a3` ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.") Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	7a619b5c75	iris: Enable HIZ_CCS sampling Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	8e7644e48f	intel/blorp: Satisfy clear color rules for HIZ_CCS Store the converted depth value into two dwords. Avoids regressing the piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is enabled in a later commit. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	0aa308f420	intel: Fix and use HIZ_CCS write through mode Write through to the CCS if the surface is used as a texture and can be sampled by the HW with CCS. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	fee4dbcb4d	iris: Start using blorp_can_hiz_clear_depth() Check that the alignment requirements for HIZ_CCS are satisfied by using this function. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	5425fcf2cb	intel/blorp: Satisfy HIZ_CCS fast-clear alignments Prevent the piglit test, amd_vertex_shader_layer-layered-depth-texture-render, from regressing in in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	6451008e8b	intel: Refactor blorp_can_hiz_clear_depth() Prepare this function to be used in iris and to handle new Gen12 behavior. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	cc99d0adc0	isl: Add isl_surf_supports_hiz_ccs_wt() Add a helper to determine if an ISL surface supports the write-through mode of HIZ_CCS. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	6020ebf799	iris: Enable HIZ_CCS in depth buffer instructions Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	af6ff48894	iris: Define initial HIZ_CCS state and transitions Make it match those of HIZ. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	c991045d38	iris: Create an unusable secondary aux surface The HIZ_CCS and MCS_CCS auxiliary surface modes require that drivers store information about two aux buffers. We choose to represent this as HiZ/MCS being the primary aux surface and the CCS as an secondary/extra aux surface. This representation has the effect of placing most of the code that will have to choose between the two aux surfaces around the aux-map entry points. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	909030bca6	iris: Don't guess the aux_usage Instead of guessing an aux_usage, then confirming it if the isl_surf_get_*_surf functions are successful, just call the ISL functions up-front. This will help us to more easily determine if a depth buffer supports HIZ_CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	04e5f7e8a9	intel/blorp: Treat HIZ_CCS like HiZ Allow it in depth buffer instructions but disable it for blits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:06 -07:00
Nanley Chery	cc415f911f	intel/blorp: Assert against HiZ in surface states Avoid unexpected behavior if the caller happens to pass in a HiZ aux usage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	c50f8b2fc9	intel: Support HIZ_CCS in isl_surf_get_ccs_surf Add an extra aux parameter which will be filled out with CCS if the first two isl_surf parameters fit the requirements for HiZ_CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	e2e67b3f11	isl: Reduce assertions during aux surf creation Return false more often to reduce the burden on the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	6670e07a6e	intel: Enable CCS_E for R24_UNORM_X8_TYPELESS on TGL+ While this format isn't listed in BSpec: 53911, other documentation and empirical evidence suggest that it's fine to remap it to R32_FLOAT. I've filed a bug for the BSpec page. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	f93bc14618	intel: Use 3DSTATE_DEPTH_BUFFER::ControlSurfaceEnable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Jason Ekstrand	ab994ecae6	intel/isl: Support HIZ_CCS in emit_depth_stencil_hiz v2. Remove undocumented CCS_E-only mode for depth. (Nanley) Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	6312328a61	intel: Use RENDER_SURFACE_STATE::DepthStencilResource Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Jordan Justen	5d34a9975f	intel: Update alignment restrictions for HiZ surfaces. v2 (Nanley): * Maintain a chronological ordering for HiZ alignments. Suggested by Ken. Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	6cd9731d96	iris: Clear ::has_hiz when disabling aux Fixes: `2cddc953cd` ("iris: some initial HiZ bits") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	d5fb9cccdc	intel/blorp: Disable depth testing for slow depth clears We'll start doing slow depth clears more often on HIZ_CCS buffers in a future commit. Reduce the performance impact by making them use less bandwidth. From the Depth Test section of the BSpec: This function is enabled by the Depth Test Enable state variable. If enabled, the pixel's ("source") depth value is first computed. After computation the pixel's depth value is clamped to the range defined by Minimum Depth and Maximum Depth in the selected CC_VIEWPORT state. Then the current ("destination") depth buffer value for this pixel is read. and from the Depth Buffer Updates section of the BSpec: If depth testing is disabled or the depth test passed, the incoming pixel's depth value is written to the Depth Buffer. Taken together, it's clear that depth testing isn't necessary to perform a depth buffer clear. Mark Janes and I analyzed this patch with frameretrace and a depthrange piglit test. I disabled HiZ to ensure we'd get slow depth clears. We've observed the bandwidth consumption by the depth buffer access to be cut ~50% on BDW and SKL during depth clears. On a more graphically intensive workload, the Shadowmapping Sascha benchmark, I took the average of 3 runs on a BDW with a display resolution of about 1920x1200 (minus some desktop environment decorations). I measured a 22.61% FPS improvement when HiZ is disabled. v2. The BSpec doesn't mandate this behavior, update comment accordingly. (Ken) Fixes: `bc4bb5a7e3` ("intel/blorp: Emit more complete DEPTH_STENCIL state") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	e655eed531	intel: Enable CCS_E for some formats on Gen12 In ISL: Update the format table to add CCS_E support for some 8BPP formats, some 16BPP formats, and R10G10B10A2_UNORM_SRGB. In the helper for determining CCS_E support, we return false for some 16BPP formats because they aren't properly handled in blorp_copy(). In BLORP: Allow the new and non-problematic formats for CCS_E-enabled copies. v2. Update other fields for A1B5G5R5_UNORM and A4B4G4R4_UNORM in table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)	2019-10-28 10:47:05 -07:00
Nanley Chery	126c9562d9	isl: Redefine the CCS layout for Gen12 The CCS could be described in a number of ways, but this format was chosen to minimize churn in the drivers. We may decide on an different direction in the future. v2. Increase alignment for display surfaces. (Nanley) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	1e91280242	isl: Add and use isl_tiling_flag_to_enum() Use a helper that will automatically handle Gen12's CCS tiling when creating a CCS isl_surf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	82822bc549	iris: Allow for non-Y-tiled aux allocation The Gen12 CCS is not Y-tiled. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	22be1447bb	isl/drm: Map HiZ and CCS tilings to Y In the function which translates ISL tilings to i915 tilings, map ISL's HiZ and CCS tilings to Y instead of NONE (linear). The HW docs describe HiZ and pre-Gen12 CCS surfaces as being Y-tiled in memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Jason Ekstrand	901bed5122	intel/isl: Update surf_fill_state for gen12 v2 (Nanley): * Avoid driver churn for now. * Include some media compression changes. Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Jason Ekstrand	caf4cc548e	intel/isl/fill_state: Separate aux_mode handling from aux_surf v2. Avoid driver churn for now. (Nanley) Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Jason Ekstrand	a1e0b21061	intel/isl: Add new aux modes available on gen12 v2. Add media compression. (Nanley) Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	77f506382f	i965/miptree: Avoid -Wswitch for the Gen12 aux modes Avoid the compiler warnings for the new enums that will be introduced in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	8af1853331	anv/private: Modify aux slice helpers for Gen12 CCS The isl_surf structs for Gen12's CCS won't describe how many slices in the main surface can be compressed. All slices will be compressable if CCS is enabled, so lookup the main surface's logical dimension. v2. Add a space before a `?`. (Jordan) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	ba52cd7ab2	intel/blorp: Don't assert aux slices match main slices This isn't accurate enough for HiZ which can have a discontiguous range of supported aux slices. This also won't work with the plan to represent Gen12 CCS as a single slice surface. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Jason Ekstrand	4021a3925c	intel/blorp: Use surf instead of aux_surf for image dimensions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	d90bffaef8	intel/blorp: Halve the Gen12 fast-clear/resolve rectangle Update their dimensions according to the Bspec. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Rafael Antognolli	43b48ee752	intel/blorp/gen12: Set FWCC when storing the clear color. From "Render Target Fast Clear" description for Gen12: "SW must store clear color using MI_STORE_DATA_IMM with ForceWriteCompletionCheck bit set." From Instruction_MI_STORE_DATA_IMM, bitfield 10 (when set to 1): "Following the last write from this command, Command Streamer will wait for all previous writes are completed and in global observable domain before moving to next command." We use 4 SDIs to store the clear color (one per channel). From the description, it looks to me that setting that flag only on the last SDI should be enough. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	07e16221d9	isl: Round up some pitches to 512B for Gen12's CCS Gen12's CCS requires that the main surface have a pitch aligned to 512B. v2. Provide a BSpec citation. (Ken) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	f6aefa94cc	iris: Don't assume CCS_E includes CCS_D There's no longer a clear-only compression mode of CCS on Gen12+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:05 -07:00
Nanley Chery	300d77c2fa	anv/cmd_buffer: Don't assume CCS_E includes CCS_D There's no longer a clear-only compression mode of CCS on Gen12+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:05 -07:00
Nanley Chery	4f0b5f9732	anv/image: Disable CCS_D on Gen12+ Clear-only compression no longer exists on TGL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	a94cb6503f	isl: Disable CCS_D on Gen12+ Clear-only compression no longer exists on TGL. v2. Add BSpec reference. (Sagar) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	83fc15e5ba	iris: Drop support for I915_FORMAT_MOD_Y_TILED_CCS on TGL+ The format of the CCS has changed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	0eaf293b47	anv/formats: Disable I915_FORMAT_MOD_Y_TILED_CCS on TGL+ The format of the CCS has changed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 10:47:04 -07:00
Nanley Chery	d0fcc2dd50	anv: Properly allocate aux-tracking space for CCS_E add_aux_state_tracking_buffer() actually checks the aux usage when determining how many dwords to allocate for state tracking. Move the function call to the point after the CCS_E aux usage is assigned. Fixes: `de3be61801` ("anv/cmd_buffer: Rework aux tracking") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:04 -07:00
Nanley Chery	698d723a6d	anv/blorp: Use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+. Fixes: `6c9f9a82d7` ("intel/genxml,isl: Add gen12 render surface state changes") Reported-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 10:47:04 -07:00
Plamena Manolova	939ddccb7a	anv: Add support for depth bounds testing. In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction to enable depth bounds testing. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-28 14:13:04 +00:00
Plamena Manolova	1df871f8ff	iris: Add support for depth bounds testing. In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction to enable depth bounds testing. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 13:46:06 +00:00
Plamena Manolova	1ecd37eac6	genxml: Add 3DSTATE_DEPTH_BOUNDS instruction. In gen12 we add the 3DSTATE_DEPTH_BOUNDS instruction which enables support for depth bounds testing. Signed-off-by: Plamena Manolova <plamena.manolova@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 13:45:24 +00:00
Danylo Piliaiev	8818e0df74	glsl: Initialize all fields of ir_variable in constructor Better be safe, even if we could technically avoid this for some fields. Cc: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1999 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Tested-by: Witold Baryluk <witold.baryluk@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-28 12:49:15 +00:00
Timothy Arceri	1909bc526d	util: remove LIST_IS_EMPTY macro Just use the inlined function directly. The new function was introduced in `addcf410`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:39 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	c578600489	util: remove LIST_DEL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	c976b427c4	util: remove LIST_DELINIT macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	d23d47c065	util: remove LIST_REPLACE macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	40258fb8b8	util: remove LIST_ADD macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	255de06c59	util: remove LIST_ADDTAIL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	7ae1be1028	util: remove LIST_INITHEAD macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Erik Faye-Lund	15e7f94278	gitlab-ci: fixup debian tags When resolving a merge-conflict, I accidentally only updated the ARM64-tag tag. Let's correct this. Fixes: `3d529c1739` ("gitlab-ci: also build Zink on CI")	2019-10-28 12:07:30 +01:00
Danylo Piliaiev	12a8f2616a	intel/compiler: Fix C++ one definition rule violations When building with "-flto" brw::block_data definitions were colliding. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-28 12:02:40 +02:00
Erik Faye-Lund	3d529c1739	gitlab-ci: also build Zink on CI This prevents accidentally breaking the driver-build while working on other drivers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	86ed8132a5	zink: simplify gl-to-vulkan lowering Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	412e2aa23b	zink/spirv: more complete sampler-dim handling Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	f26eab3175	zink: fixup scissoring Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Duncan Hopkins	c4446098cf	zink: limited uniform buffer size so the limits is not exceeded. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	4ef088f241	zink: do not set lineWidth to invalid value Some implementations don't support the lineWidth-feature, so let's avoid setting invalid state to them. But since we don't have a fallback for this, inform the user. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	59f8ba05f5	zink: pass screen to zink_create_gfx_pipeline Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Duncan Hopkins	5cf93985a0	zink: respect ubo buffer alignment requirement The driver can report a minimum alignment for UBOs, and that can be larger than 64, which we've currently been using. Let's play ball, and use the reported value instead. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Duncan Hopkins	108ba81c95	zink: fix line-width calculation There's two things that goes wrong in this code on some drivers: 1. Rounding off the line-width to granularity can push it outside the legal range. 2. A granularity of 0.0 results in NaN, because we divide by zero. So let's make this code a bit more robust. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	df11f3f2ab	zink: fixup return-value Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	d5cbc05cde	zink: refactor blitting Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	a7fbc8bc7f	zink: implement resource_from_handle Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	65fbb1836a	zink: use VK_FORMAT_B8G8R8A8_UNORM for PIPE_FORMAT_B8G8R8X8_UNORM Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	867d892d90	zink: do not set VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for non-3D textures Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	d8f1cf4946	zink/spirv: alias var0 on tex0 etc instead This fixes Quake3, and is more in line with directx semantics. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	c7bcb6e5dc	zink: lower two-sided coloring Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	67a9749ada	zink/spirv: alias generic varyings on non-generic ones This gets rid of the nasty location-allocation hack. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	1f3d2b9f80	zink/spirv: implement load_front_face We're now adding interface-types during code-emitting, so we need to defer emitting the entry-point. No biggie, spirv_builder is prepares for this. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:48 +00:00
Erik Faye-Lund	a046957a79	zink/spirv: fixup b2i32 Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	b28156413f	zink: do not lower bools to float Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	3ed41e3bb6	zink/spirv: prepare for 1-bit booleans Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	c24c3da00a	zink/spirv: fixup b2i32 and implement b2f32 Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	0a912269d4	zink/spirv: clean up get_[fu]vec_constant Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	3ceba2d312	zink/spirv: inline get_uvec_constant into emit_load_const This is the only call-site that wants to specify unique values per component for any of the get_*_constant functions. So let's give this its own implementation instead, so we can ease the burden for the rest. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	20f6b19fdf	zink/spirv: add emit_uint_const-helper While we're at it, let's move emit_float_const to the same location as this needs to be defined at. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	f048196f9e	zink/spirv: add emit_bitcast-helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	0f697be76d	zink/spirv: use bit_size instead of hard-coding Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	54c46db1c8	zink/spirv: implement emit_float_const helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	89591c895c	zink/spirv: implement emit_select helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	2419022a0c	zink/spirv: implement b2i32 Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	f4ad93462c	zink/spirv: implement bitwise ops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	103776ab9c	zink/spirv: implement bcsel Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	0947afaa8f	zink/spirv: assert bit-size This is going to make it easier to verify that 1-bit float sizes don't leak into the rest of the code. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	bb895afaa0	zink/spirv: implement f2b1 Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	04bb08ed35	zink/spirv: use ordered compares Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	3ef3ab2d54	zink: lower point-size Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	f24e14cc08	zink: add missing sRGB DXT-formats Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	d50ec9f798	zink: disable PIPE_CAP_QUERY_TIME_ELAPSED for now Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	b525348729	zink: support shadow-samplers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	d9c068cba1	zink: fix rendering to 3D-textures Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	65e2cf98d5	zink: initialize nr_samples for pipe_surface Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	8575295c17	zink: use primconvert to get rid of 8-bit indices Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	2942becfe9	zink: also accept txl Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	2683619955	HACK: zink: suspend / resume queries on batch-boundaries HACK because we assert that we don't overrun the pool. We need a fallback here instead. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	67cde39c8c	zink: move set_active_query_state-stub to zink_query.c Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	7ebdf5be15	zink: disable timestamp-queries We don't implement the get_timestamp context-method, so this is just going to crash if anyone tries to use it. Let's implement it later. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	e084211c08	zink: fixup boolean queries Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:47 +00:00
Erik Faye-Lund	69189417ae	zink/spirv: support vec1 coordinates Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	499bf41487	zink: do not use both depth and stencil aspects for sampler-views Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	5f14168edf	zink/spirv: always enable Sampled1D for fragment shaders Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	967e570511	zink: add note about enabling PIPE_CAP_CLIP_HALFZ Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	755037e09d	zink: don't crash when setting rast-state to NULL Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	7004afcd24	zink: remove insecure comment This isn't as inaccurate as the comment says, the Vulkan documentation even seems to suggest this is the same. Let's drop the comment. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	a10d43d845	zink: avoid texelFetch until it's implemented Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	a9770e2bd2	zink: set ExecutionModeDepthReplacing when depth is written Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	10f26ef92d	zink: fixup: save rasterizer Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	c96963a8d1	zink: ensure layout is reasonable before copying Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	c947aee63b	zink/spirv: debug-print unknown varying slots Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	c2f52cf94f	zink/spirv: be a bit more strict with fragment-results Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	97f4827e2e	zink: wait for transfer when reading TODO: this could really benefit from a separate transfer-queue, I think. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	a005fae564	zink: support more texturing Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	44f374ced5	zink/spirv: correct opcode Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	baf34dbd75	zink: make sure imageExtent.depth is 1 for arrays Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	67d2e6258e	zink: stub resource_from_handle Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	b8a9bbeb00	zink: abort on submit-failure Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	4a64ee192a	zink: crash hard on unknown queries Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	86c0217ee9	zink: add more compares Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	06859b70b9	zink: more converts Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	b5bfb72fce	zink: more comparison-ops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	bcd12adce5	zink: implement ineg Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	d19f0b437b	zink: add shift ops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	6032fc65b0	zink: add division ops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	60bfee1f31	zink: add some opcodes Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	4e60d4d52a	zink: clean up opcode-emitting a bit Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	55bcf9b1e0	zink: process one aspect-mask bit at the time Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	cd59de1e3f	zink: save all supported util_blitter states Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	4887ceb79e	zink: save original scissor and viewport Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	d29cc33a9b	zink: store sampler and image_view counts Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	8e5fe441bd	zink: use pipe_stencil_ref instead of uint32_t-array Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	e14c29b9f2	zink: document end-of-frame hack Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Erik Faye-Lund	10439594ec	zink: only consider format-desc if checking details Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:46 +00:00
Dave Airlie	4480aefc38	zink: attempt to get multisample resource creation right Use the exposed vulkan limits to fill out supported formats. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Dave Airlie	e234116a96	zink: add samples to rasterizer Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Dave Airlie	0c5f3e50ae	zink: add sample mask support This isn't really used yet, but may as well just fill it in. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	dbf67e8a20	zink: refactor fence destruction Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	4c5ade8ca6	zink: drop unused argument Because si.waitSemaphoreCount is 0, this won't even be looked at by the driver, so let's just drop it. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	03efb6dd27	zink: cleanup zink_end_batch This inlines submit_cmdbuf into zink_end_batch, the only place it's used. This makes the code a bit more straight-forward to read. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	8edd357795	zink: request ucp-lowering Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	80673264cb	zink: do not lower io Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	af0dc71d6f	zink/spirv: rename vec_type These aren't guaranteed to be vectors, they can also be scalars. The var-part is the significant part here, not the vector-ness. So let's rename these. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	62f7d9afe8	zink/spirv: var -> regs These track nir-registers, so it's clearer if we refer to them by that name instead. There's potentially more vars than these. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Dave Airlie	5dbfb02459	zink: add support for compressed formats Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	6ae8686bff	zink: request alpha-test lowering Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	d9b7d7b051	zink: pool descriptors per batch Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	9913e5c40b	zink: reuse constants Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	8e5d24fedf	zink: fix off-by-one in assert Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	35c0ef8852	zink: squashme: trade cplusplus wrapper for header-guard Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	bb76a3f61d	zink: squashme: forward declare hash_table Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	fe34a35333	zink: do not use hash-table for regs Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	ca074edc7f	zink: clamp scissors Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	249cd3fc13	zink: kill dead code Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Duncan Hopkins	d850e2a3f2	zink: clamped limits to INT_MAX when stored as uint32_t. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	737a2bba35	zink: prepare for shadow-samplers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	5dfa6be36e	zink: keep a reference to used render-passes Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	1927d11fc0	zink: pass screen instead of device to program-functions Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	f90ee9e33a	zink: rename sampler-view destroy function This name is more consistent with other functions. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	e4bbdcbf80	zink: clean up render-pass management Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	0296e8981d	zink: remove hack-comment This isn't a hack, it's how this should work. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	2a7302075d	zink: ensure sampler-views survive a batch we don't need to track the resources for the samplers any longer, as the sampler view holds a reference instead. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	09e20d88e7	zink: fixup parameter name Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	795c0e95c5	zink: use helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	9e0ff0ffda	zink: more batch-ism Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	33b2f914db	zink: cache framebuffers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	a872f46369	zink: cache render-passes Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	5f21637370	zink: simplify renderpass/framebuffer logic a tad Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	9cac63cae9	zink: implement batching This reduces stalling quite a bit. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:45 +00:00
Erik Faye-Lund	56b1048bb0	zink: return after blitting Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	ef8750da3d	zink: remove unusual alignment Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	94d3b9389e	zink: tweak state handling Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	8f6449f296	zink: move primitive-topology stuff into program The primitive topology is a bit of an odd-ball, as it's the only truly draw-call specific state that needs to be passed to the program to get a pipeline. So let's make this a bit more explict, by passing it separately. This makes the flow of data a bit easier to wrap your head around. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	e0a93ba351	zink: assign increasing locations to varyings Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	cedf3598b4	zink: ensure textures are transitioned properly Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	c471525fdc	zink: ensure non-fragment shaders use lod-versions of texture Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	9cf6163ea1	zink: emit dedicated block for variables Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	93af00502e	zink: use uvec for undefs Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	a8e63387f3	zink: do not destroy staging-resource, deref it Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	819f9fd2f2	zink: track used resources Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	5a9f235ac2	zink: implement fmod Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	22d080b3ac	zink: store shader_info in zink_shader Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	ce6f19c4ec	zink: texture-rects? Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	4ae362c0ef	zink: delete samplers after the current cmdbuf This makes them zombies for a little while. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	2e2ad61ef1	zink: add curr_cmdbuf-helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	806f040bb3	zink: reference blit/copy-region resources Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	b89eb298ff	zink: whitespace cleanup Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	453d9f193a	zink: wait for idle on context-destroy Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	8541b58e39	zink: reference ubos and textures Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	21cffebe4f	zink: reference vertex and index buffers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	a27b84dd2e	zink: return old fence from zink_flush Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	0fcc9550b2	zink: reference renderpass and framebuffer from cmdbuf Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	ce66749e0b	zink: cache those pipelines Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	8e56b828e4	zink: move renderpass inside gfx pipeline state Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	1cdbeefd2c	zink: cache programs Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	fba0293bef	zink: pass zink_render_pass to pipeline-creation Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	86d0e741ec	zink: prepare for multiple cmdbufs Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	229cd042d3	zink: move cmdbuf-resetting into a helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	ac45bc2359	zink: do not leak image-views Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:44 +00:00
Erik Faye-Lund	e64cc463e3	zink: move render-pass begin to helper Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	7034422389	zink: prepare for caching of renderpases/framebuffers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	b458863c1e	zink/spirv: implement loops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	acdd12dae3	zink/spirv: implement discard Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	11ad9bfc35	zink/spirv: implement if-statements Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	8bbf86e7bc	zink/spirv: prepare for control-flow Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	32aea77cfe	zink/spirv: handle reading registers Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	f317105dd9	zink/spirv: implement some integer ops Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Dave Airlie	d2abe0ac61	zink/spirv: store all values as uint. This adds bitcasting to uint everywhere for now, and stores all spir-v ssa values as uints. It also casts bool to 0/0xffffffff for now (nir 1-bit bools may be coming in the future). This fixes a lot of piglit tests to pass now Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	ac530c1ce2	zink: remove discard_if Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Dave Airlie	6d96578912	zink: query support (v2) This at least passes piglit occlusion_query test for me here now. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	b533de12a5	zink: transform z-range In vulkan, the Z-range of clip-space goes from 0..W instead of -W..+W as is the case in OpenGL. So we need to transform the Z-range to account for this. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Dave Airlie	9fa7400564	zink: add dri loader export MESA_LOADER_DRIVER_OVERRIDE=zink should now work without using swrast paths Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	4249e4a598	zink/spirv: implement point-sprites This passes glsl-fs-pointcoord_gles2 from piglit. Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Dave Airlie	c3bd0274c6	zink: ask for flatshade lowering Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	48f1f20a9d	zink: detect presence of VK_KHR_maintenance1 Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Erik Faye-Lund	8d46e35d16	zink: introduce opengl over vulkan Here's zink, a so far pretty simple vulkan-gallium driver that is able to translate some applications from OpenGL to Vulkan. The compiler is quite limited for now, this will be improved on later. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 08:51:43 +00:00
Samuel Pitoiset	5912792501	radv: fix OpQuantizeToF16 for NaN on GFX6-7 Do not flush NaN to 0. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.propagated_nans Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 09:31:52 +01:00
Samuel Pitoiset	d82dfca872	radv: enable fast depth/stencil clears with separate aspects on GFX8 It's similar to GFX9+. Shadow of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 07:54:11 +00:00
Jordan Justen	66796a1787	iris: Mark aux-map BO as used by all batches Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Jordan Justen	2e6a7ced4d	iris/gen12: Write GFX_AUX_TABLE base address register Rework: * Move last_aux_map_state to iris_batch. (Nanley, Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Jordan Justen	f046c6d090	iris: Map each surf to it's aux-surf in the aux-map tables Rework: Nanley Chery Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Jordan Justen	d09db2d7b2	isl/gen12: 64k surface alignment Reworks: * Update size for aux map change (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Jordan Justen	f118ca2075	iris/bufmgr: Initialize aux map context for gen12 Reworks: * free gen_buffer in gen_aux_map_buffer_free. (Rafael) * lock around aux_map_bos accesses. (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:14 -07:00
Lionel Landwerlin	6af8a4acc4	anv: Add aux-map translation for gen12+ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-28 00:09:14 -07:00
Jordan Justen	7737f56544	anv/gen12: Write GFX_AUX_TABLE base address register Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:14 -07:00
Jordan Justen	109c96b322	genxml/gen12: Add AUX MAP register definitions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:13 -07:00
Jordan Justen	d4a3299ba1	anv/gen12: Initialize aux map context Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:13 -07:00
Jordan Justen	0d0290bb3f	intel/common: Add surface to aux map translation table support Reworks: * Add ISL_FORMAT_B8G8R8X8_UNORM_SRGB to get_format_encoding (Nanley) * ralloc_free aux_map_buffer entries in gen_aux_map_finish. (Rafael) * verify_aligned_space => align_and_verify_space (Rafael) * Add mutex to aux-map code. (Rafael, Nanley) * Add gen_aux_map_fill_bos (Ken) * Make gen_aux_map_get_state_num lockless Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:13 -07:00
Jordan Justen	062022f2e4	anv: Implement aux-map allocator interface This interface allows the aux-map code in the intel/common library to allocate and free buffers. Reworks: * free gen_buffer in gen_aux_map_buffer_free. (Rafael) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-28 00:09:13 -07:00
Jordan Justen	c848ab45f3	intel/common: Add interface to allocate device buffers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:13 -07:00
Lionel Landwerlin	830cdaf3f0	intel/dev: store whether the device uses an aux map tables on devinfo Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 00:09:10 -07:00
Tapani Pälli	412badd059	i965: setup sized internalformat for MESA_FORMAT_R10G10B10A2_UNORM Commit `d2b60e433e` introduced restrictions (as per GLES spec) on the internal format. We need to setup a sized format for the texture image so framebuffers created with that are considered complete. This change fixes following Android CTS test in AHardwareBufferNativeTests category: SingleLayer_ColorTest_GpuColorOutputAndSampledImage_R10G10B10A2_UNORM Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `d2b60e433e` ("mesa/main: R10G10B10_(A2) formats are not color renderable in ES") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-28 07:13:10 +02:00
Eric Engestrom	32cff3781a	tu: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00
Eric Engestrom	0581a86753	v3d: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00
Eric Engestrom	c2430f3edc	radv: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00
Eric Engestrom	493903199c	anv: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:09:14 +00:00
Jonathan Marek	521cdde8fc	freedreno/a2xx: use sysval for pointcoord Fixes a problem with shaders using gl_PointCoord. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reported-by: Fabio Estevam <festevam@gmail.com> Tested-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-27 16:53:32 +00:00
Alyssa Rosenzweig	a0c0030075	pan/midgard: Disable precise occlusion queries I thought there was hardware support for this, but it seems to broken, or at least more complex than I believed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-26 14:38:59 +00:00
Urja Rannikko	dff99ce7d5	panfrost: allocate bo for occlusion query results This memory needs to still be available after all the drawing is done and forgotten about, so cannot be transient. Also clear the result so that no rendering returns a zero. Signed-off-by: Urja Rannikko <urjaman@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-26 14:38:25 +00:00
Alyssa Rosenzweig	728a975700	panfrost: Expose serialized NIR support Serialized NIR is required for clover with the SPIR-V pipeline. With this change and PAN_MESA_DEBUG=deqp, clinfo is able to successfully probe panfrost. Code from Nouveau (commit `7955fabcf8` by Karol Herbst). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-26 13:17:42 +00:00
Alyssa Rosenzweig	afb0d08cb0	pipe-loader: Default to kmsro if probe fails A device supported by kmsro will not automatically probe kmsro since the driver name will be panfrost/lima/v3d/..., not "kmsro". Since kmsro is a bit of a catch-all for generic (mostly embedded) GPUs, add a fallback on kmsro for the dynamic loader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-10-26 13:17:42 +00:00
Alyssa Rosenzweig	4949876dd0	pipe-loader: Add kmsro pipe_loader target kmsro is used by numerous embedded GPUs for a common winsys abstraction. Let's add support for it for the dynamic pipe loader, so clover can probe on these drivers. We build the target with Panfrost. When other drivers need kmsro+clover, we can revisit the build system part; my mesonfu is wanting. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-10-26 13:17:42 +00:00
Jose Fonseca	ace5138548	scons: Fix force_scons parsing. - Use parsed options instead of using ARGUMENTS directly. - Handle the case of mingw cross compilation. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2003	2019-10-26 08:23:48 +01:00
Timothy Arceri	cff53da374	radv: enable secure compile support Can be enabled via the environment variable which tells the driver how many compilation threads are expected to be called, and therefore how many forked processes the driver should create. For example we would expect to call fossilize replay with something like this: RADV_SECURE_COMPILE_THREADS=8 ./fossilize-replay --num-threads 8 \ --shader-cache-size 0 --ignore-derived-pipelines pipeline_cache.foz Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	57c95d2ce2	radv: a support for a secure compile fork at device creation This added support for the fork, the installation of the seccomp filter, and the main loop for the actual compilation to be called from i.e. run_secure_compile_device(). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	3f2283b3e2	radv: add radv_secure_compile() This function will be called by the parent process when doing a secure compile. It first selects a free process to work with then passes it all the information it needs to compile the pipeline. Once the pipeline information has been passed to the secure process, it then waits around to read/write any disk cache entries required before exiting. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	07692f703f	radv: for secure compile exit early from radv_shader_variant_create() We don't have permission to be creating shared memory etc. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	5cd437b1ed	radv: allow the secure process to read and write from disk cache This allows the secure process to read and write to the disk cache via the parent process. This commit just adds the functionality needed for the secure process, the following commit will add the functionality for the parent process. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	5d25aee005	radv: add radv_device_use_secure_compile() helper Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	d33f2165c9	radv: add some new members to radv device and instance for secure compile These will be used by the following commits to hold information about the forked secure compile processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	e8cb13d499	radv: add radv_secure_compile_type enum This will be used to identify information being passed between the parent and secure process during a secure compile. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	2d2b113e86	radv: add radv_create_shaders() to radv_shader.h In a follwing commit we want to be able to call this for secure compiles from radv_device.c Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	6571000071	radv: add debug option to turn off in memory cache This can be usefull for debugging the on disk cache, but is also useful in the following patch for secure compiles which will be used to compile huge pipeline collections. These pipeline collections can be multiple GBs and the in memory cache grows to multiple GBs very quickly when they are compiled so we want to be able to turn off the in memory cache. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	637776629d	radv: get topology from pipeline key rather than VkGraphicsPipelineCreateInfo This is cleaner and avoids having to read/write an additional copy of topology for use with secure compile. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Marek Olšák	83a346cd58	docs: document new feature EGL_EXT_image_flush_external	2019-10-25 19:59:04 -04:00
Marek Olšák	c1c574fdf1	egl: implement new functions from EGL_EXT_image_flush_external Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Marek Olšák	34b1aa957a	egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Marek Olšák	1d122c104a	st/dri: add support for EGL_EXT_image_flush_external Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Marek Olšák	1d1b457821	st/dri: assume external consumers of back buffers can write to the buffers Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Marek Olšák	7520478461	dri_interface: add interface for EGL_EXT_image_flush_external Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Marek Olšák	a0a8109fb6	include: add the definition of EGL_EXT_image_flush_external Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-10-25 19:59:04 -04:00
Dylan Baker	19851c9ad6	gitlab-ci: Add a job for meson on windows This adds a new CI job that runs on windows with MSVC. It currently builds softpipe and osmesa, and runs the related unit tests. It does rely on meson's wraps for zlib, but I've set up caching of the wrap dependencies so hopefully that wont be a problem. I really wanted to user powershell for this, but there just isn't an easy way to do that, it's much easier to use batch scripts, so thats what I used. The leading `/` for .gitlab-ci/lava... must be removed because windows doesn't understand it, and when it reads the file the job ends in error. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-25 22:47:32 +00:00
Dylan Baker	06e4647cb0	gitlab-ci: refactor out some common stuff for Windows and Linux Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-25 22:47:32 +00:00
Dylan Baker	09ee11f5da	nir: Fix invalid code for MSVC Fixes: `ee2050b111` ("nir: Use BITSET for tracking varyings in lower_io_arrays")	2019-10-25 22:47:32 +00:00
Dylan Baker	ca0c1e69ca	docs: update releasing process to use new scripts and gitlab There were several out of date entries in this document, update them to current practices. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:19 -07:00
Dylan Baker	8a4541aae2	bin/gen_release_notes.py: Add a warning if new features are introduced in a point release Fixes: `86079447da` ("scripts: Add a gen_release_notes.py script") Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:15 -07:00
Dylan Baker	b153785370	bin/gen_release_notes.py: html escape all external data All of these (bug titles, patch titles, features, and people's names) can contain characters that are not valid html. Just escape everything for safety. Fixes: `86079447da` ("scripts: Add a gen_release_notes.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:13 -07:00
Dylan Baker	7e4b87f987	bin/post_release.py: Add .html to hrefs oops. Fixes: `3226b12a09` ("release: Add an update_release_calendar.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:11 -07:00
Dylan Baker	5eef803625	bin/post_version.py: white space fixes Fixes: `3226b12a09` ("release: Add an update_release_calendar.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:08 -07:00
Dylan Baker	abf9e7ac7b	bin/post_version.py: Pass version as an argument I made a bad assumption; I assumed this would be run in the release branch. But we don't do that, we run in the master branch. As a result we need to pass the version as an argument. Fixes: `3226b12a09` ("release: Add an update_release_calendar.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:06 -07:00
Dylan Baker	c6d41e7f0b	bin/gen_release_notes.py: Return "None" if there are no new features Which is very likely .Z > 0 releases. Fixes: `86079447da` ("scripts: Add a gen_release_notes.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:03 -07:00
Dylan Baker	df3d4ad82d	bin/gen_release_notes.py: strip '#' from gitlab bugs If they use the `Fixes: #1` form. Fixes: `86079447da` ("scripts: Add a gen_release_notes.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:46:00 -07:00
Dylan Baker	69f540c017	bin/gen_release_notes.py: fix conditional of bugfix Previously this would result in the .0 warning be generated for .z > 0 and the .z == 0 would get the other message. Fixes: `86079447da` ("scripts: Add a gen_release_notes.py script") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-25 15:45:53 -07:00
Illia Iorin	6b672e342a	mesa/main: Ignore filter state for MS texture completeness After the discussion in https://github.com/KhronosGroup/OpenGL-API/issues/45 the section 8.17 (texture completeness) of the OpenGL 4.6 core profile was changed to explicitly say that multisample texture completeness ignores filter state of the texture. "Using the preceding definitions, a texture is complete unless any of the following conditions hold true: ... - The minification filter requires a mipmap (is neither NEAREST nor LINEAR), the texture is not multisample, and the texture is not mipmap complete. - The texture is not multisample; either the magnification filter is not NEAREST, or the minification filter is neither NEAREST nor NEAREST_- MIPMAP_NEAREST; and any of – The internal format of the texture is integer (see table 8.12). – The internal format is STENCIL_INDEX. – The internal format is DEPTH_STENCIL, and the value of DEPTH_- STENCIL_TEXTURE_MODE for the texture is STENCIL_INDEX." Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-25 21:16:23 +00:00
Illia Iorin	71d4ece366	Revert "mesa/main: Fix multisample texture initialize" This reverts commit `a113a42e73`. Per https://github.com/KhronosGroup/OpenGL-API/issues/45 it was a wrong way to fix the issue. Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-25 21:16:23 +00:00
Marek Olšák	88e9042b6c	glsl/serialize: optimize for equal offsets in uniform remap tables Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1416 This decreases the shader cache size in the ticket from 1.6 MB to 40 KB. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-25 17:01:26 -04:00
Marek Olšák	e90269d90a	glsl/serialize: restructure remap table code Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-25 17:01:25 -04:00
Kenneth Graunke	f306d07932	nir: Use VARYING_SLOT_TESS_MAX to size indirect bitmasks MAX_VARYINGS_INCL_PATCH subtracts VARYING_SLOT_VAR0 giving us a size that's too small, so BITSET_SET writes words out of bounds, corrupting the stack and causing all kinds of chaos. VARYING_SLOT_TESS_MAX is the right value to use here, as it's the largest location. Closes: 2002 Fixes: `ee2050b111` ("nir: Use BITSET for tracking varyings in lower_io_arrays") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-25 13:29:09 -07:00
Neil Armstrong	e919c44c3b	Revert "ci: Disable lima until its farm can get fixed." This reverts commit `fb9362c6fb`. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-25 20:52:03 +02:00
Jason Ekstrand	e2bb7fef94	Revert "mapi: Inline call x86_current_tls." This reverts commit `e137b3a9b7`. It completely broke 32-bit EGL such that wflinfo can't even run without crashing.	2019-10-25 11:31:51 -05:00
Jon Turney	2649609ac5	rbug: Fix use of alloca() without #include "c99_alloca.h" [12/60] Compiling C object 'src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o'. FAILED: src/gallium/auxiliary/eb820e8@@gallium@sta/rbug_rbug_texture.c.o [...] ../src/gallium/auxiliary/rbug/rbug_texture.c: In function 'rbug_send_texture_info_reply': ../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: error: implicit declaration of function 'alloca'; did you mean 'malloc'? [-Werror=implicit-function-declaration] uint32_t height = alloca(sizeof(uint32_t) height_len); ^~~~~~ malloc ../src/gallium/auxiliary/rbug/rbug_texture.c:302:21: warning: initialization makes pointer from integer without a cast [-Wint-conversion] ../src/gallium/auxiliary/rbug/rbug_texture.c:303:20: warning: initialization makes pointer from integer without a cast [-Wint-conversion] uint32_t depth = alloca(sizeof(uint32_t) height_len); ^~~~~~ cc1: some warnings being treated as errors Include c99_alloca.h to portably make the alloca() prototype available. See also: `498d9d0f`, `adfb9c5c`, `fc8139b1` Fixes: `6174cba7` ("rbug: fix transmitted texture sizes") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-25 16:04:34 +01:00
Alyssa Rosenzweig	f98e9a2771	pan/midgard: Express allocated registers as offsets Rather than supplying a mask/swizzle to compose with the original, just supply the offset of the allocated register so we can directly offset the mask/swizzle, without resorting to composition. This is simpler, cleaner, and will generalize to non-32-bit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig	c1d36eb115	pan/midgard: Expose more typesize manipulation routines These internal mir.c routines will help the RA. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig	9bba182840	pan/midgard: Add mir_set_bytemask helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-25 08:45:39 -04:00
Timur Kristóf	85cc40f7ce	st/nine: Fix unused variable warnings in release build. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-25 12:44:44 +02:00
Timur Kristóf	f091b02825	st/nine: Fix build with -Werror=empty-body Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1995 Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-25 12:44:44 +02:00
Timur Kristóf	c580f134ae	aco: Refactor hazard mitigations, separate pass for GFX10. GFX10 hazards require a different approach compared to previous generations, for example it doesn't need s_nop, and most hazards can't be solved by adding NOPs at all. Also, they are not resolved by branch instructions. This commit reorganizes aco_insert_NOPs so that there is now a separate pass for GFX10. The new GFX10 pass also respects the control flow of the shader. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	b01847bd94	aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard. This commit refines the VMEMtoScalarWriteHazard mitigation, based upon a closer look at what LLVM does. Also changes the code to match the structure of the other hazard mitigations. * The hazard is not only triggered by VMEM, FLAT and GLOBAL but also SCRATCH and DS instructions. * The SMEM/SALU instructions only cause a hazard when they write a register that the VMEM/etc. are reading. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	c037ba1bb7	aco/gfx10: Mitigate LdsBranchVmemWARHazard. There is a hazard caused by there is a branch between a VMEM/GLOBAL/SCRATCH instruction and a DS instruction. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	09d676d81a	aco/gfx10: Mitigate SMEMtoVectorWriteHazard. There is a hazard that happens when an SMEM instruction reads an SGPR and then a VALU instruction writes that same SGPR. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	d6dfce02d0	aco/gfx10: Mitigate VcmpxExecWARHazard. There is a hazard when a non-VALU instruction reads the EXEC mask and then a VALU instruction writes the EXEC mask. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	e5a8616973	aco/gfx10: Mitigate VcmpxPermlaneHazard. Any permlane instruction that follows any VOPC instruction can cause a hazard, this commit implements a workaround that avoids this causing a problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	99aed688d3	aco/gfx10: Add notes about some GFX10 hazards. ACO currently mitigates VMEMtoScalarWriteHazard and Offset3fBug (names from LLVM). There are some bugs that ACO needn't care about. Just to be on the safe side, add an assertion that makes sure that we aren't hit by FlatSegmentOffsetBug. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:41 +02:00
Samuel Pitoiset	2bf8a9b337	radv: fix VK_KHR_shader_float_controls dependency on GFX6-7 From the Vulkan spec 1.1.126 : "VK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY_KHR specifies that shader float controls for 32-bit floating point can be set independently; other bit widths must be set identically to each other." Forgot to update this when I enabled that extension recently. Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.independence_settings.independence_setting Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-25 07:49:20 +02:00
Lepton Wu	e137b3a9b7	mapi: Inline call x86_current_tls. This saves one return and a simple benchmark which calls glGetString repeatedly on my desktop shows it improves calls per second from 118M to 128M. Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 23:37:18 +00:00
Lepton Wu	a4fec4dd6a	virgl: Remove formats with unusual sample count. Most GPU require the sample count is power of 2. Just remove those formats with unusual sample count. This decreases dEQP EGL tests run time a lot. Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-24 23:11:08 +00:00
Kristian H. Kristensen	ee2050b111	nir: Use BITSET for tracking varyings in lower_io_arrays MAX_VARYINGS_INCL_PATCH is greater than 64, so we'll need more that 64 bits (per component) to track which vars have indirects. This pass was trying to track patch varyings (which start at bit 63) in a separate 64 bit word, but failed to subtract VARYING_SLOT_PATCH0 and accessed out of bounds. Do away with the ad-hoc bit mask tracking and just use a BITSET. Fixes: dEQP-GLES31.functional.tessellation.user_defined_io.per_patch_block.vertex_io_array_size_implicit.triangles Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 15:32:20 -07:00
Dylan Baker	5658b13fb6	docs: update calendar, add news item and link release notes for 19.2.2	2019-10-24 14:12:04 -07:00
Dylan Baker	89d94f5ecc	docs: Add sha256 sum for 19.2.2	2019-10-24 14:08:25 -07:00
Dylan Baker	849415d615	docs: Add release notes for 19.2.2	2019-10-24 14:07:30 -07:00
Rob Clark	bc67b892d0	freedreno/ir3: handle the progress case In some cases, in particular when you have things that can be src modifiers ((abs)/(neg)), once eliminating one mov, there is a possibility to remove another. Handle this by re-visiting an instruction after eliminating a copy on one of it's srcs. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	97b24efd9f	freedreno/ir3: remove restrictions on const + (abs)/(neg) These date back to relatively early days of ir3, when a lot was still not well understood. But according to CI (and what I've seen blob driver do), these are not actually real restrictions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	e665e65f96	freedreno/ir3: allow copy-propagate out of fanout Now that we fixed the sharp edges that this was papering over, we can relax the restriction about eliminating a mov coming out of a fanout (for example from result of texture fetch). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	3ac328875e	freedreno/ir3: treat high vs low reg as conversion This avoids copy-propagating a high register into an instruction which cannot consume it. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	9e211b57b8	freedreno/ir3: propagate dest flags for collect/fanin We did this properly already for split/fanout. But collect was missed. Extract out a helper to share. This way we avoid copy propagating a mov from high or half reg into an instruction which cannot consume a high/half reg. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	49ab94694d	freedreno/ir3: make high regs easier to see in IR dumps Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Rob Clark	0f395f0933	freedreno/ir3: debug cleanup 1) deduplicate IR3_SHADER_DEBUG=disasm versus fs/vs/etc handling 2) standardize shader stage name prints, in particular VERT vs BVERT 3) don't mix stderr and stdout Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 13:08:56 -07:00
Caio Marcelo de Oliveira Filho	d31f415ba0	spirv: Add helper to find args of Image Operands Avoid keeping track of the idx and all possible image operands for each operation. Note for convenience we split up the handling of ImageOperandsOffsetMask and ImageOperandsConstOffsetMask. Suggested by Jason Ekstrand. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	c7d8fe2f0d	spirv: Check that only one offset is defined as Image Operand Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	d27b853c08	spirv: Add imageoperands_to_string helper Change the information to also include the category, so that the particulars of BitEnum enumeration can be handled in the template. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	06aecb14c0	anv: Implement VK_KHR_vulkan_memory_model Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	b8784fe652	spirv: Handle MakePointerAvailable/Visible Emit barriers with semantics matching the access operand and the storage class of the pointer. v2: Fix order of visible / available emission relative to the operations. (Bas) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	129c85c28b	spirv: Handle MakeTexelAvailable/Visible Set the memory semantics and scope for later emitting the barrier. Note the barrier emission code already exist in vtn_handle_image for the Image atomics. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	c649e64edc	spirv: Add option to emit scoped memory barriers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	c022043102	spirv: Add SpvMemoryModelVulkan and related capabilities Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	1bb191a0d1	spirv: Emit memory barriers for atomic operations Add a helper to split the memory semantics into before and after the operation, and use that result to emit memory barriers. v2: Be more explicit about which bits we are keeping around when splitting memory semantics into a before and after. For now we are ignoring Volatile. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	d6992f996b	spirv: Parse memory semantics for atomic operations Including the right storage memory semantic based on the storage class of the operation. These will be used later to emit memory barriers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	e142061399	intel/fs: Implement scoped_memory_barrier Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	901071044e	nir/tests: Add copy propagation tests with scoped_memory_barrier Three groups of tests, effectively defining what cases the optimization is allowed or prevented - Redudant loads (a load generated the value) - Propagate SSA values (a store generated the value) - Propagate a var (a copy generated the value) Change the shader type of the tests to be COMPUTE so nir_var_mem_shared can also be used. Doesn't affect the semantic of the copy propagation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:56 -07:00
Caio Marcelo de Oliveira Filho	73572abc2a	nir: Add scoped_memory_barrier intrinsic Add a NIR instrinsic that represent a memory barrier in SPIR-V / Vulkan Memory Model, with extra attributes that describe the barrier: - Ordering: whether is an Acquire or Release; - "Cache control": availability ("ensure this gets written in the memory") and visibility ("ensure my cache is up to date when I'm reading"); - Variable modes: which memory types this barrier applies to; - Scope: how far this barrier applies. Note that unlike in SPIR-V, the "Storage Semantics" and the "Memory Semantics" are split into two different attributes so we can use variable modes for the former. NIR passes that took barriers in consideration were also changed - nir_opt_copy_prop_vars: clean up the values for the mode of an ACQUIRE barrier. Copy propagation effect is to "pull up a load" (by not performing it), which is what ACQUIRE restricts. - nir_opt_dead_write_vars and nir_opt_combine_writes: clean up the pending writes for the modes of an RELEASE barrier. Dead writes effect is to "push down a store", which is what RELEASE restricts. - nir_opt_access: treat the ACQUIRE and RELEASE as a full barrier for the modes. This is conservative, but since this is a GL-specific pass, doesn't make a difference for now. v2: Fix the scoped barrier handling in copy propagation. (Jason) Add scoped barrier handling to nir_opt_access and nir_opt_combine_writes. (Rhys) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:55 -07:00
Jason Ekstrand	0ebe89459c	spirv/info: Add a memorymodel_to_string helper Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 11:39:55 -07:00
Dylan Baker	cdff9b00e1	docs: Add release not about scons deprecation	2019-10-24 18:33:50 +00:00
Dylan Baker	61ed9891c7	scons: Also print a deprecation warning on windows This warning is different. Meson support for windows is less mature than for other platforms, and the goal here is to alert people that eventually we plan to drop scons and move to meson, and that they should try out meson and report issues. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-24 18:33:50 +00:00
Dylan Baker	54053bc8d0	scons: Print a deprecation warning about using scons on not windows At this point meson should be able to handle all of the non-windows platforms just fine; we'd like to be able to stop maintaining scons for those platforms sooner than later. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-24 18:33:50 +00:00
Dylan Baker	79e73887e7	scons: Use print_function ins SConstruct This ensures that we get python3's print() function behavior even in python2, instead of python2's print statement behavior. We'll be using this in the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-24 18:33:50 +00:00
Adam Jackson	2e9aef4651	gallium: Fix a bunch of undefined left-shifts in u_format_* Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2019-10-24 14:21:51 -04:00
Samuel Pitoiset	4b17311e52	radv: compute the number of records correctly for vertex buffers On GFX8 the number of records is in bytes while on other chips it's in units of "stride". Fixes dEQP-VK.robustness.vertex_access..draw.vertex_ on RAVEN. Tested on GFX6, GFX8, GFX10 and RAVEN. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-24 17:14:43 +02:00
Michel Dänzer	75cc8c0b82	gitlab-ci: Enable UBSan for the meson-vulkan job It doesn't report any errors now. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:21:48 +02:00
Michel Dänzer	9ffe477412	util/tests: Avoid int64_t overflow issues in fast_idiv_by_const test Flagged by UBSan: ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233:14: runtime error: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself #0 0x55b4c1a2a428 in rand_sint ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:233 #1 0x55b4c1a2ad3a in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:308 #2 0x55b4c1a2b837 in fast_idiv_by_const_int32_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:410 #3 0x55b4c1abc13f in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #4 0x55b4c1aa7a4d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #5 0x55b4c1a4ce57 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #6 0x55b4c1a4f530 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #7 0x55b4c1a51cbe in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #8 0x55b4c1a6d698 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #9 0x55b4c1abfd58 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #10 0x55b4c1aab425 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #11 0x55b4c1a64cba in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #12 0x55b4c1ae4b73 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #13 0x55b4c1ae4a33 in main ../src/gtest/src/gtest_main.cc:37 #14 0x7ff172d1dbba in __libc_start_main ../csu/libc-start.c:308 #15 0x55b4c1a28dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9) ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309:52: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int'; cast to an unsigned type to negate this value to itself #0 0x563b24dafd2d in random_sdiv_test ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:309 #1 0x563b24db0f0f in fast_idiv_by_const_int64_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:473 #2 0x563b24e41111 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #3 0x563b24e2ca1f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #4 0x563b24dd1e29 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #5 0x563b24dd4502 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #6 0x563b24dd6c90 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #7 0x563b24df266a in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #8 0x563b24e44d2a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #9 0x563b24e303f7 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #10 0x563b24de9c8c in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #11 0x563b24e69b45 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #12 0x563b24e69a05 in main ../src/gtest/src/gtest_main.cc:37 #13 0x7f9a90330bba in __libc_start_main ../csu/libc-start.c:308 #14 0x563b24daddc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9) v2: * Use INT64_MIN instead of LLONG_MIN (Jason Ekstrand) * Simpler test for INT64_MIN result from rand_sint (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:21:27 +02:00
Michel Dänzer	69420c28bd	util: Use uint64_t for shifting left in sign_extend and strunc Shifting int64_t values left into the sign bit has undefined behaviour: ../src/util/fast_idiv_by_const.c:175:14: runtime error: left shift of 131 by 56 places cannot be represented in type 'long int' #0 0x561337ed10c1 in sign_extend ../src/util/fast_idiv_by_const.c:175 #1 0x561337ed1335 in util_compute_fast_sdiv_info ../src/util/fast_idiv_by_const.c:239 #2 0x561337e17519 in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:357 #3 0x561337ea815d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #4 0x561337e93a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #5 0x561337e38e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #6 0x561337e3b54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #7 0x561337e3dcdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #8 0x561337e596b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #9 0x561337eabd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #10 0x561337e97443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #11 0x561337e50cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #12 0x561337ed0b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #13 0x561337ed0a51 in main ../src/gtest/src/gtest_main.cc:37 #14 0x7f85ba483bba in __libc_start_main ../csu/libc-start.c:308 #15 0x561337e14dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9) ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51:14: runtime error: left shift of negative value -63 #0 0x55fc3c0e67cc in strunc ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:51 #1 0x55fc3c0e6d93 in smul_high ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:140 #2 0x55fc3c0e7067 in fast_sdiv ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:181 #3 0x55fc3c0e858b in fast_idiv_by_const_int8_Test::TestBody() ../src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test.cpp:358 #4 0x55fc3c17915d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x55fc3c164a6b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x55fc3c109e75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x55fc3c10c54e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x55fc3c10ecdc in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x55fc3c12a6b6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x55fc3c17cd76 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x55fc3c168443 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x55fc3c121cd8 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x55fc3c1a1b91 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x55fc3c1a1a51 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7fd224759bba in __libc_start_main ../csu/libc-start.c:308 #16 0x55fc3c0e5dc9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/util/tests/fast_idiv_by_const/fast_idiv_by_const_test+0x96dc9) v2: * Use two casts instead of changing the argument type (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:21:01 +02:00
Michel Dänzer	65e376a721	gallium/util: Cast to target type before shifting left Otherwise a smaller type may be promoted to int, which can hit undefined behaviour: ../src/gallium/auxiliary/util/u_half.h:126:29: runtime error: left shift of 32768 by 16 places cannot be represented in type 'int' #0 0x5646ff63d488 in util_half_to_float ../src/gallium/auxiliary/util/u_half.h:126 #1 0x5646ff63d749 in _mesa_half_to_float ../src/util/half_float.c:145 #2 0x5646ff54d557 in nir_const_value_negative_equal ../src/compiler/nir/nir_instr_set.c:372 #3 0x5646ff44d29a in const_value_negative_equal_test_nir_type_float16_trivially_true_Test::TestBody() ../src/compiler/nir/tests/negative_equal_tests.cpp:121 #4 0x5646ff505c05 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x5646ff4f1513 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x5646ff4979b5 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x5646ff49a08e in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x5646ff49c81c in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x5646ff4b81f6 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x5646ff50981e in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x5646ff4f4eeb in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x5646ff4af818 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x5646ff52e639 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x5646ff52e4f9 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7f6bacb78bba in __libc_start_main ../csu/libc-start.c:308 #16 0x5646ff448019 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/compiler/nir/negative_equal+0x17c019) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:20:31 +02:00
Michel Dänzer	2b1b56cb3a	intel/fs: Check for NULL key in fs_visitor constructor Flagged by UBSan: ../src/intel/compiler/brw_fs_visitor.cpp:986:20: runtime error: member access within null pointer of type 'const struct brw_base_prog_key' #0 0x559fadb48556 in fs_visitor::init() ../src/intel/compiler/brw_fs_visitor.cpp:986 #1 0x559fadb46db3 in fs_visitor::fs_visitor(brw_compiler const, void, void, brw_base_prog_key const, brw_stage_prog_data, nir_shader const, unsigned int, int, brw_vue_map const) ../src/intel/compiler/brw_fs_visitor.cpp:962 #2 0x559fad9c7cd8 in saturate_propagation_fs_visitor::saturate_propagation_fs_visitor(brw_compiler, brw_wm_prog_data, nir_shader) (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x61bcd8) #3 0x559fad9960a1 in saturate_propagation_test::SetUp() ../src/intel/compiler/test_fs_saturate_propagation.cpp:65 #4 0x559fadd7a32d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x559fadd65c3b in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x559fadd0af75 in testing::Test::Run() ../src/gtest/src/gtest.cc:2470 #7 0x559fadd0d8a4 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x559fadd10032 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x559fadd2ba0c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x559fadd7df46 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x559fadd69613 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x559fadd2302e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x559fadda2d61 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x559fadda2c21 in main ../src/gtest/src/gtest_main.cc:37 #15 0x7fe8f6748bba in __libc_start_main ../csu/libc-start.c:308 #16 0x559fad9950f9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/fs_saturate_propagation+0x5e90f9) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:20:04 +02:00
Michel Dänzer	41623be20e	intel/compiler: Cast to target type before shifting left Otherwise a smaller type may be promoted to int, which can hit undefined behaviour: ../src/intel/compiler/brw_packed_float.c:66:17: runtime error: left shift of 128 by 24 places cannot be represented in type 'int' #0 0x5604a03969aa in brw_vf_to_float ../src/intel/compiler/brw_packed_float.c:66 #1 0x5604a0391305 in vf_float_conversion_test_test_vf_to_float_Test::TestBody() ../src/intel/compiler/test_vf_float_conversions.cpp:70 #2 0x5604a041a323 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #3 0x5604a0405c31 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #4 0x5604a03ab03b in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #5 0x5604a03ad714 in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #6 0x5604a03afea2 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #7 0x5604a03cb87c in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #8 0x5604a041df3c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #9 0x5604a0409609 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #10 0x5604a03c2e9e in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #11 0x5604a0442d57 in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #12 0x5604a0442c17 in main ../src/gtest/src/gtest_main.cc:37 #13 0x7f9a1983dbba in __libc_start_main ../csu/libc-start.c:308 #14 0x5604a0390d89 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/vf_float_conversions+0x8dd89) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:19:23 +02:00
Michel Dänzer	59b72bdfb4	intel/compiler: Don't left-shift by >= the number of bits of the type To avoid it, use the modulo of the number of bits in the value being shifted, which is presumably what ended up happening on x86. Flagged by UBSan: ../src/intel/compiler/brw_eu_validate.c:974:33: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' #0 0x561abb612ab3 in general_restrictions_on_region_parameters ../src/intel/compiler/brw_eu_validate.c:974 #1 0x561abb617574 in brw_validate_instructions ../src/intel/compiler/brw_eu_validate.c:1851 #2 0x561abb53bd31 in validate ../src/intel/compiler/test_eu_validate.cpp:106 #3 0x561abb555369 in validation_test_source_cannot_span_more_than_2_registers_Test::TestBody() ../src/intel/compiler/test_eu_validate.cpp:486 #4 0x561abb742651 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2402 #5 0x561abb72e64d in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test, void (testing::Test::)(), char const) ../src/gtest/src/gtest.cc:2438 #6 0x561abb6d5451 in testing::Test::Run() ../src/gtest/src/gtest.cc:2474 #7 0x561abb6d7b2a in testing::TestInfo::Run() ../src/gtest/src/gtest.cc:2656 #8 0x561abb6da2b8 in testing::TestCase::Run() ../src/gtest/src/gtest.cc:2774 #9 0x561abb6f5c92 in testing::internal::UnitTestImpl::RunAllTests() ../src/gtest/src/gtest.cc:4649 #10 0x561abb74626a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2402 #11 0x561abb732025 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl, bool (testing::internal::UnitTestImpl::)(), char const) ../src/gtest/src/gtest.cc:2438 #12 0x561abb6ed2b4 in testing::UnitTest::Run() ../src/gtest/src/gtest.cc:4257 #13 0x561abb768b3b in RUN_ALL_TESTS() ../src/gtest/include/gtest/gtest.h:2233 #14 0x561abb7689fb in main ../src/gtest/src/gtest_main.cc:37 #15 0x7f525e5a9bba in __libc_start_main ../csu/libc-start.c:308 #16 0x561abb538ed9 in _start (/home/daenzer/src/mesa-git/mesa/build-amd64-sanitize/src/intel/compiler/eu_validate+0x1b8ed9) Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-24 16:16:49 +02:00
Eric Engestrom	47571a01ec	anv: fix error message `strerror()` takes an `errno`, not the negative value returned by the `ioctl()`. Instead of fixing this as `"%s", strerror(errno)`, let's just use the `"%m"` shortcut for it. Fixes: `2b5f30b1d9` ("anv: implement VK_INTEL_performance_query") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-24 13:57:40 +00:00
Eric Engestrom	8d43e2b2de	meson: add -Werror=empty-body to disallow `if(x);` This would have prevented a bug in MR 2058 [1]; with that MR fixed, nothing else uses empty-body blocks, so let's just forbid them altogether. [1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2058#note_237880 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 14:54:09 +01:00
Eric Engestrom	1177151b6d	llvmpipe: avoid generating empty-body blocks Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 14:54:09 +01:00
Eric Engestrom	abe32f56f5	llvmpipe: avoid compiling no-op block on release builds Suggested-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-24 14:54:09 +01:00
Thomas Hellstrom	91146c0796	winsys/svga: Limit the maximum DMA hardware buffer size The kernel total GMR/DMA size is limited, but it's definitely possible for the kernel to allow a larger buffer allocation to succeed, but command submission using that buffer as a GMR would fail typically causing an application crash. So have the winsys limit the size of GMR/DMA buffers. The pipe driver will then resort to allocating smaller buffers and perform the DMA transfer in multiple bands, also allowing for the pre-flush mechanism to kick in. This avoids the related application crashes. Fixes: `e7843273fa` ("winsys/svga: Update to vmwgfx kernel module 2.1") Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-24 15:08:43 +02:00
Thomas Hellstrom	00db976905	svga: Fix banded DMA upload unmap Even with banded DMA uploads, st->hwbuf is always non-NULL, but when we've allocated a software buffer to hold the full upload, unmapping of the hardware buffer has already been done before svga_texture_transfer_unmap_dma(), and the code was performing an unmap of an already mapped buffer. Fix this by testing for software buffer not present. Fixes: `a9c4a861d5` ("svga: refactor svga_texture_transfer_map/unmap functions") Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-24 15:08:43 +02:00
Tomeu Vizoso	3168b8defa	gitlab-ci: Update kernel for LAVA jobs to 5.4-rc4 Update to 5.4-rc4 so we can test Panfrost on devices with Mali T720 and T820. A bug was found that prevented things working at all on RK3288 devices, so we carry a patch for now in my personal fork. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Daniel Stone <daniels@collabora.com>	2019-10-24 08:47:37 +02:00
Timothy Arceri	1961653c89	glsl: remove propagate_invariance() call from the linker This was added in `586f4a42e7` and became redundant with `34ab9b0947` Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-24 13:24:49 +11:00
Timothy Arceri	922801b77d	nir: improve nir_variable packing Before: /* size: 136, cachelines: 3, members: 10 / After: / size: 128, cachelines: 2, members: 10 */ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-10-24 13:24:40 +11:00
Timothy Arceri	c412ff426b	nir: fix nir_variable_data packing Before: /* size: 60, cachelines: 1, members: 29 / After: / size: 56, cachelines: 1, members: 29 */ Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-10-24 13:22:59 +11:00
Marek Olšák	fff884e09d	radeonsi/nir: implement pipe_screen::finalize_nir Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	92196fe74b	st/mesa: use pipe_screen::finalize_nir Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	43efccb657	tgsi_to_nir: use pipe_screen::finalize_nir Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	fb04e5da97	gallium: add pipe_screen::finalize_nir Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	8a0dd0af3f	st/mesa: update VS shader_info for NIR after lowering passes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	28199aeee5	st/mesa: assign driver locations for VS inputs for NIR before caching fix up edge flags in the NIR pass, because st/mesa doesn't touch the inputs after caching Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	eaffdad108	st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	3634dca99a	st/mesa: move some NIR lowering before shader caching Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:12:52 -04:00
Marek Olšák	c2efd2cbfb	util/u_queue: skip util_queue_finish if num_threads is 0 This fixes a deadlock in pthread_barrier_destroy. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 21:11:17 -04:00
Marek Olšák	e096011def	util/disk_cache: finish all queue jobs in destroy instead of killing them If there are queued shaders to be written to disk, wait for that. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-23 20:22:50 -04:00
Kenneth Graunke	8dadef2ec5	iris: Rework edgeflag handling We were relying on specific pass ordering in st to avoid setting inputs_read/outputs_written for edge flags. Instead, just assume that it happens and throw out the results we don't want. We should probably revisit this and try and add a vertex element property like I originally wanted so we can avoid having it be associated with the VS altogether.	2019-10-23 16:38:27 -07:00
Marek Olšák	6b166d6fb1	gallium/noop: implement get_disk_shader_cache and get_compiler_options trivial	2019-10-23 18:11:19 -04:00
Rhys Perry	fc04a2fc31	aco: take LDS into account when calculating num_waves pipeline-db (Vega): SGPRS: 344 -> 344 (0.00 %) VGPRS: 424 -> 524 (23.58 %) Spilled SGPRs: 84 -> 80 (-4.76 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 52812 -> 52484 (-0.62 %) bytes LDS: 135 -> 135 (0.00 %) blocks Max Waves: 56 -> 53 (-5.36 %) v2: consider WGP, rework to be clearer and apply the "maximum 16 workgroups per CU" limit properly v2: use "SIMD" instead of "EU" v2: fix spiller by introducing "Program::max_waves" v2: rename "lds_size" to "lds_limit" v3: make max_waves actually independant of register usage v3: fix issue where max_waves was way too high v3: use DIV_ROUND_UP(a, b) instead of max(a / b, 1) v3: rename "workgroups_per_cu" to "workgroups_per_cu_wgp" v4: fix typo from "workgroups_per_cu" rename Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)	2019-10-23 19:11:21 +01:00
Rhys Perry	08d510010b	aco: increase accuracy of SGPR limits SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed number of SGPRs and has 106 addressable SGPRs. pipeline-db (Vega): SGPRS: 5912 -> 6232 (5.41 %) VGPRS: 1772 -> 1780 (0.45 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 88228 -> 87904 (-0.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 559 -> 571 (2.15 %) piepline-db (Navi): SGPRS: 341256 -> 363384 (6.48 %) VGPRS: 171536 -> 170960 (-0.34 %) Spilled SGPRs: 832 -> 581 (-30.17 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 14207332 -> 14190872 (-0.12 %) bytes LDS: 33 -> 33 (0.00 %) blocks Max Waves: 18072 -> 18251 (0.99 %) v2: unconditionally count vcc as an extra sgpr on GFX10+ v3: pass SGPRs rounded to 8 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-23 19:11:21 +01:00
Rhys Perry	7453c1adff	radv: round vgprs/sgprs before calculating max_waves Note that ACO doesn't correctly round SGPR counts on GFX8/GFX9. pipeline-db (ACO/Vega): SGPRS: 11000 -> 11000 (0.00 %) VGPRS: 3120 -> 3120 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 164328 -> 164328 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1125 -> 1000 (-11.11 %) v2: consider wave32 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-23 19:11:20 +01:00
Lionel Landwerlin	254d9976b6	docs: Add new Intel extension Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-23 19:07:34 +03:00
Erik Faye-Lund	8ae024d029	Revert "vc4: do not report alpha-test as supported" This reverts commit `a79b93269c`. Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:59 +02:00
Erik Faye-Lund	65328bd32d	Revert "v3d: do not report alpha-test as supported" This reverts commit `9d0523b569`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:55 +02:00
Erik Faye-Lund	acf1bf47cc	Revert "nir: drop support for using load_alpha_ref_float" This reverts commit `5af272b474`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:52 +02:00
Erik Faye-Lund	beb6639a9d	Revert "nir: drop unused alpha_ref_float" This reverts commit `e8095f2af0`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Maria Casanova <jmcasanova@igalia.com>	2019-10-23 13:03:38 +02:00
Samuel Pitoiset	f11ea22666	radv: fix a performance regression with graphics depth/stencil clears I recently changed the slow depth/stencil clear path to make sure depth values are explicitly exported by the fragment shader. This is actually only useful when VK_EXT_depth_range_unrestricted is enabled. While this path is correct, it introduced a performance regression with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and probably more titles. This is because it prevents the hardware to do some optimizations like discarding fragments. This commit re-introduces the previous (a bit faster) slow depth/stencil clear path and it selects the unrestricted path only if VK_EXT_depth_range_unrestricted is enabled. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 10:23:47 +02:00
Samuel Pitoiset	7562a2cbe3	radv: fix vkUpdateDescriptorSets with inline uniform blocks descriptorCount is the number of bytes into the descriptor, so it shouldn't be used as an index. srcArrayElement/dstArrayElement specify the starting byte offset within the binding to copy from/to. This fixes new CTS tests: dEQP-VK.binding_model.descriptor_copy..inline_uniform_block_ dEQP-VK.binding_model.descriptor_copy..mix_3 dEQP-VK.binding_model.descriptor_copy..mix_array1 Fixes: `8d2654a419` ("radv: Support VK_EXT_inline_uniform_block.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:59:22 +02:00
Samuel Pitoiset	9c92a21fe5	radv/gfx10: fix 3D images GFX10 does act like GFX9 actually. This fixes dEQP-VK.glsl.texture_functions.query.texturesize.sampler3d_. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:45:49 +02:00
Samuel Pitoiset	41ace1d939	radv/gfx10: re-enable fast depth/stencil clears with separate aspects It used to cause weird issues on GFX10 in the past with vkmark and Wreckfest, and they can't be reproduced now. Shadow Of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 09:18:06 +02:00
Samuel Pitoiset	956d825ed8	radv: do not emit rbplus if attachments are undefined Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer on Raven and other chips that allow rbplus. This just prevents a crash and rbplus probaby needs more work. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:57:31 +02:00
Samuel Pitoiset	411ad8e7c5	radv: add an assertion in radv_gfx10_compute_bin_size() To prevent out of bounds access. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:33:12 +02:00
Samuel Pitoiset	f4ab58c1a0	radv: do not create meta pipelines with 16 samples The driver only supports up to 8 samples, so it's useless to create more pipelines than needed. This fixes a conditional jump reported by Valgrind on GFX10: ==194282== Conditional jump or move depends on uninitialised value(s) ==194282== at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242) ==194282== by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334) ==194282== by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440) ==194282== by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764) ==194282== by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788) ==194282== by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114) ==194282== by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297) ==194282== by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277) ==194282== by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363) ==194282== by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080 This is caused by an out of bound access of 'fmask_array' (ie. index is 4 as for 16 samples). Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-23 08:33:08 +02:00
Lionel Landwerlin	2b5f30b1d9	anv: implement VK_INTEL_performance_query v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	5ba6d9941b	intel/perf: add mdapi writes for register perf counters Those are not part of the OA reports. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	a2a1873a82	intel/genxml: add RPSTAT register for core frequency Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:15 +00:00
Lionel Landwerlin	e0ab658acd	intel/genxml: add generic perf counters registers We have 2 of those we can configure to source programmable events. Those are not part of the OA reports. Configuration happens in i915 through the metric set selected by the application. On the Mesa side we'll just sample those and do a diff. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	11c4bf9417	intel/perf: add support for querying kernel loaded configurations We use this as a communication mechanism between MDAPI & Anv. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	13f802291d	drm-uapi: Update headers from drm-next Pull new updates from drm-next as of the following commit: commit f1b4a9217efd61d0b84c6dc404596c8519ff6f59 Merge: 400e91347e1d f3a36d469621 Author: Dave Airlie <airlied@redhat.com> Date: Tue Oct 22 15:04:00 2019 +1000 Merge tag 'du-next-20191016' of git://linuxtv.org/pinchartl/media into drm-next Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	db7a6847dd	intel/perf: move registers to their own header Will conflict with the genxml RPSTAT register. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	e1d5d75257	intel/perf: extract register configuration We want to query the content of register configurations from the kernel. Let's pull this out of the query. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	a338b7d739	intel/perf: expose some utility functions The Vulkan performance query extension is a bit lower level than the GL one. Expose some of the functions to do the result accumulation directly in the Anv driver. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Lionel Landwerlin	a0e0e75db1	intel/perf: add mdapi maker helper A simple utility to put the marker at the right location. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-23 05:41:14 +00:00
Kenneth Graunke	c352cdf970	st/mesa: Silence chatty debug printf Other debug_printf's in this file are in if (0) blocks. Trivial.	2019-10-22 18:01:41 -07:00
Chris Wilson	0899bf55d4	st/mesa: Map MESA_FORMAT_RGB_UNORM8 <-> PIPE_FORMAT_R8G8B8_UNORM This is useful for PBO texture upload with GL_RGB and GL_UNSIGNED_BYTE. v2: Vasily Khoruzhick provided an update for the Lima CI expectations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-22 22:13:14 +00:00
Lionel Landwerlin	0dfa643feb	anv: fix unwind of vkCreateDevice fail We're skipping the context destruction in some cases which is the grand scheme of thing is not that important because closing device->fd will destroy the associated context as well. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Fixes: `b30e01aef5` ("anv: fix memory leak on device destroy")	2019-10-22 20:44:26 +00:00
Rhys Perry	118a32e5ba	Revert "aco: only emit waitcnt on loop continues if we there was some load or export" We don't properly pass on ctx.lgkm_cnt/ctx.barrier_imm/etc, so this waitcnt was necessary for barriers and correctly waiting for SMEM before s_dcache_wb on GFX10. Totals from affected shaders: SGPRS: 33200 -> 33200 (0.00 %) VGPRS: 31376 -> 31376 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2431804 -> 2433956 (0.09 %) bytes LDS: 316 -> 316 (0.00 %) blocks Max Waves: 1609 -> 1609 (0.00 %) This reverts commit `2c050b49b3`. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	964ce47abc	aco: add missing bld.scc() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	c96289a70e	aco: keep can_reorder/barrier when combining addition into SMEM Affects 30 shaders in the pipeline-db (all youngblood). Totals from affected shaders: SGPRS: 2656 -> 2456 (-7.53 %) VGPRS: 2260 -> 2260 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 240680 -> 240944 (0.11 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 90 -> 90 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	57c2cfb608	aco: add a few missing checks in value numbering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a8d0101d69	aco: use ds_read2_b64/ds_write2_b64 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	bdf47a1273	aco: properly combine additions into ds_write2_b64/ds_read2_b64 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	58d4aee5df	aco: fix sparse store_lds() p_extract_vector's second operand is in units of the definition size, not dwords. v2: move extract_subvector() to right before ds_write_helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a856629e8f	aco: create load_lds/store_lds helpers We'll want these for GS, since VS->GS IO on Vega is done using LDS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	a400928f4a	aco: fix 64-bit p_extract_vector on 32-bit p_create_vector Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Rhys Perry	f6f15859de	aco: small stage corrections Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-22 18:52:29 +00:00
Marek Olšák	f764725b3e	st/mesa: replace pipe_shader_state with tgsi_token* in st_vp_variant we don't need more than that Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-22 14:41:25 -04:00
Marek Olšák	a0b711d8e9	nir: allow nir_lower_uniforms_to_ubo to be run repeatedly for st/mesa Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-22 14:41:23 -04:00
Rob Clark	aa8515463e	freedreno/ir3: fixup register footprint fixup Small typo resulted in not converting footprint to vec4, meaning that we could potentially ask for quite a few more registers than required Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Rob Clark	4c060235a2	freedreno/ir3: handle scalarized varying inputs If the load_interpolated_input is scalarized, we would be too conservative about deciding the tex instruction wasn't a candidate to pre-fetch: vec1 32 ssa_0 = load_const (0x00000000 /* 0.000000 /) vec2 32 ssa_1 = intrinsic load_barycentric_pixel () (0) / interp_mode=0 / vec1 32 ssa_2 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 0) / base=0 / / component=0 / / packed:v_uv,v_uv1 / vec1 32 ssa_3 = intrinsic load_interpolated_input (ssa_1, ssa_0) (0, 1) / base=0 / / component=1 / / packed:v_uv,v_uv1 */ vec2 32 ssa_8 = vec2 ssa_2, ssa_3 vec4 32 ssa_9 = tex ssa_8 (coord), 0 (texture), 0 (sampler) Really we don't care that the texcoord components come from different load_interpolated_input instructions, just that they have consecutive varying offsets. Reported-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-22 17:46:19 +00:00
Daniel Schürmann	3a20ef4a32	aco: refactor value numbering Previously, we used one hashset per BB, so that we could always initialize the current hashset from the immediate dominator. This patch changes the behavior to a single hashmap using the block index per instruction to resolve dominance. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-22 17:18:59 +02:00
Erik Faye-Lund	3a71e1d27b	mesa/st: assert that lowering is supported Some of these lowerings aren't supported for drivers that supports tesselation and geometry shaders. Let's add a couple of asserts to make it obvious if these have been enabled when it's not possible. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-22 12:07:23 +00:00
Michel Dänzer	793f6b30d9	gitlab-ci: Enable llvmpipe in ARM build jobs v2: * Use LLVM 8 from buster-backports v3: * Use LLVM 7 again for armhf, llvmpipe is still broken there with LLVM 8 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Michel Dänzer	59e7f1413c	gitlab-ci: Update the meson cross file for LLVM_VERSION as well Cross builds don't use the llvm-config path from the native file.	2019-10-22 10:26:29 +00:00
Michel Dänzer	163ec5d808	gitlab-ci: Use native aarch64 runner for ARM build jobs This allows running the regression tests. One downside is that we can't easily build the Vulkan overlay layer, because only x86 binaries of the glslang validator are available. If that's important, we could either use those binaries via qemu, or build it from source. v2: * Add :amd64 suffix to existing debian-9/10 job names (Eric Engestrom) Acked-by: Eric Engestrom <eric.engestrom@intel.com> # v1	2019-10-22 10:26:29 +00:00
Michel Dänzer	c5aa2711a4	gitlab-ci: Explicitly list debian-10 in needs: for .deqp-test template Apparently needs: in a definition overwrites inherited ones. So .deqp-test effectively didn't declare needs: for debian-10, which means any jobs based on .deqp-test could spuriously run after the debian-10 job failed or was cancelled.	2019-10-22 10:26:29 +00:00
Michel Dänzer	38d42cf1d5	gitlab-ci: Bring ARM docker image install script in line with x86_64 Use https:// URLs in the APT configuration. Drop --no-install-recommends, the image generation template disables installation of recommended packages in /etc/apt/apt.conf. Run apt-get autoremove at the end, cleaning up packages which were installed to satisfy dependencies but are no longer needed. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Michel Dänzer	e3c7e04dfa	gitlab-ci: Sort ARM docker image packages in alphabetical order No functional change. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-22 10:26:29 +00:00
Samuel Pitoiset	a13320370e	radv: fix updating bound fast ds clear values with different aspects On GFX9, the driver is able to do an optimized fast depth/stencil clear with only one aspect (ie. clear the stencil part of a depth/stencil image). When this happens, the driver should only update the clear values of the given aspect. Note that it's currently only supported on GFX9 but I have some local patches that extend this optimized path for other gens. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-22 11:16:13 +02:00
Sagar Ghuge	97e6d34e66	intel/compiler: Refactor disassembly of sources in 3src instruction Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	18b28b5654	intel/compiler: Don't move immediate in register On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	bf943bdf24	intel/compiler: Set bits according to source file On Gen >= 12, if src0 or src2 holds immediate value, we need set src[0/2]_is_imm bits instead of register file. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Sagar Ghuge	c018c5a339	intel/compiler: Add Immediate support for 3 source instruction On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but not both. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-21 20:32:43 -07:00
Eric Anholt	fb9362c6fb	ci: Disable lima until its farm can get fixed. It's been throwing the following error today: "<Fault -32603: 'Internal Server Error (contact server administrator for details): could not extend file "base/17952/18226": No space left on device\nHINT: Check free disk space.\n'>" Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-10-21 20:31:34 -07:00
Sagar Ghuge	7fb75ddfa7	intel: Add missing entry for brw_nir_lower_alpha_to_coverage in Makefile Fixes: `7ecfbd4f6d` ("nir: Add alpha_to_coverage lowering pass") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-21 16:19:24 -07:00
Dave Airlie	bde08ce4d7	llvmpipe: handle compute shader launch with 0 threads If you set LP_NUM_THREADS=0 compute shaders would hang, just execute the workloads in sequence if we have no threads in the pool. Fixes: `1b24e3ba75` ("llvmpipe: add compute threadpool + mutex") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-21 22:51:23 +00:00
Marijn Suijten	0141a4cdc0	freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mk This file is created in `2a0d45ae6c` but addition to android makefiles was omitted. It breaks the build with missing references which are defined in this file. List the file in ir3_SOURCES to make the build succeed. Signed-off-by: Marijn Suijten <marijns95@gmail.com>	2019-10-21 22:43:00 +00:00
Samuel Pitoiset	39760793b5	ac/llvm: fix ac_to_integer_type() for 32-bit const addr space pointers This fixes some crashes with dEQP-VK.descriptor_indexing.* when read_first_invocation has its source from a descriptor. Most of these tests still fail because of an LLVM bug (they work with ACO). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 22:32:01 +02:00
Rhys Perry	73184e51d1	aco: run opt_algebraic in a loop Totals from affected shaders: SGPRS: 13920 -> 13656 (-1.90 %) VGPRS: 12972 -> 12960 (-0.09 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1005680 -> 1000648 (-0.50 %) bytes LDS: 91 -> 91 (0.00 %) blocks Max Waves: 688 -> 688 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 19:18:30 +00:00
Rhys Perry	132ae89b19	aco: use nir_lower_idiv_precise v7: rename _nv50/_llvm to _fast/_precise Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Rhys Perry	8b98d0954e	nir/lower_idiv: add new llvm-based path v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 18:49:46 +00:00
Sagar Ghuge	f729ecefef	intel/compiler: Remove emit_alpha_to_coverage workaround from backend Remove emit_alpha_to_coverage workaround from backend compiler and start using ported workaround from NIR. v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho) Fixes piglit test on HSW: - arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Sagar Ghuge	7ecfbd4f6d	nir: Add alpha_to_coverage lowering pass Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround() in intel/compiler. v2 (Caio Marcelo de Oliveira Filho): - Track store output and sample mask instruction - Nest math insturction for more readability - Bail out early if no gl_SampleMask v3: (Caio Marcelo de Oliveira Filho): - Do math instructions after instruction block - Restructure code - Move pass under src/intel/compiler v4: (Caio Marcelo de Oliveira Filho): - Organize dither mask calculation Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-21 11:27:29 -07:00
Daniel Schürmann	0e4bd261b1	aco: ensure that uniform booleans are computed in WQM if their uses happen in WQM This fixes graphical corruption in SC2. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-21 17:39:46 +00:00
Dylan Baker	a9a9249288	meson: Require meson >= 0.49.1 when using icc or icl 0.49.0 can compile most of mesa with ICC or ICL, but not SWR without additional workarounds in our meson.build files. Bumping patch version is easier and shouldn't be a big burden anyway, especially to cover a niche compiler. The check originally only covered ICC, but now covers ICL as well. Fixes: `3740ffb59c` ("meson: add switches for SWR with MSVC") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1937 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-21 17:21:57 +00:00
Juan A. Suarez Romero	d33fe2d5eb	docs: update calendar, add news item and link release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-10-21 19:13:55 +02:00
Juan A. Suarez Romero	62a0e8421e	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cc88eeb6ff`)	2019-10-21 19:10:52 +02:00
Juan A. Suarez Romero	7aa63ffe4f	docs: add release notes for 19.1.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `5c6d266c59`)	2019-10-21 19:10:49 +02:00
Timur Kristóf	7e5f87b533	aco/gfx10: Update constant addresses in fix_branches_gfx10. Due to a bug in GFX10 hardware, s_nop instructions must be added if a branch is at 0x3f. We already do this, but forgot to also update the constant addresses that come after this instruction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Timur Kristóf	f380398f8f	aco/gfx10: Fix PS exports for SPI_SHADER_32_AR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Timur Kristóf	1749953ea3	aco/gfx10: Wait for pending SMEM stores before loads Currently if you have an SMEM store followed by an SMEM load that loads the same location as was written, it won't work because the store isn't finished before the load is executed. This is NOT mitigated by an s_nop instruction on GFX10. Since we currently don't have proper alias analysis, this commit adds a workaround which will insert an s_waitcnt lgkmcnt(0) before each SSBO load if they follow a store. We should further refine this in the future when we can make sure to only add the wait when we load the same thing as has been stored. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-21 14:33:54 +00:00
Boris Brezillon	7fa5cd3ee3	panfrost: Fix the DISCARD_WHOLE_RES case in transfer_map() The current implementation does not synchronize on BO readiness when DISCARD_WHOLE_RES flag is set, which can lead to misbehaviours when the resource being updated is being used by one of the pending or already flushed batches. Adding unconditional BO synchronization would do the trick, but we can sometimes optimize this path by re-allocating a new BO instead of waiting for the existing one to be ready. Reported-by: Daniel Stone <daniels@collabora.com> Reported-by: Heinrich Fink <heinrich.fink@daqri.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-21 14:37:02 +02:00
Iago Toral Quiroga	2d5edf2558	st/mesa: only require ESSL 3.1 for geometry shaders According to the OES_geometry_shader spec, section Dependencies: "OpenGL ES 3.1 and OpenGL ES Shading Language 3.10 are required." Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-21 09:09:15 +00:00
Lepton Wu	f4ba31ff50	egl/android: Remove our own reference to buffers. We currently doesn't maintain it correctly and the buffer gets leaked if surface is destroyed before calling swapping buffers. From Android frameworks/native/libs/nativewindow/include/system/window.h: The window holds a reference to the buffer between dequeueBuffer and either queueBuffer or cancelBuffer, so clients only need their own reference if they might use the buffer after queueing or canceling it. v2: Remove our own reference. Fixes: `0212db3504` ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor") Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1) Reviewed-By: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-21 07:50:31 +00:00
Samuel Pitoiset	b72205a4c1	radv: advertise VK_KHR_spirv_1_4 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 09:21:40 +02:00
Samuel Pitoiset	b139198b06	radv: do not dump descriptors twice in hang reports If a pipeline has both graphics and compute, descriptors are same. While we are at it, use queue->device for simplicity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	cf5e55558e	radv: dump trace files earlier if a GPU hang is detected To make sure a trace file is generated in case the driver crashes during the hang report generation (which happens sometimes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	bc2319deb2	radv: print which ring is dumped in hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	076f9dce7c	radv: do not print useless descriptors info in hang reports This information has never been useful. All descriptors are already dumped with colors etc, and it's more useful. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:50:39 +02:00
Samuel Pitoiset	9da94e510c	radv: enable VK_KHR_shader_float_controls on GFX6-GFX7 Disable 16-bit features because fp16 isn't exposed on these chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-21 08:47:28 +02:00
Alyssa Rosenzweig	4c9b9ed5f9	panfrost/ci: Update expectations list A bunch of blend tests fixed on T760. A single blend test regressed on both T760/T860 but I am unable to reproduce locally so am just documenting the regression and moving on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	b8c4fb235e	pan/midgard: Implement SIMD-aware dead code elimination We would like to eliminate not just entire dead instructions, but also dead components, which increases scheduler flexibility (since some vector instructions can become scalar after eliminating dead components). This also will allow better RA in the future. Results are meh. total instructions in shared programs: 3453 -> 3451 (-0.06%) instructions in affected programs: 60 -> 58 (-3.33%) helped: 2 HURT: 0 total bundles in shared programs: 1826 -> 1824 (-0.11%) bundles in affected programs: 33 -> 31 (-6.06%) helped: 2 HURT: 0 total quadwords in shared programs: 3144 -> 3144 (0.00%) quadwords in affected programs: 0 -> 0 helped: 0 HURT: 0 total registers in shared programs: 321 -> 321 (0.00%) registers in affected programs: 45 -> 45 (0.00%) helped: 11 HURT: 11 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for registers value: -0.45 0.45 95% mean confidence interval for registers %-change: -1.87% 62.18% Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 445 -> 447 (0.45%) threads in affected programs: 2 -> 4 (100.00%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	6c4b97011b	pan/midgard: Create dependency graph bytewise This allows for vec16 dependencies in the scheduler, not that we have any yet (thankfully). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	825f11e739	pan/midgard: Handle nontrivial masks in texture RA The texture instruction has a mask we need to take into account. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d1d3411ba5	pan/midgard: Implement per-byte liveness tracking Now that we have notion of byte masks, liveness tracking can be updated to reflect this extra granularity without loss of correctness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	43fd730fc4	pan/midgard: Simplify mir_bytemask_of_read_components There are easy ways to iterate sources! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	e9202ff3cb	pan/midgard: Report byte masks for read components Read component masks don't have a particular type associated, since the type of the ALU operation may not match the type of the operands in question. So let's generate byte masks instead, and update the rest of the compiler to use byte masks when analyzing reads. Preparation for mixed types. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d079631248	pan/midgard: Add helpers for manipulating byte masks There are essentially two formats of masks in play beginning with this commit: masks per-channel and masks per-byte. The former make sense within a given fixed-size instruction; the latter are typesize-independent. It turns out you need the latter to meaningfully manipulate instructions containing multiple sizes (which is quite possible with ALU operations). Similarly, we have mir_srcsize. We calculate the size of the source by analyzing the size of the instruction itself and stepping down if there is a half-modifier. Finally, we have mir_round_bytemask_down, for when we want to take a byte mask and "round it down" to a given component size, so that we can use it as a component mask. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	e981b69484	pan/midgard: Implement OP_IS_STORE with table ..rather than open-coding. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	8e31b14858	pan/midgard: Tableize load/store ops This will allow us to encode properties about the load/store ops like we do for ALU ops. We include now properties about whether we have a store, and if there are special cases on the load/store op. We also tag each instruction by its natural size... this is probably not totally right, but it's a start. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	5952add9a9	pan/midgard: Factor out mir_get_alu_src This helper is used in a bunch of places ... might as well make that common. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	f77ea9798d	pan/midgard/disasm: Fix printing 8-bit/16-bit masks The trick is realizing even with a destination override, the masks are encoded in the same mode as the instruction itself, rather than stepping down. The override means that the smaller type is used, but the mask is parsed as if it were the higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	d49fdca229	pan/midgard: Identify 64-bit atomic opcodes They are symmetric to their 32-bit counterparts, just shifted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig	6601570ead	pan/midgard: Debug mir_insert_instruction_after_scheduled Add some comments explaining what's going on in a more natural flow in order to solve the actual bug. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Fixes: `2d914ebe81` ("pan/midgard: Fix memory corruption in register spilling")	2019-10-20 12:02:31 +00:00
Christian Gmeiner	a6de05a968	etnaviv: keep track of buffer valid ranges for PIPE_BUFFER This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Such a situation can be triggered with the following API usage example: glBufferSubData(..., offset, size, data1); glDrawArrays(...); // append new vertex data glBufferSubData(..., offset+size, size, data2); glDrawArrays(...); Same is done for freedreno, nouveau and radeon. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-20 09:03:06 +00:00
Christian Gmeiner	eab6d75066	etnaviv: store updated usage in pipe_transfer object Store the changed usage in the newly created transfer object. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-20 09:03:06 +00:00
Christian Gmeiner	cd4528563f	etnaviv: fix code style Fixes: `1194afdfe3` ("etnaviv: rework the stream flush to always go through the context flush") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-20 10:20:22 +02:00
Lionel Landwerlin	b30e01aef5	anv: fix memory leak on device destroy v2: handle vma destruction if vkCreateDevice fails (Jordan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-20 08:02:22 +00:00
Christian Gmeiner	f834656a41	etnaviv: fix compile warnings Fixes: `e5cc66dfad` ("etnaviv: Rework locking") Fixes: `1456aa61cc` ("etnaviv: Rework resource status tracking") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-20 08:28:18 +02:00
Eric Anholt	d8741ad251	mesa: Redefine the RG formats as array formats. This is the layout used in the GL API, and maps directly to PIPE formats with no endianness trickery. As with the LA change, this fixes big-endian fetching from texbos. Also cleans up some endian shenanigans in shader images. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	4f384ddf5f	gallium: Drop the unused PIPE_FORMAT_AL formats. Now that Mesa is also using an array format for LA, nothing was using these. (And, clearly, no HW driver had exposed them). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	6a819cabe8	mesa: Replace MESA_FORMAT_L8A8/A8L8 UNORM/SNORM/SRGB with an array format. The array format is what the GL API wants (fixing texbos on big-endian), and matches directly to gallium's corresponding array format. The only driver exposing A8L8 was radeon/r200 in big-endian, where the HW's underlying format was trying to read as array and we needed to flip things around to make our packed format come out right (note that while the radeon format tables had both AL and LA, ChooseTextureFormat would only pick one of them based on endianness). v2: Don't make r200/radeon use endian swaps. v3: Rebase on dropping the r200 _be/_le format table removal patch v4: reword commit message to explain why we can drop both formats from radeon. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Eric Anholt	236b478b2e	mesa: Replace the LA16_UNORM packed formats with one array format. The array format is what the GL API wants (and we made a mistake in the format returned for texbos on big-endian!), and it's exactly what the gallium-side PIPE_FORMAT_L16A16 is. The only downside is that dri_util tries to fall back to sampling RG16 using LA16, which doesn't have a match for big-endian any more. No HW drivers supported A16L16 anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	1165e3f360	radeon: Drop the unused first arg of OUT_BATCH_RELOC. This was a trap when trying to figure out how to fit data bits into the reloc. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-20 04:39:48 +00:00
Eric Anholt	2a548cf92f	radeon: Fill in the TXOFFSET field containing the tile bits in our relocs. The first arg to OUT_BATCH_RELOC is ignored, we actually wanted these in the third arg. They're always 0 so far, so it didn't matter. v2: Reword commit message that I don't end up using the tile bits, but keep the commit as a cleanup anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Eric Anholt	ecddabfa76	r100/r200: factor out txformat/txfilter setup from the TFP path. No matter what, we deref the texFormat from the table, except for a mistake in cpp=4 where we pulled a 0 out of the table either way. v2: Rebase on dropping r200 table deduplication patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-10-20 04:39:48 +00:00
Vasily Khoruzhick	7ceafa4b40	lima: fix PP stack size PP stack size should be set to maximum PP stack size, not to stack size of last shader. Fixes: `27e7603c34` ("lima: fix ppir spill stack allocation") Tested-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-19 18:15:18 -07:00
Marijn Suijten	224b267282	freedreno/a5xx: enable a510 Kernel support for this GPU is added by the following series: https://patchwork.kernel.org/project/linux-arm-msm/list/?series=187609 In particular https://patchwork.kernel.org/patch/11189953/ Tested on Sony Xperia X and X Compact. Signed-off-by: Marijn Suijten <marijns95@gmail.com> Tested-by: AngeloGioacchino Del Regno <kholk11@gmail.com>	2019-10-19 16:48:24 +02:00
Prodea Alexandru-Liviu	48d617118a	Appveyor/Meson: Add build test of osmesa gallium Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-19 14:44:44 +00:00
Lionel Landwerlin	3f8f52b241	anv: fix vkUpdateDescriptorSets with inline uniform blocks With inline uniform blocks descriptor, the meaning of descriptorCount is a number of bytes to copy into the descriptor. Don't try to use that size as an index into the descriptor table. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `43f40dc7cb` ("anv: Implement VK_EXT_inline_uniform_block") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-19 13:16:40 +03:00
Rob Clark	1cea76274e	freedreno/ir3: handle imad24_ir3 case in UBO lowering Similiar to iadd, we can fold an added constant value from an imad24_ir3 into the load_uniform's constant offset. This avoids some cases where the addition of imad24_ir3 could otherwise be a regression in instr count. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	d9424e5821	freedreno/ir3: add imul24 opcode This maps to mul.s24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	c7b8f16bee	freedreno/ir3: optimize immed 2nd src to mad We can't encode immed sources for cat3 (mad) instructions, but we can use const in first or third src. We handled this case already, but we weren't considering that we could lower immed to const. For manhattan: total instructions in shared programs: 35202 -> 34718 (-1.37%) instructions in affected programs: 14931 -> 14447 (-3.24%) helped: 90 HURT: 0 total full in shared programs: 2451 -> 2359 (-3.75%) full in affected programs: 653 -> 561 (-14.09%) helped: 69 HURT: 2 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 15:08:54 -07:00
Rob Clark	666b6236f7	freedreno/ir3: add rule to generate imad24 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	5e08f070f0	nir: add nir_lower_amul pass Lower amul to either imul or imul24, depending on whether 24b is enough bits to calculate an offset within the thing being dereferenced. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-18 15:08:54 -07:00
Rob Clark	1bdde31392	nir: add address calc related opt rules Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6320e37d4b	nir: add amul instruction Used for address/offset calculation (ie. array derefs), where we can potentially use less than 32b for the multiply of array idx by element size. For backends that support `imul24`, this gives a lowering pass an easy way to find multiplies that potentially can be converted to `imul24`. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	0568761f8e	nir: Add a new ALU nir_op_imul24 Some hardware can do 24b multiply in a single instruction, but not 32b. However in most cases 24b is sufficient for address/offset calculation. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	bc2ccdc45a	freedreno/ir3: Handle newly added opcode nir_op_imad24_ir3 Simply emit an ir3_MAD_S24 instruction in the backend. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev	32e5fbf47c	nir: Add a new ALU nir_op_imad24_ir3 ir3 compiler has a signed integer multiply-add instruction (MAD_S24) that is used for different offset calculations in the backend. Since we intend to move some of these calculations to NIR, we need a new ALU op that can directly represent it. Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	6ad442acae	freedreno/ir3: rename mul.s/mul.u to mul.s24/mul.u24, to better reflect that these are 24b multiply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	ad8167c1e0	nir/search: fix the PoT helpers Otherwise, if the base type is (for example) uint32, we would incorrectly think that PoT optimizations could not apply. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jason Ekstsrand <jason@jleksrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2019-10-18 15:08:54 -07:00
Rob Clark	f30c256ec0	freedreno/ir3: enable pre-fs texture fetch for a6xx Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	72048dd799	turnip: add support for pre-fs texture fetch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	a5afcc76d5	freedreno/a6xx: add support for pre-fs texture fetch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	e9450ad27d	freedreno/ir3: Add support for texture sampling pre-dispatch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev	2a0d45ae6c	freedreno/ir3: Add a NIR pass to select tex instructions eligible for pre-fetch The pass should run once at the end of shader compilation, for a4xx onwards. It iterates texture sampling instructions and mark those eligibile for pre-dispatch by changing the tex op from 'tex' to 'tex_prefetch'. An instruction is eligibile if: * The coordinate is a vector where all its components come from a shader input. * The order of the components match exactly that of the input (no swizzles). * The instruction is in the 'main' function, and in the outer most-block. The first two restrictions were arrived to empirically, so more testing could tighten or loosen it. The 3rd restriction is there to allow moving the instructions eligible for pre-dispatch to the beginning of the shader, so that we don't block the registers holding the result for too long. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	7d4213fe88	freedreno/ir3: force i/j pixel to r0.x It seems that pre-fs texture fetch only works if ij_pix ends up in r0.x. I've tried unknown zero bits, to no avail, and blob also seems to force r0.x when this feature is used. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	07e9bf564f	freedreno/ir3: add pre-dispatch tex fetch to disasm Useful to see in disassembly listing texture fetches that were moved to pre-dispatch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	2b93eb9c76	freedreno/ir3: add dummy bary.f(ei) for pre-fs-fetch If the only use of varyings is a pre-shader texture-fetch, we still need to issue a bary.f with the end-input flag, otherwise we'll block further VS invocations, as the hw will think varying storage is still busy. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	392a309a55	freedreno/ir3: fixup register footprint to account for prefetch It is possible that the result of a pre-fs texture fetch is an output (or partially an output) of the FS. Sine the meta:tex_prefetch instructions are dropped before the assembler, we need to account for this when we fixup the register footprint. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	482e1b9955	freedreno/ir3: add meta instruction for pre-fs texture fetch Add a placeholder instruction to track texture fetches made prior to FS shader dispatch. These, like meta:input instructions are scheduled before any real instructions, so that RA realizes their result values are live before the first real instruction. And to give legalize a way to track usage of fetched sample requiring (sy) sync flags. There is some related special handling for varying texcoord inputs used for pre-fs-fetch, so that they are not DCE'd and remain in linkage between FS and previous stage. Note that we could almost avoid this special handling by giving meta:tex_prefetch real src arguments, except that in the FS stage, inputs are actual bary.f/ldlv instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	11e467c378	freedreno/ir3: don't DCE ij_pix if used for pre-fs-texture-fetch When we enable pre-dispatch texture fetch, we could have a scenario where the barycentric i/j coord sysval is not used in the shader, but only used for the varying fetch for the pre-dispatch texture fetch. In this case we need to take care not to DCE this sysval. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	af817a44c1	freedreno/ir3: track sysval slot for inputs Will be needed for special handling of SYSTEM_VALUE_BARYCENTRIC_PIXEL (ij_pix) when pre-fs texture fetch is enabled. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	35692fab86	freedreno/ir3: remove unused ir3_instruction::inout Not sure I remember how long this has been unused for. But it's unused now. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Hyunjun Ko	fd14788e1f	freedreno/ir3: Add data structures to support texture pre-fetch Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Rob Clark	766a68cdb9	freedreno: update registers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eduardo Lima Mitev	f1d4fadf1b	nir: Add new texop nir_texop_tex_prefetch This is like nir_texop_tex, but signals that the sampling coordinates are immutable during the shader stage, in a way that allows the HW that supports pre-dispatching sampling operations to pre-fetch the result prior to scheduling the shader stage. This is introduced to support the feature in Freedreno. Adreno HW from a4xx supports it. A NIR pass introduced later in this series will detect sampling operations that are eligible for pre-dispatch, and replace nir_texop_tex by this new op, to tell the backend to enable pre-fetch. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-18 21:11:54 +00:00
Eric Engestrom	27df3e015b	osmesa: add missing #include <stdint.h> Fixes: `281466332b` ("gallium/osmesa: Introduce a test.") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1947 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-18 22:07:21 +01:00
Dylan Baker	1ce23b5653	docs: Add new feature for compiling for windows with meson Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	0b6b7ff3ca	appveyor: Move appveyor script into .appveyor directory This clears out the scripts directory completely Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	fbb969b98a	appveyor: Add support for building llvmpipe with meson Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	41b3eb08d9	docs: update meson docs for windows Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	821cf6942a	meson: Use cmake to find LLVM when building for windows We don't use cmake normally because it always results in static linking. This is very problematic for *nix OSes which expect shared linking by default, but for windows this isn't a problem as LLVM doesn't support shared linking on windows anyway. Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	b962c7c971	meson: Add support for wrapping llvm For building on Windows (when not using cygwin), users may want to use a binary wrap of LLVM, this provides a fallback to the LLVM dependency which may be used in this case Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Dylan Baker	dbd554ba05	meson/llvmpipe: Add dep_llvm to driver_swrast This fixes build errors in gl-gdi on windows when using llvmpipe Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-10-18 13:02:58 -07:00
Hal Gentz	fa611b07f1	Revert "egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext." This reverts commit `173bc9d684`.	2019-10-18 18:41:51 +00:00
Hal Gentz	94386d476c	Revert "egl: Fixes transparency with EGL and X11." This reverts commit `90a19074b4`.	2019-10-18 18:41:51 +00:00
Hal Gentz	9997693960	Revert "egl: Puts RGBA visuals in the second config selection group." This reverts commit `a800d16e4f`.	2019-10-18 18:41:51 +00:00
Hal Gentz	4ef2c53755	Revert "egl: Configs w/o double buffering support have no `EGL_WINDOW_BIT`." This reverts commit `075a96aa92`.	2019-10-18 18:41:51 +00:00
Jonathan Marek	9a7a92c1ec	etnaviv: check NO_ASTC feature bit Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 19:30:41 +02:00
Jonathan Marek	15c5ec0024	etnaviv: fix TS samplers on GC7000L Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 19:23:59 +02:00
Jonathan Marek	ad48411d72	etnaviv: fix linear_nearest / nearest_linear filters on GC7000Lite MIN filter is only used when LOD MAX is at least 4 (I guess the 2 LSB don't actually exist). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 19:06:44 +02:00
Lucas Stach	95adc393eb	etnaviv: GC7000: flush TX descriptor and instruction cache The etnaviv kernel driver will only ever flush write caches. As both the TX descriptor and instruction cache are read caches they must be flushed from the user cmdstream at an appropriate time. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 19:06:39 +02:00
Lucas Stach	54dd288317	etnaviv: add linear texture support on GC7000 It's just a matter of writing the addressing mode into the texture descriptor. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 19:06:35 +02:00
Wladimir J. van der Laan	eda73d7127	etnaviv: GC7000: Texture descriptors Create a separate implementation file with texture-descriptor-based sampler views and sampler states. Initialize the one or the other based on the GPU. There is so little in common that this seemed more appropriate that keeping them as one type of state object would only be confusing. This commit is actually a combiation of the original commit by Wladimir, fixes and TS implementation from Jonathan and changed to use softpin by Lucas. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 19:06:20 +02:00
Lucas Stach	5bc3fcf620	etnaviv: check for softpin availability on Halti5 devices Halti5 uses texture descriptors to control the samplers, and thus needs to know the GPU virtual address for the texture buffers to fill into the descriptor buffer. Without softpin userspace has no control over the GPU VM and also no way to fix up the texture descriptor buffer, so there is no point in creating a screen on a Halti5 device without softpin being available. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 19:05:25 +02:00
Lucas Stach	0bdf5420f1	etnaviv: drm: add softpin interface If softpin is available on the kernel side, we transparently replace the relocs with self-managed GPU virtual addresses. This allows to skip some work at the kernel side, as it doesn't need to touch the command stream anymore before submitting it to the hardware. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 19:05:21 +02:00
Marek Vasut	e5cc66dfad	etnaviv: Rework locking Replace the per-screen locking of flushing with per-context one and add per-context lock around command stream buffer accesses, to prevent cross-context flushing from corrupting these command stream buffers. Signed-off-by: Marek Vasut <marex@denx.de>	2019-10-18 17:03:25 +00:00
Marek Vasut	0c38c5454b	etnaviv: Command buffer realloc Reallocate the command stream buffer in case it is too small. The older kernel versions are limited to 64 kiB buffer, so limit the size to avoid oversized buffers. Signed-off-by: Marek Vasut <marex@denx.de>	2019-10-18 17:03:25 +00:00
Marek Vasut	1456aa61cc	etnaviv: Rework resource status tracking Have each context track which resources it marked as pending read and pending write. Have each resource track in which context it is pending. This way, it is possible to identify when a resource is both pending read and pending write at the same time. Moreover, the status field can be correctly calculated and updated when necessary. Signed-off-by: Marek Vasut <marex@denx.de>	2019-10-18 17:03:25 +00:00
Lucas Stach	1194afdfe3	etnaviv: rework the stream flush to always go through the context flush This way we can ensure that the pipe driver tracking of pending resources stays in sync with the actual command buffer state, even if a space reservation triggers a forced flush. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 17:03:25 +00:00
Lucas Stach	1864fcd8c7	etnaviv: drm: remove unused etna_cmd_stream_finish It's not used by anything and gets in the way for the refactoring of the flush handling. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 17:03:25 +00:00
Lucas Stach	9e672e4d20	etnaviv: keep references to pending resources As long as a resource is pending in any context we must not destroy it, otherwise we'll hit a classical use-after-free with fireworks. To avoid this take a reference when the resource is first added to the pending set and put the reference when no longer pending. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-18 17:03:25 +00:00
Marek Vasut	90e223646b	etnaviv: Make contexts track resources Currently, the screen tracks all resources for all contexts, but this is not correct. Each context should track the resources it uses. This also allows a context to detect whether a resource is used by another context and to notify another context using a resource that the current context is done using the resource. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Guido Günther <guido.gunther@puri.sm> Cc: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 17:03:25 +00:00
Brian Paul	2946bd6628	REVIEWERS: add VMware reviewers	2019-10-18 16:42:40 +00:00
Samuel Pitoiset	7c50214aab	radv: implement VK_KHR_shader_float_controls This exposes what's required for DX and this is what we already configure. The driver flushes denorms for FP32 and preserves them for FP16/FP64. Note that we can't allow both preserving and flushing denorms because this won't work for merged shaders. This will require LLVM to update the float mode register to make it work. Only enabled on GFX8+ with the LLVM path because it's untested on previous chips and ACO doesn't support it. This extension is required for SPIRV 1.4. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-18 16:55:58 +02:00
Samuel Pitoiset	2c2aaf275c	ac/llvm: force fneg/fabs to flush denorms to zero if requested LLVM optimizes these instructions with XOR/AND and it loses the sign bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-18 16:55:55 +02:00
Samuel Pitoiset	7dfb15fff1	ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO Because some instructions will be optimized by the backend compiler, the driver has to manually flush to zero to keep the result exact. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-18 16:55:51 +02:00
Samuel Pitoiset	d94bd4e512	ac/llvm: add ac_build_canonicalize() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-18 16:55:48 +02:00
Eric Engestrom	3ad6154f4e	travis: test meson install as well Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-18 15:27:37 +01:00
Eric Engestrom	b0853a43da	travis: don't (re)install python The new Mac OS X images apparently already have python2 and python3, and `brew` considers asking to install something already installed as a fatal error... Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-18 15:27:37 +01:00
Lepton Wu	a651926884	gbm: Add GBM_MAX_PLANES definition This removed hard coded "4". Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2019-10-18 13:18:28 +00:00
Jose Maria Casanova Crespo	f8da0f6198	v3d: Explicitly expose OpenGL ES Shading Language 3.1 This will expose GL_EXT_primitive_bounding_box and GL_OES_primitive_bounding_box after previous commits expose OpenGL ES 3.1 once Compute Shaders are available. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-18 14:08:52 +02:00
Iago Toral Quiroga	db87439232	v3d: request the kernel to flush caches when TMU is dirty This adapts the v3d driver to the new CL submit ioctl interface that allows the driver to request a flush of the caches after the render job has completed. This seems to eliminate the kernel write violation errors reported during CTS and Piglit excutions, fixing some CTS tests and GPU resets along the way. v2: - Adapt to changes in the kernel side. - Disable shader storage and shader images if the kernel doesn't implement cache flushing. Fixes CTS tests: KHR-GLES31.core.shader_image_size.basic-nonMS-fs-float KHR-GLES31.core.shader_image_size.basic-nonMS-fs-int KHR-GLES31.core.shader_image_size.basic-nonMS-fs-uint KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-float KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-int KHR-GLES31.core.shader_image_size.advanced-nonMS-fs-uint KHR-GLES31.core.shader_atomic_counters.advanced-usage-many-draw-calls2 KHR-GLES31.core.shader_atomic_counters.advanced-usage-draw-update-draw KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-int KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-matR KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std140-struct KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-matC-pad KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-fs-std430-vec Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-18 14:08:52 +02:00
Eric Anholt	66e2d3b69f	v3d: Add Compute Shader support Now that the UAPI has landed, add the pipe_context function for dispatching compute shaders. This is the last major feature for GLES 3.1, though it's not enabled quite yet.	2019-10-18 14:08:52 +02:00
Iago Toral Quiroga	2d8b51ea4d	broadcom: document known hardware issues for L2T flush command Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-18 14:08:52 +02:00
Iago Toral Quiroga	46182fc1da	v3d: add new flag dirty TMU cache at v3d_compiler That we set for any TMU write on spills and general tmu. It is then used as part of v3d_emit_gl_shader_state later. v2: add a new flag instead at v3d_compiler instead of dirty the flag at v3dx if there is any spill (change suggested by Eric, added by Alejandro) v3: set this for anything that is not a load and do it also in v3d40_vir_emit_image_load_store (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-18 14:08:52 +02:00
Iago Toral Quiroga	d2203d74c6	v3d: trivial update to obsolete comment Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-18 14:08:52 +02:00
Bas Nieuwenhuizen	fd21ee8b52	radv: Fix single stage constant flush with merged shaders. e.g. a VERTEX only flush with tess on Vega should look at the TCS to see which bits are needed. CC: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-18 10:49:29 +00:00
Lucas Stach	1b65c49c58	rbug: remove superfluous NULL check The SCR_INIT macro used to install the rbug resource_changed method will only do so when the driver below rbug exposes this method, so the check will always evaluate to true. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	93d47932b8	rbug: implement resource creation with modifier Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	5b3e57059c	rbug: forward can_create_resource to pipe driver Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	8eea8c9691	rbug: forward texture_barrier to pipe driver Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	024eaa7fec	rbug: implement missing explicit sync related fence functions Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	5f76d3cce8	rbug: move flush_resource initialization All the other context method initialzation follow the order of the pipe_context structure definition making it easy to find unimplemented methods in rbug. Move the flush_resource init to follow the same order. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	a75eb888e0	rbug: unwrap index buffer resource All resources passed to the drivers below rbug need to be unwrapped before being passed down. We missed to do this for the index buffer resource when this was made part of the draw_info structure. Fixes: `330d0607ed` (gallium: remove pipe_index_buffer and set_index_buffer) Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	6174cba748	rbug: fix transmitted texture sizes The rbug wire format defines the texture size parameters to be uint32_t sized and uses memcpy to move the function parameters to the message structure. This caused totally wrong transmitted texture sizes since the height and depth paramterds have been changed to uint16_t in the gallium API. Fix this by doing an explicit conversion to the correct representation before packing into the wire message. Fixes: `e6428092f5` (gallium: decrease the size of pipe_resource - 64 -> 48 bytes) Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Lucas Stach	f6461df63a	gallium/util: don't depend on implementation defined behavior in listen() Using 0 as the backlog argument to listen() is exploiting implementation defined behavior and will lead to no connections being accepted on some libc implementations. Quote of the listen manpage: "A backlog argument of 0 may allow the socket to accept connections, in which case the length of the listen queue may be set to an implementation-defined minimum value." Fix this by using a more sensible backlog value. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-10-18 10:12:07 +00:00
Iago Toral Quiroga	5be5b53b6d	mesa/main: GL_GEOMETRY_SHADER_INVOCATIONS exists in GL_OES_geometry_shader It seems that for desktop GL this was included with ARB_gpu_shader5, but for OpenGL ES this is already included with the base extension and there is a CTS test that checks this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 09:03:21 +00:00
Pierre-Eric Pelloux Prayer	af60187153	mesa: implement glTextureStorageNDEXT functions Implement the 3 functions using the texturestorage_error() helper. _mesa_lookup_or_create_texture is always called to make sure that 'texture' is initialized (even if the texturestorage_error() generates an error afterwards). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	50533d408d	mesa: add EXT_dsa NamedCopyBufferSubDataEXT function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	da21435a7a	mesa: add EXT_dsa NamedRenderbufferStorageMultisampleEXT function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	2e14749f8f	mesa: add EXT_dsa Generate*MipmapEXT functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	fb804266a3	mesa: refactor GenerateTextureMipmap handling Rework _mesa_GenerateTextureMipmap to allow code sharing with EXT_dsa functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	cfc0ebe7f1	mesa: add EXT_dsa glGetFloati_vEXT/glGetDoublei_vEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	a4e935f2d7	mesa: add EXT_dsa + EXT_gpu_program_parameters functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	78b65343e8	mesa: add EXT_dsa + EXT_gpu_shader4 functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	c2d6f61f26	mesa: add EXT_dsa + EXT_texture_integer functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	2bdf809e66	mesa: add EXT_dsa + EXT_texture_buffer_object functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	28cc07a876	mesa: add EXT_dsa glProgramUniform*EXT functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	1d1722e910	mesa: add EXT_dsa NamedProgram functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	eaeab0a998	mesa: add EXT_dsa glClientAttribDefaultEXT / glPushClientAttribDefaultEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Pierre-Eric Pelloux-Prayer	01666ad206	mesa: add EXT_dsa glNamedRenderbufferStorageEXT and glGetNamedRenderbufferParameterivEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-18 10:26:26 +02:00
Daniel Stone	40689f5ac0	panfrost: Respect offset for imported resources When we import a resource through Gallium, we need to take account of the offset parameter passed. Fixes a failure seen with the VIVID V4L2 driver, which would create NV12 resources within the same BO, with an offset. Sample pipeline to reproduce (replace videoN with your actual VIVID device node): gst-launch-1.0 v4l2src device=/dev/videoN ! video/x-raw,format=NV12 ! glimagesink Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Tested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>	2019-10-18 09:38:52 +02:00
Jordan Justen	22859a18d5	iris/resource: Use isl surface alignment during bo allocation Reworks: * Change subject from "iris: Align main surface allocation to 64k on gen12+" * Make use of isl surf alignment. (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:22:00 -07:00
Jason Ekstrand	48c153e21b	intel/isl: Add isl_aux_usage_has_ccs Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:22:00 -07:00
Jordan Justen	d83fe059c2	intel/isl: Add R10G10B10_FLOAT_A2_UNORM format Reworks: * Fill out the format's entry in the ISL format table. (Nanley) * Support CCS_E-enabled BLORP copies with the format. (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:21:56 -07:00
Kenneth Graunke	f192741ddd	intel/compiler: Report the number of non-spill/fill SEND messages This can be useful to measure whether memory access optimizations are having the desired effect. For example, we might see a reduction in image loads/stores, or constant buffer loads. We can already see this in cycle estimates to some extent, but this is a more direct approach, minus a lot of the noise of random scheduler shuffling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-17 20:44:00 -07:00
Marek Olšák	cac5182992	st/mesa: don't call variables "tgsi" when they can reference NIR Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	48b4843c30	st/mesa: merge st_fragment_program into st_common_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	e94da4ab80	st/mesa: remove redundant function st_reference_compprog Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	614331738d	st/mesa: remove unused st_xxx_program::sha1 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	0c74e354d1	st/mesa: remove st_vp_variant_key in favor of st_common_variant_key Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	6468df0533	st/mesa: remove num_tgsi_tokens from st_xx_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	64dfc82340	st/mesa: rename basic -> common for st_common_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	33d53f0614	st/mesa: rename st_xxx_program::tgsi to state Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Marek Olšák	dd4d791821	st/mesa: lower doubles for NIR after linking This allows dropping 1 call to st_nir_opts, because shaders are always optimized after linking. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:37 -04:00
Marek Olšák	7908e82f60	st/mesa: call st_nir_opts for linked shaders only once The removed st_nir_opts calls are mostly redundant. There is an improvement with shader-db on radeonsi: Before: real 1m54.047s user 28m37.857s sys 0m7.573s After: real 1m52.012s user 28m3.412s sys 0m7.808s Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-17 20:31:34 -04:00
Ian Romanick	92252219d3	intel/vec4: Don't try both sources as immediates for DPH DPH isn't actually commutative, so this doesn't work. If the immediate in src0 would be a VF candidate, we could do better. shrug No shader-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b04beaf41d` ("intel/vec4: Try both sources as candidates for being immediates")	2019-10-17 15:07:01 -07:00
Ian Romanick	050e4e28bf	nir/search: Fix possible NULL dereference in is_fsign Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern")	2019-10-17 15:07:01 -07:00
Jordan Justen	da10fa9d63	iris: Let isl decide the supported tiling in more situations Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Suggested-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:23 -07:00
Jordan Justen	be89fbd51e	intel/isl: Add gen12 depth/stencil surface alignments Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:23 -07:00
Jason Ekstrand	d9565160b2	intel/isl: Select Y-tiling for stencil on gen12 Rework: * Disallow linear 1D stencil buffers (Nanley) * Force Y for gen12 stencil rather than ~W (Nanley) Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:22 -07:00
Jason Ekstrand	9dd9c3363b	intel/genxml: Remove W-tiling on gen12 It's no longer supported by the hardware Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:22 -07:00
Jordan Justen	523ba0a3e7	intel/genxml,isl: Add gen12 stencil buffer changes Rework: * NULL stencil buffer path (Jason) * genxml fixes (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:22 -07:00
Jordan Justen	d2a490d1d9	intel/genxml,isl: Add gen12 depth buffer changes Reworks: * Fix 3DSTATE_DEPTH_BUFFER "Surface Format" end in xml (Jason) * Remove WM_HZ_OP changes (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:22 -07:00
Jordan Justen	6c9f9a82d7	intel/genxml,isl: Add gen12 render surface state changes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-10-17 14:47:17 -07:00
Eric Anholt	75c601b6cf	mesa: Refactor the entirety of _mesa_format_matches_format_and_type(). This function was difficult to implement for new formats due to the combination of endianness and swapbytes support. Since it's mostly used for fast paths, bugs in it were often missed during testing. Just reimplement it on top of the recent _mesa_format_from_format_and_type() which can give us a canonical MESA_FORMAT for a format and type enum (while respecting endianness). Fixes: - R4G4B4A4_UNORM, B4G4R4_UINT, R4G4B4A4_UINT incorrectly matched with swapBytes (you can't just reverse the channels if the channels aren't bytes) - A4R4G4B4_UNORM and A4R4G4B4_UINT missing BGRA/4444_REV matches - failing to match RGB/BGR unorm8 array formats on BE - 2101010 formats incorrectly matching with swapBytes set. - UINT/SINT byte formats failed to match with swapBytes set. This deletes the part of tests/mesa_formats.cpp that called _mesa_format_matches_format_and_type() to make sure it didn't assertion fail, as it now would assertion fail due to the fact that we were passing an invalid format (GL_RG) for most types. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:07:29 +00:00
Eric Anholt	d77c77936b	mesa: Add support for array formats of depth and stencil. In desktop GL, you can specify things like GL_DEPTH_COMPONENT/GL_BYTE as a ReadPixels format, and we need to be able to represent that to see if we have proper MESA_FORMATs for them. That's exactly what the mesa_array_format enum is for. v2: Drop _mesa from static fn. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:07:29 +00:00
Eric Anholt	4f4fc75357	mesa: Add format/type matching for DEPTH/UINT_24_8. We had missed this case where GLES3 allows glReadPixels(DEPTH, UINT_24_8), and just got lucky by the readpixels path never asking for the matching format from this function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:07:29 +00:00
Eric Anholt	7be72b24f5	mesa: Fix depth/stencil ordering in _mesa_format_from_format_and_type(). The GL spec says the 24-bit component is in the high bits, and format_unpack.c looks at the high 24 bits in the S8Z24 case, not Z24SS8. Avoids a regression in the next commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:07:29 +00:00
Eric Anholt	df5fe86232	mesa: Add debug info to _mesa_format_from_format_and_type() error path. The unreachable() that follows isn't very useful for debug, and by adding this here we get a nice description of the failure in debug builds. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-17 21:07:29 +00:00
Kristian H. Kristensen	0a4e6726ba	freedreno/a6xx: Turn on geometry shaders Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:45:03 -07:00
Kristian H. Kristensen	d3945e3b9b	freedreno/ci: Add failing tests to skip list Some queries are still failing and layered rending needs more work. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:45:03 -07:00
Kristian H. Kristensen	622afc8dbd	freedreno/a6xx: Implement PIPE_QUERY_PRIMITIVES_GENERATED for GS When we don't have streamout enabled, we have to read this register to get the number of primitives emitted. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	c8e1522a50	freedreno/blitter: Save GS state We have GS state now. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	946a1e206f	st/mesa: Also enable GS when ESSLVersion > 320 Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	7cb672227b	freedreno/a6xx: Support layered render targets Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0eebedb619	freedreno/a6xx: Emit program state for GS Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	d6ed39e20e	freedreno/ir3: End VS with CHMASK and CHSH in GS pipelines When used in a GS pipeline, the VS doesn't end with the END instruction. Instead it chains to the GS, which continues running with the same register allocation. The intended use cases seems to be that you can compile a regular VS (ie outputs in registers and ending with END) but then tack on link-time generated code past the END to write the outputs using STLW, in case the VS is used with GS. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	4b7312b763	freedreno/ir3: Start GS with (ss) and (sy) We don't know what kind of loads we might have to wait on when coming in from chsh in the VS so set both sync flags. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	c347708bea	freedreno/ir3: Pre-color GS header and primitive ID These sysvals have to be unclobbered by VS and in the same registers in both VS and GS, since the chsh from VS to GS doesn't reload the values. We use the pre-color argument to ir3_ra() to always place these values in r0.x and r0.y. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	ce08fddbbe	freedreno/ir3: Setup ir3 inputs and outputs for GS Inputs are the GS header, which contains vertex ID, local primitive ID and thread ID as well as primitive ID. The setup is a little different from other sysvals, since we always have to receive them in the VS so that it can pass them on into the GS. The vertex flag outputs from GS is set up as a proper nir output in the lowering pass and doesn't need special handling here. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0293d14719	freedreno/ir3: Implement primitive layout intrinsics This implements the load_vs_primitive_stride_ir3, load_vs_vertex_stride_ir3 and load_primitive_location_ir3 intrinsics, used for getting the primitive layout strides and locations. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	8e16fb1528	freedreno/ir3: Implement lowering passes for VS and GS This introduces two new lowering passes. One to lower VS to explicit outputs using STLW and one to lower GS to load input using LDLW and implement the GS specific functionality. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	8f39985b01	freedreno/ir3: Add has_gs flag to shader key Since the presence of GS changes how the VS operates we need to track that in the shader key. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	2703844cb3	freedreno/a6xx: Add missing adjacency primitives to table Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0324706764	freedreno/ir3: Add intrinsics that map to LDLW/STLW These intrinsics will let us do all the offset calculations in nir, which is nicer to work with and lets nir_opt_algebraic eat it all up. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	436d125adf	freedreno/ir3: Add new LDLW/STLW instructions These access memory used for passing data between geometry stages. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	956d319446	freedreno/ir3: Extend RA with mechanism for pre-coloring registers We'll need to pre-color certain input registers betwee VS and GS shaders. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	0b6625d825	freedreno/ir3: Use third register for offset for LDL and LDLV Before, offset held the offset, which can be either immediate or a register. Use a third register to hold the offset so that we can use a register. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	3a93e60e7b	freedreno/ir3: Add support for CHSH and CHMASK instructions Just add the constructors for now and special case similar to END so we don't remove them. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	f335a6663d	freedreno/a6xx: Trim a few regs from fd6_emit_restore() We know what these do an either write them in the program stateobj or don't need to write them. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Kristian H. Kristensen	610c8c938e	freedreno/registers: Update with GS, HS and DS registers Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-17 13:43:53 -07:00
Eric Anholt	628ed1bbd5	freedreno/ci: Ban texsubimage2d_pbo.r16ui_2d, due to two flakes reported. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-10-17 20:32:46 +00:00
Marek Olšák	12d92714e9	st/mesa: silence a warning in st_nir_lower_tex_src_plane trivial	2019-10-17 16:07:26 -04:00
Marek Olšák	3ed1dd3d42	gallium/u_blitter: remove an unused variable trivial	2019-10-17 16:07:02 -04:00
Marek Olšák	9aa5b348de	radeonsi: recreate aux_context after a GPU reset Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-17 14:56:26 -04:00
Marek Olšák	438ede3ca3	radeonsi: call the reset callback if get_device_reset_status returns a failure Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-17 14:56:24 -04:00
Marek Olšák	93707457b6	st/mesa: call the reset callback if glGetGraphicsResetStatus returns a failure so that we immediately set the no-op dispatch Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-17 14:56:23 -04:00
Caio Marcelo de Oliveira Filho	c847bfaaf5	intel/fs/gen12: Add tests for scoreboard pass Tests the combinations of cases of RAW, WAW and WAR hazards involving both inorder and outoforder instructions. Also tests that dependencies combine and propagate correctly through control flow (loops and conditionals). v2: Add an extra test illustrating that the non-logical CFG edge between then-block and else-block is being taking into account. (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-17 10:02:35 -07:00
Daniel Schürmann	4b458b3e8f	aco: don't combine minmax3 if there is a neg or abs modifier in between This fixes a graphical corruption in HotS. No pipelinedb changes other than that. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-17 16:21:19 +00:00
Roland Scheidegger	045f05a2f6	gallivm: Fix saturated signed psub/padd intrinsics on llvm 8 LLVM 8 did remove both the signed and unsigned sse2/avx intrinsics in the end, and provide arch-independent llvm intrinsics instead. Fixes a crash when using snorm framebuffers (tested with piglit arb_color_buffer_float-render GL_RGBA8_SNORM -auto). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org>	2019-10-17 17:42:16 +02:00
Samuel Pitoiset	c644644c65	radv: fix DCC fast clear code for intensity formats (correctly) Previous fix was pretty bogus. This fixes a rendering regression with Nier (minimap too large). Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1943 Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1952 Fixes: `ea92273cea` ("radv: fix DCC fast clear code for intensity formats") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-17 15:29:43 +02:00
Tomeu Vizoso	82f18b713a	panfrost: Keep track of active BOs If two jobs use the same GEM object at the same time, the job that finishes first will (previous to this commit) close the GEM object, even if there's a job still referencing it. To prevent this, have all jobs use the same panfrost_bo for a given GEM object, so it's only closed once the last job is done with it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-17 14:33:59 +02:00
Karol Herbst	730f06a44d	nv50/ir: remove DUMMY edge type it was never used Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-10-17 14:00:50 +02:00
James Xiong	023282a4f6	gallium: do not increase ref count of the new throttle fence A new throttle fence was initialized to 1, and increased by 1 again when it's put in drawable->throttle_fence; the ref was decreased by 1 when it's removed from drawable->throttle_fence, and never reached to 0, caused leak. Fixes: ff77bf5cbf7 ("gallium: simplify throttle implementation") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1949 Signed-off-by: James Xiong <james.xiong@intel.com> Reported-by: Florian Wesch <fw@info-beamer.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2019-10-17 10:18:07 +00:00
Erik Faye-Lund	e8095f2af0	nir: drop unused alpha_ref_float Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	5af272b474	nir: drop support for using load_alpha_ref_float Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	9d0523b569	v3d: do not report alpha-test as supported This triggers lowering in the state-tracker, which makes things a bit simpler. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	a79b93269c	vc4: do not report alpha-test as supported This triggers lowering in the state-tracker, which makes things a bit simpler. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	2da792d398	panfrost: do not report alpha-test as supported This triggers lowering in the state-tracker, which makes things a bit simpler. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	3298aedd6e	mesa/st: support lowering user-clip-planes automatically Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	439f499591	mesa/program: support referencing the clip-space clip-plane state Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	71c0dcf266	nir: support feeding state to nir_lower_clip_[vg]s Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	eb3047c094	nir: support lowering clipdist to arrays This allows us to make sure clipdist is emitted as a scalar array rather than two vec4s. This matches SPIR-V semantics, and will be useful for Zink. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	28543f1640	mesa/gallium: automatically lower two-sided lighting Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	011d692a52	nir: support derefs in two-sided lighting lowering Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	3b4fc2401b	mesa/gallium: automatically lower point-size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	878c94288a	nir: add lowering-pass for point-size mov Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	b786738454	st/mesa: move point_size_per_vertex-logic to helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	b1c4c4c7f5	mesa/gallium: automatically lower alpha-testing Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	6d7e02e37d	nir: allow passing alpha-ref state to lowering-code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	fdc4450c28	mesa: expose alpha-ref as a state-variable Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Dave Airlie	cce3ad166a	st/mesa: handling lower flatshading for NIR drivers. This uses the NIR pass to lower flatshading when the driver requests it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Dave Airlie	731260de7d	gallium: add flatshade lowering capability This allows the driver to request flatshade lowering. (NIR drivers only so far). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Dave Airlie	dc91a02a72	nir: add a pass to lower flat shading. This takes any color or backcolor that has unspecified shading and converts it to flat shading. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 10:41:36 +02:00
Erik Faye-Lund	26c6640835	gallium/u_blitter: set a more sane viewport-state This actually corresponds to legal GL depth-ranges, because depth-clear values are always in the 0..1 range in OpenGL. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-17 09:26:25 +02:00
Marek Olšák	4857d695f5	st/mesa: reorder and document code in st_translate_vertex_program move the TGSI code after the ARB_vp code Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	5d0630e504	st/mesa: call prog_to_nir sooner for ARB_fp Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	f54dcaf232	st/mesa: don't call translate_*_program functions for NIR move the initializaton to st_link_nir Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	f86b28dfdc	st/mesa: finalize NIR after shader variant passes for TCS/TES/GS/CS Same as VS and FS. This might fix vertex color clamping. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	45378689e0	st/mesa: unify transform feedback info translation code Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	b23967a5e1	st/mesa: move vertex program preparation code into st_prepare_vertex_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	8dfcec405a	st/mesa: clean up more after the removal of st_compute_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	196fc59c40	st/mesa: deduplicate st_common_program code in st_program_string_notify Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	05f59bb777	st/mesa: sink TCS/TES/GS/CS translate code into st_translate_common_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	74c007ba7f	st/mesa: deduplicate cases in st_deserialise_ir_program Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	1cc866c264	st/mesa: remove st_compute_program in favor of st_common_program The conversion from pipe_shader_state to pipe_compute_state is done at the end. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	691240cdbe	st/mesa: don't store stream output info to shader cache for tess ctrl shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	33de483d55	st/mesa: simplify the signature of st_release_basic_variants Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	ab843a3702	st/mesa: deduplicate code for ATI fs in st_program_string_notify Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Marek Olšák	b596bb5b66	st/mesa: use *prog at the end of st_link_nir Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-16 20:10:47 -04:00
Dylan Baker	4e07869a06	appveyor: Cache meson's wrap downloads This makes us less reliant on wrap-db (and reduces the amount of downloading that goes on during the build). Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1936 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 23:26:09 +00:00
Dylan Baker	c65f907ce9	gitlab-ci: Set the meson wrapmode to disabled This will prevent us from accidentally falling back to the wrap-db instead of using locally installed versions. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 23:26:09 +00:00
Dylan Baker	449f831088	Revert "gitlab-ci: Disable meson-mingw32-x86_64 job again for now" This reverts commit `d60b8679a4`. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 23:26:09 +00:00
Dylan Baker	6e375ff1aa	gitlab-ci: Add a pkg-config for mingw The one debian provides is broken in buster+, so I've just written my own. This allows meson to find the installed zlib and prevents it from falling back to wraps. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 23:26:09 +00:00
Dylan Baker	4441da0044	meson: Don't use expat on windows It's not really needed, and there's no debian package for it so we're forced to fall back to wraps in mesa's CI. This can be problematic in itself. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 23:26:09 +00:00
Karol Herbst	656c038d01	st/mesa: fix crash for drivers supporting nir defaulting to tgsi nvc0 and I assume radeonsi as well hit an assert inside glsl_to_tgsi as atan instructions get inserted into the shader. Fixes: `cece947a8d` ("glsl/builtin: Add alternate versions of atan using new ops") Cc: Neil Roberts <nroberts@igalia.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-16 21:53:46 +00:00
Eric Engestrom	aaab70035a	util/u_atomic: fix return type of p_atomic_{inc,dec}_return() and p_atomic_{cmp,}xchg() We're trying to cast the return type to the type of the var, but instead we were casting `sizeof(*v)`. Fixes: `6df72e970c` ("util: Make u_atomic.h typeless.") Fixes: `0a7f17cf5b` ("util/u_atomic: add p_atomic_xchg") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-16 19:41:47 +01:00
Eric Engestrom	d3b06a199e	mesa/math: delete duplicate extern symbol It's already defined in `m_debug_util.h`, along with an explanation of what it is and how to use it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-16 19:31:24 +01:00
Eric Engestrom	38427da02b	mesa/math: delete leftover... from 18 years ago (!) Left over from `0a79baf1bf` ("remove dead vertex assembly"). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-16 19:31:20 +01:00
Andreas Baierl	0ee931c1de	lima: Fix crash when there are no vertex shader attributes Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-16 16:45:05 +00:00
Andreas Baierl	f906f5f053	lima: Fix compiler warning in standalone compiler 'struct lima_context' has to be declared before usage in lima_program.h Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-16 15:13:13 +00:00
Rhys Perry	88f1c0a360	aco: emit_split_vector() s_memtime results Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-16 15:31:19 +01:00
Rhys Perry	ded51b13da	aco: don't CSE s_memtime Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-16 15:31:19 +01:00
Rhys Perry	d7838152f5	aco: fix scheduling with s_memtime/s_memrealtime Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-16 15:31:19 +01:00
Alan Coopersmith	6804b8e1ff	intel/common: include unistd.h for ioctl() prototype on Solaris Fixes build errors of: In file included from ../src/intel/vulkan/anv_private.h:48, from ../src/intel/vulkan/genX_blorp_exec.c:26: ../src/intel/common/gen_gem.h: In function ‘gen_ioctl’: ../src/intel/common/gen_gem.h:68:15: error: implicit declaration of function ‘ioctl’ [-Werror=implicit-function-declaration] 68 \| ret = ioctl(fd, request, arg); \| ^~~~~ In file included from ../include/c11/threads_posix.h:35, from ../include/c11/threads.h:66, from ../src/mesa/main/mtypes.h:39, from ../src/intel/compiler/brw_compiler.h:30, from ../src/intel/vulkan/anv_private.h:51, from ../src/intel/vulkan/genX_blorp_exec.c:26: /usr/include/unistd.h: At top level: /usr/include/unistd.h:471:12: error: conflicting types for ‘ioctl’ 471 \| extern int ioctl(int, int, ...); \| ^~~~~ /usr/include/unistd.h:471:1: note: a parameter list with an ellipsis can’t match an empty parameter name list declaration 471 \| extern int ioctl(int, int, ...); \| ^~~~~~ In file included from ../src/intel/vulkan/anv_private.h:48, from ../src/intel/vulkan/genX_blorp_exec.c:26: ../src/intel/common/gen_gem.h:68:15: note: previous implicit declaration of ‘ioctl’ was here 68 \| ret = ioctl(fd, request, arg); \| ^~~~~ Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 13:45:57 +01:00
Alan Coopersmith	d8a9420f6f	meson: recognize "sunos" as the system name for Solaris Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-16 13:45:57 +01:00
Alan Coopersmith	7040795a69	util: Solaris has linux-style pthread_setname_np Fixes: `dcf9d91a` ("util: Handle differences in pthread_setname_np") Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 13:45:57 +01:00
Alan Coopersmith	b3028a9fb8	util: Workaround lack of flock on Solaris v2: Replace autoconf check for flock() with meson check Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 13:45:57 +01:00
Alan Coopersmith	a56c3e3a47	util: Make Solaris implemention of p_atomic_add work with gcc gcc is very particular about where you place the (void) cast The previous placement made it error out with: In file included from disk_cache.c:40:0: ../../src/util/u_atomic.h:203:29: error: void value not ignored as it ought to be #define p_atomic_add(v, i) ((void) \ ^ disk_cache.c:658:4: note: in expansion of macro ‘p_atomic_add’ p_atomic_add(cache->size, size); ^ Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 13:45:57 +01:00
Alan Coopersmith	ddde652e70	c99_compat.h: Don't try to use 'restrict' in C++ code Fixes build failures on Solaris in C++ files using gcc: ../src/util/u_math.h:628:41: error: expected ‘,’ or ‘...’ before ‘dest’ 628 \| util_memcpy_cpu_to_le32(void * restrict dest, const void * restrict src, size_t n) \| ^~~~ ../src/util/u_math.h: In function ‘void* util_memcpy_cpu_to_le32(void*)’: ../src/util/u_math.h:641:18: error: ‘dest’ was not declared in this scope 641 \| return memcpy(dest, src, n); \| ^~~~ ../src/util/u_math.h:641:24: error: ‘src’ was not declared in this scope 641 \| return memcpy(dest, src, n); \| ^~~ ../src/util/u_math.h:641:29: error: ‘n’ was not declared in this scope; did you mean ‘yn’? 641 \| return memcpy(dest, src, n); \| ^ \| yn Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-16 13:45:57 +01:00
Alyssa Rosenzweig	c94ccbf201	pan/midgard: Do not repeatedly spill same value It doesn't make sense. You already spilled it once, and it didn't help. Don't try again, or you'll end up in a loop. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig	2d914ebe81	pan/midgard: Fix memory corruption in register spilling Essentially an off-by-one error ... bit of an edge case, but seems to occur in some glamor shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig	01a78dbbab	pan/midgard: Allow COMPUTE jobs in panfrost_bo_access_for_stage Fixes: `ada752afe4` ("panfrost: Extend the panfrost_batch_add_bo() API to pass access flags") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig	fd2216e1fd	pan/midgard: Use 16-bit liveness masks We'll want liveness per-byte, so we need to accomodate up to 16 bytes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig	4fee7b30c0	panfrost: Disable frame throttling The new frame throttling implemention interacts unfortunately with pipelining, leading to fence fds leaking like crazy and ultimately apps crashing quickly. With this patch, apps still crash but not as quickly. We need to either figure out the real cause or revert the core changes. Nevertheless, we don't want frame throttling in the first place, so. Fixes: `a65e29ccb2` ("gallium: simplify throttle implementation") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-16 08:13:38 -04:00
Pierre-Eric Pelloux-Prayer	16233797f4	mesa: fix invalid target error handling for teximage This commit moves the target check before using _mesa_get_current_tex_object to fix a "Mesa implementation error: bad target in _mesa_get_current_tex_object()" error. Fixes: `9dd1f7cec0` ("mesa: pass gl_texture_object as arg to not depend on state") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-16 10:41:31 +02:00
Marek Olšák	268e0e01f3	radeonsi/nir: simplify si_lower_nir signature just a cleanup	2019-10-15 21:52:09 -04:00
Alyssa Rosenzweig	923aa3918c	pan/midgard: Fix mir_mask_of_read_components with dot products Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig	47b58199f0	pan/midgard: Add perspective ops to mir_get_swizzle I really need to just make this a table.. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig	7db36d94af	pan/midgard: Don't try to propagate swizzles to branches Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig	9c0915ba4a	pan/midgard: Allow non-contiguous masks in UBO lowering We don't really need to impose this condition, but we do need to cope with the slightly more general case. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-15 21:41:11 -04:00
Alyssa Rosenzweig	a6867fb3fd	pan/midgard: Report read mask for branch arguments Conditionals in particular read values. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-15 21:41:11 -04:00
James Xiong	fd235484fe	iris: finish aux import on get_param A buffer and its aux are imported separately, if the aux import is not completed yet when resource_get_param is called, merge the separate aux a.k.a the 2nd image into the main image. Fixes: `246eebba4a` ("iris: Export and import surfaces with modifiers that have aux data") Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-15 23:19:04 +00:00
Kenneth Graunke	e6ca6e587e	mesa: Handle pbuffers in desktop GL framebuffer attachment queries Once again, we were handling back-to-front in the GLES3 case, but not the desktop GL case. Fixes GTF-GL46.gtf30.GL3Tests.framebuffer_srgb.framebuffer_srgb_default_encoding when run with --deqp-surface-type=pbuffer --deqp-gl-context-type=egl. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-15 15:44:27 -07:00
Kenneth Graunke	c512eca4da	mesa: Make back_to_front_if_single_buffered non-static So I can use it in fbobject.c in the next commit. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-15 15:44:25 -07:00
Kenneth Graunke	d947276b4a	mesa: Use ctx->ReadBuffer in glReadBuffer back-to-front tests We were looking at ctx->DrawBuffer when asking about the read buffer, which was good enough for CTS purposes, but definitely not right. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-15 15:44:16 -07:00
Lionel Landwerlin	701e0ac077	etnaviv: remove variable from global namespace Found out by accident this was clashing with another driver. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-10-15 21:07:25 +00:00
Marek Olšák	7f6b9baee2	st/mesa: always allocate pack/unpack buffers as staging Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-15 14:24:23 -04:00
Adam Jackson	3f840e5ccd	gallium/xlib: Fix xmesa drawable creation The first time you call glXMakeCurrent, current != ctx. As a result we would never look up whether the drawable already had an XMesaDrawable, and would instead always create one. Then XMesaBufferList would have two different buffers for the same XID, and you'd be reading and drawing to different places, and that's not what you want at all. Instead just always look up the drawable. Fixes: `db8be355` (gallium/xlib: Remove drawable caching from the MakeCurrent path) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1196 Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-15 17:24:41 +00:00
Eric Engestrom	3bcd54f3fc	gitlab-ci: set a common job parent for test stage Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-15 17:42:39 +01:00
Eric Engestrom	aba78c2d38	gitlab-ci: set a common job parent for build stage Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-15 17:42:39 +01:00
Eric Engestrom	81b98e99cd	gitlab-ci: set a common job parent for container stage While at it, rename to singular "container" for consistency. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-15 17:42:39 +01:00
Samuel Pitoiset	4a3bdc6d22	Revert "radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+" This reverts commit `2ca8629fa9`. This was initially ported from RadeonSI, but in the meantime it has been reverted because it might hang. Be conservative and re-introduce this packet emission. Unfortunately this doesn't fix anything known. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-15 15:58:34 +02:00
Jonathan Marek	39d7cb36ff	spirv: set correct dest_type for texture query ops Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-15 08:42:22 -04:00
Jonathan Marek	37dec33676	turnip: more descriptor sets Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	ac9f0d2dd4	turnip: push constants Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	5b7fbcbdde	turnip: depth/stencil Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	f1efc9a1c8	turnip: basic msaa working Not perfect but gets through some tests. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	d3c9914152	turnip: improve CmdCopyImage and implement CmdBlitImage Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	571b2611b3	turnip: use nir_assign_io_var_locations instead of nir_assign_var_locations Variables with same location should use the same driver_location. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	a5635a8a50	turnip: add missing nir passes Avoids assert fails in ir3. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	d930be9f4c	turnip: add code to lower indirect samplers Taken from nir_lower_samplers. Sampler arrays don't work though, this is just to avoid an assert fail in ir3. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	e336076838	turnip: fixup consts Fix some mistakes in previous series. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Jonathan Marek	29464712ce	turnip: update some shader state bits from GL driver Notably includes centroid varying bits that were missing. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Eric Anholt	9a5f3594ee	turnip: Emit clears of gmem using linear. This is what we do in freedreno. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:20 -04:00
Eric Anholt	776a9ce36b	turnip: Set up the correct tiling mode for small attachments. Noticed while debugging a tiling-looking issue by comparing our gmem blit setup to freedreno's. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Eric Anholt	8193c2b08b	turnip: Tell spirv_to_nir that we want fragcoord as a sysval. Fixes ir3 compiler failure failure in dEQP-VK.renderpass.dedicated_allocation.formats.r8g8b8a8_unorm.clear.clear_draw (now just a rendering failure where the subpass clear isn't happening) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Eric Anholt	0ce1672a2c	turnip: Fill in clear color packing for r10g11b11 and rgb9e5. Fixes assertion failures in dEQP-VK.api.image_clearing.core.clear_color_image.2d.* for these formats, though the test set as a whole is stil failing. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Eric Anholt	1b16c5c98e	turnip: Drop unused tu_pack_clear_value() return. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	8626d33986	turnip: add anisotropy and compressed formats to device features Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	f4154e7d3e	turnip: disable tiling as necessary Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	057c0f5caa	turnip: update setup_slices Deal with tiled r8g8 having different alignment and other updates taken from fd6_resource. Additionally track image samples/cpp. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	c47f58bd4d	turnip: add VK_KHR_sampler_mirror_clamp_to_edge Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	2f939ef889	turnip: add black border color Avoids hangs and some texture tests are happy with just this. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	ffbffe19f9	turnip: improve sampler descriptor Fixes anisotropy and shadow texture Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	68b8d0b70e	turnip: improve view descriptor Changes to make compressed, tiled, 3d, etc textures work Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	31351a0281	turnip: add more 2d_ifmt translations Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	acdc75301f	turnip: format table fixes * Fix R16G16 SCALED and R16G16B16A16 SCALED having texture format * Fix B5G6R5 swap value * Use R8_UINT instead of R8_UNORM for S8_UINT rb format * Disable 96-bit texture formats instead having a check for NPOT formats * Don't fail assert on D24X8 format Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	eb67d9f0f3	turnip: add format_is_uint/format_is_sint Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	12ede7565f	turnip: add astc format layout Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	b6e1544852	turnip: fix assert failing for 0 color attachments Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	467f9982df	turnip: fix segmentation fault with compute pipeline Not supported, so always set pointer to NULL Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	eef195c9cc	turnip: fix segmentation fault in events Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	03772df450	turnip: fix 32 vertex attributes case Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	8580726f90	turnip: fix triangle strip Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:19 -04:00
Jonathan Marek	b7093882eb	freedreno/regs: update a6xx 2d blit bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-15 07:56:18 -04:00
Samuel Pitoiset	50c8c4144b	radv: rename VK_KHR_shader_float16_int8 structs/constants Trivial change. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-15 12:13:53 +02:00
Iago Toral	e353656f3d	v3d: drop unused shader_rec_count member from context Looks like this was copied from the vc4 driver where it is actually included in the submit CL ioctl. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-15 06:56:45 +00:00
Jonathan Marek	278c9b5cc7	freedreno/ir3: implement fquantize2f16 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Jonathan Marek	92d756f22d	freedreno/ir3: implement texop_texture_samples Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Jonathan Marek	3cfd5ffb8c	freedreno/ir3: fix GETLOD for negative LODs Note: for output type U32, negative LOD is not sign extended from 16 bits Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Jonathan Marek	cfc6a3e394	freedreno/ir3: implement fdd{x,y}_coarse opcodes Same as regular fddx/fddy. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Jonathan Marek	b094b384e2	freedreno/ir3: increase size of inputs/outputs arrays Makes it possible to support 32 varyings. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Jonathan Marek	08003c37b9	freedreno/ir3: remove input ncomp field Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robclark@gmail.com>	2019-10-14 17:48:22 -04:00
Lucas Stach	ce23bc9283	etnaviv: fix vertex buffer state emission for single stream GPUs GPUs with a single supported vertex stream must use the single state address to program the stream. Fixes: `3d09bb390a` (etnaviv: GC7000: State changes for HALTI3..5) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-14 19:18:37 +00:00
Dave Airlie	c2efc7c637	gallivm/draw/swr: make the gs_iface not depend on tgsi. This gs_iface doesn't seem to require a dependence on the tgsi context, except for the swr end prim code. This refactors the API to include all the info that the swr code needs in the interface rather than having to dig it out of the struct inheritance. This is a precursor to adding NIR support to llvmpipe. Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-10-15 04:43:30 +10:00
Kenneth Graunke	ac7af7c500	iris: Implement the Gen < 9 tessellation quads workaround Fixes several CTS tests: - KHR-GL46.tessellation_shader.vertex.vertex_spacing - KHR-GL46.tessellation_shader.tessellation_shader_point_mode.points_verification Fixes: `823609b1a3` ("iris/WIP: add broadwell support")	2019-10-14 09:48:36 -07:00
Caio Marcelo de Oliveira Filho	58286c7969	anv: Advertise VK_KHR_spirv_1_4 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-14 08:25:42 -07:00
Caio Marcelo de Oliveira Filho	90a7893ca8	vulkan: Update the XML and headers to 1.1.125 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-14 08:23:27 -07:00
Mauro Rossi	072c94f724	android: amd/common: export amd/llvm headers Fixes the following building error: external/mesa/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:42:10: fatal error: 'ac_llvm_util.h' file not found ^~~~~~~~~~~~~~~~ 1 error generated. Fixes: 3a08110 ("amd: Move all amd/common code that depends on LLVM to amd/llvm.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-14 10:46:45 +02:00
James Xiong	4f963b03a1	gallium: rename PIPE_CAP_MAX_FRAMES_IN_FLIGHT to PIPE_CAP_THROTTLE v2: [ Michel Dänzer ] * Update src/gallium/docs/source/screen.rst accordingly Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> # v1 Reviewed-by: Marek Olšák <marek.olsak@amd.com> # v1	2019-10-14 10:05:46 +02:00
James Xiong	a65e29ccb2	gallium: simplify throttle implementation All gallium drivers currently set MAX_FRAME_IN_FLIGHT to either 1 or 0, which means that the drivers either throttle on the previous render or don't throttle, the current implementation is more complicated than necessary and can be simplified. Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-14 10:05:40 +02:00
Samuel Pitoiset	ea92273cea	radv: fix DCC fast clear code for intensity formats This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was affected because intensity formats are different. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-14 08:36:14 +02:00
Eric Engestrom	ebe176eeff	gbm: use size_t for array indexes Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-13 18:10:47 +01:00
Eric Engestrom	ad7e410893	gbm: replace NULL sentinel with explicit ARRAY_SIZE() Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-13 18:10:47 +01:00
Eric Engestrom	0d74f4bb16	gbm: replace 1/0 bool with true/false Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-13 18:10:47 +01:00
Eric Engestrom	e9d8081135	gbm: turn 0/-1 bool into true/false Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-13 18:10:47 +01:00
Eric Engestrom	48289d8853	radv: add exported symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-13 17:40:54 +01:00
Eric Engestrom	960038d550	anv: add exported symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-13 17:40:47 +01:00
Eric Engestrom	f1c22390f7	symbols-check: ignore exported C++ symbols Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-13 17:40:43 +01:00
Boris Brezillon	35e92a11dd	panfrost: Fix support for packed 24-bit formats pan_pack_color() color was missing the 24-bit packed format case. Looks like putting the clear color in a 32-bit slot does the trick. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-13 14:44:25 +02:00
Timothy Arceri	1294f01e06	glsl: fix crash compiling bindless samplers inside unnamed UBOs The check to see if we were dealing with a buffer block was too late and only worked for named UBOs. Fixes: `f32b01ca43` "glsl/linker: remove ubo explicit binding handling" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1900	2019-10-12 22:04:23 +11:00
Neil Roberts	cece947a8d	glsl/builtin: Add alternate versions of atan using new ops Adds alternate versions of the atan builtin functions that use ir_unop_atan and ir_binop_atan2 instead of inlining to the IR implementation of the function. These alternatives are selected if the IR is going to be consumed by NIR. In that case the IR ops will be translated to the appropriate NIR op. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	77f3fbb4aa	glsl: Add opcodes for atan and atan2 Adds ir_binop_atan2 and ir_unop_atan. When converting to NIR these are expanded out using the appropriate builtin generator. If they are used with anything else then it will just hit an assert. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	0832845dc6	nir/builtin: Add extern "C" guards to nir_builtin_builder.h That way it can also be included from a C++ source. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	9eaeedd54b	nir/builtin: Add #include u_math.h to the header The inline functions use M_PI so they should include a header to make sure it is defined. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:18 +02:00
Neil Roberts	2098ae16c8	nir/builder: Move nir_atan and nir_atan2 from SPIR-V translator Moves build_atan and build_atan2 into nir_builtin_builder. The goal is to be able to use this from the GLSL translator too. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-12 09:43:17 +02:00
Hal Gentz	075a96aa92	egl: Configs w/o double buffering support have no `EGL_WINDOW_BIT`. When users pass a config to `eglCreateWindowSurface` it requests double buffering, but if the config doesn't have the appropriate `__DRIconfig`, `eglCreateWindowSurface` fails with a `EGL_BAD_MATCH`. Given that such behaviour is completely unacceptable, we drop the `EGL_WINDOW_BIT` if we don't have at least one `__DRIconfig` supporting double buffering, otherwise dropping the `EGL_PIXMAP_BIT`. Fixes: `049f343e8a` "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-10-11 21:57:21 +00:00
Hal Gentz	a800d16e4f	egl: Puts RGBA visuals in the second config selection group. That way applications don't get windows that are compositor alpha-blended accidentally. In the ideal world, this would be done by the xserver, as it does for GLX, however, an appropriate place could not be found, so it's being placed here instead. Fixes: `049f343e8a` "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-10-11 21:57:21 +00:00
Hal Gentz	90a19074b4	egl: Fixes transparency with EGL and X11. This commit does this by allowing both RGB and RGBA visuals to match with EGL configs. We also expose the `EGL_MESA_config_select_group` egl extension, which is similar to GLX's visual select group extension, to allow the RGBA visuals to get less priority. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676 Fixes: `049f343e8a` "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-10-11 21:57:21 +00:00
Hal Gentz	173bc9d684	egl: Add EGL_CONFIG_SELECT_GROUP_MESA ext. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67676 Fixes: `049f343e8a` "egl: Allow 24-bit visuals for 32-bit RGBA8888 configs" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-10-11 21:57:20 +00:00
Kenneth Graunke	44754279ac	intel/fs/gen12: Use TCS 8_PATCH mode. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-10-11 12:24:16 -07:00
Jason Ekstrand	c92fb60007	intel/fs/gen12: Implement gl_FrontFacing on gen12+. The bit moved on gen12 in order to prepare for dual-SIMD8 dispatch. This implementation isn't an entirely complete as it only works on SIMD8 and SIMD16 and not dual-SIMD8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	ceb123befa	intel/fs/gen11+: Fix CS_OPCODE_CS_TERMINATE codegen. Apparently the ts_request_type and ts_resource_select thread spawner message descriptor bits were removed from the hardware at least since ICL. Drop them in order to avoid assertion failures on Gen12+ platforms which don't have any encoding for this. On Gen9+ these are probably just ignored by the hardware, so this is unlikely to have had any functional implications prior to Gen12. v2: Mark TS message fields as non-existing in brw_inst.h on ICL. (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a5efb0eae8	intel/fs/gen12: Fix barrier codegen. The WAIT instruction has been removed, but SYNC.bar can be used instead to wait for a notification on n0.0. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6b52f81395	intel/eu: Don't set notify descriptor field of gateway barrier message. Apparently this field was removed on SKL, and according to the hardware docs for previous platforms "This field is only valid for a ForwardMsg message. It is ignored for other messages. The BarrierMsg message always increments the N0 notification counter". Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	b0e69d115e	intel/ir/gen12: Update assert in brw_stage_has_packed_dispatch(). Confirmed no regressions after a full Piglit run on TGL with the brw_fs_test_dispatch_packing() test enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Jason Ekstrand	ca7b6fd392	intel/eu/validate/gen12: Don't blow up on indirect src0. They look like a NULL source if you don't look at the address mode. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	ab5aa01689	intel/eu/validate/gen12: Validation fixes for SEND instruction. The following fix-up by Jordan Justen is squashed in: intel/eu/validate: gen12 send instruction doesn't have a dst type field Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a81f9b5e3e	intel/eu/validate/gen12: Fix validation of SYNC instruction. src0 will typically be null for this instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	45768e6b3c	intel/eu/validate/gen12: Implement integer multiply restrictions in EU validator. Due to hardware bug filed as HSDES#1604601757. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Jordan Justen	f9ec4ac5a1	intel/ir: Lower fpow on Gen12. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	cb6db5bfb3	intel/fs/gen12: Don't support source mods for 32x16 integer multiply. Due to hardware bug filed as HSDES#1604601757. v2: Only return if result of fs_inst::can_do_source_mods() is known to be false for the case new orthogonal restrictions are implemented below in the future. (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	de5d106ccf	intel/disasm: Disassemble register file of split SEND sources. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c03869323b	intel/disasm: Don't disassemble saturate control on SEND instructions. The field is gone on Gen12+ and it was illegal on previous generations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	f15e0b3439	intel/disasm/gen12: Disassemble Gen12 SEND instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	fd7e21dd90	intel/disasm/gen12: Disassemble Gen12 SYNC instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	606d823b42	intel/disasm/gen12: Disassemble three-source instruction source and destination regions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	8263d300c2	intel/disasm/gen12: Fix disassembly of some common instruction controls. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	83612c0127	intel/disasm/gen12: Disassemble software scoreboard information. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	396f6b27a7	intel/fs/gen12: Demodernize software scoreboard lowering pass. Kept as a separate commit in order to avoid distracting reviewers of the software scoreboard pass with memory management boilerplate. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	265c7c8971	intel/fs/gen12: Introduce software scoreboard lowering pass. Gen12+ hardware lacks the register scoreboard logic that used to guarantee data coherency between register reads and writes in previous generations. This lowering pass runs after register allocation in order to make up for it. It works by performing global dataflow analysis in order to determine the set of potential dependencies of every instruction in the shader, and then inserts any required SWSB annotations and additional SYNC instructions in order to guarantee data coherency. v2: Drop unnecessary _safe list iteration (Caio). v3: Temporarily workaround potential WaR hazard between FPU instruction and subsequent out-of-order write, pending clarification from the hardware team. Drop redundant tracking of implicit access of acc0-1, since the hardware guarantees coherency of these (but not the other accumulators...). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	e0b8d7953e	intel/fs/gen12: Add scheduling information to the IR. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	15e3a0d9d2	intel/eu/gen12: Set SWSB annotations in hand-crafted assembly. Reviewers are encouraged to audit the code generation pass independently for the case I missed some potential data hazard or new code has been added in the meantime. v2: Add SYNC instruction to cr0 workaround in brw_float_controls_mode(). v3: Drop likely redundant (and potentially harmful) RegDist SWSB annotation from ce0 read in brw_find_live_channel() (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	d3f3bdcd18	intel/eu/gen12: Add tracking of default SWSB state to the current brw_codegen instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6154cdf924	intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen. v2: Introduce extra tgl_swsb_sbid() constructor (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c22db5e188	intel/fs/gen12: Add codegen support for the SYNC instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	0e57dbc55c	intel/ir/gen12: Add SYNC hardware instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	7499e10383	intel/eu/gen12: Don't set thread control, it's gone. An effect similar to the one formerly provided by setting thread control to "switch" can be achieved now by setting a RegDist of 1 on the SWSB field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a66ea33991	intel/eu/gen12: Don't set DD control, it's gone. A future lowering pass will simulate the same behavior originally provided by NoDDChk/NoDDClr at the IR level by using appropriate SWSB annotations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	8a5fad0d92	intel/eu/gen12: Use SEND instruction for split sends. The new SEND instruction behaves like the former SENDS instruction. The original single-payload SEND instruction is gone. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6634ede7aa	intel/eu/gen12: Codegen SEND descriptor regions correctly. The SEND instruction is now four-source. The descriptor is no longer part of source 1, so avoid touching it to avoid corruption while initializing the descriptor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	2c4c9aba30	intel/eu/gen12: Codegen pathological SEND source and destination regions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	bafc9515db	intel/eu/gen12: Codegen control flow instructions correctly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6e1daba3b4	intel/eu/gen12: Codegen three-source instruction source and destination regions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	9fdb67aa09	intel/eu/gen12: Fix codegen of immediate source regions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6cb764ae9c	intel/eu/gen12: Add Gen12 opcode descriptions to the table. Quite a lot of churn because the encoding of most hardware opcodes has changed unfortunately. v2: Split dot-product description fixes to separate patch (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	31182e7aa9	intel/eu/gen11+: Mark dot product opcodes as unsupported on opcode_descs table. These instructions have been removed from the hardware. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c742be1437	intel/eu/gen12: Implement datatype binary encoding. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Sagar Ghuge	a12533f2ce	intel/eu/gen12: Implement immediate 64 bit constant encoding. On Gen12, 64 bit immediate constants are loaded in reverse order. Lower 32 bit gets loaded from bit 96-127 and higher 32 bits from 64-95 in instruction encoding. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Co-authored-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	5291283af0	intel/eu/gen12: Implement compact instruction binary encoding. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	77d09d0d50	intel/eu/gen12: Implement indirect region binary encoding. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	81400470be	intel/eu/gen12: Implement SEND instruction binary encoding. v2: Fix off-by-one upper GET_BITS() bound, combine 25-29 and 30-31 descriptor fields (Ken). Shorten name of GEN12_MD() macro, drop some removed TS message descriptor fields (Jordan). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	d24b8af23d	intel/eu/gen12: Implement control flow instruction binary encoding. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	956c156dc4	intel/eu/gen12: Implement three-source instruction binary encoding. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	fa48281795	intel/eu/gen12: Implement basic instruction binary encoding. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	143176163d	intel/eu/gen12: Add sanity-check asserts to brw_inst_bits() and brw_inst_set_bits(). These caught a few bugs during the development of this series. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	7e5a8638d3	intel/eu/gen12: Extend brw_inst.h macros for Gen12 support. The encoding of almost every instruction field has changed in Gen12, so this involves adding a Gen12+ bitfield spec to every brw_inst macro. In addition some new macros are required to handle certain discontiguous and variable-length fields. This commit doesn't actually include the Gen12 updated bitfield specs, only the macros are extended here for reviewability. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Rename FDC() to FFDC() and FDC1() to FDC() for consistency with the existing F() and FF() macros.	2019-10-11 12:24:16 -07:00
Francisco Jerez	6965a02e09	intel/ir: Represent physical edge of unconditional CONTINUE instruction. This edge doesn't exist in the original scalar program, but it represents a potential control flow path the EU will take in cases where control flow isn't uniform across channels of the same SIMD thread. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	eeaad2992c	intel/ir: Represent physical edge of ELSE instruction. This edge doesn't exist in the original scalar program, but it represents a potential control flow path the EU will take in cases where the condition isn't uniform across channels of the same SIMD thread. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	152754665a	intel/ir: Represent logical edge of BREAK instruction. Currently only the physical back-edge is represented, which incidentally also leads to the exit block of the loop, but we need the direct logical edge in addition for our logical CFG representation to be complete. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	c344c92b31	intel/ir: Add helper function to push block onto CFG analysis stack. Requested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	d6a9731d8f	intel/ir: Represent physical and logical subsets of the CFG. This represents two control flow graphs in the same cfg_t data structure: The physical CFG that will include all possible control flow paths the EU can physically take, and the logical CFG restricted to the control flow paths that exist in the original scalar program. The latter is a subset of the former because in case of divergence the SIMD vectorized program will take control flow paths that aren't part of the original scalar program. The bblock_link constructor and bblock_t::add_successor() now take a "kind" parameter that specifies whether the edge is purely physical or whether it's part of both the logical and physical CFGs (a logical edge is of course always guaranteed to be in the physical CFG as well). bblock_t::is_predecessor_of() and ::is_successor_of() also take a kind parameter specifying which CFG is being queried. The '~>' notation will be used now in order to represent purely physical edges in IR dumps. This commit doesn't actually add nor remove any edges from the CFG (the only edges marked as purely physical here are the two WHILE loop ones that already existed). Optimization passes should continue using the same (incomplete) physical CFG they were using before until they're fixed to do something smarter in a later commit, so this shouldn't lead to any functional changes. v2: Remove tabs from lines changed in this file (Caio). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	1b570456ca	intel/ir: Drop hard-coded correspondence between IR and HW opcodes. Having the IR opcodes locked to their hardware representation is risky because it causes opcodes as different as BRC and IFF to compare equal at the IR level (luckily the back-end only ever uses one opcode from each group, right now), and it prevents us from supporting instructions that change their hardware representation across generations, which will become a problem on Gen12+ platforms. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	057902dcf8	intel/eu: Encode and decode native instruction opcodes from/to IR opcodes. Change brw_inst_set_opcode() and brw_inst_opcode() to call brw_opcode_encode/decode() transparently in order to translate between hardware and IR opcodes, and update the EU compaction code in order to do the same as needed, so we can eventually drop the one-to-one correspondence between hardware and IR opcodes. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	25dd67099d	intel/eu: Rework opcode description tables to allow efficient look-up by either HW or IR opcode. This rewrites the current opcode description tables as a more compact flat data structure. The purpose is to allow efficient constant-time look-up by either HW or IR opcode, which will allow us to drop the hard-coded correspondence between HW and IR opcodes -- See the next commits for the rationale. brw_eu.c is now built as C++ source so we can take advantage of pointers to member in order to make the look-up function work regardless of the opcode_desc member used as look-up key. v2: Optimize devinfo struct comparison (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	51dc40cefb	intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Francisco Jerez	35bcd08d61	intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C). The brw_inst opcode accessors are going away in one of the following commits. We could potentially replace them with the new helpers that do opcode remapping, but that would lead to a circular dependency between brw_inst.h and brw_eu.h. This way we also avoid ordering issues that can cause the semantics of the ex_desc accessors to change depending on whether the ex_desc field is set after or before the opcode instruction field. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	b2ae65c7d9	intel/fs: Fix constness of implied_mrf_writes() argument. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	6f275a863d	intel/fs: Define is_send() convenience IR helper. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	f326d9d218	intel/fs: Define is_payload() method of the IR instruction class. This is required because SEND message payload sources are fetched asynchronously by the hardware, which can lead to WaR data corruption on Gen12+ platforms if not handled specially by the compiler to guarantee proper synchronization. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-11 12:24:16 -07:00
Francisco Jerez	a42581fa8f	intel/fs: Teach fs_inst::is_send_from_grf() about some missing send-like instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-11 12:24:16 -07:00
Bas Nieuwenhuizen	6da3bf2600	nir/dead_cf: Remove dead control flow after infinite loops. And after discard-only loops. Otherwise we end up with dead code which confuses nir_repair_ssa into adding a whole bunch of uses of undefined. However, for derefs, we sometimes always expect to get a variable instead of undefined. Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv. Fixes: `c832820ce9` "nir/dead_cf: Repair SSA if the pass makes progress" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-11 17:24:26 +02:00
Rhys Perry	f13ad839f1	aco: don't use p_as_uniform for vgpr sampler/image indices p_as_uniform can get CSE'd, which can be incorrect and break some dEQP-VK.descriptor_indexing.* tests. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-11 14:26:58 +00:00
Rhys Perry	0c3fe323b6	aco: implement divergent vulkan_resource_index Fixes the UBO/SSBO dEQP-VK.descriptor_indexing.* tests v2: remove bld.copy() usage Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-11 14:26:58 +00:00
Rhys Perry	5526a557ee	aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit() This can happen when bcsel is used between the results of two vulkan_resource_index. It's also probably needed for non-uniform descriptor indexing Fixes dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.reads_opselect_two_buffers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-11 14:26:58 +00:00
Rhys Perry	45d6c69b99	aco: use can_accept_constant in valu_can_accept_literal Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-11 14:26:58 +00:00
Rhys Perry	b37857bcea	aco: don't apply sgprs/constants to read/write lane instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-11 14:26:58 +00:00
Rhys Perry	599d634c2c	nir/lower_input_attachments: pass on non-uniform access flag Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-11 14:26:58 +00:00
Rhys Perry	5ef04d7982	nir/lower_non_uniform: lower image/texture instructions taking derefs v2: always assert on the texture/sampler handle's num_components v3: replicate the deref inside the loop v4: remove a case of useless line wrapping Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-11 14:26:58 +00:00
Jonathan Marek	7e3b900c80	etnaviv: rework etna_resource_create tiling choice Now that the base resource is allowed to be incompatible with PE, we can make a smarter choice of tiling mode to avoid allocating a PE compatible base that is never used for regular textures. This affects GPUs like GC2000 where there is no tiling compatible with both PE and TE. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-11 07:26:52 -04:00
Jonathan Marek	b962776530	etnaviv: rework compatible render base For PE-incompatible layouts, use a mechanism similar to what texture does to create a compatible base resource. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-11 07:26:52 -04:00
Jonathan Marek	e7e02435a8	etnaviv: get addressing mode from tiling layout Remove the "addressing_mode" state, which is currently set incorrectly, and instead deduce the addressing mode from the tiling layout. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-11 07:26:52 -04:00
Jonathan Marek	5403b36653	etnaviv: clear texture cache and flush ts when texture is modified Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-11 07:26:52 -04:00
Christian Gmeiner	6dc650fe71	etnaviv: output the same shader-db format as freedreno, v3d and intel This lets us reuse their report.py. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-11 12:35:15 +02:00
Christian Gmeiner	140bc0f040	etnaviv: nir: start to make use of compile_error(..) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-10-11 11:37:03 +02:00
Michel Dänzer	d60b8679a4	gitlab-ci: Disable meson-mingw32-x86_64 job again for now The wrapdb.mesonbuild.com SSL certificate expired, causing the job to fail: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/731864 Switching to http:// doesn't avoid it: https://gitlab.freedesktop.org/daenzer/mesa/-/jobs/732043	2019-10-11 11:10:01 +02:00
Michel Dänzer	eb86cbabe6	gitlab-ci: Add .use-debian-10 template It simplifies the definitions of jobs using the Debian 10 image. The needs: was previously missing from the llvmpipe/softpipe test jobs, so they could spuriously run if the debian-10 job failed or was cancelled. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-11 10:05:21 +02:00
Michel Dänzer	9691329727	gitlab-ci: Remove redundant .meson-cross template script It was identical to the one inherited from the .meson-build template. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-11 10:04:45 +02:00
Dave Airlie	f59ff014b1	gallivm: fix coroutines on aarch64 with llvm 8 The coroutine split pass is missing a dependency before LLVM 9.0, and fails to initialise properly if the CallGraphWrapperPass hasn't be initialised earlier (x86 does it due to some of it's passes requiring it). This is a workaround for llvm 8 (coroutines are only supported in 8 and higher). It adds another pass that has a dependency on the pass the coroutines split requires. This pass shouldn't have any raal effects. Fixes: `d32690b43c` (gallivm: add coroutine pass manager support) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-11 12:15:45 +10:00
Dave Airlie	05b008c961	llvmpipe: add support for tg4 component selection. This is needed as part of GLES3.1 and helps for ARB_gpu_shader5. Fixes: KHR-GLES31.core.texture_gather.* cases Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-11 00:32:15 +00:00
Dave Airlie	a70f0a8841	st/glsl: add support for alternate TG4 encoding. This will encode the component selection value (0, 1, 2, 3) into the X swizzle of the sampler, if the driver requests it. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-11 00:32:15 +00:00
Dave Airlie	0c09df52e1	gallium: add a a new cap for changing the TGSI TG4 instruction encoding Accessing the TG4 component via immediates in the llvmpipe backend is quite messy (like really messy). Roland suggested we change the instruction encoding, so introduce a cap to allow the component to be selected to be store in the sampler swizzle, which should be otherwise unused. I could probably switch all drivers over, but virgl would need some work that I'd prefer not to rush it. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-11 00:32:15 +00:00
Dave Airlie	1e65757f4e	gallivm/sample: add gather component selection to the key. This allows for component selection to work as per ARB_gpu_shader5/GLES3.1 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-11 00:32:15 +00:00
Roland Scheidegger	5084e9785b	llvmpipe: increase max texture size to 2GB The 1GB limit was arbitrary, increase this to 2GB (which is the max possible without code changes). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-10-11 01:41:08 +02:00
Dylan Baker	d905d9b600	gitlab-ci: Add a mingw x86_64 job Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:05 -07:00
Dylan Baker	f066c96078	appveyor: Add support for meson as well as scons on windows This job uses the vs2017 backend of meson (msbuild) as opposed to the ninja backend used on MacOS and Linux. v7: - rebase on master - remove llvm (we'll add that back later) - remove cygwin (we'll add that back later too) v6: - rebase on master, including the addition of cygwin - consolidate 3 appveyor patches into this one patch v5 - use the new b_vscrt option instead of manually specifying the crt v4: - rebase on python3 generators - cache meson wraps - Build x86 instead of x86_64, since that's what the pre-built LLVM is - update to vs2017 from vs2015 - set the default-library to static - use the new vscrt override - add the /m switch to msbuild to make the build somewhat faster Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:05 -07:00
Dylan Baker	44c5e634a5	docs: update meson docs for windows Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:05 -07:00
Dylan Baker	638868bbff	glsl/tests: Handle no-exec errors Currently meson doesn't correctly handle passing compiled binaries to scripts in tests. This patch looks to the future (0.53) when meson will have this functionality, but also immediately it fixes these tests in cross compiles by causing them to return 77, which meson interprets as skip. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:05 -07:00
Dylan Baker	1bf5e5a011	meson/util: Don't run string_buffer tests on mingw They succeed with MSVC but not with MinGW. I don't understand why they fail. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	09d21b554a	meson: glcpp tests are expected to fail on windows v2: - Exclude the tests rather than xfail them Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	8f363ce5b5	meson: only build timspec test if timespec is available Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	fe8f8981d0	meson: don't error on formaters with mingw MSVC is generally happy, but mingw errors. I've spent as much time (several days) trying to squash all of these warnings and I'm done with it, just leave them as warnings with MinGW. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	1e2c05b82a	meson: add msvc compat args to swr This has always been present in the scons build, so it should be in the meson build as well. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	63f5aee694	meson: maintain names of shared API libraries Mesa uses the lib prefix, and doesn't use a version for it's dynamic libraries, which meson defaults to. v2: - this patch Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	e1dbf10749	meson: don't build or run mesa-sha1 test on windows It crashes hard (pop-up window and all). v2: - Change comment to FIXME Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	b6b59813c3	meson: disable graw tests on mingw I can't figure out why symbols are being exposed that shouldn't. v2: - change comment to FIXME Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	56db696875	meson: don't build gallium trivial tests on windows They require the pipe-loaders, which require xmlconfig, which doesn't build with msvc. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	880ca3c964	meson: Set visibility and compat args for graw Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	095bdbda2b	meson: Add msvc compat args to util/tests To keep this building with msvc Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	00fca07c3b	meson: Add idep_getopt for tests There are quite a few tests that require getopt, when using MSVC we need to use the bundled version of getopt since there isn't a system version. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	8eee724b73	meson: don't define USE_ELF_TLS for windows Because the macros for exporting dll symbols and using TLS are mutually exclusive. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	3740ffb59c	meson: add switches for SWR with MSVC This makes two changes for SWR, The first is that it reorders the arguments to try to put the ICL ones first. This is required to support older versions of meson that don't add enough "error in this case" switches to ICL, which causes it to happy accept -mavx (for example) even though it doesn't support them, resulting in compilation failures. The second is to fix the names of the libraries, setting the soversion to '' will result in <lib>.dll, instead of <lib>-0.dll. Since these are not versioned dll's, but implement an internal API we should communicate that. It's also what scons does. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	d2cb0a59ce	meson: disable sse4.1 optimizations with msvc There isn't an obvious command line switch here, /arch:AVX might be the right thing, but meson doesn't know what to do here either and leaves the -msse4.1 and -mstackrealign. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	150aec5d1f	meson: force inclusion of inttypes.h for glcpp with msvc Because we provide a copy if MSVC doesn't, and we need it to make flex do what we want. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	8b19c5b145	meson: Add support for using win_flex and win_bison on windows Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	81d44c01ee	meson: don't look for rt on windows v6: - Minor refactor Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	e3f5c3232c	meson: fix pipe-loader compilation for windows v2: - Add missing D to pound define - Simply define the variable rather than set it to 1 (mirrors android.mk not scons) Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	474d6f8e08	util/xmlconfig: include strndup.h for windows Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	7ef85a0d92	meson: Don't check for posix_memalign on windows There's a mingw bug for this, it exports __builtin_posix_memalign but not posix_memalign, so the check will succeed, but compiling will fail. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	597a063551	meson: fix gallium-osmesa to build for windows v2: - set so_version to '' (only affects windows) - always set lib prefix to 'lib', even on msvc v5: - key NO_EXPORTS on shared glapi instead of gles. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	b97a341017	meson: build graw-gdi target Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	a2c79cc3cb	meson: build libgl-gdi target v4: - Fix check for broken mingw (should be for x86 not x86_64) - Add comment about why check is needed Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	3c8934343b	meson: build wgl state tracker v4: - Handle enable gles properly - Add comments about what various #defines do v5: - key NO_EXPORTS on shared glapi instead of gles. Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	560cdabebe	meson: build gallium gdi winsys v6: - use null_dep instead of [] Reviewed-by: Eric Anholt <eric@anholt.net> (v5) Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	2a53a06793	meson: Add necessary defines for mesa_gallium on windows v4: - Retain scons comments for windows specific defines v5: - key GLAPI_NO_EXPORTS off of shared-glapi instead of gles Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	2e17348600	meson: Add windows defines to glapi These are needed to control the export or symbols due to differences between the way windows and *nix handle symbol exports. Reviewed-by: Eric Anholt <eric@anholt.net> (v2) Acked-by: Kristian H. Kristensen <hoegsberg@google.com> v5: - key NO_EXPORT off of shared-glapi instead of gles	2019-10-10 16:33:04 -07:00
Dylan Baker	3aee462781	meson: add windows compiler checks and libraries v4: - Fix typo in warning code (4246 -> 4267) - Copy comments from scons for what MSVC warnings codes do - Merge linker argument changes into this commit v5: - Add /GR- on windows if LLVM is build without rtti (equivalent to GCc's -fno-rtti') - Add /wd4291, which is catching the same hting that -Wno-non-virtual-dtor is on GCC/Clang Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Dylan Baker	16bc3073cb	util: use _WIN32 instead of WIN32 MinGW defines only _WIN32, but doesn't have fcntl, so we need to use the windows path. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-10 16:33:04 -07:00
Rob Clark	f1fe656a92	freedreno/ir3: handle multi component alu src when propagating shifts Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-10 23:12:05 +00:00
Rob Clark	61a0a86d28	freedreno/ir3: drop unused param Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-10-10 23:12:05 +00:00
Marek Olšák	c38c8d012e	clover: fix the nir_serialize build failure Fixes: `dd4cc56ebd` "nir: add a strip parameter to nir_serialize"	2019-10-10 18:44:40 -04:00
Dave Airlie	1b221f4e7b	llvmpipe/draw: handle UBOs that are < 16 bytes. Not sure if this is a bug in the user or not, but some CTS tests fail due to using an 8 byte constant buffer. Fixes: KHR-GLES31.core.layout_binding.block_layout_binding_block_VertexShader Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-10 21:52:20 +00:00
Dave Airlie	744b8936df	llvmpipe/draw: fix image sizes for vertex/geometry shaders. since images are a single level, minify before passing the w/h to draw. Fixes: KHR-GLES31.core.shader_image_size.basic-nonMS-vs-* Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-10 21:52:20 +00:00
Dave Airlie	7cac880831	llvmpipe: make texture buffer offset alignment == 16 Due to use vmovdqa instructions in the asm, which require 16-byte aligned buffers. This fixes a crash in KHR-GLES31.core.texture_buffer.texture_buffer_texture_buffer_range Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-10 21:52:20 +00:00
Eric Engestrom	34ba363ab0	meson: skip installation of GLVND-provided headers Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1846 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 22:33:09 +01:00
Eric Engestrom	1a7e9652c4	meson: split Mesa headers as a separate installation Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 22:33:05 +01:00
Eric Engestrom	daae003f47	meson: split headers one per line Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 22:21:00 +01:00
Eric Engestrom	b9a5fb1f05	meson: move a couple of include installs around Preparation for a later commit. Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 22:18:04 +01:00
Eric Engestrom	b57fa7ca49	meson: rename `glvnd_missing_pc_files` to `not glvnd_has_headers_and_pc_files` This reflects better what is provided by glvnd or not. Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 22:18:04 +01:00
Eric Engestrom	a0829cf23b	GL: drop symbols mangling support SCons and Meson have never supported that feature, and Autotools was deleted over 6 months ago and no-one complained yet, so it's pretty obvious nobody cares about it. Fixes: `95aefc94a9` ("Delete autotools") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-10 21:40:48 +01:00
Rhys Perry	2026ff5165	aco: update print_ir Mostly adds GFX10 stuff. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 20:02:36 +00:00
Rhys Perry	283eda71cf	aco: rework scratch resource code Fix compute, cleanup and add GFX10 support. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 20:02:36 +00:00
Rhys Perry	f64b1a3454	aco/gfx10: disable GFX9 1D texture workarounds Navi added back support for 1D textures. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 20:02:36 +00:00
Rhys Perry	de0748c42e	aco/gfx10: fix inline uniform blocks Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 20:02:36 +00:00
Rhys Perry	ba71be228f	radv/aco: disable NGG when ACO is used Note that radv_device.c still has to be modified to use ACO with Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 20:02:36 +00:00
Marek Olšák	b7fc082b28	ac/nir: add back nir_op_fmod radeonsi doesn't lower it for doubles. This partially reverts commit `d861401554`. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:57:50 -04:00
Marek Olšák	09e0e4c93c	gallium: remove PIPE_SHADER_CAP_SCALAR_ISA Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:19 -04:00
Marek Olšák	1f718bfc78	tgsi_to_nir: use nir_shader_compiler_options::lower_to_scalar Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:19 -04:00
Marek Olšák	e4f7d2576e	st/mesa: use nir_shader_compiler_options::lower_to_scalar Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:19 -04:00
Marek Olšák	cebc38ff60	nir: add nir_shader_compiler_options::lower_to_scalar This will replace PIPE_SHADER_CAP_SCALAR_ISA. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	7fc5919793	tgsi_to_nir: add #ifdef header guards Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	e5209e6a95	nir/drawpixels: fix what appears to be a copy-paste bug in get_texcoord_const Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	e621b30787	nir/drawpixels: handle load_color0, load_input, load_interpolated_input for radeonsi Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-10 15:49:18 -04:00
Marek Olšák	3340c066a1	nir: move gl_nir_opt_access from glsl directory Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-10 15:49:18 -04:00
Marek Olšák	dd4cc56ebd	nir: add a strip parameter to nir_serialize so that drivers don't have to call nir_strip manually. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-10-10 15:47:07 -04:00
Bas Nieuwenhuizen	e6986bcb73	radv: Enable VK_ANDROID_external_memory_android_hardware_buffer. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	e92b9c5f4f	radv: Check the size of the imported buffer. This is a security feature to disallow malicious apps from passing a buffer that is too small. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	dad047a56a	radv: Expose image handle compat types for Android handles. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	1b0ceba925	radv: Allow Android image binding. Using delayed layout of images. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	83a012b603	radv/android: Add android hardware buffer import/export. Support does not include images yet. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	adad61239c	radv: Deal with Android external formats. To abstract things a bit, this adds a helper function in radv_android.c. However, this means we have to link in radv_android.c on non-android as well, which means some scaffolding changes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	041fc7beb8	radv: Derive android usage from create flags. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	53b1372571	radv: Disallow sparse shared images. Since we really cannot share them ever. Also remove an unused switch. Fixes: `b70829708a` "radv: Implement VK_KHR_external_memory" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	f36b52740a	radv/android: Add android hardware buffer queries. Derived from the Intel code. For the internal format we just use the internal Vulkan format, as we have Vulkan formats for all android formats we care about. For the ycbcr properties we just do something. I do not have a real clue what would be recommended. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	a34e4dd0d2	radv/android: Add android hardware buffer field to device memory. You cannot go from BO to Android hardware buffer, so for export we have to remember it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	9ea72b5337	radv: Add VK_ANDROID_external_memory_android_hardware_buffer. Still disabled but now we can add entrypoints. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	4a495e1a85	radv: Unset vk_info in radv_image_create_layout. For better test coverage of this corner case. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	64768111c3	radv: Handle slightly different image dimensions. The minigbm comment really says it all. We should fix minigbm as well, but for now this is the more robust solution. Note that this only changes width and height for the surface creation, not for the image and hence also not for the sampler, where it would wreak havoc due to the normalized coords. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	852c64ca65	radv: Delay patching for imported images until layout time. We want this flexibility because in GFX10 we lose any stride fields, so we have to make sure our width/height are in alignment with the external image we import. Furthermore, we need the ability to inject tiling modifiers on import time which is strictly after create time for Android. So, with the layout & patch functions being fully independent of pCreateInfo, we can delay it until import/bind time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	2ab4d418f9	radv: Split out layout code from image creation. So we can delay the layout until later in some import cases. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	825ddfee59	radv: Handle device memory alloc failure with normal free. Less duplication/complexity. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen	e1469c02cf	radv: Cleanup buffer_from_fd. Unused stride/offset args. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 17:02:34 +00:00
Tomeu Vizoso	6397dff6d7	gitlab-ci/lava: Test Lima driver with dEQP Run dEQP on boards with Mali 400 and 450 in Baylibre's lab. There's lots of skipped tests because of crashes and undetermined behavior. May be a good idea to run the tests with valgrind and fix any issues found. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>	2019-10-10 14:50:14 +00:00
Tomeu Vizoso	8a168683d0	gitlab-ci/lava: Use files to list tests to skip As the non-LAVA runner script does, have per-GPU version files listing the tests that are to be skipped, due to being very slow, unstable, etc. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>	2019-10-10 14:50:14 +00:00
Rafael Antognolli	01122a78b3	intel/tools: Support multiple contexts in intel_dump_gpu. Create basic aub_context on GEM_CONTEXT_CREATE. Set it up and submit a context + ring + pphwsp during execbuf submission, if it has not been initialized yet. v2: Write the HWSP only once per engine (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-10 14:08:50 +00:00
Rafael Antognolli	12feafc28e	intel/tools: Add basic aub_context code and helpers. v2: - Only dump context if there were no erros (Lionel). - Store counter for context handles in aub_file (Lionel). v3: - Add a comment about aub_context -> GEM context (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-10 14:08:50 +00:00
Rafael Antognolli	472de61187	intel/tools: Use common code for GGTT address allocation. We want to be able to create contexts on demand, and increase the GGTT as needed for that. Use the aub_map_ggtt() function for that. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-10 14:08:50 +00:00
Rafael Antognolli	9968316ed0	intel/tools: Factor out GGTT allocation. We want to reuse it in execlists_setup(). v2: Rename it to write_ggtt_ptes() (Lionel). v3: Rename it to aub_map_ggtt() (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-10 14:08:50 +00:00
Bas Nieuwenhuizen	a9687c4e05	radv: Implement & enable VK_EXT_texel_buffer_alignment. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-10 13:24:16 +00:00
Samuel Pitoiset	9d17d97ee4	radv: use a compute shader for copying timestamp query results When the timestamp is not ready (ie. UINT64_MAX), the availabily bit should be zero. The previous code used to copy the timestamp value as the availabily bit and that's completely wrong. Because it's not that simple to emit a conditional with the CP, the driver now uses a compute shader for copying timestamp query results. Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-10 13:23:22 +02:00
Samuel Pitoiset	dad80eadb2	radv: sync before resetting query pools if timestamps have been written Otherwise, the GPU might write timestamp queries after the reset operation. This is similar to other query operations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-10 13:23:20 +02:00
Timur Kristóf	aa75be05af	aco: Clean up usages of PhysReg::reg from aco_assembler. These are not needed anymore, since PhyReg has an implicit conversion operator that can convert it to unsigned int, which is equivalent to accessing this field. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	d729d8f1dc	aco: Add extra assertion for number of FS input VGPRs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	a89153d038	aco: Fix s_dcache_wb on GFX10. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	68c9554732	aco: Have s_waitcnt_vscnt write to NULL. Not sure if this instruction actually writes anything, but LLVM disassembles a destination and sets it to NULL. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	619f0a71cc	aco: Use the VOP3-only add/sub GFX10 instructions if needed. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	6a6bef59b0	aco: Initial work to avoid GFX10 hazards. Currently just breaks up SMEM groups and fixes FeatureVMEMtoScalarWriteHazard (name from LLVM). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	d63c175897	aco: pad code with s_code_end on GFX10 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	83993f535e	aco: workaround GFX10 0x3f branch bug According to LLVM, branches with an offset of 0x3f are buggy. v2: (by Timur Kristóf) - extract the GFX10 specific part to its own function Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	0be1dd8564	aco: Fix VS input VGPRs on GFX10. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	c24cd97515	aco: Assemble opsel in VOP3 instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Rhys Perry	818bdab796	aco: Allow literals on VOP3 instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-10 09:57:53 +02:00
Timur Kristóf	7cf1dcf22d	aco: Support subvector loops in aco_assembler. These are currently not used, but could be useful later. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	21f1953383	aco: Set GFX10 dimensionality on the instructions that need it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	eaa2a7cdf6	aco: Use ac_get_sampler_dim, delete duplicate code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	1de9ef9c96	aco: Set GFX10 DLC bit properly. The DLC bit is now set to 1 for all loads when GLC is also set, but cleared to 0 for all stores (otherwise it causes issues), and also cleared to 0 for atomics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	89b074be86	aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	d3a48c272f	aco: Support GFX10 EXP in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	e6330d71b5	aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	64d74ca816	aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	c0df15e645	aco: Support GFX10 MTBUF in aco_assembler. Also remove img_format from aco_ir, since it can be calculated from dfmt and nfmt. So only the assember needs to deal with it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Timur Kristóf	e96124bd65	aco: Link ACO with amd/common. We'd like to use some functions, for example some ac_shader_util functions in ACO, so we need to link ACO to AC. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	c57503b932	amd/common: Add extern "C" to some headers that were missing it. We'd like to include some of these in C++ code later. Specifically, ACO is written in C++ and we would like to use some of this code in ACO in order to avoid code duplication. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	9e27816252	aco: Support GFX10 MUBUF in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	6106d4bce9	aco: Support GFX10 DS in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	bbe87eb6c3	aco: Support GFX10 VINTRP in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	b6235651b9	aco: Support GFX10 SMEM in aco_assembler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	fd1d947457	aco: Add missing GFX10 specific fields and some README notes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Timur Kristóf	a01d796de4	aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:52 +02:00
Alejandro Piñeiro	fa41a51891	v3d: take into account prim_counts_offset Specifically when reading the primitive counters. This fixed ~700 CTS tests using this pattern: dEQP-GLES3.functional.transform_feedback.* when run after tests like dEQP-GLES3.functional.prerequisite.read_pixels on the same caselist. When run individually those tests were passing because prim_counts_offset was zero. Fixes: `0f2d1dfe65` ("v3d: use the GPU to record primitives written to transform feedback") Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-10-10 09:51:50 +02:00
Samuel Pitoiset	42b2d1119a	radv: get the device name from radeon_info::name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-10 08:15:41 +02:00
Dave Airlie	b1f3173d0f	st/mesa: fix R8 bitmap texture for TGSI paths. The initial patch only fixed up the NIR path, but forgot the TGSI path needed fixing as well. Fixes: `f92226931b` ("st/mesa: Prefer R8 for bitmap textures") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-10 10:22:37 +10:00
Jason Ekstrand	c7e5d24d8f	anv/pipeline: Capture serialized NIR This allows the serialized NIR to be displayed in RenderDoc and similar tools. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-09 22:28:01 +00:00
Matt Turner	b2f6fda542	clover: Remove unused code Fixes: `96b592696f` ("gallium: Require LLVM >= 3.9") Bug: https://bugs.gentoo.org/685678	2019-10-09 14:54:07 -07:00
Greg V	6da865bcfe	clover: use iterator_range in get_kernel_nodes With libc++ (LLVM's STL implementation), the original code does not compile because an appropriate vector constructor cannot be found (for the _ForwardIterator one, requirement is_constructible is not satisfied).	2019-10-09 14:54:07 -07:00
Marek Olšák	aed1f7ad34	radeonsi: enable MSAA shader images Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:38 -04:00
Marek Olšák	095a58204d	radeonsi: expand FMASK before MSAA image stores are used Image stores don't use FMASK, so we have to turn it into identity. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:36 -04:00
Marek Olšák	98b88cc1f6	radeonsi: apply FMASK to MSAA image loads Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:34 -04:00
Marek Olšák	c0575a6241	radeonsi: clean up image_fetch_rsrc Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:33 -04:00
Marek Olšák	743a9d85e2	radeonsi: add FMASK slots for shader images (for MSAA images) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:31 -04:00
Marek Olšák	1881b35bf6	radeonsi: set the sample index for shader images correctly Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:30 -04:00
Marek Olšák	0a0def7317	radeonsi: fix GLSL imageSamples() We haven't supported MSAA images, so it doesn't matter much. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:28 -04:00
Marek Olšák	279da8a201	tgsi/scan: add tgsi_shader_info::msaa_images_declared Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:27 -04:00
Marek Olšák	e26bd397a8	nir: add shader_info::last_msaa_image for radeonsi Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-09 17:12:19 -04:00
Marek Olšák	e4f4bb8abd	radeonsi: don't set BO metadata for non-zero planes pointed out by Bas	2019-10-09 17:06:54 -04:00
Marek Olšák	28da990bed	radeonsi: ignore metadata for non-zero planes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	86e60bc265	radeonsi: remove si_vid_join_surfaces and use combined planar allocations Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	0f7c9dad44	radeonsi: allocate planar multimedia formats in 1 buffer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	35680bfea1	vl: use u_format in vl_video_buffer_formats Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	a122e70858	gallium/u_tests: test NV12 allocation and export Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	20f132e5ef	gallium/util: add planar format layouts and helpers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Marek Olšák	3d06b9952c	gallium/util: remove enum numbering from util_format_layout Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 17:06:54 -04:00
Caio Marcelo de Oliveira Filho	9b58863f87	i965: Disable fast clears when running with INTEL_DEBUG=nofc Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho	bb9af8abbd	iris: Disable fast clears when running with INTEL_DEBUG=nofc Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho	44978baece	anv: Disable fast clears when running with INTEL_DEBUG=nofc Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho	d438261e05	intel: Add INTEL_DEBUG=nofc for disabling fast clears Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-10-09 13:29:26 -07:00
Maya Rashish	e0d89b90d4	llvmpipe: avoid left-shifting a negative number. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Maya Rashish <coypu@sdf.org>	2019-10-09 20:20:40 +00:00
Danilo Spinella	962aca1910	egl: Include stddef.h in generated source Required for NULL macro used throughout the generated file. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-09 13:16:38 -07:00
OBATA Akio	1ee4258383	util: fix to detect NetBSD properly <sys/param.h> is required for NetBSD version detection, and __NetBSD__ must be used to detect even on older releases. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-10-09 13:01:17 -07:00
Jan Beich	6ea0a918bb	util: simplify BSD includes Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-10-09 12:55:15 -07:00
Jan Beich	e892d9337f	util: detect AltiVec at runtime on BSDs Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-10-09 12:55:13 -07:00
Jan Beich	8d2dd1f4f3	util: skip AltiVec detection if built with -maltivec Helps platforms where runtime detection isn't implemented. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-10-09 12:55:11 -07:00
Jan Beich	601a098338	util: detect NEON at runtime on FreeBSD Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-10-09 12:55:10 -07:00
Jan Beich	7d5ad8e77e	util: skip NEON detection if built with -mfpu=neon Helps platforms where runtime detection isn't implemented. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-10-09 12:55:00 -07:00
Adam Jackson	5218c3b27e	egl: Make native display detection work more than once eglGetDisplay is awful because you have to inspect the pointer you're given and guess what type of native display it corresponds to. We make it worse by caching the type of the first such display we detect, so if the second call to eglGetDisplay is to a different display type, kaboom. Fortunately this is a problem that can be solved with the delete key. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/156	2019-10-09 18:12:29 +00:00
Rhys Perry	3f6e91a8d8	aco: enable nir_opt_sink SGPRS: 880272 -> 838936 (-4.70 %) VGPRS: 705316 -> 680988 (-3.45 %) Spilled SGPRs: 1032 -> 832 (-19.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 252 -> 252 (0.00 %) dwords per thread Code Size: 55150788 -> 55172436 (0.04 %) bytes LDS: 451 -> 451 (0.00 %) blocks Max Waves: 66178 -> 68706 (3.82 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-09 17:55:25 +00:00
Connor Abbott	5ac32b2954	nir/sink: Don't sink load_ubo to outside of its defining loop Previously, this could have made the resource divergent in code like that which is genereated by nir_lower_non_uniform_access. Fixes: `da8ed68a` ('nir: replace nir_move_load_const() with nir_opt_sink()') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-09 17:55:25 +00:00
Connor Abbott	af9296b8c0	nir/sink: Rewrite loop handling logic Previously, for code like: loop { loop { a = load_ubo() } use(a) } adjust_block_for_loops() would return the block before the first loop. Now we compute the range of allowed blocks and then walk the dominance tree directly, guaranteeing directly that we always choose a block that dominates all the uses and is dominated by the definition. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-09 17:55:25 +00:00
Marek Olšák	b049ebcf90	amd: don't use AMD_FAMILY definitions from amdgpu_drm.h use the ones from addrlib Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-09 13:27:13 -04:00
Dylan Baker	770ab4db82	docs: update calendar, add news item, and link release notes for 19.2.1	2019-10-09 10:26:23 -07:00
Dylan Baker	4c302ba941	docs: Add SHA256 sum for 19.2.1	2019-10-09 10:26:23 -07:00
Dylan Baker	970a83ef34	docs: Add relnotes for 19.2.1	2019-10-09 10:26:23 -07:00
Rhys Perry	2ea9e59e8d	aco: move s_andn2_b64 instructions out of the p_discard_if And use a new p_discard_early_exit instruction. This fixes some cases where a definition having the same register as an operand causes issues. v2: rename instruction to p_exit_early_if v2: modify the existing instruction instead of creating a new one v3: merge the "i == num - 1" IFs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-09 16:19:02 +00:00
Daniel Schürmann	f584c42707	aco: don't reorder instructions in order to lower boolean phis Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-09 17:50:23 +02:00
Daniel Schürmann	10be90671f	aco: re-use existing phi instruction when lowering boolean phis Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-09 17:50:23 +02:00
Michael Schellenberger Costa	a607ea51a7	aco: Cleanup insert_before_logical_end Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-09 17:50:23 +02:00
Vasily Khoruzhick	c8554f849e	lima/ppir: don't clone texture loads Cloning texture loads isn't a good idea since we may move it into a block that is not shared between all the invocations of the shader. We'd like to avoid that since it may result in undefined behavior. Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-10-09 08:24:27 -07:00
Michel Dänzer	94cfe59070	gitlab-ci/lava: Add needs: for container image to test jobs Without this, the test jobs could spuriously run after the container job failed or was cancelled, even if the build job didn't run at all. Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-10-09 16:19:56 +02:00
Samuel Pitoiset	030e67fac3	radv: bump minTexelBufferOffsetAlignment to 4 The spec has probably been misinterpreted during RADV bringup. This fixes GPU hangs with dEQP-VK.binding_model.offset_nonzero. Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 11:22:58 +00:00
Sergii Romantsov	1b21b97511	meta: leak of shader program when decompressing tex-images CC: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-10-09 10:49:08 +00:00
Erik Faye-Lund	bbdbb02a5f	mesa/main: prefer R8-textures instead of A8 for glBitmap in display lists This allows drivers to communicate that they prefer R8 textures rather than A8 for glBitmap usage. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-09 09:56:00 +02:00
Dave Airlie	f92226931b	st/mesa: Prefer R8 for bitmap textures If it's not available, we fall back to A8. This should work on all drivers, because we depend on it in the display-list code already. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-09 09:56:00 +02:00
Samuel Pitoiset	ad96c4987c	drirc: enable vk_x11_override_min_image_count for DOOM DOOM fails to handle more images than expected when the adaptative sync mode is enabled. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1902 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-09 08:38:38 +02:00
Samuel Pitoiset	cbd6f0a0c2	radv: implement VK_KHR_shader_clock NIR->LLVM and ACO already support nir_intrinsic_shader_clock. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-09 08:43:14 +02:00
Kenneth Graunke	0b7ecfdda5	iris: Implement the Broadwell NP Z PMA Stall Fix This should help avoid stalls in the pixel mask array in certain non-promoted depth cases. It especially helps for Z16, as each bit in the PMA corresponds to two pixels when using Z16, as opposed to the usual one pixel. Improves performance in GFXBench5 TRex by 22% (n=1).	2019-10-08 21:53:12 -07:00
Caio Marcelo de Oliveira Filho	4327837be9	docs: Update recently enabled VK extensions on Intel	2019-10-08 16:34:00 -07:00
Caio Marcelo de Oliveira Filho	9560c9b498	anv: Enable VK_EXT_shader_subgroup_{ballot,vote} Anvil now supports and passes Vulkan CTS tests matching dEQP-VK.subgroups..ext_shader_subgroup_ballot. dEQP-VK.subgroups..ext_shader_subgroup_vote. and crucible tests matching func.shader-ballot.* func.shader-subgroup-vote.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-08 16:34:00 -07:00
Kenneth Graunke	b453b29fc7	st/mesa: Fix inverted polygon stipple condition Fixes Piglit's gl-2.1-polygon-stipple-fs on iris. Fixes: `63f24c3c01` ("gallium: Enable MESA_framebuffer_flip_y") Reviewed-by: Fritz Koenig <frkoenig@google.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-08 16:18:13 -07:00
Fritz Koenig	63f24c3c01	gallium: Enable MESA_framebuffer_flip_y Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-08 13:53:01 -07:00
Fritz Koenig	66937abe2b	mesa: Allow MESA_framebuffer_flip_y for GLES 3 Implement glFramebufferParameteriMESA on GLES 3 so that the extension is not dependant on GLES 3.1 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-08 13:53:01 -07:00
Fritz Koenig	9fb76392de	mesa: GetFramebufferParameteriv spelling GetFramebufferParameteriv was incorrectly spelled as GetFramebufferParameteri. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-08 13:53:01 -07:00
Fritz Koenig	ab8e5a1539	include/GLES2: Sync GLES2 headers with Khronos Bring in glFramebufferParameteriMESA/glGetFramebufferParameterivMESA Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-10-08 13:53:01 -07:00
Clément Guérin	5afbe87d21	radeonsi: enable zerovram for Rocket League Fixes corruption on game startup. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1888 Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-08 16:07:30 -04:00
Kenneth Graunke	face221283	iris: Properly unreference extra VBOs for draw parameters bound_vertex_buffers doesn't include extra draw parameters buffers. Tracking this correctly is kind of complicated, and iris_destroy_state isn't exactly in a hot path, so just loop over all VBO bindings. Fixes: `4122665dd9` (iris: Enable ARB_shader_draw_parameters support) Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-10-08 11:14:21 -07:00
Eric Engestrom	6f26eae077	meson: fix sys/mkdev.h detection on Solaris On Solaris, sys/sysmacros.h has long-deprecated copies of major() & minor() but not makedev(). sys/mkdev.h has all three and is the preferred choice. Let's make sure we check for all 3 major(), minor() and makedev(). Reported-by: Alan Coopersmith <alan.coopersmith@oracle.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alan Coopersmith <alan.coopersmith@oracle.com> Tested-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2019-10-08 16:26:50 +01:00
Eric Engestrom	02b3aa3cf3	include: update drm-uapi `drm.h` was missing a `#include <stdint.h>`, which was completely breaking the non-linux builds after `272f9cfe6a` ("dri: Use DRM_FORMAT_* instead of defining our own copy.") started making use of it. Fixes: `272f9cfe6a` ("dri: Use DRM_FORMAT_* instead of defining our own copy.") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/950 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-08 14:23:51 +01:00
Michel Dänzer	3b8aeb0906	loader: Simplify handling of the radeonsi driver The list of AMD/ATI devices supported by radeon/r200/r300/r600 is complete, so anything else must use radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-08 09:02:34 +00:00
Bas Nieuwenhuizen	a0c930d284	amd/llvm: Fix warning due to asserted-only variable. [212/893] Compiling C object 'src/amd/llvm/ce8261c@@amd_common_llvm@sta/ac_nir_to_llvm.c.o'. ../mesa/src/amd/llvm/ac_nir_to_llvm.c: In function ‘visit_image_atomic’: ../mesa/src/amd/llvm/ac_nir_to_llvm.c:2636:17: warning: unused variable ‘format’ [-Wunused-variable] 2636 \| const GLenum format = nir_intrinsic_format(instr); \| ^~~~~~ Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-08 10:22:56 +02:00
Boris Brezillon	71eda74f7c	panfrost: Draw the wallpaper when only depth/stencil bufs are cleared When only the depth/stencil bufs are cleared, we should make sure the color content is reloaded into the tile buffers if we want to preserve their content. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-08 10:07:54 +02:00
Boris Brezillon	c138ca80d2	panfrost: Make sure a clear does not re-use a pre-existing batch glClear()s are expected to be the first thing GL apps do before drawing new things. If there's already an existing batch targetting the same FBO that has draws attached to it, we should make sure the new clear gets a new batch assigned to guaranteed that the FB content is actually cleared with the requested color/depth/stencil values. We create a panfrost_get_fresh_batch_for_fbo() helper for that and call it from panfrost_clear(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-08 10:07:54 +02:00
Kenneth Graunke	016c19bc89	iris: Update comment about 3-component formats and buffer textures You can't render to PIPE_BUFFER so there's no reason to prefer RGBX. PBO upload would like to use proper RGB textures as source data.	2019-10-07 23:11:45 -07:00
Chris Wilson	64207ebe66	iris: Allow packed RGB pbo uploads Hitting any fallback path on Broxton as we require clflushing the whole buffer even for an upload of a subtexture. However, since gallium provides a pbo upload path, allow it to sample packed RGB if supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 23:11:38 -07:00
Tapani Pälli	e4a826b2c8	anv/android: fix images created with external format support This fixes a case where user first creates image and then later binds it with memory created from AHW buffer. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-08 07:19:05 +03:00
Bas Nieuwenhuizen	72665a0f1f	meson: Always add LLVM coroutines module. It gets used by the gallium auxiliary draw module, which gets used pretty much always when LLVM is used as JIT. At the same time most builds don't hit the issue here because the shared library of LLVM contains all modules. Fixes: `d32690b43c` ("gallivm: add coroutine pass manager support") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/951 Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-10-08 03:24:49 +02:00
Timur Kristóf	3a08110d43	amd: Move all amd/common code that depends on LLVM to amd/llvm. This commit is a step towards the goal of being able to build RADV without LLVM. In the future we would like to offer the option to use RADV solely with ACO. There is still a need for the common AMD code located in amd/common but the LLVM specific parts need to be separated. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-08 00:44:08 +00:00
Ilia Mirkin	738bbee603	nvc0: add support for GL_EXT_demote_to_helper_invocation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-10-07 20:42:11 -04:00
Ilia Mirkin	71c34a51c3	gallium/tgsi: add support for DEMOTE and READ_HELPER opcodes This mirrors the intrinsics in the GLSL IR. One could imagine an alternate definition where reading the semantic would account for the READ_HELPER functionality, but that feels potentially dodgy and could be subject to CSE unpleasantness. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-10-07 20:41:59 -04:00
Marek Olšák	eec7b0a865	radeonsi: use simple_mtx_t instead of mtx_t Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:05:07 -04:00
Marek Olšák	5498a8d23c	st/mesa: use simple_mtx_t instead of mtx_t Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:05:04 -04:00
Marek Olšák	732ea0b213	gallium: add PIPE_RESOURCE_FLAG_SINGLE_THREAD_USE to skip util_range lock u_upload_mgr sets it, so that util_range_add can skip the lock. The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2% in torcs on radeonsi. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:05:00 -04:00
Marek Olšák	59dd4dafb5	util: use simple_mtx_t for util_range Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-07 20:04:49 -04:00
Marek Olšák	3b2b83924e	winsys/radeon: initialize SIMD properties in radeon_info This was missed when I added them. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1839 Fixes: `0692ae34e9` ("ac: move ac_get_num_physical_sgprs into radeon_info") Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-10-07 18:44:19 -04:00
Kenneth Graunke	6d9c1f30e4	iris: Drop vtbl usage for some load_register calls We can just call the actual functions directly.	2019-10-07 14:10:33 -07:00
Jordan Justen	ae9c311b9a	iris/state: Move reg/mem load/store functions earlier in file Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2019-10-07 14:10:33 -07:00
Eric Engestrom	c84bd2b095	meson: drop unused inc_nir Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	1234505bd6	meson: drop duplicate inc_nir from spirv2nir Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	f5808e6088	meson: drop duplicate inc_nir from libglsl Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	326be1774c	meson: drop duplicate inc_nir from libiris Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	7a1dc6ab44	meson: rename libnir to _libnir to make it clear it's not meant to be used anywhere else Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	3e95b2773f	meson: use idep_nir instead of libnir in pipe-loader Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	612e70c594	meson: use idep_nir instead of libnir in haiku softpipe Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	1975c5a59d	meson: use idep_nir instead of libnir in gallium nine Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	140d7e8b3a	meson: use idep_nir instead of libnir in libclnir Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	a0a8b24078	meson: use idep_nir instead of libnir in libnouveau Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	731097c747	meson: add missing idep_nir_headers in iris_gen_libs Fixes: `4929f020c3` ("iris: better SBE") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:49:40 +01:00
Eric Engestrom	721b880e4c	script: drop get_reviewer.pl This script doesn't make sense anymore in the age of GitLab. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-07 21:33:38 +01:00
Eric Engestrom	b91ae0379b	meson/loader: drop unneeded *.h file Meson automatically tracks any file included by a file it already tracks, and `pci_id_driver_map.h` & `loader.h` are included by `loader.c`, while `loader_dri3_helper.h` is included by `loader_dri3_helper.c`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 21:30:16 +01:00
Eric Engestrom	b9157ea415	loader: use ARRAY_SIZE instead of NULL sentinel Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 21:30:16 +01:00
Eric Engestrom	5be6c8959c	loader: s/int/bool/ for predicate result Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 21:30:16 +01:00
Eric Engestrom	26149d119b	loader: replace int/1/0 with bool/true/false Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 21:30:16 +01:00
Eric Engestrom	6202a13b71	egl: replace MESA_EGL_NO_X11_HEADERS hack with upstream EGL_NO_X11 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-10-07 20:28:59 +00:00
Kenneth Graunke	90a35752b4	iris: Drop bonus parameters from iris_init_*_context() Nothing uses vtbl or dbg, and screen is available from the batch.	2019-10-07 13:15:56 -07:00
Rhys Perry	2d78e55a8c	nir/constant_folding: fold load_constant intrinsics These can appear after loop unrolling. v2: stylistic changes v2: replace state->mem_ctx with state->shader v2: add bounds checking v3: use nir_intrinsic_range() for bounds checking v3: fix issue where partially out-of-bounds reads are replaced with undefs v4: fix merge conflicts during rebase v5: split into two commits v6: set constant_data to NULL after freeing (fixes nir_sweep()/Iris) v7: don't remove the constant data if there are no constant loads Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v6) Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2019-10-07 19:49:53 +01:00
Rhys Perry	ec054a67da	nir/constant_folding: add back and use constant_fold_state Useful for load_constant folding. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-07 19:49:53 +01:00
Caio Marcelo de Oliveira Filho	f7ca072ab2	anv: Implement VK_KHR_shader_clock Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 09:12:12 -07:00
Caio Marcelo de Oliveira Filho	f20cea0162	spirv: Implement SPV_KHR_shader_clock We only have the subgroup variant in NIR (equivalent to clockARB), so only support that for now. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 09:12:12 -07:00
Caio Marcelo de Oliveira Filho	3f304617cb	vulkan: Update the XML and headers to 1.1.124 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-10-07 09:12:12 -07:00
Kenneth Graunke	bd46dfa889	Revert "iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround" This reverts commit `4f857423b3`. It caused GPU hangs on all affected platforms, in e.g. Piglit bin/stencil-twoside -auto -fbo.	2019-10-07 09:08:41 -07:00
Tomeu Vizoso	c00f017e65	gitlab-ci/lava: Fix image to use in test jobs In the test stage, we can use any of the two container images as we arent going to do anything architecture-dependent when submitting the jobs to LAVA. But if we are in a pipeline in which the images need to be rebuilt and one finishes much earlier than the other, it could happen that the test job that executes first fails to find the container image. To avoid that, have each job in the test stage to use the image that has been already implicitly built by depending on the build job for the given arch. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-07 07:31:55 -07:00
Boris Brezillon	8d0830de05	Revert "Revert "st/dri2: Implement DRI2bufferDamageExtension"" This reverts commit `19546108d3`. This commit breaks the build because lima implements ->set_damage_region(). I guess we'll need more discussion before removing the ->set_damage_region() hook.	2019-10-07 12:24:51 +02:00
Boris Brezillon	19546108d3	Revert "st/dri2: Implement DRI2bufferDamageExtension" This reverts commit `492ffbed63`. BACK_LEFT attachment can be outdated when the user calls KHR_partial_update(), leading to a damage region update on the wrong pipe_resource object. Let's not expose the ->set_damage_region() method until the core is fixed to handle that properly. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Daniel Stone <daniels@collabora.com>	2019-10-07 11:38:26 +02:00
Tomeu Vizoso	555c0de8c6	gitlab-ci: Move LAVA-related files into top-level ci dir In preparation for testing drivers other than Panfrost in LAVA labs. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-06 07:47:41 -07:00
Tomeu Vizoso	7b01f725dd	gitlab-ci: Run dEQP on devices with Panfrost Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so we test on devices with Panfrost. This uses LAVA to schedule jobs in the devices and will be the base for testing Etnaviv, Lima, etc. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-06 07:47:21 -07:00
Kenneth Graunke	4f857423b3	iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround This is a port of Nanley's `904c2a617d` from i965 to iris. One concern is that iris uses larger batches, and also emits far fewer commands, so we may come closer to the 500 limit within a batch, and could need to supplement this with actual counting. Manhattan 3.0 had 239 3DSTATE_CONSTANT_PS packets in a batch, Unigine Valley had 155. So it seems like we're still in the realm of safety.	2019-10-05 17:18:45 -04:00
Kenneth Graunke	f1bba22f69	iris: Refactor push constant allocation so we can reuse it We'll need this for a workaround shortly. While refactoring, also improve the comment slightly.	2019-10-05 17:18:44 -04:00
Lionel Landwerlin	12bf1308c4	intel/isl: set vertical surface alignment on null surfaces Just following the spec. Somewhat unclear whether this applies to NULL surfaces. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Lionel Landwerlin	ff1a5aadbf	intel/isl: set surface array appropriately This doesn't seem to affect anything. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Lionel Landwerlin	c445d6f66e	intel/isl: Set null surface format to R32_UINT It appears we never had a test in piglit or deqp sampling from a null surface... It turns out this triggers a hang on IVB only. Updating the null surface format to R32_UINT fixes the hang on ivb and doesn't affect other platforms, so set it by default for all platforms. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-05 20:54:33 +00:00
Jonathan Marek	1249cf19b0	etnaviv: set texture INT_FILTER bit This should improve texture sampling performance on GC3000. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-05 20:31:36 +00:00
Jonathan Marek	c877142fca	etnaviv: implement texture comparator Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-05 20:31:36 +00:00
Jonathan Marek	686e9fa0fb	etnaviv: update headers from rnndb Update to etna_viv commit 7ff8029. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-05 20:31:36 +00:00
Lionel Landwerlin	d36763b2a4	intel: fix subslice computation from topology data We're missing the offset of the slice in the subslice mask... This worked for most platforms that don't have first slice fused off because we would reread the same mask from slice0 again and again... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c1900f5b0f` ("intel: devinfo: add helper functions to fill fusing masks values") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869 Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-10-05 23:05:03 +03:00
Kenneth Graunke	396b410959	dri: Avoid swapbuffer throttling in glXCopySubBufferMESA We were supplying __DRI2_THROTTLE_SWAPBUFFER, rather than the obvious choice of __DRI2_THROTTLE_COPYSUBBUFFER. This meant that we hit the swap-based frame throttling. glXCopySubBuffer doesn't seem like it's intended to be a frame boundary, so we'd like to avoid this throttling. Tested-by: Michel Dänzer <mdaenzer@redhat.com> # DRI3 only Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-05 13:19:37 +00:00
Kenneth Graunke	72beda4fb4	st/dri: Perform MSAA downsampling for __DRI2_THROTTLE_COPYSUBBUFFER glXCopySubBufferMESA copies data from the back buffer to the front, so it needs to perform a MSAA downsampling operation just like glXSwapBuffers would. Currently, the CopySubBuffer implementations supply a throttle reason of __DRI2_THROTTLE_SWAPBUFFERS, so they hit this path and work today. But we'd like to avoid swapbuffer throttling in this case, so the next patch will change that reason. Tested-by: Michel Dänzer <mdaenzer@redhat.com> # DRI3 only Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-10-05 13:19:37 +00:00
Prodea Alexandru-Liviu	6309c31fd8	scons/MSYS2-MinGW-W64: Fix build options defaults Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> When building in a MSYS2 Mingw-w64 environment Mesa3D sets wrong default build options which inevitably lead to build failure.	2019-10-05 08:43:13 +00:00
Lionel Landwerlin	907c2397f0	intel/error2aub: add support for platforms without PPGTT Not much to do to enable this, just make sure to always write to the GGTT :) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 22:31:15 +00:00
Rhys Perry	77ebb030ed	aco: fix load_constant with multiple arrays I thought I fixed this, but I guess I must have broken it again. Fixes various dEQP-VK.draw.* tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-04 22:43:11 +01:00
Eric Anholt	ce76be9933	nir: Fix some wonky whitespace in nir_search.h. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	3cc914921e	nir: Factor out most of the algebraic passes C code to .c/.h. Working on the algebraic implementation, I was being driven nuts by my editor not highlighting and handling indentation for the C code. It turns out that it's basically not pass-specific code, and we can move it over to the relevant .c file. Replaces 30KB of code with 34KB of data on my i965 build. No perf diff on shader-db (n=3) Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	c23db0df18	nir: Keep the range analysis HT around intra-pass until we make a change. This lets us memoize range analysis work across instructions. Reduces runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3). Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	7025dbe794	nir: Skip emitting no-op movs from the builder. Having passes generate these is just making more work for copy propagation (and thus probably calling more optimization passes) later. Noticed while trying to debug nir_opt_algebraic() top-to-bottom having O(n^2) behavior due to not finding new matches in replacement code. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Eric Anholt	e7b754a05c	nir: Make nir_search's dumping go to stderr. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-10-04 19:15:01 +00:00
Adam Jackson	3746ee912f	surfaceless: Support EGL_WL_bind_wayland_display Feature parity with the drm, x11, and wayland platforms. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1870 Tested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>	2019-10-04 15:49:10 +00:00
Rhys Perry	1264acdf4b	nir/print: always use the right FILE * Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-04 15:24:10 +00:00
Erik Faye-Lund	49b32233a0	nir: initialize needs_helper_invocations as well Similar to the previous commit, we should also initialize needs_helper_invocations here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 14:55:40 +00:00
Erik Faye-Lund	1d6d2ca9f1	nir: initialize uses_discard to false This matches what we do for uses_sample_qualifier, and what we do in ir_set_program_inouts.cpp as well. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 14:55:40 +00:00
Rhys Perry	a87b0f5141	radv/aco,aco: set lower_fmod This simplifies ACO and allows the lowered code to be optimized (in particular, constant folded). Totals from affected shaders: SGPRS: 1776 -> 1776 (0.00 %) VGPRS: 1436 -> 1436 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 203452 -> 203564 (0.06 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 103 -> 103 (0.00 %) At least some of the code size increase seems to be from literals being applied to instructions as a result of constant folding. v2: remove fmod/frem handling in init_context() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-04 14:00:46 +00:00
Prodea Alexandru-Liviu	0fe2e04f2d	scons/windows: Fix build with LLVM>=8 Fixes `eebe091d29` ("scons/windows: Enable compute shaders when possible.") Signed-off-by: Prodea Alexandru-Liviu <liviuprodea@yahoo.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-10-04 13:48:08 +00:00
Michel Dänzer	b012f06d66	dri3: Pass __DRI2_THROTTLE_COPYSUBBUFFER from loader_dri3_copy_drawable 0 is __DRI2_THROTTLE_SWAPBUFFER, which doesn't really make sense here. Avoids dri_flush() throttling twice for the same glFlush call with front buffer rendering, as described in https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2057 . Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-04 10:55:43 +02:00
Gert Wollny	7cbb44aa6a	r600: Fix interpolateAtCentroid If the instruction interpolateAtCentroid is used the extra interpolator must also be enabled in the state. Fixes: fs-interpolateatcentroid-block Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-10-04 10:09:01 +02:00
Dylan Baker	1481d05409	meson: Only error building gallium video without libdrm when the platform is drm Fixes: `3b265f61f5` ("meson: gallium media state trackers require libdrm with x11") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1878 Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-10-03 22:14:20 -07:00
Alyssa Rosenzweig	dcd2f26b98	pan/midgard: Replace mir_is_live_after with new pass Now that we have live_out calculated per block as metadata, calculating liveness of an instruction at a given point in the program becomes O(n) to the size of the block worst-case, rather than O(n) the program. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig	39a4b3ebe9	pan/midgard: Calculate temp_count for liveness This needs to be correct or the analysis fails. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig	ad5fcac005	pan/midgard: Invalidate liveness for mir_is_live_after Callers should have liveness info ready. Ideally we'd have a nice metadata tracking framework like NIR to handle this automatically, but for now this will allow us to make forward progress... when we're about to do something with liveness, invalidate everything ahead to force a clean calculation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig	3450c013c5	pan/midgard: Begin tracking liveness metadata This will allow us to explicitly invalidate liveness analysis results so we can cache liveness results. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig	846e5d5ba8	pan/midgard: Don't try to OR live_in of successors By definition, once liveness analysis has occurred: live_out = OR {succ} succ->live_in Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig	013cd6bed2	pan/midgard: Move RA's liveness analysis into midgard_liveness.c There are unfortunately two distinct liveness analysis passes in the compiler right now -- one good (but complex) pass used by RA based on solving data flow equations, and one awful (but simple) pass used for dead code elimination and bundling based on an abstract walk of the AST. Let's move RA's pass into shared code so we can work on unifying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig	76a76de7af	pan/midgard: Add mir_calculate_temp_count helper This allows us to fill in ctx->temp_count explicitly, even if we haven't squished down the MIR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig	c59fae0fef	pan/midgard: Remove mir_has_multiple_writes We already enforce this with the SSA/register distinction in the backend. There is no need to duplicate this logic merely for an assert. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 22:29:50 -04:00
Erik Faye-Lund	3f4be0d199	.mailmap: add a couple of aliases for Jakob Bornecrantz Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>	2019-10-03 17:11:20 -04:00
Erik Faye-Lund	2eb916a58d	.mailmap: add an alias for Tomeu Vizoso Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-10-03 17:11:10 -04:00
Erik Faye-Lund	27ae5c81f7	.mailmap: add an alias for Gert Wollny Reviewed-by: Gert Wollny <gert.wollny@collabora.com>	2019-10-03 17:10:59 -04:00
Erik Faye-Lund	28b64049d0	.mailmap: add an alias for Alexandros Frantzis Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-10-03 17:10:28 -04:00
Erik Faye-Lund	b7baf70778	.mailmap: specify spelling for Elie Tournier Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-10-03 17:09:42 -04:00
Boris Brezillon	1ac33aae49	panfrost: Get rid of the flush in panfrost_set_framebuffer_state() Now that we have track inter-batch dependencies, the flush done in panfrost_set_framebuffer_state() is no longer needed. Let's get rid of it. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	70cf93c4d7	panfrost: Kill the explicit serialization in panfrost_batch_submit() Now that we have all the pieces in place to support pipelining batches we can get rid of the drmSyncobjWait() at the end of panfrost_batch_submit(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	0a12a16bae	panfrost: Do fine-grained flushing when preparing BO for CPU accesses We don't have to flush all batches when we're only interested in reading/writing a specific BO. Thanks to the panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers we can now flush only the batches touching the BO we want to access from the CPU. This fixes the dEQP-GLES2.functional.fbo.render.texsubimage.* tests. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	2225383af8	panfrost: Make sure the BO is 'ready' when picked from the cache This is needed if we want to free the panfrost_batch object at submit time in order to not have to GC the batch on the next job submission. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	22190bc27b	panfrost: Add flags to reflect the BO imported/exported state Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that are not exported/imported (meaning that all GPU accesses are known by the context). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	82399b58d3	panfrost: Add a panfrost_flush_batches_accessing_bo() helper This will allow us to only flush batches touching a specific resource, which is particularly useful when the CPU needs to access a BO. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	a45984b244	panfrost: Add a panfrost_flush_all_batches() helper And use it in panfrost_flush() to flush all batches, and not only the one currently bound to the context. We also replace all internal calls to panfrost_flush() by panfrost_flush_all_batches() ones. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	b5d8f9bbbf	panfrost: Prepare panfrost_fence for batch pipelining The panfrost_fence logic currently waits on the last submitted batch, but the batch serialization that was enforced in panfrost_batch_submit() is about to go away, allowing for several batches to be pipelined, and the last submitted one is not necessarily the one that will finish last. We need to make sure the fence logic waits on all flushed batches, not only the last one. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	2dad9fde50	panfrost: Start tracking inter-batch dependencies The idea is to track which BO are being accessed and the type of access to determine when a dependency exists. Thanks to that we can build a dependency graph that will allow us to flush batches in the correct order. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	40a07bfbd7	panfrost: Add a panfrost_freeze_batch() helper We'll soon need to freeze a batch not only when it's flushed, but also when another batch depends on us, so let's add a helper to avoid duplicating the logic. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	819738e4af	panfrost: Use the per-batch fences to wait on the last submitted batch We just replace the per-context out_sync object by a pointer to the the fence of the last last submitted batch. Pipelining of batches will come later. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	6936b7f319	panfrost: Add a batch fence So we can implement fine-grained dependency tracking between batches. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	a8bd265cef	panfrost: Make panfrost_batch->bos a hash table So we can store the flags as data and keep the BO as a key. This way we keep track of the type of access done on BOs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	ada752afe4	panfrost: Extend the panfrost_batch_add_bo() API to pass access flags The type of access being done on a BO has impacts on job scheduling (shared resources being written enforce serialization while those being read only allow for job parallelization) and BO lifetime (the fragment job might last longer than the vertex/tiler ones, if we can, it's good to release BOs earlier so that others can re-use them through the BO re-use cache). Let's pass extra access flags to panfrost_batch_add_bo() and panfrost_batch_create_bo() so the batch submission logic can take the appropriate when submitting batches. Note that this information is not used yet, we're just patching callers to pass the correct flags here. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Boris Brezillon	12f790f7da	panfrost: Add the shader BO to the batch in patch_shader_state() We know a shader will be used by a batch when panfrost_patch_shader_state() is called, so let's add the shader BO at that time. Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-10-03 16:55:38 -04:00
Andres Gomez	02c265be9d	egl: Remove the 565 pbuffer-only EGL config under X11. The CTS finally has agreed to drop the requirement for a 565-no-depth-no-stencil config for ES 3.0. Hence we can now remove the code to satisfy this requirement using a pbuffer-only visual with whatever other buffers the driver happens to have given us. This reverts commit `82607f8a90`, commit `6ad31c4ff3` and commit `dacb11a585`. v2: - Reference the VK-GL-CTS issue (Eric E.). v3: - Don't revert `fc21394bc4` ("egl: Quiet warning about front buffer rendering for pixmaps/pbuffers") (Kenneth). References: VK-GL-CTS issue 1601. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-10-03 23:51:46 +03:00
Dylan Baker	974e3ad004	bin: delete unused releasing scripts Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-03 20:15:19 +00:00
Dylan Baker	3226b12a09	release: Add an update_release_calendar.py script This script is for updating post version bump. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-03 20:15:19 +00:00
Dylan Baker	86079447da	scripts: Add a gen_release_notes.py script This script is responsible for generating an entire page in the docs/relnotes/ directory. It includes a template for the page, and uses mako to fill in the necessary bits. It is designed to be purely fire and forget, calculating previous versions, shortlogs, bug fixes, and dates. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-03 20:15:19 +00:00
Dylan Baker	7ff49c25ed	docs: add a new_features.text file and remove 19.3.0 release notes The next patch is going to introduce a tool that creates the entire release html page for us, without any user intervention. As such we can't be editing it. To that end the script will read the new_features.txt file to get a list of new features. This is a flat text file, one entry per line. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-10-03 20:15:19 +00:00
Rafael Antognolli	cdc331c6f9	anv/block_pool: Align anv_block_pool state to 64 bits. On 64 bits platforms, some atomic operations like __sync_fetch_and_add() have constant time, but on 32 bits platforms they are implemented with a loop and might take much longer. Additionally, it seems like if their operands are not aligned to 64 bits, they also require extra memory accesses. From the Intel Architecture's Developer Manual Vol. 1, 4.1.1: "A word or doubleword operand that crosses a 4-byte boundary or a quadword operand that crosses an 8-byte boundary is considered unaligned and requires two separate memory bus cycles for access." Forcing the u64 field to be aligned to 64 bits seems to make the unit tests that are stressing this finish much faster. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-03 12:40:33 -07:00
Erik Faye-Lund	0103d4747a	loader/dri3: do not blit outside old/new buffers Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-03 18:58:34 +00:00
Dylan Baker	9af6c38def	docs: Add use of Closes: tag for closing gitlab issues This replaces to old Bugzilla: tag, which no longer makes sense because we don't use bugzilla anymore. Reviewed-by: Eric Anholt <eric@anholt.net> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-03 17:45:51 +00:00
Anuj Phogat	0d60621101	intel/isl/icl: Use halign 8 instead of 4 hw workaround v1 by Topi Pohjolainen v2,v3 by Anuj Phogat: - Apply for gen >= 11 - Remove wa_bug_xxx function - Use helper functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-03 17:18:41 +00:00
Samuel Pitoiset	d861401554	ac/nir: remove unused code for nir_op_{fmod,frem} RADV and RadeonSI both lower these two NIR instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-03 18:15:17 +02:00
Samuel Pitoiset	5ebe1a17e9	radv: enable lower_fmod for the LLVM path This lowers fmod and frem at NIR level like RadeonSI. fmod is already lowered directly in NIR->LLVM, and frem will be lowered by LLVM anyways. This fixes a LLVM crash with: dEQP-VK.glsl.builtin.precision_fp16_storage32b.frem.compute.scalar. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-03 18:15:14 +02:00
Adam Jackson	1b87f4058d	egl/dri2: Don't dlclose() the driver on dri2_load_driver_common failure ... because it's wrong to do so. The error path out of dri2_initialize_drm ends with dri2_display_destroy, which calls functions in the vtable we're trying to set up, so if we dlclose the driver then those function pointers will point off into space and things crash. Noticed this because after !1923 eglinfo would crash when setting up the GBM platform. This was something of a cascade failure, because my kernel is too old for DRM_IOCTL_I915_GETPARAM to work without DRM_AUTH, so i965 wouldn't load. platform_drm.c then got very confused when it tries to load swrast as a dri2 driver. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-03 09:39:51 -04:00
Bas Nieuwenhuizen	c837872fba	radv: Fix warning in 32-bit build. uintptr_t is 32 bits in a 32-bits build, resulting in shifting out of bounds. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-03 13:06:08 +00:00
Bas Nieuwenhuizen	8ad3d8b178	radv: Fix condition for skipping the continue CS. We need the continue CS for referencing the tess/GDS/sample position BOs. Fixes: `46e52df34d` "radv: add tessellation ring allocation support. (v2)" Fixes: `e1dc3ab753` "radv/gfx10: allocate GDS/OA buffer objects for NGG streamout" Fixes: `1171b304f3` "radv: overhaul fragment shader sample positions." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-03 13:02:07 +00:00
Michel Dänzer	4712fdf7ae	gitlab-ci: Use per-job ccache Instead of a single cache shared between all jobs, but reduce the maximum cache size to 1.5G (from 5G). Rationale for smaller cache: Pulling & pushing a 5G cache could take a long time. Consider https://gitlab.freedesktop.org/mesa/mesa/-/jobs/684010 (click the "Show complete raw" button to see timestamps): Pulling the cache took 1569927241-1569927194 = 47 seconds, pushing it 1569927671-1569927519 = 152, for a total of 199 seconds. The actual build took comparable 1569927518-1569927243 = 275 seconds, despite no cache hits from ccache. In other words, the cache transfers almost doubled the job duration, and they would have negated any build time benefits from ccache even with a high cache hit rate. Also, the smaller caches avoid blowing up storage requirements for them too much. Rationale for per-job caches: Making a single cache significantly smaller might result in cached build products from one job getting evicted by another job, reducing the likelihood of cache hits from previous pipelines. v2: * Move up "ccache --max-size=1500M" call (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-03 09:26:11 +02:00
Gurchetan Singh	a1a5672118	virgl: honor winsys supplied metadata To truly to do this correctly, we'll have to fix the discrepancy between drm_virtgpu_3d_transfer_to_host and virtio_gpu_transfer_host_3d. However, this is a good starting point. Since virtio-gpu only supports self-import and export, this should be fine. Let's only do WINSYS_HANDLE_TYPE_FD for this currently. Reviewed by: Robert Tarasov <tutankhamen@chromium.org>	2019-10-02 17:57:59 -07:00
Gurchetan Singh	9bde8f3a8f	virgl: modify internal structures to track winsys-supplied data The winsys might supply dimensions that are different than those we calculate. In additional, it may supply virtualized modifiers. In practice, a stride != bpp * width and virtualized modifiers don't happen yet, but the plan is to move in that direction. Also make virgl_resource_layout static. Reviewed by: Robert Tarasov <tutankhamen@chromium.org>	2019-10-02 17:57:53 -07:00
Gurchetan Singh	aad4127c41	virgl: modify resource_create_from_handle(..) callback This commit makes no functional changes, just adds the revelant plumbing. Reviewed by: Robert Tarasov <tutankhamen@chromium.org>	2019-10-02 17:57:47 -07:00
Gurchetan Singh	2899bbe37a	virgl: remove stride from virgl_hw_res It's not used anywhere, and stride isn't really an intrinsic property of a GEM buffer. Reviewed by: Robert Tarasov <tutankhamen@chromium.org>	2019-10-02 17:57:40 -07:00
Lionel Landwerlin	1c6fdbc83c	intel: fix topology query i915 will report ENODEV on generations prior to Haswell because there is no point in reporting values on those. This is prior any fusing could happen on parts with identical PCI ids. This query call was previously only triggered on generations that support performance queries, which happens to match generation for which i915 reports topology, but the commit pointed below started using it on all generations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1860 Cc: <mesa-stable@lists.freedesktop.org> Fixes: `96e1c945f2` ("i965: Move device info initialization to common code") Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-10-02 22:25:44 +00:00
Caio Marcelo de Oliveira Filho	faf98be290	docs: Fix GL_EXT_demote_to_helper_invocation name Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-02 14:33:42 -07:00
Samuel Pitoiset	a2a68d551c	radv/gfx10: fix the ESGS ring size symbol Random hangs no longer happen, I'm actually not sure if they were related to this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 21:50:40 +02:00
Samuel Pitoiset	34be977f80	radv: fix build Forgot to amend the commit before updating the MR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-02 20:37:43 +02:00
Samuel Pitoiset	4304162744	Revert "radv: disable viewport clamping even if FS doesn't write Z" This was actually the wrong fix. This reverts commit `0a313cc285`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 19:40:39 +02:00
Samuel Pitoiset	b8fe6189a9	radv: rework the slow depthstencil clear to write depth from PS Make sure to export the expected clear values to the depth stencil attachment. This fixes dEQP-VK.pipeline.depth_range_unrestricted.* on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 19:31:51 +02:00
Samuel Pitoiset	e19d1ee2d1	radv/gfx10: fix NGG streamout with triangle strips for VS The number of vertices has to be adjusted with the output primitive type. This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:09:35 +02:00
Samuel Pitoiset	08ab13d340	radv/gfx10: fix storing/loading NGG stream outputs for GS The GS outputs are stored differently in the LDS storage, they are indexed by out_idx which is incremented for each stored DWORD. Thus, we need a different path for exporting the stream outputs. This fixes a bunch of CTS failures when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:09:32 +02:00
Samuel Pitoiset	3be21b5ab1	radv/gfx10: use the component mask when storing/loading NGG stream outputs It's unnecessary to store/load more components that needed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:09:30 +02:00
Samuel Pitoiset	60f8224171	radv/gfx10: fix storing/loading NGG stream outputs for VS and TES The LDS storage allocated for stream outputs is 4 * N, where N is the number of outputs. So, we have to store/load with N as index and not with the output location as index. This doesn't fix anything known but it should fix out-of-bounds access and it also reduces the number of outputs written to the LDS storage. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:09:27 +02:00
Samuel Pitoiset	56e1b1ff0c	radv/gfx10: add missing counter buffer to the BO list The buffer isn't necessarily used before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:09:25 +02:00
Samuel Pitoiset	683c5e27c7	radv/gfx10: add radv_device::use_ngg Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-02 18:06:01 +02:00
Eric Engestrom	2236cf24a7	git: delete .gitattributes The last of these was deleted in `44a8e51354` ("d3d1x: Remove.") over 6 years ago. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-10-02 13:29:55 +01:00
Gert Wollny	c5da8230de	etnaviv: enable triangle strips only when the hardware supports it Some hardware has a bug with triangle strips and it is signalled by the flag BUG_FIXED8 whether this bug has been fixed. So only enable triangle strips when this flag is set. Thanks: Jonathan Marek and Christian Gmeiner for the pointers v2: Add TODO to indicate that the handling should be refined (Jonathan & Christian) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-10-02 07:34:36 +00:00
Dylan Baker	d855e19b87	meson: remove -DGALLIUM_SOFTPIPE from st/osmesa It's unused here, and undefined in scons. It is used in targets/osmesa, but it's properly defined there already. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-10-01 12:34:27 -07:00
Lionel Landwerlin	2208d79dde	mesa: don't forget to clear _Layer field on texture unit On the Android Antutu benchmark we ran into an assert in ISL where the (base layer + num layers) > total layers. It turns out the core of mesa forgot to clear the _Layer variable, potentially leaving an inconsistent value. v2: Pull setting u->_Layer out of the conditional blocks (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-01 21:49:13 +03:00
Robin Murphy	563f8974d8	egl/gbm: Fix config validation In converting to shift/size-based validation, we lost a condition from the ARGB/XRGB equivalence check, which left it working one way round but not the other, and broke applications like glmark2-es2-drm on some platforms. Restore the equivalent check that both configs actually have an alpha channel before considering a mismatch. Fixes: `7b4ed2b513` ("egl: Convert configs to use shifts and sizes instead of masks") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-10-01 14:45:15 +01:00
Ken Mays	4943c89d6d	haiku: fix Mesa build 1. The hgl.c file is a read-only file versus read-write. Ref: src/gallium/state_trackers/hgl/hgl.c 2. I've included the Haiku-specific patches I used to get a successful build of Mesa 19.1.7 on Haiku using the meson/ninja build procedure. Shows "[764/764] linking target ... libswpipe.so" at build completion. v2: Remove autotools files (Eric) v3: Update the patch Reported-by: Ken Mays <kmays2000@gmail.com> Tested-by: Ken Mays <kmays2000@gmail.com> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Alexander von Gluck IV <kallisti5@unixzen.com>	2019-10-01 10:31:02 +00:00
Michel Dänzer	e55df4c859	gitlab-ci: Set ccache path for cross compilers in meson cross file Without this, meson didn't pick up ccache for cross builds. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-10-01 11:16:33 +02:00
Andres Gomez	f83874a405	docs/relnotes: add support for GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6 on i965 and iris After `41549a18e6` ("i965: Enable OpenGL 4.6 for Gen8+"), i965 implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6. After `15e439071d` ("iris: Enable ARB_gl_spirv and ARB_spirv_extensions"), iris implements GL_ARB_gl_spirv, GL_ARB_spirv_extensions and OpenGL 4.6. v2: - Explicit the support is for i965 and iris. v3: - Add also GL_ARB_spirv_extensions to the release notes (Alejandro). Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-10-01 12:09:48 +03:00
Kevin Strasser	641320ce02	egl: Fix implicit declaration of ffs Found when building for Android in C99 mode. Include bitscan.h to ensure ffs is available. Fixes: `7b4ed2b5` ("egl: Convert configs to use shifts and sizes instead of masks") Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-30 14:33:43 -07:00
Rafael Antognolli	b9994cb8d5	intel/tools: Fix aubinator usage of rb_tree. The order of comparison has changed, so we need to invert the logic of "insert_left" when using rb_tree_insert_at(). Fixes: `dae33052db` (util/rb_tree: Reverse the order of comparison functions). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-30 13:43:23 -07:00
Caio Marcelo de Oliveira Filho	089da33c4d	docs/relnotes: Add EXT_demote_to_helper_invocation support on iris, i965 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	54f1de1c5c	i965: Enable EXT_demote_to_helper_invocation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	a3776df7b1	iris: Enable EXT_demote_to_helper_invocation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	008de52305	gallium: Add PIPE_CAP_DEMOTE_TO_HELPER_INVOCATION To enable EXT_demote_to_helper_invocation: This extension adds a "demote" keyword that is similar to "discard" but only suppresses subsequent writes and outputs to the framebuffer, and does not terminate the execution of the invocation. For the remainder of the execution, the invocation is "demoted" to act like a helper invocation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	61fa4b5707	glsl: Add helperInvocationEXT() builtin From EXT_demote_to_helper_invocation, implemented with the existing nir_intrinsic_is_helper_invocation. Such builtin is necessary when using `demote` because we can't redefine the value of gl_HelperInvocation (since it is an input variable). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	3439956377	glsl: Parse `demote` statement When the EXT_demote_to_helper_invocation extension is enabled, `demote` is treated as a keyword, and produces an ir_demote. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	af1a6f0f77	glsl: Add ir_demote To represent the new `demote` keyword when using EXT_demote_to_helper_invocation extension. Most of the changes are to include it in the visitors. Demote is not considered a control flow, so also include an empty visit member function in ir_control_flow_visitor. Only NIR actually supports `demote`, so assert the translations for TGSI and Mesa's gl_program -- since the demote is not expected to appear for those. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Caio Marcelo de Oliveira Filho	c81b912eb7	mesa: Extension boilerplate for EXT_demote_to_helper_invocation Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-30 12:44:30 -07:00
Kenneth Graunke	309924c3c9	iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets. We can't just check for the BO base address, we need to check for the full address including any offset we may have applied. When updating the address, we need to include the offset again. Fixes: `5ad0c88dbe` ("iris: Replace buffer backing storage and rebind to update addresses.")	2019-09-30 12:41:03 -07:00
Eric Engestrom	fa0dcaaae0	docs/install: drop autotools references 19.3 will be the 3rd release without autotools, people know it's gone by now. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-30 19:45:15 +01:00
Maya Rashish	c0330461c9	meson: Test for -Wl,--build-id=sha1 instead of hard-coding OS list. Helps Solaris ld builds. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Maya Rashish <coypu@sdf.org>	2019-09-30 18:38:14 +00:00
Dylan Baker	4913ad9a37	docs: remove stray newline Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-30 18:27:52 +00:00
Dylan Baker	bc2d73c36b	docs: use https for mesonbuild.com Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-30 18:27:52 +00:00
Dylan Baker	5d11a828e1	docs: update install docs for meson Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-30 18:27:52 +00:00
Marek Olšák	a1545af079	ac/nir: fix GLSL imageSamples() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-30 14:21:42 -04:00
Marek Olšák	0cc233e3dc	ac: add ac_build_image_get_sample_count from radeonsi Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-30 14:21:42 -04:00
Marek Olšák	39e638c14e	ac/surface: don't allocate FMASK if there is no graphics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-30 14:21:42 -04:00
Marek Olšák	f704fb7f0b	tgsi_to_nir: handle PIPE_FORMAT_NONE in image opcodes radeonsi doesn't use the format and internal shaders don't set it. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-09-30 14:20:48 -04:00
Dylan Baker	3b265f61f5	meson: gallium media state trackers require libdrm with x11 v2: - update copyright year in all changed files - rebase on master Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-30 18:06:56 +00:00
Kenneth Graunke	a0a93763fb	iris: Disable CCS_E for 32-bit floating point textures. A while back, Michael Larabel noticed that Paraview's Wavelet Volume case runs significantly slower on iris than i965. It turns out this is because we enable CCS_E for 32-bit floating point formats, while i965 disables it, with an oblique comment saying that we benchmarked it (on what exactly?) and determined that it was a loss. Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed large framerate drops when enabling CCS_E for either format. However, several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit floating point formats, with no apparent ill effects. So, disable compression for 32-bit float formats for now, but leave it enabled for 16-bit float formats as they seem to be working fine. Improves performance in Paraview's Wavelet Volume test by 62% on a Skylake GT4e. Fixes: `3cfc6a207b` ("iris: Fill out res->aux.possible_usages")	2019-09-30 10:44:52 -07:00
Marek Olšák	4a0d2e2880	ac: reorder and print all radeon_info fields Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:21 -04:00
Marek Olšák	e8b1538587	ac: set the number of SDPs same as the number of TCCs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:21 -04:00
Marek Olšák	b7c2f7c5a6	ac: fix num_good_cu_per_sh for harvested chips Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	235ebe9163	radeonsi/gfx10: fix corruption for chips with harvested TCCs Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	8cbe83445b	ac: add radeon_info::tcc_harvested Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	7d97013294	ac: fix incorrect vram_size reported by the kernel Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Marek Olšák	3c0938bece	radeonsi/gfx10: fix L2 cache rinse programming Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-30 13:36:20 -04:00
Eric Engestrom	0efc253f02	etnaviv: fix bitmask typo Fixes: `d92689c46f` ("etnaviv: nir: add native integers (HALTI2+)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-09-30 17:54:33 +01:00
Adam Jackson	855dc17fcf	glx: Log the filename of the drm device if we fail to open it Helps point the user to the specific device that's having issues, since you're increasingly likely to have more than one. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/107 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-30 15:30:16 +00:00
pal1000	eebe091d29	scons/windows: Enable compute shaders when possible. Tests done with llvm-config indicate that there are only 2 libraries in irreader and not in engine, LLVMAsmParser and LLVMIRReader and both of them are part of coroutines so I replaced irreader with coroutines and added libraries unique to coroutines. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-30 15:49:46 +01:00
Alyssa Rosenzweig	7be00b2a06	pan/midgard: Allow scheduling conditions with constants Now that we have constant adjustment logic abstracted, we can do this safely. Along with the csel inversion patch, this allows many more common csel ops to inline their condition in the bundle. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	c20063aa4a	pan/midgard: Add csel invert optimization Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	f0f4b39548	pan/midgard: Add mir_flip helper Useful for various operations on both commutative and anticommutative ops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	10037ce523	pan/midgard: Tightly pack 32-bit constants If we can reuse constant slots from other instructions, we would like to do so to include more instructions per bundle. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	a3ca283bc1	pan/midgard: Allow writeout to see into the future If an instruction could be scheduled to vmul to satisfy the writeout conditions, let's do that and save an instruction+cycle per fragment shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	12a70ccd9e	pan/midgard: Allow 6 instructions per bundle We never had a scheduler good enough to hit this case before! :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	34ff50cadd	pan/midgard: Only one conditional per bundle allowed There's no r32 to save ya after you use up r31 :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	2715bd02ee	pan/midgard: Schedule to smul/sadd Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	57bac68fff	pan/midgard: Extend choose_instruction for scalar units Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	e9edae3ecb	pan/midgard: Don't double check SCALAR units Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	d3b3daa9d3	pan/midgard: Use new scheduler We still emit in-order but we switch to using the bundles created from the new scheduler, which will allow greater flexibility and room for out-of-order optimization. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	1409af9fc7	pan/midgard: Add distance metric to choose_instruction We require chosen instructions to be "close", to avoid ballooning register pressure. This is a kludge that will go away once we have proper liveness tracking in the scheduler, but for now it prevents a lot of needless spilling. v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders that spilled excessively are fixed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Derp	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	e9571b53e1	pan/midgard: Add mir_choose_alu helper Based on a given unit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	8462e82467	pan/midgard: Implement load/store pairing We can bundle two load/store together. This eliminates the need for explicit load/store pairing in a prepass, as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	7cf4932410	pan/midgard: Extend csel_swizzle to branches Conditions for branches don't have a swizzle explicitly in the emitted binary, but they do implicitly get swizzled in whatever instruction wrote r31, so we need to handle that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	c9ce5a92a0	pan/midgard: Add helpers for scheduling conditionals Conditional instructions (csel and conditional branches) require their condition to be written to a special condition pipeline register (r31.w for scalar, r31.xyzw for vector). However, pipeline registers are live only for the duration of a single bundle. As such, the logic to schedule conditionals correct is surprisingly complex. Essentially, we see if we could stuff the conditional within the same bundle as the csel/branch without breaking anything; if we can, we do that. If we can't, we add a dummy move to make room. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	6f92288e85	pan/midgard: Implement predicate->unit This allows ALUs to select for each unit of the bundle separately. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	5a9a48b81a	pan/midgard: Add predicate->exclude A bit of a kludge but allows setting an implicit dependency of synthetic conditional moves on the actual condition, fixing code generated like: vmul.feq r0, .. sadd.imov r31, .., r0 vadd.fcsel [...] The imov runs simultaneous with feq so it gets garbage results, but it's too late to add an actual dependency practically speaking, since the new synthetic imov doesn't have a node associated. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	6284f3ec25	pan/midgard: Add constant intersection filters In the future, we will want to keep track of which components of constants of various sizes correspond to which parts of the bundle constants, like in the old scheduler. For now, let's just stub it out for a simple rule of one instruction with embedded constants per bundle. We can eventually do better, of course. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	941bdd2088	pan/midgard: Remove csel constant unit force Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	da18525b6f	pan/midgard: Add mir_schedule_texture/ldst/alu helpers We don't actually do any scheduling here yet, but add per-tag helpers to consume an instruction, print it, pop it off the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	72a03bcafa	pan/midgard: Add mir_choose_bundle helper It's not always obvious what the optimal bundle type should be. Let's break out the logic to decide. Currently set for purely in-order operation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	b5396369d2	pan/midgard: Add mir_update_worklist helper After we've chosen an instruction, popped it off, and processed it, it's time to update the worklist, removing that instruction from the dependency graph to allow its dependents to be put onto the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	826fd7308b	pan/midgard: Add mir_choose_instruction stub In the future, this routine will implement the core scheduling logic to decide which instruction out of the worklist will be scheduled next, in a way that minimizes cycle count and register pressure. In the present, we are more interested in replicating in-order scheduling with the much-more-powerful out-of-order model. So rather than discriminating by a register pressure estimate, we simply choose the latest possible instruction in the worklist. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	f48038b588	pan/midgard: Initialize worklist This flows naturally from the dependency graph Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	a3b46c0db6	pan/midgard: Calculate dependency graph Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	adda411263	pan/midgard: Add flatten_mir helper We would like to flatten a linked list of midgard_instructions into an array of midgard_instruction pointers on the heap. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	0ecfcbf462	pan/midgard: Squeeze indices before scheduling This allows node_count to be correct while scheduling. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	ad05e8a52c	pan/midgard: Fix component count handling for ldst It's not based on the writemask and it can't be inferred; it's just intrinsic to the op itself. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig	cc0544a0f5	pan/midgard: Add missing parans in SWIZZLE definition Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-30 08:40:11 -04:00
Daniel Schürmann	b3c1f601aa	nouveau: set lower_sub = true Subtractions are already implemented as additions anyway. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Eric Anholt	ca1aa5d225	v3d: Enable the late algebraic optimizations to get real subs. This worked better than my original v3d-local pass for just subs, and is a huge win over not producing subs. total instructions in shared programs: 6408469 -> 6167932 (-3.75%) total threads in shared programs: 153784 -> 154104 (0.21%) total uniforms in shared programs: 2157078 -> 1905823 (-11.65%) total max-temps in shared programs: 904546 -> 895796 (-0.97%) total spills in shared programs: 4959 -> 4993 (0.69%) total fills in shared programs: 6558 -> 6670 (1.71%) total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%) total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	1d29895e5b	aco: call nir_opt_algebraic_late() exhaustively 57559 shaders in 28980 tests Totals: SGPRS: 2963407 -> 2959935 (-0.12 %) VGPRS: 2014812 -> 2016328 (0.08 %) Spilled SGPRs: 1077 -> 1077 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 10348 -> 10348 (0.00 %) dwords per thread Code Size: 114545436 -> 114498084 (-0.04 %) bytes LDS: 933 -> 933 (0.00 %) blocks Max Waves: 375997 -> 375866 (-0.03 %) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	0fb27f1e5a	radv/aco: Don't lower subtractions 40228 shaders in 20236 tests Totals: SGPRS: 2045512 -> 2046496 (0.05 %) VGPRS: 1430856 -> 1430464 (-0.03 %) Spilled SGPRs: 1077 -> 1077 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 10348 -> 10348 (0.00 %) dwords per thread Code Size: 77202840 -> 77151832 (-0.07 %) bytes LDS: 863 -> 863 (0.00 %) blocks Max Waves: 260729 -> 260754 (0.01 %) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	239423d234	nir: Remove unnecessary subtraction optimizations These optimizations are already covered after lowering. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	99848a57b7	nir: recombine nir_op_*sub when lower_sub = false There are some optimizations which are only implemented for additions and some optimizations which assume that subtractions have been lowered. By lowering all subtractions first and later recombine for backends which prefer this option, we don't have to implement them twice. This patch also moves lower_negate to nir_opt_algebraic_late() to enable these optimizations for backends which make use of it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Daniel Schürmann	10e508c815	freedreno: Enable the nir_opt_algebraic_late() pass. Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Eric Anholt	d54ae70ee7	vc4: Enable the nir_opt_algebraic_late() pass. Upcoming changes to sub optimization will make this pass required. Over the course of that series, we see uniforms +.46%, instructions -.24% (seems like a fine tradeoff -- uniforms are 1/2 the size of instructions as far as cache occupancy) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-30 09:44:10 +00:00
Michel Dänzer	f2b8051d69	gitlab-ci: Add test-container:arm64 to needs: for arm64 test jobs Without this, it was theoretically possible for the jobs to run before the docker image was ready. v2: * Use - list syntax instead of [] (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-30 09:17:44 +02:00
Michel Dänzer	42a18280e4	gitlab-ci: Add needs: for x86 buster docker image This allows most build jobs to run before the stretch or arm64 docker images are ready. v2: * Use - list syntax instead of [] (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-30 09:17:38 +02:00
Michel Dänzer	88319f2678	gitlab-ci: Declare needs: for stretch docker image This allows the -old-llvm jobs to run before the buster docker images are ready. v2: Use - list syntax instead of [] (Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-30 09:17:00 +02:00
pal1000	ffb0d3a25c	scons: Fix MSYS2 Mingw-w64 build. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> This patch is based on `28e3f85e09/mingw-w64-mesa/link-ole32.patch` but with tweaks to avoid MSVC build break when applied. v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation; v3: Fix obviously wrong compiler flags for swr driver; v4: Update original patch URL because it has been relocated; v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway; v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.	2019-09-29 10:57:16 +01:00
pal1000	bcb4dfb14b	scons/windows: Support build with LLVM 9. As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF. Tests done with llvm-config on both LLVM 8 and 9 indicate that mcjit, bitwriter and x86asmprinter fully fit inside engine component. On other platforms and with meson build mcdisassembler was used to replace X86AsmPrinter but mcdisassembler also fully fits inside engine component for LLVM>=8 according to same tests. v2: Avoid duplicating code related to Mingw pthreads. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> On 19.1 this patch does not apply cleanly without `88eb2a1f`	2019-09-29 10:51:34 +01:00
Vasily Khoruzhick	336b021d36	lima: set uniforms_address lower bits properly Looks like blob uses following values for uniforms buffer: 0 for 8 bytes 1 for 16 bytes 2 for 24 bytes 2 for 32 bytes 3 for 40 bytes 3 for 48 bytes 3 for 56 bytes 3 for 64 bytes 4 for 72 bytes It all looks like log2(size / 8) rounded up, so let's do the same. Fixes: 931fc2a7b3f9("lima: do not set the PP uniforms address lowest bits") Reviewed-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-28 10:34:19 -07:00
Michel Zou	3f92d17894	scons: add py3 support SCons 3.1 has moved to python 3, requiring this fix to continue supporting scons builds. Closes: #944 Cc: mesa-stable@lists.freedesktop.org Acked-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Eric Engestrom <eric@engestrom.ch>	2019-09-28 16:53:08 +00:00
Mauro Rossi	411e50a8fd	android: aco: add support for libmesa_aco Android building rules are added in src/amd/Android.compiler.mk libmesa_aco static library is built conditionally to radeonsi as done for vulkan.radv module This will prevent Android build errors for non x86 systems filter-out compiler/aco_instruction_selection_setup.cpp source, as already included by compiler/aco_instruction_selection.cpp and would cause several multiple definition linker errors NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco Fixes: `93c8ebf` ("aco: Initial commit of independent AMD compiler") Fixes: `a70a998` ("radv/aco: Setup alternate path in RADV to support the experimental ACO compiler") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:34 +02:00
Mauro Rossi	268fb10e9c	android: compiler/nir: build nir_divergence_analysis.c Prerequisite to avoid following radv linking error happening with aco FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178: error: undefined reference to 'nir_divergence_analysis' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `df86c5f` ("nir: add divergence analysis pass.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:28 +02:00
Mauro Rossi	c24ad565ae	android: aco: fix undefined template 'std::__1::array' build errors Fixes a few building errors similar to the following: In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26: In file included from external/libcxx/include/algorithm:639: external/libcxx/include/utility:321:9: error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>' _T2 second; ^ Fixes: `93c8ebf` ("aco: Initial commit of independent AMD compiler") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2019-09-28 15:56:23 +02:00
Jonathan Marek	b38fcaa221	etnaviv: nir: fix gl_FragDepth Fixes the following piglit test: fragdepth_gles2 (for ETNA_MESA_DEBUG=nir) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:44 -04:00
Jonathan Marek	d4e35e62d2	etnaviv: disable earlyZ when shader writes fragment depth Fixes the following piglit test: fragdepth_gles2 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:43 -04:00
Jonathan Marek	dc3656c9c4	etnaviv: nir: make lower_alu easier to follow Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:43 -04:00
Jonathan Marek	c4f63be5a6	etnaviv: remove extra allocation for shader code Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:43 -04:00
Jonathan Marek	0b3957331d	etnaviv: nir: remove "options" struct It just makes thing more complicated for no reason. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:43 -04:00
Jonathan Marek	8f1b2ea7a9	etnaviv: nir: use store_deref instead of store_output Allows some simplification. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:43 -04:00
Jonathan Marek	d92689c46f	etnaviv: nir: add native integers (HALTI2+) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:34:35 -04:00
Jonathan Marek	d446134d2a	qetnaviv: nir: use new immediates when possible Note it can still be improved a bit: * Use alu swizzle to determine if src is scalar * Take into account new immediates in the multiple uniform src lowering Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:33:42 -04:00
Jonathan Marek	95fa799c86	etnaviv: nir: set num_components for inputs/outputs This can improve performance by allowing the LAST_VARYING_2X bit to be set when possible (and possibility more benefits on HALTI5 where the number of components is set for each varying). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:33:42 -04:00
Jonathan Marek	0036e078e3	etnaviv: nir: allocate contiguous components for LOAD destination LOAD starts reading into the first enabled destination component, and doesn't skip disabled components, so we need to allocate a destination with contiguous components. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:33:42 -04:00
Jonathan Marek	7da15bdd2d	etnaviv: nir: fix gl_FrontFacing Only invert front facing when glFrontFace is GL_CW. Fixes following deqp test: dEQP-GLES2.functional.shaders.builtin_variable.frontfacing Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-28 00:33:33 -04:00
Icenowy Zheng	931fc2a7b3	lima: do not set the PP uniforms address lowest bits The PP uniforms address register in render state is not a direct pointer to the uniforms storage -- instead, it points to an one-item array, and the array item is the real pointer to the uniforms storage. This register reuses some of its LSBs as a size field. Currently the size is set according to the length of the real uniforms storage. However, as the register itself contains only a pointer to the one-item array, the size field should be set to the length of the one-item array and subtract it by 1, which means a fixed value of 0. That means we can just omit it now. Test shows this should be the correct approach to set this register. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-28 08:49:20 +08:00
Andrii Simiklit	b32bb888c7	glsl: disallow incompatible matrices multiplication glsl 4.4 spec section '5.9 expressions': "The operator is multiply (), where both operands are matrices or one operand is a vector and the other a matrix. A right vector operand is treated as a column vector and a left vector operand as a row vector. In all these cases, it is required that the number of columns of the left operand is equal to the number of rows of the right operand. Then, the multiply () operation does a linear algebraic multiply, yielding an object that has the same number of rows as the left operand and the same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations” explains in more detail how vectors and matrices are operated on." This fix disallows a multiplication of incompatible matrices like: mat4x3(..) * mat4x3(..) mat4x2(..) * mat4x2(..) mat3x2(..) * mat3x2(..) .... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-09-27 21:42:09 +00:00
Eric Anholt	67e8977290	turnip: Fix failure behavior of vkCreateGraphicsPipelines. According to the 1.1.123 spec: "The implementation will attempt to create all pipelines, and only return VK_NULL_HANDLE values for those that actually failed." Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-27 13:34:28 -07:00
Eric Anholt	ab3cf128a6	turnip: Silence compiler warning about uninit pipeline. The code was fine as far as I see, but the warning was irritating. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-27 13:34:28 -07:00
Eric Anholt	a6cc68106c	turnip: Add a .editorconfig and .dir-locals.el I was inheriting the one from src/freedreno with funny tabs, while this driver is written with normal Mesa 3-space indents. Unfortunately I have to add both files, because I use emacs and emacs prefers .dir-locals to .editorconfig :( Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-27 13:34:28 -07:00
Eric Anholt	7a4647ee39	shader_enums: Move MAX_DRAW_BUFFERS to this file. We include shader_enums.h from freedreno's compiler for both GL and Vulkan, and the main/config.h include resulted in polluting the namespace with things like MAX_VIEWPORTS that other Vulkan drivers use as their driver-specific maximums. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-27 13:34:28 -07:00
Jason Ekstrand	6c858b9a91	intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates Without this, we were DCEing flag writes because we didn't think their results were used because we didn't understand that an ANY32 predicate actually read all the flags. Fixes: `df1aec763e` "i965/fs: Define methods to calculate the flag..." Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-27 19:31:43 +00:00
Christian Gmeiner	2391ef7785	etnaviv: support ARB_framebuffer_object Passes most of piglit's tests regarding arb_framebuffer_object and unlocks some more piglit tests. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-09-27 18:22:08 +00:00
Christian Gmeiner	fd1ed6f4f8	etnaviv: etna_resource_copy_region(..): drop assert We are using util_resource_copy_region(..) as fallback which supports different formats for src and dst. Improves the experience when running deqp or piglit with a debug build. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-09-27 18:22:08 +00:00
Dylan Baker	e456a053c3	meson: Link xvmc with libxv Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was fixed. This results in compilation failures for the gallium xvmc tracker and tools. This patch fixes that by explicitly linking to libxv. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844 Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-27 16:39:01 +00:00
Dylan Baker	8c5c21d7e3	meson: Try finding libxvmcw via pkg-config before using find_library This fixes cross compiling issues, because pkg-config is less likely to get the wrong libs. v2: - Fix typo in comment Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939 Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-27 16:39:01 +00:00
Andreas Gottschling	c5a2ccec5e	drisw: Fix shared memory leak on drawable resize XDestroyImage will mark the segment as to-be-destroyed, but it will persist until we detach it, and we weren't doing so. Cc: mesa-stable@lists.freedesktop.org Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121 Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-27 16:06:05 +00:00
Adam Jackson	90d58286cc	drisw: Fix and simplify drawable setup We don't want to require a visual for the drawable, because there exist fbconfigs that don't correspond to any visual (say a 565 pixmap\|pbuffer config on a depth-24 display). Fortunately, we don't need one either. Passing the visual to XCreateImage serves only to fill in the XImage's {red,green,blue}_mask fields, which libX11 itself never uses, they exist only for the client's convenience, and we don't care. And we already have the drawable depth in glx_config::rgbBits. So replace the XVisualInfo field in the drawable private with a pointer to the glx_config. Having done that driswCreateGCs becomes trivial, so inline it into its caller. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1194 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 11:18:15 -04:00
Adam Jackson	3c0eb762e2	drisw: Simplify GC setup There's no reason to have two GCs here. The only difference between them is that swapgc would generate graphics exposures, except we only ever use this GC for PutImage, and PutImage doesn't generate graphics exposures. We also don't need to explicitly ChangeGC to GXCopy, because that's the default. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 11:18:10 -04:00
Bas Nieuwenhuizen	e4a52bd653	turnip: Add todo for d24_s8 copies Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen	fa522b8a47	turnip: Disallow NPoT formats. Copying is a mess for these formats for now. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen	9e822957cd	turnip: Always use UINT formats for copies. Looks like r16_unorm might have precision issues. dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.r16_unorm.r16_unorm.general_general fails, but the dumped images in the xml are the same so I'd guess the low bits are the issue. r8_unorm and r16_uint work. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 15:05:21 +02:00
Bas Nieuwenhuizen	b48fe29e3c	turnip: Add image->image blitting. 3D blits & format reinterpretation are still TBD. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-27 15:05:21 +02:00
Rhys Perry	1f2813e103	aco: don't remove the loop exec mask in transition_to_Exact() No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-27 10:57:03 +01:00
Rhys Perry	b711e62e61	aco: set loop_info::has_discard for demotes We need the loop header phis for the outer exec masks. Needed for dEQP-VK.glsl.demote.dynamic_loop_texture Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-27 10:57:03 +01:00
Kenneth Graunke	237c7636ca	iris: Only resolve for image levels/layers which are actually in use. There's no need to resolve everything.	2019-09-26 22:49:10 -07:00
Vasily Khoruzhick	6dd0ad66de	lima/ppir: add NIR pass to split varying loads NIR may emit a single instrinsic to load several packed varyings, but that's suboptimal for Utgard PP for several reasons: - varyings that are used as sampler inputs can be passed using pipeline register with increased precision - we have small number of regs, so using a vec4 regs for storing two vec2 varyings increases reg pressure. Add NIR pass to split a single load into several loads and utilize it in lima. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-26 18:51:10 -07:00
Timur Kristóf	c372dc762d	radv: Fix L2 cache rinse programming. According to radeonsi, GLM doesn't support WB alone, so we have to set INV too when WB is set. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 22:18:16 +00:00
Jonathan Marek	8727253329	turnip: emit texture and uniform state Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	cb14f56b4f	turnip: add some shader information in pipeline state This information is needed by texture/uniform descriptors. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	ee4fa15a86	turnip: use nir_opt_copy_prop_vars Avoids getting a "load_output" in a case like this: gl_Position = ubuf.MVP * ubuf.position[gl_VertexIndex]; frag_pos = gl_Position.xyz; Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	b54f9e9e9e	turnip: lower samplers and uniform buffer indices Lower these to something compatible with ir3, and save the descriptor set and binding information. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	c39afe68f0	turnip: basic descriptor sets (uniform buffer and samplers) Mostly copy-paste from radv, with a few modifications. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	386f46ea82	turnip: enable linear filtering Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	02ca326a04	turnip: align layer_size Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	195abadd2c	turnip: use linear tiling for scanout image Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	54c80d080a	turnip: implement image view descriptor Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	5f2fb904a1	turnip: implement sampler state Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	53277757aa	turnip: fix vertex_id ir3 uses non-zero based vertex id for a6xx Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jonathan Marek	1e8aff9ff3	turnip: emit shader immediates Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-09-26 17:18:13 -04:00
Jason Ekstrand	5ca4f57469	util/rb_tree: Stop relying on &iter->field != NULL The old version of the iterators relies on a &iter->field != NULL check which works fine on older GCC but newer GCC versions and clang have optimizations that break if you do pointer math on a null pointer. The correct solution to this is to do the null comparisons before we do any sort of &iter->field or use rb_node_data to do the reverse operation. Acked-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-26 20:36:41 +00:00
Jason Ekstrand	f18aad6dc0	util/rb_tree: Also test _safe iterators Acked-by: Michel Dänzer <mdaenzer@redhat.com> Tested-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-26 20:36:41 +00:00
Eric Anholt	3338d6e5f8	freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex. This is based on the fix I used for the same problem on V3D. In this case, it fixes all but the the dEQP-GLES2.functional.texture.filtering.2d._npot cases of dEQP-GLES2.functional.texture.filtering.2d.'s failures. Acked-by: Rob Clark <robdclark@chromium.org>	2019-09-26 11:27:31 -07:00
Maya Rashish	e16fadd545	intel/compiler: avoid truncating int64_t to int Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Maya Rashish <maya@netbsd.org>	2019-09-26 17:46:26 +00:00
Icenowy Zheng	a1ff8dbb1e	lima: support rectangle texture As Vasily discovered, the bit 7 of the word 1 of the texture descriptor is set when reloading the framebuffer, to use framebuffer-based offset rather than normalized one. This bit also works for regular textures to enable accessing with non-normalized offset. Add support for rectangle texture by setting this bit for PIPE_TEXTURE_RECT. Suggested-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-26 16:52:22 +00:00
Michel Dänzer	eb03141f52	loader: Avoid use-after-free / use of uninitialized local variables Per the valgrind output below, we were returning the pointer to freed memory if none of the later conditional pointer assignments were executed. This caused dEQP CI jobs to crash on certain runners, presumably due to a double-free down the line. Also, we were skipping to the out: label before the vendor_id & chip_id variables used by it were initialized, resulting in broken LIBGL_DEBUG=verbose output such as libGL: pci id for fd 4: 51108f00:51108f00, driver radeonsi Fixes: `5a545e355b` "loader: always map the "amdgpu" kernel driver name to radeonsi (v2)" ==403== Invalid read of size 1 ==403== at 0x4AFD576: surfaceless_probe_device (platform_surfaceless.c:316) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const) const (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DE07A: deqp::gles2::Context::Context(tcu::TestContext&) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DB5EF: deqp::gles2::TestPackage::init() (in /deqp/modules/gles2/deqp-gles2) ==403== Address 0x56bd340 is 0 bytes inside a block of size 4 free'd ==403== at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==403== by 0x4B01767: loader_get_driver_for_fd (loader.c:464) ==403== by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const) const (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const) (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2) ==403== Block was alloc'd at ==403== at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==403== by 0x4EE5E09: strndup (strndup.c:43) ==403== by 0x4B010B1: loader_get_kernel_driver_name (loader.c:101) ==403== by 0x4B016AF: loader_get_driver_for_fd (loader.c:462) ==403== by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308) ==403== by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984) ==403== by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958) ==403== by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75) ==403== by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96) ==403== by 0x4AE9367: eglInitialize (eglapi.c:617) ==403== by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2) ==403== by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-26 18:00:34 +02:00
Adam Jackson	8b6d3f2c78	Revert "glx: Lift sending the MakeCurrent request to top-level code" Apparently this provokes crashes elsewhere in code unrelated to MakeCurrent. I hate GLX so very very much. This reverts commit `999c2aed88`. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207	2019-09-26 11:07:42 -04:00
Adam Jackson	a14e3b43be	Revert "glx: Implement GLX_EXT_no_config_context" This reverts commit `0d635ccc91`. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207	2019-09-26 11:07:13 -04:00
Timur Kristóf	30f0c0ea7d	radv: Add debug option to dump meta shaders. This new option can help debug shader compiler problems when there are issues with the meta shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 13:36:49 +00:00
Timur Kristóf	a4fd8ba7e3	amd/common: Introduce ac_get_fs_input_vgpr_cnt. Add a function called ac_get_fs_input_vgpr_cnt which will return the number of input VGPRs used by an AMD shader. Previously, radv and radeonsi had the same code duplicated, but this commit also allows them to share this code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-26 13:36:49 +00:00
Timur Kristóf	83eebdb507	radv: Set shared VGPR count in radv_postprocess_config. This commit allows RADV to set the shared VGPR count according to the shader config. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 13:36:49 +00:00
Timur Kristóf	7bde4ddaf7	amd/common: Add num_shared_vgprs to ac_shader_config for GFX10. In GFX10 wave64 mode, shared VGPRs allow the two wave halves to share some data with each other. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-26 13:36:49 +00:00
Timur Kristóf	db1fddcf0f	amd/common: Extract some helper functions to ac_shader_util. This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and ac_get_image_dim into ac_shader_util, thus enabling them to be used by compilers other than LLVM. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-26 13:36:49 +00:00
Timur Kristóf	d8b46f8964	amd/common: Move ac_export_mrt_z to ac_llvm_build. The aim of this commit is to keep ac_shader_util LLVM-free, since we would like to use it in ACO later. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-26 13:36:49 +00:00
Rhys Perry	06ea3325c3	aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask v2: rename pass_temp to pass_flags v2: also CSE reductions v3: add ds_swizzle_b32 support v3: check gds/offset0/offset1 fields Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-26 13:19:51 +01:00
Rhys Perry	86ecf92c23	aco: don't CSE v_readlane_b32/v_readfirstlane_b32 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-26 13:19:51 +01:00
Rhys Perry	3c966fd688	aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 11:08:47 +01:00
Rhys Perry	ec8ced9123	radv/aco: return a correct name and description for the backend IR Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 11:08:43 +01:00
Rhys Perry	15ea1c5cff	aco: store printed backend IR in binary Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 11:08:31 +01:00
Rhys Perry	6613b81327	aco,radv/aco: get dissassembly for release builds if requested Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 11:08:09 +01:00
Rhys Perry	0aef1a230e	radv/aco: actually disable ACO when unsupported We were setting this twice. The second time, we weren't later disabling it if unsupported. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-26 11:04:45 +01:00
Tapani Pälli	031752798b	mesa/st: calculate texture size based on EGLImage miplevel Fixes issues with 'egl-gl_oes_egl_image' Piglit test. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-26 07:55:24 +03:00
Dylan Baker	fafd20f67d	meson: fix logic for generating .pc files with old glvnd We want to generate PC files for non-glvnd builds and for builds with old glvnd, but the current logic doesn't do that, it builds them unconditionally, and for GLES it builds the shared libraries, which is also not what we want. This does not generate .pc files for gles1 or gles2. Which it we weren't doing before either, making this not a regression but a return to status-quo.o Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838 Fixes: `93df862b6a` ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-25 23:25:27 +00:00
Ian Romanick	7e53bebcb5	nir/range-analysis: Use types to provide better ranges from bcsel and mov Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16328255 -> 16315391 (-0.08%) instructions in affected programs: 218318 -> 205454 (-5.89%) helped: 988 HURT: 0 helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10 helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88% 95% mean confidence interval for instructions value: -13.69 -12.35 95% mean confidence interval for instructions %-change: -6.55% -5.99% Instructions are helped. total cycles in shared programs: 363683977 -> 363615417 (-0.02%) cycles in affected programs: 1475193 -> 1406633 (-4.65%) helped: 923 HURT: 36 helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48 helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08% HURT stats (abs) min: 1 max: 179 x̄: 38.58 x̃: 4 HURT stats (rel) min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29% 95% mean confidence interval for cycles value: -75.88 -67.10 95% mean confidence interval for cycles %-change: -5.10% -4.66% Cycles are helped. Sandy Bridge total instructions in shared programs: 10785779 -> 10785654 (<.01%) instructions in affected programs: 13855 -> 13730 (-0.90%) helped: 67 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1 helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78% 95% mean confidence interval for instructions value: -2.47 -1.26 95% mean confidence interval for instructions %-change: -1.13% -0.81% Instructions are helped. total cycles in shared programs: 153704799 -> 153704481 (<.01%) cycles in affected programs: 101509 -> 101191 (-0.31%) helped: 38 HURT: 13 helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16 helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53% HURT stats (abs) min: 1 max: 36 x̄: 12.15 x̃: 7 HURT stats (rel) min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44% 95% mean confidence interval for cycles value: -10.24 -2.24 95% mean confidence interval for cycles %-change: -0.75% -0.17% Cycles are helped. LOST: 2 GAINED: 0 No shader-db change on Iron Lake or GM45.	2019-09-25 15:37:05 -07:00
Ian Romanick	99ddb41e2d	nir/range-analysis: Use types in the hash key This allows the reslut of mov and bcsel to be separately interpreted as float or int depending on the use. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-25 15:37:01 -07:00
Ian Romanick	018d2b524a	nir/range-analysis: Bail if the types don't match Some shaders are hurt by this change because now a load_const(0x00000000) is not recognized as eq_zero when loaded as a float. This behavior is restored in a later patch (nir/range-analysis: Use types to provide better ranges from bcsel and mov). v2: Add a comment about reinterpretation of int/uint/bool. Suggested by Caio. Rewrite condition the check for types being float versus checking for types not being all the things that aren't float. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16327543 -> 16328255 (<.01%) instructions in affected programs: 55928 -> 56640 (1.27%) helped: 0 HURT: 208 HURT stats (abs) min: 1 max: 16 x̄: 3.42 x̃: 3 HURT stats (rel) min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12% 95% mean confidence interval for instructions value: 3.06 3.79 95% mean confidence interval for instructions %-change: 1.17% 1.46% Instructions are HURT. total cycles in shared programs: 363682759 -> 363683977 (<.01%) cycles in affected programs: 325758 -> 326976 (0.37%) helped: 44 HURT: 133 helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5 helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29% HURT stats (abs) min: 1 max: 157 x̄: 20.28 x̃: 14 HURT stats (rel) min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73% 95% mean confidence interval for cycles value: 0.38 13.39 95% mean confidence interval for cycles %-change: -0.06% 0.96% Inconclusive result (%-change mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10787433 -> 10787443 (<.01%) instructions in affected programs: 1842 -> 1852 (0.54%) helped: 0 HURT: 10 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.36% 1.10% Instructions are HURT. total cycles in shared programs: 153724543 -> 153724563 (<.01%) cycles in affected programs: 8407 -> 8427 (0.24%) helped: 1 HURT: 3 helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18 helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98% HURT stats (abs) min: 4 max: 18 x̄: 12.67 x̃: 16 HURT stats (rel) min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72% 95% mean confidence interval for cycles value: -21.31 31.31 95% mean confidence interval for cycles %-change: -1.11% 1.46% Inconclusive result (value mean confidence interval includes 0). No shader-db changes on Iron Lake or GM45.	2019-09-25 15:37:01 -07:00
Lionel Landwerlin	e5ddbd7a3c	intel: Add new Comet Lake PCI-ids Commit bfc4c359b282 ("drm/i915/cml: Add Missing PCI IDs") in i915 added 3 new CML PCI ids. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-26 01:13:28 +03:00
Lionel Landwerlin	813f3460e7	intel: use proper label for Comet Lake skus Fixes: `82f6a746e8` ("intel: Add support for Comet Lake") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-26 01:13:28 +03:00
Kristian H. Kristensen	06d207a2fa	freedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader Consolidate a few more generic shaders setup regs in fd6_emit_shader. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	cf695ad2ec	freedreno/a6xx: Emit const and texture state for HS/DS/GS Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	87d234d968	freedreno/ir3: Add HS/DS/GS to shader key and cache Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	d9c2ceddd2	freedreno/a6xx: Add generic program stateobj support for HS/DS/GS This add generic stage state setup for HS/DS/GS to the program state object. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	64bc833f32	freedreno: Move fs functions after geometry pipeline stages Let's try to always order the stages in the pipeline order. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	00cbb6db09	freedreno: Add state binding functions for HS/DS/GS Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	2dc4d6c692	freedreno: Rename vp and fp to vs and fs in fd_program_stateobj We're using vs and fs now, and adding hs, ds and gs soon. It's confusing enough that we have both DS/TCS and HS/TES. At least for VS and FS there doesn't have to be multiple names. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Kristian H. Kristensen	c99ecf7f96	freedreno/a6xx: Factor out const state setup We'll be sharing this logic for new shader stages soon. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-25 21:39:08 +00:00
Eric Engestrom	b3e3af0e37	glsl: turn runtime asserts of compile-time value into compile-time asserts Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-09-25 21:14:52 +00:00
Eric Engestrom	ae8a7d5c8f	docs/release-calendar: add missing <td> and </td> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-25 22:13:07 +01:00
Eric Engestrom	f9bb5cd105	docs/release-calendar: fix bugfix release numbers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-25 22:13:07 +01:00
Lionel Landwerlin	da2d67fc3b	anv: gem-stubs: return a valid fd got anv_gem_userptr() Fixes invalid close(-1) in the unit tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-25 22:02:51 +03:00
Danylo Piliaiev	2d8f77db83	st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader RET as a last instruction could be safely ignored. Remove it to prevent crashes/warnings in case underlying driver doesn't implement arbitrary returns. A better way would be to remove the RET after the whole shader is parsed which will handle a possible case when the last RET is followed by a comment. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-09-25 18:24:01 +00:00
Dylan Baker	60861388e7	bin/get-pick-list: use --oneline=pretty instead of --oneline --oneline shortens hashes, while --oneline=pretty doesn't, otherwise they are the same. Having full hashes is convenient as that is the format that the bin/.cherry-ignore script requires to work correctly. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-09-25 17:50:19 +00:00
Dylan Baker	c8fa996dcf	release: Push 19.3 back two weeks The main reason to do this is that 19.2 has slipped by two weeks, and such the 19.3 branch is due to happen extremely close to the release of 19.2.0. I think it would be better to have a little more time between releases for developers and for packagers. This would still have the 19.3 release out before December, even if it slips by 1 week. Acked-By: Karol Herbst <kherbst@redhat.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-09-25 10:46:59 -07:00
Dylan Baker	666a5a2230	docs: update calendar, add news item, and link release notes for 19.2.0	2019-09-25 10:42:17 -07:00
Dylan Baker	582421285b	docs: add SHA256 sum for 19.2.0	2019-09-25 10:42:17 -07:00
Dylan Baker	8302eb7a8f	docs: Add release notes for 19.2.0	2019-09-25 10:42:17 -07:00
Andreas Baierl	0c199808bc	lima/ppir: Add various varying fetch sources to disassembler Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-25 16:57:31 +00:00
Eric Engestrom	93df862b6a	meson: re-add incorrect pkg-config files with GLVND for backward compatibility This is a bit counter-intuitive, but the issue is that GLVND is broken in versions <= 1.1.1, so we need to keep wrongly providing these files to cover up their mistake, otherwise the rest of the world ends up broken. Suggested-by: Dylan Baker <dylan@pnwbakers.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-25 17:27:54 +01:00
Rhys Perry	db2ca45102	aco: check for duplicate opcode numbers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-25 15:28:44 +00:00
Rhys Perry	101f47fdd7	aco: fix opcode for s_mul_hi_i32 Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-25 15:28:44 +00:00
Rhys Perry	2faaf04c62	aco: fix v_subrev_co_u32_e64 opcode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-25 15:28:44 +00:00
Rhys Perry	00aa413bae	aco: fix GFX9 opcode for v_xad_u32 Fixes various dEQP-VK.image.store.* tests. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-25 15:28:44 +00:00
Rhys Perry	b125dc4839	aco: implement 64-bit ineg We currently lower them, but nir_opt_algebraic() can add new ones because lower_sub=true. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-25 15:27:48 +00:00
Rhys Perry	641eac953c	aco: run nir_lower_int64() before nir_lower_idiv() nir_lower_idiv() asserts on 64-bit integers. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-09-25 15:27:48 +00:00
Connor Abbott	36e000d832	nir: Fix overlapping vars in nir_assign_io_var_locations() When handling two variables with overlapping locations, we process the one with lower location first, and then extend the location -> driver_location map to guarantee that it's contiguous for the second variable too. But the loop had the wrong bound, so we weren't extending the map 100%, which could lead to problems later such as an incorrect num_inputs. The loop index i is an index into the slots of the variable, so we need to stop at the final slot of the variable (var_size) instead of the number of unassigned slots. This fixes spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-interleave-range on radeonsi NIR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-25 15:53:50 +02:00
Karol Herbst	66456b8d49	clover: eliminate "ignoring attributes on template argument" warning Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <dev@pmoreau.org>	2019-09-25 10:39:58 +00:00
Karol Herbst	4f044c38e2	clover/codegen: remove unused get_symbol_offsets function Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <dev@pmoreau.org>	2019-09-25 10:39:58 +00:00
Karol Herbst	2859c49f7b	clover/llvm: remove harmful std::move call both clang and gcc warn with: "moving a local object in a return statement prevents copy elision" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <dev@pmoreau.org>	2019-09-25 10:39:58 +00:00
Tapani Pälli	f4d9169204	iris: disable aux on first get_param if not created with aux This moves the fix from commit `361f3d19f1` to happen in get_param (used now instead of get_handle by st/dri). This fixes artifacts seen with Xorg and CCS_E. Fixes: `fc12fd05f5` "iris: Implement pipe_screen::resource_get_param" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-25 08:28:45 +03:00
Erik Faye-Lund	88f909eb37	glsl: correct bitcast-helpers Without this, we'll incorrectly round off huge values to the nearest representable double instead of keeping it at the exact value as we're supposed to. Found by inspecting compiler-warnings. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `85faf5082f` ("glsl: Add 64-bit integer support for constant expressions") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-25 04:52:54 +00:00
Vasily Khoruzhick	678ebda8b7	lima/ppir: add support for indirect load of uniforms and varyings Utgard PP supports indirect load of uniforms and varyings, so let's enable it. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 20:33:27 -07:00
Vasily Khoruzhick	780985d1b8	lima/ppir: add node dependency types Currently we add dependecies in 3 cases: 1) One node consumes value produced by another node 2) Sequency dependencies 3) Write after read dependencies 2) and 3) only affect scheduler decisions since we still can use pipeline register if we have only 1 dependency of type 1). Add 3 dependency types and mark dependencies as we add them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 20:33:13 -07:00
Vasily Khoruzhick	4fcfed426a	lima/ppir: don't attempt to clone tex coords if it's not varying It makes no sense to clone texture coords if it's not varying, moreover we don't support cloning ALU nodes. Fixes: `1c1890fa70` ("lima/ppir: clone uniforms and load_coords into each successor") Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-25 03:07:16 +00:00
Timothy Arceri	0e1310e59f	radeonsi/nir: lower load constants to scalar We call nir_lower_load_const_to_scalar in the state trackers linker however some later passes can reintroduce constant vectors. Here we lower these to scalar and perform optimisations. The Intel drivers do a similar call in their backend.. shader-db results VEGA 64: Totals from affected shaders: SGPRS: 152168 -> 151976 (-0.13 %) VGPRS: 135224 -> 135112 (-0.08 %) Spilled SGPRs: 4027 -> 4163 (3.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 10670028 -> 10654776 (-0.14 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 13122 -> 13135 (0.10 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-25 02:42:55 +00:00
Jonathan Marek	e353fd096d	turnip: use image tile_mode for gmem configuration Fixes at least this deqp test: dEQP-VK.api.smoke.triangle Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-24 22:32:09 -04:00
Jonathan Marek	f510901dc2	turnip: fix binning shader compilation ir3 segfaults if nonbinning is NULL for the bininng pass shader. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-24 22:32:09 -04:00
Rhys Perry	12372d60ff	nir/opt_remove_phis: handle phis with no sources This can happen with loops with unreachable exits which are later optimized away. Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-25 00:58:30 +00:00
Michel Dänzer	67d930d64b	radeonsi: fix VAAPI segfault due to various bugs Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236	2019-09-24 19:23:30 -04:00
Marek Olšák	f52afdf672	gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH because vl doesn't call flush_resource and I wasn't able to find all places where flush_resource needs to be called. This fixes corrupted / unflushed surfaces with fullscreen videos on Raven. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-24 19:23:30 -04:00
Marek Olšák	783fae2a1f	radeonsi: initialize displayable DCC using the retile blit to prevent hangs Cc 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-24 19:23:30 -04:00
Connor Abbott	270fe55256	nir/opt_large_constants: Handle store writemasks This fixes some piglit tests on radeonsi NIR where a varying is initialized to a constant array in the vertex shader. Varying packing after nir_lower_io_to_temporaries creates writemasked stores which persist after pulling the constant initialization down into the fragment shader. While we're here, rewrite handle_constant_store() to do the loop over components outside the switch, so that we don't have to duplicate the writemask checking for every bitsize. Fixes: `1235850522` ("nir: Add a large constants optimization pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-24 20:59:58 +00:00
Eric Engestrom	da496d4e30	meson: split more compiler options to their own line Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-24 19:39:24 +01:00
Eric Engestrom	3fd0afd5e3	meson: drop -Wno-foo bug workaround for Meson < 0.46 This was a workaround for a bug in Meson that was fixed in 0.46 [1]. [1] https://github.com/mesonbuild/meson/pull/2284 Fixes: `f7b6a8d12f` ("meson: bump required version to 0.46") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-24 19:39:24 +01:00
Eric Engestrom	30f639c181	radv: fix s/load/store/ copy-paste typo Fixes: `cdc6efddf9` ("radv: implement all depth/stencil resolve modes using graphics") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-24 19:18:54 +01:00
Stephen Barber	8c3ace6991	nouveau: add idep_nir_headers as dep for libnouveau Fixes a compilation error when building libnouveau: In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25: ../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory #include "nir_intrinsics.h" ^~~~~~~~~~~~~~~~~~ compilation terminated. Fixes: `f014ae3c7c` ("nouveau: add support for nir") Signed-off-by: Stephen Barber <smbarber@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-09-24 17:27:20 +00:00
Bas Nieuwenhuizen	780182f0a0	radv: Add workaround for hang in The Surge 2. Released today and hangs on RADV. We don't have the root cause yet, but this should unblock people playing the game. No drirc because the radv debugflags are not usable from drirc and I want this backported. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-09-24 09:51:40 +00:00
Andres Gomez	5e87f48f1d	i965/fs: set rounding mode when emitting the flrp instruction flrp was forgotten when already adding the rounding mode for other instructions. Fixes: `ba1e25e1aa` ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions") Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2019-09-24 12:06:59 +03:00
Andres Gomez	6f1468c371	i965/fs: add a comment about how the rounding mode in fmul is set After `1711bf6cf2` ("intel/fs: Generate better code for fsign multiplied by a value"), the conflicts resolution for setting the rounding mode after the fused fmul and fsign optimization is non obvious. Basically, the optimization doesn't really result in a MUL, or any other operation which would need to have the rounding mode set. Hence, we set it just before the actual MUL in the treatment of fmul. Fixes: `ba1e25e1aa` ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions") Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2019-09-24 11:24:15 +03:00
Juan A. Suarez Romero	b3c25e6f99	bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars The script only handles commits with "Fixes: <sha1>" where <sha1> is equal or great than 8 chars. But <sha1> can be smaller, like 7 chars. This commit relax the restriction to handle <sha1> 4 or more chars. Fixes: `533fead423` ("bin/get-pick-list.sh: tweak the commit sha matching pattern") Acked-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-24 07:36:45 +00:00
Connor Abbott	fed5b605f0	lima/gpir: Fix 64-bit shift in scheduler spilling There are 64 physical registers so the shift must be 64 bits. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:44:54 +02:00
Connor Abbott	ef38a659fb	lima/gpir: Don't emit movs when translating from NIR The scheduler doesn't expect them. To do this, I had to refactor the registration part of gpir_node_create_dest() to be separate from creating and inserting the node, since the last two now aren't done when handling moves. This adds more code but creates the possibility of automatically inserting input dependencies when inserting nodes, similar to what's done in NIR with the use-def lists (this isn't done yet). Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:43:48 +02:00
Connor Abbott	96c31d9a55	lima/gpir: Fix postlog2 fixup handling We guarantee that a complex1 op is always used by postlog2 directly by rewriting the postlog2 op to be a move when there would be a move inserted between them. But we weren't doing this in all circumstances where there might be a move. Move the logic to place_move() so that it always happens. Fixes a few log tests that happened to start failing due to changes in the register allocator leading to a different scheduling order. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:43:06 +02:00
Connor Abbott	1cd1cce035	lima/gpir: Use registers for values live in multiple blocks This commit adds the framework for cross-basic-block register allocation. Like ARM's compiler, we assume that the value registers aren't usable across branches, which means we have to use physical registers to store any value that crosses a basic block. There are three parts to this: 1. When translating from NIR, we rely on the NIR out-of-ssa pass to coalesce values into registers. We insert store_reg instructions for values used in more than one basic block, and load_reg instructions for values not defined in the same basic block (or defined after their use, for loops). So by the time we've translated out of NIR we've already split things into values (which are only used in the same basic block) and registers (which are only used in different basic blocks than where they're defined). 2. We allocate the registers at the same time that we allocate the values, before the final scheduler. Unlike the values, where the assigned color is fake, we assign the actual physical index & component to physregs at this stage. load_reg and store_reg are treated as moves in the allocator and when creating write-after-read dependencies. 3. Finally, in the main scheduler we have to avoid overwriting existing live physregs when spilling. First, we have to tell the scheduler which physical registers are live at the end of each block, to avoid overwriting those. If a register is only live at the beginning, we can reuse it for spilling after the last original use in the final program happens, i.e. before any original use is scheduled, but we have to be careful to add the proper dependencies so that the spill write is scheduled before the original reads. To handle this we repurpose reg_link for uses to be used by the scheduler. A few register-related things copied over from NIR or from other drivers can be dropped. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:37:37 +02:00
Connor Abbott	7594ef6eb0	lima/gpir: Support branch instructions Because branch conditions have to be in the pass slot, there is no unconditional branch, and realistically the pass slot has to contain a move when branching (there's nothing it does that would be useful for operating on booleans, so we can't use it for anything when computing the branch condition), we put the branch instruction in the pass slot and at codegen time turn it into a move of the branch condition. This means that it doesn't have to be special-cased like store instructions are in the scheduler. Because of this decision we can remove the half-implemented BRANCH codegen slot. Finally, we (ab)use the existing schedule_first mechanism to make sure that branches are always last in the basic block. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:35:47 +02:00
Connor Abbott	2df2e081fd	lima/gpir: Only try to place actual children When picking a node to be scheduled, we try to schedule its children as well. But we shouldn't try to schedule nodes which only have a fake dependency on the original node, since this isn't the point of scheduling children at the same time and can break some expectations of the rest of the code. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:35:26 +02:00
Connor Abbott	f989a024b4	lima/gpir: Fix compiler warning Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-24 08:33:56 +02:00
Adam Jackson	0d635ccc91	glx: Implement GLX_EXT_no_config_context This is the GLX counterpart to EGL_KHR_no_config_context. Contexts may now be created without reference to an fbconfig, in which case it is treated as compatible with any fbconfig (and thus any GLX drawable). Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 20:39:01 -04:00
Adam Jackson	999c2aed88	glx: Lift sending the MakeCurrent request to top-level code Somewhat terrifyingly, we never sent this for direct contexts, which means the server never knew the context/drawable bindings. To handle this sanely, pull the request code up out of the indirect backend, and rewrite the context switch path to call it as appropriate. This attempts to preserve the existing behavior of not calling unbind() on the context if its refcount would not drop to zero. Of course, you can't just do this indiscriminately, because this is GLX and extant X servers have bugs and everything is terrible. To wit: - For 1.20.x prior to 1.20.6, you can bind a direct context once, but the second time you try to modify the context's binding you will get GLXBadContextTag. This includes unbinding the context. And "deleting" the context will leak memory, because it will still appear to be current. - For 1.19 and earlier, glXMakeCurrent(dpy, None, ctx) should be legal for GL 3.0+ contexts, but the server will throw BadMatch. To guard against this, we only send the request for indirect contexts unless the server is known good, and only mention one context at a time in such a request; if switching between contexts, we first unbind the old, and then bind the new. Note that the second VendorRelease() version is to catch XFree86 4.x and Xorg [67].x, which almost certainly have the above bugs. Other servers might report different version numbers here, but we can't do direct rendering against them, so this should be safe. Fixes glx-make-context, glx-multi-window-single-context and glx-query-drawable-glx_fbconfig_id-window. Sufficiently old piglit will regress on glx-make-glxdrawable-current (throwing BadMatch), which is fixed by mesa/piglit!116.	2019-09-23 20:39:01 -04:00
Adam Jackson	01e437988d	glx: Move vertex array protocol state into the indirect backend Only relevant for indirect contexts, so let's get that code out of the common path.	2019-09-23 20:21:01 -04:00
Kenneth Graunke	b9e93db208	intel: Increase Gen11 compute shader scratch IDs to 64. From the MEDIA_VFE_STATE docs: "Starting with this configuration, the Maximum Number of Threads must be set to (#EU * 8) for GPGPU dispatches. Although there are only 7 threads per EU in the configuration, the FFTID is calculated as if there are 8 threads per EU, which in turn requires a larger amount of Scratch Space to be allocated by the driver." It's pretty clear that we need to increase this for scratch address calculations, because the FFTID has a certain bit-pattern. The quote above seems to indicate that we should increase the actual thread count programmed in MEDIA_VFE_STATE as well, but we think the intention is to only bump the scratch space. Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8. Fixes: `5ac804bd9a` ("intel: Add a preliminary device for Ice Lake") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-23 16:59:40 -07:00
Kenneth Graunke	50c0dd8621	Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM" This reverts commit `729de1488f`. It turns out that, although the register is in the logical context, it isn't whitelisted, so we can't actually write it from userspace batch buffers. The write just becomes a noop, which is why we saw no performance changes. I manually whitelisted it, and still observed no performance gains, but it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments on the iris driver. So we might need to fix something before enabling this. To prevent it randomly getting turned on should the kernel ever whitelist this register, we revert the patch for now.	2019-09-23 16:31:23 -07:00
Jason Ekstrand	03911195a3	util/rb_tree: Replace useless ifs with asserts Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 22:38:30 +00:00
Kenneth Graunke	a733423da5	broadcom/genxml: Stop manually scrubbing 'α' -> "alpha" 'α' has never appeared in any genxml files, so there's no need to replace it with the word "alpha". Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-23 20:24:54 +00:00
Kenneth Graunke	8489206e9d	intel/genxml: Stop manually scrubbing 'α' -> "alpha" 'α' has never appeared in any genxml files, so there's no need to replace it with the word "alpha". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-09-23 20:24:54 +00:00
Rob Clark	d8cbf1adc1	freedreno/a6xx: do streamout only in binning pass Use VPC_SO_OVERRIDE to control whether we do streamout in binning or draw pass. Normally we want to do streamout in binning pass, except when there is a single tile and binning passed is skipped. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-23 20:02:34 +00:00
Rob Clark	b9bf374512	freedreno/a6xx: fix binning pass vs. xfb We could bit doing streamout from binning pass. In this case we want to use the full VS which doesn't have (potentially streamed out) varyings stripped out. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-23 20:02:34 +00:00
Rob Clark	331f89a971	freedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-23 20:02:34 +00:00
Marek Olšák	05d32850ff	ac/nir: force unnormalized coordinates for RECT This fixes VAAPI. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:54 -04:00
Marek Olšák	500181b2ba	ac/nir: port Z compare value clamping from radeonsi This fixes some dEQP tests. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:54 -04:00
Marek Olšák	09447ccc78	tgsi_to_nir: fix 2-component system values like tess_level_inner_default Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-23 15:34:56 -04:00
Marek Olšák	3906fce88b	tgsi_to_nir: fix masked out image loads This caused a failure in NIR validation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-23 15:34:54 -04:00
Marek Olšák	780eeaf2f1	nir: define 8-byte size and alignment for bindless variables Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:22 -04:00
Marek Olšák	f5c103ce1d	nir: don't add bindless variables to num_textures and num_images It confuses radeonsi. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-23 15:34:05 -04:00
Marek Olšák	150f6ffb4c	amd: remove all PCI IDs supported by amdgpu Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-23 15:15:35 -04:00
Jiang, Sonny	5a545e355b	loader: always map the "amdgpu" kernel driver name to radeonsi (v2) v2: cleanup Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-23 15:14:11 -04:00
Marek Olšák	9429714233	ac: stop using PCI IDs for chip identification PCI IDs for amdgpu will be removed from Mesa. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-23 15:14:11 -04:00
Marek Olšák	48742de601	ac/addrlib: fix chip identification for Vega10, Arcturus, Raven2, Renoir Cc: 19.2 <mesa-stable@lists.freedesktop.org> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-23 15:14:11 -04:00
Marek Olšák	65b698136c	amd: add more PCI IDs for Navi14 trivial and urgent Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-23 15:12:34 -04:00
Eric Engestrom	c29c410182	meson: split compiler warnings one per line Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 17:56:22 +01:00
Jason Ekstrand	d63162cff0	nir/repair_ssa: Replace the unreachable check with the phi builder In `a3268599f3`, I attempted to fix nir_repair_ssa for unreachable blocks. However, that commit missed the possibility that the use is in a block which, itself, is unreachable. In this case, we can end up in an infinite loop trying to replace a def with itself. Even though a no-op replacement is a fine operation, it keeps extending the end of the uses list as we're walking it. Instead of explicitly checking for the group of conditions, just check if the phi builder gives us a different def. That's guaranteed to be 100% reliable and, while it lacks symmetry with the is_valid checks, should be more reliable. Fixes: `a3268599` "nir/repair_ssa: Repair dominance for unreachable..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-23 16:19:24 +00:00
Daniel Schürmann	2c050b49b3	aco: only emit waitcnt on loop continues if we there was some load or export Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-09-23 13:39:33 +02:00
Karol Herbst	70e39294d7	nv50/ir/nir: comparison of integer expressions of different signedness warning Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2019-09-23 13:27:32 +02:00
Karol Herbst	61ccca12f5	nv50/ir: fix unnecessary parentheses warning Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2019-09-23 13:27:32 +02:00
Erico Nunes	ab49a0e746	lima: remove partial clear support from pipe->clear() pipe->clear() is not called for partial clears, which mesa emulates by drawing a quad. Furthermore, drivers should not use rasterizer state information for scissor information (which was being used to handle the partial clears). So, remove the partial clear support since it was not supposed to be handled by pipe->clear() anyway. This fixes issues with clearing after switching to different sized framebuffers. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-09-23 12:19:10 +02:00
Boris Brezillon	0c6ca0a647	dEQP-GLES2.functional.buffer.write.use.index_array.* are passing now. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-09-23 09:48:38 +02:00
Boris Brezillon	055497fa84	panfrost: Fix indexed draws ->padded_count should be large enough to cover all vertices pointed by the index array. Use the local vertex_count variable that contains the updated vertex_count value for the indexed draw case. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-23 09:47:41 +02:00
Karol Herbst	697eb8f973	clover/nir: fix compilation with g++-5.5 and maybe earlier fixes "sorry, unimplemented: non-trivial designated initializers not supported" Fixes: `deb04adf2a` ("clover: add support for passing kernels as nir to the driver") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-09-23 07:09:41 +00:00
Kenneth Graunke	ec81f19b44	st/mesa: Bail on incomplete attachments in discard_framebuffer Incomplete attachments don't have an associated pipe_surface, so this would crash. Fixes a WebGL conformance test that uses incomplete attachments: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/renderbuffers/invalidate-framebuffer.html?webglVersion=2&quiet=0&quick=1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111756 Reviewed-By: Tapani Pälli <tapani.palli@intel.com>	2019-09-22 21:03:16 -07:00
Vasily Khoruzhick	d214778753	lima: implement BO cache Allocating BOs is expensive, so we should avoid doing that by caching freed BOs. BO cache is modelled after one in v3d driver and works as follows: - in lima_bo_create() check if we have matching BO in cache and return it if there's one, allocate new BO otherwise. - in lima_bo_unreference() (renamed from lima_bo_free()): put BO in cache instead of freeing it and remove all stale BOs from cache Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-22 19:20:59 -07:00
Vasily Khoruzhick	9f897a2b4c	lima: use 0 to poll if BO is busy in lima_bo_wait() os_time_get_absolute_timeout(0) returns current time, while kernel driver expects 0 as value to poll BO status and return immediately. Fix it by setting abs_timeout to 0 if timeout_ns is 0 Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-22 19:20:59 -07:00
Qiang Yu	7f7ac21088	lima: move damage bound build to resource Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-09-23 09:48:55 +08:00
Qiang Yu	4ed569eed7	lima: don't use damage system when full damage Some time weston set full damage region. It is more effient to use the cached pp stream instead of dynamically create one. Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-09-23 09:48:50 +08:00
Qiang Yu	afbaed906d	lima: implement EGL_KHR_partial_update This extension set a damage region for each buffer swap which can be used to reduce buffer reload cost by only feed damage region's tile buffer address for PP. Reviewed-and-Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-09-23 09:48:15 +08:00
Icenowy Zheng	8278b236b0	lima: fix PLBU viewport configuration The PLBU expects the viewport's 4 borders' coordinates, however currently we're feeding the coordinate of the left-bottom point and the size to it, which leads to misrendering when the left-bottom point is not (0,0). Change the macros for the viewport PLBU command, and the data feed to it. The code to calculate the 4 borders is ported from Panfrost. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-09-22 15:22:38 +08:00
Bas Nieuwenhuizen	40087ffc5b	amd: Build aco only if radv is enabled ACO depends on C++14, but radeonsi/radv with LLVM 8,9 do not. Let us only require it for RADV, since that is the only user. Fixes: `a70a998718` "radv/aco: Setup alternate path in RADV to support the experimental ACO compiler" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-21 14:06:41 +02:00
Karol Herbst	7955fabcf8	nvc0: expose spirv support required for OpenCL v2: adjust to changes in previous commits v3: properly convert to NIR in nvc0_cp_state_create Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> (v1)	2019-09-21 08:28:32 +00:00
Karol Herbst	deb04adf2a	clover: add support for passing kernels as nir to the driver v2: minor formatting fixes v3: call glsl_type_singleton_init_or_ref and glsl_type_singleton_decref v4: capitalize and punctuate comments fix text_executable -> text_intermediate in TODO make glsl_type_singleton wrapper static v5: rewrite how we run the nir passes v6: fix unhandled case switch warning in st/mesa Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v4)	2019-09-21 08:28:32 +00:00
Karol Herbst	1befaf4417	clover: prepare supporting multiple IRs v2: rework arguments to compiler::compile_program add assert to device::ir_format v3: remove PIPE_SHADER_IR_SPIRV change title Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v2) Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-09-21 08:28:32 +00:00
Karol Herbst	c8cd8e279d	clover: add support for drivers having no proper binary format Most drivers have actually no binary format and just store the IR directly as a single entry point blob. v2: add a cap to switch between single or multi entry point binaries v3: remove the entry_point field v4: remove PIPE_CAP_MULTI_ENTRY_POINT_BINARIES v5: remove supports_multiple_entry_points Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-09-21 08:28:32 +00:00
Karol Herbst	1982ac6d6b	clover/functional: add id_equals helper v2: pass argument by value Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-09-21 08:28:32 +00:00
Karol Herbst	f3ba98cb18	rename pipe_llvm_program_header to pipe_binary_program_header We want to use it for other formats as well, so give it a more generic name Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-09-21 08:28:32 +00:00
Karol Herbst	b6c47abe3e	gallium: add blob field to pipe_llvm_program_header makes it easier to consume a IR_NATIVE binary Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-09-21 08:28:32 +00:00
Pierre Moreau	2043c5f37c	clover/llvm: Add functions for compiling from source to SPIR-V Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-09-21 08:28:32 +00:00
Pierre Moreau	975a3c6ad3	clover/llvm: Add options for dumping SPIR-V binaries Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Francisco Jerez <currojerez@riseup.net>	2019-09-21 08:28:32 +00:00
Pierre Moreau	2147386505	clover/spirv: Add functions for parsing arguments, linking programs, etc. v2 (Karol Herbst): silence warnings about unhandled enum values v3 (Karol Herbst): added back array size parsing (needed for structs passed by value) Acked-by: Francisco Jerez <currojerez@riseup.net> (v2)	2019-09-21 08:28:32 +00:00
Pierre Moreau	939a7e9a5c	clover/spirv: Add functions for validating SPIR-V binaries Changes since: * v12: - remove autotools (Karol Herbst) - Remove the callback in format_validation_msg. (Francisco Jerez) - Removed is_binary_spirv. (Francisco Jerez) - Pass a string reference to is_valid_spirv instead of the notification callback. (Francisco Jerez) * v11: Fix compilation error introduced in v11. * v10: - Reuse format_validation_msg in is_valid_spirv. - Remove LVL2STR macro in format_validation_msg. * v9: Add `clover_cpp_std` to the overrides of the `libclspirv` target in Meson. * v7: Add DEFINES to libclspirv and libclover, in autotools, as they would otherwise never know whether CLOVER_ALLOW_SPIRV has been defined (Dave Airlie) * v6: Update the dependency name (meson) and the libs variable (Makefile) due to the replacement of llvm-spirv to the new official SPIRV-LLVM-Translator. * v5: Changed to match the updated “clover/llvm: Allow translating from SPIR-V to LLVM IR” in the v6. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-09-21 08:28:32 +00:00
Pierre Moreau	866f6f11d9	meson: Check for SPIRV-Tools and llvm-spirv Changes since: * v12 (Karol Herbst): - rename CLOVER_ALLOW_SPIRV to HAVE_CLOVER_SPIRV * v11 (Karol Herbst): - only set new defines for clover to speed up recompilation - remove autotools * v10: - Add a new flag (`--enable-opencl-spirv` for autotools, and `-Dopencl-spirv=true` for meson) for enabling SPIR-V support in clover, and never automagically enable it without that flag. (Dylan Baker) - When enabling the SPIR-V support, the SPIRV-Tools and SPIRV-LLVM-Translator libraries are now required dependencies. * v7: - Properly align LLVMSPIRVLib comment (Dylan Baker) - Only define CLOVER_ALLOW_SPIRV when both dependencies are found: autotools was only requiring one or the other. * v6: Replace the llvm-spirv repository by the new official SPIRV-LLVM-Translator. * v4: Add a comment saying where to find llvm-spirv (Karol Herbst). * v3: - make SPIRV-Tools and llvm-spirv optional (Francisco Jerez); - bump requirement for llvm-spirv to version 0.2 * v2: - Bump the required version of SPIRV-Tools to the latest release; - Add a dependency on llvm-spirv. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v10) Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-09-21 08:28:32 +00:00
Kenneth Graunke	aa7ac32976	isl: Drop WaDisableSamplerL2BypassForTextureCompressedFormats on Gen11 Gen11 doesn't require us to bypass the L2 cache for BC* images anymore. The documentation is a bit hard to follow on this point, but the Windows driver clearly only applies this workaround on Gen9, and their commit history indicates that this was an intentional change to drop the workaround for Gen11+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 15:35:17 -07:00
Hal Gentz	57c894334e	gallium/osmesa: Fix the inability to set no context as current. Currently there is no way to make no context current w/gallium + osmesa. The non-gallium version of osmesa does this if the context and buffer passed to `OSMesaMakeCurrent` are both null. This small change makes it so that this is also the case with the gallium version. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Hal Gentz <zegentzy@protonmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-20 14:04:12 -06:00
Adam Jackson	6e4fd14b0f	libgbm: Wire up getCapability for the image loader	2019-09-20 19:10:31 +00:00
Adam Jackson	55a1b583d9	egl/surfaceless: Add FP16 format support Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>	2019-09-20 19:10:31 +00:00
Adam Jackson	d01406133d	egl/wayland: Implement getCapability for the dri2 and image loaders Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>	2019-09-20 19:10:31 +00:00
Adam Jackson	e74c947359	egl/wayland: Add FP16 format support Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>	2019-09-20 19:10:31 +00:00
Adam Jackson	cb8bbbef31	egl/wayland: Reindent the format table No idea how these ended up with 3-then-2-space indents. Reviewed-by: Kevin Strasser <kevin.strasser@intel.com>	2019-09-20 19:10:31 +00:00
Jason Ekstrand	7d861ab812	anv: Advertise VK_KHR_shader_subgroup_extended_types Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	03255da225	intel/fs: Do 8-bit subgroup scan operations in 16 bits Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	651725f7a1	intel/fs: Allow CLUSTER_BROADCAST to do type conversion We can't really handle it in the little-core 64-bit case but it's not really needed there. Where we really want this is for when we need to do 16 -> 8-bit conversions. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Jason Ekstrand	3515c0e9cf	intel/fs: Allow UB, B, and HF types in brw_nir_reduction_op_identity Because byte immediates aren't a thing on GEN hardware, we return a signed or unsigned word immediate in the byte case. Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 18:02:15 +00:00
Paulo Zanoni	10532c6831	intel/fs: don't forget the stride at generate_shuffle During generate_shuffle(), when we use byte sized registers we end up with a destination stride of 2. We don't take the stride into consideration when selecting the group offset for the last MOV operation, which means we end up moving things to the wrong place, leaving the last few channels untouched. Take the destination stride in consideration so we don't miss the last channels. v2: Assert this is not necessary for the IVB special case (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-20 10:57:05 -07:00
Jason Ekstrand	dae33052db	util/rb_tree: Reverse the order of comparison functions The new order matches that of the comparison functions accepted by the C standard library qsort() functions. Being consistent with qsort will hopefully help avoid developer confusion. The only current user of the red-black tree is aub_mem.c which is pretty easy to fix up. Reviewed-by: Lionel Landwerlin <lionel.g.lndwerlin@intel.com>	2019-09-20 17:37:25 +00:00
Jason Ekstrand	d35d7346d2	util/rb_tree: Add the unit tests When I wrote the red-black tree implementation, I wrote tests for it but they never got imported into mesa. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-20 17:37:25 +00:00
Eric Engestrom	3c1a24de07	anv: implement ICD interface v4 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Eric Engestrom	19db95e78e	anv: split instance dispatch table This effectively breaks the instance dispatch table in 2 with entry points using a physical device as first argument getting their own dispatch table. As a result we now have to check instance & physical device dispatch table instead of just the instance dispatch table before. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-20 08:31:58 +00:00
Adam Jackson	88b8922f57	glx: Fix drawable lookup bugs in glXUseXFont We were using the current drawable of the context to name the appropriate screen for creating the bitmaps. But one, the current drawable can be None, and two, it can be a GLXDrawable. Passing either one as the second argument to XCreatePixmap will throw BadDrawable. Use the root window of the context's screen instead. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/89 LOLed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-19 21:06:01 -04:00
Adam Jackson	b4fe0b3ffd	glx: Avoid atof() when computing the server's GLX version atof() is locale-dependent (sigh), which means 1.3 becomes 1.0 if the locale's decimal separator isn't a full-stop. Just use the protocol major/minor instead. This would be slightly broken if the server generically implements 1.3+ but a particular screen is only capable of less, but in practice no such servers exist. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/74 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-19 20:50:01 -04:00
Ian Romanick	317a88b920	nir/algebraic: Additional D3D Boolean optimization I observed this pattern in several shaders in Hand of Fate 2 while investigating bugzilla #111490. This also led to the related bugzilla #111578. The shaders from HoF2 are not in shader-db. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Skylake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 16222621 -> 16205419 (-0.11%) instructions in affected programs: 798418 -> 781216 (-2.15%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 158 x̄: 31.39 x̃: 35 helped stats (rel) min: 0.45% max: 28.64% x̄: 2.83% x̃: 2.09% 95% mean confidence interval for instructions value: -33.22 -29.56 95% mean confidence interval for instructions %-change: -3.11% -2.56% Instructions are helped. total cycles in shared programs: 364676209 -> 363345763 (-0.36%) cycles in affected programs: 112810504 -> 111480058 (-1.18%) helped: 546 HURT: 7 helped stats (abs) min: 2 max: 118913 x̄: 2439.77 x̃: 2340 helped stats (rel) min: 0.08% max: 37.56% x̄: 1.46% x̃: 1.08% HURT stats (abs) min: 2 max: 770 x̄: 238.00 x̃: 43 HURT stats (rel) min: 0.02% max: 11.24% x̄: 3.71% x̃: 0.35% 95% mean confidence interval for cycles value: -2884.33 -1927.41 95% mean confidence interval for cycles %-change: -1.59% -1.21% Cycles are helped. total spills in shared programs: 8870 -> 8514 (-4.01%) spills in affected programs: 1230 -> 874 (-28.94%) helped: 161 HURT: 0 total fills in shared programs: 21901 -> 21348 (-2.52%) fills in affected programs: 2120 -> 1567 (-26.08%) helped: 155 HURT: 5 Broadwell and Haswell had similar results. (Broadwell shown) total instructions in shared programs: 14994910 -> 14975495 (-0.13%) instructions in affected programs: 839033 -> 819618 (-2.31%) helped: 548 HURT: 0 helped stats (abs) min: 2 max: 299 x̄: 35.43 x̃: 49 helped stats (rel) min: 0.39% max: 19.89% x̄: 2.91% x̃: 2.22% 95% mean confidence interval for instructions value: -37.46 -33.40 95% mean confidence interval for instructions %-change: -3.12% -2.70% Instructions are helped. total cycles in shared programs: 386032453 -> 384450722 (-0.41%) cycles in affected programs: 117807357 -> 116225626 (-1.34%) helped: 547 HURT: 6 helped stats (abs) min: 2 max: 22096 x̄: 2892.01 x̃: 3926 helped stats (rel) min: 0.17% max: 10.34% x̄: 1.56% x̃: 1.31% HURT stats (abs) min: 4 max: 60 x̄: 32.83 x̃: 29 HURT stats (rel) min: 0.38% max: 12.79% x̄: 5.86% x̃: 4.65% 95% mean confidence interval for cycles value: -3060.28 -2660.27 95% mean confidence interval for cycles %-change: -1.59% -1.37% Cycles are helped. total spills in shared programs: 23372 -> 21869 (-6.43%) spills in affected programs: 11730 -> 10227 (-12.81%) helped: 352 HURT: 0 total fills in shared programs: 34747 -> 35351 (1.74%) fills in affected programs: 11013 -> 11617 (5.48%) helped: 3 HURT: 347 Ivy Bridge and Sandybridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11956420 -> 11956126 (<.01%) instructions in affected programs: 14898 -> 14604 (-1.97%) helped: 98 HURT: 0 helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 helped stats (rel) min: 1.30% max: 3.57% x̄: 2.08% x̃: 2.00% 95% mean confidence interval for instructions value: -3.00 -3.00 95% mean confidence interval for instructions %-change: -2.18% -1.98% Instructions are helped. total cycles in shared programs: 178791217 -> 178790792 (<.01%) cycles in affected programs: 149763 -> 149338 (-0.28%) helped: 91 HURT: 7 helped stats (abs) min: 3 max: 107 x̄: 20.63 x̃: 16 helped stats (rel) min: 0.13% max: 6.91% x̄: 1.40% x̃: 1.18% HURT stats (abs) min: 3 max: 322 x̄: 207.43 x̃: 322 HURT stats (rel) min: 0.14% max: 19.85% x̄: 12.73% x̃: 17.41% 95% mean confidence interval for cycles value: -18.94 10.27 95% mean confidence interval for cycles %-change: -1.28% 0.49% Inconclusive result (value mean confidence interval includes 0).	2019-09-19 14:22:22 -07:00
Ian Romanick	92f70df8c3	nir/algebraic: Do not apply late DPH optimization in vertex processing stages Some shaders do not use 'invariant' in vertex and (possibly) geometry shader stages on some outputs that are intended to be invariant. For various reasons, this optimization may not be fully applied in all shaders used for different rendering passes of the same geometry. This can result in Z-fighting artifacts (at best). For now, disable this optimization in these stages. In tessellation stages applications seem to use 'precise' when necessary, so allow the optimization in those stages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111490 Fixes: `09705747d7` ("nir/algebraic: Reassociate fadd into fmul in DPH-like pattern") All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16194726 -> 16344745 (0.93%) instructions in affected programs: 2855172 -> 3005191 (5.25%) helped: 6 HURT: 20279 helped stats (abs) min: 1 max: 3 x̄: 1.33 x̃: 1 helped stats (rel) min: 0.44% max: 1.00% x̄: 0.54% x̃: 0.44% HURT stats (abs) min: 1 max: 32 x̄: 7.40 x̃: 7 HURT stats (rel) min: 0.14% max: 42.86% x̄: 8.58% x̃: 6.56% 95% mean confidence interval for instructions value: 7.34 7.45 95% mean confidence interval for instructions %-change: 8.48% 8.67% Instructions are HURT. total cycles in shared programs: 364471296 -> 365014683 (0.15%) cycles in affected programs: 32421530 -> 32964917 (1.68%) helped: 2925 HURT: 16144 helped stats (abs) min: 1 max: 403 x̄: 18.39 x̃: 5 helped stats (rel) min: <.01% max: 22.61% x̄: 1.97% x̃: 1.15% HURT stats (abs) min: 1 max: 18471 x̄: 36.99 x̃: 15 HURT stats (rel) min: 0.02% max: 52.58% x̄: 5.60% x̃: 3.87% 95% mean confidence interval for cycles value: 21.58 35.41 95% mean confidence interval for cycles %-change: 4.36% 4.52% Cycles are HURT.	2019-09-19 14:21:31 -07:00
Andres Gomez	bcd9224728	docs/features: Update VK_KHR_display_swapchain status It was set as done by mistake. Fixes: `bc15d74529` ("docs/features: Mark some Vulkan extensions as done") Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 23:45:17 +03:00
Andres Gomez	53c24cfd8a	docs/features: Update status list of Vulkan extensions To get the extension list: $ git grep -hE "extension name=\"VK_KHR" src/vulkan/registry/vk.xml \| \ grep -v disabled \| awk '{print $2}' \| sed -E 's/(name=)?"//g' \| sort To find anv(il) and radv supported extensions: $ git grep -hE "'VK_([A-Z]+)_[a-z,0-9]" src/intel/ $ git grep -hE "'VK_([A-Z]+)_[a-z,0-9]" src/amd/ v2: - Keep VK_KHR_device_group and VK_KHR_device_group_creation as not started (Jason). Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 23:39:26 +03:00
Jason Ekstrand	0c4e89ad5b	Move blob from compiler/ to util/ There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 19:56:22 +00:00
Boris Brezillon	fc5a87715a	Revert "panfrost: Rework midgard_pair_load_store() to kill the nested foreach loop" There's a missing prev_ldst = NULL; assignment in the new logic, but even with this fixed it seems to regress some applications, so let's revert the change until we find the real problem. This reverts commit `c9bebae287`.	2019-09-19 21:01:27 +02:00
Caio Marcelo de Oliveira Filho	fa080f03d3	intel/fs: Add Fall-through comment Reviewed-by: Andres Gomez <agomez@igalia.com>	2019-09-19 10:02:16 -07:00
Samuel Iglesias Gonsálvez	5ed5e76741	nir/algebraic: refactor inexact opcode restrictions Refactor the code to avoid calling a lot of time to auxiliary functions when it is not really needed. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-19 18:57:27 +02:00
Adam Jackson	5b5c5bf833	docs: Update bug report URLs for the gitlab migration Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-19 16:37:36 +00:00
Bas Nieuwenhuizen	ec76232785	glx: Remove redundant null check. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/64 Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-19 15:11:10 +00:00
Kenneth Graunke	706c9f2d60	iris: Skip double-disabling TCS/TES/GS after BLORP operations BLORP always turns off TCS/TES/GS. If regular drawing also has them disabled (the overwhelmingly common case), then leaving them disabled is just fine by us and we can skip dirtying them, as that would just re-disable them a second time on the next draw. If they are actually enabled, however, we do need to flag them. Cuts 52% of the 3DSTATE_HS packets in an Aztec Ruins trace.	2019-09-19 07:56:15 -07:00
Erik Faye-Lund	7f7060dc73	.mailmap: add an alias for Frank Binns Reviewed-by: Frank Binns <frank.binns@imgtec.com>	2019-09-19 16:41:10 +02:00
Erik Faye-Lund	c1b1e0e875	.mailmap: add an alias for Bas Nieuwenhuizen Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 16:39:10 +02:00
Arcady Goldmints-Orlov	5ec5fecc26	anv: fix descriptor limits on gen8 Later generations support bindless for samplers, images, and buffers and thus per-stage descriptors are not limited by the binding table size. However, gen8 doesn't support bindless images and thus needs to report a lower per-stage limit so that all combinations of descriptors that fit within the advertised limits are reported as supported by vkGetDescriptorSetLayoutSupport. Fixes test dEQP-VK.api.maintenance3_check.descriptor_set Fixes: `79fb0d27f3` ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-19 09:10:40 -05:00
Daniel Schürmann	8b78cce433	radv: remove dead shared variables LLVM does this anyway, but for ACO we need to do it in NIR. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00
Daniel Schürmann	281262281b	radv/aco: enable VK_EXT_shader_demote_to_helper_invocation For now, this extension will only be enabled for ACO. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00
Daniel Schürmann	e01b522a72	radv: enable clustered reductions These work with both, LLVM and ACO. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00
Daniel Schürmann	a70a998718	radv/aco: Setup alternate path in RADV to support the experimental ACO compiler LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco. Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00
Daniel Schürmann	93c8ebfa78	aco: Initial commit of independent AMD compiler ACO (short for AMD Compiler) is a new compiler backend with the goal to replace LLVM for Radeon hardware for the RADV driver. ACO currently supports only VS, PS and CS on VI and Vega. There are some optimizations missing because of unmerged NIR changes which may decrease performance. Full commit history can be found at https://github.com/daniel-schuermann/mesa/commits/backend Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com> Co-authored-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00
Tapani Pälli	99cbec0a5f	egl: check for NULL value like eglGetSyncAttribKHR does Commit `d1e1563bb6` added a NULL check for eglGetSyncAttribKHR but eglGetSyncAttrib does not do this. Patch adds same check to happen with eglGetSyncAttrib. Fixes crashes in (when exposing EGL 1.5): dEQP-EGL.functional.fence_sync.invalid.get_invalid_value Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-09-19 06:39:33 +00:00
Kenneth Graunke	a16975e615	iris: Rework iris_update_draw_parameters to be more efficient This improves a couple of things: 1. We now only update anything if the shader actually cares. Previously, is_indexed_draw was causing us to flag dirty vertex buffers, elements, and SGVs every time the shader switched between indexed and non-indexed draws. This is a very common situation, but we only need that information if the shader uses gl_BaseVertex. We were also flagging things when switching between indirect/direct draws as well, and now we only bother if it matters. 2. We upload new draw parameters only when necessary. When we detect that the draw parameters have changed, we upload a new copy, and use that. Previously we were uploading it every time the vertex buffers were dirty (for possibly unrelated reasons) and the shader needed that info. Tying these together also makes the code a bit easier to follow. In Civilization VI's benchmark, this code was flagging dirty state many times per frame (49 average, 16 median, 614 maximum). Now it occurs exactly once for the entire run.	2019-09-18 22:50:52 -07:00
Kenneth Graunke	6841f11d14	iris: Use state_refs for draw parameters. iris_state_ref is a <resource, offset> tuple, which is exactly what we need here.	2019-09-18 22:50:52 -07:00
Timothy Arceri	ddd314f0ce	util/disk_cache: make use of the total job size limiting feature This makes use of the total job size limiting feature added in the previous patch. The idea is to avoid an excessive build up in memory use due to the use of both the UTIL_QUEUE_INIT_RESIZE_IF_FULL and UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flags. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-19 15:03:27 +10:00
Timothy Arceri	896885025f	util/u_queue: track job size and limit the size of queue growth When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a situation where the queue never executes and grows to a huge size due to all other threads being busy. This is the case with the shader cache when attempting to compile a huge number of shaders up front. If all threads are busy compiling shaders the cache queues memory use can climb into the many GBs very fast. The use of these two flags with the shader cache is intended to allow shaders compiled at runtime to be compiled as fast as possible. To avoid huge memory use but still allow the queue to perform optimally in the run time compilation case, we now add the ability to track memory consumed by the jobs in the queue and limit it to a hardcoded 256MB which should be more than enough. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-19 15:03:27 +10:00
Timothy Arceri	a2ee29c3da	util/disk_cache: bump thread count assigned to disk cache queue Since we set the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY flag this should have little impact on low core systems. However just about all modern CPUs currently available that run Mesa have at least 4 cores. For these CPUs allowing more threads can result in the queue being processed faster and avoid excessive memory use due to a backlog of cache entrys building up in the queue. This change helps avoid a huge build up of cache entrys in the queue due to using both the UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY and UTIL_QUEUE_INIT_RESIZE_IF_FULL flags. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-19 15:03:27 +10:00
Paulo Zanoni	8e614c7a29	intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32 The current code can create functions with a width of 32, which is not supported by our hardware. Add some code to simplify how we express what we want and prevent such cases. For some unknown reason, all the tests I could run seem to work even with these unsupported MOVs. Fixes: `b0858c1cc6` "intel/fs: Add a couple of simple helper opcodes" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:48:27 +00:00
Paulo Zanoni	c99df52873	intel/fs: the maximum supported stride width is 16 There are cases where we try to generate registers with a stride of 32, while the hardware maximum is just 16. This happens, for example, when using 8 bit integers on SIMD32. This results in a crash because the variable 'width' has a value of 32: ../../src/intel/compiler/brw_reg.h:550: brw_reg brw_vecn_reg(unsigned int, brw_reg_file, unsigned int, unsigned int): Assertion `!"Invalid register width"' failed. This change prevents the crash and makes the tests pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:48:27 +00:00
Paulo Zanoni	cebf447d16	intel/fs: roll the loop with the <0,1,0> additions in emit_scan() IMHO the code is easier to understand this way, being explicit that we're doing exactly the same thing every time. No functional changes. v2: Adjust the loop breaking condition (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:47:17 +00:00
Paulo Zanoni	d9ddf5076d	intel/fs: make scan/reduce work with SIMD32 when it fits 2 registers When dealing with uint16_t and uint8_t on SIMD32 we can do all the operations using just 2 registers, so we don't hit the recursion at the beginning of emit_scan(). Because of that, we need to actually compute scan/reduce for channels 31:16. v2: Still missed instructions (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-09-19 02:47:17 +00:00
Kristian H. Kristensen	7f07046dbc	freedreno/regs: A couple of tess updates Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Kristian H. Kristensen	a2031a117c	freedreno/regs: Fix CP_DRAW_INDX_OFFSET command On A5xx+ the INDX_BASE pointer is 64 bit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Kristian H. Kristensen	2251a4345b	freedreno/a6xx: Write multiple regs for SP_VS_OUT_REG and SP_VS_VPC_DST_REG Compute the number of writes up front. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Kristian H. Kristensen	cc4fe81145	freedreno/a6xx: Turn on vectorize_io We want this for tessellation eventually, but we can turn it on now. Shader-db results: total instructions in shared programs: 8612905 -> 8611387 (-0.02%) instructions in affected programs: 164952 -> 163434 (-0.92%) total dwords in shared programs: 11952000 -> 11950560 (-0.01%) dwords in affected programs: 68096 -> 66656 (-2.11%) total full in shared programs: 315019 -> 315009 (<.01%) full in affected programs: 1642 -> 1632 (-0.61%) total constlen in shared programs: 2463654 -> 2463654 (0.00%) constlen in affected programs: 0 -> 0 total (ss) in shared programs: 152379 -> 152409 (0.02%) (ss) in affected programs: 1503 -> 1533 (2.00%) total (sy) in shared programs: 96473 -> 96525 (0.05%) (sy) in affected programs: 654 -> 706 (7.95%) total max_sun in shared programs: 1172454 -> 1172472 (<.01%) max_sun in affected programs: 104 -> 122 (17.31%) Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Kristian H. Kristensen	1cb9534434	freedreno/a6xx: Share shader state constructor and destructor Also, swap vs and fs constructor or so fs comes first. Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Kristian H. Kristensen	be38480064	freedreno/a6xx: Track location of gl_Position out as we link it When using xfb and rasterizing, the fragment shader may have fewer inputs than the vertex shader outputs. We can't rely on gl_Position to be placed at fs->total_in, but have to instead remember where we add it in the link map and use that location. Fixes 100+ tesselation dEQPs under dEQP-GLES31.functional.tessellation.primitive_discard.* dEQP-GLES31.functional.tessellation.user_defined_io.* Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 16:59:10 -07:00
Caio Marcelo de Oliveira Filho	d38e0a6326	spirv: Add missing break for capability handling New added cases "stole" the previous break. Fixes: `420ad0a1a3` ("spirv: check support for SPV_KHR_float_controls capabilities") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 15:49:14 -07:00
Kenneth Graunke	3da8a8a3d6	iris: Avoid uploading SURFACE_STATE descriptors for UBOs if possible If we can entirely push uniform data, we don't need a SURFACE_STATE descriptor for pulling data. Since constant uploads are a very common operation, and being able to push all data is also very common, we would like to avoid the overhead in this case. This patch defers uploading new descriptors. Instead of handling that at iris_set_constant_buffer, we do it at iris_update_compiled_shaders, where we can see the currently bound shader variants. If any need pull descriptors, and descriptors are missing, we update them and flag that the binding table also needs to be refreshed. Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by 31.9774% +/- 1.12947% (n=15). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	0e4a75f917	intel/compiler: Record whether any pull constant loads occur I would like for iris to be able to avoid setting up SURFACE_STATE for UBOs in the common case where all constants are pushed. Unfortunately, we don't know up front whether everything will be pushed: the backend is allowed to demote pushed UBOs to pull loads fairly late in the process. This is probably desirable though, as we'd like the backend to be able to re-pull pushed data to break up long live ranges in response to register pressure. Here we simply add a "are there any pull loads at all" boolean to prog_data, which is a bit crude but at least allows us to skip work in the common "everything pushed" case. We could skip more work by tracking exactly which UBO surfaces are pulled in a bitmask, but I wanted to avoid bringing back the old mark_surface_used() mechanism. Finer-grained tracking could allow us to skip a bit more work when multiple UBOs are in use and /some/ are 100% pushed, but others are accessed via pulls. However, I'm not sure how common this is and it would save at most 4 pull descriptors, so we defer that for now. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	dd83ef0d1a	iris: Track per-stage bind history, reduce work accordingly We now track per-stage bind history for constant and shader buffers, shader images, and sampler views by adding an extra res->bind_stages field to go with res->bind_history. This lets us flag IRIS_DIRTY_CONSTANTS for only the specific stages involved, and also skip some CPU overhead in iris_rebind_buffer. Cuts 4% of 3DSTATE_CONSTANT_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	1e7daaa6c9	iris: Don't flag IRIS_DIRTY_BINDINGS for constant usage history The underlying buffer isn't changing - so we don't need to update any SURFACE_STATE descriptors - we just might have new constants, meaning we need to re-emit 3DSTATE_CONSTANT_XS. On Gen9, this means we need to update 3DSTATE_BINDING_TABLE_POINTERS_XS too, but that's now handled by the explicit check in the previous patch. On Gen9, this should cause us to re-emit the binding table /pointer/ on writing to a buffer with PIPE_BIND_CONSTANT_BUFFER, rather than emitting a whole new /table/. On Gen8 and Gen11, this avoids binding table churn altogether. Cuts 61% of 3DSTATE_BINDING_TABLE_POINTERS_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	e7db3577f8	iris: Explicitly emit 3DSTATE_BTP_XS on Gen9 with DIRTY_CONSTANTS_XS Right now, we usually flag both IRIS_DIRTY_{CONSTANTS,BINDINGS}_XS, because we have SURFACE_STATE for constant buffers in case the shaders access them via pull mode. But this flagging is overkill in many cases. Gen8 and Gen11 don't need it at all. Gen9 doesn't need that large of a hammer in all cases. Just handle it explicitly so the right thing happens. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Kenneth Graunke	caa0aebd01	iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebinds We upload a new SURFACE_STATE for the UBO/SSBO in question, which means that we need new binding tables as well. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 15:44:22 -07:00
Bas Nieuwenhuizen	4b7e7956f0	radv: Add DFSM support. Apparently we already enabled it without having support ... Not sure if we also need to set disable_start_of_prim when the PS has memory writes, but this mirrors radeonsi. Doubles fillrate in my dual_quad_bench from ~16 pixels/cycles to ~32 pixels/cycle on a Raven. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen	0fa2740059	radv: Disable dfsm by default even on Raven. When actually implementing it, Talos on low is still 3% slower. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 21:28:51 +00:00
Bas Nieuwenhuizen	f2dffb395f	radv: Only break batch on framebuffer change with dfsm. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 21:28:51 +00:00
Connor Abbott	57e0bb8ccc	nir/opt_if: Fix undef handling in opt_split_alu_of_phi() The pass assumed that "Most ALU ops produce an undefined result if any source is undef" which is completely untrue. Due to how we lower if statements to selects and then optimize on those selects later, we simply cannot make that assumption. In particular this pass tried to replace an ior of undef and true, which had been generated by optimizing a select which itself came from flattening an if statement, to undef causing a miscompilation for a CTS test with radeonsi NIR. We fix this by always doing what the non-undef path did, i.e. duplicate the instruction twice. If there are cases where the instruction before the loop can be folded away due to having an undef source, we should add these to opt_undef instead. The comment above the pass says that if the phi source from before the loop is undef, and we can fold the instruction before the loop to undef, then we can ignore sources of the original instruction that don't dominate the block before the loop because we don't need them to create the instruction before the loop. This is incorrect, because the instruction at the bottom of the loop would get those sources from the wrong loop iteration. The code never actually did what the comment said, so we only have to update the comment to match what the pass actually does. We also update the example to more closely match what most actual loops look like after vtn and peephole_select. There are no shader-db changes with i965, radeonsi NIR, or radv. With anv and my vkpipeline-db there's only one change: total instructions in shared programs: 14125290 -> 14125300 (<.01%) instructions in affected programs: 2598 -> 2608 (0.38%) helped: 0 HURT: 1 total cycles in shared programs: 2051473437 -> 2051473397 (<.01%) cycles in affected programs: 36697 -> 36657 (-0.11%) helped: 1 HURT: 0 Fixes KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input with radeonsi NIR.	2019-09-18 17:18:34 -04:00
Eric Engestrom	a1de3011f3	gl: drop incorrect pkg-config file for glvnd Akin to `1a25980c46` ("egl: drop incorrect pkg-config file for glvnd") and `b01524fff0` ("meson: don't build libGLES*.so with GLVND") , removes a pkg-config file that shouldn't have been there in the first place, but was needed because of that GLVND bug. Now that the glvnd bug has been fixed, it was apparent that this gl.pc pkg-config file was forgotten to be removed, so let's do just that :) Suggested-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-18 22:16:51 +01:00
Andres Gomez	66f2aa6ccd	docs: Add the maximum implemented Vulkan API version in 19.3 rel notes Currently, Vulkan 1.1. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-19 00:03:55 +03:00
Andres Gomez	41b0e0d7e0	docs: Add the maximum implemented Vulkan API version in 19.2 rel notes Currently, Vulkan 1.1. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-19 00:03:50 +03:00
Andres Gomez	d2db43fcad	docs: Add the maximum implemented Vulkan API version in 19.1 rel notes Currently, Vulkan 1.1. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-19 00:03:39 +03:00
Andres Gomez	d9760f8935	nir/opcodes: Clear variable names confusion Having Python and C variables sharing name in the same block of code makes its understanding a bit confusing. Make it explicit that the Python bit_size variable refers to the destination bit size. Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-18 23:59:07 +03:00
Rhys Perry	b3f71685d9	radv: never kill a NGG GS shader Seems to fix a hang with excessive vertex emissions when NGG is used for GS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-18 19:26:58 +00:00
Samuel Pitoiset	99c186fbbe	radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS No GS copy shader if a pipeline enables NGG GS. This fixes dEQP-VK.pipeline.executable_properties.graphics.geometry_stage. Fixes: `86864eedd2` ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-18 21:19:28 +02:00
Marek Olšák	fe7aa271a9	radeonsi: include drm_fourcc.h to fix the build	2019-09-18 14:52:25 -04:00
Marek Olšák	00e29816e7	radeonsi: implement pipe_screen::resource_get_param v2: return DRM_FORMAT_MOD_INVALID from the function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2019-09-18 14:43:01 -04:00
Marek Olšák	d307aa56f9	gallium: extend resource_get_param to be as capable as resource_get_handle Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-18 14:41:30 -04:00
Marek Olšák	aae35fbd3a	ac: move ac_get_num_physical_vgprs into radeon_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 14:39:06 -04:00
Marek Olšák	0692ae34e9	ac: move ac_get_num_physical_sgprs into radeon_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 14:39:06 -04:00
Marek Olšák	ca43006fd2	ac: move ac_get_max_wave64_per_simd into radeon_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 14:39:06 -04:00
Marek Olšák	deab3a23f6	ac: move num_sdp_interfaces into radeon_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 14:39:06 -04:00
Marek Olšák	2c62b461e9	ac: move PBB MAX_ALLOC_COUNT into radeon_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-18 14:39:06 -04:00
Jonathan Marek	05da025f35	etnaviv: fix two-sided stencil * Set missing STENCIL_CONFIG_EXT2 bits * Swap stencil sides when rendering CCW Fixes following deqp tests (which were 99% failing): dEQP-GLES2.functional.fragment_ops.depth_stencil.* Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-09-18 12:49:10 -04:00
Samuel Pitoiset	68820007fd	radv: fix loading 64-bit GS inputs We have to load 2 32-bit integer and to cast correctly. This fixes crashes with gs-double-interpolator.vk_shader_test. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-18 17:16:36 +02:00
Bas Nieuwenhuizen	7999e10cab	tu: Set up glsl types. Addresses this assert: deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type glsl_type::get_interface_instance(const glsl_struct_field , unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed. running dEQP-VK.api.smoke.triangle . Fixes: `624789e370` "compiler/glsl: handle case where we have multiple users for types" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-18 16:51:18 +02:00
Andres Gomez	f833b4cada	docs: Update to OpenGL 4.6 in the release notes After `41549a18e6` ("i965: Enable OpenGL 4.6 for Gen8+"), Mesa implements the OpenGL 4.6 API. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-18 12:28:05 +00:00
Erik Faye-Lund	ea74b1b9aa	.mailmap: add an alias for Eric Engestrom Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-09-18 14:05:05 +02:00
Erik Faye-Lund	ed91eacf71	.mailmap: add an alias for Michel Dänzer Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-18 14:04:40 +02:00
Samuel Pitoiset	46b7512b0a	radv: fix writing depth/stencil clear values to image Use the fastest way only if both aspects are used. Oops. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728 Fixes: `218ce34962` ("radv: add mipmap support for the clear depth/stencil values") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-18 13:27:46 +02:00
Michel Dänzer	88e5796daa	gitlab-ci: Merge scons-nollvm and scons-llvm jobs The new job tests scons without LLVM and with all LLVM versions >= 6.0. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	baa5024e24	gitlab-ci: Test scons with all LLVM versions Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	0374aacac0	gitlab-ci: Move scons build/test commands to a separate shell script Preparatory, no functional change intended. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	8a8388ca67	gitlab-ci: Use crossbuild-essential-* packages They are convenience packages which pull in everything needed for cross-building via dependencies. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	a01230e73a	gitlab-ci: Use newer packages from backports by default This is needed in particular to get a recent enough version of meson in the stretch image, but should be generally beneficial. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	8a19992869	gitlab-ci: Create separate docker images for Debian stretch & buster Pros: * Less fragile due to not mixing packages from stretch and buster * No longer need to use third-party LLVM packages * The buster image now uses GCC 8 for C++ as well (previously 6 for C++, 8 for C), allowing to drop some hacks Con: * The stretch image now only uses GCC 6 for C as well as C++ * Need separate jobs for testing old LLVM versions Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	26fcc8baba	gitlab-ci: Pass --no-remove to apt-get where possible If installing new packages would require removing previously installed ones, this flag causes apt-get to abort with an error instead, preventing later obscure failures due to the missing packages. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Michel Dänzer	2259b45174	gitlab-ci: Reference full ci-templates commit hash 8 digits might become ambiguous at some point. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-18 10:36:48 +00:00
Haihao Xiang	8a9b81ab9d	i965: support AYUV/XYUV for external import only Fixes: `89785e2d56` ("i965: add support for sampling from AYUV") Fixes: `7cab8d3661` ("i965: Add support for sampling from XYUV images") Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-18 12:07:23 +03:00
Boris Brezillon	1e483a87bc	panfrost: Allocate tiler and scratchpad BOs per-batch If we want to execute several batches in parallel they need to have their own tiler and scratchpad BOs. Let move those objects to panfrost_batch and allocate them on a per-batch basis. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:40:17 +02:00
Boris Brezillon	0eec73a800	panfrost: Add FBO BOs to batch->bos earlier If we want the batch dependency tracking to work correctly we must make sure all BOs are added to the batch->bos set early enough. Adding FBO BOs when generating the fragment job is clearly to late. Add a panfrost_batch_add_fbo_bos helper and call it in the clear/draw path. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:37:56 +02:00
Boris Brezillon	5a4d095f9b	panfrost: Add the panfrost_batch_create_bo() helper This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+ panfrost_bo_unreference() sequence that's done for all per-batch BOs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:37:31 +02:00
Boris Brezillon	9af4aeaaf7	panfrost: Don't return imported/exported BOs to the cache We don't know who else is using the BO in that case, and thus shouldn't re-use it for something else. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:35:52 +02:00
Boris Brezillon	90b8934547	panfrost: Add panfrost_bo_{alloc,free}() Thanks to that we avoid the recursive call into panfrost_bo_create() and we can get rid of panfrost_bo_release() by inlining the code in panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:35:29 +02:00
Boris Brezillon	cb71ae5572	panfrost: Stop using panfrost_bo_release() outside of pan_bo.c panfrost_bo_unreference() should be used instead. The only difference caused by this change is that the scratchpad, tiler_heap and tiler_dummy BOs are now returned to the cache instead of being freed when a context is destroyed. This is only a problem if we care about context isolation, which apparently is not the case since transient BOs are already returned to the per-FD cache (and all contexts share the same address space anyway, so enforcing context isolation is almost impossible). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:35:06 +02:00
Boris Brezillon	e15ab939fd	panfrost: Stop passing screen around for BO operations Store a screen pointer in panfrost_bo so we don't have to pass a screen object to all functions manipulating the BO. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:34:27 +02:00
Boris Brezillon	10ce751726	panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap() panfrost_bo_mmap() already takes care of that. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:34:08 +02:00
Boris Brezillon	a06e08def9	panfrost: Stop exposing panfrost_bo_cache_{fetch,put}() They are not expected to be called directly, users should use panfrost_bo_{create,release}() instead. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:33:51 +02:00
Boris Brezillon	154cb725d4	panfrost: Move the BO API to its own header Right now, the BO API is spread over pan_{allocate,resource,screen}.h. Let's move all BO related definitions to a separate header file. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:29:13 +02:00
Boris Brezillon	34efaafc93	panfrost: s/PAN_ALLOCATE_/PAN_BO_/ Change the prefix for BO allocation flags to make it consistent with the rest of the BO API. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:28:55 +02:00
Boris Brezillon	29d0e5c177	panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.c This way we have all BO related functions placed in the same source file. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:28:39 +02:00
Boris Brezillon	0500c9e514	panfrost: Get rid of pan_drm.c pan_drm.c was only meaningful when we were supporting 2 kernel drivers (mali_kbase, and the drm one). Now that there's now kernel-driver abstraction we're better off moving those functions were they belong: * BO related functions in pan_bo.c * fence related functions + query_gpu_version() in pan_screen.c * submit related functions in pan_job.c While at it, we rename the functions according to the place they're being moved to. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:28:22 +02:00
Boris Brezillon	1e47c3ee7b	panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch() has_draws can be inferred directly from the batch->last_job value, no need to pass it around. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:28:03 +02:00
Boris Brezillon	07085fe8a4	panfrost: Kill a useless memset(0) in panfrost_create_context() ctx is allocated with rzalloc() which takes care of zero-ing the memory region. No need to call memset(0) on top. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:27:47 +02:00
Boris Brezillon	4eac1b2008	panfrost: Add polygon_list to the batch BO set at allocation time That's what we do for other per-batch BOs, and we'll soon add an helper to automate this create_bo()+add_bo()+bo_unreference() sequence, so let's prepare the code to ease this transition. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:27:30 +02:00
Boris Brezillon	c16fb1f48d	panfrost: Add missing panfrost_batch_add_bo() calls Some BOs are used by batches but never explicitly added to the BO set. This is currently not a problem because we wait for the execution of a batch to be finished before releasing a BO, but we will soon relax this rule. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:27:09 +02:00
Boris Brezillon	a94d028065	panfrost: Use the correct type for the bo_handle array The DRM driver expects an array of u32, let's use the correct type, even if using an int works in practice because it's still a 32-bit integer. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:26:49 +02:00
Boris Brezillon	2b771b8424	panfrost: Stop exposing internal panfrost_*_batch() functions panfrost_{create,free,get}_batch() are only called inside pan_job.c. Let's make them static. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-18 10:26:21 +02:00
Christian Gmeiner	8d5f905faa	etnaviv: disable ARB_shadow Looks like only HALT2 GPUs have support for it but that is not yet implemented so disable ARB_shadow for now. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 06:47:26 +02:00
Christian Gmeiner	dcc0e23438	Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP" There are GPUs that do not support this feature. This reverts commit `e871abe452` Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-18 06:47:21 +02:00
Lepton Wu	417d602fda	virgl: Remove wrong EAGAIN handling for drmIoctl drmIoctl handles EAGAIN itself and actually it always return -1 on errors. Remove the wrong handling of its return value. Also, print a warning when it fails. v2: - use _debug_printf instead of fprintf (Gurchetan Singh) Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2019-09-18 03:36:10 +00:00
Kenneth Graunke	f8c44e4ed7	iris: Skip allocating a null surface when there are 0 color regions. The compiler now sets the "Null Render Target" bit in the RT write extended message descriptor, causing it to write to an implicit null surface without us needing to set one up in the binding table. Together with the last patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 14:27:51 -07:00
Kenneth Graunke	f76a724e06	intel/compiler: Set "Null Render Target" ex_desc bit on Gen11 When there are no color regions (i.e. a depth only pass), we can set the "Null Render Target" bit in the Gen11 RT write extended message descriptor to indicate that it should behave as if it's writing to a null render target, without the need for a binding table entry. This lets drivers avoid setting up that null RT binding table entry, but more importantly means the HW doesn't actually have to bother looking up the surface state. Together with the next patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 14:27:51 -07:00
Samuel Iglesias Gonsálvez	f55f7b199d	docs/relnotes: add support for VK_KHR_shader_float_controls on Intel v2: - Move to 19.2.0 release notes (Andres). v3: - Move to 19.3.0 release notes (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	f5dd6dfe01	anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls This adds support for VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and enables de Vulkan and SPIR-V extensions. Also, notice that this includes the updates applied to the VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension VK_KHR_shader_float_controls v4 and Vulkan 1.1.116. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9b07020a4f	i965/fs: add support for shader float control to remove_extra_rounding_modes() The remove_extra_rounding_modes() optimization will remove duplicated rounding mode changes. v2: - Fix bug in the rounding mode change (Alejandro). v3: - Fix rounding modes. v4: - Updated to renamed shader info member and enum values (Andres). v5: - Simplify flags logic operations (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9bd88d10d8	i965/fs: set rounding mode when emitting nir_op_f2f32 or nir_op_f2f16 v2: - Consider nir_op_f2f16 case too (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	ba1e25e1aa	i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions v2: - Updated to renamed shader info member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	9da56ffc52	i965/fs: add emit_shader_float_controls_execution_mode() and aux functions We need this function to emit code that setups the control register later with the defined execution mode for the shader. Therefore, we emit it as the first instruction. v2: - Fix bug in setting the default mode mask in brw_rnd_mode_from_nir(). - Fix support for rounding modes in brw_rnd_mode_from_nir(). v3: - Updated to renamed shader info member and enum values (Andres). v4: - Add actual emission as first instruction of emit_nir_code (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	8a6507b6fe	i965/fs/generator: add new opcode to set float controls modes in control register Before this commit, we had only FPRoundingMode decoration (the per instruction one) that is applied during the SPIR-V handling. In vtn_alu we find out the rounding mode, and generate the code accordingly that later will be used to look for the respective nir_op_f2f16_{rtz,rtne}. Per-instruction gets prioritized because we make them explicit conversions (with RTZ or RTNE nir opcodes) and they will override the default execution mode defined with float controls. However, we need to come back to the mode defined by float controls after the execution of the FP Rounding instruction. Therefore, the new SHADER_OPCODE_FLOAT_CONTROL_MODE opcode will be used to set the default rounding mode and denorms treatment in the whole shader while the pre-existent SHADER_OPCODE_RND_MODE, will be used as prioritized rounding mode in a per-instruction basis. v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper and the new opcode (this one) (Caio). v5: - Add an explanation on the actual purpose and priority of the newly introduced opcode in the commit log (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	28da9558f5	i965/fs/generator: refactor rounding mode helper in preparation for float controls v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper (this one) and the new opcode (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:19 +03:00
Samuel Iglesias Gonsálvez	cdace5b0c6	i965/fs/nir: add nir_op_unpack_half_2x16_split_*_flush_to_zero The denorm mode is set in the control register, no need to do something else. v2: - Add an assert to make sure that we realize if this assumption is broken in the future (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	3c474f8513	intel/nir: do not apply the fsin and fcos trig workarounds for consts If we have fsin or fcos trigonometric operations with constant values as inputs, we will multiply the result by 0.99997 in brw_nir_apply_trig_workarounds, making the result wrong. Adjusting the rules so they do not apply to const values we let a later constant fold to deal with it. v2: - Do not early constant fold but only apply the trig workaround for non constants (Caio). - Add fixes tag to commit log (Caio). Fixes: `bfd17c76c1` "i965: Port INTEL_PRECISE_TRIG=1 to NIR." Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	dba4d0a319	nir: fix fmin/fmax support for doubles Until now, it was using the floating point version of fmin/fmax, instead of the double version. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	2abc6299bf	nir: fix denorm flush-to-zero in sqrt's lowering at nir_lower_double_ops v2: - Replace hard coded value with DBL_MIN (Connor). v3: - Have into account the FLOAT_CONTROLS_DENORM_PRESERVE_FP64 flag (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	1e0e3ed15a	nir: fix denorms in unpack_half_1x16() According to VK_KHR_shader_float_controls: "Denormalized values obtained via unpacking an integer into a vector of values with smaller bit width and interpreting those values as floating-point numbers must: be flushed to zero, unless the entry point is declared with the code:DenormPreserve execution mode." v2: - Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor). v3: - Adapt to use the new NIR lowering framework (Andres). v4: - Updated to renamed shader info member and enum values (Andres). v5: - Simplify flags logic operations (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f097247dd8	nir/algebraic: disable inexact optimizations depending on float controls execution mode If FLOAT_CONTROLS_SIGNED_ZERO_INF_NAN_PRESERVE or FLOAT_CONTROLS_DENORM_FLUSH_TO_ZERO are enabled, do not apply the inexact optimizations so the VK_KHR_shader_float_controls execution mode is respected. v2: - Do not apply inexact optimizations if SHADER_DENORM_FLUSH_TO_ZERO is enabled (Andres). v3: - Updated to renamed shader info member (Andres). v4: - Directly access execution mode instead of dragging it by parameter (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]	2019-09-17 23:39:18 +03:00
Andres Gomez	3f782cdd25	nir/algebraic: mark float optimizations returning one parameter as inexact With the arrival of VK_KHR_shader_float_controls algebraic optimizations for float types of the form (('fop', a, b), a) become inexact depending on the execution mode. For example, if we have activated SHADER_DENORM_FLUSH_TO_ZERO, in case of a denorm value for the "a" parameter, we cannot return it still as a denorm, it needs to be flushed to zero. Therefore, we mark now all those operations as inexact. Suggested-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	5e22f3e29a	nir/constant_expressions: mind rounding mode converting from float to float16 destinations v2: - Move the op-code specific knowledge to nir_opcodes.py even if it means a rount trip conversion (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	ef681cf971	nir/opcodes: make sure f2f16_rtz and f2f16_rtne behavior is not overriden by the float controls execution mode Suggested-by: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	7580707345	nir: mind rounding mode on fadd, fsub, fmul and fma opcodes According to Vulkan spec, the new execution modes affect only correctly rounded SPIR-V instructions, which includes fadd, fsub and fmul. v2: - Fix fmul, fsub and fadd round-to-zero definitions, they should use auxiliary functions to calculate the proper value because Mesa uses round-to-nearest-even rounding mode by default (Connor). v3: - Do an actual fused multiply-add at ffma (Connor). v4: - Simplify fadd and fmul for bit sizes < 64 (Connor). - Do not use double ffma for 32 bits float (Connor). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	0ac07c7ca7	nir: add support for round to zero rounding mode to nir_op_f2f32 f2f16's rounding modes are already handled and f2f64 don't need it as there is not a floating point type with higher bit size than 64 for now. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	5308333e78	util: add fp64 -> fp32 conversion support for RTNE and RTZ rounding modes In order to be coherent with the pre-existent API for half floats, this new API for double is the one meant to be used when doing double to float conversions. It is no more than a wrapper for the softfloat.h API but we meant to keep that one private. v2: - Fix bug in _mesa_double_to_float_rtz() in the inf/nan detection using the exponent value. v3: - Replace custom f64 -> f32 implementations with the softfloat one (Andres). v4: - Added API usage clarifying comments (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	733ede8ff6	util: add float to float16 conversions with RTZ and RTNE In order to be coherent with the pre-existent functions, this new API is the one meant to be used when doing half float to float conversions. It is no more than a wrapper for the softfloat.h API but we meant to keep that one private. v2: - Replace custom f32 -> f16 RTZ implementation with the softfloat one (Andres). v3: - Added API usage clarifying comments (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	153c714f2a	util: add softfloat functions to operate with doubles and floats Implemented fadd, fsub, fmul and ffma for doubles and ffma for floats, rounding to zero, using a modified implementation from Berkely Softfloat 3e Library. Their implementation correctness has been checked with the Berkeley TestFloat Release 3e tool for x86_64. v2: - Reuse util_last_bit64() in _mesa_count_leading_zeros64() implementation (Connor). v3: - Add a specific ffma for floats version (Connor). - Implement the ffma for doubles version (Andres). - Lots of fixes in fadd, fsub and fmul (Andres). - Improved documentation (Andres). v4: - Added f64 -> f32 conversion function (Andres). - Added f32 -> f16 RTZ conversion function (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Tested-by: Andres Gomez <agomez@igalia.com> Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	f7d73db353	nir: add support for flushing to zero denorm constants v2: - Refactor conditions and shared function (Connor). - Move code to nir_eval_const_opcode() (Connor). - Don't flush to zero on fquantize2f16 From Vulkan spec, VK_KHR_shader_float_controls section: "3) Do denorm and rounding mode controls apply to OpSpecConstantOp? RESOLVED: Yes, except when the opcode is OpQuantizeToF16." v3: - Fix bit size (Connor). - Fix execution mode on nir_loop_analize (Connor). v4: - Adapt after API changes to nir_eval_const_opcode (Andres). v5: - Simplify constant_denorm_flush_to_zero (Caio). v6: - Adapt after API changes and to use the new constant constructors (Andres). - Replace MAYBE_UNUSED with UNUSED as the first is going away (Andres). v7: - Adapt to newly added calls (Andres). - Simplified the auxiliary to flush denorms to zero (Caio). - Updated to renamed supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v4] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	45668a8be1	nir: add auxiliary functions to detect if a mode is enabled v2: - Added more functions. v3: - Simplify most of the functions (Caio). v4: - Updated to renamed enum values (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v3]	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	84781e1f1d	spirv/nir: keep track of SPV_KHR_float_controls execution modes v2: - Add support for rounding modes for each floating point bit size. v3: - Commit `e68871f6a4` ("spirv: Handle constants and types before execution modes") changed when the execution modes are handled, which affects the result of the floating point constants when the rounding mode is set in the execution mode. Moved the handling of the rounding modes before we handle the constants. v4: - Rename vtn_decoration "literals" to "operands" (Andres). - Simplify execution mode parsing util function (Caio). - Extend the comment about the timing of the handling of the rounding modes (Caio). v5: - Correct extension name (Caio). - Rename shader info member (Andres). - Rename float controls enum (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez	420ad0a1a3	spirv: check support for SPV_KHR_float_controls capabilities v2: - Correct extension name (Caio). - Rename supported capabilities member (Andres). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 23:39:18 +03:00
Adam Jackson	320c36ed3a	gallium/xlib: Fix glXMakeCurrent(dpy, None, None, ctx) This is entirely legal in GL 3.0+. I wonder how many more times I'll need to fix this specific bug.	2019-09-17 20:16:00 +00:00
Adam Jackson	a693f98e17	gallium/xlib: Remove MakeCurrent_PrevContext As the comment notes, this is not thread-safe. You can just as easily use GetCurrentContext instead, so, do that.	2019-09-17 20:16:00 +00:00
Adam Jackson	db8be355d1	gallium/xlib: Remove drawable caching from the MakeCurrent path AFAICT this only exists to avoid hitting XMesaFindBuffer, which is a linear search. But you don't have that many GLX drawables, so whatever.	2019-09-17 20:16:00 +00:00
Marek Olšák	83f195414a	radeonsi: add Navi12 PCI ID trivial and urgent Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-17 16:00:33 -04:00
Adam Jackson	6ec1259423	ci: Run tests on i386 cross builds Yes, some tests fail, but we can turn those into XFAILs at meson time. Better to keep the things that work working than not cover them at all. Unfortunately XPASS results will not cause the build to fail until we update CI to meson 0.51 or newer. Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-09-17 14:53:57 -04:00
Jon Turney	dd1dba80b9	Fix timespec_from_nsec test for 32-bit time_t Since struct timespec's tv_sec member is of type time_t, adjust the expected value to allow for the truncation which will occur with 32-bit time_t. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-17 12:17:53 -04:00
Tapani Pälli	631255387f	iris: close screen fd on iris_destroy_screen Otherwise it never gets closed, this fixes errors seen with deqp-egl where we end up opening 1024 files. Fixes: `2dce0e94` ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-17 14:46:45 +03:00
Juan A. Suarez Romero	34d51f931b	docs: update calendar, add news item and link release notes for 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-17 12:55:41 +02:00
Juan A. Suarez Romero	a216395831	docs: add sha256 checksums for 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `b9d7244035`)	2019-09-17 12:53:24 +02:00
Juan A. Suarez Romero	d9d4c1be62	docs: add release notes for 19.1.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `f632aac938`)	2019-09-17 12:53:23 +02:00
Michel Dänzer	aed3babef7	ac: Remove DEBUG workaround As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-17 10:24:29 +00:00
Michel Dänzer	2c278602d8	swr: Limit DEBUG workaround to LLVM < 7 As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-17 10:24:29 +00:00
Michel Dänzer	8218f6e22d	gallivm: Limit DEBUG workaround to LLVM < 7 As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-17 10:24:29 +00:00
Erik Faye-Lund	26351f1ee3	st/mesa: remove always-true expression In case the GLSL version is 130 or higher, we've already enabled ARB_shader_bit_encoding a bit earlier in this same function. So this condition will always be true. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-17 07:26:42 +00:00
Christian Gmeiner	1c34d19f90	etnaviv: a bit of micro-optimization Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-09-17 05:50:37 +00:00
Icenowy Zheng	d61b67b41d	lima: reset scissor state if scissor test is disabled The PLBU seems to preserve scissor state between draws, and since lima doesn't emit PLBU_CMD_SCISSORS() if scissor test is disabled, it uses state from previous draw. Fix it by emitting PLBU_CMD_SCISSORS() for full fb if scissor test is disabled. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-09-17 04:13:24 +00:00
Jason Ekstrand	533987b5f4	vulkan: Update the XML and headers to 1.1.123 Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-17 04:11:05 +00:00
Caio Marcelo de Oliveira Filho	9cf1bfcdd7	spirv: Handle ShaderLayer and ShaderViewportIndex capabilities SPIR-V 1.5 incorported the SPV_EXT_shader_viewport_index_layer but splitting into the two capabilities above. Just handle them as we support the extension already. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:18:01 -07:00
Caio Marcelo de Oliveira Filho	f6392e38d8	spirv: Update JSON and headers to 1.5 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-16 19:17:26 -07:00
Eric Anholt	eea6f21cbd	freedreno: Fix invalid read when a block has no instructions. We can't deref list_(first/last)_entries unless we know we have at least one. Instead, just use our IP we've been tracking as we go to set up the start ip, and fill in the end IP as we walk instructions. Fixes a complaint in valgrind on dEQP-GLES3.functional.transform_feedback.* which sometimes has an empty main (non-END) block when the VS inputs are just directly mapped to outputs without any ALU ops. Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-16 22:02:43 +00:00
Kenneth Graunke	d9d6305b80	st/mesa: Increase GL_POINT_SIZE_RANGE minimum to 1.0 Table 23.54 of the OpenGL 4.5 spec lists the minimum values for GL_POINT_SIZE_RANGE as [1, 1]. So zero is not allowed (even though arguably this could be useful for MSAA rendering, where a sub-1px point might cover only some samples...) This fixes the WebGL 2.0 conformance suite's state.gl-get-calls test on Chromium on Linux, which uses desktop OpenGL. The test checks that the minimum value of GL_ALIASED_POINT_SIZE_RANGE is 1. Unfortunately, that query doesn't exist in desktop GL, so it checks POINT_SIZE_RANGE, which is the anti-aliased value. There's not really anything better for Chromium to do here, unfortunately. When running Chromium with --api=es3, it maps it to the correct query and the test already works. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 13:40:41 -07:00
Kenneth Graunke	fbad42bbb9	st/mesa: Prefer 5551 formats for GL_UNSIGNED_SHORT_5_5_5_1. Previously, internalformat GL_RGBA and type GL_UNSIGNED_SHORT_5_5_5_1 was promoted to RGBA8888 as the table entry with the 5551 formats is listed below the 8888 entry, and it also doesn't have GL_RGBA as a possible internalformat. Using actual 5551 fixes the following dEQP-EGL test: - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 13:17:23 -07:00
Rhys Perry	ffabcbba60	radv: always emit a position export in gs copy shaders Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f8d0337299` ('radv: add multiple streams support for the GS copy shader') Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-16 19:42:30 +00:00
Rhys Perry	0f29c9df31	radv: keep GS threads with excessive emissions which could write to memory Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-16 19:42:30 +00:00
Lionel Landwerlin	dcf13fbac9	drirc: include unreal engine version 0 to 23 This was meant to include up to version 23. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0616b7ac90` ("vulkan: add vk_x11_strict_image_count option") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-16 21:47:21 +03:00
Lionel Landwerlin	10206ba17b	util/xmlconfig: fix regexp compile failure check This is embarrasing... Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `04dc6074cf` ("driconfig: add a new engine name/version parameter") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-16 21:47:21 +03:00
Erik Faye-Lund	9c57b54994	gallium/gdi: use GALLIUM_FOO rather than HAVE_FOO This matches what other targets do, and makes it easier to port to meson. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-16 17:54:00 +00:00
Dylan Baker	9e1f49aae1	scons: Make scons and meson agree about path to glapi generated headers Currently scons puts them in src/mapi/glapi, meosn puts them in src/mapi/glapi/gen. This results in some things being compilable only by one or the other, put them in the same places so that everyone is happy. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-16 17:54:00 +00:00
Vasily Khoruzhick	ca5782f0ee	lima: add standalone disassembler with primitive MBS parser It's useful for analyzing shader binaries produced by ARM mali offline compiler which outputs files in MBS format. MBS is mali binary shader, currently parser just extracts shader binary and ignores everything else. Reviewed-and-tested-by: Connor Abbott<cwabbott0@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-16 09:29:55 -07:00
Heinrich Fink	df8602f4b5	mesa/gl: Sync with Khronos registry Update GL headers and xml API from upstream Khronos registry (commit 3d0c3eb). Keep `BUILDING_MESA` quirk in glext.h. mesa/extensions: Expose EXT_EGL_sync instead of MESA_EGL_sync to reflect Khronos request of changing this extension's scope from MESA to EXT. EGL_EGL_sync is also the name of the extension that has been merged into the upstream Khronos GL registry. Remove MESA_EGL_sync spec txt from Mesa tree as it is now published as EXT by Khronos. v1: Remove MESA_EGL_sync spec and squash commits (Eric E) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-09-16 16:50:43 +01:00
Sergii Romantsov	2bfcf04345	nir/large_constants: pass after lowering copy_deref v2: by J.Ekstrand suggestion moved lowering of large constants after lowering of copy_deref is done. CC: Jason Ekstrand <jason@jlekstrand.net> CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111450 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-09-16 11:23:48 +00:00
Michel Dänzer	e536446b60	gitlab-ci: Move up meson-arm64 job definition This might allow the arm64 tests to start running earlier. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 12:51:35 +02:00
Michel Dänzer	cccb68b407	gitlab-ci: Move dependencies/needs for meson-main job to .deqp-test Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 12:51:35 +02:00
Michel Dänzer	128581d0d8	gitlab-ci: Simplify some job definitions by extending more similar jobs v2: * Preserve setting NIR_VALIDATE=0 for all arm64_* jobs * Preserve setting DEQP_SKIPS=deqp-default-skips.txt for arm64_a306_gles2 jobs Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 12:51:34 +02:00
Michel Dänzer	e426f40097	gitlab-ci: Use multiple inheritance instead of YAML references Support for multiple inheritance was added to GitLab recently. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 12:51:34 +02:00
Michel Dänzer	0173e9b1ca	gitlab-ci: Add needs stanza to arm64_a306_gles2 job definition This allows the arm64_a306_gles2 jobs to run as soon as the meson-arm64 job has finished. Fixes: `6f0dc087b7` "freedreno: Introduce gitlab-based CI." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-16 12:51:34 +02:00
Timothy Arceri	741cff91d3	radeonsi/nir: fix number of used samplers Commit `f3e978db` incorrectly assumed the maximum number of samplers was equal to the max number of defined samplers e.g. where bindings skip slots. This fixes an assert in si_nir_load_sampler_desc() for an enemy territory quake wars shader. And fixes potential bugs with incorrect bounds limiting in the same code for production builds of mesa. Fixes: `f3e978db` ("radeonsi/nir: Remove uniform variable scanning") Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-16 10:14:48 +00:00
Samuel Pitoiset	c5010e72b6	radv/gfx10: disable unsupported transform feedback features for NGG Mostly multiple streams and queries which have to be fixed/implemented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	d0fd82b502	radv/gfx10: implement NGG streamout It's still disabled by default because transform feedback randomly hangs and it seems like it's related to GDS (cf. RadeonSI). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	63b20fb0cf	radv/gfx10: make sure to wait for idle before clearing GDS Otherwise the next streamout operation will overwrite GDS. This can be improved by tracking if there is a streamout operation in flight. Currently the driver unconditionally flushes but that doesn't matter much as NGG streamout is disabled by default. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	7314f6ef97	radv/gfx10: make GDS idle when leaving the IB NGG streamout uses GDS and we have to make sure that another process isn't going to overwrite GDS while our shaders are busy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	2d89d8f333	radv/gfx10: enable NGG_WAVE_ID_EN for NGG streamout Otherwise the wave IDs are probably 0 and it hangs. NGG_WAVE_ID_EN generates wave IDs for GDS OA. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	a72344efa3	radv/gfx10: gather GS output for VS as NGG For streamout we have to the number of streamout outputs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	b617156621	radv/gfx10: compute the correct buffer size for NGG streamout It's used to determined the max emit per buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	d81100d307	radv/gfx10: fix unnecessary LDS overallocation for NGG GS Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	c415c58b4a	radv/gfx10: adjust the LDS size for VS/TES NGG streamout It should account for the number of streamout outputs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	67093ed3a3	radv/gfx10: unconditionally declare scratch space for NGG streamout without GS Streamout outputs are stored in the ESGS ring. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	5ebc76471c	radv/gfx10: adjust the GS NGG scratch size for streamout It needs more space for multiple streams. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	e1dc3ab753	radv/gfx10: allocate GDS/OA buffer objects for NGG streamout This allocates two BOs for GFX10 NGG streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	957c3436fa	radv/gfx10: implement NGG streamout begin/end functions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	a15b3bcf1a	radv/gfx10: add an option to switch from legacy to NGG streamout This internal option is turned off by default because NGG streamout still hangs. It seems like it's related to GDS as RadeonSI. That option will be turned on once all issues are resolved. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Samuel Pitoiset	c5a00c3068	radv/winsys: add support for GS and OA domains For NGG streamout which uses GDS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-16 12:08:22 +02:00
Danylo Piliaiev	6f5a8617b4	iris: Fix fence leak in iris_fence_flush Documentation for pipe_context::flush states: "NOTE: use screen->fence_reference() (or equivalent) to transfer new fence ref to **fence, to ensure that previous fence is unref'd" Hence we need to unref previous out_fence. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-16 08:47:37 +00:00
Sergii Romantsov	c7b2a2fd36	nir/large_constants: more careful data copying A filed of nir_variable.location may be equel to -1. That may cause copying to invalid address of list-node, making some internal fields corrupted. Patch fixes segfault during freeing context due to corrupted address of ralloc_header.destructor. v2: copy data if var is constant (Connor Abbott) CC: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b6d4753568` (nir/large_constants: De-duplicate constants) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111676 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-16 07:58:49 +00:00
Juan A. Suarez Romero	237e6f4fed	docs: extend 19.1.x releases As 19.2 got some delays, let's extend 19.1 at least in one extra release. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-16 06:51:20 +00:00
Lionel Landwerlin	0616b7ac90	vulkan: add vk_x11_strict_image_count option This option strictly allocate the minImageCount given by the application at swapchain creation. This works around application that do not deal with the fact that the implementation allocates more images than the minimum specified. v2: Add values in default drirc (Bas) v3: specify engine name/version (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-15 15:37:02 +03:00
Lionel Landwerlin	04dc6074cf	driconfig: add a new engine name/version parameter Vulkan applications can register with the following structure : typedef struct VkApplicationInfo { VkStructureType sType; const void* pNext; const char* pApplicationName; uint32_t applicationVersion; const char* pEngineName; uint32_t engineVersion; uint32_t apiVersion; } VkApplicationInfo; This enables the Vulkan implementations to apply workarounds based off matching this description. Here we add a new parameter for matching the driconfig options with the following : <device driver="anv"> <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42"> <option name="blaaah" value="true" /> </application> </device> v2: switch engine name match to use regexps v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric) v4: Add missing bit that went to the following commit (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-15 15:37:02 +03:00
Lionel Landwerlin	6d5f11ab34	radv: store engine name We'll use this later for a new driconfig matching parameter. v2: Avoid leak in device creation error case (Bas) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: 19.2 <mesa-stable@lists.freedesktop.org>	2019-09-15 15:37:02 +03:00
Christian Gmeiner	9466e4cfab	gallium: util_set_vertex_buffers_mask(..): make use of u_bit_consecutive(..) Also move the clearing of the bits out of if/else. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-14 17:45:47 +00:00
Rob Clark	53a38e3015	gitlab-ci/a630: skip dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8 Seen a couple flakes on this one so far. Not sure if it is a real driver problem or not, but skip it to unblock things. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-09-14 10:22:55 -07:00
Lepton Wu	ac175fb168	virgl: replace fprintf with _debug_printf Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-09-14 00:14:41 +00:00
Kenneth Graunke	c9fb704f72	iris: Initialize ice->state.prim_mode to an invalid value It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we fail to notice an initial primitive of points being new, and fail at updating the "primitive is points or lines" field. We do not need to reset this on device loss because we're tracking the last primitive mode sent to us on the CPU via draw_vbo, not the last primitive mode sent to the GPU. Fixes several tests: - dEQP-GLES3.functional.clipping.point.wide_point_clip - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center - dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner Fixes: `dcfca0af7c` ("iris: Set XY Clipping correctly.")	2019-09-13 16:31:29 -07:00
Eric Anholt	7859eb1390	gitlab-ci: Make the test job fail when bugs are unexpectedly fixed. If people fix bugs without updating the expected-fails list, then we end up with a lack of coverage of those failures in the future. Also, some day down the line another developer ends up trying to figure out if the bug was actually fixed or their environment is just failing to reproduce it. Suggested-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-13 13:50:56 -07:00
Eric Anholt	1190c81f8e	gitlab-ci/a630: Drop the MSAA expected failure. This hasn't failed for me in ~5 minutes of looping over dEQP-GLES3.functional.fbo.msaa.* Reviewed-by: Adam Jackson <ajax@redhat.com> Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-13 13:50:54 -07:00
Eric Anholt	7a0fd10ffc	gitlab-ci/a630: Drop remaining dEQP-GLES3.functional.draw.random.* xfails. These haven't failed for me in ~10 minutes of looping over draw.random.*. Reviewed-by: Adam Jackson <ajax@redhat.com> Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-13 13:50:01 -07:00
Andreas Baierl	4b1a14fd47	lima/ppir: Add undef handling Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark it as undefined, so that regalloc can set it non-interfering to avoid register pressure. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-09-13 19:41:32 +00:00
Andreas Baierl	4ddadd6370	lima/ppir: Rename ppir_op_dummy to ppir_op_undef Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-09-13 19:41:32 +00:00
John Stultz	3976c86e70	Android.mk: Fix missing \ from recent llvm change Building w/ AOSP, I was hitting the following error: external/mesa3d/src/amd/Android.common.mk:95: error: missing separator. Which was due to the changes to mesa-build-with-llvm missing a line continuation. Fixes: `96b592696f` Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-09-13 19:11:10 +00:00
Boris Brezillon	6ddfd37c7e	panfrost: Move the batch submission logic to panfrost_batch_submit() We are about to patch panfrost_flush() to flush all pending batches, not only the current one. In order to do that, we need to move the 'flush single batch' code to panfrost_batch_submit(). While at it, we get rid of the existing pipelining logic, which is currently unused and replace it by an unconditional wait at the end of panfrost_batch_submit(). A new pipeline logic will be introduced later on. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	2fc91a16ab	panfrost: Move the fence creation in panfrost_flush() panfrost_flush() is about to be reworked to flush all pending batches, but we want the fence to block on the last one. Let's move the fence creation logic in panfrost_flush() to prepare for this situation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	835439b84f	panfrost: Delay payloads[].offset_start initialization panfrost_draw_vbo() Might call the primeconvert/without_prim_restart helpers which will enter the ->draw_vbo() again. Let's delay payloads[].offset_start initialization so we don't initialize them twice. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	4166ca92e2	panfrost: Prepare things to avoid flushes on FB switch panfrost_attach_vt_xxx() functions are now passed a batch, and the generated FB desc is kept in panfrost_batch so we can switch FBs without forcing a flush. The postfix->framebuffer field is restored on the next attach_vt_framebuffer() call if the batch already has an FB desc. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	e5c7701a0a	panfrost: Pass a batch to panfrost_set_value_job() So we can emit SET_VALUE jobs for a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	bc0f6c0b15	panfrost: Use ctx->wallpaper_batch in panfrost_blit_wallpaper() We'll soon be able to flush a batch that's not currently bound to the context, which means ctx->pipe_framebuffer will not necessarily be the FBO targeted by the wallpaper draw. Let's prepare for this case and use ctx->wallpaper_batch in panfrost_blit_wallpaper(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	aa851a62b9	panfrost: Pass a batch to functions emitting FB descs So we can emit such jobs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	07a68835a1	panfrost: Pass a batch to panfrost_{allocate,upload}_transient() We need that if we want to upload transient buffers to a batch that's not currently bound to the context, which in turn will be needed if we want to relax the batch serialization we have right now (only flush batches when we need to: on a flush request, or when one batch depends on the result of other batches). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	e46d95d51b	panfrost: Allow testing if a specific batch is targeting a scanout FB Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it a batch instead of a context and move the code to pan_job.c. With this in place, we can now test if a batch is targeting a scanout FB even if this batch is not bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	40e20324e0	panfrost: Get rid of the unused 'flush jobs accessing res' infra Will be replaced by something similar but using a BOs as keys instead of resources. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Boris Brezillon	1b5873b73c	panfrost: Use a pipe_framebuffer_state as the batch key This way we have all the fb_state information directly attached to a batch and can pass only the batch to functions emitting CMDs, which is needed if we want to be able to queue CMDs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-13 16:25:06 +02:00
Indrajit Das	92765f85e1	radeon/vcn: exclude raven2 from vcn 2.0 encode initialization Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-09-13 09:18:43 -04:00
Eric Engestrom	a0f8a07308	gitlab-ci: rename stages to something simpler Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-13 13:26:09 +01:00
Boris Brezillon	c9bebae287	panfrost: Rework midgard_pair_load_store() to kill the nested foreach loop mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe() which is designed to protect against removal of the current entry, but removing the entry placed just after the current one will lead to a use-after-free situation. Luckily, the midgard_pair_load_store() logic guarantees that the instruction being removed (if any) is never placed just after ins which in turn guarantees that the hidden __next variable always points to a valid object. Took me a bit of time to realize that this code was safe, so I'm suggesting to get rid of the inner mir_foreach_instr_in_block_from() loop and rework the code so that the removed instruction is always the current one (which is what the list_for_each_entry_safe() API was initially designed for). While at it, we also get rid of the unecessary insert(ins)/remove(ins) dance by simply moving the instruction around. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-13 12:03:47 +02:00
Boris Brezillon	0e513ccca4	panfrost: Fix a list_assert() in schedule_block() list_for_each_entry() does not allow modifying the current item pointer. Let's rework the skip-instructions logic in schedule_block() to not break this rule. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-09-13 11:01:40 +02:00
Iago Toral Quiroga	2eace10c62	v3d: fix TF primitive counts for resume without draw The V3D documentation states that primitive counters are reset when we emit Tile Binning Mode Configuration items, which we do at the start of each draw call, however, in the actual hardware this doesn't seem to take effect when transform feedback is not active (this doesn't happen in the simulator). This causes a problem in the following scenario: glBeginTransformFeedback() glDrawArrays() glPauseTransformFeedback() glDrawArrays() glResumeTransformFeedback() glEndTransformFeedback() The TF pause will trigger a flush of the primitive counters, which results in a correct number of primitives up to that point. In theory, the counter should then be reset when we execute the draw after pausing TF, but that doesn't happen, and since TF is enabled again by the resume command before we end recording, by the time we end the transform feedback recording we again check the counters, but instead of reading 0, we read again the same value we read at the time we paused, incorrectly accumulating that value again. In theory, we should be able to avoid this by using the other method to reset the primitive counters: using operation 1 instead of 0 when we flush the counts to the buffer at the time we pause, but again, this doesn't seem to be work and we still see obsolete counts by the time we end transform feedback. This patch fixes the problem by not accumulating TF primitive counts unless we know we have actually queued draw calls during transform feedback, since that seems to effectively reset the counters. This should also be more performant, since it saves unnecessary stalls for the primitive counters to be updated when we know there haven't been any new primitives drawn. Fixes CTS tests: dEQP-GLES3.functional.transform_feedback.* Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-13 06:53:26 +00:00
Iago Toral Quiroga	ded6ea9209	v3d: remove redundant update of queued draw calls This was updating the counter for the indexed draw path only, but we are already updating the counter for all paths a bit later, so this is only duplicating counts for indexed paths. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-13 06:53:26 +00:00
Iago Toral Quiroga	b9a07eed00	v3d: make sure we have enough space in the CL for the primitive counts packet Fixes: `0f2d1dfe65` ("v3d: use the GPU to record primitives written to transform feedback") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-13 06:53:26 +00:00
Iago Toral Quiroga	b69f51a5ef	v3d: add missing line break for performance debug message Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-13 06:53:26 +00:00
Tomeu Vizoso	bc79e5c437	panfrost/ci: Use releases for Volt dEQP So we can better correlate different results to versions of the runner. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-13 08:35:36 +02:00
Tomeu Vizoso	c301fc027a	panfrost/ci: Update kernel to 5.3-rc8 We haven't updated in a long time, so better do it now and again when 5.3 is released. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-13 08:35:36 +02:00
Tomeu Vizoso	ca4e6637d0	panfrost/ci: Run dEQP with the surfaceless platform Instead of running it with the Wayland platform, which introduces unwanted dependencies and complexity. Makes tests run 30% faster, as well. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-13 08:35:36 +02:00
Samuel Pitoiset	8137df3a46	radv: fix allocating number of user sgprs if streamout is used streamout_buffers is assigned after that function, so the previous fix was completely wrong. This probably fix something when streamout buffers and push constants are used/inlined in the same shader. Fixes: `378e2d2414` ("radv: fix computing number of user SGPRs for streamout buffers") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-13 07:55:30 +02:00
Jason Ekstrand	acfa2340e6	intel/fs: Handle UNDEF in split_virtual_grfs When the UNDEF instruction was added, we didn't do anything special in split_virtual_grfs. This mean that anything with an UNDEF wasn't getting split which causes problems for the compiler. Among other things, it makes RA harder because things are in bigger chunks. It also meant that dvec4s weren't getting split which means that they are larger than the maximum register size. Shader-db results on Kaby Lake: total instructions in shared programs: 14959202 -> 14960035 (<.01%) instructions in affected programs: 96197 -> 97030 (0.87%) helped: 140 HURT: 128 helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45% HURT stats (abs) min: 1 max: 825 x̄: 8.28 x̃: 1 HURT stats (rel) min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50% 95% mean confidence interval for instructions value: -2.96 9.18 95% mean confidence interval for instructions %-change: -0.56% 1.51% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4372 -> 4372 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 352646771 -> 352840997 (0.06%) cycles in affected programs: 218600800 -> 218795026 (0.09%) helped: 21167 HURT: 21411 helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10 helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98% HURT stats (abs) min: 1 max: 26027 x̄: 45.54 x̃: 10 HURT stats (rel) min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06% 95% mean confidence interval for cycles value: 2.87 6.26 95% mean confidence interval for cycles %-change: 0.40% 0.55% Cycles are HURT. total spills in shared programs: 8840 -> 8953 (1.28%) spills in affected programs: 126 -> 239 (89.68%) helped: 1 HURT: 2 total fills in shared programs: 21782 -> 21914 (0.61%) fills in affected programs: 431 -> 563 (30.63%) helped: 1 HURT: 3 LOST: 0 GAINED: 5 Shader-db results on Haswell: total instructions in shared programs: 13320918 -> 13320769 (<.01%) instructions in affected programs: 40998 -> 40849 (-0.36%) helped: 146 HURT: 56 helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22% HURT stats (abs) min: 2 max: 23 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26% 95% mean confidence interval for instructions value: -1.26 -0.21 95% mean confidence interval for instructions %-change: -0.62% 0.77% Inconclusive result (%-change mean confidence interval includes 0). total loops in shared programs: 4373 -> 4373 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 374518258 -> 374384193 (-0.04%) cycles in affected programs: 231101954 -> 230967889 (-0.06%) helped: 21427 HURT: 19438 helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8 helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86% HURT stats (abs) min: 1 max: 20875 x̄: 27.38 x̃: 8 HURT stats (rel) min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80% 95% mean confidence interval for cycles value: -4.49 -2.07 95% mean confidence interval for cycles %-change: -0.14% -0.04% Cycles are helped. total spills in shared programs: 23406 -> 23411 (0.02%) spills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 total fills in shared programs: 34845 -> 34850 (0.01%) fills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 LOST: 0 GAINED: 0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566 Fixes: `f4ef34f207` "intel/fs: Add an UNDEF instruction to avoid..." Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-09-13 04:12:24 +00:00
Jiadong Zhu	33aa039acf	mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINT _mesa_texstore_z32f_x24s8 calculates source rowStride at a pace of 64-bit, this will make inaccuracy offset if the width of src image is an odd number. Modify src pointer to int_32* as source image format is gl_float which is 32-bit per pixel. Reviewed by Ilia Mirkin Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-09-12 23:28:28 -04:00
Rob Clark	b4df115d3f	freedreno/a6xx: pre-calculate userconst stateobj size The AnTuTu "garden" benchmark overflows the fixed size constbuffer stateobject, so lets be more clever and calculate (a potentially slightly pessimistic) actual size. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-09-12 18:07:20 -07:00
Adam Jackson	5a9dec7534	gallium: Restore VSX for llvm >= 4 Accidentally dropped in `4fdd455eeb`. Fixes: `4fdd455e` ("gallium: Require LLVM >= 3.4) Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-12 20:09:12 -04:00
Eric Anholt	2efc804892	egl/android: Fix build since the DRI fourcc removal. Fixes: `272f9cfe6a` ("dri: Use DRM_FORMAT_* instead of defining our own copy.") Reviewed-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-12 21:54:30 +00:00
Eric Anholt	89e840ec59	gitlab-ci/a630: Disable flappy layout_binding.ssbo.fragment_binding_array It started showing up as unreliable post-merge. There's a valgrind complaint, but even fixing that doesn't make it stable.	2019-09-12 14:16:21 -07:00
Rob Clark	966b7c3ed2	freedreno: fix compiler warning fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 12:52:04 -07:00
Eric Anholt	6f0dc087b7	freedreno: Introduce gitlab-based CI. Since freedreno's kernel and GPU reset seem to be totally solid, we don't need to have the complexity of the LAVA setup that panfrost has. Instead, we can register some boards as shared gitlab runners and have the jobs run out of a docker container just like we do for llvmpipe. Just make sure that the DRI device node is passed through to the containers in the gitlab config ('devices = ["/dev/dri"]' under runners.docker). If a runner fails (networking dies, kernel panic, etc.) it'll take out one build but the rest can keep going since gitlab-runner is what pulls jobs. Since the runner pulls jobs, it also means that they can live behind firewalls instead of needing some public address to be accessed by gitlab.fd.o. For now, enable it just on db410c (A307) and cheza (A630) as those are the hardware that I have plenty of. A307 is only testing GLES2 since running all of GLES3 takes too long for the number of boards I've brought up. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-12 10:55:42 -07:00
Eric Anholt	0b6b0c09f4	gitlab-ci: Log the driver version that got tested. Sometimes you just want confirmation that dEQP really picked up the driver we built you thought. This is not as good as one might like, because git isn't present in the cross-build image. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-12 10:55:42 -07:00
Eric Anholt	8d4742fe49	gitlab-ci: Disable dEQP's watchdog timer. A handful of tests on freedreno have been close to the watchdog timeout, and now sporadically fail since range analysis has slowed down the compiler for them. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-12 10:55:42 -07:00
Caio Marcelo de Oliveira Filho	f479878ce6	mesa/st: Fallback to name lookup when the variable have no Parameter This brings back the fallback previously present in st_nir_lookup_parameter_index(): if there's no parameter associated with the variable, use a parameter from a variable with the same prefix. We'll have to sort out something for SPIR-V, but in the meantime let's fix GLSL. Fixes: `b6384e57f5` ("mesa/st: Lookup parameters without using names") Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Eric Anholt <eric@anholt.net>	2019-09-12 17:53:54 +00:00
Adam Jackson	ad9c1838e0	glx: Remove unused indirection for glx_context->fillImage This slot is always filled in with __glFillImage. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 13:23:32 -04:00
Eric Engestrom	f812cbfd88	meson/v3d: replace partial list of nir dep files with idep_nir_headers "partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-12 13:18:36 +01:00
Eric Engestrom	f418de5490	meson/iris: replace partial list of nir dep files with idep_nir_headers "partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-12 13:18:36 +01:00
Jose Maria Casanova Crespo	068c8889dd	v3d: flag dirty state when binding compute states As introduced in "v3d: flag dirty state when binding new sampler states" we need to add support for compute states. New flag VC5_DIRTY_COMPTEX and VC5_DIRTY_UNCOMPILED_CS are introduced. Reaching 33 flags at the dirty field forces us to change the type to uint_64. Flags are reordered and empty continuous bits are available for future pipeline stages. v2: Update flag conditions to compile cs shader. (Eric Antholt) Now dirty flags use uint_64t and flags are reordered. Added VC5_DIRTY_UNCOMPILED_CS flag. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 12:20:17 +01:00
Danylo Piliaiev	175c32e9bd	tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONE Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made it for drivers impossible to have flatshaded color inputs. Translate it to INTERP_MODE_NONE which drivers interpret as smooth or flat depending on flatshading state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467 Fixes: `770faf54` ("tgsi_to_nir: Improve interpolation modes.") Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 10:17:39 +00:00
Iago Toral Quiroga	544b156968	nir/lower_point_size: assume scalar PSIZ Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 06:40:04 +00:00
Iago Toral Quiroga	7f0b4a803c	gallium/ttn: VARYING_SLOT_PSIZ and VARYING_SLOT_FOGC are scalar Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 06:40:04 +00:00
Iago Toral Quiroga	ab341e61f0	prog_to_nir: VARYING_SLOT_PSIZ is a scalar v2: remove stray change (Erik Faye-Lund) Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-12 06:40:04 +00:00
Lepton Wu	8b1912c20b	egl/android: Only keep BGRA EGL configs as fallback Stock Android code actually doesn't support BGRA format EGL configs. It's hard coded to use RGBA_8888 as window format for BGRA EGL configs here: https://android.googlesource.com/platform/frameworks/native/+/1eb32e2/opengl/libs/EGL/eglApi.cpp#608 So just remove it from EGL configs if RGBA is supported. Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-12 06:38:59 +00:00
renchenglei	e2485bb023	egl/android: Enable HAL_PIXEL_FORMAT_RGBA_1010102 format The patch adds support for HAL_PIXEL_FORMAT_RGBA_1010102 on Android platform. Fixes android.media.cts.DecoderTest#testVp9HdrStaticMetadata which failed in egl due to "Unsupported native buffer format 0x2b" on Android. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>	2019-09-12 05:59:56 +00:00
Kenneth Graunke	6a82a374b4	iris: trivial whitespace fixes	2019-09-11 21:33:41 -07:00
Jonathan Marek	3690a53608	u_format: float type for R11G11B10_FLOAT/R9G9B9E5_FLOAT Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-11 22:39:19 -04:00
Jonathan Marek	8829f9ccb0	u_format: add ETC2 to util_format_srgb/util_format_linear Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-11 22:39:07 -04:00
Vinson Lee	8d286776b6	meson: Add coroutines component to llvmpipe build. Fixes: `d32690b43c` ("gallivm: add coroutine pass manager support") Suggested-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-09-11 15:14:17 -07:00
Eric Anholt	272f9cfe6a	dri: Use DRM_FORMAT_* instead of defining our own copy. We have only two defines that aren't from DRM_FORMAT_*: SARGB and SABGR. Keep only those as __DRI_IMAGE_FOURCC and garbage collect the rest. While this header is also used from the X server, the X server doesn't use any __DRI_IMAGE enums. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-11 13:05:10 -07:00
Eric Anholt	c18b1f0e71	uapi: Update drm_fourcc.h Taken from drm-misc-next 268de6530aa1 ("drm: mst: Fix query_payload ack reply struct") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-11 13:04:53 -07:00
Kenneth Graunke	73e4f974b8	st/mesa: Only pause queries if there are any active queries to pause. Previously, ReadPixels, PBO upload/download, and clears would call cso_save_state with CSO_PAUSE_QUERIES, causing cso_context to call pipe->set_active_query_state() twice for each operation. This can potentially cause driver work to enable/disable statistics counters. But often, there are no queries happening which need to be paused. By keeping a simple tally of active queries, we can skip this work. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-11 19:47:57 +00:00
Jean Hertel	2c1983f757	Fix missing dri2_load_driver on platform_drm Signed-off-by: Jean Hertel <jean.hertel@hotmail.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-11 19:28:09 +00:00
Anuj Phogat	729de1488f	intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM Initial benchmarking didn't show any performance benefits. But it might eventually. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-11 11:29:37 -07:00
Anuj Phogat	ee2bde5232	genxml/gen11+: Add COMMON_SLICE_CHICKEN4 register Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-11 11:29:37 -07:00
Adam Jackson	7e0e53a077	egl/dri2: Refuse to add EGLConfigs with no supported surface types For example, the surfaceless platform only supports pbuffers. If the driver supports MSAA, we would still create a config, but it would have no supported surface types. That's meaningless, so don't do it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-11 14:11:40 -04:00
Adam Jackson	96b592696f	gallium: Require LLVM >= 3.9 To go any further than this would be to break the current version of Android. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-11 17:00:43 +00:00
Adam Jackson	585d095610	gallium: Require LLVM >= 3.8 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-11 17:00:43 +00:00
Adam Jackson	59f18f2159	gallium: Require LLVM >= 3.7 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-11 17:00:43 +00:00
Adam Jackson	9abf7d5755	gallium: Require LLVM >= 3.6 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-11 17:00:43 +00:00
Adam Jackson	3c553d9cff	gallium: Require LLVM >= 3.5 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> [ Michel Dänzer: Dropped jessie line from debian-install.sh again ]	2019-09-11 17:00:43 +00:00
Michel Dänzer	57855ff8aa	gitlab-ci: Keep g++ from stretch when installing foreign toolchains Upgrading to a newer g++ causes older LLVM/clang packages to be removed. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-11 17:00:43 +00:00
Michel Dänzer	3be7c67bbe	gitlab-ci: Explicitly install linux-libc-dev for foreign architectures Something seems to have changed in Debian buster causing installation of the other foreign packages to fail without this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-09-11 17:00:43 +00:00
Adam Jackson	4fdd455eeb	gallium: Require LLVM >= 3.4 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-11 17:00:43 +00:00
Dylan Baker	a1ebbc3225	Docs: mark that 19.2.0-rc3 has been released Also update -rc4 to me.	2019-09-11 09:47:45 -07:00
Brian Paul	d714415208	st/nir: fix illegal designated initializer in st_glsl_to_nir.cpp IIRC, designated initializers are not legal C++. Fixes the MSVC build. Fixes: `83fd1e58` ("glsl/nir: Add and use a gl_nir_link() function") Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-09-11 09:38:07 -06:00
Dylan Baker	52cf2d05a7	meson: don't generate file into subdirs This is unsupported by meson and may become a hard error in the future. Fixes: `5adfc8602c` ("lima/ppir: move sin/cos input scaling into NIR") Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-11 08:35:05 -07:00
Kenneth Graunke	73b70b4952	iris: Set bo->reusable = false in iris_bo_make_external_locked This fixes a missing bo->reusable = false in iris_bo_export_gem_handle. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-09-11 08:10:47 -07:00
Kenneth Graunke	06370c3167	iris: Finish initializing the BO before stuffing it in the hash table Other threads may pick it up once it's in the hash table. Not known to fix anything currently. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-09-11 08:10:47 -07:00
Marek Olšák	9a59ad87df	radeonsi/gfx9: honor user stride for imported buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-11 11:03:31 -04:00
Marek Olšák	b97c5edd7a	prog_to_nir, tgsi_to_nir: make sure kill doesn't discard NaNs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-11 10:59:27 -04:00
Marek Olšák	1bb2656276	ac: replace HAVE_LLVM with LLVM_VERSION_MAJOR for atomic-optimizations trivial	2019-09-11 10:56:46 -04:00
Vasily Khoruzhick	32ea4c2c5e	lima: set .out_sync field of req in lima_submit_start() Looks like .out_sync wasn't set in lima_submit_start(), as result submit completion fence was never signalled. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-10 21:49:53 -07:00
Anuj Phogat	cb18046073	intel: Add few Ice Lake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-10 15:59:30 -07:00
Kenneth Graunke	c6d40b5182	gallium: Fix util_format_get_depth_only This is a pipe format, not a boolean. Fixes: `5849e0612c` ("gallium/auxiliary: Add util_format_get_depth_only() helper.") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-10 15:49:29 -07:00
Rob Clark	6c19d37331	freedreno/a6xx: fix 3d tex layout Fixes dEQP-GLES3.functional.texture.specification.texstorage3d.size.3d_2x2x2_2_levels Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-10 22:08:33 +00:00
Rob Clark	85a23a8991	freedreno/a6xx: don't tile things that are too small If the lowest (largest) mipmap level is too small to tile, then don't bother pretending. Note that this requires initializing pipe->screen before fd_resource_level_linear() is called. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-09-10 22:08:33 +00:00
Caio Marcelo de Oliveira Filho	15e439071d	iris: Enable ARB_gl_spirv and ARB_spirv_extensions This will also "unlock" OpenGL 4.6 for Iris! v2: Also enable PIPE_CAP_GL_SPIRV_VARIABLE_POINTERS. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	83fd1e58d8	glsl/nir: Add and use a gl_nir_link() function Perform all the NIR linking steps in order. Change iris and i965 to use it. Suggested by Alejandro. v2: Add gl_nir_linker_options struct. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	eca8032f20	gallium: Add ARB_gl_spirv support The PIPE_CAP_GL_SPIRV capability enables ARB_gl_spirv and ARB_spirv_extensions, and will make sure the corresponding SPIR-V capabilities and extensions lists are initialized. The additional PIPE_CAP_GL_SPIRV_VARIABLE_POINTERS capability enables the support for Variable Pointers in SPIR-V shaders. This depends on the driver and is not mandatory for ARB_gl_spirv support. v2: Add a PIPE_CAP for Variable Pointers. (Marek) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1]	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	dccd179ba1	mesa/spirv: Set a few more extensions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	1a12b0fe36	mesa/st: Don't expect prog->nir to already exist There's no such case, if we load prog->nir from the shader cache, we shouldn't hit this path. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	b4b39d9859	mesa/st: Add support for SPIR-V shaders The SPIR-V codepath uses NIR linking, so we have to preprocess after the linking steps, which makes things slightly different than GLSL. To make more clear when the preprocess is happening, I've ended up inlining st_nir_get_mesa_program() into its caller. The goal was to make both GLSL and SPIR-V to use the same preprocess function, the exceptions are: - SPIR-V codepath don't support NIR state slots yet; - GLSL lowers shared memory early, so we don't do the deref lowering for those. For now I didn't bother to rename other functions and files (now that many of them apply to both GLSL and SPIR-V), but we should do this in further patches. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	18e79e97e5	mesa/st: Extract preprocessing NIR steps Refactor to split the glsl_to_nir conversion from the preprocessing NIR passes into separate functions, so we can use them in SPIR-V. Unlike in GLSL, there we'll need to perform a few passes with the NIR linker before doing the individual preprocess calls. No behavior should change with this patch. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	b6384e57f5	mesa/st: Lookup parameters without using names Use the new MainUniformStorageIndex field in Parameter instead. It was added so we could match those in the SPIR-V case, where names are optional. v2: Use MainUniformStorageIndex for all cases. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v1] Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	d40978f396	mesa/program: Associate uniform storage without using names Use the new UniformStorageIndex field in Parameter instead. This mechanism was added so we could match those in the SPIR-V case, where names are optional. v2: Use UniformStorageIndex for all cases. (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	4dd1ef9d0a	mesa: Fill Parameter storage indices even when not using SPIR-V When creating Parameters, fill in the associated uniform storage indices, like it is done with the NIR linker used for SPIR-V. This will allow later code to not rely on names (which would never work for SPIR-V where names are optional). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	664e4a610d	glsl/nir: Fill in the Parameters in NIR linker The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the first UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	eea3aa25aa	mesa: Pack gl_program_parameter struct The gl_register_file doesn't need 16 bits, so shorten it and use the extra room for 'Padded' (also mark it as a single bit). This shrinks the struct size from 32 bytes to 24 bytes. See also `4794fbc86e` ("mesa: reduce the size of gl_program_parameter") that shrinked from 40 to 24 and later `7536af670b` ("glsl: fix shader cache for packed param list") that added `Padded`. v2: Use just 5 bits for gl_register_file. (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	eda596d64b	compiler: Add glsl_contains_opaque() helper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	1a96811fe1	mesa/st: Do not rely on name to identify special uniforms Every uniform that have the "gl_" name also have some state slots. So use the state_slots like we did in `57b6184931` ("i965: account for NIR uniforms without name"). This removes the dependency on names, which are optional when using ARB_gl_spirv. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-10 14:36:46 -07:00
Caio Marcelo de Oliveira Filho	4f33f96c45	glsl/nir: Avoid overflow when setting max_uniform_location Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned max_uniform_location. Those unmapped uniforms don't have to be accounted at this point. Fixes: `7a9e5cdfbb` ("nir/linker: Add gl_nir_link_uniforms()") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-10 14:36:46 -07:00
Dylan Baker	3047199931	meson: don't allow glvnd on windows Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	e1e2388f06	meson: don't build glx or dri by default on windows v5: - Move is windows check down to make code more robust Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	70cac06bbf	meson: Add a platform for windows This mirrors the haiku build which uses a platform. v2: - Fix some rebase problems Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	f680cc62f8	meson: build getopt when using msvc v4: - Don't wrap a single file in a list to match mesa style - Use null_dep instead of empty list Reviewed-by: Eric Anholt <eric@anholt.net> (v3) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	0caa229dcb	meson: fix dl detection on non cygwin windows v4: - Don't run checks on Windows that will always fail Reviewed-by: Eric Anholt <eric@anholt.net> (v3) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	2595b7c997	glapi: export glapi_destroy_multithread when building shared-glapi on windows Which will allow meson to build a shared glapi build with mingw. v2: - Add symbol to symbol check test Reviewed-by: Eric Anholt <eric@anholt.net> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	b9fa7ec4fa	meson: add a expat subproject For Windows Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	8ba86ad55c	meson: add a zlib subproject To help windows build Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	49fe44fe5a	add a git ignore for subprojects Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	af444d84a3	meson: don't build glapi_static_check_table on windows It doesn't compile due to undefined symbols, which are in libglapi_static, so I don't understand the problem. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	8424209a42	meson: Make shared-glapi a combo So it can auto off for windows, but on elsewhere. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	a1a8703199	meson: don't try to generate i18n translations on windows Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:47 +00:00
Dylan Baker	26961e2cb5	glsl/tests: Handle windows \r\n new lines Currently the praser for s expressions assumes that newlines will be \n, resulting in incorrect parsing on windows, where the newline is \r\n. This patch just adds \r? to the regular expression used to parse the s expressions, which fixes at 1 test on windows. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-10 20:36:46 +00:00
Kenneth Graunke	077a1952cc	iris: Fix constant buffer sizes for non-UBOs Since the system value refactor, we've accidentally only been setting cbuf->buffer_size in the UBO case, and not in the uploaded-constants case. We use cbuf->buffer_size to fill out the SURFACE_STATE entry, so it needs to be initialized in both cases. Fixes: `3b6d787e40` ("iris: move sysvals to their own constant buffer")	2019-09-10 10:53:15 -07:00
Lionel Landwerlin	341034a73d	intel: update product names for WHL Documentation list all of those as "UHD". Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111629 BSpec: 33266 Acked-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-10 19:21:38 +03:00
Samuel Pitoiset	538766792d	radv/gfx10: declare a LDS symbol for the NGG emit space This fixes some interactions when NGG GS is enabled. It fixes: - dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.geom - dEQP-VK.tessellation.geometry_interaction.passthrough.* For some reasons, using the computed ESGS ring size randomly hangs with CTS. For now, just use the maximum LDS size for ESGS. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:27:01 +02:00
Samuel Pitoiset	168f8dbafa	radv: calculate GFX9 GS and GFX10 NGG states before compiling shader variants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:58 +02:00
Samuel Pitoiset	e7ee9a6387	radv: store the ESGS ring size as part of gfx10_ngg_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:53 +02:00
Samuel Pitoiset	7eba5666fa	radv: store GFX10 NGG state as part of the shader info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:51 +02:00
Samuel Pitoiset	349caedee0	radv: store GFX9 GS state as part of the shader info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:47 +02:00
Samuel Pitoiset	a9af11f1fa	radv: fill shader info for all stages in the pipeline This shouldn't be in NIR->LLVM because ACO also needs the shader info. This will also help for computing some NGG values that are necessary for declaring LDS symbols. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:45 +02:00
Samuel Pitoiset	8cf297c7b1	radv: do not pass all compiler options to the shader info pass Only the pipeline layout and the shader keys are needed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-10 09:26:42 +02:00
Marek Olšák	ef919d8dcb	radeonsi: remove redundant si_texture offset and size fields Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	e4c84d8678	radeonsi: move texture storage allocation outside of radeonsi possible code sharing with radv Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	58ccadfc5c	radeonsi: move HTILE allocation outside of radeonsi ac_surface computes it for amdgpu. radeon_drm_surface computes it for radeon. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	30a1dd0ee6	radeonsi: handle NO_DCC early Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	7d4a10a29f	ac/surface: add RADEON_SURF_NO_FMASK This controls FMASK and CMASK computation for MSAA. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	6633863150	r300,r600,radeonsi: set winsys_handle::stride,offset in drivers, not winsyses Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	5ac6908263	r300,r600,radeonsi: read winsys_handle::stride,offset in drivers, not winsyses Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	d95afd8b9e	radeonsi/gfx10: fix wave occupancy computations Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	42ea0b7b52	radeonsi: only support at most 1024 threads per block LLVM 10 won't support 2048. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	c1e08cb6d5	radeonsi: disable DCC when importing a texture from an incompatible driver and unify the code. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	28adf0d00c	radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts This fixes a crash. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	2f42d4cacc	radeonsi/gfx10: use fma for TGSI_OPCODE_FMA Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	d64593e3c4	ac: use fma on gfx10 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-09-09 23:43:03 -04:00
Marek Olšák	d979e5bfab	ac: enable LLVM atomic optimizations	2019-09-09 23:43:03 -04:00
Lepton Wu	263136fb5d	virgl: Fix pipe_resource leaks under multi-sample. Fixes: `900a80f9e4` ("virgl: virgl_transfer should own its virgl_resource") Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-09-10 03:42:55 +00:00
Kenneth Graunke	410894c643	iris: Avoid flushing for cache history on transfer range flushes The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps appending data, and calling glFlushMappedBufferRange(). We were invalidating the VF cache each time it flushed a new range, which results in a ton of VF flushes. If the contents of the destination in the target range are undefined (never even possibly written), this patch makes us assume that it's likely not in the cache and so cache invalidations are required. If the destination range is defined, we continue cache flushing as we may need to expunge stale data. This eliminates 88% of the VF cache invalidates on Manhattan 3.0. Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).	2019-09-09 15:08:22 -07:00
Kenneth Graunke	7d28e9ddd6	iris: Optimize out redundant sampler state binds This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in the J2DBench images test. For some reason, the state tracker is calling bind_sampler_state with the same sampler state in a bunch of cases.	2019-09-09 11:55:27 -07:00
Kenneth Graunke	325e25d689	iris: Add support for the always_flush_cache=true debug option. This can be useful for debugging missing flushes.	2019-09-09 11:55:27 -07:00
Adam Jackson	366b2e5c19	mesa: Eliminate gl_config::rgbMode Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-09 14:12:57 -04:00
Adam Jackson	78e0fa6bb2	mesa: Eliminate gl_config::have{Accum,Depth,Stencil}Buffer Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-09 14:12:57 -04:00
Adam Jackson	c4990b7b19	mesa: Remove unused gl_config::indexBits Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-09 14:12:57 -04:00
Adam Jackson	04bef9a0a6	gallium/xlib: Fix an obvious thinko x == !GLX_DIRECT_COLOR is a fancy way of writing x == 0, which is clearly not what was meant.	2019-09-09 14:12:57 -04:00
Kenneth Graunke	9173459b95	iris: Ignore line stipple information if it's disabled The line stipple pattern and factor only matter if line stippling is actually enabled. Otherwise, we can safely ignore it. PBO upload may give us zero for line stipple information, while normal drawing tends to give us an actual stipple pattern such as 0xffff. This was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus very expensive. Improves performance in Manhattan 3.0 on Skylake GT4e by 0.149261% +/- 0.0380796% (n=210). On an Icelake 8x8 with the GPU frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).	2019-09-09 10:55:20 -07:00
Vasily Khoruzhick	fbd5d9ebb5	lima/ppir: drop fge/flt/feq/fne options These are supposed to be lowered into sge/slt/seq/sne equivalents. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 10:25:30 -07:00
Vasily Khoruzhick	576341324d	lima: run opt_algebraic between int_to_float and boot_to_float for vs int_to_float emits ftrunc and ftrunc lowering generates bool ops. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 10:25:30 -07:00
Vasily Khoruzhick	996f1b6174	lima/gpir: fix warning in gpir disassembler Fixes following warning: ../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’: ../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds] 241 \| "xyzw"[src - gpir_codegen_src_attrib_x]); Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 10:25:30 -07:00
Vasily Khoruzhick	e6dbf6d948	lima/gpir: lower fceil GP doesn't support fceil so we need to lower it. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 10:25:30 -07:00
Connor Abbott	c64f30546d	lima/gpir: Disallow moves for schedule_first nodes The entire point of schedule_first is that the node has to be scheduled as soon as possible without any moves because it doesn't produce a proper floating-point value, or its value changes depending on where you read it. We were still introducing a move for preexp2 in some cases though, even if it got scheduled as soon as possible, which broke some exp() tests. Fix that. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 17:42:19 +07:00
Connor Abbott	8c7ad22adb	lima/gpir: Fix fake dep handling for schedule_first nodes The whole point of schedule_first nodes is that they need to be scheduled as soon as possible, so if a schedule_first node is the successor in a fake dependency that prevents it from being scheduled after its parent, that can cause problems. We need to add these fake dependencies to the parent as well, and we need to guarantee that the pre-RA scheduler puts schedule_first nodes right before their parents in order to prevent this from adding cycles to the dependency graph. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 17:42:00 +07:00
Connor Abbott	2955875381	lima/gpir: Fix schedule_first insertion logic The idea was to make sure schedule_first nodes were always first in the ready list. I made sure they were inserted first, but not that other nodes wouldn't later be scheduled ahead of them. Fixes spec@glsl-1.10@execution@built-in-functions@vs-exp-float and probably others. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 17:41:35 +07:00
Connor Abbott	63acdb5ce6	lima/gpir: Ignore unscheduled successors in can_use_complex() The point of the function is to avoid creating a complex move which is used by certain slots in the next instruction, but unscheduled successors will never be in the next instruction. Found while debugging a crash that the previous commit fixed. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 17:40:58 +07:00
Connor Abbott	ee8cc90e55	lima/gpir: Do all lowerings before rsched The scheduler assumes that load nodes are always duplicated so that they can always be scheduled eventually and therefore they never need to be spilled. But some lowerings were running after the pre-RA scheduler, whereas duplication has to happen before then since it's needed for the scheduler to do a better job reducing register pressure. This meant that lowerings were introducing multiple uses of a load instruction, which broke the scheduler's expectation and resulted in infinite loops in situations where the only nodes available to spill were load nodes. Spilling load nodes would be silly, so we want to fix the lowerings rather than the scheduler. Just do all lowerings before the pre-RA scheduler, which also helps with reducing pressure since the scheduler can more accurately compute the pressure. Fixes lima/mesa#104. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-09 17:39:20 +07:00
Mauro Rossi	ae5ac26dfa	android: anv: libmesa_vulkan_common: add libmesa_util static dependency Change needed to fix the following building error: In file included from external/mesa/src/intel/vulkan/anv_device.c:43: external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `4dcb1ff` ("anv: add support for driconf") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-09-08 20:07:56 +02:00
Boris Brezillon	3ce03374b3	panfrost: Rename pan_bo_cache.c into pan_bo.c So we can move all the BO logic into this file instead of having it spread over pan_resource.c, pan_drm.c and pan_bo_cache.c. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:24:54 +02:00
Boris Brezillon	14bfb0cb67	panfrost: Get rid of the now unused SLAB allocator The last users have been converted to use plain BOs. Let's get rid of this abstraction. We can always consider adding it back if we need it at some point. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:24:19 +02:00
Boris Brezillon	2c90045cf2	panfrost: Get rid of unused panfrost_context fields Some fields in panfrost_context are unused (probably leftovers from previous refactor). Let's get rid of them. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:23:34 +02:00
Boris Brezillon	76274bcb5e	panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOs ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using panfrost_drm_allocate_slab() but they never any of the SLAB-based allocation logic. Let's convert those fields to plain BOs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:22:59 +02:00
Boris Brezillon	a2bba567ae	panfrost: Make transient allocation rely on the BO cache Right now, the transient memory allocator implements its own BO caching mechanism, which is not really needed since we already have a generic BO cache. Let's simplify things a bit. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:22:26 +02:00
Boris Brezillon	12d8a17957	panfrost: Stop passing a ctx to functions being passed a batch The context can be retrieved from batch->ctx. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:21:44 +02:00
Boris Brezillon	beb18c6172	panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch() Given the function name it makes more sense to pass it a job batch directly. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:20:59 +02:00
Boris Brezillon	2c526993bc	panfrost: s/job/batch/ What we currently call a job is actually a batch containing several jobs all attached to a rendering operation targeting a specific FBO. Let's rename structs, functions, variables and fields to reflect this fact. Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-09-08 16:19:56 +02:00
Heinrich Fink	3aa4f3a442	egl: Add GL_MESA_EGL_sync support This commit follow OES_EGL_sync to universially enable use of EGL sync objects with desktop OpenGL contexts. Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-08 08:01:55 +00:00
Heinrich Fink	8c933c9d96	headers: Add GL_MESA_EGL_sync token to GL Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-08 08:01:55 +00:00
Heinrich Fink	17470c4aaa	registry: update gl.xml with GL_MESA_EGL_sync token As added by upstream GL registry changes Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-08 08:01:55 +00:00
Heinrich Fink	f4327ce06e	specs: Add GL_MESA_EGL_sync Adds GL_MESA_EGL_sync as defined in upstream OpenGL registry Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-08 08:01:55 +00:00
Tapani Pälli	f83f9d7daa	android: fix linking issues with liblog Fixes Android build errors observed in Intel CI. Fixes: `f9f7cbc1aa` "util: android logging support" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-07 13:16:29 +03:00
Kenneth Graunke	dfb86405cf	iris: Support the disable_throttling=true driconf option.	2019-09-06 18:35:24 -07:00
Jason Ekstrand	c832820ce9	nir/dead_cf: Repair SSA if the pass makes progress The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	1005272a2b	nir/repair_ssa: Insert deref casts when needed Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	a3268599f3	nir/repair_ssa: Repair dominance for unreachable blocks NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	f81a2623d8	nir: Add a block_is_unreachable helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	517142252f	nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	37cdb7fc44	nir: Handle complex derefs in nir_split_array_vars We already bail and don't split the vars but we were passing a NULL to _mesa_hash_table_search which is not allowed. Fixes: `f1cb3348f1` "nir/split_vars: Properly bail in the presence of ..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-09-06 23:39:01 +00:00
Jason Ekstrand	34541be7b0	intel/blorp: Use wide formats for nicely aligned stencil clears In the case where the stencil clear is nicely aligned, we can clear stencil much more efficiently by mapping it as a wide format (say RGBA32_UINT) and blasting out the stencil clear value with a repclear. On Unigine Heaven, this makes one stencil clear go from non-trivial to unnoticeable when looking at per-draw timings. In order for this change to work properly, ANV needs to do a bit more flushing around depth and stencil clears. i965 and iris already have the cache tracking logic to handle this so no changes are required there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:35:09 +00:00
Jason Ekstrand	d62ca48c31	intel/blorp: Expose surf_fake_interleaved_msaa internally	2019-09-06 23:35:09 +00:00
Jason Ekstrand	caa786e029	intel/blorp: Expose surf_retile_w_to_y internally Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:35:09 +00:00
Jason Ekstrand	a90b1cbe73	blorp: Memset surface info to zero when initializing it This isn't known to fix any current bugs but it does prevent a regression in a subsequent commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:35:09 +00:00
Jason Ekstrand	c15b197d74	intel/tools: Decode PS kernels on SNB Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:35:09 +00:00
Jason Ekstrand	7f5cb5fd6d	intel/tools: Decode 3DSTATE_BINDING_TABLE_POINTERS on SNB Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:35:09 +00:00
Rhys Perry	6b8cb08756	nir/lower_io_to_vector: don't merge compact varyings Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: 02bc4aabb48 ('nir/lower_io_to_vector: allow FS outputs to be vectorized') Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 15:38:10 -07:00
Eric Engestrom	27339fe9a7	drirc: override minImageCount=2 for gfxbench Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765 Fixes: `4689e98fe8` ("vulkan/wsi: Set X11 minImageCount to 3.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	5eb7d48b58	radv: add support for vk_x11_override_min_image_count Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	4ad99ee961	amd: move adaptive sync to performance section, as it is defined in xmlpool Fixes: `3844ed8d44` ("radv: Add adaptive_sync driconfig option and enable it by default.") Fixes: `e260493f2a` ("radeonsi: Enable adaptive_sync by default for radeon") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	037b5b567f	anv: add support for vk_x11_override_min_image_count Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	a72cdd00ab	wsi: add minImageCount override Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	4dcb1fff19	anv: add support for driconf No option is supported yet, this is just the boilerplate. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 23:16:05 +01:00
Eric Engestrom	ba73564b52	gallivm: drop LLVM<3.3 code paths as no build system allows that Suggested-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	1b8764638a	meson/scons/android: drop now-unused HAVE_LLVM Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	2406b35151	llvmpipe: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	ba1e085587	clover: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	1c1c477470	gallivm: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	7527144383	clover: replace major llvm version checks with LLVM_VERSION_MAJOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	08890068c5	gallivm: replace major llvm version checks with LLVM_VERSION_MAJOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	6120c442ee	swr: replace major llvm version checks with LLVM_VERSION_MAJOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	19d9e57f2c	amd: replace major llvm version checks with LLVM_VERSION_MAJOR Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:26:29 +01:00
Eric Engestrom	bce9c05ca8	svga: replace binary HAVE_LLVM checks with LLVM_AVAILABLE Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:19:01 +01:00
Eric Engestrom	cf7d186be6	r600: replace binary HAVE_LLVM checks with LLVM_AVAILABLE Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:19:01 +01:00
Eric Engestrom	28cb16b6f8	aux/draw: replace binary HAVE_LLVM checks with LLVM_AVAILABLE Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:19:01 +01:00
Eric Engestrom	ef434fbc25	meson/scons/android: add LLVM_AVAILABLE binary flag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:19:01 +01:00
Eric Engestrom	5aebe37b53	gallivm: replace `0x` version print with actual version string Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Michel Dänzer <mdaenzer@redhat.com>	2019-09-06 22:19:01 +01:00
Jordan Justen	9790cfcefa	anv,iris: L3ALLOC register replaces L3CNTLREG for gen12 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 13:11:25 -07:00
Anuj Phogat	414cae0fd6	intel/gen12: Add L3 configurations Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 13:11:22 -07:00
Rhys Perry	5a7fe0ae99	util: include u_endian.h in u_math.h u_endian.h needs to be included, otherwise PIPE_ARCH_BIG_ENDIAN might not be defined on big-endian architectures and the endian conversion macros will be incorrect. I don't think anything is broken because of this, I just noticed this when looking at the file. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 19:52:50 +00:00
Jason Ekstrand	3b1a7e5333	anv: Bump maxComputeWorkgroupSize Fixes: `9a129510f5` "anv: Bump maxComputeWorkgroupInvocations" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-06 18:26:55 +00:00
Kenneth Graunke	0d0ae16e8f	intel: Stop redirecting state cache to command streamer cache section This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.	2019-09-06 10:57:55 -07:00
Kenneth Graunke	68be5ff8d0	iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESS Jason pointed out that the caches likely refer to offsets from dynamic and surface state base addresses, so when we change those, we need to invalidate the caches. Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.	2019-09-06 10:57:55 -07:00
Kristian H. Kristensen	30ab3e39fd	freedreno/a6xx: Implement primitive count queries on GPU The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or tessellation, since these stages add primitives at runtime. Use the WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and implement a hw query for this. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-06 09:53:28 -07:00
Kristian H. Kristensen	1acf8d2354	freedreno/a6xx: Let the GPU track streamout offsets The GPU writes out streamout offsets as it goes to the FLUSH_BASE pointer. We use that value with CP_MEM_TO_REG when appending to the stream so that we don't have to track the offsets with the CPU in the driver. This ensures that streamout continues to work once we enable geometry and tessellation shader stages that add geometry. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-06 09:53:28 -07:00
Roland Scheidegger	de1c89fd93	llvmpipe: fix CALLOC vs. free mismatches Should fix some issues we're seeing. And use REALLOC instead of realloc. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-09-06 18:31:34 +02:00
Samuel Pitoiset	0bf51b6941	radv/gfx10: determine the number of vertices per primitive for TES This doesn't fix anything known but it's correct now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 17:36:49 +02:00
Rhys Perry	bcd14756ee	nir/lower_io_to_vector: add flat mode This has lower_io_to_vector try to turn variables into arrays of 4-sized vectors when possible and fall back to the old approach when that isn't possible. This is so that lower_io_to_vector can guarantee that only one variable is used for each fragment shader output. v2: handle dual-source blending v3: don't try to merge structs and non-32-bit types in get_flat_type() v3: fix per-vertex inputs v3: fix and cleanup location advancement in get_flat_type() and it's calling code v4: prioritize the original mode over the flat mode v4: don't create flat variables to merge only one variable v5: don't skip an entire slot when encountering structs in the old mode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 15:38:04 +00:00
Rhys Perry	300e758b7c	nir/lower_io_to_vector: allow FS outputs to be vectorized v2: handle dual-source blending v3: use a higher MAX_SLOTS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 15:38:04 +00:00
Samuel Pitoiset	c6be5cefba	radv/gfx10: make use the output usage mask when exporting NGG GS params It shouldn't matter much because output varyings should have been compacted during NIR shader linking but it mirrors what the driver does when emitting NGG GS vertex parameters. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 17:25:28 +02:00
Samuel Pitoiset	b1a872f0c0	radv/gfx10: account for the subpass view for the NGG GS storage If the fragment shader needs the layer index, we have to allocate one more dword in the NGG GS storage. Found by inspection. This doesn't fix anything known. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 17:25:28 +02:00
Tomeu Vizoso	0efc0f8edc	panfrost/ci: Increase timeouts Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab job will fail in that case. Increase the timeouts to reduce the likeliness of that happening and reduce false positives. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-06 16:35:16 +02:00
Tomeu Vizoso	8a5dd61828	panfrost/ci: Use special runner for LAVA jobs So repositories don't need to be specially configured with a token to access LAVA, store this token in a bind volume for a special runner. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-06 16:35:16 +02:00
Tomeu Vizoso	10b60dbd2c	panfrost/ci: Re-add support for armhf Now that Volt supports armhf, build again images and submit to LAVA for RK3288. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-06 16:35:16 +02:00
Samuel Pitoiset	f31fb33432	radv: calculate esgs_itemsize in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:24 +02:00
Samuel Pitoiset	7fa00e178f	radv: calculate the GSVS vertex size in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:22 +02:00
Samuel Pitoiset	3e8bda66ae	radv: gather primitive ID in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:20 +02:00
Samuel Pitoiset	1877e87f1e	radv: gather layer in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:19 +02:00
Samuel Pitoiset	84b346eda9	radv: gather viewport in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:17 +02:00
Samuel Pitoiset	d21489d415	radv: gather pointsize in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:09 +02:00
Samuel Pitoiset	a99d2d5564	radv: gather clip/cull distances in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:07 +02:00
Samuel Pitoiset	b16cf6c4c6	radv: move ac_fill_shader_info() to radv_nir_shader_info_pass() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:05 +02:00
Samuel Pitoiset	83499ac765	radv: merge radv_shader_variant_info into radv_shader_info Having two different structs is useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 15:52:03 +02:00
Zhu, James	878439bba3	radeon: Fix mjpeg issue for ARCTURUS ARCTURUS mjpeg is using direct register access. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2019-09-06 08:53:52 -04:00
Leo Liu	a3074370d9	radeon/vcn: add RENOIR VCN decode support It has same VCN2.x block as Navi1x Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2019-09-06 08:53:52 -04:00
Danylo Piliaiev	aabde02f2f	glsl: Fix unroll of do{} while(false) like loops For loops which condition is false on the first iteration iteration count was falsely calculated under the assumption that loop's condition is true until it becomes false, meaning it's true at least one time. Now such loops are reported as having 0 iteration. Similar to the fix `e71fc7f2` done in NIR. Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-06 10:27:33 +00:00
Timur Kristóf	3debd0ef15	tgsi_to_nir: Remove dependency on libglsl. This commit removes the GLSL dependency in TTN by manually recording the textures used and calling nir_lower_samplers instead of its GL counterpart. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 12:20:53 +03:00
Timur Kristóf	610cc3089c	nir: Carve out nir_lower_samplers from GLSL code. Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-09-06 12:20:20 +03:00
Gert Wollny	9b9e1de90e	radeonsi: Release storage for smda_uploads when the context is destroyed This fixes a memory leak in the flush code: Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 in __interceptor_realloc .../gcc-8.3.0/libsanitizer/asan/asan_malloc_linux.cc:105 #1 in si_buffer_do_flush_region src/gallium/drivers/radeonsi/si_buffer.c:573 #2 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:608 #3 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:597 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-06 09:44:24 +02:00
Mauro Rossi	7a6e7803a7	android: mesa: revert "Enable asm unconditionally" This patch partially reverts `20294dc` ("mesa: Enable asm unconditionally, ...") Android makefile build logic needs to disable assembler optimization in 32bit builds to avoid text relocations for libglapi.so shared Fixes the following build error with Android x86 32bit target: [ 0% 4/477] target SharedLib: libglapi (out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so) FAILED: out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so ... prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: warning: shared library text segment is not shareable prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: error: treating warnings as errors clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `20294dc` ("mesa: Enable asm unconditionally, now that gen_matypes is gone.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-09-06 08:48:28 +02:00
Samuel Pitoiset	fa13b2f002	radv/gfx10: always set ballot_mask_bits to 64 The codegen handles it and it adds the correct casts. This fixes a bunch of LLVM validation errors when enabling Wave32 for compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-06 08:11:43 +02:00
Caio Marcelo de Oliveira Filho	c0c55bd84f	nir/lower_explicit_io: Handle 1 bit loads and stores Load a 32-bit value then convert to 1-bit. Convert 1-bit to 32-bit value, then Store it. These cases started to appear when we changed Anvil to use derefs for shared memory. v2: Use `bit_size` in a couple of places we were missing. (Jason) Reassign `value` instead of `src[0]`. (Jason) Fixes: `024a46a407` ("anv: use derefs for shared memory access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-09-05 22:24:09 -07:00
Jason Ekstrand	d15fe8ca82	Revert "intel/fs: Move the scalar-region conversion to the generator." This reverts commit `c0504569ea`. Now that we're doing interpolation lowering in NIR, we can continue to stride the FS input registers directly in the brw_fs_nir code like we did before. This fixes SIMD32 fragment shaders which broke because lower_simd_width depended on the 0 stride to split PLN instructions correctly. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-09-06 03:58:09 +00:00
Jason Ekstrand	47e9743547	intel/fs: Fix FB write inst groups This commit does two things. First, it simplifies the way we compute the FB write group bit. There's no reason to use a ternary because inst->group / 16 can only be 0 or 1. Second, it fixes an order-of- operations bug where the ternary wasn't selecting between (1 << 11) and 0 but between (1 << 11) and 0 \| brw_dp_write_desc(...). Fixes: `0d9648416` "intel/compiler: Use generic SEND for Gen7+ FB writes" Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-06 03:58:09 +00:00
Vasily Khoruzhick	aa77fc309a	lima/ppir: don't lower phis to scalar Utgard PP is vec4 architecture, so lowering phis to scalars increases instruction count and potentially interferes with spilling. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-05 19:29:16 -07:00
Jonathan Marek	feea5986a9	freedreno/a2xx: formats update For render formats, update fd2_pipe2color to only work with HW supported render formats, and remove the format whitelist is_format_supported. This patch enables float render formats (which work). For vertex/texture formats, use a generic function which translates using the bitsize of the channels. Since we fake support for some vertex formats, check for these in is_format_supported to avoid enabling them as sampler formats. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-06 02:24:29 +00:00
Jonathan Marek	21dfa8e486	freedreno/a2xx: fix depth gmem restore Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16 render formats for gmem restore. Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color working with depth formats. gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src formats are the same. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-06 02:24:29 +00:00
Jonathan Marek	88ca73bcd0	freedreno/a2xx: implement polygon offset Fixes failures in the following deqp tests: dEQP-GLES2.functional.polygon_offset.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	ac4ca24c32	freedreno/a2xx: fix SRC_ALPHA_SATURATE for alpha blend function Fixes failures in the following deqp tests: dEQP-GLES2.functional.fragment_ops.src_alpha_saturate Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	80906a12d9	freedreno/a2xx: ir2: update register state in scalar insert Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-06 02:24:29 +00:00
Jonathan Marek	588cfe4a2b	freedreno/a2xx: ir2: fix incorrect instruction reordering Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-06 02:24:29 +00:00
Jonathan Marek	a6ebd4ab08	freedreno/a2xx: ir2: check opcode on the right instruction in export cp Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	19e62fec60	freedreno/a2xx: ir2: fix saturate in cp Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	c5e6961a58	freedreno/a2xx: ir2: set lower_fdph The fdph opcode is not supported. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	22799787b5	freedreno/a2xx: ir2: remove pointcoord y invert Fixes the following deqp test: dEQP-GLES2.functional.shaders.builtin_variable.pointcoord Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Jonathan Marek	3516a90ab4	freedreno/a2xx: ir2: fix lowering of instructions after float lowering Some instructions generated by int/bool float lowering need to be lowered by opt_algebraic. Fixes: `43dbd7d6` Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Vasily Khoruzhick	517b60dc13	lima/ppir: don't lower vector {b,f}csel to scalar if condition is scalar Utgard PP has vector fcsel operation, but its condition is scalar. Add filtering callback that checks whether {b,f}csel condition is not scalar to lower {b,f}csel to scalar only in this case. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-06 01:51:28 +00:00
Vasily Khoruzhick	9367d2ca37	nir: allow specifying filter callback in lower_alu_to_scalar Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-06 01:51:28 +00:00
Rob Clark	f9f7cbc1aa	util: android logging support In particular, it would be nice for failed debug_assert() msgs to show up in logcat. Signed-off-by: Rob Clark <robdclark@chromium.org> Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-06 00:45:11 +00:00
Rob Clark	9baa72b7fc	freedreno/ir3: allow copy propagation for relative This appears to work fine (with the additional constraint of keeping the indirect load in the same block that a0.x was loaded). We can probably lift this restriction on earlier gens after testing. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	d9ad6f54dc	freedreno/ir3: fix cp cmps.s opt Need to use ir3_instr_set_address(), otherwise the instruction might not get added to the indirects table. This becomes a problem when we turn on copy propagation for relative accesses, as check_instr() in the sched pass won't realize there is an indirect consumer of address register load that is ready to be scheduled. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	e59bfc820b	freedreno/ir3: assert that only single address An instruction can reference only a single address register value. Add an assert to catch bugs. Also, address value should also be local to the same block as the instruction. (The one spot where changing the instruction address is actually legit needs to clear the address first.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	f94f22e87a	freedreno/ir3: fix mad copy propagation special case After the next patch enabling copy propagation for relative sources, we'll need to dereference the n'th src in valid_flags(), so we actually need to swap the sources before calling valid_flags(). But the logic was already a bit cumbersome, so move it into a helper function. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	1fd6a91d4a	freedreno/ir3: fix addr/pred spilling The live_values and use_count was not being properly updated. This starts triggering problems with the next patch, where we allow copy propagation for RELATIV access. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	50a91fbf87	freedreno/ir3: cleanup "partially const" ubo srcs Move the constant part of the indirect offset into nir intrinsic base. When we have multiple indirect accesses with different constant offsets, this lets other opt passes clean up things to use a single address register value. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Erico Nunes	17bb437ac2	lima/ppir: improve regalloc spill cost calculation Now that spilling ops can be inserted into existing instructions, it makes sense to increase cost to spill registers that would cause the creation of a new instruction. Experimental results showed that penalizing too much due to this caused worse results, however it is beneficial as a tie resolver between registers with the same number of components. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-05 23:29:24 +00:00
Erico Nunes	7b2f195d0b	lima/ppir: optimizations in regalloc spilling code Avoid creating unnecessary instructions for the load/store temp nodes when not required, to further reduce register pressure. The store_temp operation seems to be unable to do any spilling. At least the offline shader seems to never output instructions accessing swizzled components, and attempting to output that in ppir results in errors. So, force spilled registers to allocate a full vec4 register. This seems to be the optimal way as it is possible to always keep stores and temps in a single instruction that can be pipelined. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-05 23:29:24 +00:00
Erico Nunes	f9bf1a95ec	lima/ppir: mark regalloc created ssa unspillable One ssa created in the spillinc code in ppir_update_spilled_src was not properly being marked 'spilled', which made it a candidate for future spilling attempts. Since it was being inserted by the spilling code itself, let's mark it unspillable to avoid an infinite spilling loop. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-05 23:29:24 +00:00
Jose Maria Casanova Crespo	a5df0fa0b1	v3d: writes to magic registers aren't RF writes after THREND Shaders must not attempt to write to the register files in the last three instructions, but that doesn't include the magic registers: nop ; nop ; thrsw; ldtmu.- * ERROR * nop ; nop nop ; nop v2: Simplify validation rules. (Eric Anholt) v3: Adjust validation even more. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-05 22:54:13 +01:00
Sergii Romantsov	1dce75c183	intel/dri: finish proper glthread KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: `dca36d5516` (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-09-05 09:04:12 -07:00
Connor Abbott	3f5b541fc8	radv: Call nir_propagate_invariant() Without this, invariant qualifiers don't do anything. Together with a fix to the game, this fixes flickering in No Man's Sky. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-09-05 14:05:46 +02:00
Connor Abbott	2f5783bc2b	radeonsi/nir: Don't lower constant arrays to uniforms shader-db results: Totals: SGPRS: 3955968 -> 3954960 (-0.03 %) VGPRS: 2220220 -> 2220092 (-0.01 %) Spilled SGPRs: 11387 -> 11325 (-0.54 %) Spilled VGPRs: 97 -> 97 (0.00 %) Private memory VGPRs: 2528 -> 2528 (0.00 %) Scratch size: 2656 -> 2656 (0.00 %) dwords per thread Code Size: 76002204 -> 75994988 (-0.01 %) bytes LDS: 740 -> 740 (0.00 %) blocks Max Waves: 772776 -> 772787 (0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 16840 -> 15832 (-5.99 %) VGPRS: 16452 -> 16324 (-0.78 %) Spilled SGPRs: 1416 -> 1354 (-4.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 2016 -> 2016 (0.00 %) Scratch size: 2040 -> 2040 (0.00 %) dwords per thread Code Size: 953624 -> 946408 (-0.76 %) bytes LDS: 303 -> 303 (0.00 %) blocks Max Waves: 1622 -> 1633 (0.68 %) Wait states: 0 -> 0 (0.00 %) There were a large number of regressions in code size, but they seem to be because NIR unrolls some loop which results in the table being replaced by a bunch of immediates on multiplies etc. -- this bloats code size since the table size is now included, but means that there are less loads so it's still a net positive. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-05 12:39:26 +02:00
Connor Abbott	2af431cf7f	gallium: Plumb through a way to disable GLSL const lowering For radeonsi, we will prefer the NIR pass as it'll generate better code (some index calculation and a single load vs. a load, then index calculation, then another load) and oftentimes NIR optimization can kick in and make all the access indices constant. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-05 12:38:46 +02:00
Connor Abbott	49503ae74e	st/nir: Don't lower indirects when linking I believe this was stuck here early because otherwise nir_opt_copy_prop_vars could undo what lower_io_to_temporaries does. However that has since been fixed. Also, we now use scratch for large variables so the comment is stale. On radeonsi these are the shader-db results: Totals: SGPRS: 3955968 -> 3955968 (0.00 %) VGPRS: 2220208 -> 2220220 (0.00 %) Spilled SGPRs: 11387 -> 11387 (0.00 %) Spilled VGPRs: 97 -> 97 (0.00 %) Private memory VGPRs: 2528 -> 2528 (0.00 %) Scratch size: 2656 -> 2656 (0.00 %) dwords per thread Code Size: 76002108 -> 76002204 (0.00 %) bytes LDS: 740 -> 740 (0.00 %) blocks Max Waves: 772779 -> 772776 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 176 -> 176 (0.00 %) VGPRS: 144 -> 156 (8.33 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 12104 -> 12200 (0.79 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 25 (-10.71 %) Wait states: 0 -> 0 (0.00 %) The few small regressions are due to nir_opt_large_constants kicking in when indirect lowering happens to result in smaller code after optimization since the array is very simple. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-05 12:38:22 +02:00
Connor Abbott	7d2d7b5d5f	st/nir: Call nir_remove_unused_variables() in the opt loop This prevents regressions when disabling indirect lowering. Sometimes the only use of an input array was copying it to the array created by nir_lower_io_to_temporaries, and without lowering indirects we wouldn't have eliminated the temporary array until after linking, which was too late to remove unused code in the producer. No shader-db changes with radeonsi NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-05 12:37:28 +02:00
Connor Abbott	71a6794200	ac/nir: Enable nir_opt_large_constants vkpipeline-db numbers: Totals: SGPRS: 1740306 -> 1741322 (0.06 %) VGPRS: 1331124 -> 1331712 (0.04 %) Spilled SGPRs: 21201 -> 21316 (0.54 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 256 -> 256 (0.00 %) dwords per thread Code Size: 79022628 -> 78694788 (-0.41 %) bytes LDS: 6500 -> 6500 (0.00 %) blocks Max Waves: 301413 -> 301302 (-0.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 53633 -> 54649 (1.89 %) VGPRS: 53000 -> 53588 (1.11 %) Spilled SGPRs: 3454 -> 3569 (3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5284232 -> 4956392 (-6.20 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 4239 -> 4128 (-2.62 %) Wait states: 0 -> 0 (0.00 %) (The biggest VGPR and max wave regression is due to unrolling a loop, which made the scheduler more aggressive, but in this case it's able to effectively hide latency so it's actually probably a win.) shader-db numbers with radeonsi NIR: Totals: SGPRS: 3526496 -> 3526512 (0.00 %) VGPRS: 2198576 -> 2198576 (0.00 %) Spilled SGPRs: 10463 -> 10463 (0.00 %) Spilled VGPRs: 86 -> 86 (0.00 %) Private memory VGPRs: 3182 -> 2528 (-20.55 %) Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread Code Size: 74117280 -> 74106140 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 775846 -> 775844 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 856 -> 872 (1.87 %) VGPRS: 680 -> 680 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 654 -> 0 (-100.00 %) Scratch size: 668 -> 0 (-100.00 %) dwords per thread Code Size: 49652 -> 38512 (-22.44 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 182 -> 180 (-1.10 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-05 12:21:46 +02:00
Connor Abbott	91626d0865	ac/nir: Support load_constant intrinsics Setup a constant global variable that LLVM will stick in a .rodata section and generate PC-relative loads for. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-05 12:21:42 +02:00
Connor Abbott	5dadbabb47	radv/radeonsi: Don't count read-only data when reporting code size We usually use these counts as a simple way to figure out if a change reduces the number of instructions or shrinks an instruction. However, since .rodata sections aren't executed, we shouldn't be counting their size for this analysis. Make the linker return the total executable size, and use it to report the more useful size in both drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-05 12:21:35 +02:00
Heinrich Fink	5cc7cc5f17	headers: remove redundant GL token from GL wrapper Removing GL_FRAMEBUFFER_FLIP_Y_MESA token from glheader.h as it is now provided by glext.h Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-05 09:26:35 +02:00
Heinrich Fink	e2c88b7cd6	specs: Sync framebuffer_flip_y text with GL registry Sync extension spec of MESA_framebuffer_flip_y to what has been merged upstream in the GL registry. Update now carries the accepted GL extension no. v2: split GL headers update off to separate commit Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-05 09:26:30 +02:00
Heinrich Fink	c9a3f4fe40	include: sync GL headers with registry Integrating headers from upstream registry [0] master branch. Effective GL registry commit integrated: 9d534f9312e56c72df763207e449c6719576fd54 Keeping the following quirks local to Mesa: - glext.h: BUILDING_MESA guard (see !1492) - glxext.h: glXQueryGLXPbufferSGIX: 'int' return type (Mesa) vs while 'void' (GL registry) - glxext.h: GLX_RENDERER_ID_MESA is still expected by some mesa tests, even though its token has been removed from the spec (see docs/specs/MESA_query_renderer.spec) - glxext.h: glXGetTransparentIndexSUN / PFNGLXGETTRANSPARENTINDEXSUNPROC argument pTransparentIndex has type 'unsigned long ' (Mesa) vs. 'long ' (GL registry) [0] https://github.com/KhronosGroup/OpenGL-Registry Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-05 09:26:15 +02:00
Hal Gentz	55c912883c	clover: Fix build after clang r370122. ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp: In function ‘std::unique_ptr<clang::CompilerInstance> {anonymous}::create_compiler_instance(const clover::device&, const std::vector<std::__cxx11::basic_string<char> >&, std::string&)’: ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:203:81: error: no matching function for call to ‘clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, const char* const, const char const, clang::DiagnosticsEngine&)’ 203 \| c->getInvocation(), copts.data(), copts.data() + copts.size(), diag)) \| ^ In file included from /opt/llvm64/include/clang/Frontend/CompilerInstance.h:15, from ../mesa/src/gallium/state_trackers/clover/llvm/codegen.hpp:37, from ../mesa/src/gallium/state_trackers/clover/llvm/invocation.cpp:49: /opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate: ‘static bool clang::CompilerInvocation::CreateFromArgs(clang::CompilerInvocation&, llvm::ArrayRef<const char>, clang::DiagnosticsEngine&)’ 157 \| static bool CreateFromArgs(CompilerInvocation &Res, \| ^~~~~~~~~~~~~~ /opt/llvm64/include/clang/Frontend/CompilerInvocation.h:157:15: note: candidate expects 3 arguments, 4 provided Signed-off-by: Hal Gentz <zegentzy@protonmail.com> Reviewed-by: Aaron Watry <awatry@gmail.com>	2019-09-04 22:29:52 -05:00
Vinson Lee	e716a9e213	scons: Add coroutines component to build. Fixes: `d32690b43c` ("gallivm: add coroutine pass manager support") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-09-04 20:05:43 -07:00
Eric Anholt	cc3c217ce0	gallium/osmesa: Move 565 format selection checks where the rest are. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-04 16:43:36 -07:00
Eric Anholt	9e7eb9780a	gallium/osmesa: Fix a race in creating the stmgr. Noticed while looking at other OSMesa bugs. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-04 16:43:36 -07:00
Eric Anholt	281466332b	gallium/osmesa: Introduce a test. Given that we occasionally touch this code and probably nobody really wants to think about it, introduce a minimal test so that we know we haven't completely broken OSMesa. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-04 16:43:36 -07:00
Dylan Baker	d89d075589	docs: Mark 19.2.0-rc2 as done and push back rc3 and rc4/final	2019-09-04 16:00:02 -07:00
Hal Gentz	1591d1fee5	glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX. When run in optirun, applications that linked to `libGLX.so` and then proceeded to querying Mesa for extension strings caused a SEGV in Mesa. `glXQueryExtensionsString` was calling a chain of functions that eventually led to `__glXQueryServerString`. This function would call `xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`. The latter for some unknown reason returned `NULL`. Passing this `NULL` to `xcb_glx_query_server_string_string_length` would cause a SEGV as the function tried to dereference it. The reason behind the function returning `NULL` is yet to be determined, however, simply checking that the ptr is not `NULL` resolves this. A similar check has been added to `__glXGetString` for completeness sake, although not immediately necessary. In addition to that, we stumbled into a similar problem in `AllocAndFetchScreenConfigs` which tries to access the configs to free them if `__glXQueryServerString` fails. This, of course, SEGVs, because the configs are yet to have been allocated. Simply continuing past the configs if their config ptrs are `NULL` resolves this. We also switch to `calloc` to make sure that the config ptrs are `NULL` by default, and not some uninitialized value. Cc: mesa-stable@lists.freedesktop.org Fixes: `24b8a8cfe8` "glx: implement __glXGetString, hide __glXGetStringFromServer" Fixes: `cb3610e37c` "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it " Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Hal Gentz <zegentzy@protonmail.com>	2019-09-04 16:00:10 +00:00
Adam Jackson	9acb94b623	egl: Enable 10bpc EGLConfigs for platform_{device,surfaceless} It's somewhat annoying that these are so similar for so little benefit. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-09-04 11:39:57 -04:00
Neil Roberts	95927c414f	glsl: Store the precision for a function return type The precision for a function return type is now stored in ir_function_signature. This will later be useful to implement mediump to float16 lowering. In the meantime it is also useful to catch errors where a function is redeclared with a different precision. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-09-04 12:41:20 +02:00
Dave Airlie	3a7e92dac5	docs: add llvmpipe features for fb_no_attach and compute shaders Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	c0521ecffb	llvmpipe: enable compute shaders if LLVM has coroutines Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6453a22612	llvmpipe: add local memory allocation path Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	4e70970507	llvmpipe: add compute shader parameter fetching support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	0b51e73de2	llvmpipe: add compute shader images support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	45a8cf95f2	llvmpipe: add ssbo support to compute shaders Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6ea8e9b415	llvmpipe: add compute sampler + sampler view support. This is ported from the fragment shader code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	4ca40cc3dc	llvmpipe: add support for compute constant buffers. This is mostly ported from the fragment shader code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	775fa81d7b	llvmpipe: add compute pipeline statistics support. This just adds the CS invocations counter. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	50fde5b208	llvmpipe: add grid launch This adds the dispatch code. It creates a job for the number of blocks in the grid, and dispatches them to the threadpool implementation. The threadpool then calls the JIT code to execute the coroutines. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	b320830bbd	llvmpipe: add compute shader generation. This creates the coroutine execution environment and the main compute shaders that get executed inside it. Each compute shader block is executed in it's own coroutine execution shader, which each "thread" being a coroutine executed inside it in sequence. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6ea41df94c	llvmpipe: introduce variant building infrastrucutre. This doesn't actually build any of the shaders yet, but just builds up the framework necessary to start building the shaders and variants. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	fc01fafdbc	llvmpipe: introduce new state dirty tracking for compute. Compute doesn't share dirty state with the fragment pipeline so create a separate path for it. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	a6f6ca37c8	llvmpipe: add initial shader create/bind/destroy variants framework. This is mostly a port of the fragment shader framework Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	a792c5ae3e	llvmpipe: add compute debug option Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	25f46ae9aa	gallivm: add compute jit interface. This adds the jit interface for compute shaders, it's based on the fragment shader one. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	3879f69b50	llvmpipe: add initial compute state structs These mirror the fragment shader structs, this is just a framework. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	add0b151f5	llvmpipe: introduce compute shader context The compute shader will need it's own context like the frag shader has, this just introduces the framework struct and allocates/frees for it in the right places. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	83597ad3f2	gallivm: add barrier support for compute shaders. When the code is executing an hits a barrier, it will suspend the coroutine and return control to the coroutine dispatcher. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	1b24e3ba75	llvmpipe: add compute threadpool + mutex Reviewed-by: Roland Scheidegger <sroland@vmware.com> In order to efficiently run a number of compute blocks, use a threadpool that just allows for jobs with unique sequential ids to be dispatched.	2019-09-04 15:22:20 +10:00
Dave Airlie	e5bf6b7013	gallivm: add support for compute shared memory Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	db6c78f9c8	gallivm: add new compute related intrinsics Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	3312bed7b0	llvmpipe: reogranise jit pointer ordering In order to share the texture/image/sampler code with compute shaders we need to reorg them to be at the front of context same as draw does for vs/gs sharing. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	d32690b43c	gallivm: add coroutine pass manager support coroutines require a proper pass manager, so add the passes to the correct places Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	9cf1340e4f	gallivm: add coroutine support files to gallivm. These wrap the coroutine intrinsics and also add some higher level wrappers around coroutine begin, end and suspend procedures Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	f3f0cbf4f4	gallivm/flow: add counter reset for loops This allows the counter value to be forced to a certain value Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Dave Airlie	6b3c6b91a8	llvmpipe: enable fb no attach Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-09-04 15:22:20 +10:00
Kenneth Graunke	f8887909c6	iris: Report correct number of planes for planar images We were only handling the modifiers case and not counting the number of planes in actual planar images. Fixes Piglit's ext_image_dma_buf_import-export. Fixes: `fc12fd05f5` ("iris: Implement pipe_screen::resource_get_param") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-09-03 21:55:23 -07:00
Ilia Mirkin	32d458fdff	teximage: ensure that TexSubImage checks format We were previously not doing at least some of the checks. This uses the same logic that is used in glTexImage*. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-04 00:35:45 -04:00
Jan Beich	8e92ce9ba5	gallium/hud: add CPU usage support for DragonFly/NetBSD/OpenBSD Each BSD has slightly different sysctl for retrieving per-CPU times. FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return type differs between summation and per-CPU times. DragonFly is compatible with FreeBSD. Signed-off-by: Jan Beich <jbeich@FreeBSD.org>	2019-09-03 22:53:15 -04:00
Roman Stratiienko	ef621a73f7	lima: Return fence unconditionally Based on the vc4 implementation. Fixes Android RenderEngine::flush() routine: android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225 Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-09-04 00:32:04 +00:00
Vasily Khoruzhick	1c1890fa70	lima/ppir: clone uniforms and load_coords into each successor Try more aggressive approach with cloning uniform and coord loads. Uniform load can be inserted into any instruction, so let's do that. ARM site claim that penalty for cache miss is one clock, so we don't lose anything if we merge it into instruction that uses the result. As side effect we can also pipeline it and thus decrease reg pressure. Do the same for varyings that hold texture coords, but for different reason: looks like there's a special path for coords that increases precision if varying that holds it is pipelined. If we don't pipeline it and load coords from a register its precision is fp16 and thus only 10 bits which is not enough to accurately sample textures of size 1024 or larger. Since instruction can hold only one uniform load and one varying load, node_to_instr now creates a move using helper introduced in previous commit if slot is already taken. As side effect of this change we can also try to pipeline texture loads and create a move if attempt fails. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Vasily Khoruzhick	e23fd2c375	lima/ppir: don't assume that load coords gets value from register It can load value from varying directly as well. Also load_regs is the only op that has a source, so add src_num field to load node and set it accordingly. Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Vasily Khoruzhick	bd77d19300	lima/ppir: add common helper for creating movs Introduce common helper for creating movs to avoid code duplication Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-04 00:02:13 +00:00
Eric Engestrom	7659c6197f	nir: fix memleak in error path Fixes: `2cf59861a8` ("nir: Add partial redundancy elimination for compares") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-04 00:31:53 +01:00
Eric Engestrom	c4969b0a25	freedreno/drm-shim: fix mem leak Fixes: `494ecef6b4` ("freedreno: Add support for drm-shim.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-04 00:18:37 +01:00
Eric Engestrom	7abf65aedc	anv: fix format string in error message Fixes: `9775894f10` ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-04 00:13:20 +01:00
Eric Engestrom	1667360f7d	util/os_file: fix double-close() Fixes: `955c63d364` ("util/os_file: resize buffer to what was actually needed") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-04 00:11:51 +01:00
Eric Engestrom	43d470404c	egl: fix deadlock in malloc error path Fixes: `cb0980e69a` ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-04 00:10:18 +01:00
Eric Engestrom	3afe9d798a	ttn: fix 64-bit shift on 32-bit `1` Fixes: `4d0b2c7aaa` ("ttn: Update shader->info as we generate code.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-04 00:01:08 +01:00
Rob Clark	1ef459297c	freedreno/ir3: use uniform base When lowering from ubo, use the constant base field in the load_uniform instruction for the constant part of the offset. Doesn't change much for constant indexing, but this will help for indirect indexing because constant-folding can't completely clean up the result. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Rob Clark	305bcdf992	freedreno/drm: fix 64b iova shifts Should shift before splitting 64b iova into dwords Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Rob Clark	5ccd5871ed	nir: remove unused constant_fold_state Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Eric Anholt	79a5ebe045	freedreno: Fix the type of single-component scaled vertex attrs. This looks like clear copy-and-pasteos, and fixes: dEQP-GLES2.functional.draw.random.40 (on A307 and A630, both tested in the new CI farm) Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-03 19:34:09 +00:00
Connor Abbott	f3e978db4d	radeonsi/nir: Remove uniform variable scanning We can get all the information we need from NIR. It's slightly less accurate, but radeonsi doesn't use the extra information. The old code also overcounted atomic counters, which led to problems when everything was used at once. Fixes KHR-GL45.compute_shader.resources-max. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:55:02 +02:00
Connor Abbott	96c2a2832f	ttn: Fill out more info fields We'll use these in radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:54:57 +02:00
Connor Abbott	dcc64fcfed	nir: Fix num_ssbos when lowering atomic counters Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-09-03 15:54:54 +02:00
Connor Abbott	2abf62d348	ac/nir: Fix gather4 integer wa with unnormalized coordinates This adds a bit of unneccesary code on radeonsi, since whether unnormalized coordinates are used is known at compile time with GL, but I wasn't sure if it was worth the few instructions to plumb everything through, especially for something so rare -- my shader-db doesn't have any instances where this changes anything. Fixes CTS tests I created at https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-03 13:50:54 +00:00
Connor Abbott	c63ccf90df	ac/nir: Rewrite gather4 integer workaround based on radeonsi The workaround was originally written based on amdgpu-pro traces, but since then radeonsi has got its own slightly different version. Use the radeonsi version instead, to be consistent and because it'll be slightly more convenient for handling unnormalized coordinates. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-03 13:50:54 +00:00
Eric Engestrom	5f7d90f2ff	egl: warn user if they set an invalid EGL_PLATFORM Technically, the user might have set EGL_DISPLAY instead of EGL_PLATFORM, but since the former is deprecated let's just mention the latter in the warning message. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-09-03 14:41:43 +01:00
Alyssa Rosenzweig	5cdfccf8a6	panfrost: Remove panfrost_upload This routine was made obsolete over a series of reworks of memory allocation; Tomeu's changes to shader memory allocation finally made this unused as cppcheck noted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig	42f0aae874	panfrost: Fix misc. issues flagged by cppcheck Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig	6bd18bb264	panfrost: Mark (1 << 31) as unsigned I was not aware this incurred undefined behaviour; thank you cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig	a058e90138	pan/midgard: Remove mir_rewrite_index_*_tag These helpers are unused, as flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig	41ebac638a	pan/midgard: Remove mir_print_bundle In practice, the new post-schedule print is just as useful. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:29 +02:00
Alyssa Rosenzweig	d34e3f7e0a	pan/midgard: Remove cppwrap.cpp It has not been used in a long time; I forgot this file even existed. Flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:55:21 +02:00
Alyssa Rosenzweig	1a4153b24c	pan/midgard: Fix cppcheck issues Miscellaneous minor issues flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:54:21 +02:00
Alyssa Rosenzweig	032e21b33e	pan/midgard: Correct issues in disassemble.c cppcheck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:54:05 +02:00
Alyssa Rosenzweig	23376c2d35	pan/decode: Add missing format specifier Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:42:08 +02:00
Alyssa Rosenzweig	dc342aaac3	pan/decode: Use portable format specifier for 64-bit Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:42:04 +02:00
Alyssa Rosenzweig	bcfcb7e624	pan/decode: Use %zu instead of %d Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:41:59 +02:00
Alyssa Rosenzweig	d6d6d6327a	pan/decode: Fix uninitialized variables Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-09-03 13:41:34 +02:00
Juan A. Suarez Romero	c1c0386676	docs: update calendar, add news item and link release notes for 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-09-03 13:06:56 +02:00
Juan A. Suarez Romero	b3763dab18	docs: add sha256 checksums for 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `4ec2325dd0`)	2019-09-03 13:04:49 +02:00
Juan A. Suarez Romero	4151947583	docs: add release notes for 19.1.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `85c8f88a49`)	2019-09-03 13:04:45 +02:00
Lionel Landwerlin	320b0f66c2	vulkan/overlay: bounce image back to present layout Once we write the overlay to an image to be presented, we must not forget to put it back into present layout. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-09-03 07:11:58 +00:00
Zhaowei Yuan	9db06a5350	broadcom/vc4: Expand width of dst surface Four bytes of src_surf will be compressed into a 32-bits data and stored into dst_surf, and dst_surf is read as z-order, so its width must be aligned to multiples of 8(4x2) before divided by 2. Signed-off-by: Zhaowei Yuan <zhaowei.yuan@samsung.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-09-03 08:47:43 +02:00
Vinson Lee	538820ff5f	swr: Fix make_unique build error. swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT)’: swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func))); ^~~~~~~~~~~ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-09-02 14:52:23 -07:00
nia	1900b82dbf	loader: include limits.h for PATH_MAX This is needed to build on illumos. The location of the PATH_MAX definition in limits.h seems to be fairly standard: https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-09-02 15:49:34 +00:00
Erik Faye-Lund	2f82d972ab	util: only allow _BitScanReverse64 on 64-bit cpus While the documentation for _BitScanReverse64 on MSDN says that it's available on ARM, this isn't true. It's only available on ARM64. So let's match reality. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-09-02 12:45:45 +00:00
Erik Faye-Lund	1de9ba33a2	mesa/x86: improve SSE-checks for MSVC This enables some more SSE optimizations on MSVC builds. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-02 12:45:45 +00:00
Erik Faye-Lund	06099d0e0c	util: do not assume MSVC implies SSE This is not true for MSVC on ARM. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-02 12:45:45 +00:00
Erik Faye-Lund	2ade1c5cf7	util: fix SSE-version needed for double opcodes This code generates CVTSD2SI, which requires SSE2. So let's fix the required SSE-version. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `5de29ae` (util: try to use SSE instructions with MSVC and 32-bit gcc) Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-02 12:45:45 +00:00
Erik Faye-Lund	ee2bc11cc7	mesa/main: remove unused include This has been unused since `183db3a645` ("glsl: move half<->float convertion to util"), Oct 10 2015. Let's drop needlessly including it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-09-02 12:45:45 +00:00
Samuel Pitoiset	966a455bb9	nir: do not assume that the result of fexp2(a) is always an integral It's only correct when 'a' is an integral greater or equal to 0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493 Fixes: `5544b2cbbd` ("nir/algebraic: Use value range analysis to eliminate useless unary ops") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-09-02 09:00:37 +02:00
Lionel Landwerlin	6775a52400	egl: fix platform selection Add missing "device" platform v2: Add the missing platform (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jean Hertel <jean.hertel@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529 Fixes: `d6edccee8d` ("egl: add EGL_platform_device support") Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-09-02 06:28:06 +03:00
Kenneth Graunke	87fa8d9ebc	iris: Lessen texture cache hack flush for blits/copies on Icelake. Lionel found actual documentation for this at long last. Apparently it actually is a sampler cache limitation that was mostly fixed on Icelake. Unfortunately, it seems there are still issues with ASTC and non-ASTC sampler views. Still, we can lessen the flush condition from "format mismatch" to "ASTC mismatch", which eliminates most of the flushing here. We also update the documentation to refer to the workaround name.	2019-08-31 20:17:55 -07:00
Vinson Lee	4771f6bccc	util: Define strchrnul on macOS. strchrnul is not available on macOS. pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration] next = strchrnul(library_paths, ':'); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-31 13:26:10 -07:00
Erik Faye-Lund	52af1427c6	gallium/auxiliary/indices: consistently apply start only to input The majority of these only apply the start argument to the input, but a few of them also does for the output-array. util_primconvert, the only user of this argument expects this pass a non-zero start-argument does not expect this to be applied to the output; if it is, it will write outside of allocated memory, leading to VRAM corruption. The reason this doesn't seem to have been noticed before, is that no driver currently use util_primconvert to convert a primitive-type to itself, which is the cases where this was broken. But for Zink, this will no longer be true, because we need to eliminate the use of 8-bit index-buffers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `28f3f8d413` ("gallium/auxiliary/indices: add start param") Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-31 19:45:52 +00:00
Vinson Lee	029b07b2ad	travis: Fail build if any command in if statement fails. Travis is checking the exit code of the entire if statement. Fixes: `64ffc289be` ("travis: add MacOS Scons build") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-08-31 12:20:04 -07:00
Vinson Lee	3664a6600e	swr: Fix build with llvm-9.0 again. Commit `6f7306c029` ("swr/rast: Refactor memory API between rasterizer core and swr") unintentionally removed changes for llvm-9.0. Fixes: `6f7306c029` ("swr/rast: Refactor memory API between rasterizer core and swr") Fixes: `5dd9ad1570` ("swr/rasterizer: Better implementation of scatter") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>	2019-08-31 00:20:40 -07:00
Alyssa Rosenzweig	20237166b6	pan/midgard: Use shared psiz clamp pass We already had a perfectly cromulent pass for this, but one landed in common NIR code so let's switch and lighten our tree. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 16:06:09 -07:00
Alyssa Rosenzweig	0b225f1892	pan/midgard: Remove mir_opt_post_move_eliminate This optimization depended on RA running before scheduling. It therefore no longer applies and is now unused. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:28 -07:00
Alyssa Rosenzweig	d699a17475	pan/midgard: Schedule before RA This is a tradeoff. Scheduling before RA means we don't do RA on what-will-become pipeline registers. Importantly, it means the scheduler is able to reorder instructions, as registers have not been decided yet. Unfortunately, it also complicates register spilling, since the spills themselves won't get bundled optimally and we can only spill twice per ALU bundle (only one spill per bundle allowed here). It also prevents us from eliminating dead moves introduced by register allocation, as they are not dead before RA. The shader-db regressions are from poor spilling choices introduced by the new bundling requirements. These could be solved by the combination of a post-scheduler (to combine adjacent spills into bundles) with a VLIW-aware spill cost calculation. Nevertheless, the change is small enough that I feel it's worth it to eat a tiny shader-db regression for the sake of flexibility. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:28 -07:00
Alyssa Rosenzweig	5e06d90c45	pan/midgard: Handle fragment writeout in RA Rather than using a pile of hacks and awkward constructs in MIR to ensure the writeout parameter gets written into r0, let's add a dedicated shadow register class for writeout (interfering with work register r0) so we can express the writeout condition succintly and directly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig	116b17d2d1	pan/midgard: Do not propagate swizzles into writeout There's no slot for it; you'll end up writing into the void and clobbering stuff. Don't. do it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig	eb3cc20f42	pan/midgard: Fix misc. RA issues When running the register allocator after scheduling, the MIR looks a little different, so we need to extend the RA to handle a few of these extra cases correctly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig	e5ba016d3a	pan/midgard: Print MIR by the bundle After scheduling, we still have valid MIR, but we have additional bundling annotations which we would like to keep debug, so print these. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:27 -07:00
Alyssa Rosenzweig	f42cebdd84	pan/midgard: Print branches in MIR Rather than a vague "br.??" line, annotate the branch with its target type (useful for disambiguating discards) and whether it was inverted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	59f2cfcbc7	pan/midgard: Remove texture_index This is deadcode. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	76529836ec	pan/midgard: Cleanup fragment writeout branch I'm not sure if this is strictly necessary but it makes debugging easier and minimizes the diff with the experimental scheduler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	cc2ba8efe9	pan/midgard: Add scheduling barriers Scheduling occurs on a per-block basis, strongly assuming that a given block contains at most a single branch. This does not always map to the source NIR control flow, particularly when discard intrinsics are involved. The solution is to allow scheduling barriers, which will terminate a block early in code generation and open a new block. To facilitate this, we need to move some post-block processing to a new pass, rather than relying hackily on the current_block pointer. This allows us to cleanup some logic analyzing branches in other parts of the driver us well, now that the MIR is much more well-formed. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	19bceb5812	pan/midgard: Track shader quadword count while scheduling This allow multiblock blend shaders to compute constant colour offsets correctly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	72cbd2d4e7	pan/midgard: Allow NULL argument in mir_has_arg It's sometimes convenient to call this with no instruction specified. By definition, a missing instruction cannot reference any argument, so let's check for NULL and shortciruit to false. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	bcc59ff04d	pan/midgard: Improve mir_mask_of_read_components Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	5377d70292	pan/midgard: Extend mir_special_index to writeout The branch has the writeout specified in its source list, making this special even if it's not explicitly part of r0. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:26 -07:00
Alyssa Rosenzweig	b56399fcd2	pan/midgard: csel_swizzle with mir get swizzle Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:25 -07:00
Alyssa Rosenzweig	28622f9088	pan/midgard: Add mir_insert_instructionscheduled helpers In order to run register allocation after scheduling, it is sometimes necessary to be able to insert instructions into an already-scheduled program. This is suboptimal, since it forces us to do a worst-case scheduling, but it is nevertheless required for correct handling of spills/fills. Let's add helpers to insert instructions as standalone bundles for use in spilling code. These helpers are minimal -- they only* work on load/store ops or moves. They should not be used for anything but register spilling; any other instructions should be added prior to the schedule. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:25 -07:00
Alyssa Rosenzweig	8e369966d7	pan/midgard: Track csel swizzle While it doesn't matter with an unconditional move to the conditional register (r31), when we try to elide that move we'll need to track the swizzle explicitly, and there is no slot for that yet since ALU ops are normally binary. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	a8eafb0b74	pan/midgard: Ensure fragment writeout is in the final block This ensures the block only has exactly one branch, which makes scheduling happy. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	cfd5bd2c7d	pan/midgard: Document Midgard scheduling requirements Oh boy. Midgard scheduling is crazy... These are all just the requirements, not even the algorithm yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	d6e4e36566	pan/midgard: Include condition in branch->src[0] This will allow us to reference the condition while scheduling. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	bd79cddafa	pan/midgard: Add post-schedule iteration helpers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	f29d03a1f9	pan/midgard: Fix corner case in RA It doesn't really matter but... meh. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	d722b60191	pan/midgard: Add OP_IS_CSEL_V helper ..to distinguish from scalar csel. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	01316719cf	pan/midgard: Expose mir_get/set_swizzle The scheduler would like to use these. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig	3f757425a4	pan/midgard: Extract instruction sizing helper The scheduler shouldn't need to worry about this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig	bbe2914967	pan/midgard: Factor out mir_is_scalar This helper doesn't need to be in the giant loop. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig	67909c8ff2	pan/midgard: Count shader-db stats by bundled instructions This does not affect shaders in any way. Rather, it makes the shader-db instruction count recorded in the compiler accurate with the in-order scheduler, matching up with what we calculate from pandecode. Though shaders are the same, instruction counts cannot be compared across this commit for this reason. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 15:50:22 -07:00
Alyssa Rosenzweig	3f9dc97124	freedreno/ir3: Link directly to Sethi-Ullman paper Allow a direct link to the PDF itself from the authors themselves, rather than a paywall splash page. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Rob Clark <robdclark@chromium.org>	2019-08-30 15:50:22 -07:00
Adam Jackson	da5ebe3010	Revert "glx: Unset the direct_support bit for GLX_EXT_import_context" The GLX extension strings are independent of any context, so abusing the direct_support bit to control this extension's visibility is wrong. This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7. Reported-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Michel Dänzer <michel@daenzer.net>	2019-08-30 17:50:45 -04:00
Boris Brezillon	9087cf7015	panfrost: Add transient BOs to job batches Memory allocated through panfrost_allocate_transient() is likely to come from the transient pool. Let's add the BO backing the allocated memory region to the job batch so the kernel can retain this BO while jobs are executed. In practice that has never been a problem because the transient pool is never shrinked, and even if it was, we still control the lifetime of the job, so there's no reason for this BO to be freed before the GPU is done executing the batch. But it still make sense to add the BO for debugging purpose. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-30 22:13:41 +02:00
Rohan Garg	b2ff2dfc2a	panfrost: protect access to shared bo cache and transient pool Both the BO cache and the transient pool are shared across context's. Protect access to these with mutexes. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-30 22:10:49 +02:00
Rohan Garg	6b0dc3d530	panfrost: Jobs must be per context, not per screen Jobs _must_ only be shared across the same context, having the last_job tracked in a screen causes use-after-free issues and memory corruptions. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-30 22:06:54 +02:00
Lepton Wu	bd98470a46	st/mesa: Allow zero as [level\|layer]_override This fix two dEQP tests for virgl: dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-30 17:30:53 +00:00
Khaled Emara	6926f56d5b	freedreno/a3xx: fix sysmem <-> gmem tiles transfer Tiling mode was missing from fd3_emit_gmem_restore_tex(). emit_gmem2mem_surf() used LINEAR exclusiveley. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-08-30 08:54:30 -07:00
Khaled Emara	ed1954ced3	freedreno/a3xx: fix texture tiling parameters * Fix 2D/2DArray/3D tiling parameters: There is a bottom threshold for width and height. * Renable tiling for Cubemap, after setting the right parameters. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-08-30 08:54:30 -07:00
Michel Dänzer	8de25ecd6b	gitlab-ci: Use new needs: keyword This way, the test jobs can start running before all build+test jobs have finished, once the meson-main job has. Idea suggested by Daniel Stone on IRC. See https://docs.gitlab.com/ce/ci/directed_acyclic_graph/ and https://docs.gitlab.com/ce/ci/yaml/README.html#needs for details. v2: * Improve commit log (Daniel Stone, Eric Engestrom) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-30 11:27:00 +02:00
Michel Dänzer	42f8d5a531	gitlab-ci: Move up meson-main job definition In order to increase the chance of it running early. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-30 11:25:26 +02:00
Dave Stevenson	873b092e91	broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride. Equivalent of `0c1dd9dee` "broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride." for v3d. Allows YUV buffers with a single buffer and plane offsets to be passed in. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-30 10:53:05 +02:00
Jan Zielinski	2263e6a895	swr/rasterizer: Fix GS attributes processing Input to GS is just a set of attributes, so remove explicit setup of 'position' which is meaningless for GS input processing. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-30 07:31:45 +00:00
Samuel Pitoiset	6b96c94b5a	radv: keep a pointer to a NIR shader into radv_shader_context This avoids multiple copies for nothing and it's more elegant. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:30 +02:00
Samuel Pitoiset	7b1655ccf3	radv: move setting can_discard to ac_fill_shader_info() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:27 +02:00
Samuel Pitoiset	081561de16	radv: replace ac_nir_build_if by ac_build_ifcc Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:25 +02:00
Samuel Pitoiset	cc3d36b5dd	radv: remove radv_init_llvm_target() helper RADV no longer uses specific LLVM options compared to the common code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:21 +02:00
Samuel Pitoiset	dc27a54c84	radv: remove useless ac_llvm_util.h include from the WSI code Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:19 +02:00
Samuel Pitoiset	6cb455c418	radv: remove unused shader_info parameter in ac_compile_llvm_module() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:17 +02:00
Samuel Pitoiset	9aaca90123	radv: remove some unused fields from radv_shader_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:33:15 +02:00
Samuel Pitoiset	8d44f83844	radv: move lowering PS inputs/outputs at the right place At shaders creation, just after NIR linking. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:29:31 +02:00
Samuel Pitoiset	151d6990ec	radv: gather info about PS inputs in the shader info pass It's the right place to do that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-30 09:29:29 +02:00
Samuel Pitoiset	9f2fd23f99	ac: drop now useless lookup_interp_param from ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-30 08:23:56 +02:00
Samuel Pitoiset	a63719db6a	ac: import linear/perspective PS input parameters from radv/radeonsi Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-30 08:23:54 +02:00
Krzysztof Raszkowski	8be51061ec	util: Add unreachable() definition for clang compiler. Without unreachable() definition clang throw return-type error in many places in mesa code. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-30 05:50:21 +00:00
Nataraj Deshpande	e3f54cb0c1	egl/android: Enable HAL_PIXEL_FORMAT_RGBA_FP16 format The patch adds support for 64 bit HAL_PIXEL_FORMAT_RGBA_FP16 for android platform. Fixes android.graphics.cts.BitmapColorSpaceTest#test16bitHardware which failed in egl due to "Unsupported native buffer format 0x16" on chromebooks. Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-29 23:16:08 +00:00
Dave Airlie	a69ae76cc8	gallivm: disable accurate cube corner for integer textures. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-30 08:27:16 +10:00
Pierre-Eric Pelloux-Prayer	47cc660d9c	glsl: replace 'x + (-x)' with constant 0 This fixes a hang in shadertoy for radeonsi where a buffer was initialized with: value -= value with value being undefined. In this case LLVM replace the operation with an assignment to NaN. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-29 17:48:49 -04:00
Thong Thai	8d03a6b700	radeonsi: add JPEG decode support for VCN 2.0 devices Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2019-08-29 17:27:35 -04:00
Thong Thai	2a3a560407	Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu" This reverts commit `5a2e65be89`. Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL still needs to be emitted by the UMD, or else the driver will hang Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Thong Thai <thong.thai@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-29 17:27:15 -04:00
Ian Romanick	9ad4a2eac5	nir/range-analysis: Add a lot more assertions about the contents of tables v2: Update several of the comments. Drop some redundant uses of ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-29 13:15:53 -07:00
Ian Romanick	636da12433	nir/range-analysis: Range tracking for fpow One shader from Metro Last Light and the rest from Rochard. In the Rochard cases, something like: min(1.0, max(pow(saturate(x), y), z)) was transformed to saturate(max(pow(saturate(x), y), z)) because the result of the pow must be >= 0. The Metro Last Light case was similar. An instance of min(pow(abs(x), y), 1.0) became saturate(pow(abs(x), y)) v2: Fix some comments. Suggested by Caio. v3: Fix setting is_intgral when the exponent might be negative. See also Mesa MR !1778. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16280670 -> 16280659 (<.01%) instructions in affected programs: 1130 -> 1119 (-0.97%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.72% max: 1.43% x̄: 1.03% x̃: 0.97% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.19% -0.86% Instructions are helped. total cycles in shared programs: 367168430 -> 367168270 (<.01%) cycles in affected programs: 10281 -> 10121 (-1.56%) helped: 10 HURT: 1 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 1.31% max: 2.43% x̄: 1.79% x̃: 1.70% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 3.10% max: 3.10% x̄: 3.10% x̃: 3.10% 95% mean confidence interval for cycles value: -20.06 -9.04 95% mean confidence interval for cycles %-change: -2.36% -0.32% Cycles are helped.	2019-08-29 13:15:53 -07:00
Ian Romanick	7dba7df5e5	nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel I discovered this while looking at a shader that was hurt by some other work I'm doing. When I examined the changes, I was confused that one instance of a comparison that was used in a discard_if was (incorrectly) eliminated, while another instance used by a bcsel was (correctly) not eliminated. I had to use NIR_PRINT=true to see exactly where things when wrong. A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and Strike Suit Zero were impacted. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16280659 -> 16281075 (<.01%) instructions in affected programs: 21042 -> 21458 (1.98%) helped: 0 HURT: 136 HURT stats (abs) min: 1 max: 9 x̄: 3.06 x̃: 3 HURT stats (rel) min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03% 95% mean confidence interval for instructions value: 2.93 3.19 95% mean confidence interval for instructions %-change: 2.08% 2.37% Instructions are HURT. total cycles in shared programs: 367168270 -> 367170313 (<.01%) cycles in affected programs: 172020 -> 174063 (1.19%) helped: 14 HURT: 111 helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9 helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79% HURT stats (abs) min: 2 max: 584 x̄: 21.08 x̃: 5 HURT stats (rel) min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40% 95% mean confidence interval for cycles value: 5.41 27.28 95% mean confidence interval for cycles %-change: 0.64% 1.81% Cycles are HURT.	2019-08-29 13:15:53 -07:00
Ian Romanick	0b4782fccd	nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero) Found by inspection. I tried really, really hard to make a test case that would trigger this problem, but I was unsuccesful. It's very hard to get an instruction to produce a ne_zero result without ne_zero sources. The most plausible way is using bcsel. That proves problematic because bcsel interprets its sources as integers, so it cannot currently be used to "clean" values for floating point instructions. No shader-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass")	2019-08-29 13:15:53 -07:00
Ian Romanick	ef2e235252	nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero Fixes piglit tests (new in piglit!110): - fs-underflow-fma-compare-zero.shader_test - fs-underflow-mul-compare-zero.shader_test v2: Add back part of comment accidentally deleted. Noticed by Caio. Remove is_not_zero function as it is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: `fa116ce357` ("nir/range-analysis: Range tracking for ffma and flrp") Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16278465 -> 16279492 (<.01%) instructions in affected programs: 16765 -> 17792 (6.13%) helped: 0 HURT: 23 HURT stats (abs) min: 7 max: 275 x̄: 44.65 x̃: 8 HURT stats (rel) min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62% 95% mean confidence interval for instructions value: 9.57 79.74 95% mean confidence interval for instructions %-change: 1.85% 6.61% Instructions are HURT. total cycles in shared programs: 367135159 -> 367154270 (<.01%) cycles in affected programs: 279306 -> 298417 (6.84%) helped: 0 HURT: 23 HURT stats (abs) min: 13 max: 6029 x̄: 830.91 x̃: 54 HURT stats (rel) min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49% 95% mean confidence interval for cycles value: 100.89 1560.94 95% mean confidence interval for cycles %-change: 0.94% 13.71% Cycles are HURT. total spills in shared programs: 8870 -> 8869 (-0.01%) spills in affected programs: 19 -> 18 (-5.26%) helped: 1 HURT: 0 total fills in shared programs: 21904 -> 21901 (-0.01%) fills in affected programs: 81 -> 78 (-3.70%) helped: 1 HURT: 0 LOST: 0 GAINED: 1 On Broadwell, a shader was hurt for spills / fills instead of helped. No changes on any earlier platforms.	2019-08-29 13:15:53 -07:00
Ian Romanick	33ad2bab4b	nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero Fixes piglit tests (new in piglit!110): - fs-underflow-exp2-compare-zero.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Most of the shaders affected are, unsurprisingly, in Unigine Heaven. All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16278207 -> 16278465 (<.01%) instructions in affected programs: 11374 -> 11632 (2.27%) helped: 0 HURT: 58 HURT stats (abs) min: 2 max: 13 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82% 95% mean confidence interval for instructions value: 3.77 5.13 95% mean confidence interval for instructions %-change: 2.19% 2.64% Instructions are HURT. total cycles in shared programs: 367134284 -> 367135159 (<.01%) cycles in affected programs: 81207 -> 82082 (1.08%) helped: 17 HURT: 36 helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6 helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78% HURT stats (abs) min: 4 max: 235 x̄: 66.97 x̃: 16 HURT stats (rel) min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09% 95% mean confidence interval for cycles value: -20.36 53.38 95% mean confidence interval for cycles %-change: -1.08% 4.67% Inconclusive result (value mean confidence interval includes 0). No changes on any earlier platforms.	2019-08-29 13:15:53 -07:00
Ian Romanick	e07248d2a8	nir/algebraic: Clean up value range analysis-based optimizations Fix the a / b ordering in some compares. Delete duplicate patterns. Add a table explaining things. While I was cleaning this up, I managed to confuse myself. The table helped sort that out. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-29 13:15:52 -07:00
Ian Romanick	ccb236d1bc	nir/algebraic: Mark some value range analysis-based optimizations imprecise This didn't fix bug #111308, but it was found will trying to find the actual cause of that bug. Fixes piglit tests (new in piglit!110): - fs-fract-of-NaN.shader_test - fs-lt-nan-tautology.shader_test - fs-ge-nan-tautology.shader_test No shader-db changes on any Intel platform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308 Fixes: `b77070e293` ("nir/algebraic: Use value range analysis to eliminate tautological compares") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-29 13:15:52 -07:00
Kenneth Graunke	30b9ed92ea	iris: Fix partial fast clear checks to account for miplevel. We enabled fast clears at level > 0, but didn't minify the dimensions when comparing the box size, so we always thought it was a partial clear and as a result never actually enabled any. This eliminates some slow clears in Civilization VI, but they are mostly during initialization and not the main rendering. Thanks to Dan Walsh for noticing we had too many slow clears. Fixes: `393f659ed8` ("iris: Enable fast clears on other miplevels and layers than 0.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-29 11:27:16 -07:00
Rohan Garg	394192fcee	panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job() is_scanout is not used anywhere and can be inferred within panfrost_drm_submit_vs_fs_job() if required. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-29 19:03:17 +02:00
Kenneth Graunke	fda9fb8dcd	iris: Actually describe bo_reuse driconf option Otherwise it doesn't exist and can't be parsed, so everything dies at screen init time. Fixes: `6dc4ddc5f8` ("iris: use driconf for 'bo_reuse' parameter")	2019-08-29 09:40:34 -07:00
Tomeu Vizoso	aace7d3500	panfrost/ci: Print only regressions Some functionality has been added to deqp-volt to only print regressions, so update our version of it and use the new options. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-29 17:12:04 +02:00
Roland Scheidegger	332b21db55	gallivm: use fallback code for mul_hi with llvm >= 7.0 LLVM 7.0 ditched the pmulu intrinsics. This is only a trivial patch to use the fallback code instead. It'll likely produce atrocious code since the pattern doesn't match what llvm itself uses in its autoupgrade paths, hence the pattern won't be recognized. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-29 16:55:49 +02:00
Samuel Pitoiset	b650ecfe31	radv/gfx10: compute the LDS size for exporting PrimID for VS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-29 16:08:37 +02:00
Jan Zielinski	e64091ebd4	swr/rasterizer: Enable ARB_fragment_layer_viewport Added loading gl_Layer and gl_ViewportIndex variables to Pixel Shader context. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-29 12:09:05 +02:00
Tapani Pälli	6dc4ddc5f8	iris: use driconf for 'bo_reuse' parameter Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-29 09:33:52 +03:00
Tapani Pälli	b65de51dcf	i965: initialize bo_reuse when creating brw_bufmgr Fixes a possible data race spotted while debugging on other EGL related failures where glFinish and eglCreateContext are going on at the same time: ==11558== Possible data race during read of size 1 at 0x5E78CD0 by thread #23 ==11558== Locks held: 1, at address 0x5E77CA8 ==11558== at 0x61B71D4: bo_alloc_internal (brw_bufmgr.c:639) ==11558== by 0x61B7328: brw_bo_alloc (brw_bufmgr.c:669) ==11558== by 0x61EF975: recreate_growing_buffer (intel_batchbuffer.c:231) ==11558== by 0x61EFAAE: intel_batchbuffer_reset (intel_batchbuffer.c:255) ==11558== by 0x61EFB85: intel_batchbuffer_reset_and_clear_render_cache (intel_batchbuffer.c:280) ==11558== by 0x61F0507: brw_new_batch (intel_batchbuffer.c:551) ==11558== by 0x61F12C1: _intel_batchbuffer_flush_fence (intel_batchbuffer.c:888) ==11558== by 0x61BDD6B: intel_glFlush (brw_context.c:296) ==11558== by 0x61BDDB9: intel_finish (brw_context.c:307) ==11558== by 0x623831B: _mesa_Finish (context.c:1906) ==11558== by 0x46D556: deqp::egl::GLES2ThreadTest::Operation::execute(tcu::ThreadUtil::Thread&) ==11558== by 0x721502: tcu::ThreadUtil::Thread::run() ==11558== ==11558== This conflicts with a previous write of size 1 by thread #26 ==11558== Locks held: 1, at address 0x5D09878 ==11558== at 0x61B98A9: brw_bufmgr_enable_reuse (brw_bufmgr.c:1541) ==11558== by 0x61BF09D: brw_process_driconf_options (brw_context.c:854) ==11558== by 0x61BF6CA: brwCreateContext (brw_context.c:993) ==11558== by 0x621181F: driCreateContextAttribs (dri_util.c:473) ==11558== by 0x53FE87B: dri2_create_context (egl_dri2.c:1388) ==11558== by 0x53EE7BE: eglCreateContext (eglapi.c:807) ==11558== by 0x5C8AB9: eglw::FuncPtrLibrary::createContext(void, void, void, int const) const ==11558== by 0x46E027: deqp::egl::GLES2ThreadTest::CreateContext::exec(tcu::ThreadUtil::Thread&) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-29 09:33:13 +03:00
Kenneth Graunke	90ca709f6d	iris: Don't auto-flush/dirty on transfer unmap for coherent buffers When u_upload_mgr fills up a buffer, it unmaps and destroys it. Our unmap function was automatically performing the equivalent of a FlushMappedBufferRange call in this case. Because the buffer mapping is persistent and coherent, we don't actually do any flushing when we do the rest of the writes to the buffer - we were just doing one final one at the end. But we would be using the uploaded contents on the GPU the whole time. This certainly shouldn't be necessary for streaming buffers, and if such flushing and dirtying is necessary for coherent buffers, this is wildly insufficient. Drops a small number of constant packets and PIPE_CONTROL flushes from most benchmarks that I've looked at. Doesn't seem to make much of an impact on performance, however. Thanks to Felix Degrood for noticing that we were emitting more 3DSTATE_CONSTANT_* packets than we needed to.	2019-08-28 22:11:05 -07:00
Timur Kristóf	5f3eb6ef29	st/nine: Properly initialize GLSL types for NIR shaders. NIR shaders use GLSL types (note: these live outside libglsl), and nine needs to properly initialize these just like the other state trackers. This fixes an assertion failure when TTN is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>	2019-08-28 23:31:34 +00:00
Rob Clark	6167a63839	freedreno/ir3: do better job of marking convergence points Fixes: dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_vertex dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_fragment Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:27 -07:00
Rob Clark	6af70aa2b4	freedreno/ir3: maintain predecessors/successors While resolving jumps to skip intermediate jumps from the structured CFG, maintain the successors and predecessors correctly. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:25 -07:00
Rob Clark	06bc4875ff	freedreno/ir3: convert block->predecessors to set Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:19 -07:00
Jordan Justen	cfbde3282d	pci_id_driver_map: Support preferring iris over i965 This adds the ability for intel devices that: * Only load on i965 * Only load on iris * First attempt i965, and try iris next * First attempt iris, and try i965 next Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-28 13:38:34 -07:00
Jordan Justen	107c22945f	i965: Exit with error if gen12+ is detected For OpenGL support on gen12, the iris driver should be used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-28 13:38:34 -07:00
Tapani Pälli	d8dd9a245e	anv: build libanv for gen12 in android build Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:34 -07:00
Jordan Justen	181be14d43	anv: Build for gen12 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:34 -07:00
Tapani Pälli	da603c066e	iris: build android libmesa_iris for gen12 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-28 13:38:34 -07:00
Jordan Justen	44ab7c265f	iris: Build for gen12 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-28 13:38:33 -07:00
Jordan Justen	4d2e390a65	intel/l3: Don't assert on gen12 (use gen11 config temporarily) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:33 -07:00
Jordan Justen	bdeb498070	intel/compiler: Disable compaction on gen12 for now Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-28 13:38:33 -07:00
Tapani Pälli	d7a1140c45	intel/isl: build android libmesa_isl for gen12 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:33 -07:00
Jordan Justen	6d63fd8a69	intel/isl: Build gen12 using gen11 code paths Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:33 -07:00
Tapani Pälli	7319003a74	intel/genxml: generate pack files for gen12 on android builds Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:33 -07:00
Jordan Justen	b42a05b436	intel/genxml: Build gen12 genxml Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:33 -07:00
Jordan Justen	531563b64b	intel/genxml: Add gen12.xml as a copy of gen11.xml Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:32 -07:00
Jordan Justen	2323536ee7	intel/genxml: Run sort_xml.sh to tidy gen9.xml and gen11.xml Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:32 -07:00
Jordan Justen	70566a87eb	intel/genxml/gen11: Add spaces in EnableUnormPathInColorPipe Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:32 -07:00
Jordan Justen	acce7d3460	intel/genxml: Handle field names with different spacing/hyphen If a field name differs slightly between two generations then this change will still add the fields into the same group. For example, these will be treated as equal: * "Software Exception" and "Software Exception" * "Per Thread" and "Per-Thread" Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-28 13:38:28 -07:00
Eric Anholt	973b49386c	freedreno/a6xx: Fix non-mipmap filtering selection. We were clamping the LOD to force non-mipmap filtering, but that means that the HW doesn't get to select between the min and mag filters. Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering. Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.* Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-28 13:14:41 -07:00
Ian Romanick	b418269d7d	intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware See the previous commit for the explanation of the Fixes tag. Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal Engine 4 tech demos. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `7afa26d4e3` ("nir: Add lowering for nir_op_bitfield_reverse.")	2019-08-28 11:39:29 -07:00
Ian Romanick	d3fd1c761a	nir/algrbraic: Don't optimize open-coded bitfield reverse when lowering is enabled This caused a problem on Sandybridge where an open-coded bitfieldReverse() function could be optimized to a nir_op_bitfield_reverse that would generate an unsupported BFREV instruction in the backend. This was encountered in some Unreal4 tech demos in shader-db. The bug was not previously noticed because we don't actually try to run those demos on Sandybridge. The fixes tag is a bit a lie. The actual bug was introduced about 26,000 commits earlier in `371c4b3c48` ("nir: Recognize open-coded bitfield_reverse."). Without the NIR lowering pass, the flag needed to avoid the optimization does not exist. Hopefully nobody will care to fix this on an earlier Mesa release. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `7afa26d4e3` ("nir: Add lowering for nir_op_bitfield_reverse.")	2019-08-28 11:38:51 -07:00
Eric Anholt	4662b70d23	gallium: Don't emit identical endian-dependent pack/unpack code. Reduces the size of the u_format_table.c file by 140k (out of 1.64M) and makes me less confused about endianness in gallium. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Eric Anholt	d17ff2f7f1	gallium: Fix big-endian addressing of non-bitmask array formats. The formats affected are: - LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT) - R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT) - RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM, 32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT) - RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED, 16_UINT, 16_SINT) - RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT) - RGBx32 x (FLOAT, UINT, SINT) - RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT) The updated st_formats.c unit test checks that the formats affected by this change are all array formats in the equivalent Mesa format (if any). Mesa's array format definition is clear: the value stored is an array (increasing memory address) of values of the channel's type. It's also the only thing that makes sense for the RGB types, or very large types like RGBA64_FLOAT (A should not move to the low address because the cpu is BE). Acked-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Tested-by: Matt Turner <mattst88@gmail.com> (unit tests on BE) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Eric Anholt	0547fdd7ee	gallium: Drop a bit of dead code from the pack/unpack python. Nothing used this var. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Eric Anholt	309ef968cd	gallium: Drop the useless union wrapper on pack/unpack. Nothing accessed the .value field, just the .chan. Unwrap all the code from the union, for clarity (and 13k less generated code). Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Eric Anholt	174240c5e4	gallium: Skip generating the pack/unpack union if we don't use it. Shaves 30k off of the 1.6M .c file, and makes for less noise for me trying to understand how gallium formats actually work. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Eric Anholt	7c8cdee0b2	gallium: Fix mesa format name in unit test failure path. We clearly wanted the mesa format here. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-28 10:39:36 -07:00
Boris Brezillon	8709b865ce	panfrost: Reset the damage area on imported resources Reset the damage area in the resource_from_handle() path (as done in panfrost_resource_create()). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-28 17:50:44 +02:00
Boris Brezillon	938c5b0148	panfrost: Use ralloc() to allocate instructions to avoid leaking those objs Instructions attached to blocks are never explicitly freed. Let's use ralloc() to attach those objects to the compiler context so that they are automatically freed when the ctx object is freed. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-28 17:50:01 +02:00
Jose Fonseca	6e01575b68	scons: Make GCC builds stricter. Uses some of the same -Werror options used by Meson, as suggested by Michel Dänzer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Michel Dänzer <michel@daenzer.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-28 15:52:07 +01:00
Jose Fonseca	6b2bc8f25e	util: Prevent strcasecmp macro redefinion. MinGW headers already define it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-28 15:52:07 +01:00
Jose Fonseca	46f7b3662f	util: Prevent implicit declaration of function getenv. With MinGW cross compilation. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-28 15:52:07 +01:00
Jose Fonseca	7029556398	glx: Fix incompatible function pointer types. I don't know how Meson didn't hit this issue, when it too already uses -Werror=incompatible-pointer-types Fixes: `3dd299c3d5` ("glx: Sync <GL/glxext.h> with Khronos") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-28 15:52:07 +01:00
Vasily Khoruzhick	200859f45c	lima: fix texture descriptor issues Looks like initial RE was wrong and some fields have different purpose. I.e. there's no "disable_mipmap" field, it's actually part of another field that selects mipmap filtering. Also fix layout position. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-28 00:28:38 +00:00
Kenneth Graunke	7e095a4fbf	iris: Drop swizzling parameter from s8_offset. This is always false on Gen8+, no need for dead code and parameters.	2019-08-27 17:11:32 -07:00
Kenneth Graunke	e18cd5452a	mesa: Fix _mesa_float_to_unorm() on 32-bit systems. This fixes the following CTS test on 32-bit systems: GTF-GL46.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_init It does glGetTexImage of a 16-bit SNORM image, requesting 32-bit UNORM data. In get_tex_rgba_uncompressed, we round trip through float to handle image transfer ops for clamping. _mesa_format_convert does: _mesa_float_to_unorm(0.571428597f, 32) which translated to: _mesa_lroundevenf(0.571428597f * 0xffffffffu) which produced different results on 64-bit and 32-bit systems: 64-bit: result = 0x92492500 32-bit: result = 0x80000000 This is because the size of "long" varies between the two systems, and 0x92492500 is too large to fit in a signed 32-bit integer. To fix this, we switch to the new _mesa_i64roundevenf function which always does the 64-bit operation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104395 Fixes: `594fc0f859` ("mesa: Replace F_TO_I() with _mesa_lroundevenf().") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-27 23:57:02 +00:00
Kenneth Graunke	b59914e179	util: Add a _mesa_i64roundevenf() helper. This always returns a int64_t, translating to _mesa_lroundevenf on systems where long is 64-bit, and llrintf where "long long" is needed. Fixes: `594fc0f859` ("mesa: Replace F_TO_I() with _mesa_lroundevenf().") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-27 23:57:02 +00:00
Adam Jackson	163fc11f27	glx: Unset the direct_support bit for GLX_EXT_import_context GLX_EXT_import_context operates only on indirect contexts, a direct context cannot possibly support it. Without this change the extension will appear in the combined GLX extension string even if it is missing from the server string, indicating a lack of required server support.	2019-08-27 22:34:46 +00:00
Daniel Kolesa	1b9fce56c4	util: add auxv based PowerPC AltiVec/VSX detection At least on Linux, we can use the ELF auxiliary vector to detect the presence of AltiVec, VSX and other CPU features without having to go through handling SIGILL, which has various problems of its own. A similar thing is already being done for ARM to detect NEON. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Daniel Kolesa <daniel@octaforge.org>	2019-08-27 14:55:37 -07:00
Kenneth Graunke	23f42f8dcf	intel/compiler: Use new Gen11 headerless RT writes for MRT cases Gen11 adds support for specifying the render target index and src0 alpha present bits in the extended message descriptor. Previously, we had to use a message header for this, requiring extra instructions to write the fields, and two registers of extra payload. Improves performance on my ICL 8x8 frequency locked to 700Mhz, on iris: GfxBench5 Manhattan 3.0: 2.13635% +/- 0.159859% (n=5) GfxBench5 Aztec Ruins: 1.57173% +/- 0.128749% (n=5) Synmark2 OglDeferred: 2.86914% +/- 0.191211% (n=10) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-27 14:20:07 -07:00
Kenneth Graunke	0d96484165	intel/compiler: Use generic SEND for Gen7+ FB writes This takes care of generate_fb_write/fire_fb_write/brw_fb_WRITE's stuff earlier in the visitor. It will also make it easier to generate SENDSC messages with indirect extended descriptors in a few patches. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-27 14:20:07 -07:00
Kenneth Graunke	86a63b1098	intel/compiler: Refactor FB write message control setup into a helper. This will be used by visitor code to convert directly to SEND in a bit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-27 14:20:07 -07:00
Kenneth Graunke	b6fe25c7f5	intel/compiler: Handle bits 15:12 in brw_send_indirect_split_message() Annoyingly, these bits exist in some extended message descriptors (in particular render target writes), but they don't have any corresponding bits in the ISA encoding. So we can't use an immediate and have to fall back to an indirect extended descriptor. Thanks to Jason Ekstrand for reminding me that you can still set these bits via an indirect descriptor, even if they don't exist in the ISA. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-27 14:20:07 -07:00
Kenneth Graunke	c8c9c48684	intel/compiler: Fix src0/desc setter ordering src0 vstride and type overlap with bits of the extended descriptor. brw_set_desc() also sets the extended descriptor to 0. So by setting the descriptor, then setting src0, we were accidentally setting a bunch of extended descriptor bits unintentionally. When using this infrastructure for framebuffer writes (in a future patch), this ended up setting the extended descriptor bit 20, which is "Null Render Target" on Icelake, causing nothing to be written to the framebuffer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-27 14:20:07 -07:00
Marek Olšák	360cf3c4b0	radeonsi: fix scratch buffer WAVESIZE setting leading to corruption Cc: 19.2 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:52:32 -04:00
Marek Olšák	f95a28d361	radeonsi: unbind blend/DSA/rasterizer state correctly in delete functions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414 Fixes: `b758eed9c3` ("radeonsi: make sure that blend state != NULL and remove all NULL checking") Cc: 19.2 <mesa-stable@lists.freedesktop.org> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:52:30 -04:00
Marek Olšák	40e5ac45ae	radeonsi: align scratch and ring buffer allocations for faster memory access Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:52:28 -04:00
Marek Olšák	d8f27552f4	radeonsi: consolidate determining VGPR_COMP_CNT for API VS Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	4dde40908f	radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flags We need two different values of the register, one for NGG and one for legacy, in order to fix edge flags for the legacy pipeline. Passing the ngg flag to emit_clip_regs would be too complicated, so CONTEXT_REG_RMW is used for partial register updates. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	1426acf9e7	radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variables It varies depending on si_shader_key::as_ngg. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	2e94cb6693	radeonsi: add PKT3_CONTEXT_REG_RMW Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	d9a453c747	winsys/amdgpu+radeon: process AMD_DEBUG in addition to R600_DEBUG Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	467df4b90a	radeonsi/gfx10: add AMD_DEBUG=nongg Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	6229b5a058	radeonsi/gfx10: finish up Navi14, add PCI ID Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	73bde2b029	radeonsi/gfx10: always use the legacy pipeline for streamout The best way to prevent GDS hangs is not to use GDS. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	f251fd7bf5	radeonsi/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0 Only gfx9 and older use it to get InstanceID in VGPR1. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	28f44ee533	radeonsi/gfx10: fix InstanceID for legacy VS+GS Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	e121d75de9	radeonsi/gfx10: add as_ngg variant for VS as ES to select Wave32/64 Legacy GS only works with Wave64. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	f34d023f1a	radeonsi/gfx10: create the GS copy shader if using legacy streamout Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	776f05a307	radeonsi/gfx10: fix the PRIMITIVES_GENERATED query if using legacy streamout Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	cab5b3861d	radeonsi/gfx10: fix tessellation for the legacy pipeline ported from PAL Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	a9bb566955	radeonsi: move some global shader cache flags to per-binary flags Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Marek Olšák	810846e157	radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cache It could load an NGG shader when we want a legacy shader and vice versa. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-27 16:16:08 -04:00
Kenneth Graunke	6342d43ae9	iris: Delete dead prototype	2019-08-27 13:15:02 -07:00
Boris Brezillon	2734a4951e	Revert "panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir()" This reverts commit `5882e0def9`. This commit causes a segfault. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-27 20:07:28 +02:00
Boris Brezillon	0142dcb990	panfrost: Make sure bundle.instructions[] contains valid instructions Add an assert() in schedule_bundle() to make sure all instruction pointers in bundle.instructions[] are valid. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-27 16:50:52 +02:00
Boris Brezillon	5882e0def9	panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir() Right now we're leaking all block and instruction objects allocated by the compiler. Let's clean things up before leaving midgard_compile_shader_nir(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-27 16:50:52 +02:00
Boris Brezillon	3ac49f135a	panfrost: Free the instruction object in mir_remove_instruction() To avoid memory leaks. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-27 16:50:52 +02:00
Eric Engestrom	239f7f1c0a	scons: add support for MAJOR_IN_{MKDEV,SYSMACROS} src/gallium/winsys/svga/drm/vmw_screen.c: In function ‘vmw_dev_compare’: src/gallium/winsys/svga/drm/vmw_screen.c:48:12: warning: implicit declaration of function ‘major’ [-Wimplicit-function-declaration] 48 \| return (major((dev_t )key1) == major((dev_t )key2) && \| ^~~~~ src/gallium/winsys/svga/drm/vmw_screen.c:49:12: warning: implicit declaration of function ‘minor’ [-Wimplicit-function-declaration] 49 \| minor((dev_t )key1) == minor((dev_t )key2)) ? 0 : 1; \| ^~~~~ That file (and many others) already has the proper #include with their respective guards, but scons wasn't defining them, resulting in implicit functions being used instead (and an always-true check that's probably breaking something down the line). Note that I'm cheating a bit here because Scons doesn't seem to have a clean way to detect the existence of major() et al. as functions or macros, so I'm taking the shortcut of just detecting the presence of the header and assuming its contents is what we expect. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-By: Jose Fonseca <jfonseca@vmware.com>	2019-08-27 14:03:46 +01:00
Samuel Pitoiset	49f5ddd3ae	radv: make use of has_ls_vgpr_init_bug Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-27 08:04:51 +02:00
Samuel Pitoiset	fd54fc85aa	ac: add has_ls_vgpr_init_bug to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:47 +02:00
Samuel Pitoiset	1bf2572dff	ac: add has_msaa_sample_loc_bug to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:44 +02:00
Samuel Pitoiset	021feb1bf6	ac: add rbplus_allowed to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:41 +02:00
Samuel Pitoiset	20c5db02b5	ac: add has_tc_compat_zrange_bug to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:36 +02:00
Samuel Pitoiset	b55919cf2a	ac: add has_gfx9_scissor_bug to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:32 +02:00
Samuel Pitoiset	2b9c371575	ac: add cpdma_prefetch_writes_memory to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:29 +02:00
Samuel Pitoiset	b027ad66d7	ac: add has_out_of_order_rast to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:26 +02:00
Samuel Pitoiset	ed720af46d	ac: add has_load_ctx_reg_pkt to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:22 +02:00
Samuel Pitoiset	63c0b89b8f	ac: add has_rbplus to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:19 +02:00
Samuel Pitoiset	44a46c09de	ac: add has_dcc_constant_encode to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:16 +02:00
Samuel Pitoiset	c08401f035	ac: add has_distributed_tess to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:11 +02:00
Samuel Pitoiset	d62d2840c4	ac: add has_clear_state to ac_gpu_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:05 +02:00
Samuel Pitoiset	af65f9431e	ac: drop llvm8 from some load/store helpers Cleanup. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-27 08:04:00 +02:00
Dave Airlie	e6eb444554	gallivm: fix appveyor build after images changes	2019-08-27 13:36:03 +10:00
Dave Airlie	c501c2cef6	docs: add shader image extensions for llvmpipe v1.1: fix typo in llvmpipe name (ajax) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:25 +10:00
Dave Airlie	b7468f7831	llvmpipe: enable ARB_shader_image_load_store Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:22 +10:00
Dave Airlie	6c2fa01b9c	llvmpipe: flush on api memorybarrier. Until we have somewhere we can do better, just hit it with a hammer. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:16 +10:00
Dave Airlie	b9bf236c71	gallivm: add memory barrier support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:13 +10:00
Dave Airlie	abfb633968	gallivm: add support for fences api on older llvm Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:10 +10:00
Dave Airlie	8b7295f281	llvmpipe: bind vertex/geometry shader images Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:06 +10:00
Dave Airlie	2909c654b0	llvmpipe: add fragment shader image support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:04 +10:00
Dave Airlie	dc2357070c	draw: add vs/gs images support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:30:01 +10:00
Dave Airlie	ceb8d0ac5a	gallivm: add image load/store/atomic support Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:58 +10:00
Dave Airlie	15f7688ac9	gallivm/tgsi: add image interface to tgsi builder This adds the callbacks for the driver/gallium binding for image operations. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:55 +10:00
Dave Airlie	b2be174be2	llvmpipe: introduce image jit type to fragment shader jit. This adds the image type to the fragment shader jit context Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:51 +10:00
Dave Airlie	039a2e3630	draw: add jit image type for vs/gs images. This introduces the jit image type into the jit interface for vertex/geom shaders Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:49 +10:00
Dave Airlie	3c2c232059	llvmpipe: move the fragment shader variant key to dynamic length. This mirrors the vs/gs keys, and will be needed when adding images support. The const changes also mirror how the draw code work (as is needed when we add images) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:42 +10:00
Dave Airlie	d0381ea149	gallivm: add a basic image limit Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:39 +10:00
Dave Airlie	cf84b46a1c	llvmpipe: handle early test property. Also handle setting late for shaders that use stores Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:33 +10:00
Dave Airlie	a1e8fcef47	gallivm: move first/last level jit texture members. This lets us create an image structure with the same basic types as the texture one. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:31 +10:00
Dave Airlie	e8a445d8b5	gallivm: handle helper invocation (v2) Just invert the exec_mask to get if this is a helper or not. v2: get the bld mask (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:28 +10:00
Dave Airlie	fb34369eb5	gallivm: make lp_build_float_to_r11g11b10 take a const src This allows using it with a const src later. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:25 +10:00
Dave Airlie	a8ef6b5755	llvmpipe: refactor jit type creation This just cleans the code up so the texture/sampler type creation can be reused. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:29:21 +10:00
Dave Airlie	1eda49cc3d	gallivm: fix atomic compare-and-swap Not sure how I missed this before, but compswap was hitting an assert here as it is it's own special case. Fixes: `b5ac381d8f` ("gallivm: add buffer operations to the tgsi->llvm conversion.") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-27 12:28:17 +10:00
Paulo Zanoni	848d5e444a	intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails Looks like a copy/paste error. This patch prevents a segfault when running the following on BDW: INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \ dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4 For the curious, the message we're getting is: CS compile failed: Failure to register allocate. Reduce number of live scalar values to avoid this. Fixes: `864737ce6c` ("i965/fs: Build 32-wide compute shader when needed.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-26 14:54:16 -07:00
Alyssa Rosenzweig	c30116a2fa	pan/midgard: Fix invert fusing with r26 The invert wasn't applying (correctly) due to the issues addressed here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 13:43:04 -07:00
Alyssa Rosenzweig	75b6be2435	pan/midgard: Fold ssa_args into midgard_instruction This is just a bit of refactoring to simplify MIR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 13:43:04 -07:00
Eric Anholt	0309fb82ec	gallium: Add the ASTC 3D formats. No driver implements them yet, but this is a long way toward gallium having matching format enums for Mesa formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Eric Anholt	9d988f9291	gallium: Add block depth to the format utils. I decided not to update nblocks() with a depth arg as the callers wouldn't be doing ASTC 3D. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Eric Anholt	530f424735	gallium: Add a block depth field to the u_formats table. To add ASTC 3D compression formats, we need to be able to express the block depth. While I'm touching every line, line up the columns of the CSV again as they've drifted over time. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-26 19:44:00 +00:00
Alyssa Rosenzweig	9c328ea66e	pan/midgard: Add imov->fmov optimization When moving constants, if switching to a floating-point representation doesn't break anything, we'd rather have an fmov than an imov, permitting inlining the constant in many circumstances. total quadwords in shared programs: 3408 -> 3366 (-1.23%) quadwords in affected programs: 1188 -> 1146 (-3.54%) helped: 41 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11% 95% mean confidence interval for quadwords value: -1.07 -0.98 95% mean confidence interval for quadwords %-change: -11.38% -7.93% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 11:42:33 -07:00
Alyssa Rosenzweig	0acb5c1774	pan/midgard: Switch constants to uint32 Storing constants as float doesn't make sense when we have integer instructions; better to switch to be integer natively and coerce to/from float rather than the opposite. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 11:42:32 -07:00
Kenneth Graunke	2e1be771e4	isl: Don't set UnormPathInColorPipe for integer surfaces. This fixes dEQP-GLES3.functional.texture.specification subtests on iris: - texsubimage3d_depth.depth24_stencil8_2d_array - texsubimage3d_depth.depth32f_stencil8_2d_array - texsubimage3d_depth.depth_component32f_2d_array - texsubimage3d_depth.depth_component24_2d_array - texstorage2d.format.depth24_stencil8_2d - texstorage2d.format.depth32f_stencil8_2d - texstorage2d.format.depth_component24_2d - texstorage2d.format.depth_component32f_2d - texstorage3d.format.depth24_stencil8_2d_array - texstorage3d.format.depth32f_stencil8_2d_array - texstorage3d.format.depth_component24_2d_array - texstorage3d.format.depth_component32f_2d_array Here, something appears to be going wrong with having this bit set during blorp_copy operations for texture upload, which override the format to R8G8B8A8_UINT. AFAICT this bit should have no effect for integer surfaces, as it has to do with blending, and integer blending is not a thing. So it should be harmless to disable it. The Windows driver appears to be setting this bit universally, so I am unclear why we would need to. Perhaps they simply haven't run into this issue. Fixes: `f741de236b` ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-26 16:54:20 +00:00
Kenneth Graunke	1b090f065e	isl: Drop UnormPathInColorPipe for buffer surfaces. Jason suggested I remove this in review, and he's right. AFAICT this affects blending, and that just isn't going to happen on buffers. Fixes: `f741de236b` ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-26 16:54:20 +00:00
Alyssa Rosenzweig	85cc78a624	pan/midgard, bifrost: Set lower_fdph = true fdph instructions show up in some desktop GL shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-26 07:47:01 -07:00
Samuel Pitoiset	218ce34962	radv: add mipmap support for the clear depth/stencil values Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:59 +02:00
Samuel Pitoiset	e36e260c42	radv: add mipmap support for the TC-compat zrange bug Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:55 +02:00
Samuel Pitoiset	9db0dc6b8e	radv: allocate metadata space for mipmapped depth/stencil images For each mipmaps, the driver will store the clear values (8-bytes) and the TC-compat zrange value (4-bytes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:51 +02:00
Samuel Pitoiset	76812339f7	radv: decompress mipmapped depth/stencil images during transitions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:48 +02:00
Samuel Pitoiset	81c6473b7f	radv: add mipmaps support for decompress/resummarize Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:45 +02:00
Samuel Pitoiset	18ccde4d68	radv: add radv_process_depth_image_layer() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 15:56:42 +02:00
Connor Abbott	b7acf38073	ac/nir: Remove gfx9_stride_size_workaround_for_atomic The workaround was entirely in common code, and it's needed in radeonsi too so just always do it when necessary. Fixes KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9 with LLVM 8. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-26 11:00:49 +02:00
Connor Abbott	4849276ea8	ac/nir: add a workaround for viewing a slice of 3D as a 2D image GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D image, and we weren't implementing a workaround for that on gfx9 that TGSI was. Copy it over. Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi NIR. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-26 11:00:44 +02:00
Samuel Pitoiset	89671ef205	radv: fix getting the index type size for uint8_t 16-bit and 32-bit values match hardware values but 8-bit doesn't. This fixes dEQP-VK.pipeline.input_assembly.* with 8-bit index. Fixes: `372c3dcfdb` ("radv: implement VK_EXT_index_type_uint8") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-08-26 09:23:23 +02:00
Dave Airlie	bba4d2f442	virgl: fix format conversion for recent gallium changes. The virgl formats are fixed in time snapshots of the gallium ones, we just need to provide a translation table between them when we enter the hardware. This fixes a regression since Eric renumbered the gallium table. Fixes: `c45c33a5a2` (gallium: Remove manual defining of PIPE_FORMAT enum values.) Bugzilla: https://bugs.freedesktop.org/111454 v1 by Dave Airlie <airlied@redhat.com> v2: virgl: Add a number of formats to the table that are used, e.g. for vertex attributes v3: cover some more missing formats from a piglit run Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2019-08-26 06:35:00 +00:00
Dave Airlie	035cd6cdf9	virgl: drop unused format field	2019-08-26 06:35:00 +00:00
Erico Nunes	4379dcc12d	lima/ppir: enable vectorize optimization pp has vector units and some operations can be optimized when bundled together. Benchmarking this with piglit shaders shows that the instruction count can be greatly reduced on many examples with vectorize. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 18:29:12 +00:00
Erico Nunes	2a8a81d109	lima/ppir: lower selects to scalars nir vec4 fcsel assumes that each component of the condition will be used to select the same component from the options, but pp can't implement that since it only has 1 component for the condition. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 18:29:12 +00:00
Erico Nunes	27e7603c34	lima: fix ppir spill stack allocation The previous spill stack was fixed and too small, and caused instability in programs requiring spilling for roughly more than one value. This patch adds a dynamic calculation of the buffer size based on stack utilization and switches it to a separate allocation at flush time that will fit the shader that requires the largest buffer. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 20:08:59 +02:00
Jason Ekstrand	f58e0405b6	intel/fs: Drop the gl_program from fs_visitor It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-25 01:02:52 -05:00
Qiang Yu	5ff41b9fc5	lima: move format handling to unified place Create a unified table to handle pipe format to texture and render target format lookup. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-08-25 11:52:29 +08:00
Alex Smith	fe0ec41c4d	radv: Change memory type order for GPUs without dedicated VRAM Put the uncached GTT type at a higher index than the visible VRAM type, rather than having GTT first. When we don't have dedicated VRAM, we don't have a non-visible VRAM type, and the property flags for GTT and visible VRAM are identical. According to the spec, for types with identical flags, we should give the one with better performance a lower index. Previously, apps which follow the spec guidance for choosing a memory type would have picked the GTT type in preference to visible VRAM (all Feral games will do this), and end up with lower performance. On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of other (Feral) games and saw similar improvement on those as well. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: 19.2 <mesa-stable@lists.freedesktop.org> (Bas: CCing this to 19.2-rc due to high impact and limited complexity)	2019-08-24 17:37:47 +02:00
Vasily Khoruzhick	681e99d11c	lima/ppir: print register index and components number for spilled register It can be useful for debugging purposes Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	28d4b456a5	lima/ppir: add control flow support This commit adds support for nir_jump_instr, if and loop nir_cf_nodes. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	1cdf585613	lima/ppir: add better liveness analysis Add better liveness analysis that was modelled after one in vc4. It uses live ranges and is aware of multiple blocks which is prerequisite for adding CF support Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:31 -07:00
Vasily Khoruzhick	d30a98c896	lima/ppir: validate shader outputs Mali4x0 supports only gl_FragColor. gl_FragDepth is not supported. Check that we don't get anything but gl_FragColor in shader outputs. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-24 08:17:25 -07:00
Vasily Khoruzhick	8dd195e865	lima/ppir: turn store_color into ALU node We don't have a special OP to store color in PP, all we need to do is to store gl_FragColor into reg0, thus it's just a mov and therefore ALU node. Yet we still need to indicate that it's store_color op so regalloc ignores its destination. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	7f814d2b46	lima/ppir: create ppir block for each corresponding NIR block Create ppir block for each corresponding NIR block and populate its successors. It will be used later in liveness analysis and in CF support Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	4e695489df	lima/ppir: add dummy op We can get following from NIR: (1) r1 = r2 (2) r2 = ssa1 Note that r2 is read before it's assigned, so there's no node for it in comp->var_nodes. We need to create a dummy node in this case which sole purpose is to hold ppir_dest with reg in it. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	d11e1b7909	lima/ppir: add write after read deps for registers For cases like: (1) r1 = r2 (2) r2 = ssa1 We need to add (1) as dependency of (2), otherwise scheduler may reorder them. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	cd8c569ced	lima/ppir: fix ordering deps There can be several root nodes, i.e.: (1) r0 = r1 (2) r2 = r3 (3) branch if (ssa1) We need to make (3) depend on (1) and (2), old code added dependency only for (2), and (1) was kept as root node since there is no branch/discard or store color between two movs. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	bf2872eeb2	lima/ppir: set write mask for texture loads if dest is reg Destination for texture load can be a reg, so we need to set write mask in this case Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:47 -07:00
Vasily Khoruzhick	fd129817f0	lima/ppir: add support for unconditional branches and condition negation We need 'negate' modifier for branch condition to minimize branching. Idea is to generate following: current_block: { ...; if (!statement) branch else_block; } then_block: { ...; branch after_block; } else_block: { ... } after_block: { ... } Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00
Vasily Khoruzhick	e15af23b73	lima/ppir: clone ld_{uni,tex,var} into each block ppir_lower_load() and ppir_lower_load_texture() assume that node is in the same block as its successors, fix it by cloning each ld_uni and ld_tex to every block. It also reduces register pressure since values never cross block boundaries and thus never appear in live_in or live_out of any block, so do it for varyings as well. Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00
Vasily Khoruzhick	172f2ad805	lima/ppir: refactor const lowering Const nodes are now cloned for each user, i.e. const is guaranteed to have exactly one successor, so we can use ppir_do_one_node_to_instr() and drop insert_to_each_succ_instr() Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-23 18:19:46 -07:00
Rafael Antognolli	2b7ba9f239	anv: Only re-emit non-dynamic state that has changed. On commit `f6e7de41d7`, we started emitting 3DSTATE_LINE_STIPPLE as part of the non-dynamic state. That gets re-emitted every time we bind a new VkPipeline. But that instruction is non-pipelined, and it caused a perf regression of about 9-10% on Dota2. This commit makes anv_dynamic_state_copy() return a mask with only the state that has changed when copying it. 3DSTATE_LINE_STIPPLE won't be emitted anymore unless it has changed, fixing the problem above. v2: Improve commit message and add documentation about skipped checks (Jason) Fixes: `f6e7de41d7` ("anv: Implement VK_EXT_line_rasterization") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-23 15:55:18 -07:00
Alyssa Rosenzweig	21a85fd7d8	pan/decode: Validate and quiet helper invocation flag We can statically determine from the disassembly if helper invocations will be needed, so we can validate the corresponding bit in the cmdstream and thus avoid printing the bit itself in the decode. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-23 15:51:25 -07:00
Alyssa Rosenzweig	20ac0b8e4e	pan/midgard: Analyze helper invocations We check for texture ops which calculate derivatives (either explicitly via dFd* or implicitly) and mark the shader as requiring helper invocations. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-23 15:51:25 -07:00
Lionel Landwerlin	9d3fc737af	util: fix compilation on macos timespec_get() is not available on macos, we need to pull in the include/c11/threads_posix.h helper. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674 Fixes: `e2d761de03` ("util: drop final reference to p_compiler.h") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-23 23:45:25 +03:00
Caio Marcelo de Oliveira Filho	bfac462d92	i965: Silence brw_blorp uninitialized warning The variables level and start_layer are not initialized, then initialized if we have a BUFFER_BIT_DEPTH set. We assert on them later using the same check. This should be enough but GCC 9.1.1 is not convinced, so let's initialize the variables. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	5fac7c55f7	tgsi: Remove unused local Code that used it was removed in `4ebe6b2e72` ("tgsi: Drop the SSE2 constants setup that's been dead code since 2011.") Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	63f0259aeb	iris: Guard GEN9-only function in Iris state to avoid warning Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	412ed1338f	intel/decoders: Avoid uninitialized variable warnings Initialize `next_batch_addr` and `second_level`. If the batch is well formed, those values will be overriden, if not, they are as good as uninitialized garbage. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	0661480029	compiler/glsl: Fix warning about unused function The helper check_node_type() is only used when DEBUG is set (in the function below), but ASSERTED macro uses NDEBUG. So just guard the helper with #ifdef. If we see more such cases we might consider a ASSERTED-like macro for the DEBUG case. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	eac8a3b9af	anv: Drop unused local variable Leftover from `021fa28163` ("xintel/nir: Add a helper for getting BRW_AOP from an intrinsic"). Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Caio Marcelo de Oliveira Filho	f7d90c67c7	intel/compiler: Silence maybe-uninitialized warning in GCC 9.1.1 Compiler can't see that d is initialized. ../src/intel/compiler/brw_vec4_nir.cpp: In function ‘int brw::try_immediate_source(const nir_alu_instr, brw::src_reg, bool, const gen_device_info*)’: ../src/intel/compiler/brw_vec4_nir.cpp:984:12: warning: ‘d’ may be used uninitialized in this function [-Wmaybe-uninitialized] 984 \| d = MAX2(-d, d); Assert that we expect at least one component -- hence d going to be set. That by itself is not enough, so also zero initialize the variable. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-23 13:25:27 -07:00
Andres Rodriguez	a410823b3e	radv: additional query fixes Make sure we read the updated data from the gpu in cases where WAIT_BIT is not set. Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-17 05:53:51 -04:00
Kenneth Graunke	7ee7b0ecbc	iris: Fix large timeout handling in rel2abs() ...by copying the implementation of anv_get_absolute_timeout(). Appears to fix a CTS test with 32-bit builds: GTF-GL46.gtf32.GL3Tests.sync.sync_functionality_clientwaitsync_flush Fixes: `f459c56be6` ("iris: Add fence support using drm_syncobj") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-08-23 10:32:01 -07:00
Kenneth Graunke	9310ae6f68	iris: Set MOCS in all STATE_BASE_ADDRESS commands Rafael Antognolli tracked down a performance gap between i965 and iris in Synmark2's OglCSDof microbenchmark, noting that iris was performing substantially more memory reads and writes, with substantially fewer L3 hits. He suggested that something might be wrong with MOCS, or L3 configs, at which point I came up with a theory... It would appear that the STATE_BASE_ADDRESS command updates the MOCS settings for various base addresses even if you don't specify the "Modify Enable" bit for that address. Until now, we had been setting only the MOCS for bases we intended to change, leaving the others "blank" which is MOCS table entry 0, which is uncached. Most data access has a more specific MOCS (e.g. in SURFACE_STATE), but scratch access uses the Stateless Data Port Access MOCS from STATE_BASE_ADDRESS. So this meant all scratch access was uncached. Improves performance in Synmark2's OglCSDof by 2x, bringing iris on par with the existing i965 driver. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-23 10:21:48 -07:00
Vinson Lee	b05166e3d2	glx: Fix up glXQueryGLXPbufferSGIX on macOS. Fix this build error on macOS. ../src/glx/apple/glx_empty.c:158:4: error: void function 'glXQueryGLXPbufferSGIX' should not return a value [-Wreturn-type] return 0; ^ ~ Fixes: `3dd299c3d5` ("glx: Sync <GL/glxext.h> with Khronos") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-08-23 11:05:23 -04:00
Juan A. Suarez Romero	6f137ed901	docs: update calendar, add news item and link release notes for 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-23 12:40:40 +02:00
Juan A. Suarez Romero	23f1741996	docs: add sha256 checksums for 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ae2a676cd1`)	2019-08-23 12:38:28 +02:00
Juan A. Suarez Romero	152dd6ed19	docs: add release notes for 19.1.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `a384fe0ceb`)	2019-08-23 12:38:27 +02:00
Connor Abbott	f59076f8a7	radeonsi/nir: Rewrite output scanning Similarly to before, this didn't properly handle varying structs with doubles in them. This doesn't fix any tests, but was noticed while looking at the code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	9395277972	radeonsi/nir: Rewrite store intrinsic gathering The old version wasn't as accurate as it could be, and didn't handle double variables inside structs correctly. Walk the path to compute the actual components affected. In combination with the previous commit fixes KHR-GL45.enhanced_layouts.varying_structure_locations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	87cca891c3	radeonsi/nir: Add const_index when loading GS inputs This fixes loading GS inputs in structures or arrays. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	82589d3ffd	radeonsi/nir: Don't add const offset to indirect This is already done in get_deref_offset() in the common code. We were adding it twice accidentally. Fixes KHR-GL45.enhanced_layouts.varying_array_locations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	400db1852b	ac/nir: Assert GS input index is constant If it's not we silently ignore indir_index which is definitely a bug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	bb42c896fe	ac/nir: Handle const array offsets in get_deref_offset() Some users of this function (e.g. GS inputs) currently only work with constant offsets. We got lucky since all the tests used an array index of 0, so the non-constant part was always 0. But we still need to handle this. This doesn't fix any CTS test, but was noticed while debugging one. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	97d592c855	radeonsi/nir: Don't recompute num_inputs and num_outputs Don't repeat what mesa/st already does. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Connor Abbott	3eb4aeed60	st/nir: Fix num_inputs for VS inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 11:05:31 +02:00
Samuel Pitoiset	a4e6e59db8	radv/gfx10: do not use NGG with NAVI14 Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-23 09:54:08 +02:00
Samuel Pitoiset	0813c27d8d	radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0 Only gfx9 and older use it to get InstanceID in VGPR1. Ported from RadeonSI. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-23 09:54:06 +02:00
Samuel Pitoiset	7d1c091143	gitlab-ci: bump LLVM to 8 for meson-vulkan and meson-clover To fix pipeline builds. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-23 08:13:31 +02:00
Samuel Pitoiset	1fd60db4a1	ac,radv,radeonsi: remove LLVM 7 support Now that LLVM 9 will be released soon, we will only support LLVM 8, 9 and master (10). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-23 08:12:34 +02:00
Tapani Pälli	3e03a3fc53	egl: reset blob cache set/get functions on terminate Fixes errors seen with eglSetBlobCacheFuncsANDROID on Android when running dEQP that terminates and reinitializes a display. Fixes: `6f5b57093b` "egl: add support for EGL_ANDROID_blob_cache" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-23 08:14:08 +03:00
Kenneth Graunke	2d79925034	iris: Avoid unnecessary resolves on transfer maps We were always resolving the buffer as if we were accessing it via CPU maps, which don't understand any auxiliary surfaces. But we often copy to a temporary using BLORP, which understands compression just fine. So we can avoid the resolve, and accelerate the copy as well. Fixes: `9d1334d2a0` ("iris: Use copy_region and staging resources to avoid transfer stalls") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:17 -07:00
Kenneth Graunke	136629a1e3	iris: Drop copy format hacks from copy region based transfer path. This doesn't work for compressed formats, as the source texture and temporary texture would have different block sizes. (Forcing the driver to always take the GPU path would expose the bug.) Instead, just use the source format for the temporary, and let blorp_copy deal with overrides. The one case where we can't do this is ASTC, because isl won't let us create a linear ASTC surface. Fall back to the CPU paths there for now. Fixes: `9d1334d2a0` ("iris: Use copy_region and staging resources to avoid transfer stalls") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:17 -07:00
Kenneth Graunke	1cd13ccee7	iris: Update fast clear colors on Gen9 with direct immediate writes. Gen11 stores the fast clear color in an "indirect clear buffer", as a packed pixel value. Gen9 hardware stores it as a float or integer value, which is interpreted via the format. We were trying to store that in a buffer, for similarity with Icelake, and MI_COPY_MEM_MEM it from there to the actual SURFACE_STATE bytes where it's stored. This unfortunately doesn't work for blorp_copy(), which does bit-for-bit copies, and overrides the format to a CCS-compatible UINT format. This causes the clear color to be interpreted in the overridden format. Normally, we provide the clear color on the CPU, and blorp_blit.c:2611 converts it to a packed pixel value in the original format, then unpacks it in the overridden format, so the clear color we use expands to the bits we originally desired. However, BLORP doesn't support this pack/unpack with an indirect clear buffer, as it would need to do the math on the GPU. On Gen11+, it isn't necessary, as the hardware does the right thing. This patch changes Gen9 to stop using an indirect clear buffer and simply do PIPE_CONTROLs with post-sync write immediate operations to store the new color over the surface states for regular drawing. BLORP continues streaming out surface states, and handles fast clear colors on the CPU. Fixes: `53c484ba8a` ("iris: blorp using resolve hooks") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:14 -07:00
Kenneth Graunke	117a0368b0	iris: Fix broken aux.possible/sampler_usages bitmask handling For renderable surfaces, we allocate SURFACE_STATEs for each bit in res->aux.possible_usages. Sampler views use res->aux.sampler_usages. When pinning buffers, we call surf_state_offset_for_aux() to calculate the offset to the desired surface state. surf_state_offset_for_aux() took an aux_modes parameter, which should be one of those two fields. However...it was not using that parameter. It always used the broader res->aux.possible_usages field directly. One of the callers, update_clear_value(), was passing incorrect masks for this parameter. It iterated through the bits in order, using u_bit_scan(), which destructively modifies the mask. So each time we called it, the count of bits before our selected mode was 0, which would cause us to always update the SURFACE_STATE for ISL_AUX_USAGE_NONE, rather than updating each in turn. This was hidden by the earlier bug where surf_state_offset_for_aux() ignored the parameter. Fixes: `7339660e80` ("iris: Add aux.sampler_usages.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:14 -07:00
Kenneth Graunke	f6c44549ee	iris: Replace devinfo->gen with GEN_GEN This is genxml, we can compile out this code. Fixes: `2660667284` ("iris/gen8: Re-emit the SURFACE_STATE if the clear color changed.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-22 18:31:14 -07:00
Alyssa Rosenzweig	272ce6f5a7	pan/midgard: Fix writeout combining shader-db regression in the scheduler. Fixes: `dff4986b1a` ("pan/midgard: Emit store_output branch just-in-time") total bundles in shared programs: 2055 -> 2019 (-1.75%) bundles in affected programs: 1055 -> 1019 (-3.41%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.35% max: 20.00% x̄: 6.71% x̃: 5.16% 95% mean confidence interval for bundles value: -1.00 -1.00 95% mean confidence interval for bundles %-change: -8.45% -4.97% Bundles are helped. total quadwords in shared programs: 3444 -> 3408 (-1.05%) quadwords in affected programs: 1897 -> 1861 (-1.90%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.19% max: 14.29% x̄: 3.97% x̃: 2.99% 95% mean confidence interval for quadwords value: -1.00 -1.00 95% mean confidence interval for quadwords %-change: -5.08% -2.86% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 14:03:23 -07:00
Alyssa Rosenzweig	2c5ba2ee6e	panfrost: Implement gl_FragCoord correctly Rather than passing through the transformed gl_Position, we can use the hardware-level varying for this, which will correctly handle gl_FragCoord.w Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig	eeebf5c2df	panfrost: Remove vertex buffer offset from its size The offset is added to the base address, so we need to subtract it from the size to maintain the same end address and thus prevent a buffer overflow: end_address = start_address + size start_address' = start_address + offset size' = size - offset end_address' = start_address' + size' = (start_address + offset) + (size - offset) = (start_address + size) + (offset - offset) = start_address + size = end_address QED. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig	f4678f3c62	pan/decode: Handle special varyings We need a special path for special varyings so we parse them correctly instead of throwing an error when they inevitably point to bad memory. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig	caec0b3232	pan/decode: Remove size/stride divisibility check The hardware doesn't care, and a lot of Panfrost code relies on an oversized buffer. The important part is that (stride * padded_num_vertices) is no greater than size, which we'll need to check once we validate instancing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig	ed464e05c8	pan/decode: Decouple attribute/meta printing They are independent fields, so the parser should reflect that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:39 -07:00
Alyssa Rosenzweig	ae84f16786	pan/decode: Print stub for uniforms We don't need to dump the contents necessary, but having the stub with the address is useful. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 13:31:06 -07:00
Alyssa Rosenzweig	26ed431ea9	pan/decode: Decode actual varying_meta address I don't know who thought this mask was a good idea but unfortunately it must have been me. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:56:49 -07:00
Alyssa Rosenzweig	f48136e9c5	pan/decode: Downgrade shader property mismatch to warning If we permit more $whatever through than the shader needs, that's a bit of a waste, but it isn't an error. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:56:35 -07:00
Alyssa Rosenzweig	f38ce6ea8c	pan/decode: Validate, but do not print, index buffer We don't actually care about the contents of the index buffer, but we would rather like to ensure it is present and of the correct size. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:56:04 -07:00
Alyssa Rosenzweig	cbbf75424a	pan/decode: Validate mali_shader_meta stats We can infer these stats in many cases from the disassembly, so we should try to sanity check where we can. We may need to be fuzzy about analysis, since analysis gives us a bound but we don't mind if it's not used fully by the shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:55:49 -07:00
Alyssa Rosenzweig	9b067d96f7	pan/decode: Disassemble before printing shader descriptor This allows the shader descriptor to access the disassembled stats. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:55:27 -07:00
Alyssa Rosenzweig	5f9a1c74ae	pan/decode: Promote <no shader> to an error There is no reason this should happen to an in-spec program, as far as I know. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:55:00 -07:00
Alyssa Rosenzweig	d7473e2e01	pan/decode: Fix uniform printing Lazypasting from UBOs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:54:35 -07:00
Alyssa Rosenzweig	139708bbab	pan/decode: Validate blend shaders don't access I/O We could do better by forcing the checks to equal zero (right now, an indeterminate answer will pass the checks), but this is a start to guard against some egregious cases. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:54:16 -07:00
Alyssa Rosenzweig	ded9a68d8f	pan/decode: Validate and simplify FRAGMENT payloads There are a number of conditions we need to test for to statically check for TILE_RANGE_FAULTs, but once these checks are in order, we can print as-is. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:53:44 -07:00
Alyssa Rosenzweig	f06e8f7fe9	pan/decode: Validate MFBD tags These tags need to match up with what's actually described by the MFBD, so check this. Once this is checked, since the type and contents of the FBD are obvious from printing above, there's no need to explicitly mark off the framebuffer line. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:53:10 -07:00
Alyssa Rosenzweig	0c313419a0	pan/decode: Eliminate non-FBD dumped case We don't need more cases to deal with. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:52:52 -07:00
Alyssa Rosenzweig	6ec33b4f34	pan/decode: Removing uniform buffer framing We can do single line prints: ubuf_0[192] = memory_161f5000 + 896; Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:52:37 -07:00
Alyssa Rosenzweig	a68fe4baec	pan/decode: Remove mali_attr(_meta) framing It doesn't give any real added value. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:52:18 -07:00
Alyssa Rosenzweig	f162adc32b	pan/midgard: Disassemble integer constants in hex It's usually easier to parse mentally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:51:55 -07:00
Alyssa Rosenzweig	b89cb0dba6	pan/midgard: Explain ffma Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:51:39 -07:00
Alyssa Rosenzweig	19d58a299b	pan/midgard: Analyze simple loads/store For shaders using exclusively direct attribute/varyings, we can work this out statically. For shaders with indirect access, we just set an upper bound of 16 (the max attributes/varyings we support) and the actual count will be reported regardless. We proceed similarly for textures/samplers, as well as for UBOs. While UBOs can be indexed indirectly, the UBO itself -- which is what we count in the shader descriptor (rather than the UBO descriptors) -- is statically determinable. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:51:21 -07:00
Alyssa Rosenzweig	a89e368c7f	pan/midgard: Compute work_count via writes This is exact. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:50:57 -07:00
Alyssa Rosenzweig	b9fb63859e	pan/midgard: Sketch static analysis to uniform count This one is a little tricky, but the idea is that: r16-r23 are always uniforms r8-r15 are sometimes work, sometimes uniforms... ...but as work, they are always written before use ...and as uniforms, they are never written before use So we use that heuristic to determine the count to feed the machine. We'll record work register use in the next commit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:50:40 -07:00
Alyssa Rosenzweig	58fc260312	pan/decode: Hoist shader-db stats to shared decode We'll want all this information to validate the shader descriptor. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-22 12:50:14 -07:00
Alyssa Rosenzweig	a8f86fcb51	nir: Remove nir_const_load_to_arr There are no remaining users in-tree. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-22 12:24:13 -07:00
Alyssa Rosenzweig	3c01a6928a	pan/midgard,bifrost: Expand nir_const_load_to_arr Panfrost is the only user of the macro; we are better off expanding than having random stuff in nir.h. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-22 12:24:13 -07:00
Adam Jackson	a7a9d958bc	glx: Make __glXGetDrawableAttribute return true sometimes Right now it always returns zero, but as of: commit `a48a6b8a40` Author: Adam Jackson <ajax@redhat.com> Date: Tue Nov 14 15:13:05 2017 -0500 glx: Prepare driFetchDrawable for no-config contexts We were hoping it would return true if the drawable could actually be looked up. It wasn't, so that didn't go very well. With the most recent update to <GL/glxext.h> glXQueryGLXPbufferSGIX (correctly) returns void, so there's no longer anything else besides driFetchDrawable that depends on the return value from __glXGetDrawableAttribute. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-22 13:29:06 -04:00
Adam Jackson	3dd299c3d5	glx: Sync <GL/glxext.h> with Khronos Minor fixups required to keep the prototypes matching and to remove mention of retired enums. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-22 13:29:04 -04:00
Adam Jackson	5ebd333c6c	glx: Whitespace cleanups Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-22 13:28:39 -04:00
Eric Engestrom	6db1dfe347	swr: use LLVM version string instead of re-computing it Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-22 16:08:09 +01:00
Eric Engestrom	7f5ef97a07	llvmpipe: use LLVM version string instead of re-computing it Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-22 16:08:09 +01:00
Eric Engestrom	3ea83f4c9b	scons: define MESA_LLVM_VERSION_STRING like the other build systems do Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-22 16:08:09 +01:00
Bas Nieuwenhuizen	c037fe5ad1	radv: Disable NGG for geometry shaders. A bunch of remaining issues including some that affect users. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248 Fixes: `ee21bd7440` "radv/gfx10: implement NGG support (VS only)" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-22 12:47:32 +02:00
Lionel Landwerlin	5833f43305	util/timespec: use unsigned 64 bit integers for nsec values We added this utility for vulkan where all timeouts are given as uint64_t values. We can switch from signed to unsigned as this is the only user and if we ever deal with signed integers somewhere else we'll have to be careful to use the corresponding timespec_(add\|sub)_msec and always pass absolute values. v2: Forgot to drop the test calling add_nsec() with a negative number Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com> Fixes: `d2d70c3bb5` ("util: add a timespec helper") Acked-by: Daniel Stone <daniels@collabora.com>	2019-08-22 09:35:57 +02:00
Tapani Pälli	728ebcdec2	iris/android: fix build and link with libmesa_intel_perf Fixes: `0fd4359733` "iris/perf: implement routines to return counter info" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-22 10:01:14 +03:00
Samuel Pitoiset	2d9f401a83	ac: fix exclusive scans on GFX8-GFX9 This fixes a regression introduced with scan&reduce operations on GFX10. Note that some subgroups CTS still fail on GFX10 but I assume it's a different issue. This fixes dEQP-VK.subgroups.arithmetic..subgroupexclusive. Fixes: `227c29a80d` "amd/common/gfx10: implement scan & reduce operations" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-22 08:43:15 +02:00
Tapani Pälli	ce8fd042a5	util: fix os_create_anonymous_file on android Commit fixes current crashes with Vulkan applications on Android. Fixes: `c0376a1234` "util: add anon_file.h for all memfd/temp file usage" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-08-22 08:27:43 +03:00
Lionel Landwerlin	ac5bda374a	i965: honor scanout requirement from DRI Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-21 23:52:07 +00:00
Kenneth Graunke	bc844d92ce	gallium/noop: Implement resource_get_param v2: Pass through to oscreen rather than faking it (review from Marek). Fixes: `0346b70083` ("gallium/screen: Add pipe_screen::resource_get_param") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-21 22:18:22 +00:00
Kenneth Graunke	f02d1a0b75	gallium/rbug: Wrap resource_get_param if available Fixes: `0346b70083` ("gallium/screen: Add pipe_screen::resource_get_param") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-21 22:18:22 +00:00
Kenneth Graunke	c43a44791b	gallium/trace: Wrap resource_get_param if available Fixes: `0346b70083` ("gallium/screen: Add pipe_screen::resource_get_param") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-21 22:18:22 +00:00
Kenneth Graunke	0e6b573ae5	gallium/ddebug: Wrap resource_get_param if available Fixes: `0346b70083` ("gallium/screen: Add pipe_screen::resource_get_param") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-21 22:18:22 +00:00
Jose Maria Casanova Crespo	74a7e3ed3b	mesa: recover target_check before get_current_tex_objects At compressed_tex_sub_image we only can obtain the tex_object after compressed_subtexture_target_check is validated for TEX_MODE_CURRENT. So if the target is wrong the error is raised to the user. This completes the fix for the regression introduced on "mesa: refactor compressed_tex_sub_image function" of the pending failing tests: dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.compressedtexsubimage3d v2: Fix warning that texObj might be used uninitialized (Gert Wollny) Fixes: `7df233d68d` ("mesa: refactor compressed_tex_sub_image function") Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-08-21 21:26:48 +01:00
Kevin Strasser	5baff5dd3c	gallium: Add buffer and configs handling or fp16 formats Expose configs when allow_fp16_configs has been enabled and DRI_LOADER_CAP_FP16 is set in the loader. Also, make kms_swrast_dri respect format bpp, to allow for allocating buffers wider than 32 bpp. Make fp16 opt-in for gallium. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	f4703f1c10	i965: Add handling for fp16 configs Expose configs when allow_fp16_configs has been enabled and DRI_LOADER_CAP_FP16 is set in the loader. Also, define a new dri configuration option so users can disable exposure of fp16 formats. Make fp16 opt-in for i965. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	4861d2a395	gbm: Add buffer handling and visuals for fp16 formats Define and set a new loader cap DRI_LOADER_CAP_FP16, indicating that gbm can handle fp16 formats. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	a427c20080	dri: Add fp16 formats Add dri formats for RGBA ordered 64 bpp IEEE 754 half precision floating point. Leverage existing offscreen render support for MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	482ed4347d	egl: Handle dri configs with floating point pixel data In the case that __DRI_ATTRIB_FLOAT_BIT is set in the dri config, set EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT in the egl config. Add a field to the platform driver visual to indicate if it has components that are in floating point form. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	86d31c2c12	dri: Handle configs with floating point pixel data In order to handle pixel formats that consist of floating point data, enable floatMode field in the dri config, and set __DRI_ATTRIB_FLOAT_BIT in the render type attribute. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	d4a9010338	glx: Add fields for color shifts glx doesn't read the masks from the dri config directly, but for consistency add shifts to the glxconfig. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	7b4ed2b513	egl: Convert configs to use shifts and sizes instead of masks Change dri2_add_config to take arrays of shifts and sizes, and compare with those set in the dri config. Convert all platform driver masks to shifts and sizes. In order to handle older drivers, where shift attributes aren't available, we fall back to the mask attributes and compute the shifts with ffs. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	3562f48c9d	util: move bitcount to bitscan.h bitcount is free from the pipe header dependencies that make u_math.h hard to include by non-gallium specific code, so move it to bitscan.h. bitscan.h is included by u_math.h so existing references will continue working. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	5a747306ce	dri: Add config attributes for color channel shift The existing mask attributes can only support up to 32 bpp. Introduce per-channel SHIFT attributes that indicate how many bits, from lsb towards msb, the bit field is offset. A shift of -1 will indicate that there is no bit field set for the channel. As old loaders will still be looking for masks, we set the masks to 0 for any formats wider than 32 bpp. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	9328e7c04c	gallium: Use consistent approach for config format filtering rgb10 uses an 'if(allowed) continue' approach, do the same for rgba_ordering. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	4fb71604b7	i965: Add helper function for allowed config formats The driver checks dri config options and loader caps to filter out certain formats during config creation. Fold 4 call sites under a single helper function. Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-21 18:36:57 +00:00
Kevin Strasser	d07a56dbc0	drm-uapi: Update headers for fp16 formats From drm-next commit 88ab9c76d191ad8645b483f31e2b394b0f3e280e Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-21 18:36:57 +00:00
Andres Rodriguez	bd960390bb	radv: add RADV_DEBUG=allentrypoints This debug option allows vkGet[Instance/Device]ProcAddr() to succeed even if the extension associated with the requested entrypoint was not enabled. This has come in handy in a few instances when debugging VR applications, so I thought it would be good to have a cleaned up version upstreamed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 17:47:35 +00:00
Alyssa Rosenzweig	0ae72df013	panfrost: Fix PIPE_BUFFER spacing Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:44:45 -07:00
Alyssa Rosenzweig	d4542f8cb5	panfrost: Implement depth range clipping This should fix glDepthRangef issues. Eventually, something similar should allow implementing the depth bounds test. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:44:45 -07:00
Alyssa Rosenzweig	5e268a01d2	panfrost: Don't bail on PIPE_BUFFER We can handle some of it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:43:02 -07:00
Alyssa Rosenzweig	7f14916372	pan/midgard: Identify and disassemble indirect texture/sampler A pair of special flags can turn the texture/sampler handle fields into register selects. This means code like: texture(uTextures[hr28.w], ...) can be compiled to something like: texture ..., fsampler[hr28.w], texture[hr28.w] Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig	8c1bc3c000	pan/midgard: Breakout texture reg select printer This data structure is shared in other parts of the texture word, so let's streamline printing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig	aa404120e1	panfrost: Pass stream_output_info by reference It's a large structure, apparently. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	27b6264630	panfrost: Guard against NULL rasterizer explicitly Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	5ebdd10eaf	pan/bifrost: Correct file size signedness Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	87afc2e2da	panfrost: Fix missing ret assignment in DRM code Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	c43fa6b320	panfrost: Hoist bo != NULL check before dereference Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	a3c1ab2e9a	panfrost: Hoist job != NULL check Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	9cee21f0c9	panfrost: Prevent potential integer overflow in instancing Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	5bdc9096b7	panfrost: Clarify intention with PIPE_SWIZZLE_X check Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	8fba6ab03d	panfrost: Pay attention to framebuffer dimension sign These are unsigned so the clamp-positive is redundant. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	14a2032f0f	pan/midgard: Mark fallthrough explicitly Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	ed58fd63b4	panfrost: Don't check reads_point_coord Useless check. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	d0b9f094fd	pan/midgard: Simplify contradictory check. Coverity. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	91a5b2657d	pan/midgard: Reorder bits check to fix 8-bit masks Coverity. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	6189274f57	pan/midgard: Represent unused nodes by ~0 This allows nodes to be unsigned and prevents a class of weird signedness bugs identified by Coverity. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	cda0ec67e6	pan/bifrost: Avoid buffer overflow in disassembler This path shouldn't be possible for in-spec shaders, but let's be defensive. (Because security, right? Mostly because Coverity.) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	9ce45ac808	pan/decode: Remove all_zero The checks confuse Coverity, so let's make it explicit what's going on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig	1060c48d46	pan/decode: Don't leak FBD pointer Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig	52ac7dc5d0	pan/midgard: Allocate `dependencies` on stack It's small; this way we don't leak memory. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig	bf036e127f	pan/midgard: Free liveness info Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 10:38:30 -07:00
Jason Ekstrand	c9a4793de8	v3d: Use the correct opcodes for signed image min/max Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-21 17:19:55 +00:00
Jason Ekstrand	021fa28163	intel/nir: Add a helper for getting BRW_AOP from an intrinsic So many duplicated switch statements.... Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-21 17:19:55 +00:00
Jason Ekstrand	951cf94521	nir: Add explicit signs to image min/max intrinsics This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-21 17:19:55 +00:00
Alyssa Rosenzweig	fc69a5cf73	pan/decode: Cleanup mali_attr printing We can smush this into one-line per record as per usual. We still need more validation and cleaning this up, especially around instancing. But for LINEAR records, it works okay already. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:56 -07:00
Alyssa Rosenzweig	62e6673908	pan/decode: Validate attribute/varying buffer pointer Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	be5e30c46b	pan/decode: Include address in union mali_attr No need to break it out into extra lines. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	68b9030db7	pan/decode: Use concise texture printing This consolidates texture format and dimensionality into something simple: tiled rgba8_unorm.rgb1: 512x512 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	9f15f4d8e9	panfrost: Break up usage2 field This is another bit field describing layout. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	9b203950ec	pan/decode: Pretty-print sRGB format We can just stick an "s" in if it's sRGB. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	47af32b15e	panfrost: Remove ancient TODO Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	96f6b8a707	panfrost: nr_mipmap_levels -> levels No need to be so verbose. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	024f9cf24f	pan/decode: Validate texture dimensionality Textures of a smaller dimension don't need higher dimensions printed. This allows us to be more compact, while enforcing verification that higher dimensions must be zero. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	8fc4ca82e3	pan/decode: Break out pandecode_texture function It's massive and hugely nested indentation -- break it out so it's legible. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	fa536ece04	pan/decode: Guard texture unknowns as zero trips unknown3A I think I've actually seen on T6xx but.. we'll see what happens in traces going forward. We don't want the zero noise normally, and if they show up in the wild, we want to draw attention to them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:55 -07:00
Alyssa Rosenzweig	e09392fc27	pan/decode: Use GLSL style formats/swizzles This dramatically reduces visual clutter: now an entire attribute/varying record looks something like: rgba32f attribute_0[16].bgra; which is equivalent to the raw structure: { .index = 0, .format = MALI_FORMAT_RGBA32F, .swizzle = (MALI_CHANNEL_BLUE << 9) \| ...., .src_offset = 16, } Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	ac090b365f	pan/decode: Don't print the default swizzle It's just noise. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	2208eb9b72	pan/decode: Validate swizzles against format We want to make sure we don't access a component in the swizzle that doesn't exist in the format, since that is (as far as I know) undefined. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	b233012d44	pan/decode: Treat RESERVED swizzles as errors We've never seen them, so if they come up in trace, we want to draw attention to that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	a94fb781c2	pan/decode: Handle VARYING_DISCARD Varying discard is not used by Panfrost, but the blob uses it sometimes to have some padding in the varyings table, probably to minimize per-draw overhead. (...We should maybe consider this ourselves!) Let's check for this and ensure the rest of the record is consistent with a discarded varying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	c0642ebca1	panfrost: Don't trip the prefix magic field What is this? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	00be5d7b82	pan/decode: Guard attribute unknowns One should be zero. The other has always been seen as set, so check this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	9836c26ac1	panfrost: Don't crash on GL_CLAMP It's a legacy GL thing... we don't really need to handle it right now, but we shouldn't crash.. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:54 -07:00
Alyssa Rosenzweig	e4eaa730dd	panfrost: Do not expose PIPE_CAP_TEXTURE_MIRROR_CLAMP This CAP controls a desktop-only extension. If the corresponding support exists in the hardware, we don't know how to use it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	be81711c0a	panfrost: Fix scoreboarding with dependency on job #0 Subtle issue masked by how we emitted SET_VALUE jobs, but this case can and does occur, so let's fix it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	3752f76715	pan/decode: Normalize final instances of XXX Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	dcde5bd157	pan/decode: Normalize case matching XXX format Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	89c5370118	pan/decode: Mark tripped zeroes with XXX This normalizes the printed format. It also makes it easier for the future when we may introduce semantic _warn and _error handlers. A tripped zero is essentially a hazard to check for. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	e49204c878	pan/decode: Check for MFBD preload chicken bit If this bit is clear, MFBD preload will be enabled, and you.. don't want that. (At least, when the bit is clear, the old contents of the framebuffer will be preserved. I'm assuming this is what "MFBD preload" refers to in kbase.) Validate that this bit is always set. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	c9b6233558	pan/decode: Validate AFBC fields are zero when AFBC is disabled There is no "chunknown" structure; that part of the union is an artefact from falsely believing vertex/tiler MFBDs could have render targets attached (they can't). These are just plain old AFBC fields, and if there is no AFBC, it's error to set these field. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	4aeb694462	pan/decode: Do not print uniform/buffers explicitly For our purposes of driver debugging, the contents of uniform buffers are rarely interesting; we're more concerned about the metadata setting them up. We do need to be careful to validate the sizes of both uniforms and uniform buffers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	4391c65f10	pan/decode: Add static bounds checking utility Many structures in the command stream have a GPU address and size determined statically. We should check that the pointers we are passed are valid and the buffers they point to are big enough for the given size. If they're not, an MMU fault would be raised. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	9dfbc8dc03	pan/decode: Don't print unreferenced attribute memory This is a source of uninitialized memory leaking into the traces. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	897110a566	pan/decode: Check for a number of potential issues Verify sizes / masks / etc against something logical to cull down the trace space and automatically guard against a number of potential hazards. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	f5c293425f	panfrost: Correct polygon size computations While the algorithm for computing the header size has been correct for a while, we used a major hack to conservatively guess the body size. Let's scrap that and figure out the algorithm we actually need to use to be bit-identical with what the hardware expects. We do have to be careful to add the header size to total comptued BO size. It's not clear how big the polygon list needs to be in practice -- but it has to be somewhat bigger than the polygon list itself. This needs more investigation. If we size the polygon list exactly based on the polygon_list_size field, we get faults like: [ 1224.219886] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x000000001BDE8000 Reason: TODO raw fault status: 0x660003C3 decoded fault status: SLAVE FAULT exception type 0xC3: TRANSLATION_FAULT_LEVEL3 access type 0x3: WRITE source id 0x6600 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	f6e41f30d0	panfrost: Remove DRY_RUN Nobody uses this anymore anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	b4a214207c	pan/decode: Print "just right" count of texture pointers The other commented lines just add noise/entropy we don't want, and can in fact crash the trace due to asserts failing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	a8bd3ad470	pan/decode: Verify and omit polygon size The polygon sizes are computed from the width/height/flags, so we can reverse the computation and use our computation to verify the two computation algorithms are bit-identical. If they are, we can omit the computed fields. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:53 -07:00
Alyssa Rosenzweig	b45eb2775e	panfrost: Move pan_tiler.c outside of Gallium The routines in this file may be shared with Vulkan. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	13d07978ff	pan/decode: Bounds check polygon list and tiler heap We have the BOs available; ensure that the bounds specified in the command stream are actually the correct bounds. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	b072d0357b	pan/decode: Allow updating mmaps This allows the caller to call track_mmap multiple times for the same gpu_va for the purpose of updating the mmap. This is used to trace invisible BOs with kbase and doesn't apply to native traces. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	52101e48f8	pan/decode: Express tiler structures as offsets This allows us to catch a class of errors (for negative offsets, etc) automatically. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	e918dd8a6c	pan/decode: Don't print zero exception_status Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	2a8d776884	pan/decode: Fix missing NULL terminator Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	6c67bd05a6	pan/decode: Silence workgroups_x_shift_2 Since we're bit-identical we can compare the computed value. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:52 -07:00
Alyssa Rosenzweig	3752566584	panfrost: Implement workgroups_x_shift_2 quirk I'm not sure why this is done this way, but let's follow the blob. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig	25ed930c4a	pan/decode: Don't print canonical workgroup encoding The on-the-wire representation of workgroups is not 1:1 to the decoded Gallium-level workgroups (there are multiple valid encodings; see the previous commit). Nevertheless, since we're now bit-identical in packing vs the blob, we can check for a canonical form and only print the verbose trace if we fail the canonical form. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig	fb56a162a9	panfrost: Set workgroups z to 32 for non-instanced graphics This is a blob quirk; in so much as I know, the hardware doesn't care. But we're trying to be bit-identical to take as much entropy out of traces as possible, so let's introduce the quirk. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig	39b226cfb3	panfrost: Move pan_invocation to shared panfrost/ The routines in this file have no dependency on Gallium. Let's share them so they can be used for a theoretical future Vulkan driver or, more immediately, consulted when tracing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig	d9f33951df	pan/decode: Don't print MALI_DRAW_NONE Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alyssa Rosenzweig	740f86c9ee	pan/decode: Eliminate DYN_MEMORY_PROP It's obvious that it's linked by virtue of us printing the struct it links against. No need to repeat ourselves. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 08:40:51 -07:00
Alejandro Piñeiro	41549a18e6	i965: Enable OpenGL 4.6 for Gen8+ The last remaining stuff was ARB_gl_spirv and ARB_spirv_extensions. Note that it is really likely that we can enable it for some Gen7 (as 4.5 was), but it was not tested yet. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-21 17:29:42 +02:00
Alejandro Piñeiro	7dab76014a	mesa/version: uncomment SPIR-V extensions As they are implemented on i965, so we can expose 4.6. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-21 17:29:42 +02:00
Alejandro Piñeiro	2e8565bead	i965: enable ARB_gl_spirv extension and ARB_spirv_extensions for gen7+ v2: squashed the two enable patches with the docs one (Jason) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-21 17:29:42 +02:00
Tomeu Vizoso	4109a2f612	panfrost/ci: Print load stats To help make sure we are running tests in the ideal number of threads, print load stats to make obvious when there's a problem with utilization. This will be specially useful when we run tests on a wider variety of devices. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 16:41:56 +02:00
Tomeu Vizoso	3794652385	panfrost/ci: Install qemu-arm-static into chroot Some runners may be configured such that the qemu binary might not be available by the time we need to start running commands within the chroot. So make sure that it's there to avoid suprising problems in that case. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 16:41:56 +02:00
Tomeu Vizoso	8496045adc	panfrost/ci: Build kernel with CONFIG_DETECT_HUNG_TASK There's lots of locking changes going into the Panfrost kernel driver, so better be prepared. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 16:41:56 +02:00
Tomeu Vizoso	a074513dc2	panfrost/ci: Print bootstrap log A number of things can go wrong when building the rootfs from within a non-native chroot, so make sure to print the bootstrap.log so we can tell what's going on. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 16:41:56 +02:00
Tomeu Vizoso	76af465e57	panfrost/ci: Use Volt-based runner for dEQP tests It's able to run tests in parallel, fully utilizing the HW and shortening considerable the time it takes. Needed to disable tests in RK3288 for now because Volt doesn't support armhf yet, though this should be fixed soon. Tests are now run with --deqp-gl-config-name=rgba8888d24s8ms0, so we are hitting a few more failures in tests that previously were being skipped. The time to run the tests decreases from around 8 minutes to 1:45 minutes, allowing for extending coverage without increasing CI times too much. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-21 16:41:56 +02:00
Samuel Pitoiset	29834fe8a2	radv: implement VK_AMD_shader_core_properties2 Trivial extension that matches PAL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 15:14:29 +02:00
Samuel Pitoiset	a6ad9e8ccf	radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood This gives a nice boost, +20% at this time on my Vega 56. Shader ballot should be enabled by default at some point but it reduces performance a bit (-6%) with Wolfeinstein II. Enable it only for Youngblood at the moment, like what we did for Talos in the past. As a bonus point, it gets rid of some minor artifacts that only happens when ballot is disabled for some reasons. Cc: 19.2 <mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 15:14:29 +02:00
Samuel Pitoiset	f202ac27a9	radv: add a new debug option called RADV_DEBUG=noshaderballot Shader ballot will be enabled by default for Wolfenstein Youngblood. This follows what we did for sisched. Cc: 19.2 <mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 15:14:29 +02:00
Samuel Pitoiset	e73d863a66	radv: allow to enable VK_AMD_shader_ballot only on GFX8+ Scans aren't implemented on SI/CIK. Cc: 19.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 15:14:29 +02:00
Danylo Piliaiev	e71fc7f238	nir/loop_analyze: Treat do{}while(false) loops as 0 iterations Loops like: block block_0: vec1 32 ssa_2 = load_const (0x00000020) vec1 32 ssa_3 = load_const (0x00000001) loop { vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9 vec1 1 ssa_8 = ige ssa_2, ssa_7 if ssa_8 { break } else { } vec1 32 ssa_9 = iadd ssa_7, ssa_1 } Were treated as having more than 1 iteration and after unrolling produced wrong results, however such loop will exit during the first iteration if not unrolled. So we check if loop will actually loop. Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-08-21 11:01:15 +00:00
Danylo Piliaiev	84b3ef6a96	nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll Without loop_prepare_for_unroll loops are losing phis. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411 Fixes: `5db98195` "nir: add loop unroll support for wrapper loops" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-08-21 10:43:27 +00:00
Danylo Piliaiev	8869f44e9a	nir/loop_unroll: Update the comments for loop_prepare_for_unroll The comments say that we should remove continue if it is the last intruction in a loop however we remove any kind of jump. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-08-21 10:43:27 +00:00
Bas Nieuwenhuizen	e04761d0f9	radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10. Otherwise hangs are possible. This register was already set for GS and NGG. Fixes: `5eaed7ecfc` "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-21 09:51:47 +00:00
Bas Nieuwenhuizen	2e763f7c87	radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are needed. Should take the max of the 2. Fixes: `ea337c8b7e` "radv/gfx10: fix VS input VGPRs with the legacy path" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-21 09:38:46 +00:00
Daniel Schürmann	7fa1740035	nir/algebraic: some subtraction optimizations Changes with RADV/ACO: Totals from affected shaders: SGPRS: 444087 -> 455543 (2.58 %) VGPRS: 436468 -> 436768 (0.07 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 13448928 -> 13353520 (-0.71 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 68060 -> 67979 (-0.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-21 08:51:49 +00:00
Lionel Landwerlin	8a2465e3f3	radeonsi: take reference glsl types for compile threads An application quitting before the destroying its GL context and binding a NULL context might still have a radeonsi compiler thread running and potentially still accessing the types. Therefore take a reference for the duration of the threads' lifetime. v2: Only ref the glsl types, the builtins should be used by the time shader data gets to a gallium driver. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-21 09:44:10 +02:00
Lionel Landwerlin	e4da8b9c33	mesa/compiler: rework tear down of builtin/types The issue we're running into when running CTS is that glsl types are deleted while builtins depending on them are not. This happens because on one hand we have glsl types ref counted, but builtins are not. Instead builtins are destroyed when unloading libGL or explicitly calling glReleaseShaderCompiler(). This change removes almost entirely any dealing with glsl types ref/unref by letting the builtins deal with it instead. In turn we introduce a builtin ref count mechanism. Each GL context takes a reference on the builtins when compiling a shader for the first time. It releases the reference when the context is destroyed. It can also explicitly release those when glReleaseShaderCompiler() is called. Finally we also take a reference on the glsl types when loading libGL to avoid recreating glsl types too often. v2: Ensure we take a reference if we don't have one in link step (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-21 09:44:10 +02:00
Lionel Landwerlin	9f37bc419c	compiler: ensure glsl types are not created without a reference We want to detect invalid refcounting so assert we have at least one use before creating types. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-21 09:44:10 +02:00
Lionel Landwerlin	8b913bd1ce	nir/tests: take reference on glsl types Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-21 09:44:10 +02:00
Lionel Landwerlin	3ade8f0040	glsl/tests: take refs on glsl types Much like each driver, tests as standalone entities must take references on the glsl types. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-21 09:44:10 +02:00
Samuel Pitoiset	41d9873459	radv/gfx10: hardcode some depth+stencil formats in the format table The script doesn't handle them correctly and D16_UNORM_S8_UINT isn't supported by the hardware, mark it as invalid. This fixes warning when generating gfx10_format_table.h. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 08:17:40 +02:00
Samuel Pitoiset	1650e747c6	radv/gfx10: tidy up gfx10_format_table.py Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-21 08:17:38 +02:00
Ilia Mirkin	958390a9bf	gallium/vl: use compute preference for all multimedia, not just blit The compute paths in vl are a bit AMD-specific. For example, they (on nouveau), try to use a BGRX8 image format, which is not supported. Fixing all this is probably possible, but since the compute paths aren't in any way better, it's difficult to care. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Fixes: `9364d66cb7` (gallium/auxiliary/vl: Add video compositor compute shader render) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-20 23:51:39 -04:00
Emil Velikov	cca442f3ba	docs: update calendar for 19.2.x Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-20 23:14:53 +01:00
Emil Velikov	a3d42ad248	docs: add 19.3.0-devel release notes template Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-20 22:39:25 +01:00
Emil Velikov	e6c0b493d2	mesa: bump version to 19.3.0-devel Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-08-20 22:33:49 +01:00
Erico Nunes	71fb721ca5	lima/ppir: use ra_get_best_spill_node to select spill node ra_get_best_spill_node is what other users of the mesa register allocator use. Switching to it now also fixes an infinite loop issue with ppir regalloc with the ppir control flow patchset, and also provides a small gain over the previous herusitic on number of spilled nodes testing with shader-db. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-20 21:16:02 +00:00
Eric Anholt	c1dc84e71d	tgsi: Remove unused tgsi_check_soa_dependencies(). Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-08-20 13:31:13 -07:00
Eric Anholt	4ebe6b2e72	tgsi: Drop the SSE2 constants setup that's been dead code since 2011. The SSE2 executor was removed in `4eb3225b38` ("Remove tgsi_sse2.") Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-08-20 13:31:13 -07:00
Eric Anholt	98c58355d3	tgsi: drop a stale comment This was fixed in `912ed84f83` ("tgsi: move to using vector for system values.") Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-08-20 13:31:13 -07:00
Eric Anholt	553cd82d64	gitlab-ci: Enable the GLES2/3 CTS on softpipe. The GLES2 CTS takes about 8 minutes of total runtime (at parallel 4 is ~2 minutes in the test stage if runners are free), while GLES3 takes about 25. Since the GLES3 run is pretty expensive, just do a cheap touch test of 1 out of every 10 tests in the test list on MRs, until we can get the runtime down. v2: Drop the full run for now until we can bring runtime down or bring up a dedicated mesa runner. Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1) Reviewed-By: Gert Wollny <gert.wollny@collabora.com> (v1)	2019-08-20 13:31:13 -07:00
Jose Maria Casanova Crespo	6c904773fe	mesa: reverse no_error on compressed_tex_sub_image for TEX_MODE_CURRENT This fixes the regression introduced on "mesa: refactor compressed_tex_sub_image function" that started to crash KHR-GLES2.texture_3d.compressed_texture.negative_compressed_tex_sub_image Fixes: `7df233d68d` ("mesa: refactor compressed_tex_sub_image function") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-20 20:45:21 +01:00
Adam Jackson	b283919398	glx: Eliminate glx_config::{rgb,float,colorIndex}Mode These are redundant with glx_config::renderType, let's just use that consistently.	2019-08-20 14:05:07 -04:00
Adam Jackson	74ca87e4bc	glx: Remove unused glx_config::pixmapMode Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-20 14:05:03 -04:00
Adam Jackson	35fc7bdf0e	glx: convert glx_config_create_list to one big calloc Simpler, less failure prone, less malloc overhead, what's not to like. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-20 14:05:01 -04:00
Adam Jackson	97d58eabcc	glx: convert a malloc+memset to calloc Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-20 14:04:59 -04:00
Adam Jackson	cabd09c9e7	glx: Fix parameter documentation of glx_config_create_list 'minimum_size' is not, in fact, an argument to this function. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-20 14:04:56 -04:00
Arcady Goldmints-Orlov	3835535537	anv: inline uniforms blocks don't count toward descriptor set limits In a descriptor set inline uniform blocks don't use up any bindings. However, the presence of any inline uniform blocks doed require the use of the descriptor buffer, which takes up one binding. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 16:48:45 +00:00
Daniel Schürmann	df86c5ffb3	nir: add divergence analysis pass. This pass expects the shader to be in LCSSA form. The algorithm is based on 'The Simple Divergence Analysis' from Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira. Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-20 17:40:13 +02:00
Rhys Perry	7b07034931	nir/subgroups: Lower clustered reductions with cluster_size >= subgroup_size into reductions The behavior for reductions with cluster_size >= subgroup_size is implementation defined. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:40:10 +02:00
Rhys Perry	911a1dfad2	nir/lcssa: allow to create LCSSA phis for loop-invariant booleans ACO depends on LCSSA phis for divergent booleans to work correctly. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:40:05 +02:00
Daniel Schürmann	9c40ad49d5	nir/lcssa: Skip loop invariant variables when converting to LCSSA. Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:40:01 +02:00
Rhys Perry	8a6cfaa15a	nir: make nir_to_lcssa() a general NIR pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-20 17:39:54 +02:00
Daniel Schürmann	204846ad06	nir/lcssa: handle deref instructions properly Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `414148cdc1` "nir: Support deref instructions in loop_analyze"	2019-08-20 17:39:52 +02:00
Jose Maria Casanova Crespo	7c56a68c8b	tgsi_to_nir: only update TGSI properties of the current shader stage The implementation introduced in "tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)" updates all the TGSI properties, but it didn't take into account that the shader_info structure uses a union to store the different attributes for each shader stage. Now we only update the attributes if they affect current shader stage, avoiding to overwrite members of the union that should be overwritten. This has created hundreds of regressions in v3d. For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH. Fixes: `e300365197` ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-20 10:30:21 +00:00
Samuel Pitoiset	83a63a5b12	radv/gfx10: do not emit PA_SC_TILE_STEERING_OVERRIDE twice CLEAR_STATE emits it for us. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-20 12:13:44 +02:00
Samuel Pitoiset	2ca8629fa9	radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+ It's emitted by the kernel. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-20 12:13:41 +02:00
Gert Wollny	6a09405368	mesa/program: Take ARB_framebuffers_no_attachments into account in wpos correction If a drawbuffer is an fbo without an attachment then its 'Height' will be zero, and we have to take its 'DefaultGeometry.Height' into account. Fixes on softpipe (with the exception of tests that use multisample): dEQP-GLES31.functional.fbo.no_attachments.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-20 10:04:24 +02:00
Sagar Ghuge	fe0e9db797	iris: Enable non coherent framebuffer fetch on broadwell v2: Use GEN_GEN in iris_state (Kenneth Graunke) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:58 -07:00
Sagar Ghuge	57ce422e20	iris: Free resource if failed to allocate surface state Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:55 -07:00
Sagar Ghuge	02244bc515	iris: Pass isl_surf to fill_surface_state Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:45 -07:00
Sagar Ghuge	638a157e02	iris: Add infrastructure to support non coherent framebuffer fetch Create separate SURFACE_STATE for render target read in order to support non coherent framebuffer fetch on broadwell. Also we need to resolve framebuffer in order to support CCS_D. v2: Add outputs_read check (Kenneth Graunke) v3: 1) Import Curro's comment from get_isl_surf 2) Rename get_isl_surf method 3) Clean up allocation in case of failure Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:44 -07:00
Sagar Ghuge	61c0637afb	iris: Add helper functions to get tile offset All helper functions are ported from i965 driver. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:43 -07:00
Sagar Ghuge	7e816991cc	iris: Add helper function to get isl dim layout v2: Add missing space (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:41 -07:00
Sagar Ghuge	58471e20d2	iris: Add render target read entry in binding table This will be used in next patches for supporting non coherent framebuffer fetch on Broadwell. v2: Fix comment (Kenneth Graunke) v3: 1) Fix a few nits (Caio) 2) Add comment (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-20 00:50:31 -07:00
Kai Wasserbäch	1abe87383e	build: Bump C++ standard requirement to C++14 to fix FTBFS with LLVM 10 When building Mesa against a recent LLVM 10 with C++11, the build fails if the AMD common code is built as well due to "std::index_sequence" being undeclared. LLVM requires a minimum of C++14. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-20 05:39:19 +00:00
Rob Herring	d0ec5d38f6	panfrost: Add madvise support to BO cache The kernel now supports madvise ioctl to indicate which BOs can be freed when there is memory pressure. Mark BOs purgeable when they are in the BO cache. The BOs must also be munmapped when they are in the cache or they cannot be purged. We could optimize avoiding the madvise ioctl on older kernels once the driver version bump lands, but probably not worth it given the other driver features also being added. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>	2019-08-19 19:33:20 -05:00
Rob Herring	c45c2d7960	panfrost: Sync UAPI header from kernel Sync the panfrost_drm.h UAPI header with the latest from the kernel. This adds madvise ioctl and GPU feature params. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>	2019-08-19 19:33:20 -05:00
Pierre-Eric Pelloux-Prayer	0f07d18e48	mesa: add ext_dsa GetMultiTexLevelParameterEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:50:08 -04:00
Pierre-Eric Pelloux-Prayer	e8c5dc9c24	mesa: add EXT_dsa glCompressedMultiTex* functions display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:50:07 -04:00
Pierre-Eric Pelloux-Prayer	1cb8e12717	mesa: add EXT_dsa glCompressedMultiTex* functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:50:05 -04:00
Pierre-Eric Pelloux-Prayer	a886025ef5	mesa: add EXT_dsa glCompressedTex* functions display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:50:03 -04:00
Pierre-Eric Pelloux-Prayer	8c76221886	mesa: add EXT_dsa glCompressedTexture(Sub)Image1D/2D/3D functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:49:57 -04:00
Pierre-Eric Pelloux-Prayer	7df233d68d	mesa: refactor compressed_tex_sub_image function Combine compressed_tex_sub_image, compressed_tex_sub_image_error and compressed_tex_sub_image_no_error in a single function. The added "enum tex_mode mode" parameter allows to implement the DSA / non-DSA variants and their error/no_error combination. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-19 18:49:43 -04:00
Bas Nieuwenhuizen	6c5d983865	radv: Add Renoir support. Took the freedom to enable dfsm even though I don't have benchmark results yet, but it seems Raven-like. Rest is from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-19 22:34:11 +00:00
Marek Olšák	223b3174bd	radeonsi/nir: always lower ballot masks as 64-bit, codegen handles it This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.	2019-08-19 17:23:38 -04:00
Marek Olšák	5d37194d43	radeonsi: remove the unsafemath debug option unlikely to be used in the future Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	5586411de4	radeonsi/nir: fix counting shader inputs & outputs	2019-08-19 17:23:38 -04:00
Marek Olšák	452cb7055f	radeonsi/nir: fix assertion in si_nir_load_sampler_desc	2019-08-19 17:23:38 -04:00
Marek Olšák	1f8a661748	radeonsi: clean up si_llvm_context_set_tgsi Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	43f8b5642b	radeonsi: allocate and resize global_buffers as needed Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	c315cb509d	radeonsi/gfx10: don't set PA_SC_TILE_STEERING_OVERRIDE if CLEAR_STATE sets it Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	5a2e65be89	radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	8d0d753bd0	radeonsi: fix an assertion failure: assert(!res->b.is_shared) This only appears to happen on Raven2. Possible way to reproduce: resource_get_handle(WINSYS_HANDLE_TYPE_KMS) --> sets is_shared = true resource_get_handle(WINSYS_HANDLE_TYPE_DMABUF) --> fail Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>	2019-08-19 17:23:38 -04:00
Marek Olšák	bdcbac9459	radeonsi: handle the use_ngg_streamout flag in si_update_ngg	2019-08-19 17:23:38 -04:00
Marek Olšák	a6b3ca1c70	radeonsi: move the tess factor ring size assertion to a place where it matters Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Marek Olšák	21217efdfe	ac/nir: set image=true when loading FMASK for images Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-19 17:23:38 -04:00
Christian Gmeiner	f52b9218ff	etnaviv: rs: add support for 64bpp clears Starting with HALTI2 the RS supports 64bpp clears. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-19 22:36:45 +02:00
Christian GMEINER	7492685b1b	etnaviv: update headers from rnndb Update to etna_viv commit c51353e. Signed-off-by: Christian GMEINER <christian.GMEINER@bachmann.info> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-19 22:36:45 +02:00
Eric Anholt	1395503424	swrast: Make the fetch funcs table sparse. This shrinks the table, avoids needing to update the table with NULL entries on every MESA_FORMAT addition, and removes a surprising, non-unit-tested format number ordering dependency. Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-08-19 11:48:03 -07:00
Eric Anholt	c45c33a5a2	gallium: Remove manual defining of PIPE_FORMAT enum values. Now that SVGA doesn't have a table that has to be in PIPE_FORMAT order, we can let the enums have whatever values they naturally would without worrying about holes. Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-08-19 11:48:01 -07:00
Eric Anholt	84db6ba740	svga: Drop unsupported formats from the format table. Now that we're using the array initializers, we don't need to manually fill out all these stub entries. Produced with "sed -i '/.INVALID.INVALID.*INVALID/d' src/gallium/drivers/svga/svga_format.c" Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-08-19 11:43:02 -07:00
Eric Anholt	ef37da52c0	svga: Remove duplication in the format table. By using the [ ] = {} array initializer syntax, we no longer need the entries to be listed in PIPE_FORMAT_* value order. This means that people adding new gallium formats don't need to cargo-cult changes to this driver or regress that non-unit-tested requirement. While I'm here, drop the lines for formats that no longer exist (the numbered ones in the table). Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-08-19 11:42:55 -07:00
Eric Anholt	42efa789b5	svga: Factor out the format conversion table entry lookup. Seemed like a sensible cleanup, while I was looking at whether I could make the table sparse. To make the svga table not require fixups on every new gallium format, we may want to change how it's populated. Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-08-19 11:42:36 -07:00
Jason Ekstrand	5167e94f23	nir: Add more source types to nir_tex_instr_src_type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 17:03:34 +00:00
Alyssa Rosenzweig	2bb4dc4054	pan/midgard: Compute liveness per-block Rather than using a regalloc based on live internals, computed hastily with repeated invocations of a forward-analysis pass, we switch to compute liveness information on a per-block basis. Within a given basic block, we compute liveness backwards with a linear-time algorithm; for common shaders, this may help RA terminate quicker. Across blocks, we use a work list (really a work set) and check if we're making progress. This isn't terribly efficient, but it gets the job done. Point is, we get the live_in/live_out for each block. From there, it's simple to rerun the linear-time update algorithm to compute the interference graph. The benefit of this technique is the ability to ignore "gaps" in liveness across intermediate blocks that are never executed. On simple shaders like the loops in glmark, this results in a minor reduction in register pressure. The motivation was a complex shader in Krita that failed register allocation due to an unfortunate interaction between texture pipeline registers and control flow. This shader now compiles successfully. total instructions in shared programs: 3439 -> 3438 (-0.03%) instructions in affected programs: 22 -> 21 (-4.55%) helped: 1 HURT: 0 total bundles in shared programs: 2077 -> 2076 (-0.05%) bundles in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total quadwords in shared programs: 3457 -> 3456 (-0.03%) quadwords in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total registers in shared programs: 341 -> 338 (-0.88%) registers in affected programs: 9 -> 6 (-33.33%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	24c91bb54b	pan/midgard: Analyze load/store for swizzle propagation If there's a nontrivial swizzle fed into an extra (shortened) argument, we bail on copyprop. No glmark changes (since it doesn't use fancy texturing/loads). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	9ae4d3653e	pan/midgard: Treat cubemaps "stores" as loads It's always been ambiguous which they are, but their primary register is their output, not their input; therefore, they are loads. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	20dd482668	pan/midgard: Clamp cubemap swizzle to XYXX Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	2788721cc4	pan/midgard: Clamp st_vary swizzle by number of components Same issue with liveness analysis. If we store out a vec3, we should not reference the .w component. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	edc8e41566	pan/midgard: Use type-appropriate swizzle for texture coordinate The texture coordinate for a 2D texture could be a vec2 or a vec3, depending if it's an array texture or not. If it's vec2 (non-array texture), we should not reference the z component; otherwise, liveness analysis will get very confused when z is never written. v2: Fix typo (Ilia). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	2bcb3d9226	pan/midgard: Set mask for lowered read-hazard moves If we need to lower a move for a read from a vec2 texture coordinate, we shouldn't write zw, even incidentally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	739e09c297	pan/midgard: Fix texw lowering with complex control flow Fixes shaders with control flow like: out = 0; if (A) { if (B) out = texture(A, ...) } else { out = texture(B, ...) } Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	6f1c8c148d	pan/midgard: Add mir_rewrite_index_dst_single helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	d68019ad1f	pan/midgard: Print predecessors in MIR Just as a sanity check. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	e3a418fe86	pan/midgard: Index blocks for printing Better than having pointers flying about. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	2f92479ffc	pan/midgard: Add mir_foreach_src This is repeated often enough. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	84580c6dbc	pan/midgard: Add mir_foreach_instr_in_block_rev Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	c8c4471a92	pan/midgard: Add mir_foreach_successor helper Now we should be able to walk the control-flow graph naturally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	b8e526c520	pan/midgard: Add mir_foreach_predecessor utility It's ugly, but c'est la vie. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	b4b2e111f8	pan/midgard: Link exit block The exit block has been 'dangling' in the successors graph, so let's ensure it's linked in. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	07c960cac0	pan/midgard: Add mir_exit_block helper The exit block is gauranteed to be empty, signaling the end of the program. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	aeeeef1242	pan/midgard: Maintain block predecessor set While we already compute the successors array, for backwards data flow analysis, it is useful to walk the control flow graph backwards based on predecessors, so let's compute that information as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	4fa09329c1	pan/midgard: Use ralloc on ctx/blocks This will allow us to get some level of automatic memory management. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig	b59b1793b8	pan/midgard: Shrink successors[] to 2 length A block can't have more. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-19 08:32:17 -07:00
Roman Stratiienko	fdd6151612	nir: Add missing dependency in Android.nir.gen.mk Fixes incremental build with Android Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-19 09:53:18 +03:00
Erico Nunes	99d5bdcfa5	meson: build lima tools as part of 'all' tools This is primarily so that this build gets tested in CI and we don't break it again. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-18 22:27:55 +02:00
Connor Abbott	c550d367a7	ac/nir: Fix store_scratch with a non-full writemask By adding one more helper to ac_llvm_build, we can also easily keep vector stores together. Fixes the tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test piglit test. Fixes: `74470baebb` ("ac/nir: Lower large indirect variables to scratch") Reviewed-by: Marek Olšák <marek.olsak@amd.com	2019-08-18 15:15:45 +02:00
Vasily Khoruzhick	0e394cda0d	glsl/standalone: init shader stage in init_gl_program() Otherwise lima standalone compiler fails when trying to compile fragment shader with: lima_compiler: ../src/compiler/nir/nir.c:55: nir_shader_create: Assertion `si->stage == stage' failed Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-17 11:14:40 -07:00
Jason Ekstrand	16edd02bfa	iris: Only request an input mask if the shader needs it Fixes: `aebca3961b` "iris: Fix handling of SIMD32 fragment shaders" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-16 19:59:42 -05:00
Xiong, James	dcad15ff54	gallium: add back YVU support PIPE_FORMAT_YV12 is not handled so switching to PIPE_FORMAT_IYUV and adding back YVU support. Signed-off-by: James Xiong <james.xiong@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-16 13:24:49 -07:00
Erico Nunes	7a51abab42	lima: actually wait for bo in lima_bo_wait PIPE_TIMEOUT_INFINITE is unsigned and gets assigned to signed fields where it ends up as -1. When this reaches the kernel as a timeout it gets translated as no timeout, which cause the waiting functions to return immediately and not actually wait for a completion. This seems to cause unstable results with lima where even piglit tests randomly fail. Handle this by setting the signed max value in case of infinite timeout. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-16 16:31:29 +02:00
Rhys Perry	0a790c3019	nir/algebraic: add a few masking-before-unpack optimizations Helps some Dawn of War 3 and F1 2017 shaders with ACO: Totals from affected shaders: SGPRS: 2136 -> 2128 (-0.37 %) VGPRS: 1624 -> 1628 (0.25 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 168068 -> 164332 (-2.22 %) bytes LDS: 44 -> 44 (0.00 %) blocks Max Waves: 222 -> 221 (-0.45 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-16 12:13:01 +01:00
Vasily Khoruzhick	861c2b8d31	lima: fix compilation of standalone compiler Fixes: e0aeee946004("lima: add summary report for shader-db") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-15 16:59:51 -07:00
Bas Nieuwenhuizen	b9fb90e6d3	Revert "radv/gfx10: Enable DCC for storage images." Quite useless without DCC for LAYOUT_GENERAL. Fixes: `b4dad3afaa` Revert "radv: Do not decompress on LAYOUT_GENERAL." Acked-by: Dave Airlie <airlied@redhat.com>	2019-08-16 01:22:54 +02:00
Bas Nieuwenhuizen	b4dad3afaa	Revert "radv: Do not decompress on LAYOUT_GENERAL." Causes issues with a bunch of games with DXVK. Fixes: `50add1b33a` "radv: Do not decompress on LAYOUT_GENERAL." Acked-by: Dave Airlie <airlied@redhat.com>	2019-08-16 01:22:35 +02:00
Dave Airlie	f3af7886fe	mesa: add support for CET to x86/x86-64 asm files. Control-flow enforcement technology is a new instructions on x86 processors to denote where indirect jumps can land. Gcc auto adds the instruction (which encodes as a NOP on older CPUs) to entrypoints but assembler files need manual adding. This adds it to all the entry points in the mesa x86/x86-64 assembler files. This will only happen if mesa is built with the -fcf-protection flag to gcc as some distros are wanting to do. Acked-by: Eric Anholt <eric@anholt.net>	2019-08-16 09:00:35 +10:00
Alyssa Rosenzweig	78eda70892	pan/bifrost: Manually constant fold register class Fixes errors for some people building Mesa: ../src/panfrost/bifrost/bifrost_sched.c:32:31: error: initializer element is not constant const unsigned max_vec2_reg = max_primary_reg / 2; ../src/panfrost/bifrost/bifrost_sched.c:33:31: error: initializer element is not constant const unsigned max_vec3_reg = max_primary_reg / 4; // XXX: Do we need to align vec3 to vec4 boundary? ../src/panfrost/bifrost/bifrost_sched.c:34:31: error: initializer element is not constant const unsigned max_vec4_reg = max_primary_reg / 4; ../src/panfrost/bifrost/bifrost_sched.c:35:32: error: initializer element is not constant const unsigned max_registers = max_primary_reg + ../src/panfrost/bifrost/bifrost_sched.c:40:28: error: initializer element is not constant const unsigned vec2_base = primary_base + max_primary_reg; ../src/panfrost/bifrost/bifrost_sched.c:41:28: error: initializer element is not constant const unsigned vec3_base = vec2_base + max_vec2_reg; ../src/panfrost/bifrost/bifrost_sched.c:42:28: error: initializer element is not constant const unsigned vec4_base = vec3_base + max_vec3_reg; ../src/panfrost/bifrost/bifrost_sched.c:43:27: error: initializer element is not constant const unsigned vec4_end = vec4_base + max_vec4_reg; Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-15 19:06:35 +00:00
Erik Faye-Lund	18ab42644b	gallium/util: widen type before multiplication This method returns size_t, but the multiplication multiplies two integers, leading to overflow rather than type widening. Noticed by compiling with MSVC, which emits a warning. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:53 +02:00
Erik Faye-Lund	0091f62978	mesa: avoid warning on Windows On Windows, p_atomic_inc_return returns an unsigned long long rather than the type the pointer refers to, so let's make sure we cast the result to the right type. Otherwise, we'll trigger a warning about the wrong format-string for the type. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:49 +02:00
Erik Faye-Lund	544b088616	win32: unify strcasecmp definitions There was two incompatible definitions of strcasecmp, which lead to a compiler warning. Let's clean this up by only leaving one of them, and using that one all the time. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:44 +02:00
Erik Faye-Lund	ecd312be96	mesa/main: avoid warning when casting offset to pointer This generates a warning on some 64-bit systems, so let's cast to a properly sized integer first. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:39 +02:00
Erik Faye-Lund	c646cd4bac	nir: avoid warning when casting bogus pointer This intentionally-bogus pointer generates a warning on some 64-bit systems, so let's cast to a properly-sized integer first. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:35 +02:00
Erik Faye-Lund	b355eef920	glsl: fixup u64-warning Similarly to the unsigned-version, we need to first cast the result to a suiting integer before negating the number, otherwise we'll trigger a warning. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-15 20:23:13 +02:00
Kenneth Graunke	f741de236b	isl: Enable Unorm Path in Color Pipe Improves performance on my Icelake 8x8 locked to 700Mhz. For example, some GfxBench5 subtests have the following results: - [i965] gl_manhattan: ................ 7.01119% +/- 0.180971% (n=5) - [i965] gl_4 (Car Chase): 4.24351% +/- 0.175622% (n=5) - [i965] gl_blending: ................ 3.36327% +/- 0.180267% (n=5) - [i965] gl_5_normal (Aztec Ruins): 1.67962% +/- 0.243534% (n=10) - [iris] gl_manhattan: ................ 3.92357% +/- 0.073965% (n=25) - [iris] gl_4 (Car Chase): 2.17746% +/- 0.0826858% (n=5) - [iris] gl_blending: ................ 2.79599% +/- 0.803652% (n=15) - [iris] gl_5_normal (Aztec Ruins): 1.30930% +/- 0.106523% (n=25) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-15 10:39:09 -07:00
Rafael Antognolli	ceeaf93c8e	anv: Properly initialize device->slice_hash. When subslices_delta == 0 and we take the early return, device->slice_hash is not initialized on GEN11. It then causes a segfault when going through anv_DestroyDevice, if compiled with valgrind. Fixes: `7bc022b4bb` ("anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-15 09:42:48 -07:00
Danylo Piliaiev	72354d43d4	intel/compiler: Fix resource leak in error path CID: 1452261 Fixes: `04a99515` "intel/compiler: add ability to override shader's assembly" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-15 08:17:36 +00:00
Alyssa Rosenzweig	44a6c38bd6	panfrost: Implement native RECT textures We started honouring the normalized_coords flag in the texture descriptor, but a bisection revealed that broke RECT textures -- since we were also lowering them in the shader. So just remove the shader-based lowering, use native RECT textures, and enjoy the nominal reduction in complexity and performance boost. Fixes: `3e47a1181b` ("panfrost: Add MALI_SAMP_NORM_COORDS flag") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:42 -07:00
Alyssa Rosenzweig	6fe4822cca	panfrost: Add R10G10B10A2_SSCALED vertex format Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	e823a47f02	pan/midgard: Disassemble UBO index explicitly It's a bit of a special case but that's fine. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	3d54ed2488	pan/midgard: Account for unaligned UBOs when promoting uniforms We only know how to promote aligned accesses, although theoretically we should be able to promote unaligned to swizzles in the future. Check this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	03350eb8b8	pan/midgard: Add mir_ubo_shift helper Different UBO reads have different shift requirements. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	cf3bb10f51	pan/midgard: Address emit_ubo_read offset in bytes We'll want to be smarter about unaligned reads, so let's get this code all in one place. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	65e6cb4eb0	pan/midgard: Wire writemask into UBO reads Helps the disassembly be clearer and maybe regalloc be smarter. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	ec2f0b580f	pan/midgard: Identify UBO/SSBO op symmetry It's the same thing, just shifted. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig	375d4c2c74	panfrost: Extend blending to MRT Our hardware supports independent (per-RT) blending, but we need to route those settings through from Gallium. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig	dff4986b1a	pan/midgard: Emit store_output branch just-in-time We'll need multiple branches for MRT, so we can't defer. Also, we need to track dependencies to ensure r0 is set to the correct value for each store_output. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig	2fc44c4dc8	pan/midgard: Add dont_eliminate flag We need to treat fragment writes specially. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig	6ed3843224	pan/mfbd: Stuff in RT count Fixes DATA_INVALID_FAULTs with multiple render targets. We do always allocate space for 4 cbufs just to keep things sane. This may not be strictly necessary. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig	716be7862e	pan/decode: Dump FBD tagged pointer Turns out the rt count is stuffed in here.. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig	358372b256	pan/decode: Decode invalid access type upon fault We don't have a good way to confirm this, but it parallels the kernel definitons for MMU faults nicely. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig	f5cc5ef404	pan/decode: Fix duplicate heap_end property This was supposed to read heap_start. It's the same value but still, better get this right. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig	b78e04c17b	panfrost: Note "MFBD preload disable" bit It's a chicken bit, as far as I can tell. Buck buck. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 16:39:57 -07:00
Alyssa Rosenzweig	64720d1e9e	pan/bifrost: Link in compiler We enable the standalone compiler, build the new files, and let it blast. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	b93fa7d232	pan/bifrost: Check in remainder of the Bifrost compiler What it says on the tin. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	0e126aa0f0	pan/bifrost: Add bifrost_print.c/h IR printers. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	d8d8b08fe5	pan/bifrost: Style format the disassembler $ astyle .c .h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	fca491c0e1	pan/bifrost: Stub out standalone compiler We don't actually have a standalone compiler in-tree yet, but let's get prepared for when we do. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	62bbc23da5	pan/bifrost: Sync disassembler with Ryan's tree The disassembler was updated to move common code with the compiler into a shared header. Additional, some new ops and control registers relating to rounding were added. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig	b73cbd6880	panfrost: Remove standalone pandecode tool Now that panwrap has gained the ability to trace directly without dumping to the filesystem, there's no need to lug around this tool. I can assure you nobody will miss it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	6f4d796911	pan/midgard: Fix disassembly termination condition Fixes: `863bdd1f8d` ("pan/midgard: Break, not return, in disassembler") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	de2efd5ea7	panfrost: Ensure we upload at least 1 blend RT Otherwise we'll get memory junk. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	54438267c3	panfrost: Zero tripipe on initialize I don't think the hardware cares, but this adds a lot of noise to traces that we would rather not need to look at. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	1ab6290746	pan/midgard: Improve disassembler robustness Some memory corruption / etc issues let to an accidental "fuzzing" of the disassembler ;) This uncovered some issues leading to a disassembler hang, so let's fix that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	9c4c7211a3	pan/decode: Split public.h out We want a defined ABI for tracing; this set of functions should be as small as strictly necessary to minimize panwrap shenanigans. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	4f03728fb7	pan/decode: Prefer uint64_t to mali_ptr This removes an unwanted dependency on panfrost-job.h Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig	6c84a2665c	pan/midgard: Allocate spill_slot once Multiple spill moves share a single spill slot. Issue found in Krita. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig	2a9031ea44	pan/midgard: Use hint on midgard_instruction for spill_move This allows us to have multiple spill moves, whereas otherwise for N spill moves, the first N-1 would be clobbered. Issue found in Krita. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig	3e6f2e7aba	panfrost: Remove panfrost_add_dependency asserts It doesn't... make a ton of sense to need to assert and this routine is hotter than you might expect. Doesn't matter for release builds, of course. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 14:58:34 -07:00
Marek Olšák	aafc95ceb6	radeonsi: add support for Renoir Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-14 17:31:04 -04:00
Eric Engestrom	a3d6024199	meson: add nir tests to the compiler/nir test suite Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-14 22:17:06 +01:00
Eric Engestrom	d0916edfcb	EGL: sync headers with Khronos Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-14 21:48:23 +01:00
Christian Gmeiner	2c4fe6af78	relnotes: Add new ext on etnaviv for 19.2. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 21:47:35 +02:00
Christian Gmeiner	17200bb67a	etnaviv: fix weird indentation Fixes: `797a2e4fd0` ("etnaviv: update logic to determine uniform limits") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 21:29:48 +02:00
Ian Romanick	0e6581b87d	nir/algebraic: Reassociate shift-by-constant of shift-by-constant v2: After some review discussion with Alyssa, the replacements now correct account for cases where (b+c) >= bitsize. v3: Use a temporary to simplify the Python code quite a bit. Suggested by Jason. Haswell and all Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16251155 -> 16249576 (<.01%) instructions in affected programs: 232627 -> 231048 (-0.68%) helped: 547 HURT: 1 helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3 helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% 95% mean confidence interval for instructions value: -3.12 -2.65 95% mean confidence interval for instructions %-change: -1.20% -1.06% Instructions are helped. total cycles in shared programs: 365924392 -> 365372103 (-0.15%) cycles in affected programs: 59207053 -> 58654764 (-0.93%) helped: 497 HURT: 34 helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16 helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82% HURT stats (abs) min: 2 max: 424 x̄: 101.03 x̃: 63 HURT stats (rel) min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06% 95% mean confidence interval for cycles value: -1426.41 -653.77 95% mean confidence interval for cycles %-change: -1.66% -1.15% Cycles are helped. total spills in shared programs: 8870 -> 8871 (0.01%) spills in affected programs: 104 -> 105 (0.96%) helped: 0 HURT: 1 Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 11956236 -> 11955635 (<.01%) instructions in affected programs: 94110 -> 93509 (-0.64%) helped: 106 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4 helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76% 95% mean confidence interval for instructions value: -6.62 -4.72 95% mean confidence interval for instructions %-change: -2.27% -1.64% Instructions are helped. total cycles in shared programs: 179296340 -> 178788044 (-0.28%) cycles in affected programs: 51009603 -> 50501307 (-1.00%) helped: 82 HURT: 7 helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16 helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11% HURT stats (abs) min: 2 max: 8 x̄: 3.14 x̃: 2 HURT stats (rel) min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10% 95% mean confidence interval for cycles value: -7649.38 -3773.00 95% mean confidence interval for cycles %-change: -2.71% -1.99% Cycles are helped. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v2] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-14 11:15:37 -07:00
Ian Romanick	73aaeac0a3	nir/algebraic: Reassociate add-and-shift to be shift-and-add A common thing in many shaders: uniform vs { vec4 bones[...]; }; ... x = some_calculation(bones[i + 0]); y = some_calculation(bones[i + 1]); z = some_calculation(bones[i + 2]); This turns into stuff like vec1 32 ssa_12 = iadd ssa_11, ssa_0 vec1 32 ssa_13 = ishl ssa_12, ssa_3 vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0) vec1 32 ssa_15 = iadd ssa_11, ssa_1 vec1 32 ssa_16 = ishl ssa_15, ssa_3 vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0) vec1 32 ssa_18 = iadd ssa_11, ssa_2 vec1 32 ssa_19 = ishl ssa_18, ssa_3 vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0) By reassociating the shift and the add, we can reduce this to vec1 32 ssa_12 = ishl ssa_11, ssa_3 vec1 32 ssa_13 = iadd ssa_12, ssa_0 vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0) vec1 32 ssa_16 = iadd ssa_12, ssa_1 vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0) vec1 32 ssa_19 = iadd ssa_12, ssa_2 vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0) v2: Add some commentary from Rhys Perry's nearly identical patch. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16277758 -> 16250704 (-0.17%) instructions in affected programs: 1440284 -> 1413230 (-1.88%) helped: 4920 HURT: 6 helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4 helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79% HURT stats (abs) min: 1 max: 12 x̄: 4.50 x̃: 3 HURT stats (rel) min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55% 95% mean confidence interval for instructions value: -5.67 -5.31 95% mean confidence interval for instructions %-change: -2.26% -2.16% Instructions are helped. total cycles in shared programs: 367118526 -> 365895358 (-0.33%) cycles in affected programs: 93504145 -> 92280977 (-1.31%) helped: 2754 HURT: 1269 helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16 helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12% HURT stats (abs) min: 1 max: 1500 x̄: 35.85 x̃: 9 HURT stats (rel) min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75% 95% mean confidence interval for cycles value: -387.31 -220.78 95% mean confidence interval for cycles %-change: -2.11% -1.68% Cycles are helped. LOST: 1 GAINED: 1 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-14 11:15:32 -07:00
Andrii Simiklit	ff2225cf88	nir/find_array_copies: Reject copies with mismatched lengths copy_deref for wildcard dereferences requires the same arrays lengths otherwise it leads to a crash in optimizations like 'nir_opt_copy_prop_vars' because these optimizations expect 'copy_deref' just for arrays with the same lengths. v2: check was moved to 'try_match_deref' to fix aoa cases (Jason Ekstrand <jason@jlekstrand.net>) v3: -fixed comment -the condition merged with other one (Jason Ekstrand <jason@jlekstrand.net>) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-08-14 18:11:31 +00:00
Alyssa Rosenzweig	c4a4f3db5a	pan/midgard: Prefix blobber-db output for grepping Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig	5f0f9e1333	pan/midgard: Implement blobber-db We wire through some shader-db-style stats on the current shader in the disassemble so we can get a quick estimate of shader complexity from a trace. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Rob Clark <robdclark@chromium.org>	2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig	863bdd1f8d	pan/midgard: Break, not return, in disassembler We'll want to dump some stats after the shader, and I refuse to use one teensy little goto. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-14 10:31:09 -07:00
Ian Romanick	f2965fde9b	nir/range-analysis: Fail gracefully on non-SSA sources Tested-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-14 09:02:38 -07:00
Christian Gmeiner	1290cc3e27	etnaviv: split destroy_shader Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 15:10:07 +02:00
Christian Gmeiner	f90b23b8c4	etnaviv: split link_shader Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 15:10:07 +02:00
Christian Gmeiner	0765a1dd0e	etnaviv: split dump_shader Also this adds the missing impl for etna_dump_shader_nir(..). Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 15:10:07 +02:00
Christian Gmeiner	a36d04daa1	etnaviv: mv etnaviv_compiler.c etnaviv_compiler_tgsi.c Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-14 15:10:07 +02:00
Christian Gmeiner	b2da8a8357	etnaviv: correct PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE handling Have a correct answer to GL_MAX_FRAGMENT_UNIFORM_VECTORS and GL_MAX_VERTEX_UNIFORM_VECTORS. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach l.stach@pengutronix.de	2019-08-14 12:29:56 +02:00
Christian Gmeiner	797a2e4fd0	etnaviv: update logic to determine uniform limits Taken 1:1 from the header file. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach l.stach@pengutronix.de	2019-08-14 12:29:56 +02:00
Christian Gmeiner	45cb5eee5d	etnaviv: put uniform limit determination into own function Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach l.stach@pengutronix.de	2019-08-14 12:29:56 +02:00
Marek Vasut	8f97262cdd	etnaviv: Use reentrant screen lock around flush The flush callback may be called on the same pipe context, and thus the same stream, from two different threads of execution. However, etna_cmd_stream_flush{,2}() must not be called on the same stream from two different threads of execution as that would mess up the etna_bo refcounting and likely have other ugly side effects. Fix this by using a reentrant screen lock around the flush callback. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-14 10:36:36 +02:00
Marek Vasut	6bb4b6d078	etnaviv: Add valgrind support Add Valgrind support for etnaviv to track BO leaks. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-14 10:36:20 +02:00
Marek Vasut	cf92074277	etnaviv: Use hash table to track BO indexes Use hash table instead of ad-hoc arrays. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-14 10:36:04 +02:00
Marek Vasut	23f5f126d5	etnaviv: Fix double-free in etna_bo_cache_free() The following situation can happen in a multithreaded OpenGL application. A BO is submitted from etna_cmd_stream #1 with flags set for read. A BO is submitted from etna_cmd_stream #2 with flags set for write. This triggers a flush on stream #1 and clears the BO's current_stream pointer. If at this point, stream #2 attempts to queue BO again, which does happen, the BO will be added to the submit list twice. The Linux kernel driver correctly detects this and warns about it with "BO at index %u already on submit list" kernel message. However, when cleaning the BO cache in etna_bo_cache_free(), the BO which was submitted twice will also be free()d twice, this triggering a glibc double free detector. The fix is easy, even if the BO does not have current_stream set, iterate over current streams' list of BOs before adding the BO to it and verify that the BO is not yet there. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-14 10:35:48 +02:00
Roman Stratiienko	1ea95e37cc	kmsro: Add missing definitions to Android.mk Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Rob Herring robh@kernel.org	2019-08-14 07:39:53 +00:00
Gert Wollny	742d3c918f	softpipe: Add support for ARB_derivative_control Enables and passes piglits: spec/ARB_drivative_control/ dfdx-coarse dfdx-dfdy dfdx-fine dfdy-coarse dfdy-fine Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-14 07:03:15 +00:00
Vasily Khoruzhick	b579af77f3	lima/ppir: print srcs and dests in ppir_node_print_prog() Now we have an accessors for ppir src, so it's possible to easily print all srcs and dests while dumping ppir representation. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-13 22:44:07 -07:00
Vasily Khoruzhick	6920710af5	lima/ppir: use src accessors in ppir regalloc Get rid of most switch/case by using src accessors Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-13 22:44:07 -07:00
Vasily Khoruzhick	a5e7c12ced	lima/ppir: add ppir_node to ppir_src We'll need it if we want to walk through node sources Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-13 22:43:58 -07:00
Vasily Khoruzhick	afa64a2105	lima/ppir: introduce accessors for ppir_node sources Sometimes we need to walk through ppir_node sources, common accessor for all node types will simplify code a lot. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-13 22:38:07 -07:00
Jordan Justen	0f5be81edd	iris: Expose aux buffer as 2nd plane w/modifiers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 15:20:47 -07:00
Jordan Justen	246eebba4a	iris: Export and import surfaces with modifiers that have aux data The DRI interface for modifiers with aux data treats the aux data as a separate plane of the main surface. When the dri layer requests the plane associated with the aux data, we save the required information into the dri aux plane image. Later when the image is used, the dri plane image will be available in the pipe_resource structure's `next` field. Therefore in iris, we reconstruct the aux setup from this separate dri plane image when the image is used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 15:20:47 -07:00
Kenneth Graunke	99c8eb997d	iris: Do proper format checks for Y+CCS modifier support We need to ensure that the DRI image format supports CCS. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-08-13 15:20:47 -07:00
Jordan Justen	51f941c20c	iris: Create single bo for surfaces with modifiers and aux data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 15:20:47 -07:00
Jordan Justen	2c7b577e13	iris: Split iris_resource_alloc_aux to enable aux modifiers Reworks: * If the aux-state is not ISL_AUX_STATE_AUX_INVALID, then use memset even when memset_value is zero. The hiz buffer initial aux-state will be set to invalid, and therefore we can skip the memset. But, for CCS it will be set to ISL_AUX_STATE_PASS_THROUGH, and therefore the aux data must be cleared to 0 with the memset. Previously we would use BO_ALLOC_ZEROED with the CCS aux data, so this memset wasn't required. Now, the CCS aux data may be part of the main surface. We prefer to not use BO_ALLOC_ZEROED excessively, so the memset is needed for the CCS case. (Nanley) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 15:20:46 -07:00
Jordan Justen	aad36dfd16	iris: Add aux offset into hiz_address This is not currently required because the hiz buffer is in a separate buffer, and therefore the offset is 0. If we combine the aux buffer with the main surface buffer, then the hiz offset may become non-zero. Suggested-by: Nanley Chery <nanley.g.chery@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 15:20:39 -07:00
Marek Olšák	f5e1f9ccef	tgsi_to_nir: add assertions for max varying slots Nine uses GENERIC slots > 31. Trivial.	2019-08-13 18:15:53 -04:00
Marek Olšák	fad962eddc	tgsi_to_nir: expand vec3 system values to vec4 for nir_intrinsic_load_work_group_id Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 18:15:53 -04:00
Marek Olšák	88a511bd42	tgsi_to_nir: fix incorrect number of image src1 components Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 18:15:53 -04:00
Mauro Rossi	37841f52b2	i965/gen11: fix genX_bits.h include path Instead of "genX_bits.h" use "genxml/genX_bits.h" as already done in other similar cases Besides being more correct, it also fixes building error in Android. Fixes: `f0d2923` ("i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-08-13 23:58:25 +02:00
Alyssa Rosenzweig	0c56330361	panfrost: Workaround bug in partial update implementation We can't intersect with empty regions. Fixes: `65ae86b854` ("panfrost: Add support for KHR_partial_update()") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-13 11:13:48 -07:00
Eric Anholt	46daaca55e	gitlab-ci: Run the GLES2 CTS on llvmpipe. This is the start of doing CTS tests on merges to Mesa master. We use the surfaceless platform so that we don't need to bother bringing up weston or X11. The surface size is kept low to reduce runtime, but this comes at the cost of many rendering tests skipping due to too-small render targets (as we see the impact of Mesa on the shared runner pool, we can reevaluate this and what set of CTS tests we want to run). We split the job up across 4 runners (each at 4 llvmpipe threads), so that the job can load-balance across our shared runners and finish sooner (since dEQP is very single-thread-performance bound). Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:30:01 -07:00
Eric Anholt	ab49873b44	gitlab-ci: Switch the meson-main build type to debugoptimized. Now that we're running the drivers we build, building with optimization is important for keeping our runtime down. Shaves about 4 minutes of runtime off of GLES2 CTS of llvmpipe at 64x64. v2: Only switch meson-main until we enable CTS for other builds on request by Michel. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:30:01 -07:00
Eric Anholt	9605749f99	gitlab-ci: Set the prefix to ./install instead of the DESTDIR. If we don't set DESTDIR, then the DEFAULT_DRIVER_DIR built into the libraries is correct and we don't need to use LIBGL_DRIVERS_PATH and friends for CI usage. Incidentally, this moves our installed paths from /builds/anholt/mesa/install/usr/local/lib (for example) to /builds/anholt/mesa/install/lib for simplicity. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:30:01 -07:00
Eric Anholt	f417ced5cc	gitlab-ci: Build the CTS in the debian build image. This will let us reuse the image for test runs. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:30:01 -07:00
Eric Anholt	86ae3c2186	surfaceless: Fix swrast-path segfault when loader doesn't know driver name. If we're hitting the swrast fallback path here, it's probably because we stumbled across a KMS-only device (such as the ASpeed that some of our CI runners have) that will then return a NULL driver_name. Don't crash in that case. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:30:01 -07:00
Eric Anholt	6a8d39dccd	surfaceless: Fix swrast path. We get a getDrawableInfo() call in the MakeCurrent path, which platform_device was handling correctly by returning the pbuffer's width/height but platform_surfaceless segfaulted for. Reuse platform_device's implementation. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-08-13 10:29:34 -07:00
Eric Anholt	030aa6e184	gitlab-ci: Move around which builds cover which swrast. I want to enable CI of llvmpipe out of the meson-main build. So, kick classic swrast/osmesa to meson-i386, then promote llvmpipe to meson-main (along with nine, now that classic osmesa isn't keeping it out of there). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-13 10:29:34 -07:00
Eric Anholt	b816edcbf4	meson: Don't require DRI classic swrast for OSMesa. OSMesa doesn't care about this build option, it links against src/mesa/swrast regardless. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-08-13 10:29:34 -07:00
Alyssa Rosenzweig	29cfd154e3	panfrost: Implement transform feedback Midgard has no hardware support for transform feedback, so we simulate it in software. Lucky us. What Midgard does do is write out vertex shader outputs to main memory unconditonally. Fragment shaders read varyings back from main memory; there's no on-chip storage for varyings. Whether this was a reasonable design is a question I will not be engaging in this commit message. What that does mean is that, in some sense, Midgard always does transform feedback uncondtionally, and there's no way to turn off transform feedback. Normally, we would allocate some scratch memory every frame to store the varyings in an arbitrary format (interleaved for simplicity), and then feed that scratch to the fragment shader and discard when the rendering completes. The only difference now is that sometimes, for some buffers, we use a BO provided to us by Gallium and a format provided by Gallium, instead of allocating the memory and choosing the format ourselves. This has some limitations -- in particular, it only works at vec4 granularity, so a corresponding GLSL linkage patch is needed to correctly implement transform feedback for non-vec4 types. Nevertheless, given the hardware already works in this admittedly-bizarre fashion, transform feedback is "free". Or, at least, it's no more expensive than any other rendering. Specifically not implemented is dynamically-sized transform feedback (i.e. with geometry/tesselation shaders). Spoiler alert: Midgard has no support for geometry or tessellation shaders, despite advertising support. They get compiled to massive compute shaders. How's that for checkbox compliance? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:41 -07:00
Alyssa Rosenzweig	7c29588c07	panfrost: Increment offsets[] per draw We have to maintain the internal offset ourselves. Per v3d. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:39 -07:00
Alyssa Rosenzweig	e7a05a601e	panfrost: Fixup stream out information per variant We could probably get away with doing this once per pipe_shader_state but let's not jump down that rabbit hole quite yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:32 -07:00
Alyssa Rosenzweig	5b0a1a4e49	panfrost: Route outputs_written through the compiler It's there in shader_info, but we need to access it from pan_context.c Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:17 -07:00
Alyssa Rosenzweig	f714eab882	panfrost: Import stream out utility from iris We'll need this in a moment. Ken's implementation, lightly edited for Panfrost. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:14 -07:00
Alyssa Rosenzweig	9b2514d6c6	panfrost: Flush when using transform feedback This is a huge hack to workaround incomplete BO flushing logic, but it's enough for the dEQP transform feedback tests, and doing the resource management to get this right is out-of-scope for this patch series. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:11 -07:00
Alyssa Rosenzweig	4b0001c42d	panfrost: Set PIPE_CAP_TGSI_TEXCOORD It doesn't really make sense, since we don't have special texture coordinate varyings, but it'll make some code simpler for XFB and it doesn't hurt us, even if I lose a bit of my soul setting it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:09 -07:00
Alyssa Rosenzweig	72fc06df9c	panfrost: Wire up statistics for primitives GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN should now be handled. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:43:04 -07:00
Alyssa Rosenzweig	7c224c1008	panfrost: Implement callbacks for PRIMITIVES queries We're just going to compute them in the driver but let's get the structures setup to handle them. Implementation from v3d. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-13 09:42:48 -07:00
Rob Clark	72d086fc36	freedreno/a6xx: move SSBO/image consts to IBO stateobj Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	ab01ab4d4f	freedreno/a6xx: move VS driverparams to it's own stateobj If driver-params are required, we really should emit it on every draw for correctness. And if not required, we should emit a DISABLE so that un-applied state updates from previous draws don't corrupt the const state. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	882d53d8e3	freedreno/ir3+a6xx: same VBO state for draw/binning Worth ~+20% on gl_driver2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	4b82d1bbb7	freedreno/a6xx: add fd_emit_take_group() Which takes ownership of the stateobj. Useful for streaming state- objs, to avoid an extra ref/unref Worth ~5% at gl_driver2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	4a188e4215	freedreno/ir3: track # of driver params To avoid emitting unneeded const state. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	7f1e3391c6	freedreno/a6xx: move immediates to program stateobj Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	f0b91730a1	freedreno/a6xx: stop using ir3_emit_{vs,fs}_consts() Should be no functional change. Next step is to re-arrange various const state into different stateobjs. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	53667a43c4	freedreno/ir3: push ctx further up call chain Move more of the code to deal just w/ screen, without requiring ctx. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	4080dfb8af	freedreno/ir3: move ring_wfi() further up call chain Hoist them out of code-paths that will eventually be called directly for various a6xx+ const related stateobjs. This ends up duplicating one constlen check in ir3_emit_vs_consts(), to avoid what could otherwise be an unnecessary WFI on older gens. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	c6fab232c8	freedreno/all: move more emit helpers to screen framebuffer_barrier() still depends on the ctx, but the rest can move to screen. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	684f4b5843	freedreno/a3xx-a6xx+ir3: move emit_const* to screen These don't need to be in context, and we'll need them in screen in a later patch. Plus it's a good cleanup. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	566f2281c5	freedreno/a6xx: add fd6_emit_init_screen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	e89255b0a5	freedreno/a5xx: add fd5_emit_init_screen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:25 -07:00
Rob Clark	d256e3f34a	freedreno/a3xx: add fd3_emit_init_screen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:25 -07:00
Rob Clark	b9d3f39728	freedreno/a2xx: add fd2_emit_init_screen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	ec0ec641d8	freedreno/a4xx: add fd4_emit_init_screen() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	2f94de2372	freedreno/a2xx: call fd2_emit_ib() directly from fd2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	eb45422c5f	freedreno/a5xx: call fd5_emit_ib() directly from fd5 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	50e15e1c6f	freedreno/a4xx: call fd4_emit_ib() directly from fd4 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	4326eeac97	freedreno/a3xx: call fd3_emit_ib() directly from fd3 No reason for the indirection when called from a3xx specific code. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	32014afa44	freedreno/ir3: move VS driver-param emit Move DP emit to it's own function. No functional change, just code motion to prepare for splitting up const state into multiple state- objs on a6xx. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rob Clark	5722149bf1	freedreno/ir3: drop unneeded ir3_ra() args Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Boris Brezillon	65ae86b854	panfrost: Add support for KHR_partial_update() Implement ->set_damage_region() region to support partial updates. This is a dummy implementation in that it does not try to merge damage rects. It also does not deal with distinct regions and instead pick the largest quad as the only damage rect and generate up to 4 reload rects out of it (the left/right/top/bottom regions surrounding the biggest damage rect). We also do not try to reduce the number of draws by passing all quad vertices to the blit request (would require extending u_blitter) Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-13 14:41:10 +02:00
Daniel Stone	492ffbed63	st/dri2: Implement DRI2bufferDamageExtension Add a pipe_screen->set_damage_region() hook to propagate set-damage-region requests to the driver, it's then up to the driver to decide what to do with this piece of information. If the hook is left unassigned, the buffer-damage extension is considered unsupported. Signed-off-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-13 14:40:45 +02:00
Harish Krupo	a4a8ebe156	egl/dri: Use __DRI2_BUFFER_DAMAGE extension for KHR_partial_update Use the DRI2 interface callback to pass the damage rects to the driver. Signed-off-by: Harish Krupo <harishkrupo@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-13 14:40:31 +02:00
Daniel Stone	bd08a83b09	dri_interface: add DRI2_BufferDamage interface Add a new DRI2_BufferDamage interface to support the EGL_KHR_partial_update extension, informing the driver of an overriding scissor region for a particular drawable. Based on a commit originally authored by: Harish Krupo <harish.krupo.kps@intel.com> renamed extension, retargeted at DRI drawable instead of context, rewritten description Signed-off-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-13 14:40:14 +02:00
Harish Krupo	b4345da876	egl/android: Delete set_damage_region from egl dri vtbl The intension of the KHR_partial_update was not to send the damage back to the platform but to send the damage to the driver to ensure that the following rendering could be restricted to those regions. This patch removes the set_damage_region from the egl_dri vtbl and all the platfrom_*.c files. Then upcomming patches add a new dri2 interface for the drivers to implement Signed-off-by: Harish Krupo <harishkrupo@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-13 14:39:38 +02:00
Jordan Justen	fc12fd05f5	iris: Implement pipe_screen::resource_get_param Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-13 01:12:30 -07:00
Jordan Justen	3198c5b7bf	gallium/dri2: Use pipe_screen::resource_get_param in image queries Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	2decad495f	gallium/dri2: Support images with multiple planes for modifiers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	6e749a6b2b	gallium/dri2: Refactor image property queries This refactor will let us more easily use pipe_screen::resource_get_param as an alternative to pipe_screen::resource_get_handle. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	c5c2365455	state_tracker/winsys_handle: Add plane input field Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	2066966c10	gallium/dri2: Support creating multi-planar modifier images Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	fe06655e86	gallium/dri2: Implement dri2ImageExtension.queryDmaBufFormatModifierAttribs Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:29 -07:00
Jordan Justen	0346b70083	gallium/screen: Add pipe_screen::resource_get_param This function retrieves individual parameters selected by enum pipe_resource_param. It can be used as a more direct alternative to pipe_screen::resource_get_handle. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-08-13 01:12:24 -07:00
Iago Toral Quiroga	2353f7f7ef	vc4: clamp gl_PointSize to a minimum of 1.0 The OpenGL ES spec requires that the value of gl_PointSize is clamped to an implementation-dependent range matching what is advertised by GL_ALIASED_POINT_SIZE_RANGE. For VC4 this is [1.0, 512.0], but the hardware won't clamp to the minimum side of the range and won't render points with a size strictly smaller than 1.0 either, so we need to clamp manually. For points larger than the maximum size of the range the hardware clamps automatically. Fixes piglit test: spec/!opengl 2.0/vs-point_size-zero Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 09:44:54 +02:00
Iago Toral Quiroga	3539bd63dd	v3d: clamp gl_PointSize to a minimum of 1.0 The OpenGL ES spec requires that the value of gl_PointSize is clamped to an implementation-dependent range matching what is advertised by GL_ALIASED_POINT_SIZE_RANGE. For V3D this is [1.0, 512.0], but the hardware won't clamp to the minimum side of the range and won't render points with a size strictly smaller than 1.0 either, so we need to clamp manually. For points larger than the maximum size of the range the hardware clamps automatically. Fixes piglit test: spec/!opengl 2.0/vs-point_size-zero Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 09:44:54 +02:00
Iago Toral Quiroga	48f5c34301	nir: add a pass to clamp gl_PointSize to a range The OpenGL and OpenGL ES specs require that implementations clamp the value of gl_PointSize to an implementation-depedent range. This pass is useful for any GPU hardware that doesn't do this automatically for either one or both sides of the range, such as V3D. v2: - Turn into a generic NIR pass (Eric). - Make the pass work before lower I/O so we can use the deref variable to inspect if we are writing to gl_PointSize (Eric). - Make the pass take the range to clamp as parameter and allow it to clamp to both sides of the range or just one side. - Make the pass report progress. v3: - Fix copyright header (Eric) - use fmin/fmax instead of bcsel to clamp (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 09:44:12 +02:00
Iago Toral Quiroga	62e0ca3064	v3d: line length style fixes Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:38:19 +02:00
Iago Toral Quiroga	99e9809cab	v3d: honor the write mask on store operations v2: - Fix incremental update of the const offset when we need to emit a sequence with more than one write because of the writemask. - Do not move the tmu write emission to a separate helper. v3: - Get the store writemask before the loop, use ffs to get the first component to write and clear writemask bits as we process the components (Eric). - Simplified the code that figured out the number of components for the TMU config based on the number of tmu writes for stores and atomics. v4: - Code clean-ups (Eric). Fixes: KHR-GLES31.core.shader_image_load_store.advanced-cast-cs KHR-GLES31.core.shader_image_load_store.advanced-cast-fs KHR-GLES31.core.shader_storage_buffer_object.advanced-switchBuffers-cs KHR-GLES31.core.shader_storage_buffer_object.advanced-switchPrograms-cs KHR-GLES31.core.shader_storage_buffer_object.basic-operations-case1-cs Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:38:19 +02:00
Iago Toral Quiroga	3d65d2a488	v3d: refactor ntq_emit_tmu_general() slightly When we implement write masks on store operations we might need to emit multiple write sequences for a given store intrinsic. To make that easier, let's split the emission of the tmud instructions to their own block after we are done with the code that only needs to run once no matter how many write sequences we need to emit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:38:19 +02:00
Iago Toral Quiroga	b594796f1b	v3d: do not automatically flush current job for SSBOs and shader images If the current job has a sequence of draw calls involving SSBOs and/or shader images, we would flush the job in between each draw call. With this change, we won't flush the current job and we rely on the application inserting correct barriers by issuing glMemoryBarrier() when needed. v2 (Eric): - When mapping a buffer for writing, we always need to flush. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:25:15 +02:00
Iago Toral Quiroga	f1cf1153e8	v3d: only process glMemoryBarrier() for SSBOs and images PIPE_BARRIER_UPDATE is defined as: PIPE_BARRIER_UPDATE_BUFFER \| PIPE_BARRIER_UPDATE_TEXTURE Which means we were flushing for any flags other than these two, but this was intended to only flush for ssbos and images. Actually, the driver automatically flushes jobs as we need, including writes/reads involving SSBOs and images, so we don't really need to flush anything when the program emits a barrier. However, this may lead to excessive flushing in some cases, so we will soon change this to avoid atutomatic flushing of the current job for SSBOs and images, meaning that we will rely on the application to emit correct memory barriers for these that we should make sure to process here. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:25:15 +02:00
Iago Toral Quiroga	f1559ca922	v3d: fix flushing of SSBOs and shader images If the current draw call includes SSBOs, then we must flush any jobs that are writing to the same SSBOs (so that our SSBOs reads are correct), as well as jobs reading from the same SSBO (so that our SSBO writes don't stomp previous SSBO reads). The exact same logic applies to shader images. In this case we were already flushing previous writes, but we should also flush previous reads. Note that We don't need to call v3d_flush_jobs_reading_resource() and v3d_flush_jobs_writing_resource() separately though, since flushing jobs that read a resource also flushes those writing to it. Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-13 08:25:15 +02:00
Caio Marcelo de Oliveira Filho	1021abab07	intel/tools: Fix aub_file initialization in intel_dump_gpu The `device` can be set earlier either by a command line or a by intercepting an ioctl call to get the I915_PARAM_CHIPSET_ID done by the application early. In both cases `aub_file` and `devinfo` would not be initialized. Fix by splitting the conditions - `device == 0`: use the FD to get both device and devinfo. - Or `devinfo.gen == 0`: use `device` to initialize it. And separatedly, initialize aub_file the first time it is needed. Fixes: `d594d2a052` ("intel/tools: use device info initializer") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 19:18:26 -07:00
Rafael Antognolli	f0d29238df	i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced. If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Set Mask field to 0xffff for workaround (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	7bc022b4bb	anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced. If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Don't need to set the mask - it's mbo (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	a1a499e7fe	iris/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced. If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Don't need to set the mask - it's mbo (Ken). v3: Don't keep a reference to the resource used for emitting the table (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	ad513fd386	intel: Get information about pixel pipes subslices. v2: Use 1 instead of 1UL (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	32344dc581	intel/gen_decoder: Decode SLICE_HASH_TABLE.	2019-08-12 16:19:08 -07:00
Rafael Antognolli	e1cb71c182	intel/genxml: Update 3D_MODE and add SLICE_HASH_TABLE. Add these fields and the 3DSTATE_SLICE_TABLE_STATE_POINTERS instruction so we can properly configure the slice and subslice hashing on ICL+ v2: Make 'Mask' field a mbo (Ken).	2019-08-12 16:19:08 -07:00
Jason Ekstrand	d787a2d05e	anv: Implement VK_KHR_pipeline_executable_properties Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	67cb55ad11	anv: Add a ralloc context to anv_pipeline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	fec4bdff40	anv: Force a full re-compile when CAPTURE_INTERNAL_REPRESENTATION_TEXT is set Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	651fbbf9b8	anv/pipeline: Split setting up per-stage keys into its own loop Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	78f3dfb4a2	anv: Record shader compile stats in the pipeline cache We're going to want these to be available regardless of caching. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	2af380d20f	anv/pipeline: Stash generated code in the pipeline stage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	8d3cbd0393	intel/fs: Add SLM size to brw_cs_prog_data We don't need it for state setup but it's a useful statistic we want to pass on to developers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	134607760a	intel/compiler: Fill a compiler statistics struct This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Khaled Emara	2720ad5fd9	freedreno: disable tiling for cubemaps Tiling doesn't work quite well with cubemaps. Revert to linear textures, until it's fixed.	2019-08-12 22:30:54 +00:00
Khaled Emara	0ae16fb565	freedreno: add tiling parameters for 2D/2DArray/3D	2019-08-12 22:30:54 +00:00
Khaled Emara	aeaba3e4a6	freedreno: simplified slices setup for a3xx a3xx doesn't support ASTC and layout_first always returns false	2019-08-12 22:30:54 +00:00
Khaled Emara	e11a239e8c	freedreno: enable tiled textures for debug builds	2019-08-12 22:30:54 +00:00
Paulo Zanoni	866bb775de	intel/fs: add 64 bit integer multiplication lowering While NIR's lower_imul64() solves the case of 64 bit integer multiplications generated early, we don't have a way to lower such instructions when they are generated by our own backend, such as the scan/reduce intrinsics. We'll need this soon, so implement it now. An easy way to test this is to simply disable nir_lower_imul64 to let those operations reach the backend. v2: - Fix Q/UQ copy/paste errors (Caio). - Transform an 'if' into 'else if' (Caio). - Add an extra comment to clarify the need for 64b = 32b * 32b (Caio). - Make private functions private (Caio). v3: - Remove ambiguity with 'b' and 'd' variables (Caio). - Allocate potentially less regs for the dwords (Caio). Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Matt Turner <matt.turner@intel.com> Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	9217cf3b5e	intel/compiler: invert the logic of lower_integer_multiplication() Invert the logic of how progress is handled: remove the continue statements and mark progress inside the places where it actually happens. We're going to add a new lowering that also looks for BRW_OPCODE_MUL, so inverting the logic here makes the resulting code much easier to follow. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	6ba4717924	intel/compiler: don't instantiate a builder for each instruction Don't instantiate a builder for each instruction during lower_integer_multiplication(). Instantiate one only when needed. On the other hand, these unneeded builders don't seem to cost much to init, so I don't expect any significant difference in performance: this is mostly about code organization. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	75b3868dcc	intel/compiler: extract subfunctions of lower_integer_multiplication() The lower_integer_multiplication() function is already a little too big. I want to add more to it, so let's reorganize the existing code first. Let's start with just extracting the current code to subfunctions. Later we'll change them a little more. v2: Make private functions private (Caio). v3: Fix typo (Caio). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Rhys Perry	7740149852	nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Rhys Perry	da8ed68aca	nir: replace nir_move_load_const() with nir_opt_sink() This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Francisco Jerez	c2fe7a0fb8	anv/gen9: Optimize slice and subslice load balancing behavior. See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. According to Jason, improves Aztec Ruins performance by 2.7%. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Undo CPU performance micro-optimization done in i965 and iris due to lack of data justifying it on anv. Use cmd_buffer_apply_pipe_flushes wrapper instead of emitting pipe control command directly. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-12 14:40:21 -07:00
Andreas Baierl	1c45541c7f	lima/ppir: Add fddx and fddy Lower fddx and fddy and set the right bits in codegen. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com>	2019-08-12 23:20:04 +02:00
Bas Nieuwenhuizen	f1da129220	radv: Enable VK_KHR_pipeline_executable_properties. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	afad67cd7a	radv: Implement radv_GetPipelineExecutableStatisticsKHR. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	35302f0189	radv: Implement radv_GetPipelineExecutableInternalRepresentationsKHR. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	86864eedd2	radv: Implement radv_GetPipelineExecutablePropertiesKHR. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	8874af8ef4	radv: Keep shader info when needed. This allows enabling the shader info keeping on a per shader basis. Also disables the cache on a per shader basis. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	e8a256eb54	radv: Add VK_KHR_pipeline_executable_properties in disabled state. So we can add the functions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	5444d3e0c2	radv: Use string for nir dumping. Reviewed-by: Dave Airlie <airlied@redhat.com> Allows us to easily dump all nir shaders for combined variants in vega and simplifies ownership.	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	739a2880f5	radv: Get max workgroup size without nir. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Bas Nieuwenhuizen	290ca0c4dd	radv: Add utility function to calculate max waves. Not AC because a lot of it is data extraction out of radv structs. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 23:00:24 +02:00
Francisco Jerez	026773397b	iris/gen9: Optimize slice and subslice load balancing behavior. See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 13:17:58 -07:00
Francisco Jerez	03cba9f5d9	intel/genxml: Add GT_MODE hashing defs for Gen9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 13:17:58 -07:00
Francisco Jerez	9406b3a5c1	i965/gen9: Optimize slice and subslice load balancing behavior. The default pixel hashing mode settings used for slice and subslice load balancing are far from optimal under certain conditions (see the comments below for the gory details). The top-of-the-line GT4 parts suffer from a particularly severe performance problem currently due to a subslice load balancing issue. Fixing this seems to improve graphics performance across the board for most of the benchmarks in my test set, up to ~20% in some cases, e.g. from SKL GT4: unigine/valley: 3.44% ±0.11% gfxbench/gl_manhattan31: 3.99% ±0.13% gputest/pixmark_piano: 7.95% ±0.33% synmark/OglTexFilterAniso: 15.22% ±0.07% synmark/OglTexMem128: 22.26% ±0.06% Lower-end platforms are also affected by some subslice load imbalance to a lesser degree, especially during CCS resolve and fast clear operations, which are handled specially here due to rasterization ocurring in reduced CCS coordinates, which changes the semantics of the pixel hashing mode settings. No regressions seen during my tests on some SKL, KBL and BXT configurations. Additional benchmark reports welcome on any Gen9 platforms (that includes anything with Skylake, Broxton, Kabylake, Geminilake, Coffeelake, Whiskey Lake, Comet Lake or Amber Lake in your renderer string). P.S.: A similar problem is likely to be present on other non-Gen9 platforms, especially for CCS resolve and fast clear operations. Will follow-up with additional patches fixing the hashing mode for those once I have enough performance data to justify it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 13:17:58 -07:00
Alyssa Rosenzweig	b1965831e4	pan/midgard: Handle 64-bit address in mir_mask_of_read_components This is a bit of a hack, but it'll hold us over until we have 64-bit support wired through. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig	41e68094f8	pan/midgard: Allocate separate spill indices for lowered moves This helps RA be slightly more reasonable. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig	14b5b9ac38	pan/midgard: Extend liveness analysis to trinary ops Fixes RA fails with multiple indirect SSBO writes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig	c690b37d76	pan/midgard: Fix load/store pairing This used a delicate hack to try to find indirect inputs and skip them as candidates for pairing. Let's use a better criterion -- no sources -- and pair based on that. We could do better, but that would require more complex data flow analysis than we're interested in doing here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig	15954ab6ca	pan/midgard: Implement nir_intrinsic_load_num_work_groups Just a sysval to route through. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig	7229af794b	pan/midgard: Implement some compute builtins We implement gl_WorkGroupID and gl_LocalInvocationID, which map to ld_compute_id with special sources. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig	2b4e579585	pan/midgard: Rename ld_global_id -> ld_compute_id It's used for more general loads within a compute shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig	a5059f2cba	pan/midgard: Handle partial writes in liveness analysis This allows liveness analysis within a loop to be more fine grained, fixing RA failures with partial spilled movs within a loop, as well as enabling a slight reduction of register pressure more generally: total registers in shared programs: 350 -> 347 (-0.86%) registers in affected programs: 12 -> 9 (-25.00%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	e333bf606f	pan/midgard: Dump "no spill"? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	cc3df917d3	pan/midgard: Absorb nonexistance sources Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	0a7cc239bd	pan/midgard: Pretty-print destinations They're not "sources" but they follow the same conventions. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	ba8ec19a64	pan/midgard: Pretty-print units Since we are seeing some use of MIR post-scheduling, let's get this printed right. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	73f54f286a	pan/midgard: Print mask in dumped MIR Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	2ec4f9a74b	pan/midgard: Add no_spill flag Hint for the RA to avoid infinite spilling loops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	7090971f2f	pan/midgard: Generalize mir_mask_of_read_components This now works for load/store and texture instructions as well as ALU. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	419ddd63b0	pan/midgard: Implement SSBO access Just laying the groundwork. Reads and writes should be supported (both direct and indirect, either int or float, vec1/2/3/4), but no bounds checking is done at the moment. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig	a8639b91b5	pan/midgard: Pipe uniform mask through when spilling This is a corner case that happens a lot with SSBOs. Basically, if we only read a few components of a uniform, we need to only spill a few components or otherwise we try to spill what we spilled and RA hangs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:43:00 -07:00
Alyssa Rosenzweig	63e240dd05	pan/midgard: Clamp sysval component count We don't want to load a 128-bit sysval when 64-bits will do. Fixes RA failures with SSBO indirect writes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig	e7ac46be7a	pan/midgard: Pass uploaded midgard_instruction through We want to edit it after emission in some cases. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig	fa68740187	pan/midgard: Allow sysval destination override Sometimes a sysval is used to facilitate an instruction but is not the instruction itself. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig	60d80157d1	panfrost: Force flush every compute job This is of course suboptimal for performance, forcing each glDispatchCompute call to be submitted separately to the kernel and finish to completion. However, for the initial bring-up of compute jobs, this simplifies quite a bit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig	2efa025b05	panfrost: Add SSBO system value For each SSBO index we get from Gallium/NIR, we need two pieces of information in the shader: 1. The address of the SSBO in GPU memory. Within the shader, we'll be accessing it with raw memory load/store, so we need the actual address, not just an index. 2. The size of the SSBO. This is not strictly necessary, but at some point, we may like to do bounds checking on SSBO accesses. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig	e881aa8c12	gallium/util: Add u_stream_outputs_for_vertices helper This u_prim.h helper determines the number of outputs for stream output, given a particular primitive type and a vertex count. This is useful for statically calculating sizes of stream output buffers (i.e. when there is no geometry/tessellation shader in use). This helper will be used in Panfrost's transform feedback implementation, as you can probably guess since why else would I be submitting it.... See also dEQP's getTransformFeedbackOutputCount routine. v2: Simplify definition using new helpers, which also extends to non-ES2 primitive types (Eric). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 12:22:54 -07:00
Marek Olšák	8ce4f9bbc3	radeonsi: remove the always_nir option tgsi_to_nir is no longer optional if NIR is enabled.	2019-08-12 14:52:17 -04:00
Marek Olšák	4e545f934f	radeonsi/nir: implement default tess level system values Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	9c7746ceae	compiler: add SYSTEM_VALUE_TESS_LEVEL_OUTER/INNER_DEFAULT TCS system values for internal passthru TCS, needed by radeonsi NIR support Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	5167ca27fa	gallium: add TGSI_SEMANTIC_DEFAULT_OUTER/INNER_LEVEL for radeonsi NIR support.	2019-08-12 14:52:17 -04:00
Marek Olšák	f8d4198998	tgsi_to_nir: handle tess level inner/outer varyings for internal radeonsi shaders Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	8ac2583cd8	tgsi_to_nir: add support for the stencil FS output Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	f3f1d0dfd0	tgsi_to_nir: add support for TEX_LZ Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 14:52:17 -04:00
Marek Olšák	1b881852bc	compiler: add SYSTEM_VALUE_USER_DATA_AMD for internal radeonsi shaders	2019-08-12 14:52:17 -04:00
Marek Olšák	f0ccc5457a	compiler: add shader_info.cs.user_data_components_amd	2019-08-12 14:52:17 -04:00
Marek Olšák	155789c8e7	tgsi_to_nir: add basic compute shader support Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	5a0adfd9f0	tgsi_to_nir: add support for LOAD & STORE with SSBOs and images Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	0b121cb89a	tgsi_to_nir: make setup_texture_info reusable Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 14:52:17 -04:00
Marek Olšák	70fd85172b	tgsi_to_nir: add support for TXF_LZ Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	028dbd35ba	compiler: add shader_info.vs.blit_sgprs_amd for internal radeonsi shaders	2019-08-12 14:52:17 -04:00
Marek Olšák	e300365197	tgsi_to_nir: be careful about not losing any TGSI properties silently (v2) v2: squash with Timur Kristof's commit Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	8b6814211a	tgsi/scan: don't set GS_INVOCATIONS for all shader stages Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	9fb2fd0b43	compiler: add ACCESS_STREAM_CACHE_POLICY radeonsi will use this. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-08-12 14:52:17 -04:00
Marek Olšák	902dd50cf0	gallium: add AMD-specific compute TGSI enums for tgsi_to_nir	2019-08-12 14:52:17 -04:00
Marek Olšák	6a2bdb8d01	gallium: add TGSI_PROPERTY_VS_BLIT_SGPRS_AMD for tgsi_to_nir needed by radeonsi NIR support	2019-08-12 14:52:17 -04:00
Marek Olšák	d1ad4fda31	st/mesa: don't allocate mipmapped texture for NEAREST_MIPMAP_LINEAR Reviewed-by: Brian Paul <brianp@vmware.com>	2019-08-12 14:52:17 -04:00
Kenneth Graunke	5180a222c0	glsl: Optimize the SoftFP64 shader when first creating it. By optimizing the shader before inlining, we avoid having to redo this work for each inlined copy of a function. It should also reduce the memory consumption a bit. This cuts the KHR-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2 runtime by 25% on my Icelake. That test compiles many shaders, which contain large types (dmat4) and division (expensive operations). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-12 10:42:32 -07:00
Christian Gmeiner	914ecc9384	etnaviv: fix compile warnings in release build [27/31] Compiling C object 'src/gallium/drivers/etnaviv/df32d18@@etnaviv@sta/etnaviv_compiler_nir.c.o'. In file included from ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:552: ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h: In function 'ra_assign': ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir_emit.h:903:9: warning: unused variable 'ok' [-Wunused-variable] bool ok = ra_allocate(g); ^~ ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c: In function 'etna_compile_shader_nir': ../../src/gitlab_mesa/src/gallium/drivers/etnaviv/etnaviv_compiler_nir.c:663:9: warning: unused variable 'ok' [-Wunused-variable] bool ok = emit_shader(c->nir, &options, &v->num_temps, &num_consts); ^~ Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-12 16:58:13 +00:00
Bas Nieuwenhuizen	e040c1b274	radv: Do not setup attachments without a framebuffer. Test that found this: dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer Fixes: `49e6c2fb78` "radv: Store color/depth surface info in attachment info instead of framebuffer." Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 17:19:24 +02:00
Jason Ekstrand	14c96a6300	anv: Implement VK_EXT_subgroup_size_control version 2 The version bump adds a proper features struct. Fixes: `d10de25309` "anv: Implement VK_EXT_subgroup_size_control" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-12 14:56:33 +00:00
Jason Ekstrand	8aef89cc2d	vulkan: Update the XML and headers to 1.1.119 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 14:56:33 +00:00
Bas Nieuwenhuizen	d062bec48d	radv: Hash Wave32 settings in shader key. Can result in different shaders. Fixes: `8a86908e9a` "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen	ba8d3c362b	radv: Properly use Wave64 for non-NGG GS and copy shader. Fixes: `8a86908e9a` "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen	035406ecf7	radv: Put wave size in shader options/info. Instead of having the three values everywhere. This is also more future proof if we want the driver to make those decisions eventually. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen	71621e877f	relnotes: Make entries for radv more consistent. Always use 'on' as for the rest of the drivers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:29:27 +00:00
Bas Nieuwenhuizen	38961729a8	relnotes: Add new exts on radv for 19.2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-12 13:29:27 +00:00
Tapani Pälli	d4b574f26a	iris: reorder arguments as expected by the function CID: 1452262 Fixes: `b4c54894bb` "iris: Handle vertex shader with window space position" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>	2019-08-12 13:08:26 +03:00
Tapani Pälli	590ba15d6e	iris/android: move iris_query.c to 'per gen' LIBIRIS_SRC_FILES Fixes Iris build on Android. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-08-12 10:06:36 +03:00
Kenneth Graunke	0f3768bc5d	iris: Free query on error path CID: 1452276	2019-08-11 14:04:31 -07:00
Kenneth Graunke	661be3fef9	iris: Add missing 'break' We don't want to fall through to unreachable(). CID: 1452277	2019-08-11 14:04:31 -07:00
Caio Marcelo de Oliveira Filho	5ed4e31c08	spirv: Drop lower_workgroup_access_to_offsets Intel drivers are not using this anymore, and turnip still don't have Compute Shaders, so won't make a difference to stop using this option. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@chromium.org>	2019-08-10 22:15:35 -07:00
Caio Marcelo de Oliveira Filho	925e9142bd	i965/spirv: Lower shared memory later Instead of asking spirv_to_nir to lower the workgroup (shared memory) to offsets, keep them as derefs longer, then lower it later on. Because Workgroup memory doesn't have explicit offsets, we need to set those using nir_lower_vars_to_explicit_types before calling the I/O lowering pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-10 22:15:35 -07:00
Danylo Piliaiev	61d6be84f3	i965: Use force_compat_profile driconf option Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-10 11:39:29 -07:00
Eric Engestrom	d7eb40962b	i965: fix mem leak in error path Fixes: `8ae6667992` ("intel/perf: move query_object into perf") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2019-08-10 12:14:56 +01:00
Eric Engestrom	1c82fa0a92	gitlab-ci: simplify $CROSS option Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-08-10 12:11:28 +01:00
Kenneth Graunke	f1dba99639	iris: minor restyling	2019-08-10 00:16:45 -07:00
Mark Janes	9c597514d4	iris/query: enable amd performance monitors Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:34 -07:00
Mark Janes	469af7fdc9	iris/perf: get monitor results Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:32 -07:00
Mark Janes	1cb4fc184f	iris/perf: add begin/end hooks Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:24 -07:00
Mark Janes	8c4c346665	iris/perf: add delete query Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:17 -07:00
Mark Janes	aca42759ff	iris/perf: implement iris_create_monitor_object This is the first call that provides the iris context to the monitor implementation. On the first call, use the iris context to initialize the monitor context. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:14 -07:00
Mark Janes	0fd4359733	iris/perf: implement routines to return counter info With this commit, Iris will report that AMD_performance_monitor is supported, and will allow the caller to query the available metrics. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-09 19:28:03 -07:00
Eric Engestrom	e4aa0fc63a	anv: add missing `break` Fixes: `f6e7de41d7` ("anv: Implement VK_EXT_line_rasterization") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 23:34:31 +01:00
Lionel Landwerlin	e2d761de03	util: drop final reference to p_compiler.h Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	85bf1dc2de	util: os_misc: drop p_compiler.h include Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	c44c3948c7	util: u_math: drop p_compiler.h include This file was moved from gallium so drop depending on gallium headers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	8818db8f2c	vc4: prepare for p_compiler.h dependency removal Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	8a884a25c5	amd: prepare dropping include of p_compiler.h Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	a233a3a74e	mesa: be consistent on GL_TRUE/GL_FALSE & TRUE/FALSE Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:59:43 +03:00
Lionel Landwerlin	8f4dea20fc	mesa: drop some p_compiler.h types Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:50:29 +03:00
Lionel Landwerlin	7abac7a8bf	mesa: add stddef include in preparation for dropping p_compiler.h Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:50:17 +03:00
Lionel Landwerlin	6637395073	panfrost: prepare for p_compiler.h dependency removal Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:50:03 +03:00
Lionel Landwerlin	351c2ad157	i965: don't use p_compiler.h types Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 22:49:48 +03:00
Eric Engestrom	9be5ce1d73	gitlab-ci: generate meson cross-files earlier Suggested-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-09 20:07:50 +01:00
Alyssa Rosenzweig	9bc99e60a8	panfrost: Assign varying buffers dynamically Rather than hardcoding certain varying buffer indices "by convention", work it out at draw time. This added flexibility is needed for futureproofing and will be enable streamout. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig	46dae9ef58	panfrost: Assign indices at draw-time This will allow us to shuffle buffers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig	af6d3f7cb5	panfrost: Break out pan_varyings.c This code is fairly self-contained, so let's factor it out of the giant pan_context.c monster. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig	4dba493fd7	panfrost: Enable PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS Just as easy/hard as the rest of XFB. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig	5ff7973560	panfrost: Import streamout data structures Pretty much copypasted from v3d to jumpstart us. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:53:21 -07:00
Alyssa Rosenzweig	c82672c9c1	pan/midgard: Account for swizzle/mask in st_vary Register allocation for varying stores is a bit different, since the instructions ignore the writemask (varyings are normalized packed/vectorized..) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:50:45 -07:00
Alyssa Rosenzweig	5ad83015cd	pan/decode: Resolve crash with NULL attr/varyings This case needs more investigation, but this was found with geometry shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-09 11:50:45 -07:00
Krzysztof Raszkowski	c0ab268f9c	gallium/swr: Fix glClear when it's used with glEnable/glDisable GL_SCISSOR_TEST When GL_SCISSOR_TEST is enabled glClear is handled by state tracker and there is no need to do this in gallium driver. Reviewed-by: Alok Hota alok.hota@intel.com	2019-08-09 18:56:13 +02:00
Gurchetan Singh	d6f8ce1c96	util: Revert "util: added missing headers in anon-file" This reverts commit `c73988300f`. Reason: Made a fix for this, then saw @eric's change ("util/anon_file: add missing"), but some sequence of events I don't really remember caused this to get merged. So revert ;-) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-09 09:13:45 -07:00
Marek Vasut	bb47bedc85	etnaviv: Remove etna_bo_from_handle() prototype Remove etna_bo_from_handle() as there are no known users. Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-09 17:21:55 +02:00
Lionel Landwerlin	cefb4341b7	anv: drop unused code We stopped using this when we moved to Jason's mi_builder. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 17:01:38 +03:00
Christian Gmeiner	889e752965	etnaviv: fix typo Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-09 13:08:20 +00:00
Christian Gmeiner	de5070ea8d	etnaviv: add gpu_supports_texture_target(..) Currently I am seeing a handful of the following debug message: translate_texture_target:495: Unhandled texture target: 0 PIPE_BUFFER is not handled in translate_texture_target(..) which makes sense as it is used to translate from PIPE_XXX to GPU specific value during etna_create_sampler_view_state(..). To fix this problem introduce gpu_supports_texture_target(..) which just checks if the texture target is supported. Fixes: `dfe048058f` ("etnaviv: support 3D and 2D array textures") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-08-09 13:08:20 +00:00
Jon Turney	0141b7c6b2	util: Cygwin has linux-style pthread_setname_np Fixes: `dcf9d91a` ("util: Handle differences in pthread_setname_np")	2019-08-09 12:46:43 +00:00
Tapani Pälli	5e38db0c47	anv/android: disable shared representable image support explicitly Android 9 loader conditionally advertises VK_KHR_shared_presentable_image extension based on this property and it looks like it does not initialize the struct before query. Pragmas are added to ignore warnings with Android specific structure types in same manner as commit `8d386e6eef` did. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 08:53:54 +03:00
Vasily Khoruzhick	39a90749af	lima: introduce a struct describing texture descriptor Use a struct with bitfields to construct texture descriptor instead of poking bits in array of uint32_t. It improves code readability and makes it easier to experiment with unknown fields. Also fix mipmapping while we're at it - Utgard can have up to 13 levels, but 64 bytes is enough only for 10. Calculate descriptor size dynamically to account extra levels if we need them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-08 19:17:20 -07:00
Vasily Khoruzhick	edf008c04e	lima: add texel format table Introduce a table for supported texel formats and use it to check whether format is supported and for converting pipe format to lima texel format. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-08 19:17:20 -07:00
Gurchetan Singh	c73988300f	util: added missing headers in anon-file Otherwise I get: ../src/util/anon_file.c: In function ‘create_tmpfile_cloexec’: ../src/util/anon_file.c:75:9: error: implicit declaration of function ‘mkostemp’ [-Werror=implicit-function-declaration] fd = mkostemp(tmpname, O_CLOEXEC); ^~~~~~~~ ../src/util/anon_file.c:133:7: error: implicit declaration of function ‘asprintf’ [-Werror=implicit-function-declaration] asprintf(&name, "%s/mesa-shared-%s-XXXXXX", path, debug_name); ^~~~~~~~ ../src/util/anon_file.c:141:4: error: implicit declaration of function ‘free’ [-Werror=implicit-function-declaration] free(name) Fixes: c0376a ("util: add anon_file.h for all memfd/temp file usage")	2019-08-08 16:21:57 -07:00
Gurchetan Singh	42759dc986	virgl: check scanout mask Otherwise, virgl will report renderable or texturable formats as also scan-out formats. v2: drop host feature check (@kusma) Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-08-08 16:21:57 -07:00
Gurchetan Singh	3da029ac1a	virgl: fixup_readback_format --> fixup_formats This function is generalizable. Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-08-08 16:21:57 -07:00
Gurchetan Singh	bf0ca99ec7	virgl: access caps in a less verbose way in virgl_is_format_supported Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-08-08 16:21:57 -07:00
Alyssa Rosenzweig	5a898e2a65	pan/midgard: Disassemble load/store barrel shift Arm assembly intensifies. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 15:49:12 -07:00
Eric Engestrom	525a917c6c	util/anon_file: const string param Fixes: `c0376a1234` ("util: add anon_file.h for all memfd/temp file usage") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Eric Anholt <eric@anholt.net> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-08-08 22:02:54 +01:00
Eric Engestrom	8a028b0df2	util/anon_file: drop unused #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Eric Anholt <eric@anholt.net> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-08-08 22:02:54 +01:00
Eric Engestrom	60af7f5a81	util/anon_file: add missing #include Fixes: `c0376a1234` ("util: add anon_file.h for all memfd/temp file usage") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Eric Anholt <eric@anholt.net> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-08-08 22:02:54 +01:00
Greg V	ac1561088d	intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `134e750e16` ("i965: extract performance query metrics")	2019-08-08 21:44:33 +01:00
Greg V	0233372581	util: fix cpuset support on FreeBSD Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	c00ee00031	i965/tiled_memcpy: avoid creating bswap32 if it exists as a macro (e.g. on FreeBSD) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	7b520dc74f	anv: add MAP_POPULATE fallback define for portability FreeBSD does not have MAP_POPULATE Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	2be3f16600	anv: remove unused Linux-specific include Fixes: `4201cc2dd3` ("anv: Implement VK_KHX_external_semaphore_fd") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	c0dc5c1859	meson: define ETIME to ETIMEDOUT if not present Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Roman Stratiienko	28061e0ab0	lima: Fix Android.mk 1. Update LOCAL_SRC_FILES according to commit `54434fe670` ("lima/gpir: Rework the scheduler"). 2. Add libpanfrost_shared.a dependency. 3. Generate lima_nir_algebraic.c with Android.mk Fixes Android build error introduced by commit `5adfc8602c` ("lima/ppir: move sin/cos input scaling into NIR") Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-08-08 17:47:22 +00:00
Roman Stratiienko	26a01a6797	Add libpanfrost_shared to Android build 1. Add missing directory to ./Android.mk 2. Fix ./src/panfrost/Android.shared.mk Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com> Reviewed-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-08-08 17:47:22 +00:00
Rhys Perry	c52c54a746	anv,i965,iris: deduplicate setting of total_shared v5: add patch Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Rhys Perry	024a46a407	anv: use derefs for shared memory access vkpipeline-db for my Skylake GPU: total instructions in shared programs: 8847602 -> 8847896 (<.01%) instructions in affected programs: 10165 -> 10459 (2.89%) helped: 8 HURT: 2 total cycles in shared programs: 1606273555 -> 1606251634 (<.01%) cycles in affected programs: 2201803 -> 2179882 (-1.00%) helped: 7 HURT: 3 The shaders with more instructions is due to a loop over a shared array in Three Kingdoms being unrolled (and creating a lot of nested ifs). Not sure if that's good or bad. One of the shaders with worse cycles is only worse by 0.04% and the other two are the shaders with loops unrolled. v2: add patch v4: don't set spirv_options.shared_addr_format v4: move comment concerning the shared address format used and NULL v4: add vkpipeline-db results v5: rename to nir_lower_vars_to_explicit_types v5: move setting of total_shared to outside brw_compile_cs v6: set shared_addr_format v6: formatting changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Rhys Perry	fd73ed1bd7	nir: add nir_lower_to_explicit() v2: use glsl_type_size_align_func v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp v2: use align() instead of util_align_npot() v2: pack arrays a bit tighter v2: rename mem_* to field_* v2: don't attempt to handle when struct offsets are already set v2: use column_type() instead of recreating it v2: use a branch instead of \|= in nir_lower_to_explicit_impl() v2: assign locations to variables and update shared_size and num_shared v2: allow the pass to be used with nir_var_{shader_temp,function_temp} v4: rebase v5: add TODO v5: small formatting changes v5: remove incorrect assert in get_explicit_type() v5: rename to nir_lower_vars_to_explicit_types v5: correctly update progress when only variables are updated v5: rename get_explicit_type() to get_explicit_shared_type() v5: add comment explaining how get_explicit_shared_type() is different v5: update cast strides v6: update progress when lowering nir_var_function_temp variables v6: formatting changes v6: add more detailed documentation comment for get_explicit_shared_type v6: rename get_explicit_shared_type to get_explicit_type_for_size_align v7: fix comment in nir_lower_vars_to_explicit_types_impl() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Rhys Perry	8bd2e138f5	nir/lower_explicit_io: add nir_var_mem_shared support v2: require nir_address_format_32bit_offset instead v3: don't call nir_intrinsic_set_access() for shared atomics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Erik Faye-Lund	1e21bb4123	mesa: avoid warning on Windows On Windows, p_atomic_inc_return returns an unsigned long long rather than the type the pointer refers to, so let's make sure we cast the result to the right type. Otherwise, we'll trigger a warning about the wrong format-string for the type. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-08 18:20:29 +02:00
Erik Faye-Lund	e0a740c633	mesa/main: cast away constness This avoids a warning about implicitly casting away the constness of the pointer. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-08 18:20:29 +02:00
Erik Faye-Lund	75097114d9	spirv: fixup signature This avoids a warning on some compiler, complaining about implicitly casting the function-pointer. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `d482a8f` "spirv: Update the OpenCL.std.h header" Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-08-08 18:20:29 +02:00
Lucas Stach	68c24b09c2	etnaviv: remember data offset into BO Imported resources might not start at offset 0 into the buffer object. Make sure to remember the offset that is provided with the handle on import. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-08 16:11:34 +02:00
Danylo Piliaiev	b8842bc312	i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395 Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2019-08-08 13:39:15 +00:00
Bas Nieuwenhuizen	23a9d20997	radv: Avoid VEGA/RAVEN scissor bug in binning. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-08 14:08:21 +02:00
Bas Nieuwenhuizen	4a3f987afd	radv: Avoid binning RAVEN hangs. Mirroring radeonsi. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-08 14:08:21 +02:00
Bas Nieuwenhuizen	66ecc3eac8	radv: Fix off by one for S_028C48_MAX_ALLOC_COUNT. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-08 14:08:21 +02:00
Jan Zielinski	207026d29e	swr/rasterizer: modernize thread TLB Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 12:33:21 +02:00
Jan Zielinski	387599a661	swr/rasterizer: Refactor events collection mechanism Several improvements and cleanups in events and statstics mechanisms Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 11:15:07 +02:00
Jan Zielinski	ff75c35846	swr/rasterizer: improvements in simdlib 1. fix build issues with MSVC 2019 compiler The MSVC 2019 compiler seems to have an issue with optimized code-gen when using the _mm256_and_si256() intrinsic. Only disable use of integer vpand on buggy versions MSVC 2019. Otherwise allow use of integer vpand intrinsic. 2. Remove unused vec/matrix functionality Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 10:53:47 +02:00
Jan Zielinski	b55a93fdd4	swr/rasterizer: Events are now grouped and enabled by knobs All events are now grouped as follows: -Framework (i.e. ThreadStart) [always ON] -Api (i.e. SwrSync) [always ON] -Pipeline [default ON] -Shader [default ON] -SWTag [default OFF] -Memory [default OFF] Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 10:33:25 +02:00
Jan Zielinski	982d99490f	swr/rasterizer: do not mark tiles dirty until actually rendered Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 10:16:20 +02:00
Jan Zielinski	4f04f260d9	swr/rasterizer: enable size accumulation in mem stats Small refactoring is also performed Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 10:16:20 +02:00
Jan Zielinski	365ad367f1	swr/rasterizer: enable using AOS vertex data format Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-08-08 10:16:20 +02:00
Iago Toral Quiroga	fb9f7872e7	v3d: handle wait requirement when retrieving query results correctly Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Iago Toral Quiroga	0f2d1dfe65	v3d: use the GPU to record primitives written to transform feedback We can use the PRIMITIVE_COUNTS_FEEDBACK packet to write various primitive counts to a buffer, including the number of primives written to transform feedback buffers, which will handle buffer overflow correctly. There are a couple of caveats with this: Primitive counters are reset when we emit a 'Tile Binning Mode Configuration' packet, which can happen in the middle of a primitives query, so we need to read the buffer when we submit a job and accumulate the counts in the context so we don't lose them. We also need to do the same when we switch primitive type during transform feedback so we can compute the correct number of recorded vertices from the number of primitives. This is necessary so we can provide an accurate vertex count for draw from transform feedback. v2: - When computing the number of vertices for a primitive, pass in the base primitive, since that is what the hardware will count. - No need to update primitive counts when switching primitive types if the base primitives are the same. - Log perf warning when mapping the primitive counts BO for readback (Eric). - Only emit the primitive counts packet once at job end (Eric). - Use u_upload mechanism for the primitive counts buffer (Eric). - Use the XML to generate indices into the primitive counters buffer (Eric). Fixes piglit tests: spec/ext_transform_feedback/overflow-edge-cases spec/ext_transform_feedback/query-primitives_written-bufferrange spec/ext_transform_feedback/query-primitives_written-bufferrange-discard spec/ext_transform_feedback/change-size base-shrink spec/ext_transform_feedback/change-size base-grow spec/ext_transform_feedback/change-size offset-shrink spec/ext_transform_feedback/change-size offset-grow spec/ext_transform_feedback/change-size range-shrink spec/ext_transform_feedback/change-size range-grow spec/ext_transform_feedback/intervening-read prims-written Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Iago Toral Quiroga	cf8986bce0	gallium/util: add a helper to compute vertex count from primitive count v2: - Only compute vertex counts for base primitives. - Add a unit test (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Iago Toral Quiroga	9eb8699e0f	v3d: be more explicit about the query types supported Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Iago Toral Quiroga	9b316ab57a	v3d: generate packet unpack functions These were not being compiled because of the lack of __gen_unpack_address. v2: - Shift raw address correctly (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Iago Toral Quiroga	5ffb8b1716	v3d: add header guards in v3d_packet_helpers.h Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-08 08:36:52 +02:00
Tomeu Vizoso	e7eac8a1e8	panfrost: Print errors from kernel Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 07:42:52 +02:00
Tomeu Vizoso	7c8434889d	panfrost: Mark buffers as PANFROST_BO_HEAP What we call GROWABLE in Mesa corresponds to the HEAP BO flag in the kernel. These buffers cannot be memory mapped in the CPU side at the moment, so make sure they are also marked INVISIBLE. This allows us to allocate a big heap upfront (16MB) without actually reserving space unless it's needed. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 07:42:52 +02:00
Tomeu Vizoso	19afd41e65	panfrost: Mark BOs as NOEXEC Unless a BO has the EXECUTABLE flag, mark it as NOEXEC. v2: - Rework version detection (Alyssa). Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 07:42:52 +02:00
Tomeu Vizoso	9398932c2d	panfrost: Take into account flags when looking up in the BO cache This will be useful right now so we avoid retrieving a non-executable buffer when a executable one is needed. As we support more flags, this logic will need to be extended to consider the different trade-offs to be made when matching BO specifications to BOs in the cache. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 07:42:52 +02:00
Tomeu Vizoso	950b5fc596	panfrost: Allocate shaders in their own BOs Instead of all shaders being stored in a single BO, have each shader in its own. This removes the need for a 16MB allocation per context, and allows us to place transient blend shaders in BOs marked as executable (before they were allocated in the transient pool, which shouldn't be executable). v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid reading from GPU-accessible memory when patching (Alyssa). - Free struct panfrost_blend_shader (Alyssa). - Give the job a reference to regular shaders when emitting (Alyssa). v3: - Split out the allocation flags change (Rob). Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-08 07:42:52 +02:00
Tomeu Vizoso	5804d75b9c	util/hash_table: Fix hashing in clears on 32-bit Some hash functions (eg. key_u64_hash) will attempt to dereference the key, causing an invalid access when passed DELETED_KEY_VALUE (0x1) or FREED_KEY_VALUE (0x0). When in 32-bit arch a 64-bit key value doesn't fit into a pointer, so hash_table_u64 internally use a pointer to a struct containing the 64-bit key value. Fix _mesa_hash_table_u64_clear() to handle the 32-bit case by creating a temporary hash_key_u64 to pass to the hash function. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-08-08 07:42:52 +02:00
Tapani Pälli	aba57b11ee	anv: support GetSwapchainGrallocUsage2ANDROID for Android New function supports gralloc1 usage flags that get set separately for producer and consumer. As we still need to support old method too, let's share common code and use android_convertGralloc0To1Usage helper. Bump the VK_ANDROID_native_buffer version to indicate support for the new call. Changes were tested on Android Celadon P with Basemark GPU and various Sascha Willems Vulkan demos. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 05:08:01 +00:00
Mark Janes	51c3ab618b	st/mesa: eliminate unnecessary redirection Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	61c54a8878	intel/perf: fix debug typo Misspelling was seen with INTEL_DEBUG=perfmon. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	2df1ab4d48	intel/perf: make gen_perf_query_object private Encapsulate the details of this structure within the perf implemenation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	deea3798b6	intel/perf: make perf context private Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	1f4f421ce0	intel/perf: print debug information INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	a663c8c26e	intel/perf: make internal methods private Now that all references from i965 have been moved to perf, we can make internal methods private again. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	be8b466cff	intel/perf: make oa_sample_buffers private All references to this data structure have been moved inside the perf subsystem. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	f2a049b4e3	intel/perf: expose method to create query By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	9f5c160d82	intel/perf: move initialization of pipeline statistics metrics to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	9f84efb452	intel/perf: move get_query_data into gen_perf This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	73eccdc4a5	intel/perf: move delete_query to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	8c9eac1234	intel/perf: move is_query_ready to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	a9be292722	intel/perf: move wait_query to perf The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	3c8ed58486	intel/perf: create a vtable entry for bo_busy Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	6fed756388	intel/perf: create a vtable entry for bo_wait_rendering Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	511bb15d4b	intel/perf: create a vtable entry for batch_references Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	3ecb23092e	intel/perf: refactor gen_perf_end_query into gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	018f9b81e5	intel/perf: refactor gen_perf_begin_query into gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	52d3db9ab6	intel/perf: move perf-related state into gen_perf_context To move more operations into intel/perf, several state items are needed. Save references to that state in the perf_ctxt, rather than passing them in for every operation. This commit includes an initializer for gen_perf_context, to set those references and also encapsulate the initialization of the sample buffer state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	df18acee78	intel/perf: create a vtable entries for buffer object map/unmap These operations are needed to refactor subsequent methods into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	a330d759c5	intel/perf: move client reference counts into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	4d0d4aa1b5	intel/perf: move open_perf into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	79ded7cc8f	intel/perf: move close_perf into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	f57c8a6dc1	intel/perf: create a vtable entry for emit_mi_flush This method is needed to move subsequent methods into perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	52f7a0bff7	intel/perf: use temporary pointers to simplify access to perf state Most accesses to perf state were made through repeated dereferences of brw_context members. Prefering temporary variables of perf_ctx and perf_cfg has the following advantages: - more concise implementation - easier refactor when moving subsequent methods to perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	a157f5acb1	intel/perf: move snapshot_statistics_registers into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	8ae6667992	intel/perf: move query_object into perf Query objects can now be encapsulated within the perf subsystem. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	7e890ed476	intel/perf: create a vtable entry for store_register_mem64 This method is needed to move subsequent methods into perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	4b2c885207	intel/perf: move free_sample_bufs into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	2f712d21b9	intel/perf: move reap_old_sample_buffers into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	31758bd36c	intel/perf: move get_free_sample_buf into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	e08a69b7f4	intel/perf: move the perf context into perf The "context" that is necessary to submit and process perf commands to the hardware was previously present in the brw_context.perfquery struct. This commit moves it into perf and provides a more understandable name. The intention is for this struct to be private, when all methods that access it are migrated into perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	fb622054f7	intel/perf: move get_metric_id to perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	b14e15e26a	intel/perf: move oa_sample_buf structure to perf oa_sample_buf holds the data provided by the kernel that will be collated into performance metrics. Since this functionality will be implemented in perf, the struct needs to be defined there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	e091f33990	intel/perf: enumerate query-based metrics in perf Iris and i965 both need to enumerate the available metrics, so these routines must be located in perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	2446f5cfd8	intel/perf: move perf-related constants to common location The perf subsystem needs several macro definitions that were duplicated in Iris and i965 headers. Place these macros within perf, if the perf implementation contains the only references to the values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	67675a5802	intel/perf: create a vtable entry for capture_frequency_stat_register In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	ae3fac851d	intel/perf: create a vtable entry for batchbuffer_flush In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	a921b215dd	intel/perf: create a vtable entry for emit_report_count In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	9a2a2e8bea	intel/perf: create a vtable entry for bo_unreference In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	439d5a3eff	intel/perf: create a vtable for low-level driver functions Performance metrics collections requires several actions (eg bo_map()) that have different implementations for Iris and i965. The perf subsystem needs a vtable for each of these actions, so it can invoke the corresponding implementation for each driver. The first call to be added to the table is bo_alloc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	ea66484e86	intel/perf: use common ioctl wrapper There were multiple ioctl-wrapper functions, so a common implementation was put in gen_gem.h. With a common implementation, perf no longer needs the caller to configure one for it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	07d3bd5c46	intel/perf: rename gen_perf to gen_perf_config This structure contains the configurations of the metrics for the current platform, and the settings needed for the perf subsystem to query that configuration from the device. This data is available without a rendering context, and needed to support MDAPI metrics for Vulkan. A gen_perf_context struct will be added later, which holds additional state from the rendering context necessary for metric data collection. The gen_perf struct needs a more precise name to reduce confusion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Ilia Mirkin	9ff8da0e50	nvc0: fix program dumping, use _debug_printf This debug situation is unforunate. debug_printf only does something with DEBUG set, but in practice all that needs to be moved to !NDEBUG. For now, use _debug_printf which always prints. However the whole function is guarded by !NDEBUG. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-08-07 22:32:02 -04:00
Ilia Mirkin	f6af104340	nvc0: add support for ATOMC_WRAP TGSI operations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-08-07 22:32:02 -04:00
Ilia Mirkin	a2bb7b26a1	gallium: redefine ATOMINC_WRAP to be more hardware-friendly Both AMD and NVIDIA hardware define it this way. Instead of replicating the logic everywhere, just fix it up in one place. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 22:31:56 -04:00
Ilia Mirkin	582c86346d	st/mesa: relax EXT_shader_image_load_store enable There's no reason to bring format-less load requirement into this extension. It requires a size to be provided, and a compatible format is computed from the size + data type. For example layout(size1x32) uniform iimage1D image; becomes DCL IMAGE[0], 1D, PIPE_FORMAT_R32_SINT, WR whereas PIPE_CAP_IMAGE_LOAD_FORMATTED is designed to allow PIPE_FORMAT_NONE to be provided as a format and still enable LOAD operations to be performed. So the shader has all the information it needs about the format. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 22:31:38 -04:00
Mark Janes	a29bc3a3ad	i965/perf: restore mdapi statistics query metrics Registration of mdapi metrics based on statistics query registers was inadvertently removed in the commit that checks for OA kernel support. The statistics queries are not dependent on OA. Fixes: `96e1c945f2` ("i965: Move device info initialization to common code") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 17:20:04 -07:00
Greg V	c0376a1234	util: add anon_file.h for all memfd/temp file usage Move the Weston os_create_anonymous_file code from egl/wayland into util, add support for Linux memfd and FreeBSD SHM_ANON, use that code in anv/aubinator instead of explicit memfd calls for portability. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-07 22:57:55 +00:00
Pierre-Eric Pelloux-Prayer	519bebdb40	radeonsi: limit DPBB context_states_per_bin batches when using gfx9 workaround It seems that using 'context_states_per_bin = 1' for DPBB fixes the reported issue. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110214 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 18:45:24 -04:00
Pierre-Eric Pelloux-Prayer	120d0ef937	radeonsi: reduce DPBB persistent_states_per_bin value for APUs Fixes some reported GPU hangs on RAVEN. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111231 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 18:45:22 -04:00
Pierre-Eric Pelloux-Prayer	6bda9ca062	radeonsi: fix typo in DPBB register field Also only set FLUSH_ON_BINNING_TRANSITION for GPU families that needs it (matches what si_emit_dpbb_disable is doing). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 18:45:20 -04:00
Pierre-Eric Pelloux-Prayer	90bded140e	radeonsi: fix S_028C48_MAX_ALLOC_COUNT value This field uses "value minus 1" encoding. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 18:45:09 -04:00
Christian Gmeiner	323cda475b	etnaviv: drop struct etna_3d_state Also drop #if 0 code block. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>	2019-08-07 22:12:00 +02:00
Yevhenii Kolesnikov	0325860e90	mesa: Use _mesa_delete_transform_feedback_object in drivers Function _mesa_delete_transform_feedback_object called from within drivers once driver-specific clean-up has been done. Brings into conformity with how other GL objects are handled. CC: Eric Anholt <eric@anholt.net> CC: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 17:25:22 +00:00
Yevhenii Kolesnikov	4f767ded6e	mesa: use _mesa_delete_query in drivers Now drivers can call _mesa_delete_query once driver-specific clean-up has been done. Brings into conformity with how other GL objects are handled. CC: Eric Anholt <eric@anholt.net> CC: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 17:25:22 +00:00
Juan A. Suarez Romero	4619535ab7	docs: update calendar, add news item and link release notes for 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-08-07 18:51:32 +02:00
Juan A. Suarez Romero	a19d43ebd5	docs: add sha256 checksums for 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `7fcb69a33c`)	2019-08-07 18:49:25 +02:00
Juan A. Suarez Romero	8484fafc78	docs: add release notes for 19.1.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `b84ffa028d`)	2019-08-07 18:49:23 +02:00
Bas Nieuwenhuizen	5a26f528cb	meson,i965: Link with android deps when building for android. The DBG marco in brw_blorp.c ends up calling an android log function: error: undefined reference to '__android_log_print' v2: On suggestion from Lionel, hang the Android dependency onto a new libintel_common dependency. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 15:34:46 +02:00
Erik Faye-Lund	da9e2958ec	gallium/dump: add missing query-type to short-list Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `3f6b3d9db7` ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 12:03:24 +00:00
Erik Faye-Lund	70a93922db	gallium/dump: add missing query-type to short-list Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `a677799e51` ("gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE and corresponding cap") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 12:03:24 +00:00
Eric Engestrom	32ce010951	gitlab-ci: don't install autotools deps These could've been deleted a long time ago, but apparent we forgot. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-08-07 10:18:25 +01:00
Eric Engestrom	5b10ddf358	util: fix mem leak of program path Fixes: `759b940389` ("util: Get program name based on path when possible") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-08-07 08:42:42 +01:00
Eric Engestrom	991137144a	meson: build intel-ui tools as part of `all` tools Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111289 Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 08:19:31 +01:00
Eric Engestrom	c32ebfe003	gitlab-ci: add gtk3 dev files for `-D tools=intel-ui` We also need to update wayland-protocols and libXrandr (and randrproto), as they are too old for gdk3 (which gtk3 depends on). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-07 08:19:30 +01:00
Jan Vesely	6b8269d0bb	clover: Fix build after clang r367864 v2: Drop special case of llvm-9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Aaron Watry <awatry@gmail.com>	2019-08-06 23:33:55 -04:00
Timothy Arceri	d81e11332b	mesa: remove super old TODOs from shaderapi.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-07 13:31:40 +10:00
John Stultz	fcfa2d1447	mesa: freedreno: Android.registers.mk: Fix up register xml.h file generation The current Androdi.registers.mk file causes build failures that look like: FAILED: external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h Caused by the following Android build rule change: https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules I tried to replace this with something similar to the static pattern suggested in the URL above, but ended up getting all the xml.h files generated using only the first a2xx.xml source file. So I've fallen back to explicitly defining the make rules for each. Additionally, we needed to provide the proper LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library to the components that depend on the register headers. Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-08-07 02:18:38 +00:00
John Stultz	96baf052b2	mesa: Add ir3/ir3_nir_imul.c generation to Android.mk With current master we're seeing build failures with AOSP: error: undefined symbol: ir3_nir_lower_imul This is due to the ir3_nir_imul.c file not being generated in the Android.mk files. This patch simply adds it to the Android build, after which thigns build and book ok on db410c. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-08-07 02:18:19 +00:00
Rohan Garg	16edd56fcc	panfrost: Take into account a index_bias for glDrawElementsBaseVertex calls Midgard does not accept a index_bias directly and relies instead on a bias correction offset (offset_bias_correction) in order to calculate the unbiased vertex index. We need to make sure we adjust offset_start and vertex_count in order to take into account the index_bias as required by a glDrawElementsBaseVertex call and then supply a additional offset_bias_correction to the hardware. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-06 17:18:19 -07:00
Bas Nieuwenhuizen	4bb17c08ae	radv/gfx10: Enable DCC for storage images. v2: Hide it behind a perftest flag. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	3a5950f501	radv: Add device argument for dcc compression check. Because it is about to be generation dependent. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	8c63ffe54d	radv: Disable compression for compute DCC decompress store. Previously we relied on stores not using DCC but that is going to change, so disable compression explicitly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	216a9d8871	radv: Add extra struct to image view creation. For extra args. Unlike image creation, I'm not embedding the vk struct in there, so all the inline structs can be kept. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	50add1b33a	radv: Do not decompress on LAYOUT_GENERAL. We handle render loops properly now and STORAGE still disables DCC/TC-compat HTILE in general. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	66131ceb8b	radv: Pass through render loop detection to internal layout decisions. And do nothing with it yet. Everything outside a renderpass has no render loop. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen	a171a6663d	radv: Add render loop detection in renderpass. VK spec 7.3: "Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments." So the only renderloops we can have is with input attachments. Detect these. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 02:13:07 +02:00
Timothy Arceri	a5b9394b87	drirc: Add vendor workaround for Divinity: Original Sin EE Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551	2019-08-07 10:12:49 +10:00
Timothy Arceri	dca119f12c	mesa/gallium: add dric option to allow overriding GL vendor string Will be used in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551	2019-08-07 10:12:49 +10:00
Marek Olšák	c95e2a1c6b	relnotes/19.2: document EXT_texture_dhadow_lod	2019-08-06 20:10:15 -04:00
Bas Nieuwenhuizen	04c6feb12c	radv: Fix config reg assert. Using the wrong bounds Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted" Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-08-07 08:58:23 +10:00
Marek Olšák	16577f5002	tgsi_to_nir: add a few needed double opcodes for internal radeonsi shaders v2 (Connor): - Split out prep work from adding opcodes, and rewrite the former Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 18:03:26 -04:00
Marek Olšák	2207daf549	tgsi_to_nir: implement a few needed 64-bit integer opcodes for internal radeonsi shaders v2 (Connor): - Split this out from the prep work, and rework the former - Add support for U64SNE Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 18:03:24 -04:00
Connor Abbott	37f6350c1d	ttn: Prepare for 64-bit sources and destinations v2: Properly handle 32->64 bit conversions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 18:03:22 -04:00
Connor Abbott	4b10949482	ttn: Use 1-bit NIR comparison opcodes We shouldn't be using the versions that output a 32-bit boolean, since nir_opt_algebraic won't optimize them as well. Drivers will lower these to the 32-bit versions after optimizing, if appropriate. Also, this will make implementing 64-bit comparisons easier. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 18:03:19 -04:00
Connor Abbott	e7fd90e8ef	nir/builder: Add nir_b2i Same as nir_b2f but for integers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 18:03:10 -04:00
Pierre-Eric Pelloux-Prayer	f84c9ad17a	radeonsi: enable EXT_shader_image_load_store This depends on LLVM 10 because this needs https://reviews.llvm.org/D65283 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:07 -04:00
Pierre-Eric Pelloux-Prayer	25fff591c1	radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:06 -04:00
Pierre-Eric Pelloux-Prayer	8789248541	radeonsi: add support for tgsi ATOMDEC_WRAP / ATOMINC_WRAP opcodes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:04 -04:00
Pierre-Eric Pelloux-Prayer	704a6b5948	ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:03 -04:00
Pierre-Eric Pelloux-Prayer	a9ec718652	nir: add atomic_inc_wrap/atomic_dec_wrap image intrinsics Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:02 -04:00
Pierre-Eric Pelloux-Prayer	fc0a2e5d01	glsl: add EXT_shader_image_load_store new image functions This extension has 2 functions that are missing from the ARB versions: - imageAtomicIncWrap - imageAtomicDecWrap Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:41:00 -04:00
Pierre-Eric Pelloux-Prayer	70a47fb032	glsl: add EXT_shader_image_load_store keywords to lexer All of them already existed for ARB_shader_image_load_store. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:58 -04:00
Pierre-Eric Pelloux-Prayer	cfba168b6c	glsl: add size qualifiers from EXT_shader_image_load_store Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:56 -04:00
Pierre-Eric Pelloux-Prayer	cd45d09226	glsl: handle differences between ARB/EXT versions of shader_image_load_store Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:55 -04:00
Pierre-Eric Pelloux-Prayer	5db28b0cf7	mesa: add EXT_shader_image_load_store glBindImageTextureEXT function The implementation is almost identical to glBindImageTexture except for error checking. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:53 -04:00
Pierre-Eric Pelloux-Prayer	71e619a825	glapi: add EXT_shader_image_load_store Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:52 -04:00
Pierre-Eric Pelloux-Prayer	91924453ee	gallium: add PIPE_CAP_TGSI_ATOMINC_WRAP to indicate support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:51 -04:00
Pierre-Eric Pelloux-Prayer	8b6bfed3d2	tgsi: add ATOMICINC_WRAP/ATOMICDEC_WRAP opcode Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:40:34 -04:00
Marek Olšák	1d8a71af57	radeonsi/gfx10: enable all CUs for GS if NGG is never used Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:09:03 -04:00
Marek Olšák	91227a1e17	radeonsi/gfx10: add global use_ngg and use_ngg_streamout flags Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:09:02 -04:00
Marek Olšák	f064b530f6	radeonsi/gfx10: remove an obsolete VGT_REUSE_OFF workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:09:01 -04:00
Marek Olšák	37dd8ebcf7	radeonsi/gfx10: disable LATE_ALLOC_GS on Navi14 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:59 -04:00
Marek Olšák	c5a6ecf61a	radeonsi/gfx10: implement a bug workaround for GE_PC_ALLOC Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:58 -04:00
Marek Olšák	8f8c28767e	radeonsi/gfx10: implement a bug workaround for NGG -> legacy transitions Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:57 -04:00
Marek Olšák	cb9d95623b	radeonsi/gfx10: implement a GE bug workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:56 -04:00
Marek Olšák	e08b0d7ac4	radeonsi/gfx10: set GE_CNTL for tessellation correctly to match PAL Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:54 -04:00
Marek Olšák	71b53020b7	radeonsi/gfx10: simplify NGG code in si_update_shaders Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:53 -04:00
Marek Olšák	a232f5e07c	radeonsi/gfx10: fix input VGPRs for legacy VS Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:51 -04:00
Marek Olšák	8d90157d49	radeonsi: make sure that rasterizer state != NULL and remove all NULL checking Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:39 -04:00
Marek Olšák	8b8819e88a	radeonsi: make sure that DSA state != NULL and remove all NULL checking Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:39 -04:00
Marek Olšák	b758eed9c3	radeonsi: make sure that blend state != NULL and remove all NULL checking Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:39 -04:00
Marek Olšák	8b68511ebc	radeonsi: DCC MSAA blending bug - include logic op, limit to Navi14 and older Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:50 -04:00
Marek Olšák	e69c1c8b8f	radeonsi: determine accurately whether logic op is enabled Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:48 -04:00
Marek Olšák	b38f5eb17a	radeonsi: skip draw calls with 0-sized index buffers Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:39 -04:00
Marek Olšák	e777720173	radeonsi/nir: lower PS inputs before scanning the shader Lowering PS inputs can eliminate some of them, which messes up persp/linear barycentric coord usage info. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:46 -04:00
Marek Olšák	f818d9ae3c	radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_center Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:39 -04:00
Marek Olšák	b3eed3cff9	radeonsi: add missing prints into si_dump_shader_key Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:15 -04:00
Marek Olšák	6b3ee86989	radeonsi: disable SDMA image copies on dGPUs to fix corruption in games Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-08-06 17:08:08 -04:00
Pierre-Eric Pelloux-Prayer	0556932f4a	mesa: add EXT_dsa glMultiTexCoordPointerEXT function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:22 -04:00
Pierre-Eric Pelloux-Prayer	e364ddece3	mesa: add EXT_dsa glMultiTexGen* functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:21 -04:00
Pierre-Eric Pelloux-Prayer	e8e0de6a8f	mesa: add EXT_dsa glCopyMultiTexImage* and glCopyMultiTexSubImage* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:19 -04:00
Pierre-Eric Pelloux-Prayer	f28d9ab1a3	mesa: add EXT_dsa glGetMultiTexParameteriv/fvEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:18 -04:00
Pierre-Eric Pelloux-Prayer	989c375852	mesa: add EXT_dsa glMultiTexSubImage1D/2D/3DEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:16 -04:00
Pierre-Eric Pelloux-Prayer	aac6578732	mesa: add EXT_dsa glMultiTexImage1D/2D/3DEXT + glGetMultiTexImageEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:15 -04:00
Pierre-Eric Pelloux-Prayer	885dbe2e84	mesa: add glBindMultiTextureEXT display list support Fixes: `0972b0b059` ("mesa: add support for glBindMultiTextureEXT") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:13 -04:00
Pierre-Eric Pelloux-Prayer	d9e26c3483	mesa: add EXT_dsa glMultiTexParameter* functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:12 -04:00
Pierre-Eric Pelloux-Prayer	e04f95057f	mesa: add EXT_dsa (Get)MultiTexEnv functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:10 -04:00
Pierre-Eric Pelloux-Prayer	04b8e50bb8	mesa: add _mesa_(get)texenvi(f)v_indexed helpers They are exactly like _mesa_GetTexEnvfv/_mesa_GetTexEnviv except they take a GLuint texunit parameter instead of relying of ctx->Texture.CurrentUnit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:08 -04:00
Pierre-Eric Pelloux-Prayer	0e595326c4	mesa: add new helper _mesa_get_texobj_by_target_and_texunit Based on the 'static get_texobj_by_target' function from texparam.c, but extended to also take the texunit as a parameter. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:03:06 -04:00
Pierre-Eric Pelloux-Prayer	58030d2b3d	mesa: replace _mesa_get_current_fixedfunc_tex_unit with _mesa_get_fixedfunc_tex_unit The new function implements the same feature but doesn't depend on ctx->Texture.CurrentUnit. This change allows to use it from indexed functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-06 17:02:52 -04:00
Danylo Piliaiev	b4c54894bb	iris: Handle vertex shader with window space position Iris advertises support for PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION so let's actually implement it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110657 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-06 20:25:35 +00:00
Erico Nunes	b783f9f77e	lima: fix pipe_debug_callback warnings Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-06 20:29:53 +02:00
Vasily Khoruzhick	5adfc8602c	lima/ppir: move sin/cos input scaling into NIR Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-06 17:49:22 +00:00
Antia Puentes	954224b714	nir/spirv: Fix gl_BaseVertex for non-indexed draws for OpenGL Lowers BaseVertex to the correct system value for OpenGL. v2: use options->environment rather than adding a new flag to spirv_to_nir_options Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-06 09:11:27 -07:00
Kenneth Graunke	382f92a814	iris: Increase BATCH_SZ to 64kB This seems to improve performance by roughly ~1% across the board. Thanks to Rafael Antognolli and Dan Walsh for their help tuning.	2019-08-06 09:09:26 -07:00
Bas Nieuwenhuizen	2af00b1fdd	ac/nir: Use correct cast for readfirstlane and ptrs. Fixes: `028ce527` "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-06 15:48:50 +00:00
Bas Nieuwenhuizen	2301b2e029	radv: Do non-uniform lowering before bool lowering. Since it can introduce comparisons. Fixes: `028ce52739` "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-06 15:48:50 +00:00
Jonathan Marek	dfe048058f	etnaviv: support 3D and 2D array textures Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:37:36 -04:00
Jonathan Marek	3508f2fb18	etnaviv: fix 3d texture upload Fix uploading of 3D textures and 2D array textures: * Remove asserts in BLT and RS checking z * Use box->z/box->depth in etna_copy_resource_box and CPU tile/untile * Track mip level depth and use it in etna_copy_resource Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:37:36 -04:00
Jonathan Marek	ed7a27719a	etnaviv: add alternative NIR compiler enable with ETNA_MESA_DEBUG=nir Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2019-08-06 10:33:17 -04:00
Jonathan Marek	ee1ed59458	etnaviv: prep for UBOs Allow UBO relocs and only emitting uniforms that are actually used. GC7000Lite has no address register, so upload uniforms to a UBO object to LOAD from. I removed the code to check for changes to individual uniforms and just reupload to entire uniform state when the state is dirty. I think there was very limited benefit to it and it isn't compatible with relocs. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:33:17 -04:00
Jonathan Marek	ca58c1120e	etnaviv: disasm: add dual16 bits, immediate decoding, and some opcodes Also use structs from etnaviv_asm since they hold the same information. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:33:17 -04:00
Jonathan Marek	e9a5181ad6	etnaviv: asm: new features * Dual16 bits * Halti5 disable multiple uniform src * write_mask compose * Halti2+ immediates Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:33:17 -04:00
Jonathan Marek	98e59f0a0a	etnaviv: update headers from rnndb Update to etna_viv commit f38ba2d. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-08-06 10:33:17 -04:00
Erico Nunes	e0aeee9460	lima: add summary report for shader-db Very basic summary, loops and gpir spills:fills are not updated yet and are only there to comply with the strings to shader-db report.py regex. For now it can be used to analyze the impact of changes in instruction count in both gpir and ppir. The LIMA_DEBUG=shaderdb setting can be useful to output stats on applications other than shader-db. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-06 15:43:31 +02:00
Erico Nunes	9e41a514a8	lima: add support for debug callback This adds support for glDebugMessageCallback which is required to support shader-db reports. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-06 15:43:26 +02:00
Tomeu Vizoso	67f4e1e787	panfrost/ci: Remove two tests from list of failures These tests have been fixed by: `b514f41183` ("glcpp: use pre-expansion line number for __LINE__") Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-08-06 15:19:43 +02:00
Jon Turney	84fae8e649	st/dri: Move dri2_format_mapping table and it's accessors from dri2.c to dri_helpers.c `8af1990a` exposed dri2_get_mapping_by_fourcc() in dri_helpers.h, so it could be used by dri_get_egl_image(), but didn't move it. This breaks the build in the with_dri=false case (e.g. when building for a target which doesn't have libdrm, so swrast is only dri driver built)	2019-08-06 12:21:56 +00:00
Jonathan Marek	b514f41183	glcpp: use pre-expansion line number for __LINE__ Fixes the following deqp tests: dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_* It don't see the spec requiring this, but it seems to be better, as the clang preprocessor for example has this behavior. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-06 11:27:04 +00:00
Jason Ekstrand	bc612536eb	anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-06 05:46:28 +00:00
Ian Romanick	5544b2cbbd	nir/algebraic: Use value range analysis to eliminate useless unary ops Sandy Bridge is the big winner because it lies at something of a crossroads. It supports a fairly high OpenGL version, and it still has the old style math box. The high OpenGL version means a lot more shaders can run on it. The old style math box means extra moves are necessary to resolve source modifiers on operands to complex math instructions like COS, SQRT, and RCP. v2: Remove a couple patterns that are now redundant. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16282006 -> 16278207 (-0.02%) instructions in affected programs: 174555 -> 170756 (-2.18%) helped: 661 HURT: 0 helped stats (abs) min: 1 max: 36 x̄: 5.75 x̃: 3 helped stats (rel) min: 0.06% max: 23.68% x̄: 2.81% x̃: 1.94% 95% mean confidence interval for instructions value: -6.16 -5.34 95% mean confidence interval for instructions %-change: -3.02% -2.60% Instructions are helped. total cycles in shared programs: 367168597 -> 367134284 (<.01%) cycles in affected programs: 1105276 -> 1070963 (-3.10%) helped: 460 HURT: 150 helped stats (abs) min: 1 max: 568 x̄: 96.60 x̃: 82 helped stats (rel) min: 0.02% max: 32.50% x̄: 7.99% x̃: 4.27% HURT stats (abs) min: 1 max: 901 x̄: 67.49 x̃: 39 HURT stats (rel) min: 0.07% max: 20.00% x̄: 4.90% x̃: 4.22% 95% mean confidence interval for cycles value: -65.68 -46.82 95% mean confidence interval for cycles %-change: -5.59% -4.05% Cycles are helped. Sandy Bridge total instructions in shared programs: 10824272 -> 10802557 (-0.20%) instructions in affected programs: 1237988 -> 1216273 (-1.75%) helped: 8199 HURT: 0 helped stats (abs) min: 1 max: 41 x̄: 2.65 x̃: 2 helped stats (rel) min: 0.12% max: 20.00% x̄: 2.04% x̃: 1.73% 95% mean confidence interval for instructions value: -2.70 -2.59 95% mean confidence interval for instructions %-change: -2.07% -2.00% Instructions are helped. total cycles in shared programs: 154009894 -> 153843598 (-0.11%) cycles in affected programs: 10650486 -> 10484190 (-1.56%) helped: 4973 HURT: 1533 helped stats (abs) min: 1 max: 3904 x̄: 40.20 x̃: 20 helped stats (rel) min: 0.02% max: 41.72% x̄: 2.63% x̃: 1.67% HURT stats (abs) min: 1 max: 453 x̄: 21.94 x̃: 8 HURT stats (rel) min: 0.02% max: 41.91% x̄: 1.54% x̃: 0.58% 95% mean confidence interval for cycles value: -28.02 -23.10 95% mean confidence interval for cycles %-change: -1.74% -1.56% Cycles are helped. LOST: 0 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8135196 -> 8134888 (<.01%) instructions in affected programs: 31920 -> 31612 (-0.96%) helped: 169 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 1.82 x̃: 2 helped stats (rel) min: 0.43% max: 3.23% x̄: 1.23% x̃: 1.16% 95% mean confidence interval for instructions value: -2.01 -1.64 95% mean confidence interval for instructions %-change: -1.32% -1.15% Instructions are helped. total cycles in shared programs: 188575724 -> 188574092 (<.01%) cycles in affected programs: 406840 -> 405208 (-0.40%) helped: 169 HURT: 0 helped stats (abs) min: 4 max: 72 x̄: 9.66 x̃: 10 helped stats (rel) min: 0.07% max: 2.16% x̄: 0.57% x̃: 0.47% 95% mean confidence interval for cycles value: -10.72 -8.59 95% mean confidence interval for cycles %-change: -0.63% -0.50% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:14 -07:00
Ian Romanick	8d14380971	nir/algebraic: Use value range analysis to convert fmin to fsat All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16297320 -> 16282006 (-0.09%) instructions in affected programs: 2434498 -> 2419184 (-0.63%) helped: 8091 HURT: 1 helped stats (abs) min: 1 max: 51 x̄: 1.89 x̃: 2 helped stats (rel) min: 0.04% max: 14.29% x̄: 0.98% x̃: 0.95% HURT stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28% 95% mean confidence interval for instructions value: -1.94 -1.85 95% mean confidence interval for instructions %-change: -0.99% -0.96% Instructions are helped. total cycles in shared programs: 367221624 -> 367168597 (-0.01%) cycles in affected programs: 126409635 -> 126356608 (-0.04%) helped: 5612 HURT: 1023 helped stats (abs) min: 1 max: 2332 x̄: 31.11 x̃: 16 helped stats (rel) min: <.01% max: 30.31% x̄: 1.69% x̃: 1.42% HURT stats (abs) min: 1 max: 2372 x̄: 118.84 x̃: 16 HURT stats (rel) min: <.01% max: 46.98% x̄: 1.46% x̃: 0.35% 95% mean confidence interval for cycles value: -11.52 -4.46 95% mean confidence interval for cycles %-change: -1.26% -1.14% Cycles are helped. total spills in shared programs: 8868 -> 8870 (0.02%) spills in affected programs: 28 -> 30 (7.14%) helped: 0 HURT: 1 total fills in shared programs: 21903 -> 21904 (<.01%) fills in affected programs: 42 -> 43 (2.38%) helped: 0 HURT: 1 Haswell total instructions in shared programs: 13353925 -> 13338728 (-0.11%) instructions in affected programs: 2265850 -> 2250653 (-0.67%) helped: 8127 HURT: 5 helped stats (abs) min: 1 max: 51 x̄: 1.88 x̃: 2 helped stats (rel) min: 0.04% max: 20.00% x̄: 1.13% x̃: 1.07% HURT stats (abs) min: 5 max: 16 x̄: 9.00 x̃: 6 HURT stats (rel) min: 0.19% max: 0.52% x̄: 0.35% x̃: 0.28% 95% mean confidence interval for instructions value: -1.91 -1.83 95% mean confidence interval for instructions %-change: -1.15% -1.11% Instructions are helped. total cycles in shared programs: 375535444 -> 375536343 (<.01%) cycles in affected programs: 131206582 -> 131207481 (<.01%) helped: 5590 HURT: 1055 helped stats (abs) min: 1 max: 2844 x̄: 34.15 x̃: 16 helped stats (rel) min: <.01% max: 21.57% x̄: 2.08% x̃: 1.60% HURT stats (abs) min: 1 max: 2487 x̄: 181.78 x̃: 21 HURT stats (rel) min: <.01% max: 40.66% x̄: 1.96% x̃: 0.37% 95% mean confidence interval for cycles value: -4.74 5.01 95% mean confidence interval for cycles %-change: -1.51% -1.37% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 23401 -> 23407 (0.03%) spills in affected programs: 248 -> 254 (2.42%) helped: 2 HURT: 5 total fills in shared programs: 34850 -> 34845 (-0.01%) fills in affected programs: 383 -> 378 (-1.31%) helped: 2 HURT: 5 Ivy Bridge total instructions in shared programs: 11975423 -> 11968117 (-0.06%) instructions in affected programs: 845703 -> 838397 (-0.86%) helped: 4071 HURT: 0 helped stats (abs) min: 1 max: 51 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.08% max: 8.21% x̄: 1.04% x̃: 0.93% 95% mean confidence interval for instructions value: -1.87 -1.71 95% mean confidence interval for instructions %-change: -1.06% -1.02% Instructions are helped. total cycles in shared programs: 179674318 -> 179635552 (-0.02%) cycles in affected programs: 5100065 -> 5061299 (-0.76%) helped: 2650 HURT: 611 helped stats (abs) min: 1 max: 900 x̄: 21.85 x̃: 16 helped stats (rel) min: <.01% max: 21.55% x̄: 2.39% x̃: 1.40% HURT stats (abs) min: 1 max: 1841 x̄: 31.33 x̃: 6 HURT stats (rel) min: <.01% max: 58.71% x̄: 1.64% x̃: 0.37% 95% mean confidence interval for cycles value: -14.14 -9.64 95% mean confidence interval for cycles %-change: -1.75% -1.52% Cycles are helped. LOST: 3 GAINED: 7 Sandy Bridge total instructions in shared programs: 10828844 -> 10824272 (-0.04%) instructions in affected programs: 525678 -> 521106 (-0.87%) helped: 2386 HURT: 0 helped stats (abs) min: 1 max: 51 x̄: 1.92 x̃: 2 helped stats (rel) min: 0.11% max: 7.96% x̄: 1.05% x̃: 0.94% 95% mean confidence interval for instructions value: -2.04 -1.80 95% mean confidence interval for instructions %-change: -1.08% -1.03% Instructions are helped. total cycles in shared programs: 154024591 -> 154009894 (<.01%) cycles in affected programs: 4005766 -> 3991069 (-0.37%) helped: 1245 HURT: 506 helped stats (abs) min: 1 max: 585 x̄: 21.07 x̃: 16 helped stats (rel) min: 0.02% max: 11.57% x̄: 1.98% x̃: 0.83% HURT stats (abs) min: 1 max: 639 x̄: 22.81 x̃: 6 HURT stats (rel) min: 0.01% max: 26.21% x̄: 1.07% x̃: 0.26% 95% mean confidence interval for cycles value: -10.57 -6.21 95% mean confidence interval for cycles %-change: -1.23% -0.97% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8137248 -> 8135196 (-0.03%) instructions in affected programs: 148322 -> 146270 (-1.38%) helped: 992 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 2.07 x̃: 2 helped stats (rel) min: 0.41% max: 9.73% x̄: 1.74% x̃: 1.51% 95% mean confidence interval for instructions value: -2.16 -1.98 95% mean confidence interval for instructions %-change: -1.80% -1.67% Instructions are helped. total cycles in shared programs: 188583424 -> 188575724 (<.01%) cycles in affected programs: 4409620 -> 4401920 (-0.17%) helped: 956 HURT: 6 helped stats (abs) min: 2 max: 168 x̄: 8.09 x̃: 8 helped stats (rel) min: 0.04% max: 6.76% x̄: 0.27% x̃: 0.18% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10% 95% mean confidence interval for cycles value: -8.41 -7.60 95% mean confidence interval for cycles %-change: -0.29% -0.25% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:14 -07:00
Ian Romanick	b77070e293	nir/algebraic: Use value range analysis to eliminate tautological compares It's only one application on one platform (Haswell) that's affected, but spills and fills increase quite dramatically. :( All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16320850 -> 16297320 (-0.14%) instructions in affected programs: 448012 -> 424482 (-5.25%) helped: 1938 HURT: 0 helped stats (abs) min: 2 max: 264 x̄: 12.14 x̃: 10 helped stats (rel) min: 0.35% max: 43.75% x̄: 5.85% x̃: 5.38% 95% mean confidence interval for instructions value: -12.80 -11.48 95% mean confidence interval for instructions %-change: -5.99% -5.72% Instructions are helped. total cycles in shared programs: 367496943 -> 367221624 (-0.07%) cycles in affected programs: 8557232 -> 8281913 (-3.22%) helped: 1907 HURT: 26 helped stats (abs) min: 4 max: 12802 x̄: 147.21 x̃: 48 helped stats (rel) min: 0.03% max: 75.85% x̄: 5.55% x̃: 3.94% HURT stats (abs) min: 4 max: 1870 x̄: 208.23 x̃: 20 HURT stats (rel) min: 0.16% max: 32.11% x̄: 8.31% x̃: 0.79% 95% mean confidence interval for cycles value: -165.38 -119.48 95% mean confidence interval for cycles %-change: -5.68% -5.04% Cycles are helped. LOST: 1 GAINED: 0 Haswell total instructions in shared programs: 13374211 -> 13353925 (-0.15%) instructions in affected programs: 349868 -> 329582 (-5.80%) helped: 1669 HURT: 1 helped stats (abs) min: 1 max: 264 x̄: 12.57 x̃: 10 helped stats (rel) min: 0.12% max: 46.81% x̄: 6.86% x̃: 6.49% HURT stats (abs) min: 700 max: 700 x̄: 700.00 x̃: 700 HURT stats (rel) min: 64.34% max: 64.34% x̄: 64.34% x̃: 64.34% 95% mean confidence interval for instructions value: -13.25 -11.04 95% mean confidence interval for instructions %-change: -7.01% -6.63% Instructions are helped. total cycles in shared programs: 375763544 -> 375535444 (-0.06%) cycles in affected programs: 6932686 -> 6704586 (-3.29%) helped: 1622 HURT: 48 helped stats (abs) min: 2 max: 12229 x̄: 148.31 x̃: 68 helped stats (rel) min: 0.06% max: 74.03% x̄: 5.94% x̃: 4.12% HURT stats (abs) min: 3 max: 7451 x̄: 259.44 x̃: 41 HURT stats (rel) min: 0.05% max: 54.99% x̄: 8.52% x̃: 2.88% 95% mean confidence interval for cycles value: -159.86 -113.31 95% mean confidence interval for cycles %-change: -5.86% -5.18% Cycles are helped. total spills in shared programs: 23258 -> 23401 (0.61%) spills in affected programs: 54 -> 197 (264.81%) helped: 4 HURT: 2 total fills in shared programs: 34775 -> 34850 (0.22%) fills in affected programs: 52 -> 127 (144.23%) helped: 4 HURT: 1 LOST: 5 GAINED: 0 Ivy Bridge total instructions in shared programs: 11996051 -> 11977964 (-0.15%) instructions in affected programs: 346679 -> 328592 (-5.22%) helped: 1508 HURT: 0 helped stats (abs) min: 2 max: 198 x̄: 11.99 x̃: 10 helped stats (rel) min: 0.26% max: 19.83% x̄: 5.73% x̃: 5.43% 95% mean confidence interval for instructions value: -12.65 -11.34 95% mean confidence interval for instructions %-change: -5.86% -5.60% Instructions are helped. total cycles in shared programs: 179891389 -> 179691339 (-0.11%) cycles in affected programs: 7869479 -> 7669429 (-2.54%) helped: 1485 HURT: 23 helped stats (abs) min: 1 max: 12615 x̄: 136.16 x̃: 54 helped stats (rel) min: 0.02% max: 71.84% x̄: 4.69% x̃: 3.49% HURT stats (abs) min: 1 max: 403 x̄: 93.48 x̃: 6 HURT stats (rel) min: 0.04% max: 34.01% x̄: 8.68% x̃: 0.81% 95% mean confidence interval for cycles value: -154.59 -110.73 95% mean confidence interval for cycles %-change: -4.79% -4.19% Cycles are helped. Sandy Bridge total instructions in shared programs: 10829247 -> 10828844 (<.01%) instructions in affected programs: 21258 -> 20855 (-1.90%) helped: 88 HURT: 0 helped stats (abs) min: 2 max: 17 x̄: 4.58 x̃: 5 helped stats (rel) min: 0.52% max: 3.92% x̄: 2.05% x̃: 2.21% 95% mean confidence interval for instructions value: -5.03 -4.13 95% mean confidence interval for instructions %-change: -2.21% -1.89% Instructions are helped. total cycles in shared programs: 154035437 -> 154024591 (<.01%) cycles in affected programs: 430176 -> 419330 (-2.52%) helped: 78 HURT: 10 helped stats (abs) min: 2 max: 4649 x̄: 143.06 x̃: 32 helped stats (rel) min: 0.05% max: 6.02% x̄: 2.03% x̃: 1.07% HURT stats (abs) min: 3 max: 265 x̄: 31.30 x̃: 6 HURT stats (rel) min: 0.10% max: 8.67% x̄: 1.03% x̃: 0.21% 95% mean confidence interval for cycles value: -232.53 -13.97 95% mean confidence interval for cycles %-change: -2.13% -1.23% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8137402 -> 8137248 (<.01%) instructions in affected programs: 2280 -> 2126 (-6.75%) helped: 10 HURT: 0 helped stats (abs) min: 12 max: 19 x̄: 15.40 x̃: 15 helped stats (rel) min: 3.90% max: 11.73% x̄: 7.19% x̃: 6.95% 95% mean confidence interval for instructions value: -17.69 -13.11 95% mean confidence interval for instructions %-change: -8.99% -5.39% Instructions are helped. total cycles in shared programs: 188538716 -> 188583424 (0.02%) cycles in affected programs: 69326 -> 114034 (64.49%) helped: 0 HURT: 10 HURT stats (abs) min: 2068 max: 7686 x̄: 4470.80 x̃: 4870 HURT stats (rel) min: 27.20% max: 173.66% x̄: 69.55% x̃: 59.41% 95% mean confidence interval for cycles value: 2830.86 6110.74 95% mean confidence interval for cycles %-change: 39.18% 99.91% Cycles are HURT. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	96fcb3f95b	nir/algebraic: Use value range analysis to eliminate tautological compares not used by if-statements This just eliminates tautological / contradictory compares that are used for bcsel and other non-if-statement cases. If-statements are not affected because removing flow control can cause the i965 instrution scheduler to create some very long live ranges resulting in unncessary spilling. This causes some shaders to fall of a performance cliff. Since many small if-statements are already flattened to bcsel, this optimization covers more than 68% of the possible cases (2417 shaders helped for instructions on Skylake vs. 3554). v2: Reorder and add whitespace to make the relationship between the patterns more obvious. Suggested by Caio. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333474 -> 16322028 (-0.07%) instructions in affected programs: 438559 -> 427113 (-2.61%) helped: 1765 HURT: 0 helped stats (abs) min: 1 max: 275 x̄: 6.48 x̃: 4 helped stats (rel) min: 0.20% max: 36.36% x̄: 4.07% x̃: 1.82% 95% mean confidence interval for instructions value: -6.87 -6.10 95% mean confidence interval for instructions %-change: -4.30% -3.84% Instructions are helped. total cycles in shared programs: 367608554 -> 367511103 (-0.03%) cycles in affected programs: `8368829` -> 8271378 (-1.16%) helped: 1541 HURT: 129 helped stats (abs) min: 1 max: 4468 x̄: 66.78 x̃: 39 helped stats (rel) min: 0.01% max: 45.69% x̄: 4.10% x̃: 2.17% HURT stats (abs) min: 1 max: 973 x̄: 42.25 x̃: 10 HURT stats (rel) min: 0.02% max: 64.39% x̄: 2.15% x̃: 0.60% 95% mean confidence interval for cycles value: -64.90 -51.81 95% mean confidence interval for cycles %-change: -3.89% -3.36% Cycles are helped. total spills in shared programs: 8867 -> 8868 (0.01%) spills in affected programs: 18 -> 19 (5.56%) helped: 0 HURT: 1 total fills in shared programs: 21900 -> 21903 (0.01%) fills in affected programs: 78 -> 81 (3.85%) helped: 0 HURT: 1 All Gen6 and earlier platforms had similar results. (Sandy Bridge shown) total instructions in shared programs: 10829877 -> 10829247 (<.01%) instructions in affected programs: 30240 -> 29610 (-2.08%) helped: 177 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 3.56 x̃: 3 helped stats (rel) min: 0.37% max: 17.39% x̄: 2.68% x̃: 1.94% 95% mean confidence interval for instructions value: -3.93 -3.18 95% mean confidence interval for instructions %-change: -3.04% -2.32% Instructions are helped. total cycles in shared programs: 154036580 -> 154035437 (<.01%) cycles in affected programs: 352402 -> 351259 (-0.32%) helped: 96 HURT: 28 helped stats (abs) min: 1 max: 128 x̄: 14.73 x̃: 6 helped stats (rel) min: 0.03% max: 24.00% x̄: 1.51% x̃: 0.46% HURT stats (abs) min: 1 max: 117 x̄: 9.68 x̃: 4 HURT stats (rel) min: 0.03% max: 2.24% x̄: 0.43% x̃: 0.23% 95% mean confidence interval for cycles value: -13.40 -5.03 95% mean confidence interval for cycles %-change: -1.62% -0.53% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	fa116ce357	nir/range-analysis: Range tracking for ffma and flrp A similar technique could be used for fmin3, fmax3, and fmid3. This could be squashed with the previous commit. I kept it separate to ease review. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	586602c5d9	nir/range-analysis: Range tracking for bcsel This could be squashed with the previous commit. I kept it separate to ease review. v2: Add some missing cases. Use nir_src_is_const helper. Both suggested by Caio. Use a table for mapping source ranges to a result range. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	3009cbed50	nir/range-analysis: Tighten the range of fsat based on the range of its source This could be squashed with the previous commit. I kept it separate to ease review. v2: Use a switch statement and add more comments. Both suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	405de7ccb6	nir/range-analysis: Rudimentary value range analysis pass Most integer operations are omitted because dealing with integer overflow is hard. There are a few things that could be smarter if there was a small amount more tracking of ranges of integer types (i.e., operands are Boolean, operand values fit in 16 bits, etc.). The changes to nir_search_helpers.h are included in this patch to simplify reordering the changes to nir_opt_algebraic.py. v2: Memoize range analysis results. Without this, some shaders appear to get stuck in infinite loops. v3: Rebase on many months of Mesa changes, including 1-bit Boolean changes. v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction". v5: Use nir_alu_srcs_equal for detecting (aa). Previously just the SSA value was compared, and this incorrectly matched (a.xa.y). v6: Many code improvements including (but not limited to) better names, more comments, and better use of helper functions. All suggested by Caio. Rework the handling of several opcodes to use a table for mapping source ranges to a result range. This change fixed a bug that caused fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero. Slightly tighten the range of fmul by recognizing that xx is gt_zero if x is gt_zero. Add similar handling for -xx. v7: Use _______ in the tables as an alias for unknown. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	d24edb4b8c	nir/algebraic: Simplify some comparisons like a+constant < constant v2: Remove unsafe integer versions of the optimizations. This change had no effect on shader-db results. Suggested by Caio. All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333713 -> 16332631 (<.01%) instructions in affected programs: 258112 -> 257030 (-0.42%) helped: 1275 HURT: 407 helped stats (abs) min: 1 max: 7 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.20% max: 8.33% x̄: 1.33% x̃: 0.86% HURT stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.11% max: 2.94% x̄: 0.98% x̃: 0.98% 95% mean confidence interval for instructions value: -0.70 -0.59 95% mean confidence interval for instructions %-change: -0.84% -0.70% Instructions are helped. total cycles in shared programs: 367596791 -> 367601268 (<.01%) cycles in affected programs: 3420062 -> 3424539 (0.13%) helped: 1553 HURT: 783 helped stats (abs) min: 1 max: 742 x̄: 24.36 x̃: 6 helped stats (rel) min: 0.05% max: 21.12% x̄: 1.47% x̃: 0.65% HURT stats (abs) min: 1 max: 557 x̄: 54.04 x̃: 14 HURT stats (rel) min: 0.01% max: 33.66% x̄: 3.36% x̃: 1.43% 95% mean confidence interval for cycles value: -1.60 5.43 95% mean confidence interval for cycles %-change: -0.03% 0.33% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Iron Lake total instructions in shared programs: 8137992 -> 8137874 (<.01%) instructions in affected programs: 17501 -> 17383 (-0.67%) helped: 104 HURT: 2 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.25% max: 2.63% x̄: 0.87% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.22 -1.00 95% mean confidence interval for instructions %-change: -0.94% -0.76% Instructions are helped. total cycles in shared programs: 188540038 -> 188539650 (<.01%) cycles in affected programs: 704574 -> 704186 (-0.06%) helped: 125 HURT: 84 helped stats (abs) min: 2 max: 96 x̄: 6.45 x̃: 4 helped stats (rel) min: <.01% max: 3.47% x̄: 0.42% x̃: 0.25% HURT stats (abs) min: 2 max: 58 x̄: 4.98 x̃: 4 HURT stats (rel) min: 0.01% max: 2.75% x̄: 0.36% x̃: 0.33% 95% mean confidence interval for cycles value: -3.20 -0.52 95% mean confidence interval for cycles %-change: -0.19% -0.03% Cycles are helped. GM45 total instructions in shared programs: 5008889 -> 5008830 (<.01%) instructions in affected programs: 8824 -> 8765 (-0.67%) helped: 52 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.25% max: 2.38% x̄: 0.86% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.27 -0.95 95% mean confidence interval for instructions %-change: -0.96% -0.71% Instructions are helped. total cycles in shared programs: 128969426 -> 128969128 (<.01%) cycles in affected programs: 399798 -> 399500 (-0.07%) helped: 74 HURT: 30 helped stats (abs) min: 2 max: 22 x̄: 6.76 x̃: 6 helped stats (rel) min: <.01% max: 1.83% x̄: 0.46% x̃: 0.29% HURT stats (abs) min: 2 max: 58 x̄: 6.73 x̃: 6 HURT stats (rel) min: 0.06% max: 2.75% x̄: 0.42% x̃: 0.21% 95% mean confidence interval for cycles value: -4.60 -1.14 95% mean confidence interval for cycles %-change: -0.32% -0.08% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	7c64cbf49d	nir/algebraic: Recognize (a < 0 \|\| 0 < b) as min(a, -b) < 0 Similar to commit `97e6c1b9` and `f5cf74d8ba`. First apply 0 < b => -b < 0 to get (a < 0 \|\| -b < 0), then apply some pre-existing rules to get min(a, -b) < 0. v2: Substantially update the comment explaining the use of is_used_once and the duplication of patterns. Suggested by Caio. Also, while flt and fge are not commutative, ior and iand are. Half of the original patterns were redundant, so delete them. As alternate justification for deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b). Proof left as an exercise for the reader. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333789 -> 16333713 (<.01%) instructions in affected programs: 11424 -> 11348 (-0.67%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69% 95% mean confidence interval for instructions value: -3.03 -1.72 95% mean confidence interval for instructions %-change: -0.89% -0.62% Instructions are helped. total cycles in shared programs: 367598295 -> 367596791 (<.01%) cycles in affected programs: 141414 -> 139910 (-1.06%) helped: 23 HURT: 6 helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20 helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76% HURT stats (abs) min: 4 max: 88 x̄: 27.33 x̃: 12 HURT stats (rel) min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59% 95% mean confidence interval for cycles value: -93.51 -10.21 95% mean confidence interval for cycles %-change: -1.10% -0.05% Cycles are helped. total instructions in shared programs: 10830836 -> 10830779 (<.01%) instructions in affected programs: 6895 -> 6838 (-0.83%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1 helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33% 95% mean confidence interval for instructions value: -8.46 -1.04 95% mean confidence interval for instructions %-change: -1.03% -0.27% Instructions are helped. total cycles in shared programs: 154028477 -> 154032740 (<.01%) cycles in affected programs: 178433 -> 182696 (2.39%) helped: 3 HURT: 9 helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10 helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09% HURT stats (abs) min: 27 max: 1415 x̄: 477.33 x̃: 262 HURT stats (rel) min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76% 95% mean confidence interval for cycles value: 28.68 681.82 95% mean confidence interval for cycles %-change: 0.37% 3.30% Cycles are HURT. Iron Lake total instructions in shared programs: 8137966 -> 8137992 (<.01%) instructions in affected programs: 3281 -> 3307 (0.79%) helped: 0 HURT: 6 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64% 95% mean confidence interval for instructions value: 2.17 6.50 95% mean confidence interval for instructions %-change: 0.56% 0.96% Instructions are HURT. total cycles in shared programs: 188539386 -> 188540038 (<.01%) cycles in affected programs: 103826 -> 104478 (0.63%) helped: 0 HURT: 7 HURT stats (abs) min: 16 max: 218 x̄: 93.14 x̃: 80 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46% 95% mean confidence interval for cycles value: 10.26 176.02 95% mean confidence interval for cycles %-change: 0.24% 0.81% Cycles are HURT. GM45 total instructions in shared programs: 5008876 -> 5008889 (<.01%) instructions in affected programs: 1645 -> 1658 (0.79%) helped: 0 HURT: 3 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63% total cycles in shared programs: 128968950 -> 128969426 (<.01%) cycles in affected programs: 64854 -> 65330 (0.73%) helped: 0 HURT: 4 HURT stats (abs) min: 18 max: 218 x̄: 119.00 x̃: 120 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66% 95% mean confidence interval for cycles value: -62.92 300.92 95% mean confidence interval for cycles %-change: -0.05% 1.26% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	92b75c126b	nir/algebraic: Replace checks that a value is between (or not) [0, 1] v2: Add an extra line to one of the proofs. Suggested by Caio. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16329772 -> 16329427 (<.01%) instructions in affected programs: 41980 -> 41635 (-0.82%) helped: 110 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.14 x̃: 2 helped stats (rel) min: 0.19% max: 5.56% x̄: 1.12% x̃: 0.94% 95% mean confidence interval for instructions value: -4.10 -2.17 95% mean confidence interval for instructions %-change: -1.28% -0.96% Instructions are helped. total cycles in shared programs: 367551273 -> 367549979 (<.01%) cycles in affected programs: 492462 -> 491168 (-0.26%) helped: 76 HURT: 25 helped stats (abs) min: 1 max: 400 x̄: 42.86 x̃: 12 helped stats (rel) min: 0.06% max: 10.72% x̄: 1.23% x̃: 0.75% HURT stats (abs) min: 2 max: 730 x̄: 78.52 x̃: 16 HURT stats (rel) min: 0.17% max: 6.89% x̄: 2.08% x̃: 1.23% 95% mean confidence interval for cycles value: -37.79 12.16 95% mean confidence interval for cycles %-change: -0.90% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10831115 -> 10830836 (<.01%) instructions in affected programs: 37830 -> 37551 (-0.74%) helped: 70 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.99 x̃: 2 helped stats (rel) min: 0.33% max: 7.14% x̄: 1.21% x̃: 0.97% 95% mean confidence interval for instructions value: -5.47 -2.50 95% mean confidence interval for instructions %-change: -1.49% -0.92% Instructions are helped. total cycles in shared programs: 154029323 -> 154028477 (<.01%) cycles in affected programs: 247909 -> 247063 (-0.34%) helped: 52 HURT: 6 helped stats (abs) min: 2 max: 254 x̄: 25.81 x̃: 4 helped stats (rel) min: 0.07% max: 4.39% x̄: 0.81% x̃: 0.19% HURT stats (abs) min: 4 max: 403 x̄: 82.67 x̃: 8 HURT stats (rel) min: 0.18% max: 1.60% x̄: 0.71% x̃: 0.53% 95% mean confidence interval for cycles value: -34.83 5.65 95% mean confidence interval for cycles %-change: -0.98% -0.32% Inconclusive result (value mean confidence interval includes 0). Iron Lake total instructions in shared programs: 8138007 -> 8137966 (<.01%) instructions in affected programs: 4060 -> 4019 (-1.01%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.68% max: 8.33% x̄: 1.45% x̃: 0.90% 95% mean confidence interval for instructions value: -1.50 -1.15 95% mean confidence interval for instructions %-change: -2.11% -0.79% Instructions are helped. total cycles in shared programs: 188539492 -> 188539386 (<.01%) cycles in affected programs: 26280 -> 26174 (-0.40%) helped: 25 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.54% x̃: 0.50% 95% mean confidence interval for cycles value: -5.08 -3.40 95% mean confidence interval for cycles %-change: -0.70% -0.37% Cycles are helped. GM45 total instructions in shared programs: 5008897 -> 5008876 (<.01%) instructions in affected programs: 2096 -> 2075 (-1.00%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1 helped stats (rel) min: 0.68% max: 7.69% x̄: 1.41% x̃: 0.89% 95% mean confidence interval for instructions value: -1.57 -1.06 95% mean confidence interval for instructions %-change: -2.32% -0.49% Instructions are helped. total cycles in shared programs: 128969020 -> 128968950 (<.01%) cycles in affected programs: 18490 -> 18420 (-0.38%) helped: 15 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.67 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.51% x̃: 0.48% 95% mean confidence interval for cycles value: -6.03 -3.30 95% mean confidence interval for cycles %-change: -0.78% -0.24% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Jonathan Marek	a44b4200f3	tgsi_to_nir: fix nir_gather_ssa_types for TGSI->NIR shaders Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-08-05 22:09:47 -04:00
Jason Ekstrand	f6e7de41d7	anv: Implement VK_EXT_line_rasterization Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-06 02:05:28 +00:00
Jason Ekstrand	f03512f90b	genxml: Rename 3DSTATE_SF::Anti-Aliasing Enable This makes it consistent with the new name when it's moved to 3DSTATE_RASTER. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-06 02:05:28 +00:00
Jason Ekstrand	abf9e10488	anv: Use dirty bits for dynamic state tracking Previously, we assumed that the dirty bit was always 1 << VK_DYNAMIC_* and this assumption is about to be false. Extensions which define new VK_DYNAMIC_* enums won't be nice and tightly packed which this really requires. Instead, add functions to don the conversions and rework the bits a bit. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-06 02:05:28 +00:00
Jason Ekstrand	aa13f75f01	anv: Advertise the right line width range on gen9 and CHV Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-06 02:05:28 +00:00
Alyssa Rosenzweig	77295b1fdc	meson: Add panfrost to the --auto list Look ma, we're a real driver now! I was waiting until Panfrost stabilises a bit for this, but now that 19.2 is almost here, let's make us official :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-08-05 17:42:05 -07:00
Erico Nunes	360bda0b1d	lima/ppir: enable lower_vector_cmp to lower fall_equal Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-05 23:36:46 +02:00
Erico Nunes	9e8f8dbcd1	lima: re-run nir_opt_algebraic after int lowering nir_lower_int_to_float is currently only meant to run once, and some ops must be lowered after being converted from int ops to be implementable, so re-run nir_opt_algebraic after lowering ints to floats. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-05 23:36:35 +02:00
Alyssa Rosenzweig	3db4949197	pan/midgard: Extend SSA concurrency checks to other args No glmark changes, but this seems like a good idea. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-05 11:22:49 -07:00
Alyssa Rosenzweig	2869758355	pan/midgard: Rewrite bidirectionally when eliminating moves Symptom: the sky is black in SuperTuxKart (flashbacks to SMB/NES emulation intensify). Essentially, what happened is a fixed (special) move to r0 was eliminated but scheduling did not factor this in, so can_run_concurrent_ssa returned true even when there was a logical data dependency that needed to be resolved. Fixes: `20771ede1c` ("pan/midgard: Add post-RA move elimination") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-05 10:58:39 -07:00
Danylo Piliaiev	04a9951580	intel/compiler: add ability to override shader's assembly When dumping shader's assembly with INTEL_DEBUG=vs,tcs,... sha1 of the resulting assembly is also printed, having environment variable INTEL_SHADER_ASM_READ_PATH present driver will try to load a "%sha1%.bin" file from the path and substitute current assembly with the one from the file. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-05 17:19:09 +00:00
Danylo Piliaiev	430823c96b	intel/tools: add binary output type to i965_asm Add '-t,--type' command line option to specify the output type which can be 'bin', 'c_literal' or 'hex'. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-08-05 17:19:09 +00:00
Alyssa Rosenzweig	1f8b653acb	panfrost: Add app blacklist In preparation for an initial 19.2 release, add a blacklist for apps known to be buggy under Panfrost to protect users. Panfrost is NOT a conformant implementation at this time. Distros: please do not revert this patch. If blacklisted apps are run using Panfrost, dragons will bite you. Thanks :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-08-05 16:04:47 +00:00
Kenneth Graunke	64b73b770b	iris: Fix bad external BO hash table and zombie list interactions A while ago, we started deferring GEM object closure and VMA release until buffers were idle. This had some unforeseen interactions with external buffers. We keep imported buffers in hash tables, so if we have repeated imports of the same GEM object, we map those to the same iris_bo structure. This is critical for several reasons. Unfortunately, we broke this assumption. When freeing a non-idle external buffer, we would drop it from the hash tables, then move it to the zombie list. If someone reimported the same GEM object, we would not find it in the hash tables, and go ahead and make a second iris_bo for that GEM object. But the old iris_bo would still be in the zombie list, and so we would eventually call GEM_CLOSE on it - closing a BO that should have still been live. To work around this, we defer removing a BO from the hash tables until it's actually fully closed. This has the strange effect that an external BO may be on the zombie list, and yet be resurrected before it can be properly cleaned up. In this case, we remove it from the list so it won't be freed. Fixes severe instability in Weston, which was hitting EINVALs and ENOENTs from execbuf2, due to batches referring to a GEM object that had been closed, or at least had its VMA torched. Fixes: `457a55716e` ("iris: Defer closing and freeing VMA until buffers are idle.")	2019-08-05 08:53:41 -07:00
Kenneth Graunke	48e5a99d86	iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename it Everybody importing an external buffer was looking it up in the hash table, then referencing it. We can just do that in the helper instead, which also gives us a convenient spot to stash extra code shortly.	2019-08-05 08:53:07 -07:00
Ahmad Fatoum	4f75ea57c2	gallium: add stm DRM entry point The STM32MP157 features a Vivante GC400 GPU supported by etnaviv. Add a DRM entry point for the STM display controller, so mesa can be used with it. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-05 14:53:31 +00:00
Eric Engestrom	c251e2e662	gitlab-ci: don't remove a package we don't install anymore Fixes: `85dace1c0b` ("gitlab-ci: remove software-properties-common") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-05 15:43:26 +01:00
Andrii Simiklit	dc471f2ef8	etnaviv: fix a null pointer dereference This issue was found by cppcheck Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-08-05 15:31:43 +02:00
Connor Abbott	74470baebb	ac/nir: Lower large indirect variables to scratch results from radeonsi NIR: Totals from affected shaders: SGPRS: 704 -> 464 (-34.09 %) VGPRS: 2056 -> 672 (-67.32 %) Spilled SGPRs: 24 -> 0 (-100.00 %) Spilled VGPRs: 28406 -> 0 (-100.00 %) Private memory VGPRs: 0 -> 3182 (0.00 %) Scratch size: 1064 -> 3228 (203.38 %) dwords per thread Code Size: 935260 -> 40180 (-95.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 70 (150.00 %) Wait states: 0 -> 0 (0.00 %) results from radv: Totals from affected shaders: SGPRS: 80 -> 48 (-40.00 %) VGPRS: 204 -> 108 (-47.06 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 256 (0.00 %) dwords per thread Code Size: 15792 -> 9504 (-39.82 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1 -> 2 (100.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-05 11:45:18 +02:00
Timothy Arceri	3c9144f9e5	drirc: Add discard workaround for Divinity: Original Sin EE This adds an additional work around for the game to fix the blocky shadows as reported in bug 105282 Acked-by: Eric Engestrom <eric.engestrom@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105282	2019-08-05 15:35:00 +10:00
Erico Nunes	486b33558a	lima/ppir: simplify load uni/temp op lowering and scheduling The load uniform/temporary operations output only to a pipeline register, which must be consumed by another op in the same instruction later. The current implementation delays the decision of who will consume this result to until the scheduling step. If the consumer node is not able to use the pipeline register, a mov node may have to be created, during the scheduler step. As part of the ppir scheduler simplification, and now that the ppir scheduler supports pipeline register dependencies, this can be simplified by always creating a single mov node outputting to a normal register that can be used directly by all consumers. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-04 13:38:19 +02:00
Erico Nunes	fd29c4d6c5	lima/ppir: simplify select op lowering and scheduling The select operation relies on the select condition coming from the result of the the alu scalar mult slot, in the same instruction. The current implementation creates a mov node to be the predecessor of select, and then relies on an exception during scheduling to ensure that both ops are inserted in the same instruction. Now that the ppir scheduler supports pipeline register dependencies, this can be simplified by making the mov explicitly output to the fmul pipeline register, and the scheduler can place it without an exception. Since the select condition can only be placed in the scalar mult slot, differently than a regular mov, define a separate op for it. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-04 13:38:18 +02:00
Erico Nunes	eb82637c2f	lima/ppir: support pipeline registers in scheduler The ppir scheduler grew to be rather complicated and containing many exceptions as it also has to take care of inserting additional nodes when it is mandatory for nodes to be in the same instruction. As such, the lima lowering and scheduling process can be difficult to understand and maintain. The ppir lowering step created nodes hoping that the scheduler would notice the exception and do the right thing. This proposal adds a simple refactor to the scheduler so that it places nodes with pipeline registers in the same instruction. With the scheduler handling this in a general way, it is possible to create same-instruction dependencies by using pipeline registers during the lowering stage. This is simpler to maintain because now we can make these dependencies explicit in a single place (lowering), and we can drop exceptions from scheduling. Reducing the complexity of the scheduler is also useful as preparatory work to support control flow in ppir. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-04 13:38:11 +02:00
Eric Engestrom	a1da8eccbe	docs: fix "empty array" meson syntax On recent versions of Meson (0.47+) these are synonymous, but we still support older versions than that, so let's use the correct syntax to avoid confusing users of old Meson versions. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-04 12:21:19 +01:00
Eric Engestrom	1361ab3c82	egl: drop unnecessary function deref Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-08-04 11:26:20 +01:00
Eric Engestrom	e7e3fd5c03	glx: drop unnecessary pointer deref for function calls Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-08-04 11:26:20 +01:00
Eric Engestrom	9668d7f539	introduce c11_compat.h to provide C11 things in C99 Right now, all it does is provide the new standard `static_assert()` name. Fixes: `fbf7c38da3` ("egl/wayland: use bitset.h for `formats` bit set") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Bhushan Shah <bshah@kde.org>	2019-08-04 11:14:25 +01:00
Eric Engestrom	64ffc289be	travis: add MacOS Scons build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-08-04 11:11:32 +01:00
Eric Engestrom	8f1cdac793	symbols-check: fix `nm` invocation on MacOS According to Mac OSX's man page [1], this is how we should get the list of exported symbols: nm -g -P foo.dylib -g to only show the exported symbols -P to show it in a "portable" format, ie. readable by a script Since this is supported by GNU nm as well, let's use that everywhere, although some care needs to be taken as there are some differences in the output. [1] https://www.unix.com/man-page/osx/1/nm/ Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-04 11:06:27 +01:00
Eric Engestrom	59f8809f3c	symbols-check: discard platform symbols early (as the comment there already claimed) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-04 11:06:27 +01:00
Eric Engestrom	81b3d141b3	symbols-check: skip test if we can't get the symbols list Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-04 11:06:27 +01:00
Vasily Khoruzhick	c780af7771	lima/ppir: move alu vec to scalar lowering into NIR Utgard PP is vec4, but some operations are scalar, utilize NIR vec to scalar lowering pass and indicate operations that we want to lower. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-08-04 02:17:12 +00:00
Jason Ekstrand	aebca3961b	iris: Fix handling of SIMD32 fragment shaders The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out which bits to pull out of the prog data and stuff where. Therefore, they need to be called with the final set of _NPixelDispatchEnable bits after we've done the workaround for SIMD32 and 16x MSAA. Otherwise, if you end up with a somewhat odd combination of enables, the GRF start reg and KSP data ends up in the wrong slots. In particular, running SIMD32-only is broken but several other combinations are as well. Fixes: `5445c176e2` "iris: Disable SIMD32 when using a 16x MSAA..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-03 22:24:40 +00:00
Bas Nieuwenhuizen	9f37c9903b	mesa: Rename GLX_USE_TLS to USE_ELF_TLS. These days it is not GLX only and it does not work with all TLS implementations. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-03 20:18:17 +02:00
Bas Nieuwenhuizen	d7ca1efc6c	meson: Do not use GLX_USE_TLS on Android. The asm code expects a specific kind of implementation, but Android uses something different (emutls). Turns out mesa has a fallback with pthread_getspecific, with an optimizaiton if only a single thread is used. emutls also uses getspecific, so lets just use the optimized mesa implementation. Fixes: `20294dceeb` "mesa: Enable asm unconditionally, now that gen_matypes is gone." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-03 18:40:04 +02:00
Christian Gmeiner	2dd598c129	etnaviv: s/boolean/bool Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>	2019-08-03 12:32:28 +02:00
Andreas Baierl	5254e53deb	lima/ppir: Add gl_FrontFace handling Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-08-03 08:04:12 +00:00
Jason Ekstrand	b62b0cfa71	intel/nir: Add 1-bit opcodes to brw_cmod_for_nir_comparison_op Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-03 00:35:48 +00:00
Jason Ekstrand	c02c3ff612	intel/nir: Add a common nir comparison -> cmod helper We already had one in the vec4 code, we just had move it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-03 00:35:48 +00:00
Eric Engestrom	2fd30e3722	util: fix pointer type on NetBSD NetBSD expects a `void *` argument [1] as the printf-style arguments to the formatting string, so we need to cast the `const` away. [1] https://netbsd.gw.com/cgi-bin/man-cgi?pthread_setname_np++NetBSD-current Suggested-by: Kamil Rytarowski <n54@gmx.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-03 00:20:21 +00:00
Eric Engestrom	b558fa4dfe	meson: remove unused field Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Eric Engestrom	9a07606b84	meson: replace last uses of libxmlconfig with idep_xmlconfig Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Eric Engestrom	178811d8f6	meson: drop unused dep_{thread,dl} Unused as of last commit. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Eric Engestrom	d2d85b950d	meson: replace libmesa_util with idep_mesautil This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Alyssa Rosenzweig	8ddb38209d	pan/midgard: Print texture outmod I have no idea who thought this was a good idea. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 16:54:53 -07:00
Alyssa Rosenzweig	ad864a0bbb	pan/midgard: Promote all 16 uniforms Now that register spilling is in place, this is reasonable. It turns out for some shaders, it's actually better to cap at 8 work registers and extra >8 uniform reigsters and tolerate the spilling, since the extra resulting threads make up for the spillage. So incidentally, the shader that spills here is in -bterrain, which jumps from 19fps to 21fps as a result of this change. total instructions in shared programs: 3513 -> 3448 (-1.85%) instructions in affected programs: 776 -> 711 (-8.38%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2 helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19% 95% mean confidence interval for instructions value: -4.28 -2.22 95% mean confidence interval for instructions %-change: -10.02% -6.73% Instructions are helped. total bundles in shared programs: 2067 -> 2024 (-2.08%) bundles in affected programs: 515 -> 472 (-8.35%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2 helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for bundles value: -3.01 -1.29 95% mean confidence interval for bundles %-change: -12.13% -6.91% Bundles are helped. total quadwords in shared programs: 3468 -> 3426 (-1.21%) quadwords in affected programs: 764 -> 722 (-5.50%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2 helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08% 95% mean confidence interval for quadwords value: -2.83 -1.37 95% mean confidence interval for quadwords %-change: -8.08% -4.65% Quadwords are helped. total registers in shared programs: 383 -> 360 (-6.01%) registers in affected programs: 112 -> 89 (-20.54%) helped: 19 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1 helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00% 95% mean confidence interval for registers value: -1.47 -0.95 95% mean confidence interval for registers %-change: -22.39% -18.87% Registers are helped. total threads in shared programs: 432 -> 451 (4.40%) threads in affected programs: 19 -> 38 (100.00%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.41 2.04 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are [helped]. total loops in shared programs: 4 -> 4 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 0 -> 4 spills in affected programs: 0 -> 4 helped: 0 HURT: 2 total fills in shared programs: 0 -> 7 fills in affected programs: 0 -> 7 helped: 0 HURT: 2 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig	e94239b9a4	pan/midgard: Break mir_spill_register into its function No functional changes, just breaks out a megamonster function and fixes the indentation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig	d4bcca19da	pan/midgard: Switch sources to an array for trinary sources We need three independent sources to support indirect SSBO writes (as well as textures with both LOD/bias and offsets). Now is a good time to make sources just an array so we don't have to rewrite a ton of code if we ever needed a fourth source for some reason. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 16:48:54 -07:00
Alyssa Rosenzweig	513d02cfeb	pan/midgard: Remove "r27-only" register class As far as I know, there's no such thing as a load/store op that only takes its argument in r27. We just need to set the appropriate arg_1 field in the RA to specify other registers if we want them. To facilitate this, various RA-related changes are needed across the compiler ; this should also fix indirect offsets which were implicitly interpreted as "r27-only" despite not even passing through RA yet. One ripple effect change is switching the move insertion point and adjusting the liveness analysis accordingly, so while this was intended as a purely functional change, there are some shader-db changes: total instructions in shared programs: 3511 -> 3498 (-0.37%) instructions in affected programs: 563 -> 550 (-2.31%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1 helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33% 95% mean confidence interval for instructions value: -1.27 -0.90 95% mean confidence interval for instructions %-change: -3.23% -1.93% Instructions are helped. total bundles in shared programs: 2067 -> 2067 (0.00%) bundles in affected programs: 398 -> 398 (0.00%) helped: 7 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56% HURT stats (abs) min: 1 max: 2 x̄: 1.75 x̃: 2 HURT stats (rel) min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26% 95% mean confidence interval for bundles value: -0.95 0.95 95% mean confidence interval for bundles %-change: -5.21% 1.50% Inconclusive result (value mean confidence interval includes 0). total quadwords in shared programs: 3464 -> 3454 (-0.29%) quadwords in affected programs: 1199 -> 1189 (-0.83%) helped: 18 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56% 95% mean confidence interval for quadwords value: -0.98 0.07 Inconclusive result (value mean confidence interval includes 0). total registers in shared programs: 383 -> 373 (-2.61%) registers in affected programs: 56 -> 46 (-17.86%) helped: 12 HURT: 2 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00% 95% mean confidence interval for registers value: -1.13 -0.29 95% mean confidence interval for registers %-change: -35.07% -5.63% Registers are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	5d9b7a8ddb	pan/midgard: Handle get/set_swizzle for load/store arguments Load/store's main "argument 0" already has its swizzle handled correctly (for stores, that is). But the tinier arguments, the compact ones with a component select but not a full swizzle, those are not yet handled. Let's do something about that!	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	9aeb726045	pan/midgard: Fix block successors Rather than an ersatz thing that sort of looks like successors but is in fact just the source order traversal with some backward jumps hacked in for loops... construct an actual flow graph so we can do analysis sanely. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	1a116037d8	pan/midgard: Add helper to pack load/store registers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	e112d9d333	pan/midgard: Decode register/component in load/store argument 3-bits out of 8 down! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	5a572f4b55	pan/midgard: Fix REGISTER_OFFSET r27 isn't the special one, usually. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig	c908772ee4	pan/midgard: Split ld/st unknown to arg_1/arg_2 fields The 16-bit field can be decomposed to two independent 8-bit fields, each representing a single (additional) argument to the load/store op, generally used for encoding registers. Addressable registers here are substantially limited compared to the main register in a load/store op. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 14:20:02 -07:00
Bas Nieuwenhuizen	2d54fdb563	radv: Expose VK_KHR_imageless_framebuffer. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:35:25 +02:00
Bas Nieuwenhuizen	9475782eac	radv: Implement VK_KHR_imageless_framebuffer. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:35:19 +02:00
Bas Nieuwenhuizen	a7041f3b4e	radv: Store image view also outside framebuffer. So we can use it with imageless framebuffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen	49e6c2fb78	radv: Store color/depth surface info in attachment info instead of framebuffer. That way we can use it for imageless framebuffers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 22:18:51 +02:00
Alyssa Rosenzweig	cd98d94516	panfrost: Allocate polygon lists on-demand Rather than alloacting a huge (64MB) polygon list on context creation and sharing it across framebuffers, we instead allocate polygon lists as BOs (which consistently hit the cache) sized appropriately; for about a month, we've known how to calculate the polygon list size so this has only recently become possible. The good news is we can render to truly massive framebuffers without crashing and, more importantly, we eliminate the 64MB upfront overhead. If a list that size isn't actually needed, it's not allocated. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	ed501c00cb	panfrost: Handle the bo == NULL case in panfrost_bo_[un]reference() Allows us to pass BOs without checking if they're NULL or not. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	12f72175f3	panfrost: Get rid of the skippable param in attach_vt_framebuffer() The only user of this function always passes true. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	8227d284f7	panfrost: Don't emit a new FB desc when setting a new FB state The FB desc will be emitted/attached on the first draw targetting this new FB. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	95507a3dd4	panfrost: Bail out early when doing a wallpaper blit The wallpaper blit is a bit special in that the operation is targetting the current FB, but the u_blitter logic creates a new surface for it which makes util_framebuffer_state_equal() return false. In that case we don't want a new FB descriptor to be emitted/attached, so let's just copy the new state into ctx->pipe_framebuffer and exit the function. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	8645afce4c	panfrost: Bail out early when new and current FB states are equal If the current FB matches the new one there's nothing to be done in panfrost_set_framebuffer_state(). By bailing out early in that case we avoid emitting new FB descriptors (the old ones are still valid). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	17d6ee2bd1	panfrost: Delay FB descriptor allocation No need to emit SFBD/MFBD at frame invalidation. They can be emitted when the framebuffer is attached, which saves us a potential FB desc re-allocation if a new FB is bound after the swap. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	b5ca1e5458	panfrost: Remove job from ctx->jobs at submission time This guarantees that new draws targetting the same framebuffer will get a new job instance. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:54:58 +02:00
Boris Brezillon	20b00e1ff2	panfrost: Make ctx->job useful ctx->job is supposed to serve as a cache to avoid an hash table lookup everytime we access the job attached to the currently bound FB, except it was never assigned to anything but NULL. Fix that by adding the missing assignment in panfrost_get_job_for_fbo(). Also add a missing NULL assignment in the ->set_framebuffer_state() path. While at it, add extra assert()s to make sure ctx->job is consistent. Fixes: `59c9623d0a` ("panfrost: Import job data structures from v3d") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 21:52:56 +02:00
Bas Nieuwenhuizen	72e7b7a00b	ac/nir,radv: Optimize bounds check for 64 bit CAS. When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 21:21:55 +02:00
Roland Scheidegger	74baeacafc	gallivm: fix issue with AtomicCmpXchg wrapper on llvm 3.5-3.8 These versions still need wrapper but already have both success and failure ordering. (Compile tested on llvm 3.3, 3.7, 3.8.) v2: don't duplicate whole function (suggested by Brian). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111102 Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-08-02 20:16:17 +02:00
Matt Turner	dcf9d91a80	util: Handle differences in pthread_setname_np There are a lot of unfortunate differences in the implementation of this function. NetBSD and Mac OS X in particular require different arguments. https://stackoverflow.com/questions/2369738/how-to-set-the-name-of-a-thread-in-linux-pthreads/7989973#7989973 provides for a good overview of the differences. Fixes: `9c411e020d` ("util: Drop preprocessor guards for glibc-2.12") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111264 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> [Eric: use DETECT_OS_* instead of PIPE_OS_*] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	55eadf971a	util/os_time: use detect_os.h to uncouple from gallium Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	bffa23313a	util/u_debug: use detect_os.h Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	7f12a66ad5	util/os_misc: use detect_os.h to start uncoupling from gallium Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	87adc898b3	util/os_memory: use detect_os.h to uncouple it from gallium While at it, remove p_compiler.h as well as it is unused. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	9a5148190a	gallium: deduplicate os detection logic by using detect_os.h This allows us to avoid having to rename all the PIPE_OS_* at once while still making sure PIPE_OS_* and DETECT_OS_* are always in sync. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	8c52bca112	gallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER This is basically just an alias for PIPE_OS_WINDOWS. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	e740e7a6f0	scons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	8c63348c94	gallium: remove never-used PIPE_SUBSYSTEM_DRI PIPE_SUBSYSTEM_DRI was introduced in `dacfef1589` ("gallium: New configuration header.") 11 years ago, and was never used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	bfb70032d4	util: fix typo in comment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Eric Engestrom	362e9d8682	util: introduce detect_os.h Mostly copied from src/gallium/include/pipe/p_config.h, so I kept its copyright and authorship. Other than the obvious rename, the big difference is that these are always defined, to be used as `#if DETECT_OS_LINUX`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-02 18:38:52 +01:00
Rob Clark	9d5beab441	freedreno/batch: fix dependency loop detection We can have a scenario like: A -> B A -> C -> B When adding the A->C dependency, it doesn't really matter that C depends on something that A depends on, that isn't a necessary condition for a dependency loop. Instead what we want to know is that nothing C depends on, directly or indirectly, depends on A. We can detect this by recursively OR'ing the dependents_mask of C and all it's dependencies. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Rob Clark	e1790c532a	freedreno/a6xx: add missing flush/invalidates for blit Various things we were missing for multiple blits in a single batch. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Rob Clark	d8379da19e	freedreno/a6xx: skip tiles with no geometry If no clear, and no geometry according to VSC_STATE[pipe] we can skip the tile entirely. If there is a fast-clear, we can't skip restore (clear) or resolve IBs, but we can still skip draw IB. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	de3e130fc9	freedreno/a6xx: VSC overflow detection/handling Check VSC_SIZE/VSC_SIZE2 regs from cmdstream to detect overflow, and skip use of VSC visibility stream when overflow is detected, to avoid GPU hangs. This is done w/ introduction of some CP_REG_TEST/ CP_COND_REG_EXEC packet pairs. In addition, eventually (after a frame or two) detect the condition and resize the VSC buffers until overflow no longer happens. Note that this significantly reduces the initial size of the VSC buffers, backing out a previous hack to make them 16x larger than what should be typically required (the previous "solution" for VSC overflow). Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	401f532bea	freedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching Seems this isn't needed anymore on a6xx to control whether visibility stream is used. And it would be hard to deal with if it was, for disabling use of VSC stream in draw pass. So just remove it and simplify things. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	146d6e6463	freedreno/a6xx: cleanup "blit_mem" Rename to "control_mem", and switch to using a struct to manage the layout, rather than just ad-hoc hard-coded offsets. For recovering from VSC stream overflow, we'll need to add more, but best to clean it up first. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	1cbb7f7601	freedreno: refresh tile debug Fix some #ifdef'd bitrot, and get rid of #ifdef so it doesn't bitrot again. And add a prints for per-tile state. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	44f3c1cf01	freedreno: update registers Pull in some updates of VSC regs Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	c179ded9cb	freedreno/gmem: small cleanup Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	e2bb3e84ab	freedreno/drm: convert ring_pool to child_pool Worth another couple percent at driver2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Rob Clark	9ac23794c9	freedreno/drm: remove idx_lock Since it ends up contended, it is a bit of a bottleneck for workloads with high driver overhead. Worth nearly +10% at gfxbench driver2. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Rob Clark	e439f63467	freedreno/batch: always update last_fence Not all flush paths come thru fd_context_flush(), so we should also set last_fence in the batch flush path. This avoids some no-op flushes just to get a fence. For example when pctx->flush_resource() triggers a flush. We should probably keep the last_fence update in fd_context_flush() as well to handle deferred flush case. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	c93eae7f10	freedreno: drop unused fd_fence_ref param The pscreen param was just there to satisfy pipe_screen::fence_reference But some of the internal uses passed NULL for screen. Which is a bit ugly. Instead drop the param and add a shim function to plug into the screen. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Alyssa Rosenzweig	1637a53890	pan/midgard: Print invert modifier Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig	62a5ee3bb4	pan/midgard: Flip conditionals We would like to flip ops to have a constant in the second place to enable inlining of the constant. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig	d066ca3575	pan/midgard: Add bitwise src/invert fusing De Morgan's Laws and some special ops basically. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig	620c2717cf	pan/midgard: Add .not propagation pass Essentially .pos propagation but for bitwise. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig	b821e1b85e	pan/midgard: Fuse invert into bitwise ops We use the new invert flag to produce ops like inand. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 09:57:15 -07:00
Jonathan Marek	d8584c5cf2	freedreno: a2xx: implement texture tiling Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	fb5c3db0ab	freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass nir_lower_alu_to_scalar can now be used to only lower certain ops, so we don't need the custom pass. And we can lower fall_equal/fany_nequal with lower_vector_cmp instead. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	e652ca4e0b	freedreno: a2xx: fix HW binning for batches with >256K vertices Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	257957b026	freedreno: a2xx: fix fneg/fabs/fsat opcodes Previously we would get a fmov with modifiers, but now that mov has no type these opcodes need to be supported. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	43dbd7d603	freedreno: a2xx: fix order of NIR opts int_to_float needs to come after bool_to_float, and lower_to_source_mods needs to come after both, since they don't deal wih source mods. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	57e980a4fb	freedreno: a2xx: fix non-etc1 cubemaps Not sure how this happened, but apparently all cubemaps need swapped XY. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	2e029acbe2	freedreno: a2xx: fix fast clear not being used for Z24X8 buffers Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Jonathan Marek	e25388c97b	freedreno: align renderonly scanout buffers Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Eric Engestrom	6125c93e00	gitlab-ci: just build all the tools This line was mistakenly added while there is already a `-D tools=all` a few lines below. Fixes: `f60defa72d` ("gitlab-ci: Add a shader-db run using v3d on drm-shim.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 16:41:19 +01:00
Sergii Romantsov	a86eccfb78	i965/clear: clear_value better precision Test-case with depth-clear 0.5 and format MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent clear-value of 0.4999997. Maybe its better to improve? CC: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `0ae9ce0f29` (i965/clear: Quantize the depth clear value based on the format) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-02 14:25:34 +00:00
Samuel Pitoiset	e8110e51c6	radv: fix image_has_{cmask,fmask}() helpers The driver should now rely on cmask_offset because CMASK can be disabled by the driver for some reasons (eg. mipmaps). Apply the same change for FMASK, although it should be useless. Fixes: `ad1bc8621d` ("radv: remove radv_get_image_fmask_info()") Fixes: `10d08da52c` ("radv/gfx10: add missing dcc_tile_swizzle tweak") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 14:00:50 +02:00
Samuel Pitoiset	ad1bc8621d	radv: remove radv_get_image_fmask_info() It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:46 +02:00
Samuel Pitoiset	10d08da52c	radv/gfx10: add missing dcc_tile_swizzle tweak Fixes: `c90f46700d` ("radv/gfx10: mask DCC tile swizzle by alignment") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:43 +02:00
Samuel Pitoiset	9c9745e8dd	radv: remove radv_get_image_cmask_info() It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:41 +02:00
Samuel Pitoiset	856487a280	radv: only account for tile_swizzle for color surfaces with DCC It's 0 for depth surfaces with TC compat HTILE enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 13:34:39 +02:00
Bas Nieuwenhuizen	e1c5d8a364	radv: Enable VK_KHR_shader_atomic_int64 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 12:26:32 +02:00
Bas Nieuwenhuizen	a17f2206d3	ac/nir: Implement LLVM9 64-bit buffer compare & exchange. LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this extracts the ptr, does a bound check and then uses a cmpxchg LLVM instruction. Not ideal, but the earliest release we're going to get a proper intrinsic is LLVM 10. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-08-02 12:26:11 +02:00
Connor Abbott	73274c9ec2	Revert "ac/nir: handle negate modifier" This reverts commit `bfea7e4d29`.	2019-08-02 11:14:50 +02:00
Connor Abbott	4a382d66ee	Revert "ac/nir: handle abs modifier" This reverts commit `d3c80733cd`. These were only appearing due to memory corruption.	2019-08-02 11:14:08 +02:00
Timothy Arceri	06ec14d692	iris: bump compat profile support to 4.6 All of the current piglit compat profile tests pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-02 18:56:53 +10:00
Timothy Arceri	74f96b06d6	egl: fix OpenGL 3.1 context creation >From the EGL_KHR_create_context spec: "* If OpenGL 3.1 is requested, the context returned may implement any of the following versions: * Version 3.1. The GL_ARB_compatibility extension may or may not be implemented, as determined by the implementation. * The core profile of version 3.2 or greater." Fixes CTS tests: dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_stencil dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_no_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_no_stencil dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_no_stencil dEQP-EGL.functional.create_context_ext.gl_31.rgb888_no_depth_no_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_no_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_no_depth_no_stencil dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_no_depth_no_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_no_depth_no_stencil dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_stencil dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_stencil Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-02 18:56:53 +10:00
Connor Abbott	f41516bdb5	nir/find_array_copies: Reject copies with mismatched type When we detect a scalar/vector copy through load_deref/store_deref, we have to be careful since those can bitcast an int to a float and vice-versa even though copy_deref can't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251 Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 10:34:29 +02:00
Samuel Pitoiset	7368000868	radv: re-apply "Optimize rebinding the same descriptor set." This makes it cheaper to just change the dynamic offsets with the same descriptor sets. This optimization has been reverted a while back because of random GPU hangs on GFX9, no it looks fine, at least CTS no longer hangs on GFX9 and it doesn't hang on GFX10 as well. It fixes a performance problem with Wolfenstein Youngblood. Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>	2019-08-02 09:56:55 +02:00
Samuel Pitoiset	96a5445559	radv/gfx10: use the correct target machine for Wave32 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 09:37:38 +02:00
Samuel Pitoiset	8a86908e9a	radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders It can be enabled with RADV_PERFTEST=gewave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 09:37:36 +02:00
Samuel Pitoiset	953bbacc23	radv/gfx10: add Wave32 support for fragment shaders It can be enabled with RADV_PERFTEST=pswave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-08-02 09:37:34 +02:00
Kenneth Graunke	18c2e09dc7	gallium: Implement GL_EXT_shader_samples_identical via a new capability This exposes the textureSamplesIdenticalEXT function in GLSL. We enable it for iris and radeonsi, because their compilers already have support for this. Tested on Intel Kabylake and AMD Vega 64. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 23:38:54 -07:00
Kenneth Graunke	adcc0a8fdc	intel/tools: Fix aubinator_viewer build. This functions was recently renamed and not all callers were updated. Fixes: `086c486a75` ("intel/device: rename gen_get_device_info")	2019-08-01 23:36:41 -07:00
Francisco Jerez	54fbc625ea	intel/ir: Fix CFG corruption in opt_predicated_break(). Specifically the optimization of a conditional BREAK + WHILE sequence into a conditional WHILE seems pretty broken. The list of successors of "earlier_block" (where the conditional BREAK was found) is emptied and then re-created with the same edges for no apparent reason. On top of that the list of predecessors of the block immediately after the WHILE loop is emptied, but only one of the original edges will be added back, which means that potentially several blocks that still have it on their list of successors won't be on its list of predecessors anymore, causing all sorts of hilarity due to the inconsistency in the control flow graph. The solution is to remove the code that's removing valid edges from the CFG. cfg_t::remove_block() will already clean up after itself. The assert in bblock_t::combine_with() also needs to be removed since we will be merging a block with multiple children into the first one of them. Found the issue on a hardware enabling branch originally, but apparently somebody reproduced the same problem independently on master in the meantime. Fixes: `d13bcdb3a9` ("i965/fs: Extend predicated break pass to predicate WHILE.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009 Cc: jiradet.jd@gmail.com Cc: Sergii Romantsov <sergii.romantsov@globallogic.com> Cc: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Tested-by: Paul Chelombitko <qamonstergl@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-08-01 16:56:48 -07:00
Mark Janes	ddb59cd20e	intel/device: make internal functions private The device info initializer makes several fuctions internal: - handling of device override - updating topology from kernel information The implementation file is slightly reordered due to the renamed functions being static. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:40:03 -07:00
Mark Janes	086c486a75	intel/device: rename gen_get_device_info Rename the original device info initialization routine so callers don't mistakenly call the wrong one: gen_get_device_info_from_fd: Queries kernel for full device info, including topology details. gen_get_device_info_from_pci_id: Partially initializes device info based on PCI ID lookup, when the kernel is not available. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:56 -07:00
Mark Janes	d594d2a052	intel/tools: use device info initializer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:54 -07:00
Mark Janes	e4a0070db4	anv: use initialization routine for gen_device_info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:51 -07:00
Mark Janes	49465f1330	iris/screen: use initialization routine for gen_device_info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:48 -07:00
Mark Janes	96e1c945f2	i965: Move device info initialization to common code With perf queries, initializing the device info is much more complex than just getting a PCI ID and calling gen_get_device_info. This commit adds a new gen_get_device_info_from_fd helper in common code which does all of the requisite kernel queries to get device info including all of the topology information. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:44 -07:00
Mark Janes	1186f6ea69	i965/perf: verify kernel support before registering OA metrics When gen_device_info updates the topology in it's initializer, the kernel queries will fail silently. Iris and anv have minimum kernel requirements that support the queries. i965 must verify kernel support before reporting OA metrics. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:39:41 -07:00
Mark Janes	7852fe5415	intel/common: provide common ioctl routine i965 links against libdrm for drmIoctl, but anv and iris both re-implement this routine to avoid the dependency. intel/dev also needs an ioctl wrapper, so lets share the same implementation everywhere. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-01 16:38:40 -07:00
Alyssa Rosenzweig	b40ba2db6c	panfrost: Remove unused argument A relic from when we didn't have an online compiler, hah. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	ff345d4a01	panfrost: Handle MESA_SHADER_COMPUTE in compile callback Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	73c40d6bbb	pan/midgard: Use standard list traversal to find initial tag Fixes a hang (and abort) on empty shaders, which you shouldn't have anyway but better safe than sorry. DCE going on the fritz is no reason to freeze the system. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	4647999327	panfrost: Use gl_shader_stage directly for compiles No need to add a third set of enums to the mix. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	d9eb65c60c	panfrost: Emit "draw" info for compute jobs Important fields relating to shader state and UBOs are filled out from this (misnomer) function. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	22a8f6de61	panfrost: Feed compute shaders into the compiler The path for compute shader compiles resembles the graphic shader compile path, although it is substantially simpler as we don't need any shader keying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	1b284628ef	panfrost: Expose compute shaders as panfrost_shader_variants Whether variants are packed by graphics or compute is irrelevant. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	8b53230d47	panfrost: Remove shader state *base It is now unused. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	c228046b4b	panfrost: Remove CSO dependency from shader_compile We want this routine to be generic across graphics and compute, so let the caller deal with the typing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	428bed3bde	panfrost: Generalize UBO upload for other shader stages Now that everything is unified, this generalization is nice and easy. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	a34370e855	panfrost: Guard vertex upload by ctx->vertex != NULL This is irrelevant for graphics but matters for compute workloads. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	3bfdb878aa	panfrost: Generalize vertex shader upload This allows us to reuse the same code path for compute. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	3b7224190e	panfrost: Share gl_enables between VERTEX/COMPUTE Catch-all for magic bits. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	871c02b12e	panfrost: Invoke compute shader according to grid info We already have helpers for packing invocations (due to its role in instanced vertex shaders), so we can reuse this drop in for compute shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	748ccbc808	panfrost: Explain and include compute FBD Squint at it hard enough and you realize it's the beginning of an SFBD... I guess... A compute shader with register spilling would be able to confirm this, but we would expect to see the first field \| 1 and an address splattered later, setting up TLS. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	3113be3127	panfrost: Unify-driven cleanup Again, now that stages are unified some logic goes away. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	ac6aa93f9e	panfrost: Unify ctx->vs and ctx->fs It's a little verbose, but this way we can support other shader stages without too much contortion. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig	4b93152c29	panfrost: Flesh out launch_grid stub It's still incomplette, but we're able to hook into launch_grid to create a stub COMPUTE job. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig	cd1be4605c	panfrost: Cleanup via payload unification Since these are now indexable, quite a bit of code cleans up. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig	0da52015a1	panfrost: Unify payload_vertex/payload_tiler Rather than disparate variables, let's use an array of payloads indexed by the shader stage. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig	902115f94f	panfrost: Only wallpaper if we drew something last_tiler.gpu may be NULL at flush time despite no clear and existing jobs -- if we executed a compute-only workload. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:02 -07:00
Alyssa Rosenzweig	2d86828243	panfrost: Adjust shader CAPs to expose dEQP compute Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	39fe9f5e2f	panfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS We could expose TGSI as well -- we pipe it through tgsi_to_nir for Gallium-internal shaders anyway -- but we'd rather not. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	1697760e05	panfrost: Copy freedreno's panfrost_get_compute_param Values reported here aren't remotely correct, but it's a start to just get the entrypoint stubbed out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	c8bc664447	panfrost: Expose COMPUTE-related caps for GLES3.1 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	5a8b83ca0b	panfrost: Stub out launch_grid Just dumps some information about the invocation for later debug. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	a8fc40aaf5	panfrost: Stub out compute CSO Doesn't do anything, just gets the functions there. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:23:01 -07:00
Alyssa Rosenzweig	e913986868	panfrost: Implement gl_FrontFacing Interestingly, this requires no compiler changes. It's just exposed as a special varying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:15:03 -07:00
Alyssa Rosenzweig	f3e15122d4	panfrost: Add support for decoding gl_FrontFacing Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:15:03 -07:00
Alyssa Rosenzweig	9e66ff3ea9	pan/decode: Use max varying index as varying buffer count This allows us to decode asymmetric varyings correctly, which occurs with e.g. gl_FrontFacing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-01 16:15:03 -07:00
Timothy Arceri	2afedfaf9a	iris: add support for gl_ClipVertex in tess eval shaders Required for OpenGL compat support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-01 16:12:37 -07:00
Timothy Arceri	00b5bf2d72	iris: add support for gl_ClipVertex in geometry shaders This will enable us to support the OpenGL compat profile. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-01 16:12:27 -07:00
Jason Ekstrand	70dc017aec	nir: Stop whacking gl_FrontFacing to a system value We have a cap bit for gallium and a GLSL compiler flag to control this. Just trust what GLSL gives us and stop forcing it. In order for this to be safe, we have to advertise another cap in some of the gallium drivers. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-01 21:59:37 +00:00
Alyssa Rosenzweig	4e736b88f3	panfrost: Implement panfrost_set_shader_buffers callback Just copy over the passed SSBO for now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-01 14:32:08 -07:00
Alyssa Rosenzweig	898a18ea89	gallium/util: Add util_set_shader_buffers_mask helper Conceptually follows util_set_vertex_buffers_mask but for SSBOs. v2: Fix missing ~ when clearing mask. Adjust mask behaviour to match freedreno/v3d when buffer == NULL. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-01 14:31:56 -07:00
Jonathan Marek	3e33173200	kmsro: move entry points from etnaviv to kmsro These drivers are kmsro drivers so they should be part of the kmsro #if This fixes missing imx_drm driver when building with only freedreno+kmsro Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-01 16:31:51 -04:00
Emil Velikov	85dace1c0b	gitlab-ci: remove software-properties-common Currently we use the python package to manage repositories. At the same time we also do that by hand - since it's a trivial echo to a file. Stay consistent, remove the package and manage things manually. Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-08-01 16:16:15 +00:00
Brian Paul	3307c85a7d	st/mesa: fix MSVC compile breakage Trivial.	2019-08-01 09:07:21 -06:00
Gert Wollny	9de00e74fe	virgl: Enable depth_clamp by lowering if the host is new enough. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	b2e92c45ce	gallium: Make PIPE_CAP_DEPTH_CLIP_DISABLE a tri-state value and use it Use value "2" to signal that lowering is needed and supported and enable it accordingly. v2: - Note in CAP description that this lowering currently requires TGSI - use "true" instead of GL_TRUE (both Erik) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	616f320745	mesa/st: Signal state changes when depth_clamp is emulated v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com> v2: - Add GS and TES - fix constants state update flags (Erik) v3: don't update rasterizer when depth_clamp is lowered (Erik) v4: Correct NewDepthClamp and also set flags for NewClipControl (Erik) v5: Also set shader_has_one_variant property acording to possible depth_clamp lowering (Marek) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	d004fcc04a	mesa/st: Add depth clamping to rasterizer code implemented by Erik Faye-Lund <erik.faye-lund@collabora.com> v2: Use current depth range values for clamping (Erik) v3: fix scons-win64 build Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	57361d89fa	mesa/st: Tie depth_clamp code into other shaders (GS and TES) v2: Use file scope defined depth_range_state in common v3: - don't use the one_shader_variant property, as this is not correct (Marek) - also use tests on available shader stages to enable depth_clamp lowering v4: Don't use key.st, use st directly (Marek) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	d81ba38b02	mesa/st: Tie depth_clamp lowering into the FS v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com> v2: Use different call for FS v3: Use file scope defined depth_range_state Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	fefb152067	mesa/st: Tie depth clamp lowering in to the VP code v1: implemented by Erik Faye-Lund <erik.faye-lund@collabora.com> v2: Add handling of the ARB_clip_control depth mode v3: Move depth_range_state to file scope and remove training zeros (Erik) v4: - don't use the one_shader_variant property, as this is not correct (Marek) - also use tests on available shader stages to enable depth_clamp lowering V5: Don't use key.st, use st directly (Marek) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Erik Faye-Lund	b048d8bf8f	mesa/st: add tgsi-lowering code for depth-clamp This is a TGSI pass that lowers depth-clamping into shader-operations, by replacing the depth-value with 0 (a z-coordinate of zero will always pass the OpenGL depth test conditions), and using a dedicated varying to interpolate the real depth-value instead. Finally we replace the depth-output in the fragment shader. v1 implemented by Erik Faye-Lund <erik.faye-lund@collabora.com> v2: Add support for handling depth clip mode, and refactor code v3: - Rename _vs functions to _last_vertex_stage (Erik) - Use 0.0 depth to avoid clipping (Erik) v4: Fix inversion of bool value for clip control property Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	78ba12f40f	mesa/st: replace boolean declarations by bool Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-08-01 05:58:53 +00:00
Gert Wollny	7fb47195d8	Revert "softpipe: Don't draw when rasterizer_discard is set" This was too aggressive and breaks TF (Ilia) This reverts commit `4ee638cd78`. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-08-01 05:57:41 +00:00
Eric Engestrom	a563bb9e28	docs: reword meson instructions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-01 00:42:02 +01:00
Eric Engestrom	8a1e803643	travis: drop unnecessary Meson option for MacOS Those are already their default values on MacOS. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-01 00:25:20 +01:00
Jason Ekstrand	b539157504	intel/vec4: Drop all of the 64-bit varying code Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Jason Ekstrand	d03ec807a4	intel/fs: Drop all of the 64-bit varying code Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Jason Ekstrand	942c759059	intel: Use NIR to lower 64-bit varying access Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Jason Ekstrand	078dcb7ccd	nir/lower_io: Add an option to lower 64-bit varyings Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Jorge Natz	a63e82deb5	docs: Update Platforms and Drivers page with more comprehensive information. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-31 22:50:43 +00:00
Dave Airlie	7ad6ec80d9	nir: use common deref has indirect code in scratch lowering. This doesn't seem to need it's own copy here. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-01 08:32:12 +10:00
Eric Engestrom	5d7bcac4e7	nir: remove explicit nir_intrinsic_index_flag values These were left after a rebase and happen to make NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it was noticed. Fixes: `6f20643b47` ("nir: Allow qualifiers on copy_deref and image instructions") Cc: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-31 23:28:20 +01:00
Yevhenii Kolesnikov	830a8e6c47	state_tracker: Free Labels for querry and tranform_feedback Memory leaks were observed on iris with GL_KHR_debug. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-31 22:16:42 +00:00
Kenneth Graunke	b61f17d362	iris: Skip emitting 3DSTATE_INDEX_BUFFER if possible We were emitting 3DSTATE_INDEX_BUFFER on every indexed draw, even if back-to-back draws referred to the same index buffer. This improves drawoverhead scores in the DrawElements cases by about 10%, by giving us even more minimal batches.	2019-07-31 15:14:10 -07:00
Mike Blumenkrantz	8af1990ad7	st/dri: simplify dri_get_egl_image by reusing dri2_format_table this makes dri2_get_mapping_by_fourcc accessible from dri_helpers.h and does a direct lookup on the fourcc id to match the pipe format v2 (Ken): Allow map to be NULL, use img->texture->format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-31 15:11:15 -07:00
Erico Nunes	82bf5a8aac	lima: enable lower_bitops in ppir The mali pp doesn't support integers and some nir_algebraic optimizations may result in ops that are not easily lowerable to floats, so disable optimizations resulting in bitops. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-07-31 23:06:26 +02:00
Erico Nunes	b3676a6548	nir/algebraic: rename lower_bitshift to lower_bitops Optimizations that insert bitshift or bitwise operations should not be applied on GPUs that don't support integer operations. The .lower_bitshift could be used to control the bitshift related ones, but there was also one bitwise optimization uncovered. Since only lima and freedreno use this option and the use case is that no bit operations are wanted, let's rename it to .lower_bitops and use it to control all bitops related optimizations. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-07-31 23:06:04 +02:00
Erico Nunes	99c956fb47	lima/ppir: lower fdot in nir_opt_algebraic Now that we have fsum in nir, we can move fdot lowering there. This helps reduce ppir complexity and enables the lowered ops to be part of other nir optimizations in the optimization loop. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-31 21:35:58 +02:00
Erico Nunes	4a407df682	nir/algebraic: add new fsum ops and fdot lowering The Mali400 pp doesn't implement fdot but has fsum3 and fsum4, which can be used to optimize fdot lowering. fsum2 is not implemented and can be further lowered to an add with the vector components. Currently lima ppir handles this lowering internally, however this happens in a very late stage and requires a big chunk of code compared to a nir_opt_algebraic lowering. By having fsum in nir, we can reduce ppir complexity and enable the lowered ops to be part of other nir optimizations in the optimization loop. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-31 21:35:58 +02:00
Erico Nunes	7f8ff686b7	lima/ppir: refactor texture code to simplify scheduler The 'varying fetch' pp instruction deals only with coordinates, and 'texture fetch' deals only with the sampler index. Previously it was not possible to clearly map ppir_op_load_coords and ppir_op_load_texture to pp instructions as the source coordinates were kept in the ppir_op_load_texture node, making this harder to maintain. The refactor is made with the attempt to clearly map ppir_op_load_coords to the 'varying fetch' and ppir_op_load_texture to the 'texture fetch'. The coordinates are still temporarily kept in the ppir_op_load_texture node as nir has both sampler and coordinates in a single instruction and it is only possible to output one ppir node during emit. But now after lowering, the sources are transferred to the (always) created ppir_op_load_coords node, and it should be possible to directly map them to their pp instructions from there onwards. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-31 21:22:41 +02:00
Erico Nunes	d2901de09e	lima/ppir: lower texture projection Lower texture projection in ppir using nir_lower_tex and nir_lower_tex. This will insert a mul with the coordinate division before the load varying. Even though the lima pp supports projection in the load varying instruction while loading the coordinates (from a register or a varying), it requires that both the coordinates and projector be components in a single register. nir currently handles them in separate ssa, and attempting to merge them manually may end up in worse code than just doing the coordinate division manually. So for now let's just lower the projection to add support for it in lima. In the future, an optimization pass may be implemented in lima to ensure that both coords and projector come in the same register, then this lowering may be disabled and in this case lima may use the built-in projection and save the mul instruction from lowering. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-31 21:22:41 +02:00
Vinson Lee	412e1b51fe	scons: Fix random_r check. Fixes: `597bddad47` ("scons: Test for random_r()") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:23:55 +00:00
Kenneth Graunke	3f9012839e	Revert "st/dri: simplify dri_get_egl_image by reusing dri2_format_table" This reverts commit `c47af8b95f`. It causes dEQP-EGL regressions. (I think there is an easy fix, but we'll have it go through review again.)	2019-07-31 11:06:32 -07:00
Alyssa Rosenzweig	91c4acedaf	pan/midgard: Don't special case inline_constant Another constant source of bugs. Ain't that special. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:59:19 -07:00
Alyssa Rosenzweig	29416a8599	pan/midgard: De-special-case branching It's not that special. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:59:18 -07:00
Alyssa Rosenzweig	3e47a1181b	panfrost: Add MALI_SAMP_NORM_COORDS flag Corresponds to the normalized coordinates? flag on images in OpenCL and evidently also shows up in GL, so let's wire it in. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:56:11 -07:00
Alyssa Rosenzweig	cf6cad3922	panfrost: Simplify filter_mode definition It's just a bit field containing some flags; there's no need for all the macro magic. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:56:11 -07:00
Alyssa Rosenzweig	160795429d	pan/midgard: Shrink "compute FBD" We still don't know what it is, but from a newer trace we now know it's half the size we thought it was. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:56:11 -07:00
Alyssa Rosenzweig	194b49ee28	panfrost: Flip texture/sampler fields We had them backwards in both the command stream and the Midgard stack. In OpenGL ES 2.0, they're always the same, but in Vulkan/later-GL/CL they diverge so we can fix this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:56:11 -07:00
Alyssa Rosenzweig	a692126c93	panfrost: Add MALI_ATTR_IMAGE value Images are implemented (in part) as special attributes, so include support for decoding this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 10:56:11 -07:00
Mike Blumenkrantz	c47af8b95f	st/dri: simplify dri_get_egl_image by reusing dri2_format_table this makes dri2_get_mapping_by_fourcc accessible from dri_helpers.h and does a direct lookup on the fourcc id to match the pipe format Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-31 09:50:06 -07:00
Mike Blumenkrantz	7404833c2e	gallium: add handling for YUV planar surfaces st/dri: this adds a table (similar to the one in i965) which provides mappings for turning various planar formats into multiple sampler views. whereas only NV12 and IYUV were supported, now many more formats are supported here: * P0XX * YUV4XX * YVU4XX * AYUV * XYUV * YUYV * UYVY the table is used directly to handle image creation, simplifying a lot of code and resolving related TODO/FIXME items where workarounds were previously in place to manage NV12 and IYUV formats exclusively st/mesa: the changes here relate to setting up samplers for the planar formats. this requires: * checking for driver support for all the sampler formats * creating the samplers with the corresponding formats and swizzling * running nir_lower_tex with the appropriate options to trigger the lowering for each plane->sampler fixes kwg/mesa#36 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-31 09:50:06 -07:00
Mike Blumenkrantz	338a29b08f	gallium: add AYUV and XYUV formats this only adds the PIPE_FORMAT members, not any direct handling for them Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-31 09:50:06 -07:00
Alyssa Rosenzweig	7f75b2b5af	pan/midgard: Simplify discard logic The "branch offset" is, in fact, ignored. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig	27524d1462	pan/midgard: Add units for more instructions For everything but freduce, we have some sense of what units the instruction takes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig	64235b1ecc	pan/midgard: Fix ball/bany opcode table This were seriously messed up beyond all recognition. How we're passing shaders.random.* is a mystery. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig	13ee87c8b9	pan/midgard: Document branch combination LUT This took way longer to figure out than it should have.. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-31 09:39:16 -07:00
Kenneth Graunke	2037478702	st/mesa: Skip scissor rect updates when scissor is entirely disabled. If any scissor rectangles are enabled, then we need to set proper scissor rectangles for all viewports. But if the scissor test is entirely disabled, then we can skip updating any scissor rectangles. Without this step, we were updating the scissor rectangles based on the current framebuffer size. So if an app rendered to a variety of render targets at different sizes, with scissor test disabled each time, we'd still be continually updating the scissor rectangles, even though it's not necessary. In Civilization VI, this drops us from 310-350 set_scissor_state calls per frame to 0, as it doesn't appear to use scissor testing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-31 08:33:50 -07:00
Emil Velikov	72b97ad9b2	egl/drm: ensure the backing gbm is set before using it Currently, if we error out before gbm_dri is set (say due to a different name of the backing GBM implementation, or otherwise) the tear down will trigger a NULL ptr deref and crash out. Move the gbm_dri initialization as early as possible. v2: Drop check in dri2_teardowm_drm (Eric) Reported-by: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-31 14:18:12 +01:00
Eric Engestrom	4bf7e7b170	docs: update required meson version Fixes: `f7b6a8d12f` ("meson: bump required version to 0.46") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-31 11:50:39 +01:00
Samuel Pitoiset	c66021069e	radv/gfx10: implement a GE bug workaround Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Samuel Pitoiset	9a3fc7b6fa	radv/gfx10: remove an obsolete VGT_REUSE_OFF workaround Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Samuel Pitoiset	bb8f25233a	radv/gfx10: disable LATE_ALLOC_GS on Navi14 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Samuel Pitoiset	e041a74588	radv/gfx10: implement a bug workaround for GE_PC_ALLOC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Samuel Pitoiset	0e1724af61	radv/gfx10: implement a bug workaround for NGG -> legacy transitions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Samuel Pitoiset	29cca5f381	radv: skip draw calls with 0-sized index buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 12:14:29 +02:00
Eric Engestrom	fed6aa2fec	autotools: delete leftover script wrapper Randomly came across this file, which was likely only used by autotools to pass arguments to the test. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-31 10:16:30 +01:00
Eric Engestrom	53b98b0185	virgl: make use of local variable Otherwise that variable is only used in an assert() and would need an ASSERTED to avoid the warning. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	20c89b060f	mesa: add an ASSERTED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	bbeb507543	compiler/nir: add an ASSERTED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	7e2fe85a40	intel: add a couple of ASSERTED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	abc226cf41	tree-wide: replace MAYBE_UNUSED with ASSERTED Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	ab9c76769a	r600: replace MAYBE_UNUSED with specific #ifdef Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	745bae40ad	gallium/aux: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	513e67d2e4	mesa: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	c8a453a770	v3d: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	d470f1acce	v3d: drop incorrect MAYBE_UNUSED While at it, use that `screen` variable everywhere. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	84b8a50540	st/tests: drop incorrect MAYBE_UNUSED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	aed15fa799	radv: drop incorrect MAYBE_UNUSED `compressed` is clearly always used on the line right after. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	21196ec927	r600: move variable to proper scope It helps show when it's actually used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	5febd4d575	compiler: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	bac5760e7b	mesa: drop MAYBE_UNUSED var Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	e1dd6c2575	anv: drop MAYBE_UNUSED var Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	644cca65d3	i965: drop unused MAYBE_UNUSED function Added in `1b85c605a6` but never used. Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	7a3fb14609	i965: replace MAYBE_UNUSED with GEN condition Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	eee70e09bf	intel: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	e775b938b2	intel: drop incorrect MAYBE_UNUSED All these are actually always used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	14be04fb2b	egl: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Samuel Pitoiset	ea38565011	radv/gfx10: add Wave32 support for compute shaders It can be enabled with RADV_PERFTEST=cswave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-31 09:35:04 +02:00
Kenneth Graunke	3a22a8bf49	iris: Skip repeated depth buffer disables. Often times, the depth buffer is entirely disabled, but color render targets change. For example, GenerateMipmaps will change the color render target for each miplevel, but there is no depth buffer. In the Civilization VI benchmark, this drops the median number of 3DSTATE_DEPTH_BUFFER etc. packets emitted per frame from 472 to 34.	2019-07-30 19:47:41 -07:00
Marek Olšák	665989d98b	radeonsi: release NIR in the right place to fix crashes	2019-07-30 22:06:23 -04:00
Marek Olšák	9ac7d0a0e2	radeonsi: fix packing of key.mono.u.ps	2019-07-30 22:06:23 -04:00
Marek Olšák	033c39a660	ac/nir: fix incorrect Phis if callbacks use control flow inside control flow	2019-07-30 22:06:23 -04:00
Marek Olšák	d3c80733cd	ac/nir: handle abs modifier	2019-07-30 22:06:23 -04:00
Marek Olšák	efe2d8c5f9	ac: fix a memory leak in the error path of ac_build_type_name_for_intr	2019-07-30 22:06:23 -04:00
Marek Olšák	f6eca14f1b	ac: allow control flow statements in NIR callbacks This fixes a crash when compiling geometry shaders on radeonsi.	2019-07-30 22:06:23 -04:00
Marek Olšák	bfea7e4d29	ac/nir: handle negate modifier	2019-07-30 22:06:23 -04:00
Marek Olšák	33a8eab7a9	radeonsi: don't use lp_build_if for the prim discard compute shader	2019-07-30 22:06:23 -04:00
Marek Olšák	5562b6b067	radeonsi: don't use lp_build_if for the wrapping if block in the VS prolog	2019-07-30 22:06:23 -04:00
Marek Olšák	0ef4c1c04d	radeonsi: don't use lp_build_if for the wrapping if block in merged shaders	2019-07-30 22:06:23 -04:00
Marek Olšák	6ec7d603f5	radeonsi: don't use lp_build_if (in most common places)	2019-07-30 22:06:23 -04:00
Marek Olšák	3406a57ff3	radeonsi: don't use lp_build_alloca	2019-07-30 22:06:23 -04:00
Marek Olšák	9234275320	radeonsi/nir: implement FBFETCH for KHR_blend_equation_advanced	2019-07-30 22:06:23 -04:00
Marek Olšák	925161c84c	radeonsi/nir: set input_interpolate_loc for color inputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-30 22:06:23 -04:00
Marek Olšák	5787bbf90d	radeonsi/nir: set tgsi_shader_info::num_memory_instructions	2019-07-30 22:06:23 -04:00
Marek Olšák	0993dbcbef	radeonsi/nir: accurately set input_usage_mask for doubles (v2) v2: fix doubles Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-30 22:06:23 -04:00
Marek Olšák	56e3c70b56	radeonsi/nir: accurately set output_usagemask (v2) v2: fix doubles	2019-07-30 22:06:23 -04:00
Marek Olšák	37527f8a11	radeonsi/nir: accurately set reads_*_outputs for TCS	2019-07-30 22:06:23 -04:00
Marek Olšák	6697e42c3c	radeonsi/nir: clean up gather_intrinsic_load_deref_input_info	2019-07-30 22:06:23 -04:00
Marek Olšák	5f16fdefdf	radeonsi/nir: add an option to convert TGSI to NIR Use at your own risk. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-30 22:06:23 -04:00
Marek Olšák	eb43559bb8	radeonsi/nir: clean up some nir_scan_shader code Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-30 22:06:23 -04:00
Marek Olšák	34dc6ed2a5	radeonsi/gfx10: disable DCC image stores Uncompressed image stores are usually faster. Also, the driver didn't set WRITE_COMPRESS_ENABLE, so I don't know what the hw did for image stores.	2019-07-30 22:06:23 -04:00
Marek Olšák	17021efc74	radeonsi: adjust RB+ blend optimization settings based on PAL	2019-07-30 22:06:23 -04:00
Marek Olšák	27ac9a3326	ac/surface: allow linear swizzle mode automatic selection on gfx9 & 10 let addrlib make the decision to get the same result as PAL.	2019-07-30 22:06:23 -04:00
Pierre-Eric Pelloux-Prayer	a0ac0e2653	mesa: add EXT_dsa indexed generic queries Only GetPointerIndexedvEXT needs an implementation, the other functions are aliases of existing functions.	2019-07-30 22:04:26 -04:00
Pierre-Eric Pelloux-Prayer	ef84d93f3d	mesa: add EXT_dsa indexed texture commands functions Added functions: - EnableClientStateIndexedEXT - DisableClientStateIndexedEXT - EnableClientStateiEXT - DisableClientStateiEXT Implemented using the idiom provided by the spec: if (array == TEXTURE_COORD_ARRAY) { int savedClientActiveTexture; GetIntegerv(CLIENT_ACTIVE_TEXTURE, &savedClientActiveTexture); ClientActiveTexture(TEXTURE0+index); XXX(array); ClientActiveTexture(savedActiveTexture); } else { // Invalid enum }	2019-07-30 22:04:26 -04:00
Pierre-Eric Pelloux-Prayer	7534c536ca	mesa: add EXT_dsa (Named)Framebuffer functions These functions dont support display list as specified: Should the selector-free versions of various OpenGL 3.0 and EXT_framebuffer_object framebuffer object commands not be allowed in display lists [...]? RESOLVED: Yes	2019-07-30 22:04:26 -04:00
Pierre-Eric Pelloux-Prayer	e26c6764f2	mesa: add EXT_dsa NamedBuffer functions	2019-07-30 22:04:26 -04:00
Jason Ekstrand	9265e9d11a	i965/curbe: Look at SYSTEM_VALUE_FRAG_COORD instead of VARYING_SLOT_POS When transitioning gl_FragCoord over to a system value, we missed one instance of VARYING_SLOT_POS in i965. As of this commit, i965 has no references to VARYING_SLOT_POS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111263 Fixes: `4bb6e6817e` "intel: Use a system value for gl_FragCoord" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 19:21:09 -05:00
Jason Ekstrand	8fd2f2c276	intel/fs: Implement quad_swap_horizontal with a swizzle on gen7 This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_* on all gen7 platforms. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 22:38:19 +00:00
Jason Ekstrand	499d760c6e	intel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7 The issue here was discovered by a set of Vulkan CTS tests: dEQP-VK.glsl.derivate..dynamic_ These tests use ballot ops to construct a branch condition that takes the same path for each 2x2 quad but may not be uniform across the whole subgroup. They then tests that derivatives work and give the correct value even when executed inside such a branch. Because the derivative isn't executed in uniform control-flow and the values coming into the derivative aren't smooth (or worse, linear), they nicely catch bugs that aren't uncovered by simpler derivative tests. Unfortunately, these tests require Vulkan and the equivalent GL test would require the GL_ARB_shader_ballot extension which requires int64. Because the requirements for these tests are so high, it's not easy to test on older hardware and the bug is only proven to exist on gen7; gen4-6 are a conjecture. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 22:38:19 +00:00
Eric Engestrom	bf8b5de6b9	scons+meson: suppress spammy build warning on MacOS Originally introduced in `c7f3657450` ("darwin: Suppress type conversion warnings for GLhandleARB") to fix Bugzilla #66346 [1], this workaround was never ported to Scons or Meson. [1] https://bugs.freedesktop.org/66346 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-07-30 23:21:42 +01:00
Matt Turner	46a3ea06be	i965/fs: Print the scheduler mode. Line wrap some awfully long lines while we are here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-30 14:35:43 -07:00
Matt Turner	dabb5d4bee	i965/fs: Add a shader_stats struct. It'll grow further, and we'd like to avoid adding an additional parameter to fs_generator() for each new piece of data. v2 (idr): Rebase on 17 months. Track a visitor instead of a cfg. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 14:35:43 -07:00
Connor Abbott	11a49f289d	lima/gp: Support exp2 and log2 log2 is tricky because there cannot be a move between complex1 and postlog2. We can't guarantee that scheduling complex1 will succeed when we schedule postlog2, so we try to schedule complex1 and if it fails we back out by rewriting the postlog2 as a move and introducing a new postlog2 so that we can try again later. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-30 23:01:15 +02:00
Connor Abbott	c2f48d8f32	lima/gpir: Always schedule complex2 and _impl right after complex1 See https://gitlab.freedesktop.org/lima/mesa/issues/94 for the gory details of why this is needed. For _impl this is easy, since it never increases register pressure and it goes in the complex slot hence it never counts against max nodes. It's a bit more challenging for complex2, since it does count against max nodes, so we need to change the reservation logic to reserve an extra slot for complex2 when scheduling complex1. This second part isn't strictly necessary yet, but it will be for exp2. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-30 23:00:41 +02:00
Bas Nieuwenhuizen	2b53c49d2f	radv: Fix descriptor set allocation failure. Set all the handles to VK_NULL_HANDLE: "If the creation of any of those descriptor sets fails, then the implementation must destroy all successfully created descriptor set objects from this command, set all entries of the pDescriptorSets array to VK_NULL_HANDLE and return the error." (Vulkan 1.1.117 Spec, section 13.2) CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-30 22:33:24 +02:00
Andres Rodriguez	2b71b4e793	radv: fix queries with WAIT_BIT returning VK_NOT_READY When vkGetQueryPoolResults() is called with VK_QUERY_RESULT_WAIT_BIT set, the driver is supposed to wait for the query to become available before returning. Currently, radv returns once the query is indeed ready, but it returns VK_NOT_READY. It also fails to populate the results. The problem is a missing volatile in the secondary check for query availability. This patch removes the secondary check altogether since it is redundant with the preceding loop. This bug was found with an unreleased version of SteamVR. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-27 10:19:19 -04:00
Matt Turner	c9b86cf526	meson: Test for program_invocation_name program_invocation_name and program_invocation_short_name are both GNU extensions. I don't believe one can exist without the other, so only check for program_invocation_name. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 11:49:09 -07:00
Matt Turner	597bddad47	scons: Test for random_r() Suggested-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 11:49:09 -07:00
Matt Turner	c96407f37e	meson: Test for random_r() It's better to test for needed functions instead of using external knowledge about presence in this or that C library. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 11:49:09 -07:00
Matt Turner	9cc4311d86	st/nine: Drop preprocessor guards for glibc-2.12 Same rationale as the previous patch, but additionally these checks just seem entirely unnecessary. pthread_self() has been used in Mesa since at least 1999. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 11:49:09 -07:00
Matt Turner	9c411e020d	util: Drop preprocessor guards for glibc-2.12 glibc-2.12 was released in 2010. No one is building new Mesa against 9 year old glibc, and removing these checks allows the code to work on other C libraries like musl. Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 11:49:09 -07:00
Alyssa Rosenzweig	a3c59f9f00	pan/midgard: Nothing to see here, move along folks Fixes: `dee1e18fe4` ("pan/midgard: Cleanup ops table") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:49:13 -07:00
Lionel Landwerlin	7deb5ec0e8	spirv: don't discard access set by vtn_pointer_dereference We can have a access flag already set here so just augment the existing ones. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `0fb61dfdeb` ("spirv: propagate access qualifiers through ssa & pointer") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-30 17:43:59 +00:00
Sagar Ghuge	587a497529	iris: Enable EXT_texture_shadow_lod Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Sagar Ghuge	adb9e18348	gallium: Add PIPE_CAP_TEXTURE_SHADOW_LOD v2: Line wrap to 80 char (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Sagar Ghuge	6e04bd5f13	i965: Enable EXT_texture_shadow_lod Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Paulo Zanoni	25b03526c4	glsl: Add builtin functions for EXT_texture_shadow_lod With the help of Sagar, Ian and Ivan. v2: Fix dependencies (Ian Romanick) v3: 1) fix function name (Marek Olsak) 2) Add check for extension enable (Marek Olsak) Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Paulo Zanoni	154c789ad5	glsl: Allow _textureCubeArrayShadow function to accept ir_texture_opcode This will be used to support one of the function from Ext_texture_shadow_lod specification. With the help of Sagar, Ian and Ivan. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Paulo Zanoni	d80a74fb99	mesa: extension boilerplate for EXT_texture_shadow_lod With the help of Sagar, Ian and Ivan. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-30 10:42:20 -07:00
Alyssa Rosenzweig	dee1e18fe4	pan/midgard: Cleanup ops table Hopefully this should make a few ops make more sense. No functional changes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:35:22 -07:00
Alyssa Rosenzweig	834aeb1e52	pan/midgard: Extend copy-propagation to swizzles We can compose them when we rewrite, which is.. more code.. but helps. total instructions in shared programs: 3611 -> 3513 (-2.71%) instructions in affected programs: 672 -> 574 (-14.58%) helped: 11 HURT: 2 helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10 helped stats (rel) min: 5.71% max: 24.56% x̄: 17.99% x̃: 18.87% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 2.08% x̄: 1.64% x̃: 1.64% 95% mean confidence interval for instructions value: -10.45 -4.62 95% mean confidence interval for instructions %-change: -20.07% -9.87% Instructions are helped. total bundles in shared programs: 2117 -> 2067 (-2.36%) bundles in affected programs: 356 -> 306 (-14.04%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 4.55 x̃: 5 helped stats (rel) min: 4.55% max: 15.22% x̄: 13.63% x̃: 14.71% 95% mean confidence interval for bundles value: -5.64 -3.45 95% mean confidence interval for bundles %-change: -15.71% -11.55% Bundles are helped. total quadwords in shared programs: 3567 -> 3468 (-2.78%) quadwords in affected programs: 695 -> 596 (-14.24%) helped: 11 HURT: 1 helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10 helped stats (rel) min: 5.56% max: 21.88% x̄: 14.97% x̃: 15.15% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38% 95% mean confidence interval for quadwords value: -10.96 -5.54 95% mean confidence interval for quadwords %-change: -17.42% -9.63% Quadwords are helped. total registers in shared programs: 391 -> 383 (-2.05%) registers in affected programs: 46 -> 38 (-17.39%) helped: 9 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 10.00% max: 10.00% x̄: 10.00% x̃: 10.00% 95% mean confidence interval for registers value: -1.25 -0.35 95% mean confidence interval for registers %-change: -29.42% -13.58% Registers are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:35:10 -07:00
Alyssa Rosenzweig	c45487b770	pan/midgard: Extract simple source mod check Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:35:09 -07:00
Alyssa Rosenzweig	2d2abb08d0	pan/midgard: Lower texr/texw mixed registers Conceptually, r28-r29 (as used for reading) and r28-r29 (as used for writing) aren't registers at all, merely push/pull arrangements. So you can't feed a texture result back into itself without explicitly moving in the middle. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:20 -07:00
Alyssa Rosenzweig	2b248af43e	pan/midgard: Always set .cont for derivatives in loops We need to keep the helper invocations alive. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	8f887329c0	pan/midgard: Implement derivatives Implement the fdd* and fdd* opcodes in the Midgard compiler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	982134d22e	pan/midgard: Compose original texture swizzle in RA Used for lowering derivatives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	79875a9a64	pan/midgard: Add new swizzles Used for derivatives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	81e7782e30	pan/midgard: Add OP_IS_DERIVATIVE helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	ae6aea0d98	pan/midgard: Add make_compiler_temp_reg helper Corrollary to make_compiler_temp (for SSA). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	30b15a830a	pan/midgard: Move nir_*_src_index to compiler.h These helpers are useful for code emission everywhere. Share the love! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	c9498b3c5e	pan/midgard: Disassemble unknown texture ops as hex I'm not sure why I ever thought decimal was a good idea. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig	0714481894	pan/midgard: Add support for disassembling derivatives They're just texture ops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-30 10:01:19 -07:00
Connor Abbott	a094928abc	nir/find_array_copies: Use correct parent array length instr->type is the type of the array element, not the type of the array being dereferenced. Rather than fishing out the parent type, just use parent->num_children which should be the length plus 1. While we're here add another assert for the issue fixed by the previous commit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251 Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-30 17:14:33 +02:00
Connor Abbott	7788992bc6	nir: Fix comparison for nir_deref_instr_is_known_out_of_bounds() There was an off-by-one error. Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-30 17:14:28 +02:00
Samuel Pitoiset	9d7ead6f9b	radv/gfx10: only compile the GS copy shader on-demand Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-30 16:51:30 +02:00
Michel Dänzer	5229f27f06	gitlab-ci: Fix scons build directory path Fixes: `dd3d0b2897` "gitlab-ci: Only keep the build logs as artifacts." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-30 16:18:50 +02:00
Jan Zielinski	4d2890e8f7	swr/rasterizer: Add memory tracking support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-07-30 15:58:36 +02:00
Jan Zielinski	5dd9ad1570	swr/rasterizer: Better implementation of scatter Added support for avx512 scatter instruction. Non-avx512 will now call into a C function to do the scatter emulation. This has better jit compile performance than the previous approach of jitting scalar loops. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-07-30 13:39:19 +00:00
Jan Zielinski	ad9aff5528	swr/rasterizer: cleanups for tessellation This commit introduces small fixes in preparation for tessellation support. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-07-30 13:39:18 +00:00
Jan Zielinski	c5c05979f7	rasterizer/swr: move BucketMgr to SwrContext This move gets us back to parity with global manager in that we can dump render context buckets now. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2019-07-30 13:39:18 +00:00
Alejandro Piñeiro	cda4c62893	v3d: take into account separate_stencil when checking if stencil should be cleared In most cases this is not needed because the usual is that when a separate stencil is written, the parent resource is also written. This is needed if we have a separate stencil, no depth buffer, and the source and destination is the same, as in that case the stencil can be updated, but not the parent source (like if you are blitting only the stencil buffer). On that situation, the following access to the stencil buffer would clear the stencil buffer (so overwritting the previous blitting) cleared because the parent source has v3d_resource.writes to 0. As far as I see, that situation only happens with the GL_DEPTH32F_STENCIL8 format. Note that one alternative would consider that if the separate_stencil has been written, the parent should also be considered written (and update its "writes" field accordingly). But I found this patch more natural. Fixes the following piglit tests: spec/arb_depth_buffer_float/fbo-stencil-gl_depth32f_stencil8-blit spec/arb_depth_buffer_float/fbo-stencil-gl_depth32f_stencil8-copypixels the latter regressed when internally glCopyPixels implementation started to use blitting. So: Fixes: `131d40cfc9` ("st/mesa: accelerate glCopyPixels(STENCIL)") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-30 12:05:23 +02:00
Daniel Schürmann	45638e14fb	radv: Don't include radv_private.h from radv_shader.h This patch decouples radv_shader.h from any LLVM dependency. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-30 10:29:11 +02:00
Rafael Antognolli	f27908152b	i965/gen10: Remove unnecessary workaround. In fact, the description of the workaround states that the mask field doesn't work correctly on gen10, and we need to set it to 0xffff even we we only want to update a single field: "The mask bits are not implemented properly on 3DSTATE_3D_MODE. Driver must always program bits 31:16 of DW1 a value of 0xFFFF. This means if it is only updating 1 field, it must update all the fields to the correct value." So unless we want to change any of the fields of 3DSTATE_3D_MODE, there's not need to emit. Additionally, it seems this workaround is not required on gen11. And last but not least, this workaround is not implemented on iris or anv, and it doesn't seem to be missed there. So let's just remove the whole thing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 16:54:17 -07:00
Kenneth Graunke	44e713eddb	iris: Fix SO offset to be 32-bit in DrawTransformFeedback handling We accidentally started copying a full 64-bit value rather than copying a 32-bit offset and zeroing the top 32-bits. This caused us to compute bogus vertex counts which could lead to GPU hangs in some cases. Thanks to Clayton Craft for catching the regressions! Fixes: `0e24d10ff5` ("iris: Use gen_mi_builder to handle CS ALU operations.")	2019-07-29 16:38:19 -07:00
Jason Ekstrand	4bb6e6817e	intel: Use a system value for gl_FragCoord It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Jason Ekstrand	44268b1c72	glsl: Treat gl_FragCoord as a varying even when it's a system value This fixes glsl-fcoord-invariant-pass.shader_test on drivers that set GLSLFragCoordIsSysVal which includes radeonsi among others. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Jason Ekstrand	169d896df2	mesa/spirv: Set frag_coord_is_sysval to GLSLFragCoordIsSysVal Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Jason Ekstrand	e401303597	intel/fs: Remove calculate_urb_setup from fs_visitor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Rob Clark	010d255656	freedreno/a6xx: fix MSAA resolve hangs Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size. Fixes intermittent GPU hangs triggered by some of the three.js samples on https://threejs.org/ Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-07-29 15:15:31 -07:00
Rob Clark	73cc2dc084	freedreno/ir3: fix for array/reg store vs meta instructions fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-07-29 15:15:31 -07:00
Eric Engestrom	f7b6a8d12f	meson: bump required version to 0.46 0.45 has a few annoying bugs (like the one in !358 [1]), and 0.46 is well over a year old by now, so let's move to it. [1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/358 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-07-29 23:07:30 +01:00
Leo Liu	8d7f2e2221	radeon/vcn/vp9: add Arcturus VP9 support Arcturus CHIP enum is less than Navi10, since it's still gfx9, but its VCN version belongs to VCN2.x Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:58 -04:00
Leo Liu	a439863918	radeon/vcn: add Arcturus decode support different internal registers offset from previous HW Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:56 -04:00
Marek Olšák	7708540363	amd: add support for Arcturus Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:54 -04:00
Marek Olšák	417ab8ef6b	radeonsi: add AMD_DEBUG=nogfx for testing Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:53 -04:00
Marek Olšák	19d04191c4	radeonsi: add support for compute-only chips Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:51 -04:00
Sonny Jiang	c82f338855	gallium/auxiliary/vl: add compute shaders for deint yuv Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:49 -04:00
Sonny Jiang	ef77a92bca	gallium/auxiliary/vl: don't call gfx functions on compute-only chips Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:46 -04:00
James Zhu	b618b65c98	gallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositor Init graphic shader Only when PIPE_CAP_GRAPHICS is true. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:42 -04:00
Marek Olšák	187cc07d05	gallium: create multimedia contexts as compute-only if graphics is unsupported Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:41 -04:00
Marek Olšák	ea7646dc13	gallium: add PIPE_CAP_GRAPHICS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-29 17:52:39 -04:00
Samuel Pitoiset	372c3dcfdb	radv: implement VK_EXT_index_type_uint8 Natively supported on VI+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-29 23:36:53 +02:00
Lionel Landwerlin	c6196f7025	anv: implement VK_EXT_index_type_uint8 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 21:26:07 +00:00
Lionel Landwerlin	0d3a532a33	vulkan: Bump headers to 1.1.117 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 21:26:07 +00:00
Lionel Landwerlin	161b5f00db	include/vulkan: bump vk_android_native_buffer Taken off https://android.googlesource.com/platform/frameworks/native/+/refs/tags/android-9.0.0_r45/vulkan/include/vulkan/vk_android_native_buffer.h Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 21:26:07 +00:00
Eric Engestrom	8486dbb066	intel/mi: only resolve to a temp register if source isn't in memory aka. fix a s/\|\|/&&/ typo Fixes: `74063ee61a` ("intel/mi: Add a new gen_mi_store_if() helper.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 13:35:42 -07:00
Eric Anholt	5596038e2f	gitlab-ci: Enable freedreno shader-db runs. Now that helgrind is less upset and I've completed many successful full shader-db runs, we should be able to enable freedreno shader-db runs for Mesa checkins on the tiny public shader-db. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:52:39 -07:00
Eric Anholt	3c46778b75	nir: Fix helgrind complaints about data race in trivial_swizzle init. Even if the data race wasn't real (I'm not great at reasoning about this), helgrind is a nice enough tool that keeping noise out of it is probably worthwhile. Besides, typing out the numbers keeps the data in the read-only data section instead of emitting code to initialize it every time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	91986fbbdb	freedreno: Fix data race on making the shader's id. The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	6f0521b78c	freedreno: Take a lock around shader variant creation. Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	6e3b220ad3	freedreno: Fix data races with allocating/freeing struct ir3. There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: `8fe2076243` ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	65aeeae670	freedreno: Fix helgrind complaint on shader-db key setup. If the variable's going to be static, we shouldn't be memsetting it from every thread and instead just have it in the data section. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Bas Nieuwenhuizen	aac492901a	radv: Take variable descriptor counts into account for buffer entries. Fixes: `b5e04e9217` "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-29 20:42:53 +02:00
Jason Ekstrand	99d04a5bd6	anv: Don't claim support for 24 and 48-bit formats on IVB Cc: mesa-stable@lists.freedesktop.org	2019-07-29 11:34:30 -05:00
Jason Ekstrand	7c1b39cf18	isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW On Haswell, the format works but it doesn't properly do an sRGB decode. It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this format so this only affects Vulkan on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-29 11:34:18 -05:00
Alyssa Rosenzweig	463164b325	pan/midgard: Fix alpha test w.r.t new indexing Fixes: `9beb3391b5` ("pan/midgard: Tag SSA/reg") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-29 08:31:03 -07:00
Gert Wollny	4ee638cd78	softpipe: Don't draw when rasterizer_discard is set Fixes: dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-29 15:47:34 +02:00
Gert Wollny	45ac0dfad4	softpipe: Fix cube arrays layer selection To select the correct layer the z-coordinate must be rounded before it is multiplied by six. Fixes a number of tests out of dEQP-GLES31.functional.texture.filtering.cube_array.formats.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-29 15:47:34 +02:00
Lionel Landwerlin	6659d11ff0	vulkan/wsi/wayland: implement acquire timeout v2: Eric's nits v3: Reuse timespec utils (Daniel) Deal with ppoll being interrupted by a signal (Daniel) v4: Remove unnecessary time check v5: Deal with EAGAIN from wl_display_prepare_read_queue() (Daniel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2) Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-07-29 13:11:36 +00:00
Lionel Landwerlin	d2d70c3bb5	util: add a timespec helper Copied from Weston, upon Daniel's suggestion Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-07-29 13:11:36 +00:00
Eric Engestrom	ef57fb2350	intel: replace large stack buffer with heap allocation For now, this keeps the "100 bytes" allocation; we can try to figure out the correct size as a follow up. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-29 13:58:57 +01:00
Samuel Pitoiset	58ee973e87	radv/gfx10: do not use the fast depth or stencil clear bytes path It causes issues on GFX10. This fixes rendering issues with vkmark and Wreckfest at least. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-07-29 14:47:13 +02:00
Samuel Pitoiset	4aa450193b	ac: do not crash when the buffer data format is invalid This might happen when a pipeline doesn't define the vertex input state, so the buffer data format is 0 (aka INVALID). This fixes crashes when compiling some shaders on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-29 13:19:32 +02:00
Rhys Perry	a9f58af454	ac/nir: fix txf_ms with an offset Seems to fix some hair artifacts in Max Payne 3: https://github.com/daniel-schuermann/mesa/issues/76 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `f4e499ec79` ('radv: add initial non-conformant radv vulkan driver') Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-29 11:50:13 +01:00
Connor Abbott	a69ab1b7d2	radv: Delete unused local variables in optimization loop Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 620 -> 560 (-9.68 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 292 -> 292 (0.00 %) dwords per thread Code Size: 20024 -> 20144 (0.60 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 25 -> 25 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-29 11:37:46 +02:00
Connor Abbott	156306e5e6	nir/find_array_copies: Handle wildcards and overlapping copies This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 11:36:25 +02:00
Connor Abbott	c6543efe7a	nir: Print array deref indices as decimal We print the size as decimal too, and using hex without a leading "0x" was very confusing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 11:36:19 +02:00
Connor Abbott	6fc7384fd4	lima/gpir/sched: Handle more special ops in can_use_complex() We were missing handling for a few other ops that rearrange their sources somehow in codegen, namely complex2 and select. This should fix spec@glsl-1.10@execution@built-in-functions@vs-asin-vec3 and possibly other random regressions from the new scheduler which were supposed to be fixed in the commit right after. Fixes: `54434fe670` ("lima/gpir: Rework the scheduler") Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-28 23:38:31 +02:00
Connor Abbott	af95f80a24	lima/gp: Clean up lima_program_optimize_vs_nir() a little Remove an unnecessary nir_lower_regs_to_ssa as that should be done by the state tracker, and add a missing DCE pass after running copy propagation in order to remove the dead copies. This shouldn't fix anything but the second part will reduce shader sizes. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-28 23:38:31 +02:00
Connor Abbott	d26d8c5617	lima/gpir/sched: Don't try to spill when something else has succeeded In try_node(), we assume that the node we pick can still be scheduled successfully after speculatively trying all the other nodes. Normally we always undo every node after speculating it, so that when we finally schedule best_node the scheduler state is exactly the same and it succeeds. However, we also try to spill nodes, which can change the state and in a corner case that can make scheduling best_node fail. In particular, the following sequence of events happened with piglit shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a register store node S, which is a use of N, was created and then later the other uses of N were scheduled, so that S is now ready and N is partially ready. First we try to schedule S and succeed, then we try to schedule another node M, which fails, so we try to spill the remaining uses of N. This succeeds, but scheduling M still fails so that best_node is still S. However since one of the uses of N is one cycle ago, and therefore we inserted a read dependent on S one cycle ago when spilling N, S can no longer be scheduled as read-after-write latency is three cycles. While we could ad-hoc try to catch cases like this, or (the best option but very complicated) treat the spill as speculative and roll it back if we decide not to schedule the node, a simpler solution is to just give up on spilling if we've already successfully speculatively scheduled another node. We'd give up a few cases where we discover that by spilling even harder we could schedule a more desirable node, but that seems like it would be pretty rare in practice. With this we guarantee that nothing has been touched after best_node was successfully scheduled. We also cut down on pointless spilling, since if we already scheduled a node it's unlikely that spilling harder will let us schedule an even better node, and hence any spilling at this point is probably useless. While we're here, clean up the code around spilling by flattening the two if's and getting rid of the second unnecessary check for INT_MIN. Fixes: `54434fe670` ("lima/gpir: Rework the scheduler") Acked-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-28 23:38:31 +02:00
Ilia Mirkin	de17922b8a	nv50/ir: don't consider the main compute function as taking arguments With OpenCL, kernels can take arguments and return values (?). However in practice, there is no more TGSI compute implementation, and even if there were, it would probably have named functions and no explicit main. This improves RA considerably for compute shaders, since temps are not kept around as return values. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-07-27 18:24:11 -04:00
Ilia Mirkin	3e468ff2fe	nv50/ir: handle insn not being there for definition of CVT arg This can happen if it's e.g. a uniform or a function argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2019-07-27 18:24:11 -04:00
Ilia Mirkin	23dfff0669	nouveau: flip DEBUG -> !NDEBUG The meson conversion chose to change the meaning of DEBUG to "used for debugging" to be "used for expensive things for debugging", primarily for nir_validate. Flip things over so that we get nice things with optimizations enabled. While we're at it, also kill off nouveau_statebuf.h which is unused (and has a mention of DEBUG which is how I found it). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-07-27 18:24:11 -04:00
Ilia Mirkin	9f8ed5aa67	nvc0: allow a non-user buffer to be bound at position 0 Previously the code only handled it for positions 1 and up (as would be for UBO's in GL). It's not a lot of trouble to handle this, and vl or vdpau want this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2019-07-27 18:24:11 -04:00
Ilia Mirkin	c52b057e00	nv50,nvc0: update sampler/view bind functions to accept NULL array Apparently vl (or vdpau) wants to pass that in now. Handle it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2019-07-27 18:24:11 -04:00
Ilia Mirkin	face27fdc5	gallium/vl: fix compute tgsi shaders to not process undefined components This caused nouveau's function handling logic to think that the MAIN function was due to receive external parameters, and cascaded some failures after that. Instead avoid having the undefined components in the first place. Fixes: `f6ac0b5d71` (gallium/auxiliary/vl: Add compute shader to support video compositor render) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-27 18:24:11 -04:00
Alyssa Rosenzweig	159abd527e	pan/midgard: Introduce invert field This will enable us to fuse inverts in various ways. Marginal hurt: total instructions in shared programs: 3610 -> 3611 (0.03%) instructions in affected programs: 67 -> 68 (1.49%) helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 13:38:41 -07:00
Alyssa Rosenzweig	9beb3391b5	pan/midgard: Tag SSA/reg Rather than putting registers after SSA in the MIR indexing, put them side-by-side, shifted 1, using the bottom bit as the SSA/reg select. This will allow us to generate SSA temps in the compiler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 13:38:41 -07:00
Boyuan Zhang	b0626c1f30	radeon/vcn: enable rate control for hevc encoding Set cu_qp_delta_enable_flag on when rate control is enabled, and set it off when rate control is disabled (e.g. constant qp). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: fix typo and add bugzilla info Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2019-07-26 14:33:09 -04:00
Boyuan Zhang	5115c25bb8	radeon/uvd: enable rate control for hevc encoding Set cu_qp_delta_enable_flag on when rate control is enabled, and set it off when rate control is disabled (e.g. constant qp). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: fix typo and add bugzilla info Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2019-07-26 14:33:09 -04:00
Boyuan Zhang	9aaf3aaf5d	radeon/vcn: fix poc for hevc encode MaxPicOrderCntLsb should be at least 16 according to the spec, therefore add minimum value check. Also use poc value passed from st instead of calculation in slice header encoding. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: Fix typo V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb should be power of 2 according to spec. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2019-07-26 14:33:09 -04:00
Boyuan Zhang	77cf700fa3	radeon/uvd: fix poc for hevc encode MaxPicOrderCntLsb should be at least 16 according to the spec, therefore add minimum value check. Also use poc value passed from st instead of calculation in slice header encoding. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: mesa-stable@lists.freedesktop.org V2: Fix typo V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb should be power of 2 according to spec. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2019-07-26 14:33:09 -04:00
Sagar Ghuge	d5992ab134	nir: Optimize umod lowering We don't have calculate final quotient in order to calculate unsigned modulo result. Once we are done with error correction we have partial result which can be used to find out modulo operation result Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-26 11:19:23 -07:00
Alyssa Rosenzweig	f8c71d7632	pan/midgard: Improve scheduling Make scalar scheduling onto vector units more aggressive (it can only help while we schedule strictly in order). Also, allow imov on VLUT. total bundles in shared programs: 2176 -> 2117 (-2.71%) bundles in affected programs: 901 -> 842 (-6.55%) helped: 24 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 2.46 x̃: 2 helped stats (rel) min: 2.08% max: 20.00% x̄: 8.68% x̃: 5.94% 95% mean confidence interval for bundles value: -3.93 -0.99 95% mean confidence interval for bundles %-change: -10.92% -6.45% Bundles are helped. total quadwords in shared programs: 3605 -> 3566 (-1.08%) quadwords in affected programs: 1984 -> 1945 (-1.97%) helped: 28 HURT: 5 helped stats (abs) min: 1 max: 3 x̄: 1.68 x̃: 2 helped stats (rel) min: 1.02% max: 14.29% x̄: 5.12% x̃: 2.94% HURT stats (abs) min: 1 max: 3 x̄: 1.60 x̃: 1 HURT stats (rel) min: 0.57% max: 9.09% x̄: 6.40% x̃: 9.09% 95% mean confidence interval for quadwords value: -1.67 -0.69 95% mean confidence interval for quadwords %-change: -5.37% -1.37% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 10:28:46 -07:00
Alyssa Rosenzweig	94e281b9e0	pan/midgard: Specialize mod checking by type when checking constants Fixes inlining of integer constants. total quadwords in shared programs: 3585 -> 3568 (-0.47%) quadwords in affected programs: 625 -> 608 (-2.72%) helped: 13 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1 helped stats (rel) min: 1.27% max: 9.52% x̄: 3.84% x̃: 2.94% 95% mean confidence interval for quadwords value: -1.60 -1.02 95% mean confidence interval for quadwords %-change: -5.60% -2.07% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 09:47:40 -07:00
Alyssa Rosenzweig	e823d33e77	pan/midgard: Use more aggressive writeout criteria We loosen the requirement of "no dependencies" to simply be "no non-pipelined dependencies", so we check for what could be pipelined. total bundles in shared programs: 2176 -> 2156 (-0.92%) bundles in affected programs: 779 -> 759 (-2.57%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.33% max: 20.00% x̄: 6.47% x̃: 2.78% 95% mean confidence interval for bundles value: -1.00 -1.00 95% mean confidence interval for bundles %-change: -9.44% -3.50% Bundles are helped. total quadwords in shared programs: 3605 -> 3585 (-0.55%) quadwords in affected programs: 1391 -> 1371 (-1.44%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.19% max: 14.29% x̄: 3.84% x̃: 1.64% 95% mean confidence interval for quadwords value: -1.00 -1.00 95% mean confidence interval for quadwords %-change: -5.73% -1.94% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 09:47:40 -07:00
Alyssa Rosenzweig	c7fc5f3567	pan/midgard: Pipeline non-SSA registers Rather than bailing if we see something that's not SSA, do out the analysis to check if we can pipeline and do so if we can. total registers in shared programs: 392 -> 391 (-0.26%) registers in affected programs: 3 -> 2 (-33.33%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 09:40:10 -07:00
Alyssa Rosenzweig	79f0896491	pan/midgard: Add mir_mask_of_read_components helper This facilitates analysis of vec4 registers (after going out-of-SSA). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 09:37:28 -07:00
Alyssa Rosenzweig	481447cb00	pan/midgard: Add mir_is_written_before helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 09:20:52 -07:00
Alyssa Rosenzweig	95732cc9ef	pan/midgard: Obey fragment writeout criteria Rather than always emitting an extra move for fragments, check the actual criteria and emit accordingly. (This was lost during the RA improvements at the end of May). total bundles in shared programs: 2210 -> 2176 (-1.54%) bundles in affected programs: 501 -> 467 (-6.79%) helped: 34 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.59% max: 33.33% x̄: 13.13% x̃: 12.50% 95% mean confidence interval for bundles value: -1.00 -1.00 95% mean confidence interval for bundles %-change: -16.06% -10.21% Bundles are helped. total quadwords in shared programs: 3639 -> 3605 (-0.93%) quadwords in affected programs: 795 -> 761 (-4.28%) helped: 34 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.96% max: 33.33% x̄: 11.22% x̃: 8.33% 95% mean confidence interval for quadwords value: -1.00 -1.00 95% mean confidence interval for quadwords %-change: -14.31% -8.13% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:09 -07:00
Alyssa Rosenzweig	20771ede1c	pan/midgard: Add post-RA move elimination Think of this pass as register coalescing part 2. After RA runs, but before scheduling, we scan for code of the form: mov rN, rN and delete the move, since it's totally redundant. This pass helps already, but it'd of course be much more effective paired with register coalescing to encourage moves in general to end up in this form. Nevertheless, even by itself: total instructions in shared programs: 3665 -> 3613 (-1.42%) instructions in affected programs: 2046 -> 1994 (-2.54%) helped: 52 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.19% max: 25.00% x̄: 8.02% x̃: 4.00% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -10.26% -5.79% Instructions are helped. total bundles in shared programs: 2256 -> 2213 (-1.91%) bundles in affected programs: 1154 -> 1111 (-3.73%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.33% max: 25.00% x̄: 9.10% x̃: 5.56% 95% mean confidence interval for bundles value: -1.00 -1.00 95% mean confidence interval for bundles %-change: -11.60% -6.60% Bundles are helped. total quadwords in shared programs: 3689 -> 3642 (-1.27%) quadwords in affected programs: 2025 -> 1978 (-2.32%) helped: 47 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.19% max: 25.00% x̄: 7.86% x̃: 3.85% 95% mean confidence interval for quadwords value: -1.00 -1.00 95% mean confidence interval for quadwords %-change: -10.30% -5.42% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:09 -07:00
Alyssa Rosenzweig	cb6dea6b4d	pan/midgard: Share mir_nontrivial_outmod To be used with redundant move elimination. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	b6946d35c8	pan/midgard: Implement texture RA total instructions in shared programs: 3916 -> 3665 (-6.41%) instructions in affected programs: 1405 -> 1154 (-17.86%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3 helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74% 95% mean confidence interval for instructions value: -9.35 -4.99 95% mean confidence interval for instructions %-change: -22.75% -17.46% Instructions are helped. total bundles in shared programs: 2472 -> 2256 (-8.74%) bundles in affected programs: 906 -> 690 (-23.84%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3 helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67% 95% mean confidence interval for bundles value: -9.09 -4.41 95% mean confidence interval for bundles %-change: -23.77% -17.89% Bundles are helped. total quadwords in shared programs: 3965 -> 3689 (-6.96%) quadwords in affected programs: 1568 -> 1292 (-17.60%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3 helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00% 95% mean confidence interval for quadwords value: -10.38 -5.39 95% mean confidence interval for quadwords %-change: -22.57% -17.17% Quadwords are helped. total registers in shared programs: 411 -> 392 (-4.62%) registers in affected programs: 76 -> 57 (-25.00%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1 helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33% 95% mean confidence interval for registers value: -1.52 -1.01 95% mean confidence interval for registers %-change: -39.12% -22.82% Registers are helped. total threads in shared programs: 426 -> 432 (1.41%) threads in affected programs: 6 -> 12 (100.00%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	13f61f24ea	pan/midgard: Fix backwards blend color load The source and destination were incorrectly flipped in the move, but some details of our internal regalloc made this function anyway. Now that we're changing the regalloc, we need to fix this to avoid regressing blend shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	a99ecc2b2b	pan/midgard: Fix scheduling mishap We shouldn't try to schedule onto a vmul if the last unit was a smul; that would force a break ("traveling back in time"). total bundles in shared programs: 2519 -> 2472 (-1.87%) bundles in affected programs: 791 -> 744 (-5.94%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1 helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69% 95% mean confidence interval for bundles value: -3.47 -1.23 95% mean confidence interval for bundles %-change: -9.36% -6.51% Bundles are helped. total quadwords in shared programs: 4028 -> 3965 (-1.56%) quadwords in affected programs: 1223 -> 1160 (-5.15%) helped: 17 HURT: 0 helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2 helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14% 95% mean confidence interval for quadwords value: -5.71 -1.70 95% mean confidence interval for quadwords %-change: -8.03% -5.91% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	e4038f9445	pan/midgard: Fix vector->scalar swizzles The swizzle should be taken on the masked component, rather than unconditionally X. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	10324095d2	pan/midgard: Add dead move elimination pass This is a special case of DCE designed to run after the out-of-ssa pass to cleanup special register lowering. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	082485d663	pan/midgard: Move DCE into its own file Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	f9e619fa82	pan/midgard: Add mir_rewrite_dst_tag helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	b3cab85606	pan/midgard: Fix flipped register bias fields We mixed up component_lo and full, which made it appear that we had less freedom in RA than we actually do. Fix this to fix some disassemblies as well as prepare for RA with the bias field. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig	be56840d5a	pan/midgard: Update RA for cubemap coords Following the RA work, we apply the same technique to eliminate the move to r27 when loading cubemaps. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-26 08:37:08 -07:00
Eric Engestrom	d2de5b6ba2	anv+tu+radv: delete unusable dev_icd.json As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> # for anv Reviewed-by: Eric Anholt <eric@anholt.net> # for tu Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # for radv	2019-07-26 14:47:53 +00:00
Eric Engestrom	2605e9fe46	docs: fix intel_icd.json path Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, at which point one might as well use the proper icd.json file in the install folder. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-26 14:47:53 +00:00
Bas Nieuwenhuizen	9653d80de1	vulkan/wsi/x11: Increase the effective min. images for mailbox. We need 5 images: 1) CPU work 2) GPU work 3) idle 4) queued for flip 5) presenting Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 16:37:28 +02:00
Bas Nieuwenhuizen	5eae9bfbfc	vulkan/wsi/x11: Wait for GPU work before present with mailbox. Otherwise the wait only happens at flip time, which messes with keeping idle buffers around if the GPU work makes the image miss the next flip. I decided not to use the wait fences as those are still xshm fences, so that means we'd still have to wait in the application. Just doing it before presenting makes things simpler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 16:37:28 +02:00
Bas Nieuwenhuizen	cc6a72a002	vulkan/wsi/x11: Allow using thread present-only. This allows doing a potential long blocking operation before present. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 16:37:28 +02:00
Bas Nieuwenhuizen	55da4e1ec2	vulkan/wsi: Use one fence per image. Much easier to work with if we want to use them in the WS-specific WSI implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 16:37:28 +02:00
Lionel Landwerlin	0fb61dfdeb	spirv: propagate access qualifiers through ssa & pointer Not only variables can be flagged as NonUniformEXT but also expressions. We're currently ignoring it in an expression such as : imageLoad(data[nonuniformEXT(rIndex)], 0) The associated SPIRV : OpDecorate %69 NonUniformEXT ... %69 = OpLoad %61 %68 This changes propagates access qualifiers through ssa & pointers so that when it hits a OpLoad/OpStore style instructions, qualifiers are not forgotten. Fixes failure the following tests : dEQP-VK.descriptor_indexing.* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8ed583fe52` ("spirv: Handle the NonUniformEXT decoration") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-26 14:09:55 +00:00
Lionel Landwerlin	86b53770e1	spirv: wrap push ssa/pointer values This refactor allows for common code to apply decoration on all ssa/pointer values. In particular this will allow to propagage access qualifiers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-26 14:09:55 +00:00
Lionel Landwerlin	8c330728f3	nir: add access to image_deref intrinsics SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-26 14:09:55 +00:00
Yevhenii Kolesnikov	02ecf16a70	main: unreference ATIFragmentShader program before creating new one Old program was overwritten without release of memory. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 12:51:05 +00:00
Yevhenii Kolesnikov	fad848094f	state_tracker: Add destroying routine for feedback and select stages Fixes leaking memory on iris. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-26 15:35:03 +03:00
Iago Toral Quiroga	1a99fc0fd0	v3d: fix glDrawTransformFeedback{Instanced}() This needs to take the vertex count from the provided transform feedback buffer. v2: - don't take the vertex count from the underlying buffer, instead, take it from a v3d subclass of pipe_stream_output_target (Eric). Fixes piglit tests: spec/ext_transform_feedback2/draw-auto spec/ext_transform_feedback2/draw-auto instanced Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-26 08:29:41 +02:00
Iago Toral Quiroga	47eb74ae00	v3d: subclass pipe_streamout_output_target to record TF vertices written Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-26 08:29:41 +02:00
Iago Toral Quiroga	39df568ca1	v3d: refactor v3d_tf_statistics_record slightly Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-26 08:29:41 +02:00
Alyssa Rosenzweig	2f9236096a	Revert "panfrost: Don't DIY point size/coord fields" This reverts commit `4508f43eed`, which broke a bunch of dEQP tests (e.g. in dEQP-GLES2.functional.draw.draw_arrays.*) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 13:17:22 -07:00
Jason Ekstrand	295e5a17da	anv: Disable transform feedback on gen7 It's totally implementable, it's just that the plumbing is a bit different and we never hooked it up. Don't advertise a broken feature. Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback"	2019-07-25 14:58:14 -05:00
Pierre-Eric Pelloux-Prayer	cd02f60c1e	mesa: Fix GetTextureImage error reporting, again Iago Toral Quiroga fixed this in commit `94f740e3fc`, but it recently regressed in `0d8826f723`. Quoting Iago's original commit message for the fix: GetTex*Image should return INVALID_ENUM if target is not valid, however, GetTextureImage does not receive a target, and instead should return INVALID_OPERATION if the effective target is not valid. From the OpenGL 4.6 core profile spec, section 8.11 Texture Queries: "An INVALID_OPERATION error is generated by GetTextureImage if the effective target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY, TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or TEXTURE_CUBE_MAP (for GetTextureImage only)." Note that this differs from the original ARB_direct_state_access spec. However, the EXT_direct_state_access version does take a target parameter, so it should continue reporting INVALID_ENUM. Fixes KHR-GL45.direct_state_access.textures_image_query_errors. Fixes: `0d8826f723` ("mesa: refactor get_texture_image to remove duplicate code") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-25 18:43:40 +00:00
Kenneth Graunke	0e24d10ff5	iris: Use gen_mi_builder to handle CS ALU operations. In a few cases, we switch to MI_MATH instead of MI_PREDICATE, just because we were already doing math and it's easier to chain together. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	fe08aa67a8	intel/mi: Add a unit test for gen_mi_store_if(). This tests that predicated stores work. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	74063ee61a	intel/mi: Add a new gen_mi_store_if() helper. This performs predicated MI_STORE_REGISTER_MEM commands, assuming that the condition is already loaded into MI_PREDICATE_DATA. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	27b5817b6c	intel/mi: Add gen_mi_nz() and gen_mi_z() helpers. These provide comparisons against zero. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	4e16b838ba	intel/mi: Add a gen_mi_ior() to go with gen_mi_iand() Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	79b8e3c260	intel/mi: Optimize away LOAD_REGISTER_REG from a register to itself We might want to resolve something to be in a particular register, so we can access it outside of the gen_mi framework...but it may already be in that register, at which point there's no work to do. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	fe7ed6b057	iris: Make iris_query.c a genxml-compiled file. This will let us use Jason's new MI-builder shortly. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	975f7e4a59	iris: Move iris_resolve_conditional_render to the vtable. It's going to be in genxml code shortly. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	6c4c7b600d	iris: Refactor genxml macros and inlines into iris_genx_macros.h. This will let us put the genxml boilerplate in one place, before we expand genxml to more files shortly. Like i965/genX_boilerplate.h. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	204a3bb816	iris: Make an iris_genx_protos.h header for prototypes. This lets us specify the prototypes once, instead of cut and pasting them per generation. isl uses a similar approach (isl_genX_priv.h). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Marek Olšák	068093e84c	radeonsi: fix DAL hang due to incorrect DCC offset on Raven Set the correct relative offset. Fixes: `f8b6c5a` "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support"	2019-07-25 14:09:11 -04:00
Jason Ekstrand	9d2aa67c47	anv: Disable subgroup arithmetic on gen7 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-25 16:43:16 +00:00
Eric Anholt	f60defa72d	gitlab-ci: Add a shader-db run using v3d on drm-shim. This provides significant compiler coverage during CI at a fairly low cost in CPU time (~17s per thread for 4 threads on gst-gitlab-htz-runner3). I'm leaving wget in the docker image, as once this is in master I'm planning on having an automatic shader-db comparison between master and the branch included in the artifacts. I also haven't done freedreno yet, because it has some races when run in multithreaded mode that I'm still tracking down. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-25 08:56:55 -07:00
Eric Anholt	dd3d0b2897	gitlab-ci: Only keep the build logs as artifacts. On a build failure, we were tarring up the whole ccache directory, build.ninja, build products, etc. This was over 400MB compressed on a recent early meson-main build failure, which fd.o then has to hang on to for 4 weeks. The build logs are probably the interesting part, are potentially useful regardless ("how did CI's build flags differ from mine?"), and are <500k uncompressed on my personal meson build. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-07-25 08:56:49 -07:00
Eric Anholt	f68b987387	gitlab-ci: Always set libdir to lib/ I introduced libdir for cross-builds so we could point at the resulting drivers without per-arch dependencies, but I'd rather not have to type x86_64-linux-whatever for non-cross-builds either. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-25 08:56:19 -07:00
Eric Anholt	494ecef6b4	freedreno: Add support for drm-shim. I'm using this for shader-db analysis on x86_64 systems. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-25 08:56:19 -07:00
Eric Anholt	82bf1979d7	v3d: Introduce a DRM shim for calling out to the simulator. The goal is to enable testing of parts of drivers without depending on any particular kernel version or hardware being present. Simply set LD_PRELOAD=$PREFIX/lib/libv3d_drm_shim.so in your environment, and we'll fake a /dev/dri/renderD128 (or whatever the next available node is) using v3dv3. That node can then be used with the surfaceless or gbm EGL platforms. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-25 08:56:19 -07:00
Erik Faye-Lund	c5f1432296	glsl: report no function instead of empty candidate list When generating the error message for a missing function error where all available overloads were missing due to a too low GLSL version, we used to report something like this: ---8<--- 0:224(14): error: no matching function for call to `textureCubeLod(samplerCube, vec3, float)'; candidates are: 0:224(14): error: type mismatch ---8<--- This is a pretty confusing error message, and can throw people off when debugging. So let's instead check if any overload is available before we decide what to print. This allow us to report something like this instead: ---8<--- 0:224(14): error: no function with name 'textureCubeLod' 0:224(14): error: type mismatch ---8<--- This is arguably easier to understand for programmers, and doesn't send you on a wild goose chase to figure out what argument is wrong just because you stopped reading the message prematurely. I'm of course referring to a friend, not me. For sure. I would never do that. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-25 17:20:10 +02:00
Bas Nieuwenhuizen	7e1fe81f56	radv: Set correct metadata size for GFX9+. Without correct size, radeonsi assumes the metadata is incorrect, which can and will cause issues. Since the metadata is really incorrect without the size, let us fix that. Fixes: `e43cc3e3af` "radv/gfx9: handle GFX9 opaque metadata" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-25 17:07:53 +02:00
Arcady Goldmints-Orlov	832cedfdee	anv: report HOST_ALLOCATION as supported for images Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as supported for images. It was being shown supported for buffers, but not images. Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-25 09:01:26 -05:00
Samuel Pitoiset	7d11bf2155	radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSB This fixes dEQP-VK.rasterization.primitive_size.points.point_size_* This also fixes some black squares with the Sascha SSAO demo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-25 15:48:24 +02:00
Samuel Pitoiset	6a504ab473	radv/gfx10: use L2 for DMA copy/fill operations It's coherent and faster. GFX7-GFX9 should also support this but for now only uses L2 for GFX10 because it's untested on previous gens. This fixes dEQP-VK.memory.pipeline_barrier.transfer_* This also fixes some missing geometry in Dawn Of War III because VBOs weren't updated correctly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-25 15:48:21 +02:00
Alyssa Rosenzweig	9ce75826cb	pan/midgard: Optimize varying projection We add a new opt pass fusing perspective projection with varyings. Minor win..? We don't combine non-varying projections, since if we're too agressive, the extra load/store traffic will hurt us so it's not really a win in practice. total instructions in shared programs: 3915 -> 3913 (-0.05%) instructions in affected programs: 76 -> 74 (-2.63%) helped: 1 HURT: 0 total bundles in shared programs: 2520 -> 2519 (-0.04%) bundles in affected programs: 46 -> 45 (-2.17%) helped: 1 HURT: 0 total quadwords in shared programs: 4027 -> 4025 (-0.05%) quadwords in affected programs: 80 -> 78 (-2.50%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	f6438d1e15	pan/midgard: Add perspective projection recombine pass We don't use it yet, since it's actually a shader-db regression. This is primarily helpful as an intermediate step for attaching projection to varyings. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	8ddb0eda42	pan/midgard: Force perspective ops to use vec4 It doesn't make sense to use them with anything less. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	b06951d343	pan/midgard: Add R27-only op handling We use a special conflicting register class. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	f55a760d0c	pan/midgard: Add OP_R27_ONLY helper While load/store ops like st_vary can take an argument in either r26/r27, ops like those for perspective projection must specifically take their argument in r27. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	233c0faadd	pan/midgard: Enable RA for st_vary Now that all the piping is in place to do so without regressions, we flip on automatic register allocation for varyings. Hooray! total instructions in shared programs: 4025 -> 3915 (-2.73%) instructions in affected programs: 1667 -> 1557 (-6.60%) helped: 62 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 1.77 x̃: 2 helped stats (rel) min: 0.93% max: 20.00% x̄: 10.80% x̃: 10.64% 95% mean confidence interval for instructions value: -1.89 -1.66 95% mean confidence interval for instructions %-change: -12.50% -9.11% Instructions are helped. total bundles in shared programs: 2683 -> 2520 (-6.08%) bundles in affected programs: 1066 -> 903 (-15.29%) helped: 62 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.63 x̃: 3 helped stats (rel) min: 2.94% max: 42.86% x̄: 23.85% x̃: 22.50% 95% mean confidence interval for bundles value: -2.83 -2.43 95% mean confidence interval for bundles %-change: -27.73% -19.97% Bundles are helped. total quadwords in shared programs: 4192 -> 4027 (-3.94%) quadwords in affected programs: 1584 -> 1419 (-10.42%) helped: 62 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 2.66 x̃: 3 helped stats (rel) min: 1.85% max: 30.00% x̄: 16.49% x̃: 16.52% 95% mean confidence interval for quadwords value: -2.87 -2.46 95% mean confidence interval for quadwords %-change: -19.14% -13.84% Quadwords are helped. total registers in shared programs: 433 -> 411 (-5.08%) registers in affected programs: 67 -> 45 (-32.84%) helped: 23 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 25.00% max: 50.00% x̄: 41.30% x̃: 50.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 14.29% max: 14.29% x̄: 14.29% x̃: 14.29% 95% mean confidence interval for registers value: -1.09 -0.74 95% mean confidence interval for registers %-change: -45.45% -32.52% Registers are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	210dbe3fc1	pan/midgard: Remove check for `class` Fixes classes defaulting to vec4 in some cases. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	8842db3a7d	pan/midgard: Move uniforms to special registers The load/store pipes can't take a uniform register in, so an explicit move is necessary here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	ae7acde91f	pan/midgard: Emit st_vary registers in install_registers Now that we have its registers handled normally like the rest of the IR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	c3ad7500d2	pan/midgard: Add mir_lower_special_reads helper Given the constraints on special registers, we add a helper for lowering these by inserting moves (copies) where needed to satsify the ISA constraints. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	e169301bd8	pan/midgard: Add emit_explicit_constant helper We generalize the constant emission helper used in fragment writeout as we'll also need it for vertex outputs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	eedd6c1dd0	pan/midgard: Add mir_rewrite_index_src_tag Specialized version of a rewrite that only rewrites a certain type of instruction. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	5d5caf10af	pan/midgard: Add class check This ensures the rules for accessing special register classes are satisfied. This is asserted as a prepass should have lowered offending uses to something satisfying these rules. Special register classes are not work registers and cannot be used for RMW operations; they are essentially 1-way pipes straight into/from fixed-function logic in the shader cores. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	91195bdff1	pan/midgard: Implement class spilling We reuse the same register spilling mechanism as for work->memory to spill special->work registers, e.g. to allow writing out more than 2 vec4 varyings (without better scheduling anyway). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	0f38f6466e	pan/midgard: Extend liveness analysis to st_vary These can consume sources now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	dca0166ce1	pan/midgard: Implement load/store register classing This does not yet support special->work spilling, nor does it support multiclass breakup. These corner cases will be handled in succeeding commits. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	839b80aa89	pan/midgard: Allocate special register classes We'll want to also handle load/store and texture registers in our RA loop. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	480b502443	pan/midgard: Move copy propagation into its own file We also expose some utilities it uses as general MIR helpers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig	b8caaa3000	pan/midgard: Add mir_simple_swizzle helper Checks for x/xy/xyz/xyzw style swizzles (slightly more general but you get the idea). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:21 -07:00
Alyssa Rosenzweig	63385a3fdb	pan/midgard: Add mir_single_use helper Helps as an optimization heuristic. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:37:21 -07:00
Alyssa Rosenzweig	5534fdb7bf	panfrost: Compute I/O counts from shader_info ...rather than exposing it in the vendored compiler region. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig	4508f43eed	panfrost: Don't DIY point size/coord fields Again, it's in shader_info for us! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig	bab4f6c724	panfrost: Use nir_gather_info information about discards No need to track this ourselves! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig	48991c7a1f	panfrost: Use NIR helper invocations info We don't need to guesstimate this ourselves. This will help when we bringup derivatives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig	fb2fe6e7bc	panfrost/sfbd: Flesh out fragment job We include a zsbuf attachment function based on how the corresponding MFBD code works, as well as extending cbufs to mipmapped rendering while we're at it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig	e6802af8c3	panfrost: Disable tiled formats on SFBD systems Just because we don't have the format codes to render to them yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:20 -07:00
Alyssa Rosenzweig	990e24469c	panfrost: Move require_sfbd to screen We'll need it to specialize resource creation by chip. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:20 -07:00
Alyssa Rosenzweig	a9c73e825a	panfrost: Reserve, but do not upload, shader padding Fixes invalid read errors reported by valgrind. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 06:34:20 -07:00
Alyssa Rosenzweig	b2a3ca6bd5	util/ra: Add a getter for a node class Complements the existing getters and the setter for node class. To be used in the Panfrost RA refactor. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-25 06:14:12 -07:00
Tomeu Vizoso	688d9b4fb7	panfrost/ci: Update kernel to 5.2 Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-25 15:08:44 +02:00
Nicolas Dufresne	08f1cefecd	egl: Also query modifiers when exporting DMABuf This fixes eglExportDMABUFImageQueryMESA() so it will report the modififers of the underlying image. Without this information, re-importing will likely be broken as it is rare these days that no modifiers are used. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Daniel Stone <daniels@collabora.com> Fixes: `8f7338f284` ("egl: add initial EGL_MESA_image_dma_buf_export v2.4")	2019-07-25 05:14:36 +00:00
Heinrich Fink	4886924262	mesa: Enable GL_MESA_framebuffer_flip_y for GL 4.3 Extend MESA_framebuffer_flip_y to be used with OpenGL versions 4.3 and higher. OpenGL 4.3 adds FramebufferParameteri needed by this extension. Reviewed-by: Fritz Koenig <frkoenig@google.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-25 04:47:38 +00:00
Alyssa Rosenzweig	31c9fcbd0f	panfrost: Don't expose some atomic stuff even with dEQP Fixes dEQP crashes. Fixes: `2f93ecd654` ("panfrost: Fake CAPs for dEQP-GLES31") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-24 17:21:12 -07:00
Dave Airlie	16fcbb2eba	gallium: fix windows build from params change. This is why we can't have nice things. I'm sure there's someway to do this with {0} but I really don't have time for that. Fixes: `2631fd3b0b` ("gallivm: rework lp_build_tgsi_soa to take a struct") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-25 10:02:22 +10:00
Jonathan Marek	97c8314c5f	nir/algebraic: add scmp algebraic optimizations When 'x' is the result of a scmp op: x != 0.0 or x == 1.0: passthrough x == 0.0 or x != 1.0: invert Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	9be902097c	nir/algebraic: add option to lower fall_equalN/fany_nequalN Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal for vec4 backends that doesn't have any special instructions for it, as long as they support saturate. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	397375d3f3	nir/algebraic: add fdot2 optimizations Add simple fdot2 optimizations that are missing. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	1e089d0575	nir/algebraic: add option to lower fdph For backends that don't have a 'fdph' instructions Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	bc3b6168ba	nir: replace lower_sincos with algebraic opt This version has less ops for the same precision. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	5a4e71c082	nir/algebraic: allow swizzle in nir_algebraic replace expression This is to allow optimizations in nir_opt_algebraic not otherwise possible Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Rob Clark	b4f4768672	gallium/u_transfer_helper: fix assert in RGTC case Previously we'd hit the unreachable() for uploading RGTC. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-24 21:11:06 +00:00
Yevhenii Kolesnikov	53730ab32c	main: Free memory allocated for gl_bitmap_atlas structure Structure itself wasn't freed during context tear-down, causing a memory leak on iris. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-24 15:31:26 -04:00
Daniel Schürmann	e272fdd508	nir,intel: lower if (cond) demote() to new intrinsic demote_if(cond) This will effectively enable the optimization in anv. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-24 13:02:18 -05:00
Kenneth Graunke	517005b4cf	i965: Use NIR to lower legacy userclipping. This allows us to drop legacy userclip plane handling in both the vec4 and FS backends, and simplifies a few interfaces. v2 (Jason Ekstrand): - Move brw_nir_lower_legacy_clipping to brw_nir_uniforms.cpp because it's i965-specific. - Handle adding the params in brw_nir_lower_legacy_clipping - Call brw_nir_lower_legacy_clipping from brw_codegen_vs_prog Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-24 18:00:13 +00:00
Jason Ekstrand	d10de25309	anv: Implement VK_EXT_subgroup_size_control Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	bcef32d49b	anv/pipeline: Plumb pipeline shader stage create flags Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	2a236c76f8	intel/compiler: Allow for required subgroup sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	4397eb91c1	intel/compiler: Allow for varying subgroup sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	799f0f7b28	nir/lower_subgroups: Properly lower masks when subgroup_size == 0 Instead of building a constant mask (which depends on knowing the subgroup size), we build an expression. Because the pass uses the nir_shader_lower_instructions helper, subgroup lowering will be run on any newly emitted instructions as well as the previously existing instructions. In particular, if the subgroup size is known, the newly emitted subgroup_size intrinsic will get turned into a constant and a later constant folding pass will clean it up. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	256e6c2d94	vulkan: Update the XML and headers to 1.1.116 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	c84b8eeeac	intel/compiler: Be more conservative about subgroup sizes in GL The rules for gl_SubgroupSize in Vulkan require that it be a constant that can be queried through the API. However, all GL requires is that it's a uniform. Instead of always claiming that the subgroup size in the shader is 32 in GL like we have to do for Vulkan, claim 8 for geometry stages, the maximum for fragment shaders, and the actual size for compute. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	1981460af2	intel/compiler: Lower gl_SubgroupSize in postprocess_nir Instead of lowering the subgroup size so early, wait until we have more information. In particular, we're going to want different subgroup sizes from different stages depending on the API. We also defer lowering of subgroup masks because the ge/gt masks require the subgroup size to generate a subgroup mask. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	f62227f2b7	intel/nir: Make brw_nir_apply_sampler_key more generic Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Sagar Ghuge	87cef718e1	nir: Add lowering for nir_op_irem and nir_op_imod Tested on Gen > 9. v2: 1) Fix lowering 2) Keep a consistent i/u order (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 10:33:09 -07:00
Yevhenii Kolesnikov	882fe09a74	main: Fix memleaks in mesa_use_program Add freeing of SubroutineIndexes to the _mesa_free_shader_state. Fixes: `4566aaaa5b` ("mesa/subroutines: start adding per-context subroutine index support (v1.1)") Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-24 16:54:21 +00:00
Andrii Simiklit	fa2fc68de1	intel/compiler: don't use a keyword struct for a class fs_reg warning: struct 'fs_reg' was previously declared as a class Fixes: `e64be391` ("intel/compiler: generalize the combine constants pass") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-07-24 13:26:42 +00:00
Qiang Yu	280dfa02fa	lima/ppir: fix disassembler temp read/write print temp read/write use negtive offset, and handle alignment==1 case. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-07-24 20:39:39 +08:00
Eric Engestrom	e7e31b18d6	gallium+mesa: fix tgsi_semantic array type Fixes: `ed23335a31` ("gallium: use enums in p_shader_tokens.h (v2)") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-24 09:33:29 +01:00
Eric Engestrom	f986741a91	util: fix no-op macro (bad number of arguments) Fixes: `b8e077daee` ("util: no-op __builtin_types_compatible_p() for non-GCC compilers") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 09:13:58 +01:00
Samuel Pitoiset	4389e85dc9	radv/gfx10: enable VK_EXT_transform_feedback When a pipeline uses transform feedback, the driver fallbacks to the legacy path because NGG support for streamout is a non-trivial amount of work. AMDVLK also uses the legacy path for streamout, while RadeonSI uses the new NGG path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:37 +02:00
Samuel Pitoiset	a3a4fa1860	radv/gfx10: do not enable NGG if a pipeline uses XFB NGG GS for streamout requires a bunch of work, so enable it with the legacy path only for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:34 +02:00
Samuel Pitoiset	09abe571a2	radv/gfx10: emit streamout shader config Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:32 +02:00
Samuel Pitoiset	383c2e625a	radv/gfx10: declare streamout user SGPRs Required for legacy streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:30 +02:00
Samuel Pitoiset	fd195d8085	radv/gfx10: update streamout descriptors Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:27 +02:00
Samuel Pitoiset	ea337c8b7e	radv/gfx10: fix VS input VGPRs with the legacy path For some reasons, InstanceID is VGPR3 although StepRate0 is set to 1. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-24 08:23:21 +02:00
Dave Airlie	2631fd3b0b	gallivm: rework lp_build_tgsi_soa to take a struct The parameters were getting messy and I have to add a few more for compute shaders, so clean it up before proceeding. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-24 09:20:09 +10:00
Jason Ekstrand	9700e45463	nir/lower_io: Return SSA defs from helpers I can't find a single place where nir_lower_io is called after going out of SSA which is the only real reason why you wouldn't do this. Returning SSA defs is more idiomatic and is required for the next commit. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-23 17:48:49 -05:00
Dylan Baker	7cf50af6f5	meson: allow building all glx without any drivers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111016 Fixes: `a47c525f32` ("meson: build glx") Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-23 15:34:23 -07:00
Jan Zielinski	3d6cffffcf	swr/rasterizer: Fix 3D resource copies. Ensure constant attributes stay constant with barycentric interpolation. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-23 21:55:09 +02:00
Jan Zielinski	ec4a5f5e13	swr/rasterizer: Fix return type on SIMD8 version of Clamp and Normalize utility functions Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-23 21:55:09 +02:00
Jan Zielinski	47cdb0ac27	swr/rasterizer: small formatting changes Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-23 21:55:09 +02:00
Jan Zielinski	ccc6b4f96b	swr/rasterizer: Adding support for unhandled clipEnable state Clipping is not correctly handled by the rasterizer - fixing this. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-23 21:55:09 +02:00
Bas Nieuwenhuizen	e5b3f0a867	radv/gfx10: Enable binning. Numbers for Talos: gfx10 without binning: 77.0 77.7 77.2 77.6 gfx10 with binning: 82.3 82.0 82.7 82.4 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	3268c806fb	radv/gfx10: Implement bin size calculation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	4b757697e9	radv/gfx9: Select between depth/color bins based on area. Mirrors radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	22f2f76789	radv: Generalize binning settings. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	793cbf6161	radv/gfx10: Use new scan converter. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	4058b354c5	radv: Set FLUSH_ON_BINNING_TRANSITION. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen	906fcfccfd	radv: Use pbb_allow for framebuffer BREAK_BATCH. Ported from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-23 21:26:59 +02:00
Marek Olšák	264ab6ffcd	radeonsi/nir: set tgsi_shader_info::uses_fbfetch for KHR_blend_equation_adv. This doesn't implement the color buffer load. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:08:37 -04:00
Marek Olšák	45556731b6	tgsi/scan: add uses_fbfetch Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:08:37 -04:00
Marek Olšák	ee858871bd	radeonsi: fail if importing a texture with incorrect last_level or samples v2: don't fail if the texture comes from an incompatible driver. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> (v1)	2019-07-23 15:08:27 -04:00
Marek Olšák	f8b6c5a1a6	radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:03:51 -04:00
Marek Olšák	e718f8e713	radeonsi: simplify si_get_input_prim and remove incorrect TODO comment u_vertices_per_prim(QUADS) is the same as TRIANGLES. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:03:49 -04:00
Marek Olšák	16392cc3f3	radeonsi/gfx10: fix and enable CLEAR_STATE it was a driver bug. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:03:47 -04:00
Marek Olšák	ad642d5b3a	radeonsi: stop using info.opcode_count[TGSI_OPCODE_INTERP_SAMPLE] Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:03:46 -04:00
Marek Olšák	6ac2146a98	ac/nir: implement nir_op_pack_{us}norm_2x16 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-23 15:03:44 -04:00
Pierre-Eric Pelloux-Prayer	079e5f73d7	mesa/st: rewrite src var when lowering tex_src_plane The assign_extra_samplers() adds the needed extra samplers but they need to be used in the nir_tex_instr. Otherwise the plane information is simply lost and all nir_tex_instr use the same sampler. Here's an example of the bug: NIR before st_nir_lower_tex_src_plane: vec1 32 ssa_8 = load_const (0x00000000 /* 0.000000 /) vec4 32 ssa_9 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord), ssa_8 (plane) vec1 32 ssa_10 = load_const (0x00000001 / 0.000000 */) vec4 32 ssa_11 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord), ssa_10 (plane) After: vec4 32 ssa_9 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord) vec4 32 ssa_11 = tex ssa_0 (texture_deref), ssa_0 (sampler_deref), ssa_5 (coord) This fixes the following piglit test for radeonsi + NIR: - ext_image_dma_buf_import-sample_nv12 - ext_image_dma_buf_import-sample_yuv420 - ext_image_dma_buf_import-sample_yvu420 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 15:00:43 -04:00
Pierre-Eric Pelloux-Prayer	e9cf8c1d30	u_blitter: add a msaa parameter to util_blitter_clear Fixes: `ea5b7de138` ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled") Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 14:42:20 -04:00
Pierre-Eric Pelloux-Prayer	d811446e6c	u_blitter: enable msaa when dst num samples is > 1 Commit `ea5b7de138` broke some piglit tests on radeonsi (Bonaire hardware). This commit fixes half of the regression by enabling msaa if the dest surface has more than 1 sample (instead of hardcoding it to false). Fixes: `ea5b7de138` ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled") Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 14:42:20 -04:00
Jason Ekstrand	ae392d73c9	nir/gather_info: Look for uses of helper invocations The one obvious omission here is gl_HelperInvocation itself. However, the spec doesn't require that we generate then when gl_HelperInvocation is used, it merely mandates that we report them if they are there. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	41ab92a327	nir/gather_info: Move setting uses_64bit out of the switch Otherwise, as we add things to the switch, we're going to forget and add some 64-bit op at some point in the future and it'll stop getting flagged. There's no reason why we can't do the check for derivatives. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	0e6cb481fa	nir: Add a nir_tex_instr_has_implicit_derivatives helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	7a98c7804c	nir: Move nir_alu_instr_is_comparison to the ALU section Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Rafael Antognolli	1f4cbc9a06	intel/genxml: Add new test for subgroups. Make sure that a <group> tag within another <group> tag work just fine. v2: rename 'halfbyte' to 'byte' to match the size (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	fe5ae96d66	intel/genxml: Add basic infra for encoding/decoding unit tests. Adding option to print quiet. v2: Add license header. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	e25ebe2ec9	intel/gen_decoder: Decode <group> inside <group>. Now we can decode a <group> tag inside another <group> tag, and properly print its indices and content. v2: Use push/pop stack to fields, groups and iters (Lionel). v3: Add assert(iter->level < DECODE_MAX_ARRAY_DEPTH) (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	f670c2e1ff	intel/gen_decoder: Add the concept of array "levels". We currently only support one level, which is the basic level of a <group> tag. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	618d054283	intel/gen_decoder: Add array field. We currently use the group->next pointer to iterate through the <group> tags. This change them to be a type of field, so we can descend into them while iterating, and then go back to the original position. Will be useful when we want to decode <group>'s inside <group>'s, and when there are more <field>'s after a <group> tag. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	21bdd51942	intel/gen_decoder: Rename internally "group" to "array". A gen_group (group in most of the code) can be of several types: - instruction - struct - register - group (?!?) The <group> tag actually represents an array of elements. So at least in our code, lets call it an array to avoid confusion with gen_group. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	69506cbb74	intel/gen_decoder: Add gen_spec_load_filename() function. Refactor the code from gen_spec_load_from_path() into a separate function, that can be used with a xml file that doesn't fit the genX.xml filename format. Will be used soon for implementing unit tests for gen_decoder. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	1f2b22a6bd	intel/gen_decoder: Fix parsing of small genxml file. When using gen_spec_load_from path, only abort decoding if the read length is 0. Previously, we were aborting if finding an EOF, even if something was read from the file. Also only kill the decoded file if no commands or structs were found, and print a message in such case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Guido Günther	85996567f5	kmsro: Extend to include mxsfb-drm This allows using the LCDIF display controllers (with the mxsfb drm modesetting driver) along with the Etnaviv render-only drivers. LCDIF is found on i.MX SoCs. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-23 17:12:10 +00:00
Sagar Ghuge	806e5a37ed	anv: Implement VK_KHR_imageless_framebuffer v2: Pass pointer instead of struct instance (Lionel) v3: 1) Fix small nits (Jason) 2) Add way to detect anv_framebuffer don't have attachments (Jason) 3) Get rid of unncessary pNext chain walk (Jason) 4) Keep framebuffer instance in anv_cmd_state (Jason) v4: 1) Dump attachments from cmd_buffer (Jason) v5: 1) Fix condition check and add assertion (Lionel) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-23 10:01:45 -07:00
Alyssa Rosenzweig	840b806d64	panfrost/midgard: Allocate registers once (per-screen) This should save a lot of per-compile time by using the RA the way it's actually supposed to be used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-23 09:06:21 -07:00
Lionel Landwerlin	772a5f9814	anv: fix use of comma operator This doesn't fix any bug at the moment because the next statement is 'true' which happens to be APIMODE_D3D, but if that changes it could. The fixes tags is as far I could go but the error predates it (2016 is probably far enough). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8db6f2e6eb` ("anv/pipeline: Roll genX_pipeline_util.h into genX_pipeline.c") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-23 15:54:48 +00:00
Andrii Simiklit	79ab2c3e57	nir: use \| instead of \|\| operator warning: use of logical '\|\|' with constant operand note: use '\|' for a bitwise operation Fixes: `758fdce9fe` ("nir: Add some generic helpers for writing lowering passes") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-07-23 18:08:58 +03:00
Arnaud Patard	397f9ba69f	panfrost: Fix T6XX Support While testing kmscube with mesa master, it turns out that kmscube is not working anymore. After bisecting, commit `5a7688fdec` is the culprit. A short trial and error session allowed to find the removed bit of code making kmscube working again. This patch adds it back. Fixes: `5a7688fde` ("panfrost: Use 64-bit descriptors globally") v2: Add comment pointing out this is magic. [Alyssa, trivial] Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-23 08:04:42 -07:00
Alyssa Rosenzweig	83a1d5544a	panfrost: Use correct definition for is_t6xx Rather than anything "early Midgard", limit us specifically to T6XX, as certain workarounds only apply to genuine T6XX, not T7XX. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-23 08:04:42 -07:00
Eric Engestrom	3acc4278ad	nir: don't return void Fixes: `14531d676b` ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-07-23 16:02:37 +01:00
Eric Engestrom	7797823afa	util: fix asprintf() fallback Fixes: `9607d499dc` ("util: add asprintf() wrapper for MSVC") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-23 14:59:07 +00:00
Michel Dänzer	22c7738520	st/mesa: Try re-importing resource if necessary in st_vdpau_map_surface This can be the case if the resource was obtained from st_vdpau_output/video_surface_gallium. st_vdpau_output/video_surface_dma_buf do a similar dance internally. v2: * Pass PIPE_HANDLE_USAGE_FRAMEBUFFER_WRITE instead of 0 for usage. Bugzilla: https://bugs.freedesktop.org/111099 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> # v1 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 16:28:02 +02:00
Michel Dänzer	7499e7362d	radeonsi: Allow PIPE_TEXTURE_2D_ARRAY in si_texture_from_handle Needed for the following st/mesa fix. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 16:26:04 +02:00
Alyssa Rosenzweig	2f93ecd654	panfrost: Fake CAPs for dEQP-GLES31 We still have some big ticket items left on GLES 3.0, but it's often helpful to be able to access higher dEQP levels for debugging features that just don't quite match a particular API. Plus, this opens up a whole slew of new features to poke at if boredom overtakes, ahem. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-23 06:36:48 -07:00
Mark Menzynski	7493fbf032	nvc0/ir: Fix assert accessing null pointer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111007 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111167 Signed-off-by: Mark Menzynski <mmenzyns@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann<tobias.klausmann@freenet.de>	2019-07-23 15:08:25 +02:00
Samuel Pitoiset	d36af71f44	radv/gfx10: enable CLEAR_state It actually works. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-23 14:15:55 +02:00
Juan A. Suarez Romero	c41545c2f5	docs: update calendar, add news item and link release notes for 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-23 11:20:00 +00:00
Juan A. Suarez Romero	3843c5f77a	docs: add sha256 checksums for 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `33e57d0ace`)	2019-07-23 11:18:31 +00:00
Juan A. Suarez Romero	fd965a3330	docs: add release notes for 19.1.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `09a1b2bdba`)	2019-07-23 11:18:29 +00:00
Erico Nunes	65e6c42d27	lima/ppir: fix branch codegen register encode The branch instruction has 6 bits per register operand which allows it to specify a component in the register. Fix codegen so that it outputs the right component, otherwise it always outputs the x component. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-23 08:49:19 +00:00
Erico Nunes	a255b49593	lima/ppir: fix debug logs in regalloc The macros already prepend "ppir: ", remove them from the actual strings so it doesn't appear duplicated. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-23 08:24:19 +00:00
Erico Nunes	9254059dd8	lima/ppir: fix alignment on regalloc spilling loads The spilling code spills entire vec4 registers regardless of the components used by the spilled uses. The inserted stores code force the 4 components, but these loads were using a variable number of components, causing bugs on loading the spilled registers. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-23 08:24:19 +00:00
Samuel Pitoiset	9343c93e34	radv: fix dumping disassembly with RADV_DEBUG=shaders Fixes: `a20a9d0c5e` ("radv: dont store disasm string unless keep_shader_info flag set") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-23 10:22:29 +02:00
Eric Engestrom	b1c35fa6d6	st/nir: use asprintf() wrapper to fix MSVC issues Fixes: `856e84083e` ("mesa/st: add sampler uniforms") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-23 08:57:27 +01:00
Eric Engestrom	9607d499dc	util: add asprintf() wrapper for MSVC Fixes: `856e84083e` ("mesa/st: add sampler uniforms") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-23 08:57:27 +01:00
Ilia Mirkin	affb2da0f8	gallium: remove boolean from state tracker APIs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-22 22:13:51 -04:00
Ilia Mirkin	0e30c6b8a7	gallium: switch boolean -> bool at the interface definitions This is a relatively minimal change to adjust all the gallium interfaces to use bool instead of boolean. I tried to avoid making unrelated changes inside of drivers to flip boolean -> bool to reduce the risk of regressions (the compiler will much more easily allow "dirty" values inside a char-based boolean than a C99 _Bool). This has been build-tested on amd64 with: Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d vc4 i915 svga virgl swr panfrost iris lima kmsro Gallium st: mesa xa xvmc xvmc vdpau va Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 22:13:51 -04:00
Dave Airlie	365f24705f	st/nir: fix arb fragment stage conversion The comment even justifies the wrongness wrongly. We should be translating to pipe values properly here or else fragment maps to tess ctrl. Fixes: `3d7611e9a6` ("st/nir: use NIR for asm programs") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-23 11:00:53 +10:00
Marek Olšák	cb9eb1834d	radeonsi: fix warning: ‘ret’ may be used uninitialized Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-22 20:57:44 -04:00
Marek Olšák	850619117e	tgsi: fix warning: ‘interp’ may be used uninitialized Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-22 20:57:44 -04:00
Marek Olšák	f257ef2bbb	gallivm: fix warning: ‘op’ may be used uninitialized Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-22 20:57:44 -04:00
Kenneth Graunke	7cdde962c5	iris: Support storage images that have matching typed formats for reads Even if we don't directly support typed reads on a format, we can often translate them to a reasonable matching format. Advertise those too.	2019-07-22 17:30:13 -07:00
Kenneth Graunke	2f1c7fae9e	iris: Stop advertising MSAA storage images by mistake st_extensions.c sets const->MaxImageSamples (GL_MAX_IMAGE_SAMPLES) by looping over [16, 15, .. 1x] MSAA modes, and RGBA/BGRA/ARGB/ABGR 8888 color formats, calling pipe->is_format_supported() for each, with the usage set to PIPE_BIND_SHADER_IMAGE. If any are supported, it selects that number of samples. We were checking if sample_count <= 1, which meant that we were getting a value of 1x MSAA, rather than the expected 0x (feature doesn't exist). But, only on Icelake because Gen11 adds support for typed read messages for R8G8B8A8_UNORM. The lack of typed read messages for these formats was tricking the check on Gen9 to say no correctly. This caused some Icelake conformance failures, because we don't implement this feature. Just check for sample_count == 0 instead.	2019-07-22 17:30:13 -07:00
Kenneth Graunke	82607f8a90	egl: Only expose 565 pbuffer configs if X can export them as DRI3 images Glamor in xorg-server 1.20 cannot expose 16bpp pixmaps when running in the usual 24bpp mode. This meant our 565 pbuffer configs would ultimately fail to create a backing pixmap, leading to crashes. To hack around this, make a 16bpp pixmap and try and export it. If it works, expose the configs. Otherwise, just skip them. This also disables them on DRI2. These configs were only added to pass conformance requirements, and I doubt anybody cares about testing out 565 pbuffer visuals on DRI2-only drivers. v2: Don't leak the fds (caught by Eric Anholt) v3: Don't free(fds), it's not malloc'd Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 16:58:09 -07:00
Kenneth Graunke	6ad31c4ff3	egl: Make the 565 pbuffer-only config single buffered. In commit `dacb11a585`, Eric found the first matching 565 pbuffer config, and stopped. Our double-buffered configs come first in the list, so we added that, making a pbuffer-only config that claimed to be double buffered. This doesn't make sense, since pixmaps/pbuffers are fundamentally not double buffered. When using that config, every call to eglCreatePbufferSurface would fail with EGL_BAD_MATCH. The call chain looks like this: - eglCreatePbufferSurface - dri3_create_pbuffer_surface - dri3_create_surface - dri2_get_dri_config which eventually does: const bool double_buffer = surface_type == EGL_WINDOW_BIT; and then fails to find a matching config, because it ends up looking for a single-buffered config - and there aren't any. To fix this, make the 565 pbuffer config single-buffered. This fixes at least 51 dEQP-EGL.* tests. Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 16:58:09 -07:00
Kenneth Graunke	fc21394bc4	egl: Quiet warning about front buffer rendering for pixmaps/pbuffers pbuffer configs cause a million of these warnings to trigger, but when using pixmaps or buffers, there is only one surface, so this warning doesn't make much sense. Retain it for window surfaces for now. Fixes: `dacb11a585` ("egl: Add a 565 pbuffer-only EGL config under X11.") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 16:58:09 -07:00
Kenneth Graunke	78164a3a6c	mesa: Fix ReadBuffers with pbuffers pbuffers are internally single-buffered. Marek fixed DrawBuffers to handle this case, but we need to fix ReadBuffers too. Otherwise, pretty much every conformance test fails because glReadPixels breaks. v2: Refactor the switch into a helper (suggested by Eric Anholt) Fixes: `35294f2eca` ("mesa: fix pbuffers because internally they are front buffers") Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 16:58:09 -07:00
Marek Olšák	c37df5feaa	mesa: fix assertion failure in TexImage Check the assertion after error checking. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111194 Fixes: `9dd1f7cec0` ("mesa: pass gl_texture_object as arg to not depend on state") Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-22 14:45:57 -07:00
Jason Ekstrand	5c5f11d1dd	nir: Remove a bunch of large stack arrays Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-22 16:17:18 -05:00
Jason Ekstrand	fa63fad333	intel/fs: Stop stack allocating large arrays Normally, we haven't worried too much about stack sizes as Linux tends to be fairly friendly towards large stacks. However, when running DXVK apps under wine, we're suddenly subject to Windows' more stringent stack limitations and can run out of space more easily. In particular, some of the shaders in Elite Dangerous: Horizons have quite a few registers and the arrays in split_virtual_grfs are large enough to blow a 1 MiB stack leading to crashes during shader compilation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108662 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2019-07-22 16:16:39 -05:00
Nataraj Deshpande	0661c357c6	egl/android: Update color_buffers querying for buffer age color_buffers[] is currently hard coded to 3 for android which fails in droid_window_dequeue_buffer when ANativeWindow creates color_buffers >3 while querying buffer age during dEQP partial_update tests on chromeOS. The patch removes static color_buffers[], queries for MIN_UNDEQUEUED_BUFFERS, sets native window buffer count and allocates the correct number of color_buffers as per android. Fixes dEQP-EGL.functional.partial_update* tests on chromebooks with enabling EGL_KHR_partial_update. v2: update comment instead of removing (Eric Engestrom) v3: change static array to dynamic allocated color_buffers querying MIN_UNDEQUEUED_BUFFERS (Chia-I Wu olv@chromium.org) Fixes: `2acc69da8c` "EGL/Android: Add EGL_EXT_buffer_age extension" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-22 12:31:34 -07:00
Caio Marcelo de Oliveira Filho	0345aeeb40	intel/compiler: Use nir_opt_conditional_discard anv vkpipeline-db results for SKL: total instructions in shared programs: 3622461 -> 3611281 (-0.31%) instructions in affected programs: 396452 -> 385272 (-2.82%) helped: 2062 HURT: 1 total cycles in shared programs: 1458144669 -> 1458105320 (<.01%) cycles in affected programs: 4171830 -> 4132481 (-0.94%) helped: 1874 HURT: 180 total loops in shared programs: 2437 -> 2437 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 8745 -> 8748 (0.03%) spills in affected programs: 8 -> 11 (37.50%) helped: 1 HURT: 1 total fills in shared programs: 23392 -> 23395 (0.01%) fills in affected programs: 8 -> 11 (37.50%) helped: 1 HURT: 1 LOST: 0 GAINED: 1 No changes to shader-db on i965 or iris. The glsl compiler already does a similar optimization. Improvement suggested by Daniel Schürmann. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-22 09:33:48 -07:00
Alyssa Rosenzweig	d07c846546	pan/decode: Disable magic divisor debugging Memory corruption (for both legitimate and illegitimate reasons) causes this to hang pantrace. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:34:26 -07:00
Alyssa Rosenzweig	e8dca7e1e1	pan/midgard: Report spills:fills to shader-db Route this info through so we can track how we're doing on register spilling. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	055aa9b1f4	panfrost/midgard: Reenable pipeline register creation This was disabled to permit regression-free RA work. Now that the spill code is in place, we can reenable, with some caveats about efficacy. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	f0d0061b18	panfrost/midgard: Report tls_size Pipe through the number of bytes of spilled memory used from the compiler into the main driver, where it will be used to allocate the Thread Local Storage buffer. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	f1dcaa0df6	panfrost: Set `initialized` in more cases Indirect linear writes were not being marked as initialized, causing the back blit to be dropped, breaking the listed tests. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	9e3dc703ff	panfrost/ci: Update expectations We've fixed some shader tests. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	bc741599f2	panfrost/midgard: Promote to move, not rewrite for non-SSA Fixes promoted uniform loads to registers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	40abf11708	panfrost/midgard: Dump MIR of RA failure Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	a08e9511e3	pan/midgard; Dump successor graph when printing MIR We just use the pointers of the midgard_block*, which is crude, but it gets the point across and will help debug successor related issues. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	1aa556de2e	pan/midgard: Remove debug statement Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	21510c253c	panfrost/midgard: Implement register spilling Now that we run RA in a loop, before each iteration after a failed allocation we choose a spill node and spill it to Thread Local Storage using st_int4/ld_int4 instructions (for spills and fills respectively). This allows us to compile complex shaders that normally would not fit within the 16 work register limits, although it comes at a fairly steep performance penalty. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	533d65786f	panfrost/midgard: Add mir_has_arg helper Helps scan the MIR for uses of an index. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	076838ef0c	panfrost/midgard: Check write-before-read in liveness analysis If we write to an index before reading it, the old copy we're checking liveness for isn't live in this block, even if it does get read later. Fixes abnormally high register pressure in shaders with loops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	997f85c136	panfrost/midgard/disasm: Check for certain tag errors Midgard bundles contain a tag, as well as a copy of the tag of the next bundle to facilitate prefetch. Do some simple static analysis to detect certain tag errors (particularly on shaders without branching). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	d168b08d62	pan/midgard: Add OP_IS_CSEL helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	1f297471a0	pan/midgard: Add mir_rewrite_index_src_single helper Rather than rewriting an index away across the whole block, we expose finer (per-instruction) granularity for rewrites. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	16c8c354d0	pan/midgard: Ignore inline_constant in liveness It doesn't make any sense to look at it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	d155168e6c	panfrost/midgard: Implement load/store scratch opcodes These are used to load/store from Thread Local Storage, which is memory allocated per-thread (corresponding to ctx->scratchpad in the command stream) and used for register spilling. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	3bb780ecb9	pan/midg/disasm: Check for int varying ops Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	7e052d9332	pan/midgard: Remove "aliasing" It was a crazy idea that didn't pan out. We're better served by a good copyprop pass. It's also unused now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	3174bc9972	panfrost: Promote uniform registers late Rather than creating either a load or a uniform register read with a fixed beginning offset, we always create a load and then promote to a uniform register later. This will allow us to promote in a register pressure aware manner. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig	aa03159120	pan/midgard: Call scheduler/RA in a loop This will allow us to insert instructions as a result of register allocation, permitting spilling to be implemented. As a side effect, with the assert commented out this would fix a bunch of glamor crashes (due to RA failures) so MATE becomes useable. Ideally we'll have scheduling or RA actually sorted out before the branch point but if not this gives us a one-line out to get X working... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:33 -07:00
Alyssa Rosenzweig	1cabb8a706	pan/midgard: Remove custom register selection callback What we have is equivalent to the default callback; let's use that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-22 08:20:33 -07:00
Samuel Pitoiset	b5116d3cb7	radv: fix crash in vkCmdClearAttachments with unused attachment depth_stencil_attachment and/or ds_resolve attachment can be NULL. This fixes crashes with dEQP-VK.renderpass.suballocation.unused_clear_attachments.* Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 14:25:54 +02:00
Sergii Romantsov	253be49402	i965: free object labels when deleting Some leaks detected with GL_KHR_debug on i965. CC: Timothy Arceri <t_arceri@yahoo.com.au> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-22 12:39:32 +03:00
Samuel Pitoiset	915abbe932	radv/gfx10: update descriptors for inline uniform blocks Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:42 +02:00
Samuel Pitoiset	d76746c1ff	radv/gfx10: emit the GS NGG prologue before the nested barrier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:39 +02:00
Samuel Pitoiset	8c97a07967	radv/gfx10: do not allocate space for the ZPASS_DONE bug GFX10 isn't affected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:35 +02:00
Samuel Pitoiset	1fb7bd046b	radv/gfx10: do not set ELEMENT_SIZE for buffer descriptors This field doesn't exist. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:31 +02:00
Samuel Pitoiset	1878090b68	radv: clean up fill_geom_tess_rings() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:28 +02:00
Samuel Pitoiset	e7c356866e	radv: change a bunch of >= GFX9 to == GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:26 +02:00
Samuel Pitoiset	6049745b13	ac/nir: do not clamp shadow reference on GFX10 RadeonSI only uses Z32_FLOAT_CLAMP for upgraded depth textures on GFX10 and RADV doesn't promotes Z16 or Z24. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 09:02:22 +02:00
Daniel Schürmann	64b7386ee8	radv: move nir_opt_conditional_discard out of optimization loop This late optimization pass is only affected by nir_opt_if() and handles all cases in a single pass. It's enough to call it once after the optimization loop. No changes on vkpipeline-db. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-22 08:12:18 +02:00
Iago Toral Quiroga	dacaf7ec06	v3d: fill logicop_func in the fragment shader key when precompiling shaders Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng of logic ops on precompiled shaders, which we don't want to do. Also, this had the side effect of making shader-db crash, as during this lowering we would try to read the color format swizzle information from the fragment shader key that we don't populate in precompiled shaders because right now we only need it when logic operations are enabled. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 08:05:59 +02:00
Jose Maria Casanova Crespo	9bf0bdf776	v3d: Avoid scheduling an instruction that stalls waiting for SFU retval If we detect that a scheduling candidate will stall because having a register source that is the written by the SFU unit in the previous instruction we reduce its priority so any non stalling operation would be chosen. The latency of SFU operations is defined as 2. So they would be scheduled earlier if other candidates have the same priority. Finally we won't merge instructions that stall to a previously chosen one. As the result of the previous one would be waiting for an extra cycle. Although shader-db result show that instruction are hurt with an increase of 0.35% the sum of instructions + stalls is reduced a 0.52%. And the total of sfu-stalls is reduced a 63.51%. It implies also a small increase in the max-temps metric because of scheduling earlier SFU operations. total instructions in shared programs: 9102719 -> 9117851 (0.17%) instructions in affected programs: 4324628 -> 4339760 (0.35%) helped: 4162 HURT: 12128 helped stats (abs) min: 1 max: 10 x̄: 1.28 x̃: 1 helped stats (rel) min: 0.09% max: 4.76% x̄: 0.66% x̃: 0.51% HURT stats (abs) min: 1 max: 27 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.05% max: 7.69% x̄: 0.87% x̃: 0.68% 95% mean confidence interval for instructions value: 0.90 0.96 95% mean confidence interval for instructions %-change: 0.47% 0.50% Instructions are HURT. total max-temps in shared programs: 1327728 -> 1327812 (<.01%) max-temps in affected programs: 4730 -> 4814 (1.78%) helped: 61 HURT: 134 helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1 helped stats (rel) min: 2.70% max: 13.33% x̄: 4.89% x̃: 4.17% HURT stats (abs) min: 1 max: 3 x̄: 1.12 x̃: 1 HURT stats (rel) min: 1.54% max: 20.00% x̄: 6.10% x̃: 5.26% 95% mean confidence interval for max-temps value: 0.28 0.58 95% mean confidence interval for max-temps %-change: 1.80% 3.52% Max-temps are HURT. total sfu-stalls in shared programs: 99551 -> 36324 (-63.51%) sfu-stalls in affected programs: 95029 -> 31802 (-66.53%) helped: 25882 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 2 helped stats (rel) min: 5.26% max: 100.00% x̄: 79.86% x̃: 100.00% 95% mean confidence interval for sfu-stalls value: -2.47 -2.42 95% mean confidence interval for sfu-stalls %-change: -80.18% -79.54% Sfu-stalls are helped. total inst-and-stalls in shared programs: 9202270 -> 9154175 (-0.52%) inst-and-stalls in affected programs: 5618516 -> 5570421 (-0.86%) helped: 22728 HURT: 855 helped stats (abs) min: 1 max: 31 x̄: 2.16 x̃: 1 helped stats (rel) min: 0.07% max: 16.67% x̄: 1.14% x̃: 0.92% HURT stats (abs) min: 1 max: 5 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.12% max: 5.26% x̄: 1.24% x̃: 0.86% 95% mean confidence interval for inst-and-stalls value: -2.07 -2.01 95% mean confidence interval for inst-and-stalls %-change: -1.07% -1.05% Inst-and-stalls are helped. v2: Rename v3d_qpu_generates_sfu_stalls to v3d_qpu_instr_is_sfu (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 03:00:50 +02:00
Jose Maria Casanova Crespo	c341ab7ffb	v3d: add shader-db stat to count SFU stalls SFU operations have a latency of 2 cicles, so if their results are used in the following cycle to a SFU instruction, the GPU stalls for an extra cycle until the result is available. This adds the number of stalls to the shader-db debug mode and sum of instruction + stalls to evaluate optimizations to schedule instructions that avoid generating sfu-stalls. v2: Rename v3d_qpu_generates_sfu_stalls to v3d_qpu_instr_is_sfu (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-22 03:00:50 +02:00
Eric Engestrom	f7224014df	radv: replace memset()+strcpy() with snprintf() Just like the next line :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-21 10:38:17 +01:00
Eric Engestrom	29e8f15bdc	radv: drop unnecessary memset() before snprintf() snprintf() always terminates the string. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-21 10:38:17 +01:00
Bas Nieuwenhuizen	451f030c06	radv: Fix uninitialized warning. For es_vgpr_comp_cnt. Fixes: `795adbbadd` "radv/gfx10: Add pipeline state support for tess." Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-21 01:39:08 +02:00
Chia-I Wu	d31d25f634	virgl: fix a sync issue in virgl_buffer_transfer_extend In virgl_buffer_transfer_extend, when no flush is needed, it tries to extend a previously queued transfer instead if it can find one. Comparing to virgl_resource_transfer_prepare, it fails to check if the resource is busy. The existence of a previously queued transfer normally implies that the resource is not busy, maybe except for when the transfer is PIPE_TRANSFER_UNSYNCHRONIZED. Rather than burdening us with a lengthy comment, and potential concerns over breaking it as the transfer code evolves, this commit makes the valid_buffer_range check the only condition to take the fast path. In real world, we hit the fast path almost only because of the valid_buffer_range check. In micro benchmarks, the condition should always be true, otherwise the benchmarks are not very representative of meaningful workloads. I think this fix is justified. The recent change to PIPE_TRANSFER_MAP_DIRECTLY usage disables the fast path. This commit re-enables it as well. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-07-19 18:04:42 -07:00
Chia-I Wu	324c20304e	virgl: rework virgl_transfer_queue_extend Do not take a transfer and do the memcpy. Add a _buffer suffix to the function name to make it clear that it is only for buffers. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-07-19 18:04:37 -07:00
Chia-I Wu	2b8ad88078	virgl: fix virgl_buffer_transfer_extend Without setting hw_res, virgl_transfer_queue_extend never finds a match and always returns NULL. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-07-19 18:04:34 -07:00
Marek Olšák	bcabf75ab7	radeonsi: initialize scissor registers etc. without clear state Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:56 -04:00
Marek Olšák	47f41af06c	radeonsi: return success from vi_dcc_clear_level to simplify callers Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:54 -04:00
Marek Olšák	7a764b963a	radeonsi: fix compute-based culling regression in `1ce52c1e37` Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:50 -04:00
Marek Olšák	c741bed6e8	radeonsi/gfx10: fix VGT_PRIMITIVE_TYPE programming Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	a0d330bedb	radeonsi/gfx10: enable Wave32 for vertex, geometry, and tessellation shaders Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	1d82240f55	radeonsi/gfx10: add debug options to enable/disable Wave32 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	8f72f137ad	radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64 Legacy GS has to use Wave64, so TES before GS has to use Wave64 too. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	88efb63caf	radeonsi/gfx10: implement Wave32 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	54e6900ede	radeonsi/gfx10: use 32-bit wavemasks for Wave32 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	81091a5183	ac: create the LLVM builder in ac_llvm_context_init Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	eb54b8c222	ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_init Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	921c1d24d5	ac/rtld: add support for Wave32 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	73aa04e40d	ac: add Wave32 LLVM target machine Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	9e467d111b	ac: initial Wave32 support in LLVM build helpers Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	c35e926a81	radeonsi: assume that selector != NULL for compute shaders Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:48 -04:00
Marek Olšák	bf0f0697a1	radeonsi: remove what appears to be legacy compute code Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:47 -04:00
Marek Olšák	be67a275b5	radeonsi: remove si_program::use_code_object_v2 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:45 -04:00
Marek Olšák	fd92e65feb	radeonsi: add si_shader_selector into si_compute Now we can assume that shader->selector is always set. This will simplify some code. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:43 -04:00
Marek Olšák	e2c8ff009e	radeonsi: set threadgroup size to 0 for threadgroups with only 1 wave This has no effect on Wave64. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:39 -04:00
Marek Olšák	a8a526c5cb	radeonsi/gfx10: set as_ngg for GS prolog as_ngg is required by Wave32. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	d3a80f2dda	radeonsi/gfx10: remove the disable_ngg option because legacy VS hangs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	0f30223cf4	radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	bfaca7259c	radeonsi/gfx10: deduplicate code for esvert_lds_size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	a6722285c2	radeonsi/gfx10: simplify a streamout loop in gfx10_emit_ngg_epilogue Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	2683347ba0	radeonsi/gfx10: don't use MALLOC for outputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	1b4354dab9	radeonsi/gfx10: clean up ESGS ring size computation Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	37db9d2865	radeonsi/gfx10: fix unnecessary LDS overallocation for NGG GS Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	985a59e0d1	radeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	7f0ada3f3e	radeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGG Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	e08463ac22	radeonsi/gfx10: update a tunable max_es_verts_base for NGG We have to fix the computation so as not to break quads. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	79d56e6a4a	radeonsi/gfx10: implement ARB_post_depth_coverage Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	a57f0f8a6b	radeonsi: fix leaked compute shader NIR Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:37 -04:00
Marek Olšák	98377d3450	radeonsi: save the enable_nir option in the shader cache correctly Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:35 -04:00
Marek Olšák	d227b91d2e	radeonsi/gfx10: enable SDMA no changes since gfx9 for buffers Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	47dee97329	ac: use llvm.amdgcn.writelane Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Marek Olšák	39d0c68321	ac: fix shader clock on LLVM 9 Probably relevant commit: commit dd32dc3f72ec99b1794d62c74d2beb3b60468d50 Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> Date: Tue Jul 9 03:10:18 2019 +0000 [AMDGPU] Always use s_memtime for readcyclecounter Differential Revision: https://reviews.llvm.org/D64369 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365431 91177308-0d34-0410-b5e6-96231b3b80d8 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Boyuan Zhang	26099bc35d	radeon/vcn: adding engine type for new fw interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:33 -04:00
Marek Olšák	936e9fa951	radeonsi: use the correct buffer size in si_vid_clear_buffer Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 20:16:19 -04:00
Pierre-Eric Pelloux-Prayer	b1efc9d05f	mesa: add EXT_dsa glEnabledIndexedEXT The implementation uses _mesa_ActiveTexture to change the active texture unit and then reset it. It causes an unnecessary _NEW_TEXTURE_STATE but: - adding an index argument to _mesa_set_enable causes a lot of changes (~140 callers) - enable_texture (called by _mesa_set_enable) might cause a _NEW_TEXTURE_STATE anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:04:07 -04:00
Pierre-Eric Pelloux-Prayer	ff0cafc8f3	mesa: add EXT_dsa glGetTextureLevelParameter*vEXT functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:04:06 -04:00
Pierre-Eric Pelloux-Prayer	5fb9c9d628	mesa: add EXT_dsa gl(Copy)Texture(Sub)Image1D/2D/3DEXT functions Added functions: - glTextureImage1DEXT - glTextureImage2DEXT - glTextureImage3DEXT - glTextureSubImage1DEXT - glTextureSubImage3DEXT - glCopyTextureImage1DEXT - glCopyTextureImage2DEXT - glCopyTextureSubImage1DEXT - glCopyTextureSubImage2DEXT - glCopyTextureSubImage3DEXT - glGetTextureImageEXT All but the last one can be compiled in a display list. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:04:03 -04:00
Pierre-Eric Pelloux-Prayer	f8ad95c45f	mesa: move lookup_texture_ext_dsa up in teximage.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:04:01 -04:00
Pierre-Eric Pelloux-Prayer	9dd1f7cec0	mesa: pass gl_texture_object as arg to not depend on state This will allow to use the same functions for EXT_dsa implementation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:03:57 -04:00
Pierre-Eric Pelloux-Prayer	0d8826f723	mesa: refactor get_texture_image to remove duplicate code Move shared code in a new function (_get_texture_image) and use it instead of duplicating the same lines. Will be also used by the EXT_dsa functions (GetTextureImageEXT and GetMultiTexImageEXT). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 20:03:40 -04:00
Jeremy Newton	666ea30017	pipe-loader: use radeonsi for MM if amdgpu dri is used The amdgpu dri is used for the closed source AMD driver. Since this driver does not implement multimedia, we fall back to radeonsi in mesa to do multimedia. This corrects the dri driver name for when it is set to amdgpu. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-19 19:59:02 -04:00
Eric Engestrom	1a25980c46	egl: drop incorrect pkg-config file for glvnd With `b01524fff0` ("meson: don't build libGLES.so with GLVND") we dropped the incorrect pkg-config files for GLES. Since then, the glvnd issue of its missing files has become painfully apparent, since it break the build for everyone using glvnd. NVIDIA has had a fix for a few years now, but has yet to accept it: https://github.com/NVIDIA/libglvnd/pull/86 Since the breakage is already there, let's clean up everything on our side while we wait for NVIDIA to accept the fix. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-20 00:07:06 +01:00
Eric Engestrom	e8febd6cba	docs: simplify `Fixes:` git command Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-19 22:24:28 +00:00
Eric Engestrom	0e34e1a0ce	mesa/tests: add missing dep_thread Fixes: `f8c27c2775` ("state_tracker: Move the format test out to be an actual unit test.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-07-19 23:03:42 +01:00
Eric Engestrom	6f8b5872ab	util: drop strncat(), strcmp(), strncmp(), snprintf() & vsnprintf() MSVC fallbacks It would seem MSVC>=2015 is now C99-compliant wrt these functions: strncat: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strncat-strncat-l-wcsncat-wcsncat-l-mbsncat-mbsncat-l?view=vs-2017 strcmp: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strcmp-wcscmp-mbscmp?view=vs-2017 strncmp: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strncmp-wcsncmp-mbsncmp-mbsncmp-l?view=vs-2017 snprintf: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/snprintf-snprintf-snprintf-l-snwprintf-snwprintf-l?view=vs-2017 vsnprintf: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/vsnprintf-vsnprintf-vsnprintf-l-vsnwprintf-vsnwprintf-l?view=vs-2017 Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	085c3abf27	util: use standard name for vsnprintf() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	dffeaa55dd	util: use standard name for snprintf() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	00e23cd969	util: use standard name for vasprintf() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	59c2dd1b8c	util: use standard name for sprintf() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	321d971b08	util: use standard name for strcmp() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	7abc739696	util: use standard name for strcasecmp() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	88ddb2e186	util: use standard name for strncmp() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	27b9eea557	util: use standard name for strncat() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	3ba199abd1	util: use standard name for strdup() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	09a8a39940	util: use standard name for strchrnul() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	367bb55c17	util: drop unused vsprintf() wrapper Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	e7db1806af	util: drop unused strchr() wrapper Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Eric Engestrom	84e85035cf	util: drop unused strstr() wrapper Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Jason Ekstrand	6301f80b84	nir: Only rematerialize comparisons with all SSA sources Otherwise, you may end up moving a register read and that could result in an incorrect shader. This commit fixes a rendering issue in Elite: Dangerous. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152 Fixes: `3ee2e84c60` "nir: Rematerialize compare instructions" Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-19 19:45:36 +00:00
Daniel Schürmann	e352b4d650	spirv: Fix order of barriers in SpvOpControlBarrier Semantically, the memory barrier has to come first to wait for the completion of pending memory requests. Afterwards, the workgroups can be synchronized. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-19 10:37:37 -07:00
Caio Marcelo de Oliveira Filho	4061a3f6c9	nir: use a switch when printing intrinsic indices Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-19 10:04:52 -07:00
Rhys Perry	e8644122ed	nir/algebraic: mark a few comparison simplifications as precise No vkpipeline-db changes found. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-19 16:33:01 +00:00
Rhys Perry	79801b9d7d	nir/algebraic: optimize contradictory iand operands Some of these were found in a few GTAV, Rise of the Tomb Raider and Shadow of the Tomb Raider shaders. Results from vkpipeline-db run with ACO: Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 220 -> 220 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 13492 -> 11560 (-14.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 69 -> 69 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: use False instead of 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-19 16:33:01 +00:00
Erico Nunes	32ced14bad	lima/ppir: handle all node types in ppir_node_replace_child ppir_node_replace_child is used by the const lowering routine in ppir. All types need to be handled here, otherwise the src node is not updated properly when one of the lowered nodes is a const, which results in, for example, regalloc not assigning registers correctly. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-07-19 16:01:45 +00:00
Erico Nunes	2292f0c4b5	lima/ppir: branch regalloc fixes The branch instruction has sources which must be handled in src handling paths so that regalloc assigns registers to them properly. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-07-19 16:01:45 +00:00
Yevhenii Kolesnikov	32b72cbca5	main: Destroy static hash table format_array_format_table has a static lifetime - it will be destroyed by an atexit handler. Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 11:22:55 +03:00
Dave Airlie	248161123c	radv: reset the window scissor with no clear state. If we don't have clear state (which gfx10 doesn't currently) we will fix to reset the scissor. AMDVLK will leave it set to something else. Marek also has this fix for radeonsi pending. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 11:00:44 +10:00
Dave Airlie	2ac2b98780	radv: fix crash in shader tracing. Enabling tracing, and then having a vmfault, can leads to a segfault before we print out the traces, as if a meta shader is executing and we don't have the NIR for it. Just pass the stage and give back a default. Fixes: `9b9ccee4d6` ("radv: take LDS into account for compute shader occupancy stats") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-19 11:00:25 +10:00
Timothy Arceri	80c2c17e1e	iris: change last_vue_stage() to look at uncompiled shaders This allows us to find the last vue stage before we have compiled the shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Timothy Arceri	30038dd5ec	nir/lower_clip: add support for geometry shaders This will be used to enabled compat profile support for geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Timothy Arceri	4b08bb4770	nir/lower_clip: add lower_clip_outputs() helper This will be reused in the following patch to add support for clip vertex lowering in geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Timothy Arceri	a59926b3ca	nir/lower_clip: add create_clipdist_vars() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Timothy Arceri	e38b930876	nir/lower_clip: add a find_clipvertex_and_position_outputs() helper This will allow code sharing in a following patch that adds support for lowering in geometry shaders. It also allows us to exit early if there is no lowering to do which allows a small code tidy up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-19 09:25:47 +10:00
Alyssa Rosenzweig	0395b58c92	panfrost: Set rt_count This doesn't quite work yet, but it illustrates how MRT is implemented in the MFBD: rt_count is set appropriately based on the number of render targets, while additional render target descriptors are appended on with an index variable in them (not quite decoded since there's some aspects we don't understand there, but conceptually this should be right). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	871ad7789f	panfrost: Trace invisible BOs Helps make the decode a little more readable (names instead of addresses). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	17752bae8e	panfrost/decode: Preserve empty tiler heap symmetry If tiler_heap_end == tiler_heap_start, ensure it's printed the same rather than one erroring out as hex. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	e797caa0dd	panfrost: Zero polygon list body size for clears There's no polygons, so you can't have any size to the polygon list, although there is a minimal header. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	f475b79980	panfrost/mfbd: Unify depth-only with masked FBO path Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	629c7366a7	panfrost: Simplify set_framebuffer_state Most of the ad hoc logic is already in Gallium. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	227c395c00	panfrost: Check for NULL surface in places Fixes a bunch of NULL dereferences, although it does cause GPU faults of course. This is caused by color buffers masked out in MRT, which we'll eventually have to solve the right way... one thing at a time. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	79b13b4376	panfrost: Expose 4 render targets Hidden behind deqp flag as usual. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:40 -07:00
Alyssa Rosenzweig	d56f92502e	panfrost: Shrink tiler heap 128MB is excessive and 16MB is still plenty. Saves 112MB/context on kernels without growable/heap support. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 15:25:16 -07:00
Caio Marcelo de Oliveira Filho	b6d4753568	nir/large_constants: De-duplicate constants If a function has a constant and is called more than once, after inlining we may end up with different variables representing the same constant. This commit look into the data and de-duplicate them. The first pass now will collect the constant data in a per variable buffer, then de-duplication happens (by sorting then linear walk), and the second pass will use the data in var->data.location. One side-effect of the current implementation is that constants will be reordered. If this turns out to be a problem is something that can be fixed. An alternative strategy considered was to perform this in a per-function basis and then merge the results, the problem is that we would have to fix up the offsets during the merge. Given the data we have, the current patch is good enough. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 12:24:24 -07:00
Caio Marcelo de Oliveira Filho	d9b67ad079	nir/large_constants: Use ralloc for var_infos This will be used later on to allocate constant data for each variable (and then deduplicate). Also drop initializing found_read, as it is already implicitly false in the literal. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 12:24:24 -07:00
Eric Anholt	0d8a4c67cf	freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper. Cuts a bunch of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	56f4ede73d	freedreno: Convert load_barycentric_at_sample to the NIR lowering helper. Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	61098baf42	freedreno: Convert load_barycentric_at_offset to the NIR lowering helper. Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	cdc359c58e	v3d: Use nir_shader_lower_instructions() for txf_ms lowering. Cuts out a bunch of boilerplate. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	251c64a53d	nir: Allow internal changes to the instr in nir_shader_lower_instructions(). v3d's NIR txf_ms lowering wants to swizzle around the input coordinates in NIR, but doesn't generate a new txf_ms instructions as replacement. It's pretty easy to allow that in nir_shader_lower_instructions, and it may be common in lowering passes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 11:28:56 -07:00
Eric Anholt	c0640035fb	vc4: Convert vc4_nir_lower_txf_ms to nir_shader_lower_instructions(). Cuts out a bunch of boilerplate. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	40e7609603	v3d: Fix assertion failures in debug builds. nir_lower_io leaves around deref_var instructions after lowering away deref intrinsics. This ends up breaking validation after v3d_nir_lower_io removes variables not actually being stored by the shader's store_output()s. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-18 11:28:56 -07:00
Alyssa Rosenzweig	1bced0fad2	panfrost: Handle Z24 textures Just use the Z32 code. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	f29c084960	panfrost/ci: Update expectations We just fixed some stencil tests. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	fad76470d5	panfrost: Make scissor test more robust See v3d implementation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	5c554e235d	panfrost: Use correct NO_DITHER field on MFBD Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	676b9339dd	panfrost: Implement Z32F(_S8) support Z32F uses a dediacted float path. Z32F_S8 uses separate stencil planes in the hardware, lowered via u_transfer_helper. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	479185a1cd	panfrost/decode: Don't disassemble NULL shaders It is legal to load a shader from a NULL address, particularly when the TILER job is used strictly for effects on the Z/S buffer with 0x0 color mask. Don't crash the decoder in this case. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Alyssa Rosenzweig	65d89097b8	panfrost: Copy stencil front to back if back disabled When backside stenciling is disabled, backfacing primitives just do the same thing as frontfacing primitives. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-18 10:42:43 -07:00
Jan Zielinski	6f7306c029	swr/rast: Refactor memory API between rasterizer core and swr This commit cleans up API between the core of the rasterizer and swr. Some formatting changes are also done. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-18 16:17:00 +02:00
Andreas Baierl	4627a0c4eb	lima/ppir: Add gl_PointCoord handling Treat gl_PointCoord as a system value and add the necessary bits for correct codegen. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 13:20:39 +00:00
Andreas Baierl	3523233027	gallium: Add PIPE_CAP_TGSI_FS_POINT_IS_SYSVAL This adds an option to treat gl_PointCoord as a system value. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 13:20:39 +00:00
Andreas Baierl	3349a60f6f	nir/tgsi: Extend tgsi_to_nir.c to support gl_PointCoord as a system value. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 13:20:39 +00:00
Andreas Baierl	f5804f1768	nir: Add gl_PointCoord system value gl_PointCoord handling needs some special bits set in lima/ppir code generation. Treating gl_PointCoord as a system value makes it easier to distinguish from a regular varying. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 13:20:39 +00:00
Andreas Baierl	24af57407c	glsl: Optionally declare gl_PointCoord as a system value Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 13:20:39 +00:00
Connor Abbott	b178fdf486	lima/gp: Fix problem with complex moves When writing the scheduler, we forgot that you can't read the complex unit in certain sources because it gets overwritten to 0 or 1. Fixing this turned out to be possible without giving up and reducing GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't expect. There can be at most 4 next-max nodes that can't have moves scheduled in the complex slot, so it actually isn't a problem for getting the number of next-max nodes at 5 or lower. However, it is a problem for stores. If a given node is a next-max node whose move cannot go in the complex slot and is used by a store that we decide to schedule, we have to reserve one of the non-complex slots for a move instead of all the slots, or we can wind up in a situation where only the complex slot is free and we fail the move. This means that we have to add another term to the reservation logic, for stores whose children cannot be in the complex slot. Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:33:23 +02:00
Connor Abbott	54434fe670	lima/gpir: Rework the scheduler Now, we do scheduling at the same time as value register allocation. The ready list now acts similarly to the array of registers in value_regalloc, keeping us from running out of slots. Before this, the value register allocator wasn't aware of the scheduling constraints of the actual machine, which meant that it sometimes chose the wrong false dependencies to insert. Now, we assign value registers at the same time as we actually schedule instructions, making its choices reflect reality much better. It was also conservative in some cases where the new scheme doesn't have to be. For example, in something like: 1 = ld_att 2 = ld_uni 3 = add 1, 2 It's possible that one of 1 and 2 can't be scheduled in the same instruction as 3, meaning that a move needs to be inserted, so the value register allocator needs to assume that this sequence requires two registers. But when actually scheduling, we could discover that 1, 2, and 3 can all be scheduled together, so that they only require one register. The new scheduler speculatively inserts the instruction under consideration, as well as all of its child load instructions, and then counts the number of live value registers after all is said and done. This lets us be more aggressive with scheduling when we're close to the limit. With the new scheduler, the kmscube vertex shader is now scheduled in 40 instructions, versus 66 before. Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:33:23 +02:00
Connor Abbott	12645e8714	lima/gp: Mark more add-only nodes as maybe-two-slot Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:33:23 +02:00
Connor Abbott	16de3dd7a6	lima/gpir: Fix some bugs in instruction handling Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:33:23 +02:00
Connor Abbott	cc78a42577	lima: Reintroduce the standalone compiler I used this to test things without needing to have a device handy. Acked-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:33:23 +02:00
Connor Abbott	4423552ff0	nir/lower_viewport: Check variable mode first The location is unused for shader_temp and function_temp variables, and due to the way we nir_lower_io_to_temproraries demotes shader_out variables to shader_temp variables, it happened to equal VARYING_SLOT_POS for the gl_Position temporary, which made this pass fail with the offline compiler due to this coming before vars_to_ssa. Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-18 14:21:41 +02:00
Samuel Pitoiset	6e5e4bf050	radv/gfx10: set BREAK_WAVE_AT_EOI if TES or GS enable the primitive ID Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-18 10:37:10 +02:00
Samuel Pitoiset	8c692ff512	radv/gfx10: move emitting VGT_PRIMITIVEID_EN into the NGG path And do not emit VGT_GS_MODE which is unnecessary on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-18 10:36:38 +02:00
Samuel Pitoiset	8315dbe419	radv/gfx10: do not always execute a barrier before the second shader With NGG, empty waves may still be required to export data. This fixes dEQP-VK.ycbcr.format._unorm.geometry_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-18 10:06:34 +02:00
Samuel Pitoiset	63d670e350	radv: fix VGT_GS_MODE if VS uses the primitive ID Found by inspection. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-18 10:03:12 +02:00
Iago Toral Quiroga	c23fa1ca07	v3d: emit correct lowering for logic operations with MSAA render targets v2: - Drop the writemask from the per-sample color intrinsic (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	93d05c1c1f	v3d: handle nir_intrinsic_store_tlb_sample_color_v3d v2: - Move handling of output intrinsics to ntq_emit_intrinsic() (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	50016d7718	nir: add a V3D-specific intrinsic for per-sample color writes For per-sample color writes we need the output intrinsic to pack the sample index, which is not provided with regular store_output intrinsics unless we figured out a way to encode it into the base or the offset. v2: - Drop the writemask (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	ba520b00c4	v3d: implement per-sample tlb color writes Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	b96c2219ca	v3d: refactor the tlb color write code We want to split the tlb specifier setup from the color writes, because when we implement per-sample color writes we want to do the latter for all the samples, but the former only once. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	fd3ec6f55d	v3d: move tlb color write emission to a helper function We will soon be adding per-sample color writes which means additional complexity and more indentation (we will need another loop to emit the writes for each individual sample), so this will help keeping things simple and a bit more readable. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Iago Toral Quiroga	0c9919710e	v3d: implement per-sample tlb color reads Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-18 08:59:35 +02:00
Lionel Landwerlin	3adc32df92	anv: fix format mapping for depth/stencil formats anv_format is supposed to have a pointer back to the associated VkFormat, we were missed this for depth/stencil formats. This doesn't fix anything afaict, but will be needed for future changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `465de47bad` ("anv: associate vulkan formats with aspects") Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 09:40:01 +03:00
Dave Airlie	a68f593a0e	radv: put back VGT_FLUSH at ring init on gfx10 I can find no evidence that removing this is a good idea. Fixes: `9b116173b6` ("radv: do not emit VGT_FLUSH on GFX10") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-18 16:24:44 +10:00
Gert Wollny	45951452aa	softpipe: Clamp border colors when needed unorm and snorm require that the border color values are clamped, so when picking the sampler view copy/clamp the border color from the sampler and use these adjusted values. Fixes: dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_compressed_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_snorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_srgb_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.linear_unorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_compressed_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_snorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_srgb_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_color dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth_uint_stencil_sample_depth Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-18 05:49:00 +02:00
Gert Wollny	230b99ce2f	softpipe: set a lower minimum clamp value for texture coordinate border clamp The value of -0.5f is not small enough to produce negative coordinates, so lower the minimum clamp value to -1.0f. This fixes a number of tests from dEQP-GLES31.functional.texture.border_clamp.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-18 05:47:23 +02:00
Gert Wollny	eae4c6df8d	softpipe: Correct repeat-mirror evaluation when mirroring the texture corrdinates the indices must be mirrored as well and the half pixel shift must be applied in reverse. Fixes a number of tests from: dEQP-GLES31.functional.texture.gather.offset.* dEQP-GLES31.functional.texture.gather.offsets.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-18 05:47:23 +02:00
Gert Wollny	fff624fca4	softpipe: Also mark textures as dirty when updating the framebuffer state At this point all the draw caches are flushed to the old attached textures, so the read caches of these textures will need to be updated too. Fixes: dEQP-GLES3.functional.fbo.color.repeated_clear.sample.tex2d.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-18 05:33:59 +02:00
Jonathan Marek	08514a9721	etnaviv: set DITHER_MODE This fixes a rendering glitch observed in SDL testscale test, where alpha blending samples with value (1.0, 1.0, 1.0, 0.0) whitens the target instead of having no effect. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-17 23:07:50 -04:00
Jonathan Marek	aaf0c47c76	etnaviv: update headers from rnndb Update to etna_viv commit a16a418. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-17 23:07:50 -04:00
Jonathan Marek	76adf041f2	etnaviv: fix blend color on newer GPUs Newer GPUs use the half float ALPHA_COLOR_EXT register. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-17 23:07:50 -04:00
Jonathan Marek	5f73726013	etnaviv: fix alpha blending cases We need to check rgb_func/alpha_func when determining if blend or separate alpha is required. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-17 23:07:35 -04:00
Jonathan Marek	6c3c05dc38	etnaviv: fix polygon offset Dividing the fui result by 65535 is obviously wrong, and from testing, on GC7000L at least there is no division by 65535. Fixes dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-17 23:07:07 -04:00
Timothy Arceri	a20a9d0c5e	radv: dont store disasm string unless keep_shader_info flag set This fixes the memory use regression from bug 111107. Fixes: `726a31df70` ("radv: Add the concept of radv shader binaries.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111107	2019-07-18 00:25:55 +00:00
Dave Airlie	82a2f10529	radv/gfx10: set the pgm rsrc3/4 regs using index sh reg set This is ported from AMDVLK, it's probably not requires unless we want to use "real time queues", but it might be nice to just have in place. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-18 10:24:26 +10:00
Dave Airlie	de524b2c37	radv: use correct register setter for ngg hw addr this shouldn't matter, but it's good to be correct. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-18 10:17:37 +10:00
Eric Anholt	9689407c54	freedreno/a6xx: Drop the WFI in the program update stateobj. Rob Clark thinks this was likely a workaround for our const buffer update bugs, and now that it's passing tests, we should be able to drop it. renderdoc-traces results: traces/android/clashofclans.rdc: +6.1% +/- 1.1% traces/android/candycrush.rdc: +5.2% +/- 1.6% Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-17 16:20:12 -07:00
Eric Anholt	2170822603	freedreno/a6xx: Drop the WFI in constant uploads. Now that the bin vs render constlen is fixed, we can skip these waits. Improves webgl aquarium performance at 10k fish from 27fps to 33. Some highlights from renderdoc-traces: traces/android/minecraft.rdc: +17.1% +/- 3.4% traces/glmark2/ideas-speed=duration.rdc: +11.6% +/- 2.4% traces/android/candycrush.rdc: +5.4% +/- 1.1% traces/android/clashofclans.rdc: +4.4% +/- 1.3% Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-17 16:20:12 -07:00
Eric Anholt	85bbdaff6c	freedreno: Assert that we don't exceed constlen. We actually could go up to vs->constlen in the binning shader on a6xx, but for sanity let's make sure that we're always under constlen. This would have caught the bug fixed in `572c76fd88` ("freedreno: Clamp UBO uploads to the constlen decided by the shader.") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-17 16:20:12 -07:00
Eric Anholt	bc50ecfa7a	freedreno: Fix more constlen overflows. Fixes constlen overflow in dEQP-GLES31.functional.shaders.builtin_var.compute.num_work_groups and dEQP-GLES31.functional.image_load_store.buffer.image_size.readonly_32 and probably others. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-17 16:20:12 -07:00
Eric Anholt	b9f7f3e497	freedreno: Drop stale comment about skipping uploads. We already skip the upload if it's unused, due to the constlen > offset check. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-17 16:20:12 -07:00
Lepton Wu	6109df58e4	virgl: Set meta data for textures from handle. The set of meta data was removed by commit `8083464`. It broke lots of dEQP tests when running with pbuffer surface type. Fixes: `8083464013` ("virgl: remove dead code") Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-17 16:17:48 -07:00
Bas Nieuwenhuizen	f1a8967344	radv: Only save the descriptor set if we have one. After reset, if valid does not contain the relevant bit the descriptor can be != NULL but still not be valid. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-18 00:49:43 +02:00
Lionel Landwerlin	ce4c5474af	anv: report timestampComputeAndGraphics true Spec says : "timestampComputeAndGraphics specifies support for timestamps on all graphics and compute queues. If this limit is set to VK_TRUE, all queues that advertise the VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT in the VkQueueFamilyProperties::queueFlags support VkQueueFamilyProperties::timestampValidBits of at least 36." On gen7+ this should be true (we only have 32bits of timestamp on gen6 and below). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `802f00219a` ("anv/device: Update features and limits") Reported-by: Timothy Strelchun <timothy.strelchun@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 22:46:58 +00:00
Rafael Antognolli	393f659ed8	iris: Enable fast clears on other miplevels and layers than 0. Until now we only supported fast clear colors on the first miplevel and layer. The main reason for it is that we can't have different fast clear values at different levels/layers, since the surface state only supports one clear value. We can, however, enable it if we make sure we only use the same value for all levels/layers, and if one of them changes, we resolve all the others. We already do that for depth fast clears so hopefully it will be fine for color fast clears too. v2: Add check for partial clear too (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-17 14:53:37 -07:00
Rafael Antognolli	8bbd4f32bf	iris: Allow resolving clear color of CCS_D surfaces. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 14:53:16 -07:00
Kenneth Graunke	df4c2ec5e1	iris: Make iris_has_color_unresolved non-static We want to use this in the transfer code and possibly for fast clears.	2019-07-17 13:43:04 -07:00
Andreas Bergmeier	f92290a8d9	broadcom: Move v3d_get_device_info to common In common we can use implementation for Vulkan.	2019-07-17 20:02:34 +00:00
Caio Marcelo de Oliveira Filho	891a232214	nir/large_constants: Use dominance information to find more constants Relax the restriction that all the writes need to be in the first block: now accept variables that have all the writes in the same block, and all the reads are dominated by that block. This let the pass identify large constants that are local to a helper function. The writes will be at the place that the function is inlined, possibly not in the first block (but still all in the same block). Results for vkpipeline-db in SKL: total instructions in shared programs: 3624891 -> 3623145 (-0.05%) instructions in affected programs: 79416 -> 77670 (-2.20%) helped: 16 HURT: 0 total cycles in shared programs: 1458149667 -> 1458147273 (<.01%) cycles in affected programs: 30154164 -> 30151770 (<.01%) helped: 14 HURT: 2 total loops in shared programs: 2437 -> 2437 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 8813 -> 8745 (-0.77%) spills in affected programs: 2894 -> 2826 (-2.35%) helped: 8 HURT: 0 total fills in shared programs: 23470 -> 23392 (-0.33%) fills in affected programs: 12248 -> 12170 (-0.64%) helped: 6 HURT: 2 LOST: 0 GAINED: 0 Results for shader-db in SKL with Iris: total instructions in shared programs: 15379442 -> 15379392 (<.01%) instructions in affected programs: 837 -> 787 (-5.97%) helped: 2 HURT: 2 helped stats (abs) min: 27 max: 27 x̄: 27.00 x̃: 27 helped stats (rel) min: 10.47% max: 10.67% x̄: 10.57% x̃: 10.57% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 1.23% max: 1.23% x̄: 1.23% x̃: 1.23% 95% mean confidence interval for instructions value: -39.14 14.14 95% mean confidence interval for instructions %-change: -15.51% 6.17% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4880 -> 4880 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 370677237 -> 370676567 (<.01%) cycles in affected programs: 17852 -> 17182 (-3.75%) helped: 2 HURT: 1 helped stats (abs) min: 338 max: 356 x̄: 347.00 x̃: 347 helped stats (rel) min: 13.98% max: 14.64% x̄: 14.31% x̃: 14.31% HURT stats (abs) min: 24 max: 24 x̄: 24.00 x̃: 24 HURT stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18% total spills in shared programs: 11772 -> 11772 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 24948 -> 24948 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 0 GAINED: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 12:50:32 -07:00
Jason Ekstrand	7ceec21b76	intel/fs: Use a strided MOV instead of a conversion for load_* destinations In many cases, the compiler can just copy-prop the strided MOV whereas the conversion is a bit trickier. This cuts 5% of the instructions off of one particular Vulkan CTS test which does lots of load_ssbo. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Jason Ekstrand	812b341578	nir/algebraic: Optimize comparisons and up-casts These seem like obvious enough optimizations in the world of multiple integer bit sizes. The only known thing which hits these at the moment is some Vulkan CTS tests for 16-bit SSBO values which like to up-cast and check for equality. However, it's something that's bound to come up as we start seeing more integers in shaders. The optimizations of comparisons of casted values with constants are something which we would ideally do with range analysis. However, lacking that, we can do it in opt_algebraic as long as one side is a constant. In dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13, this commit, along with the previous commit, reduce the number of instructions emitted on Skylake from 55328 to 44546, a reduction of 20%. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-17 18:44:35 +00:00
Jason Ekstrand	e8505e982a	nir/algebraic: Optimize comparing unpacked values We could, in theory, add the same optimization for 64-bit unpack operations but that's likely to fight with 64-bit integer lowering on platforms which require it so it will require more infrastructure before that will be a good idea. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Jason Ekstrand	9fed031e4e	nir/algebraic: Print out the list of transforms in the C file This helps greatly when debugging algebraic transform generators because you can now actually see the output and verify that your transforms are getting generated. Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Jason Ekstrand	68a4c796d5	intel/fs: Properly stride NULL replacement regs in DCE This fixes some validation errors generated by certain D->W conversions but is likely not a full solution. Calculating an actual register stride is a far more complex problem in general and should probably be handled by the brw_fs_generator. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Eric Anholt	28a808a11b	nir: Fix nir_lower_alu_to_scalar's instr filtering. It was checking if the dest or src[0] SSA values were vectors, rather than whether the ALU op was using the source as a vector resulting in a nir_fdot4 making it through to vc4 and v3d: vec1 32 ssa_6 = fdot4 ssa_4.xxxx, ssa_5 Fixes: `c1cffa4249` ("nir/alu_to_scalar: Use the new NIR lowering framework") v2: Use Jason's recommendation to look at input_sizes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 10:30:43 -07:00
Alyssa Rosenzweig	a301250ece	panfrost: Merge varyings_mem into transient buffers Theoretically we would like these split since varyings can have specially optimized flags (no map, coherent local). For now, since neither of these flags is particularly meaningful right now, merge them together instead of special casing varyings_mem. Saves upwards of 64MB of RAM per context. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-17 09:16:37 -07:00
Lionel Landwerlin	6f880f128f	vulkan/wsi: update swapchain status on vkQueuePresent With the following chain of events : vkQueuePresent() <- Surface resize vkQueuePresent() We should be able to report SUBOPTIMAL or OUT_OF_DATE on the second vkQueuePresent() call. Currently we only look at X11 events in the vkAcquireNextImage() path so we're not able to report this. This change checks the queue of events and process any available ones to update the swapchain status. v2: Be consistent about reporting the current error state of the swapchain (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111097 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 17:40:54 +03:00
Samuel Pitoiset	24b1b1f574	radv: add an option for disabling NGG on GFX10 Will be useful for testing the legacy path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 15:43:36 +02:00
Erik Faye-Lund	d59c961af9	softpipe: pass stream-out targets to draw-module early This is essensially a port of `ed53e61bec` from LLVMpipe to softpipe, as it makes things a bit simpler and more performant. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-07-17 10:43:06 +00:00
Alejandro Piñeiro	5a84960072	spirv_extensions: i965: initialize SPIR-V extensions v2: Rebase update after changes on previous patches. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-17 10:47:27 +02:00
Alejandro Piñeiro	6ed19dcf80	spirv_extensions: add spirv_supported_extensions on gl_constants We can use it to get real values for ARB_spirv_extensions methods. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-17 10:45:58 +02:00
Alejandro Piñeiro	f6da2a5508	spirv_extensions: define spirv_extensions_supported Add a struct to maintain which SPIR-V extensions are supported, and an utility method to initialize it based on nir_spirv_supported_capabilities. v2: * Fixing code style (Ian Romanick) * Adding a prefix (spirv) to fill_supported_spirv_extensions (Ian Romanick) v3: rebase update (nir_spirv_supported_extensions renamed) v4: include AMD_gcn_shader support v5: move spirv_fill_supported_spirv_extensions to src/mesa/main/spirv_extensions.c Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-17 10:45:32 +02:00
Alejandro Piñeiro	06e5daf575	spirv_extensions: add list of extensions and to_string method Ideally this should be generated somehow. One option would be gather all the extension dependencies listed on the core grammar, but there would be the possibility of not including some of the extensions. Note that spirv-tools is doing it just slightly better, as it has a hardcoded list of extensions manually took from the registry, that they parse to get the enum and the to_string method (see generate_grammar_tables.py). v2: * Use a macro to improve readability. (Tapani Pälli) * Add unreachable on the switch, no default (Eric Engestrom) * No typedef enum (Ian Romanick) * Sort extensions names (Ian Romanick) * Don't add extensions unlikely to be supported by Mesa at any point (Ian Romanick) v3: rebase update v4: Include AMD_gcn_shader v5: move spirv_extensions_to_string to src/mesa/main/spirv_extensions.c Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-17 10:44:33 +02:00
Alejandro Piñeiro	a622aad869	spirv_extensions: add GL_ARB_spirv_extensions boilerplate v2: * Mention extension gap at gl_API.xml (Emil Velikov) * Bail with INVALID_ENUM if extension not available on getStringi (Emil Velikov) * Use EXTRA_EXT macro when defining the extension at get.c/get_hash_params.py (Emil Velikov) * Rename source files (spirvextensions.[ch] -> spirv_extensions.[ch]) (Ian) v3: * Fix GL_PROGRAM_BINARY_FORMATS glGet query, broken by error on a previous rebase v4: * Fix rebase conflicts on getstring.c after GL_SHADING_LANGUAGE_VERSION query was added v5: * Remove src/mapi/glapi/gen/Makefile.am as it no longer exists in master Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-17 10:41:44 +02:00
Samuel Pitoiset	07ff367442	radv/gfx10: implement VK_EXT_post_depth_coverage I did implement this extension a while ago but it didn't work on pre GFX10 for some reasons. Now all CTS pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:39 +02:00
Samuel Pitoiset	ed53d2c4be	radv/gfx10: disable the TC compat zrange workaround Unnecessary. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:36 +02:00
Samuel Pitoiset	edf1af696f	radv/gfx10: fallback to the legacy path if tess and extreme geometry This is unsupported and hangs. This fixes GPU hangs with dEQP-VK.tessellation.geometry_interaction.limits.output_required_*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:33 +02:00
Samuel Pitoiset	ae4b1fc095	radv/gfx10: always build the GS copy shader but uses it on-demand It should be possible to build it on-demand too but it requires more work. On GFX10, the GS copy shader is required when tess is enabled with extreme geometry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-17 08:32:30 +02:00
Gert Wollny	9c611fb381	softpipe: Remove unused static function Thanks to Eric Engestrom for pointing out that there was something wrong with that function. Fixes: `724a73509e` softpipe: Prepare handling explicit gradients Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-17 04:52:27 +00:00
Caio Marcelo de Oliveira Filho	e2939dc5a1	spirv: Bail when we see CounterBuffer decoration This decoration can be ignored, so we can just skip the next steps. Otherwise we'd have to also handle it in apply_var_decoration. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-16 20:31:12 -07:00
Kenneth Graunke	1d5ee31553	iris: Drop copy and pasted iris_timebase_scale Lionel moved brw_timebase_scale to gen_device_info_timebase_scale a few months ago, so we should just use that, and not our own copy in iris.	2019-07-16 17:22:48 -07:00
Jason Ekstrand	6fb685fe4b	nir/regs_to_ssa: Handle regs in phi sources properly Sources of phi instructions act as if they occur at the very end of the predecessor block not the block in which the phi lives. In order to handle them correctly, we have to skip phi sources on the normal instruction walk and handle them as a separate walk over the successor phis. While registers in phi instructions is a bit of an oddity it can happen when we temporarily go out-of-SSA for control-flow manipulations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111075 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-16 23:28:03 +00:00
Jason Ekstrand	6394680f6b	spirv: Add a warning for ArrayStride on arrays of blocks It's disallowed according to the SPIR-V spec or at least I think that's what the spec says. It's in a section explicitly about explicit layout of things in the StorageBuffer, Uniform, and PushConstant storage classes so it's not 100% clear that it applies with other storage classes. However, it seems like it should apply in general and violating it can trigger (fairly harmless) asserts in NIR. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-16 17:02:08 -05:00
Caio Marcelo de Oliveira Filho	f07f516c56	anv: Increase state allocation size limit to 2MB When running on ICL the dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 needs more than 1M for the shader, so bump it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-16 14:17:52 -07:00
Yevhenii Kolesnikov	3853871ef8	meta: leaking of BO with DrawPixels ctx->Unpack.BufferObj wasn't unreferenced. Fixes: `d492e7b017` (meta: Fix invalid PBO access from DrawPixels when trying to just alloc.) CC: Eric Anholt <eric@anholt.net> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 20:06:56 +00:00
Eric Anholt	e8360a64e4	swrast: Move _mesa_format_pack_colormask() to the only caller. This avoids needing format_pack to have access to the GLenum return functions for mesa_format. It seems like an odd function and unlikely to be reused. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	4d23157a8b	mesa: Give _mesa_format_get_color_encoding a clearer name. It only returned one of two values. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	35e2d31ba4	mesa: Drop redundant checks for sRGB before sRGB to linear conversion. _mesa_get_srgb_format_linear() just returns the original format if it wasn't sRGB. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	ece03848c2	mesa: Fold _mesa_unpack_depth_stencil_row() into its only caller. This was the last bit of gl.h usage in format packing. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	5956b46e16	mesa: Convert format_pack/unpack off of GL types. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	3e186af5e2	mesa: Port format_pack/unpack off of _mesa_problem(). unreachable() should be plenty of debug for these. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	c6a0a976a4	mesa: Mostly switch Mesa format info off of GL types other than GLenum. I'm considering moving most of this code to src/util/, and I want that code to not expose GL types in its interfaces. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	93a7651d8d	mesa: Rename gl_pack typedefs to mesa_pack. These are packing mesa formats, not a GL format/type. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	20ce56ad5b	mesa: Rename gl_format_info to mesa_format_info. It's about MESA_FORMATs, after all. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	f8c27c2775	state_tracker: Move the format test out to be an actual unit test. We want errors in the table to show up as unit test failures in MRs. Also keeps unit test code out of the built drivers. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	9eccae671e	u_format: Remove pointless comments. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	628f55717b	src/util: Switch _mesa_half_to_float() to u_half.h's version. The two implementations differ across the entire input range only in that u_half.h preserves mantissa bits for NaNs. The u_half.h version shaves 15% off of the text size of half_float.o. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Eric Anholt	bb5801ad98	u_half_test: Turn it into an actual unit test. You could break the test and meson test wouldn't complain, since we returned success either way. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-16 12:51:13 -07:00
Mauro Rossi	3630988b1d	android: radv/gfx10: generate gfx10_format_table.h This patch adds the missing building rules for Android, to avoid following building errors: In file included from external/mesa/src/amd/vulkan/radv_debug.c:35: In file included from external/mesa/src/amd/vulkan/radv_debug.h:27: external/mesa/src/amd/vulkan/radv_private.h:95:10: fatal error: 'gfx10_format_table.h' file not found ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/amd/vulkan/radv_android.c:31: external/mesa/src/amd/vulkan/radv_private.h:95:10: fatal error: 'gfx10_format_table.h' file not found ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `3dc5ec5d16` ("radv/gfx10: generate gfx10_format_table.h") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-16 21:31:24 +02:00
Rob Clark	856e84083e	mesa/st: add sampler uniforms Add sampler uniforms for the UV plane(s), so driver can count the uniforms and get the correct sampler count. Fixes lowered YUV on a6xx which actually wants to know # of samplers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 18:14:44 +00:00
Rob Clark	a9f34b5631	egl/android: handle multi-fd native windows We can hit multi-fd EGL_NATIVE_BUFFER_ANDROID case when the native android buffer is YUV. So we need to handle that. Currently this went unnoticed because, even though we have two or three fd's for YUV native android buffers, they all reference the same backing buffer. But we really shouldn't rely on that. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 18:14:44 +00:00
Jason Ekstrand	110669c85c	st,i965: Stop looping on 64-bit lowering Now that the 64-bit lowering passes do a complete lowering in one go, we don't need to loop anymore. We do, however, have to ensure that int64 lowering happens after double lowering because double lowering can produce int64 ops. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	548da20b22	nir/lower_doubles: Handle fdiv and fsub directly Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	d7d35a9522	nir/lower_doubles: Use the new NIR lowering framework One advantage of this is that we no longer need to run in a loop because the new framework handles lowering instructions added by lowering. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	197a08dc69	nir/lower_doubles: Use "alu" for the nir_alu_instr Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	d65902c179	nir/lower_int64: Use the core NIR lowering framework One advantage of this is that we no longer need to run in a loop because the new framework handles lowering instructions added by lowering. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	c1cffa4249	nir/alu_to_scalar: Use the new NIR lowering framework Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	eb768b0a09	nir/alu_to_scalar: Use "alu" as the name for the nir_alu_instr Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	998d84fca5	nir/lower_system_values: Support lowering more intrinsics Instead of only lowering system from variables, lower most to intrinsics and let the lowering framework immediately lower the intrinsic. This will result in a bit more instruction churn but it means that NIR code builders can just use intrinsics instead of everything having to go through variables. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	ae8caaadee	nir/lower_system_values: Drop the context-aware builder functions Instead of having context-aware builder functions, just provide lowering for the system value intrinsics and let nir_shader_lower_instructions handle the recursion for us. This makes everything a bit simpler and means that the lowering can also be used if something comes in as a system value intrinsic rather than a load_deref. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	58ffd7fbf6	nir/lower_system_values: Use the new generic NIR lowering helpers Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	ce3af830cb	nir/lower_subgroups: Use the new generic NIR lowering helpers Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	758fdce9fe	nir: Add some generic helpers for writing lowering passes Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	c74b98486a	nir: Add a helper for fetching the SSA def from an instruction Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Tomeu Vizoso	75b53a159d	pandecode: Add more addresses to trace When debugging, we're given the fault_pointer unresolved, so it is helpful to have more context in the decode. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-16 08:40:59 -07:00
Tomeu Vizoso	5a7688fdec	panfrost: Use 64-bit descriptors globally Midgard supports two modes of operation, 32-bit mode and 64-bit mode. The GPU is natively 64-bit, but job descriptors can be submitted in 32-bit mode. Among other changes, 32-bit mode shortens pointer sizes to use 32-bit pointers rather than the full 64-bit range. The blob decides which mode to use based on the CPU bitness, so an armhf system uses 32-bit descriptors and an aarch64 system uses 64-bit descriptors. For a while, we mimicked this, bu inevitably this caused the 32-bit support to lag behind as our reference platform is 64-bit. To combat the code staleness, we traced an older GPU paired with a 64-bit CPU (the Midgard T720 on-board the sunxi H64). From there, we could tell which fields were really about hardware and which fields were simply reflections of the descriptor bitness. From there, we decided to remove support for 32-bit descriptors entirely, using 64-bit descriptors unconditionally. There is minimal performance penalty for this in practice, and it allows us to unify these disparate code paths. This fixes: - T860 + armhf - T820 + armhf - T760 + aarch64 And will help bringup of 1st/2nd generation Midgard regardless of CPU. [Work done by Tomeu. Commit message written by Alyssa.] v2: Add comments preserving information about the old behaviour for future reference. Fix a compiler warning. (Alyssa) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 08:40:59 -07:00
Jason Ekstrand	6a441151c2	anv: Account for dynamic stencil write disables in the PMA fix In `6ce8592836` we started looking at the dynamic stencil state and disabling stencil writes when the stencil mask is zero. Unfortunately, we never updated the PMA fix code accordingly so 3DSTATE_WM_DEPTH_STENCIL and the PMA fix were getting out-of-sync causing hangs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109203 Fixes: `6ce8592836` "anv: Disable stencil writes when both write..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-16 15:12:45 +00:00
Alyssa Rosenzweig	5ad00fb3ed	panfrost: Implement opportunistic AFBC Rather than hardcoding a BO layout at creation-time, we implement the ability to hint layouts at various points in a BO's lifetime, potentially reallocating and switching layouts if it's heuristically deemed useful to do so. In this patch, we add a simple hinting implementation, opportunistically compressing FBOs. Support is hidden behind PAN_MESA_DEBUG=afbc as the implementation is incomplete (software access to AFBC is unimplemented at the moment) and therefore would regress significantly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 07:21:08 -07:00
Alyssa Rosenzweig	d60994989e	panfrost/mfbd: Zero out framebuffer_stride We don't know what this is, so let's not pretend we do. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 07:19:29 -07:00
Alyssa Rosenzweig	e65e3cf596	panfrost: AFBC buffers must be cache-line aligned Fixes a DATA_INVALID_FAULT when AFBC is paried with mipmapping. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 07:19:28 -07:00
Alyssa Rosenzweig	f7621a8c5f	panfrost: Add Z/S and MRT BOs to the job Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 07:19:28 -07:00
Alyssa Rosenzweig	aaae6180bf	panfrost: Set usage2 during draw, not CSO It can change from a layout switch. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-16 07:19:28 -07:00
Sergii Romantsov	7417b43211	meta: memory leak of CopyPixels usage Meta of CopyPixel generates a buffer object but does not free it on cleanup. Fixes: `37d11b13ce` (meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-16 13:48:47 +03:00
Samuel Pitoiset	afa102d65b	radv: add radv_emit_streamout_{begin,end} helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:17:00 +02:00
Samuel Pitoiset	17464d205c	radv: pass output values to radv_emit_stream_output() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:58 +02:00
Samuel Pitoiset	4dcdc4cdc5	radv: allow to select DST_SEL with RELEASE_MEM Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:57 +02:00
Samuel Pitoiset	3c6d6bd71f	radv: allow to emit PS_DONE/CS_DONE with RELEASE_MEM Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:55 +02:00
Samuel Pitoiset	219dc1b25c	radv: restore an assertion in handle_vs_outputs() The NGG GS epilogue no longers call that function so the assertion is just useless now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:53 +02:00
Samuel Pitoiset	68603b767f	radv/gfx10: emit ES outputs of TES when it's not NGG Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 11:16:51 +02:00
Samuel Pitoiset	b0f7a6e981	radv: update LATE_ALLOC_VS.LIMIT Mirror RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 10:10:22 +02:00
Samuel Pitoiset	27d91062a8	radv/gfx10: support pixel shaders without exports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 10:10:21 +02:00
Samuel Pitoiset	1b2bfeaaaa	radv: fix gathering clip/cull distance masks for GS For NGG, the driver relies on the VS outinfo struct. This fixes dEQP-VK.clipping.user_defined.clip__vert_tess_geom_ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-16 10:09:37 +02:00
Samuel Pitoiset	361d549f87	Revert "radv/gfx10: don't set array pitch field on images" It introduces too many regressions. This reverts commit `6d50dcd80f`.	2019-07-16 09:37:56 +02:00
Iago Toral Quiroga	556c299430	v3d: flag dirty state when binding new sampler states We emit code to saturate texture coordinates when using clamp wrapping mode so if we don't flag the dirty state here we don't get to recompile the shaders when the wrapping mode changes. v2: - Do the same when setting sampler views (Eric) - Use a switch statement instead of an if ladder. - Swap the shader stage assertion with an unreachable. Fixes: spec/!opengl 1.1/texwrap 1d bordercolor/gl_rgba8, border color only spec/!opengl 1.1/texwrap 1d proj bordercolor/gl_rgba8, projected, border color only spec/!opengl 1.1/texwrap 2d bordercolor/gl_rgba8, border color only spec/!opengl 1.1/texwrap 2d proj bordercolor/gl_rgba8, projected, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha12, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha16, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha4, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_alpha8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_intensity8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance4_alpha4, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance6_alpha2, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_r3_g3_b2, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb10_a2, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb4, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb5_a1, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgb8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba4, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor-swizzled/gl_rgba8, swizzled, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha12, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha16, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha4, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_alpha8, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_intensity8, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance4_alpha4, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance6_alpha2, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_luminance8_alpha8, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_r3_g3_b2, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb10_a2, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb4, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb5_a1, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgb8, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba4, border color only spec/!opengl 1.1/texwrap formats bordercolor/gl_rgba8, border color only spec/!opengl 1.2/texwrap 3d bordercolor/gl_rgba8, border color only spec/!opengl 1.2/texwrap 3d proj bordercolor/gl_rgba8, projected, border color only spec/arb_es2_compatibility/texwrap formats bordercolor-swizzled/gl_rgb565, swizzled, border color only spec/arb_es2_compatibility/texwrap formats bordercolor/gl_rgb565, border color only spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_alpha, swizzled, border color only spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_luminance_alpha, swizzled, border color only spec/arb_texture_compression/texwrap formats bordercolor-swizzled/gl_compressed_rgb, swizzled, border color only spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_alpha, border color only spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_luminance_alpha, border color only spec/arb_texture_compression/texwrap formats bordercolor/gl_compressed_rgb, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_alpha16f_arb, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_intensity16f_arb, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance16f_arb, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_luminance_alpha16f_arb, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgb16f, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor-swizzled/gl_rgba16f, swizzled, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_alpha16f_arb, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_intensity16f_arb, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_luminance16f_arb, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_luminance_alpha16f_arb, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_rgb16f, border color only spec/arb_texture_float/texwrap formats bordercolor/gl_rgba16f, border color only spec/arb_texture_rectangle/texwrap rect bordercolor/gl_rgba8, border color only spec/arb_texture_rectangle/texwrap rect proj bordercolor/gl_rgba8, projected, border color only spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_r8, swizzled, border color only spec/arb_texture_rg/texwrap formats bordercolor-swizzled/gl_rg8, swizzled, border color only spec/arb_texture_rg/texwrap formats bordercolor/gl_r8, border color only spec/arb_texture_rg/texwrap formats bordercolor/gl_rg8, border color only spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_r16f, swizzled, border color only spec/arb_texture_rg/texwrap formats-float bordercolor-swizzled/gl_rg16f, swizzled, border color only spec/arb_texture_rg/texwrap formats-float bordercolor/gl_r16f, border color only spec/arb_texture_rg/texwrap formats-float bordercolor/gl_rg16f, border color only spec/ext_packed_float/texwrap formats bordercolor-swizzled/gl_r11f_g11f_b10f, swizzled, border color only spec/ext_packed_float/texwrap formats bordercolor/gl_r11f_g11f_b10f, border color only spec/ext_texture_shared_exponent/texwrap formats bordercolor-swizzled/gl_rgb9_e5, swizzled, border color only spec/ext_texture_shared_exponent/texwrap formats bordercolor/gl_rgb9_e5, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_alpha8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_intensity8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_alpha8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_luminance8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_r8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rg8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgb8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor-swizzled/gl_rgba8_snorm, swizzled, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_alpha8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_intensity8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_alpha8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_luminance8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_r8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_rg8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgb8_snorm, border color only spec/ext_texture_snorm/texwrap formats bordercolor/gl_rgba8_snorm, border color only spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8, swizzled, border color only spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_sluminance8_alpha8, swizzled, border color only spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8, swizzled, border color only spec/ext_texture_srgb/texwrap formats bordercolor-swizzled/gl_srgb8_alpha8, swizzled, border color only spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8, border color only spec/ext_texture_srgb/texwrap formats bordercolor/gl_sluminance8_alpha8, border color only spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8, border color only spec/ext_texture_srgb/texwrap formats bordercolor/gl_srgb8_alpha8, border color only Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 08:13:28 +02:00
Samuel Pitoiset	994253b400	radv/gfx10: add missing conversions for 16-bit exports This fixes dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_* Found with RADV_DEBUG=checkir Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-16 08:12:34 +02:00
Samuel Pitoiset	d8844533af	radv: remove unused code in radv_export_param() It was hack for geometry shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-16 08:12:20 +02:00
Dave Airlie	6d50dcd80f	radv/gfx10: don't set array pitch field on images Setting this seems to be broken, amdvlk only sets it for quilted textures which I'm not sure what those are. Fixes dEQP-VK.glsl.texture_functions.query.texturesize3d Fixes: `bf11f1c3a4` ("radv/gfx10: add gfx10_make_texture_descriptor") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-16 10:41:27 +10:00
Vinson Lee	d1a55d9559	lima/ppir: Fix assert condition in ppir_codegen_encode_branch. Fixes: `af0de6b91c` ("lima/ppir: implement discard and discard_if") Reported-by: Coverity Scan Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-07-15 23:48:34 +00:00
Eric Anholt	82dc168f51	docs: Tell people how to easily generate the Fixes lines. v2: Include '-s' to suppress the diff. v3: use the git config command (Ken), use < (Eric) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-15 16:29:31 -07:00
Caio Marcelo de Oliveira Filho	1210e8caaf	spirv: Ignore ArrayStride for storage classes that should not use it The stride was already overriden when using lower_workgroup_access_to_offsets, so elaborate a bit the commentary there. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-15 16:18:57 -07:00
Caio Marcelo de Oliveira Filho	026cfa1099	spirv: Fix stride calculation when lowering Workgroup to offsets Use alignment to calculate the stride associated with the pointer types. That stride is used when the pointers are casted to arrays. Note that size alone is not sufficient, e.g. struct { vec2 a; vec1 b; } will have element an element size of 12 bytes, but the stride needs to be 16 bytes to respect the 8 byte alignment. Fixes: `050eb6389a` "spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-15 16:18:46 -07:00
Alyssa Rosenzweig	329799257b	panfrost/ci: Blacklist flush finish tests We don't implement batch splitting quite yet which is necessary for the ludicrous number of draw calls these tests invoke. Blacklist them for now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:16:19 -07:00
Alyssa Rosenzweig	c1125d0935	panfrost: Don't leak oversized transient allocations When we allocate them, we allocate with two references accidentally, causing them to leak uncontrollably. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	48f51e9dbb	panfrost: Implement panfrost_bo_cache_evict_all Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	f02278ae87	panfrost: Implement panfrost_bo_cache_get Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	525e5dc4ed	panfrost: Implement panfrost_bo_cache_put Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	9034b5586c	panfrost: Add pan_bucket helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	eb398683d7	panfrost: Implement pan_bucket_index helper We'll use this whenever we need to lookup a bucket. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	270733fe6a	panfrost: Add BO cache data structure Linked list of panfrost_bo* nested inside an array of buckets. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	f3464f7987	panfrost: Describe BO cache architecture Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	f3b7e1ddc7	panfrost: Stub out panfrost_bo_cache_evict This destructor will be used to legitimately free the BOs, now that a BO free with cacheable=0 is only a "fake" free. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	74ad5f89f8	panfrost: Stub out panfrost_bo_cache_put ..so we can intercept the BO free. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:56 -07:00
Alyssa Rosenzweig	b5a28f61ae	panfrost: Stub out panfrost_bo_cache_get We will use this function to fetch cached BOs instead of freshly allocating them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:55 -07:00
Alyssa Rosenzweig	fea953e6c2	panfrost: Don't leak the blend CSO hash table Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:55 -07:00
Alyssa Rosenzweig	07a1f3d120	panfrost: Cleanup after scoreboarding Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:55 -07:00
Alyssa Rosenzweig	fae790ecfc	panfrost: Allocate UBOs on the stack, not the heap Saves a call to calloc (the maximum size is small and known at compile-time) and fixes a leak. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 16:12:55 -07:00
Jason Ekstrand	0ba508d7a3	nir,intel: Add support for lowering 64-bit nir_opt_extract_* We need this when doing full software 64-bit emulation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309 Fixes: `cbad201c2b` "nir/algebraic: Add missing 64-bit extract_[iu]8..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-15 16:08:37 -05:00
Jason Ekstrand	7a19e05e8c	nir/opt_if: Clean up single-src phis in opt_if_loop_terminator Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071 Fixes: `2a74296f24` "nir: add opt_if_loop_terminator()" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-15 19:58:51 +00:00
Pierre-Eric Pelloux-Prayer	ed98f8a63a	radeonsi: verify buffer_offset value before using it This buffer_ofset can come directly from the application (e.g: when using glVertexAttribPointer) and can contain an invalid value. st_atom_array already makes sure that if it's not negative so all that's left is to verify that it's smaller that the buffer size. Bugs related to this issue: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105251#c52 Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109693 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-15 15:22:28 -04:00
Pierre-Eric Pelloux-Prayer	a9655f36fe	st/mesa: verify that vertex buffer offset isn't negative For drivers supporting PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET the buffer_offset value will be interpreted as an signed int. An example of application code causing a negative offset: float b[] = { ... }; // 3 float for pos, 3 for color glBufferData(GL_ARRAY_BUFFER, ..., b, ...); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), 0); glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), &b[3]); ^ should be 3 * sizeof(float) The offset is a ptr so when interpreted as a signed int it can be negative. This commit adds a verification that (int) buffer_offset is not negative - this would indicate an application bug. Since it's too late to emit a GL_INVALID_VALUE error, we replace the negative offset by 0 and emit a debug message. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-15 15:22:25 -04:00
Marek Olšák	ce04fbf67c	st/mesa: don't invalidate a buffer range that is mapped This is needed to fix an issue with OpenGL when a buffer is mapped and BufferSubData is called. In this case, we can't invalidate the buffer range.	2019-07-15 14:58:23 -04:00
Marek Olšák	fc4302d1df	gallium: use MAP_DIRECTLY to mean supression of DISCARD in buffer_subdata This is needed to fix an issue with OpenGL when a buffer is mapped and BufferSubData is called. In this case, we can't invalidate the buffer range.	2019-07-15 14:58:23 -04:00
Kenneth Graunke	5e76c99923	iris: Better handle decoder base addresses It can be useful to call the decoder on a single batch. But, that batch may not contain STATE_BASE_ADDRESS, at which point the decoder will have no idea how to find any buffers. We can initialize the two static bases at the beginning of time, so it has them even if it never sees SBA. Surface base address changes dynamically, possibly in the middle of a batch. So we update it at the start of each batch, making it always start at the value we inherited from the previous one. SBA commands inside the batch can update it to a proper value. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-07-15 11:49:19 -07:00
Samuel Pitoiset	ed12be1b8f	radv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-15 20:05:21 +02:00
Bas Nieuwenhuizen	d4f0f1a6e2	anv: Add android dependencies on android. Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers functions, where we call into some AHardwareBuffer functions. The legacy Android ext did not have us call into any Android function at all and hence it was not noticed. Fixes: `755c633b8d` "anv: Fix vulkan build in meson." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2019-07-15 15:23:43 +00:00
Alyssa Rosenzweig	0b83005807	panfrost: Advertise more depth/stencil formats Fixes a regression in glmark's shadow/refract scenes. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:35 -07:00
Alyssa Rosenzweig	1aaf68d120	panfrost/mfbd: Add Z32 rendering support Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:35 -07:00
Alyssa Rosenzweig	f8e2219b08	panfrost: Fix blend_cso if nr_cbufs == 0 Fixes: `46396af1ec` ("panfrost: Refactor blend infrastructure") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:35 -07:00
Alyssa Rosenzweig	318d641cd9	panfrost: Cleanup shader upload code The old algorithm is still used (and the same issue -- namely, leaking all shaders -- applies) but we're way more concise about it since we're only using the routine for shaders nowadays; everything else is a BO-proper or transient. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	1ffca961ab	panfrost: Remove all old allocators With the new refactor, this all becomes dead code. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	9981b6ef0f	panfrost: Use transient memory for occlusion queries These only last a frame anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	594b47d917	panfrost: Remove bizarre hack I don't think this is still necessary, and if it is, we'll have to figure out how to fix it the right way. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	d375d127a9	panfrost: Upload vertex descriptors to transient memory It's not legal to reuse the vertex shader descriptor across frames now that we patch it at draw-time, so upload to transient memory. Ideally, we could be smarter about this such that subsequent draws with the same vertex shader and same patched state would reuse the descriptor, but for now, let's simply achieve correctness. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	c6b59db5b4	panfrost: Delay resource mmaps We use the new PAN_ALLOCATE_DELAY_MMAP flag to only map resources on-demand, which should avoid mapping FBOs. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	bd4986bafa	panfrost: Cleanup PAN_ALLOCATE_* While we're at it, prompted by a semantics issue around INVISIBLE, also add a separate DELAY_MMAP flag. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Alyssa Rosenzweig	2f783ede02	panfrost/drm: Don't mmap INVISIBLE buffers On the new kernel, mmaping doesn't hurt per se, but it's still wasteful for buffers explicitly marked as not needing an mmap. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-15 08:03:34 -07:00
Lionel Landwerlin	c9c8c2f7d7	anv: fix crash in vkCmdClearAttachments with unused attachment anv_render_pass_compile() turns an unused attachment into a NULL depth_stencil_attachment pointer so check that pointer before accessing it. Found with updates to existing CTS tests. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `208be8eafa` ("anv: Make subpass::depth_stencil_attachment a pointer") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-07-15 16:47:41 +03:00
Samuel Pitoiset	b650f3d197	radv/gfx10: export the PrimitiveID for ES stages (VS or TES) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-15 11:30:10 +02:00
Samuel Pitoiset	8175f6269b	radv/gfx10: declare an external symbol for the ESGS ring It will be used for stream output but for now only declares it if VS and if the PrimitiveID needs to be exported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-15 11:30:08 +02:00
Samuel Pitoiset	f0a90eddb6	radv/gfx10: allocate ESGS ring space for exporting PrimitiveID Only VS needs that. We shouldn't hardcode these values but that's complicated to not do that for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-15 11:30:05 +02:00
Samuel Pitoiset	4478f14327	radv/gfx10: fix crash when emitting NGG GS prologue ac_nir_context is initialized after the driver emits the NGG GS prologue so it's likely to crash. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-15 08:51:53 +02:00
Vasily Khoruzhick	eb862c2365	lima/ppir: Fix branch codegen "unknown_2" field is actually a size of instruction that branch points to. If it's set to a smaller size than actual instruction branch behavior is not defined (and it usually wedges the GPU). Fix it by setting this field correctly. Fixes: `af0de6b91c` ("lima/ppir: implement discard and discard_if") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-07-14 19:49:14 -07:00
Vasily Khoruzhick	8f0160ca24	lima/ppir: Fix assert condition in ppir_codegen_encode_discard Fixes: `af0de6b91c` ("lima/ppir: implement discard and discard_if") Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-07-14 19:48:55 -07:00
Jonathan Marek	4e102a6de7	etnaviv: fix incorrect varying interpolation This corresponds to what the GC3000 blob does. The USED / UNUSED enums are wrong, at least for GC2000/GC3000. Without this the 3rd texture component is not interpolated correctly (flat?) in the following test (and others): dEQP-GLES2.functional.texture.mipmap.cube.generate.rgba8888_nicest Strangely, when the texture is sampled from OpenGL it works correctly, the problem only shows up for sampling by gallium/blitter. This fixes other cube map tests which use util_blitter_blit. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-14 10:34:17 -04:00
Jonathan Marek	a9e78a44d1	etnaviv: reduce rs alignment requirement for two pixel pipes GPU The rs alignment doesn't have to be multiplied by # of pixel pipes. This works on GC2000 which doesn't have the SINGLE_BUFFER feature. This fixes some cubemaps (NPOT / small mipmap levels) because aligning by 8 breaks the expected alignment of 4 for tiled format. We don't want to mess with the alignment of tiled formats. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-14 10:34:17 -04:00
Jonathan Marek	2c393053bf	etnaviv: fix nearest_linear / linear_nearest filtering on GC3000 The MIN filter is never used when not using mipmaps. This fixes that. Interestingly, only GC3000 needs this (GC2000 works without this fix). Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-14 10:34:17 -04:00
Jonathan Marek	63efb6ec6c	etnaviv: fix nearest filtering ROUND_UV rounding breaks nearest filtering. Enable it only when nearest filtering isn't used. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-14 10:34:17 -04:00
Bas Nieuwenhuizen	1f58b6ffef	radv/gfx10: Fix DCC clears. Looks like if the reg clear bit is set, the hwardware does not use the 0/1 clears for textures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-14 07:30:04 +00:00
Vinson Lee	730ceeddb5	meson: Add dep_thread dependency. Fix this build error on Ubuntu 18.04. /usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5' Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663 Suggested-by: Eric Engestrom <eric@@engestrom.ch> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-07-13 11:39:26 -07:00
Eric Anholt	11aa32a447	gitlab-ci: Build i386 and ARM drivers in surfaceless mode. I don't particularly care about getting x86/ARM cross-build coverage of all the window systems, but we do want to be building src/mesa/ (for x86 asm) and gallium drivers (for vc4 NEON asm). I'm also hoping to use these build products for testing freedreno on actual HW (which we do using surfaceless). This increases the docker image from 1.4G to 1.5G. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-07-13 13:46:24 +00:00
Andreas Baierl	ce81c9a2e1	lima: Fix compiler warnings for unused functions. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-07-13 13:15:05 +00:00
Caio Marcelo de Oliveira Filho	09c4037dda	anv: Fix pool allocator when first alloc needs to grow When using softpin, the first allocation was not calculating the padding and offset correctly for the case the first allocation needed to grow. We were missing initialize the state.end right after expanding the pool for the first time. This is not a problem for non-softpin since there we don't use leftover padding so the ends would re-arrange incrementally. This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in SKL -- the test uses a shader larger than the initial size for the instruction pool. Fixes: `dfc9ab2ccd` "anv/allocator: Add padding information." Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-12 22:25:37 -07:00
Kenneth Graunke	aa13921079	mesa: Port errors.c to util/list.h instead of simple_list. There is widespread consensus that simple_list should go away. This patch converts one more use to the modern kernel-style list. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-12 21:58:40 -07:00
Jason Ekstrand	974fabe810	intel: Run the optimization loop before and after lowering int64 For bindless SSBO access, we have to do 64-bit address calculations. On ICL and above, we don't have 64-bit integer support so we have to lower the address calculations to 32-bit arithmetic. If we don't run the optimization loop before lowering, we won't fold any of the address chain calculations before lowering 64-bit arithmetic and they aren't really foldable afterwards. This cuts the size of the generated code in the compute shader in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 by around 30%. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-13 02:59:28 +00:00
Alyssa Rosenzweig	7103baf01f	panfrost/decode: Drop _replay prefix We don't even support replay anymore; this is just wasting characters and adding clutter. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:53 -07:00
Alyssa Rosenzweig	0d5abfdec5	panfrost/decode: Drop _name suffixes Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:53 -07:00
Alyssa Rosenzweig	0c1874adad	panfrost/decode: Add MEMORY_PROP_DIR variant This allows dumping memory properties directly without dereferencing an address, allowing us to fix more -Waddress-of-packed-member warnings. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:52 -07:00
Alyssa Rosenzweig	9ffe061c5e	panfrost/decode: Copy embedded structs before using Fixes some, but not all, warnings from -Waddress-of-packed-member Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:52 -07:00
Alyssa Rosenzweig	23b230d72f	panfrost/decode: Remove pandecode_decode_fbd_type It is unused. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:52 -07:00
Alyssa Rosenzweig	9eea8423a0	panfrost/midgard: Use generic outmod type It could be midgard_outmod_float or midgard_outmod_int; don't assume it's one or the other. Fixes -Wenum-conversion warnings. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:23:52 -07:00
Alyssa Rosenzweig	e173d6b1b1	panfrost: Precompute scoreboard dependents Mali job dependency graphs, at least for GLES3.0, have the special property that a given node will only have at most a single dependent. This allows us to efficiently precompute the dependent array and replace an inner loop's O(N) search with an O(1) lookup, bringing the algorithmic complexity of scoreboarding from O(N^2) to O(N). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 16:22:15 -07:00
Alyssa Rosenzweig	b68778e6de	panfrost: Remove transient pool abstraction Now that it has been totally replaced by the borrow mechanism, it is now unused code. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	ee32700f37	panfrost: Subdivide fixed-size transient slabs The whole purpose of the transient memory model is to make subdivision stupidly easy, so let's handle that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	37097b2f38	panfrost: Recycle fixed-size transient BOs The usual case. We use the bitset to mark freedom and seize it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	0f5ad9efcc	panfrost: Bookkeep transient indices The batch now temporarily possesses the transient buffer, so it'll need to remember that to free it later. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	00c9a1cb75	panfrost: Rewrite allocate_transient with new abstraction We use a fixed size slab if we can, otherwise we create a dedicated ("oversized") BO and add that to the job. In the latter case we'll get reference counting for free so we can forget about this corner case for the rest of the series. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	ba02cf0e75	panfrost: Add pan_bo_for_screen helper Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:48 -07:00
Alyssa Rosenzweig	330cd057ad	panfrost: Add panfrost_transient_bo array We would like transient allocations to occur on the screen (borrowed by the batch) rather than on the context. Add fields to track this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:47 -07:00
Alyssa Rosenzweig	718ebfa225	panfrost: Don't upload vertex/tiler twice The latter upload is correct, but the former upload is unassociated with any particular FBO and therefore becomes orphaned. We do have to upload at draw-time at the latest, if we haven't by then. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:47 -07:00
Alyssa Rosenzweig	085004cc2c	panfrost/drm: Check allocation size is positive Zero-sized allocations will fail with an unhelpful errno from the kernel; check size explicitly in userspace before it gets that far. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 15:31:47 -07:00
Neil Roberts	8419621176	mesa/glspirv: Validate that compute shaders are not linked with other stages The test is based on link_shaders(). For example, it allows the following test (when run on SPIR-V mode) to pass: spec/arb_compute_shader/linker/mix_compute_and_non_compute.shader_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:42 +02:00
Neil Roberts	022e9ddd1a	mesa/glspirv: Validate that there is a VS when there is a TCS, TES or GS The shader combination tests are copied from link_shaders(). For example, it allows the following tests (when run on SPIR-V mode) to pass: spec/arb_tessellation_shader/linker/no-vs spec/arb_tessellation_shader/linker/tcs-no-vs spec/arb_tessellation_shader/linker/tes-no-vs spec/glsl-1.50/linker/gs-without-vs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Alejandro Piñeiro	e4210b93e4	i965: don't use disk cache with SPIR-V shaders Right now we don't support disk cache for SPIR-V shaders (from ARB_gl_spirv), so let's avoid writing the program data to or reading it from the disk if any in-use shaders use SPIR-V. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Alejandro Piñeiro	bb3bbdfbbd	glsl/shader_cache: handle SPIR-V shaders Right now we don't have cache support for SPIR-V shaders (from ARB_gl_spirv). Right now they are properly skipped because they fall on the ff shader code path (no key, no name), but it would be better to update current comments, and add some guards. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov	637b168470	nir/linker: Initialize UniformDataDefaults when using SPIR-V Allocate UniformDataDefaults and fill in the data defaults when linking a SPIR-V program. Among other things, this allows program serialization to work. It allows the following piglit test (when run on SPIR-V mode) to pass: spec/arb_get_program_binary/execution/uniform-after-restore.shader_test v2: use memcpy to initialize UniformDataDefaults Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov	761b0fe95f	glsl/serialize: Update write_program_resource_data() to handle NULL input and output variable names Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Arcady Goldmints-Orlov	c3122d2431	glsl/serialize: Handle NULL uniform name in write_uniforms() Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	0baa553fab	mesa/main: Fix UBO/SSBO ACTIVE_VARIABLES query (ARB_gl_spirv) When querying MAX_NUM_ACTIVE_VARIABLES, NUM_ACTIVE_VARIABLES and ACTIVE_VARIABLES over SSBO and UBO interfaces, we filter the variables which are active using the variable's name and looking for it in the program resource list. If it is in the program resource list, the variable will be considered active. However due to ARB_gl_spirv where name reflection information is not mandatory, we can use the UBO/SSBO binding and variable offset to filter which variables which are active. v2: use RESOURCE_UBO/UNI macros instead of direct castings, update comment (Alejandro) v3: Change signature of _mesa_program_resource_find_active_variable to simplify calling it. Also, squash the fix for find_binding_offset for arrays of blocks (Arcady) Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	161de77e0f	mesa/shader_query: Fix LOCATION_INDEX query (ARB_gl_spirv) When querying GL_LOCATION_INDEX using glGetProgramResourceiv we already know the index of the resource, we do not need to find it using the name, which is convenient for shaders coming from SPIR-V binaries where names are optional. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	8818553f18	mesa/shaderapi: Fix TRANSFORM_FEEDBACK_VARYING program query Fixes the program queries API (glGetProgramiv): TRANSFORM_FEEDBACK_VARYINGS and TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH in two cases: 1. ARB_enhaced_layouts: The queries were not working for GLSL shaders which specify the varyings using enhanced layouts. We were returning the info as if the varyings could only be specified using the API. 2. ARB_gl_spirv: TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH should return 1 if there is no name reflection information available. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	8792abff9d	mesa/uniforms: Fix GetUniformLocation (ARB_gl_spirv) From the ARB_gl_spirv specification, glGetUniformLocation should return -1 when no name reflection is available. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	96d6156678	mesa/shader_query: Fix NAME_LENGTH queries (ARB_gl_spirv) For shaders constructed from SPIR-V binaries, it is possible that no name reflection information is available. In that case, - glGetProgramInterfaceiv(.., pname=MAX_NAME_LENGTH, ..) - gletProgramResourceiv(.., props=NAME_LENGTH, ..) should return 1. Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Alejandro Piñeiro	3ebd60b491	mesa: Fix ACTIVE_*_MAX_LENGTH program queries (ARB_gl_spirv) Since ARB_gl_spirv it is possible to miss a lot of name reflection information, so it is needed to add NULL name checks for several queries, and return a specific value on those cases. This commit add them for ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH, ACTIVE_ATTRIBUTE_MAX_LENGTH and ACTIVE_UNIFORM_MAX_LENGTH. From ARB_gl_spirv spec: "If pname is ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH, the length of the longest active uniform block name, including the null terminator, is returned. If no active uniform blocks exist, zero is returned. If no name reflection information is available, one is returned. If pname is ACTIVE_ATTRIBUTE_MAX_LENGTH, the length of the longest active attribute name, including a null terminator, is returned. If no active attributes exist, zero is returned. If no name reflection information is available, one is returned. If pname is ACTIVE_UNIFORM_MAX_LENGTH, the length of the longest active uniform name, including a null terminator, is returned. If no active uniforms exist, zero is returned. If no name reflection information is available, one is returned." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	cafc1a40d4	nir/types: Add glsl_type_is_unsized_array helper Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	bfc5e46746	nir/linker: Fill TOP_LEVEL_ARRAY_SIZE and STRIDE From the ARB_program_interface_query specification: "For the property TOP_LEVEL_ARRAY_SIZE, a single integer identifying the number of active array elements of the top-level shader storage block member containing to the active variable is written to <params>. If the top-level block member is not declared as an array, the value one is written to <params>. If the top-level block member is an array with no declared size, the value zero is written to <params>." "For the property TOP_LEVEL_ARRAY_STRIDE, a single integer identifying the stride between array elements of the top-level shader storage block member containing the active variable is written to <params>. For top-level block members declared as arrays, the value written is the difference, in basic machine units, between the offsets of the active variable for consecutive elements in the top-level array. For top-level block members not declared as an array, zero is written to <params>." v2: move top_level_array_size and stride into nir_link_uniforms_state Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	ae2ea5ec1f	nir/linker: Compute the offset for non-trivial uniform types. ARB_gl_spirv points that the offset must be explicit, however this is true for 'root' types. For complex types, like struct members or arrays of arraya, it needs to be computed. We are not using the offset stored in the gl_buffer_variables during the uniform blocks linking because currently we do not have a way to relate a gl_buffer_variable with its corresponding gl_uniform_storage. The GLSL path uses the name for that, but we can not rely on that because names are optional in SPIR-V. Notice that uniforms non-backed by a buffer object will have an offset equal to -1, like in the GLSL path. v2: add offset and var_is_in_block as per-variable state in nir_link_uniforms_state (Arcady) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	e15c663d8e	nir/linker: Add atomic counters to the program resource list Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	e1464a1cf8	nir/linker: Add XFB resources to the program resource list Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	53087a89ac	nir/linker: Add BUFFER_VARIABLEs to the prog resource list v2: use link_util_should_add_buffer_variable() (Arcady) Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	ffdb44d3a0	nir/linker: Add inputs/outputs to the program resource list v2: added TODO comment hinting possible future refactoring of nir_build_program_resource_list and build_program_resource_list, to avoid code duplication (Alejandro, to explicitly reflect a valid concern from Timothy during the review). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-12 23:42:41 +02:00
Alejandro Piñeiro	691cee751a	nir/linker: add ubo/ssbo to the program resource list v2: "nir/linker: Use the stageref when adding UBO/SSBO resources" squashed on this one (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-12 23:42:41 +02:00
Antia Puentes	a638971929	nir/linker: Fill the uniform's BLOCK_INDEX Binding comparison is used to determine the block the uniform is part of. Note that to do the binding comparison we need the information in UniformBlocks[] and ShaderStorageBlocks[] to be available, so we have to call gl_nir_link_uniform_blocks() before linking the uniforms. v2: add missing break (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-12 23:42:41 +02:00
Samuel Pitoiset	f239e22813	radv/gfx10: enable 1D textures Mirror RadeonSI. This also fixes crashes in addrlib. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 18:25:45 +02:00
Andres Gomez	f4d2be03b1	intel/compiler: remove abandoned comments `c8665005`: ("intel/compiler: Don't always require precise lowering of flrp") forgot to remove some comments that didn't apply any more after the change. Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>	2019-07-12 16:15:20 +00:00
Andres Gomez	9aadd5d688	nir/compiler: keep same bit size when lowering with flrp This was probably not caught before because no supported test was exercising the flrp lowering with other bit size different than 32. With the arrival of VK_KHR_shader_float_controls we will have some of those and, unless we keep the bit size, we will end with something like: ../src/compiler/nir/nir_builder.h:420: nir_builder_alu_instr_finish_and_insert: Assertion `src_bit_size == bit_size' failed. Fixes: `158370ed2a` ("nir/flrp: Add new lowering pass for flrp instructions") Fixes: `ae02622d8f` ("nir/flrp: Lower flrp(a, b, c) differently if another flrp(_, b, c) exists") Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>	2019-07-12 16:15:20 +00:00
Jason Ekstrand	16842b2391	anv: Properly compute image usage in CreateImageView With separate stencil usage, we can't just grab the usage from the image directly and have to consider the per-aspect usage instead. Fixes: `1be38f9178` "anv:Use VK_EXT_separate_stencil_usage to avoid..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-12 16:13:48 +00:00
Samuel Pitoiset	b393b2ce95	radv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	8cc4e4a81e	radv/gfx10: init more registers in the graphics preamble Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	e68b55f5e3	radv/gfx10: set HS/GS/CS.WGP_MODE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:12 +02:00
Samuel Pitoiset	5d5e26230a	radv/gfx10: emit GE_PC_ALLOC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	df062afa03	radv/gfx10: enable vertex shaders without export parameters GFX10 allows this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	3f76c0f47c	radv/gfx10: launch 2 compute waves per CU before going onto the next CU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	e631d65fc6	radv: use ac_get_compute_resource_limits() No behaviour change. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Samuel Pitoiset	e510c5ee3b	ac: import ac_get_compute_resource_limits() from RadeonSI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 17:47:11 +02:00
Alyssa Rosenzweig	5f4f8aec74	panfrost: Initialize shift/extra_flags Don't rely on them being preinitialized to zero; this can cause junk to appear on the wire. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 07:38:37 -07:00
Alyssa Rosenzweig	6d8490f900	panfrost: Fix build warnings A bunch of these are from asserts not being compiled in 32-bit mode (once Erik's ASSERTABLE stuff is merged, we'll want to switch). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-12 07:38:37 -07:00
Samuel Pitoiset	37aefb2be1	radv/gfx10: invalidate everything in L2 when shaders read data This includes metadata as well. On GFX10, we have to invalidate the L2 metadata cache when shaders read DCC. Note that we still have to implement GFX10 coherency by introducing INV_L2_METATADA but for now just flush L2. This fixes a corruption with DCC and Talos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 14:08:12 +02:00
Samuel Pitoiset	4e38322dd8	radv/gfx10: fix wrong emission of GE_CNTL Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 12:15:08 +02:00
Samuel Pitoiset	219d6939df	radv: add more assertions to make sure packets are correctly emitted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 12:15:06 +02:00
Alejandro Piñeiro	85b78f96a6	v3d: use inc/dec tmu operation with image atomic sub/add of 1 This allows to remove a mov of 1/-1, as it is implicit with the operation. As with atomic inc/dec/add, usual shader-db set doesn't include any GLES shader using it. So using as workaround vk-gl-cts shaders, we get this: total instructions in shared programs: 1217013 -> 1217006 (<.01%) instructions in affected programs: 53 -> 46 (-13.21%) helped: 2 HURT: 0 One of the helped shader went from 40 to 34 instructions. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:51:22 +02:00
Alejandro Piñeiro	2e22879115	v3d: refactor some code from v3d40_vir_emit_image_load_store And moved to new auxiliar method v3d40_image_load_store_tmu_op, equivalent to the nir_to_nir v3d_general_tmu_op, to clean-up a little. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:49:29 +02:00
Alejandro Piñeiro	934ce48db8	v3d: use inc/dec tmu operation with atomic sub/add of 1 Among other things, this avoid the need of loading 1/-1 constants (so one less operation). The removed comment suggest the option of adding support on NIR for inc/dec. Intel just uses an auxiliar method to get which hw operation is needed, so no lowering is needed. And at the same time, being so small, seems unreasonable to try to add a general one on NIR itself. It is more easy to just adapt the method here (that is what the patch does right now). It is worth to note that we are not getting any change on shader-db stats because all those methods are used on the usual shader-db set with shaders needing GLSL > 4.2. In general there aren't too many GLSL ES 3.1 tests. As an alternative, we captured the GLES3/GLSL31/GLS32 used on vk-gl-cts, even if that is not a real life usage of shaders. With those we get the following: total instructions in shared programs: 1217022 -> 1217013 (<.01%) instructions in affected programs: 117 -> 108 (-7.69%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.50 x̃: 1 helped stats (rel) min: 3.57% max: 10.00% x̄: 8.09% x̃: 9.09% 95% mean confidence interval for instructions value: -2.07 -0.93 95% mean confidence interval for instructions %-change: -10.54% -5.64% Instructions are helped. Note that the shaders helped are really low because most of the vk-gl-cts tests using AtomicInc/Dec/Add are mostly used on compute shaders. Although right now there is a branch around with CS support, the usual is doing the stats against master. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:40 +02:00
Alejandro Piñeiro	3912a32a79	v3d: remove redefinition of tmu operations on nir_to_vir They are already defined, although is a slightly different format on the generated packet headers, so it was needed to change how it is used on nir_to_vir. In addition to allow to remove some duplicated headers, it will allow to define just one get_op_for_atomic_add aux method later to support using inc/dec instead of add of 1/-1. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:17 +02:00
Alejandro Piñeiro	c2ff38d2df	v3d: tweak initial comment on pack generator script As the files it mentions to use as reference has slightly different names. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 11:48:09 +02:00
Yevhenii Kolesnikov	8c5692b696	glsl/link_varyings: Fix hash table leak Hash tables were not destroyed at return. v2: Use ralloc_context (Eric Anholt) Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-12 11:07:08 +03:00
Kenneth Graunke	712ac83033	iris: Simplify devinfo access in calculate_result_on_gpu() We have devinfo, no need for screen->devinfo.	2019-07-12 00:33:19 -07:00
Iago Toral Quiroga	10d50f2904	v3d: remove unused definitions Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	8e50a9f6cf	v3d: move implementation of some intrinsics to separate helpers Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	d69184204e	v3d: emit correct lowering for logic ops with RGB10A2 render targets Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	7bf3676845	v3d: emit correct lowering for logic ops with integer render targets Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	e540775f0c	v3d: add lowering for OpenGL logic operations This implements support for OpenGL logic operations by emitting code to read from the TLB if needed and blending the fragment output accordingly. It is similar to VC4's blend lowering pass, but exclusive to logic operations, since blending is otherwise supported in hardware. The pass doesn't handle MSAA targets yet. Fixes the following piglit tests: spec/!opengl 1.0/gl-1.0-logicop/* spec/!opengl 1.1/gl-1.1-xor spec/!opengl 1.1/gl-1.1-xor-copypixels It also fixes text cursor rendering in Libreoffice with the GTK+2 theme, which is rendered via glamor using the XOR logic operation. v2: fix checks for allowed variable location and maximum render target (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	7c1d708911	v3d: acquire scoreboard lock before first tlb read Until now we have always been emitting our scoreboard locks on the last thread switch to improve parallelism. We did this by emitting our last thread switch right before our tlb writes at the very end of the program, where we know that we are outside control flow. Unfortunately, this strategy is not valid when we have tlb color reads too, as these will happen before this point in the program and can happen inside control flow. To fix this we always emit a thread switch before the first tlb load and if we see additional thread switches after that point, we change the strategy to lock on the first thread switch. v2: change the solution so it is expected to work in more scenarios (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	47d7c80dc7	v3d: implement tile buffer color read intrinsic We will be emitting this intrinsic to signal TLB color loads when we implement OpenGL logic operations, where we need to blend the fragment shader color output with the existing color in the render target. Per-sample TLB reads are not supported yet. v2: fix the offset into the color_reads array (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	b0eec9e27d	nir: add a new v3d-specific intrinsic for tile buffer color reads This is intended to be used, for example, with OpenGL logic operations. It takes a render target as source and a sample index in the base index for MSAA color reads. v2: drop the CAN_ELIMINATE and CAN_REORDER flags (Eric). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	6af1bdefa9	v3d: fix size of color_reads and sample_colors arrays We need to scale the size of these arrays to consider up to V3D_MAX_DRAW_BUFFERS render targets and 4 components per color. v2: we want to store each color component separately, so scale by 4 too. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	0279ac6e51	v3d: add color formats and swizzles to the fragment shader key We are going to need these very soon to emit correct reads from the tlb to implement logic operations. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	d26b35ba44	v3d: add helpers to emit ldtlb and ldtlbu signals The ldtlbu version will read an implicit uniform with the TLB read specifier and should be used for the first read in a sequence of TLB reads (unless the default configuration is valid, in which case we can use ldtlb). The ldtlb version is used for any subsequent TLB read in the sequence. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	aff8885cf9	v3d: handle tlb read dependency tracking as if they were writes Tile buffer reads are emitted as ordered sequences and cannot be reordered. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	4793e2c888	v3d: instructions with the ldtlb and ldtlbu signals are tlb instructions Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	83a66e10de	v3d: tlb loads cannot be removed Loads from the tile buffer are emitted in ordered sequences so we cannot eliminate or reorder any of them. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	08f4dc3adc	v3d: the ldtlbu signal reads an implicit uniform Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Iago Toral Quiroga	271bc8acfb	v3d: handle ldtlb and ldtlbu signals during disassembly We already have code to print these signals but the early return in the code that checks if any signals are present present was missing the checks for them, so it would skip printing them unless they were paired with other signals. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-12 09:16:38 +02:00
Samuel Pitoiset	958ee4c21a	radv: report shader stage name when dumping LLVM IR For debugging purposes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	2b6a089813	radv: tidy up radv_get_shader_name() and add NGG stages Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	ffd6a979bf	radv/gfx10: update OVERWRITE_COMBINER_{MRT_SHARING,WATERMARK} DCC related, mirror RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	c6fa4de15d	radv/gfx10: do not set alignment on the ngg_emit pointer This is invalid and this fixes a crash in LLVM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	df0a23ad1e	radv/gfx10: fix exporting clip/cull distances for GS This fixes dEQP-VK.clipping.user_defined.clip_distance.geom. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:53 +02:00
Samuel Pitoiset	edcd2bc833	radv/gfx10: fix exporting the subpass view index for GS This fixes dEQP-VK.multiview.geometry. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-12 08:19:20 +02:00
Timothy Arceri	3043908ccb	mesa: save/restore SSO flag when using ARB_get_program_binary Without this the restored program will fail the pipeline validation checks when we attempt to use an SSO program. Fixes: `c20fd744fe` ("mesa: Add Mesa ARB_get_program_binary helper functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111010	2019-07-12 09:26:53 +10:00
Alyssa Rosenzweig	fe783c5b0c	pan/midgard: Correct component count clamping PSIZ Kind of a funky corner case that does not (as far as I know) apply to organic shaders from GLES but does pop up in generated shaders from the fixed-function desktop pipeline. Fixes: `bb483a9166` ("panfrost: Clamp point size") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 13:30:55 -07:00
Alyssa Rosenzweig	c4e6d759dd	panfrost: Remove unused display target field Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 12:48:25 -07:00
Alyssa Rosenzweig	6b9edd2451	panfrost/ci: Update expectations Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 12:48:25 -07:00
Samuel Pitoiset	a7b7e94085	radv: only enable the GS copy shader stage if GS is enabled Ooops. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 21:44:44 +02:00
Eric Anholt	e1fe98cc7d	freedreno: Add dependency on the xml build to the winsys. The screen header includes the common xml, and otherwise we might race to build before it's done. Fixes: `e03259974e` ("freedreno: Generate headers from xml files") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-11 12:01:01 -07:00
Kenneth Graunke	5445c176e2	iris: Disable SIMD32 when using a 16x MSAA framebuffer. We weren't doing this documented workaround because it's sorta painful.	2019-07-11 11:34:21 -07:00
Ian Romanick	ef7b4fdf3f	nir/algebraic: Recognize open-coded flrp(a, b, a) No shader-db changes Ice Lake, Iron Lake, or GM45 as these platforms lack a LRP instruction. v2: Remove flrp@64 cases. Since Gen11 removes flrp@32, it seems unlikely that we'll ever have a flrp@64. Should that occur, the cases can be added back. All Gen6-Gen9 platforms had similar results. (Skylake shown) total instructions in shared programs: 15041996 -> 15041184 (<.01%) instructions in affected programs: 71776 -> 70964 (-1.13%) helped: 312 HURT: 0 helped stats (abs) min: 2 max: 3 x̄: 2.60 x̃: 3 helped stats (rel) min: 0.36% max: 4.55% x̄: 1.75% x̃: 1.28% 95% mean confidence interval for instructions value: -2.66 -2.55 95% mean confidence interval for instructions %-change: -1.89% -1.61% Instructions are helped. total cycles in shared programs: 354303333 -> 354301807 (<.01%) cycles in affected programs: 433742 -> 432216 (-0.35%) helped: 206 HURT: 78 helped stats (abs) min: 2 max: 244 x̄: 21.02 x̃: 8 helped stats (rel) min: 0.06% max: 19.59% x̄: 1.72% x̃: 0.82% HURT stats (abs) min: 1 max: 220 x̄: 35.95 x̃: 10 HURT stats (rel) min: 0.07% max: 30.48% x̄: 2.53% x̃: 0.56% 95% mean confidence interval for cycles value: -10.68 -0.06 95% mean confidence interval for cycles %-change: -0.99% -0.12% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	0c2b3a7fc0	nir/algebraic: Rearrange 1-((1-a) * (1-b)) into flrp-friendly form No shader-db changes Ice Lake, Iron Lake, or GM45 as these platforms lack a LRP instruction. v2: Convert the pattern directly to flrp. There were negligible improvements on Gen4 and Gen5, and Gen11 was actually hurt. I believe the problem is this optimization conflicts with the (1-x)*y => ffma(-x, y, y) optimization on Gen11. Skylake total instructions in shared programs: 15046487 -> 15041996 (-0.03%) instructions in affected programs: 194681 -> 190190 (-2.31%) helped: 880 HURT: 20 helped stats (abs) min: 1 max: 19 x̄: 5.13 x̃: 4 helped stats (rel) min: 0.19% max: 36.36% x̄: 4.85% x̃: 3.33% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.11% max: 1.06% x̄: 0.28% x̃: 0.17% 95% mean confidence interval for instructions value: -5.25 -4.73 95% mean confidence interval for instructions %-change: -5.11% -4.36% Instructions are helped. total cycles in shared programs: 354340839 -> 354303333 (-0.01%) cycles in affected programs: 1753622 -> 1716116 (-2.14%) helped: 786 HURT: 182 helped stats (abs) min: 1 max: 1842 x̄: 56.52 x̃: 22 helped stats (rel) min: 0.03% max: 43.17% x̄: 3.90% x̃: 2.84% HURT stats (abs) min: 1 max: 440 x̄: 37.99 x̃: 9 HURT stats (rel) min: 0.03% max: 29.37% x̄: 1.96% x̃: 0.32% 95% mean confidence interval for cycles value: -45.90 -31.59 95% mean confidence interval for cycles %-change: -3.09% -2.50% Cycles are helped. All Gen6-Gen8 platforms had similar results. (Broadwell shown) total instructions in shared programs: 15055907 -> 15051466 (-0.03%) instructions in affected programs: 196370 -> 191929 (-2.26%) helped: 871 HURT: 26 helped stats (abs) min: 1 max: 19 x̄: 5.13 x̃: 4 helped stats (rel) min: 0.19% max: 36.36% x̄: 4.76% x̃: 3.27% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.11% max: 1.06% x̄: 0.24% x̃: 0.12% 95% mean confidence interval for instructions value: -5.21 -4.69 95% mean confidence interval for instructions %-change: -4.99% -4.24% Instructions are helped. total cycles in shared programs: 387729170 -> 387699745 (<.01%) cycles in affected programs: 1816409 -> 1786984 (-1.62%) helped: 788 HURT: 172 helped stats (abs) min: 1 max: 662 x̄: 47.29 x̃: 22 helped stats (rel) min: 0.03% max: 31.26% x̄: 3.55% x̃: 2.76% HURT stats (abs) min: 1 max: 404 x̄: 45.59 x̃: 14 HURT stats (rel) min: 0.03% max: 22.92% x̄: 1.53% x̃: 0.43% 95% mean confidence interval for cycles value: -35.69 -25.61 95% mean confidence interval for cycles %-change: -2.88% -2.40% Cycles are helped. total fills in shared programs: 34712 -> 34710 (<.01%) fills in affected programs: 7 -> 5 (-28.57%) helped: 1 HURT: 0 LOST: 0 GAINED: 2 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	09705747d7	nir/algebraic: Reassociate fadd into fmul in DPH-like pattern Moving the add to the other end of the sequence allows it to be fused into an FMA. Ice Lake total instructions in shared programs: 17173074 -> 16933147 (-1.40%) instructions in affected programs: 7938745 -> 7698818 (-3.02%) helped: 35583 HURT: 90 helped stats (abs) min: 1 max: 716 x̄: 6.75 x̃: 6 helped stats (rel) min: 0.10% max: 53.04% x̄: 5.29% x̃: 3.45% HURT stats (abs) min: 1 max: 41 x̄: 2.46 x̃: 1 HURT stats (rel) min: 0.32% max: 8.33% x̄: 1.41% x̃: 0.77% 95% mean confidence interval for instructions value: -6.80 -6.65 95% mean confidence interval for instructions %-change: -5.32% -5.22% Instructions are helped. total cycles in shared programs: 360881386 -> 359533568 (-0.37%) cycles in affected programs: 189489144 -> 188141326 (-0.71%) helped: 27250 HURT: 6707 helped stats (abs) min: 1 max: 21997 x̄: 62.15 x̃: 16 helped stats (rel) min: <.01% max: 70.69% x̄: 4.04% x̃: 2.35% HURT stats (abs) min: 1 max: 3507 x̄: 51.56 x̃: 14 HURT stats (rel) min: <.01% max: 77.26% x̄: 2.72% x̃: 1.27% 95% mean confidence interval for cycles value: -44.70 -34.68 95% mean confidence interval for cycles %-change: -2.75% -2.65% Cycles are helped. total spills in shared programs: 8943 -> 8829 (-1.27%) spills in affected programs: 625 -> 511 (-18.24%) helped: 6 HURT: 3 total fills in shared programs: 21815 -> 21719 (-0.44%) fills in affected programs: 1653 -> 1557 (-5.81%) helped: 7 HURT: 10 LOST: 11 GAINED: 3 Skylake and Broadwell had similar results. (Skylake shown) total instructions in shared programs: 15271996 -> 15040882 (-1.51%) instructions in affected programs: 7193699 -> 6962585 (-3.21%) helped: 33985 HURT: 30 helped stats (abs) min: 1 max: 260 x̄: 6.80 x̃: 6 helped stats (rel) min: 0.10% max: 30.00% x̄: 5.54% x̃: 3.85% HURT stats (abs) min: 1 max: 41 x̄: 4.00 x̃: 3 HURT stats (rel) min: 0.20% max: 2.16% x̄: 1.46% x̃: 1.72% 95% mean confidence interval for instructions value: -6.87 -6.72 95% mean confidence interval for instructions %-change: -5.59% -5.48% Instructions are helped. total cycles in shared programs: 355520785 -> 354253799 (-0.36%) cycles in affected programs: 185869148 -> 184602162 (-0.68%) helped: 25824 HURT: 6287 helped stats (abs) min: 1 max: 21997 x̄: 61.66 x̃: 16 helped stats (rel) min: <.01% max: 42.05% x̄: 4.18% x̃: 2.41% HURT stats (abs) min: 1 max: 3327 x̄: 51.76 x̃: 14 HURT stats (rel) min: <.01% max: 101.62% x̄: 2.80% x̃: 1.28% 95% mean confidence interval for cycles value: -44.70 -34.21 95% mean confidence interval for cycles %-change: -2.87% -2.76% Cycles are helped. total spills in shared programs: 8835 -> 8818 (-0.19%) spills in affected programs: 613 -> 596 (-2.77%) helped: 5 HURT: 2 total fills in shared programs: 21738 -> 21744 (0.03%) fills in affected programs: 1348 -> 1354 (0.45%) helped: 5 HURT: 11 LOST: 0 GAINED: 12 Haswell total instructions in shared programs: 13447102 -> 13381508 (-0.49%) instructions in affected programs: 3770735 -> 3705141 (-1.74%) helped: 11999 HURT: 29 helped stats (abs) min: 1 max: 409 x̄: 5.60 x̃: 3 helped stats (rel) min: 0.10% max: 20.00% x̄: 2.38% x̃: 1.87% HURT stats (abs) min: 3 max: 750 x̄: 54.90 x̃: 3 HURT stats (rel) min: 0.12% max: 125.30% x̄: 9.96% x̃: 1.82% 95% mean confidence interval for instructions value: -5.71 -5.19 95% mean confidence interval for instructions %-change: -2.39% -2.30% Instructions are helped. total cycles in shared programs: 376342236 -> 375690458 (-0.17%) cycles in affected programs: 155699021 -> 155047243 (-0.42%) helped: 8397 HURT: 2876 helped stats (abs) min: 1 max: 20248 x̄: 109.87 x̃: 18 helped stats (rel) min: <.01% max: 40.71% x̄: 2.23% x̃: 1.49% HURT stats (abs) min: 1 max: 15414 x̄: 94.15 x̃: 22 HURT stats (rel) min: <.01% max: 432.49% x̄: 3.15% x̃: 1.41% 95% mean confidence interval for cycles value: -67.64 -48.00 95% mean confidence interval for cycles %-change: -0.99% -0.74% Cycles are helped. total spills in shared programs: 23134 -> 23184 (0.22%) spills in affected programs: 1675 -> 1725 (2.99%) helped: 13 HURT: 11 total fills in shared programs: 34550 -> 34686 (0.39%) fills in affected programs: 1421 -> 1557 (9.57%) helped: 13 HURT: 11 LOST: 0 GAINED: 11 Ivy Bridge total instructions in shared programs: 12019642 -> 11987285 (-0.27%) instructions in affected programs: 1532236 -> 1499879 (-2.11%) helped: 5522 HURT: 110 helped stats (abs) min: 1 max: 312 x̄: 6.22 x̃: 3 helped stats (rel) min: 0.16% max: 20.00% x̄: 2.46% x̃: 1.88% HURT stats (abs) min: 1 max: 750 x̄: 18.07 x̃: 3 HURT stats (rel) min: 0.09% max: 125.30% x̄: 3.42% x̃: 1.15% 95% mean confidence interval for instructions value: -6.25 -5.24 95% mean confidence interval for instructions %-change: -2.43% -2.26% Instructions are helped. total cycles in shared programs: 180214667 -> 179761900 (-0.25%) cycles in affected programs: 31448723 -> 30995956 (-1.44%) helped: 7191 HURT: 2838 helped stats (abs) min: 1 max: 17680 x̄: 88.47 x̃: 17 helped stats (rel) min: <.01% max: 50.45% x̄: 2.16% x̃: 1.40% HURT stats (abs) min: 1 max: 15540 x̄: 64.63 x̃: 24 HURT stats (rel) min: 0.02% max: 435.17% x̄: 3.10% x̃: 1.51% 95% mean confidence interval for cycles value: -53.34 -36.95 95% mean confidence interval for cycles %-change: -0.81% -0.53% Cycles are helped. total spills in shared programs: 3599 -> 3642 (1.19%) spills in affected programs: 1180 -> 1223 (3.64%) helped: 12 HURT: 2 total fills in shared programs: 4031 -> 4162 (3.25%) fills in affected programs: 876 -> 1007 (14.95%) helped: 12 HURT: 2 LOST: 6 GAINED: 5 Sandy Bridge total instructions in shared programs: 10850686 -> 10822890 (-0.26%) instructions in affected programs: 1247986 -> `1220190` (-2.23%) helped: 4699 HURT: 102 helped stats (abs) min: 1 max: 104 x̄: 6.02 x̃: 3 helped stats (rel) min: 0.15% max: 17.65% x̄: 2.44% x̃: 1.88% HURT stats (abs) min: 1 max: 16 x̄: 4.70 x̃: 3 HURT stats (rel) min: 0.09% max: 3.85% x̄: 1.11% x̃: 1.10% 95% mean confidence interval for instructions value: -6.10 -5.47 95% mean confidence interval for instructions %-change: -2.42% -2.30% Instructions are helped. total cycles in shared programs: 154044149 -> 153920095 (-0.08%) cycles in affected programs: 26037392 -> 25913338 (-0.48%) helped: 5974 HURT: 2521 helped stats (abs) min: 1 max: 1802 x̄: 35.42 x̃: 16 helped stats (rel) min: <.01% max: 35.80% x̄: 1.43% x̃: 0.84% HURT stats (abs) min: 1 max: 862 x̄: 34.73 x̃: 20 HURT stats (rel) min: 0.01% max: 36.33% x̄: 1.67% x̃: 0.85% 95% mean confidence interval for cycles value: -16.31 -12.90 95% mean confidence interval for cycles %-change: -0.56% -0.45% Cycles are helped. total spills in shared programs: 2876 -> 2957 (2.82%) spills in affected programs: 592 -> 673 (13.68%) helped: 6 HURT: 35 total fills in shared programs: 3157 -> 3134 (-0.73%) fills in affected programs: 402 -> 379 (-5.72%) helped: 6 HURT: 0 LOST: 5 GAINED: 11 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	ff9f526de3	nir/algebraic: Recognize open-coded flrp(-1, 1, a) and flrp(1, -1, a) v2: Remove flrp@64 cases. Since Gen11 removes flrp@32, it seems unlikely that we'll ever have a flrp@64. Should that occur, the cases can be added back. v3: Add a couple more patterns that just move the negation around. No shader-db changes Ice Lake, Iron Lake, or GM45 as these platforms lack a LRP instruction. Skylake total instructions in shared programs: 15279687 -> 15256058 (-0.15%) instructions in affected programs: 4344440 -> 4320811 (-0.54%) helped: 23455 HURT: 18 helped stats (abs) min: 1 max: 21 x̄: 1.01 x̃: 1 helped stats (rel) min: 0.02% max: 13.33% x̄: 0.86% x̃: 0.65% HURT stats (abs) min: 1 max: 2 x̄: 1.06 x̃: 1 HURT stats (rel) min: 0.13% max: 1.16% x̄: 0.43% x̃: 0.34% 95% mean confidence interval for instructions value: -1.01 -1.00 95% mean confidence interval for instructions %-change: -0.87% -0.85% Instructions are helped. total cycles in shared programs: 355593755 -> 355339981 (-0.07%) cycles in affected programs: 162089552 -> 161835778 (-0.16%) helped: 20467 HURT: 7158 helped stats (abs) min: 1 max: 2074 x̄: 29.00 x̃: 6 helped stats (rel) min: <.01% max: 35.71% x̄: 1.71% x̃: 0.58% HURT stats (abs) min: 1 max: 4814 x̄: 47.46 x̃: 11 HURT stats (rel) min: <.01% max: 125.43% x̄: 2.88% x̃: 0.98% 95% mean confidence interval for cycles value: -10.39 -7.98 95% mean confidence interval for cycles %-change: -0.57% -0.47% Cycles are helped. total spills in shared programs: 8843 -> 8835 (-0.09%) spills in affected programs: 190 -> 182 (-4.21%) helped: 2 HURT: 0 total fills in shared programs: 21738 -> 21738 (0.00%) fills in affected programs: 372 -> 372 (0.00%) helped: 1 HURT: 1 LOST: 12 GAINED: 22 Broadwell total instructions in shared programs: 15290523 -> 15266818 (-0.16%) instructions in affected programs: 4314738 -> 4291033 (-0.55%) helped: 23391 HURT: 11 helped stats (abs) min: 1 max: 119 x̄: 1.02 x̃: 1 helped stats (rel) min: 0.02% max: 13.33% x̄: 0.86% x̃: 0.65% HURT stats (abs) min: 1 max: 189 x̄: 18.09 x̃: 1 HURT stats (rel) min: 0.11% max: 5.39% x̄: 0.98% x̃: 0.50% 95% mean confidence interval for instructions value: -1.04 -0.99 95% mean confidence interval for instructions %-change: -0.87% -0.85% Instructions are helped. total cycles in shared programs: 388911660 -> 388830827 (-0.02%) cycles in affected programs: 172903324 -> 172822491 (-0.05%) helped: 15601 HURT: 13269 helped stats (abs) min: 1 max: 1986 x̄: 29.18 x̃: 6 helped stats (rel) min: <.01% max: 36.60% x̄: 1.74% x̃: 0.55% HURT stats (abs) min: 1 max: 14904 x̄: 28.21 x̃: 6 HURT stats (rel) min: <.01% max: 102.58% x̄: 1.77% x̃: 0.60% 95% mean confidence interval for cycles value: -4.20 -1.40 95% mean confidence interval for cycles %-change: -0.17% -0.08% Cycles are helped. total spills in shared programs: 23110 -> 23069 (-0.18%) spills in affected programs: 656 -> 615 (-6.25%) helped: 3 HURT: 1 total fills in shared programs: 34399 -> 34398 (<.01%) fills in affected programs: 905 -> 904 (-0.11%) helped: 3 HURT: 1 LOST: 6 GAINED: 23 Haswell total instructions in shared programs: 13465303 -> 13441142 (-0.18%) instructions in affected programs: 3726999 -> 3702838 (-0.65%) helped: 22139 HURT: 347 helped stats (abs) min: 1 max: 43 x̄: 1.11 x̃: 1 helped stats (rel) min: 0.03% max: 10.00% x̄: 1.01% x̃: 0.75% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.35% max: 11.11% x̄: 1.48% x̃: 1.12% 95% mean confidence interval for instructions value: -1.08 -1.07 95% mean confidence interval for instructions %-change: -0.99% -0.96% Instructions are helped. total cycles in shared programs: 376271308 -> 376273090 (<.01%) cycles in affected programs: 167496811 -> 167498593 (<.01%) helped: 13206 HURT: 13281 helped stats (abs) min: 1 max: 3864 x̄: 35.39 x̃: 8 helped stats (rel) min: <.01% max: 53.10% x̄: 2.31% x̃: 0.80% HURT stats (abs) min: 1 max: 3828 x̄: 35.32 x̃: 8 HURT stats (rel) min: <.01% max: 117.85% x̄: 2.88% x̃: 0.61% 95% mean confidence interval for cycles value: -1.33 1.47 95% mean confidence interval for cycles %-change: 0.22% 0.36% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 23158 -> 23134 (-0.10%) spills in affected programs: 24 -> 0 helped: 3 HURT: 0 total fills in shared programs: 34580 -> 34550 (-0.09%) fills in affected programs: 30 -> 0 helped: 3 HURT: 0 LOST: 23 GAINED: 13 Ivy Bridge total instructions in shared programs: 12034154 -> 12014301 (-0.16%) instructions in affected programs: 3636209 -> 3616356 (-0.55%) helped: 18771 HURT: 459 helped stats (abs) min: 1 max: 43 x̄: 1.08 x̃: 1 helped stats (rel) min: 0.03% max: 10.00% x̄: 0.91% x̃: 0.68% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.34% max: 8.33% x̄: 1.43% x̃: 1.11% 95% mean confidence interval for instructions value: -1.04 -1.02 95% mean confidence interval for instructions %-change: -0.86% -0.84% Instructions are helped. total cycles in shared programs: 180186960 -> 180175147 (<.01%) cycles in affected programs: 44652745 -> 44640932 (-0.03%) helped: 12979 HURT: 11033 helped stats (abs) min: 1 max: 5836 x̄: 32.88 x̃: 6 helped stats (rel) min: <.01% max: 53.10% x̄: 2.19% x̃: 0.74% HURT stats (abs) min: 1 max: 4811 x̄: 37.61 x̃: 9 HURT stats (rel) min: <.01% max: 115.18% x̄: 2.99% x̃: 0.69% 95% mean confidence interval for cycles value: -2.29 1.31 95% mean confidence interval for cycles %-change: 0.11% 0.26% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 3623 -> 3599 (-0.66%) spills in affected programs: 24 -> 0 helped: 3 HURT: 0 total fills in shared programs: 4061 -> 4031 (-0.74%) fills in affected programs: 30 -> 0 helped: 3 HURT: 0 LOST: 17 GAINED: 18 Sandy Bridge total instructions in shared programs: 10853968 -> 10834932 (-0.18%) instructions in affected programs: 3769957 -> 3750921 (-0.50%) helped: 17944 HURT: 204 helped stats (abs) min: 1 max: 3 x̄: 1.07 x̃: 1 helped stats (rel) min: 0.02% max: 10.00% x̄: 0.83% x̃: 0.60% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.31% max: 9.09% x̄: 1.83% x̃: 0.93% 95% mean confidence interval for instructions value: -1.05 -1.04 95% mean confidence interval for instructions %-change: -0.81% -0.78% Instructions are helped. total cycles in shared programs: 153894864 -> 153885988 (<.01%) cycles in affected programs: 50643925 -> 50635049 (-0.02%) helped: 9361 HURT: 10534 helped stats (abs) min: 1 max: 1966 x̄: 19.42 x̃: 4 helped stats (rel) min: <.01% max: 34.97% x̄: 0.90% x̃: 0.22% HURT stats (abs) min: 1 max: 1371 x̄: 16.42 x̃: 5 HURT stats (rel) min: <.01% max: 55.10% x̄: 0.81% x̃: 0.27% 95% mean confidence interval for cycles value: -1.27 0.38 95% mean confidence interval for cycles %-change: -0.03% 0.04% Inconclusive result (value mean confidence interval includes 0). LOST: 6 GAINED: 24 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	1259f6d802	nir: intel/vec4: Add flag to disable some algebraic optimizations A couple patches later in this series use the flag to avoid a few thousand shader-db regresions on all vec4 platforms. I'm not particularly enamored with the name of this flag. However, I suspect the Intel vec4 backend is the only backend that will benefit from it. Specifically, the cases where this helps are all cases where we want to prevent nir_opt_algebraic from rearranging instructions to create 3-source instructions, such as ffma and flrp, with additional immediate value or uniform sources. The earlier commit "intel/vec4: Try to emit a single load for multiple 3-src instruction operands" solves most of the problems caused by additional immediate values, but the restrictions on register strides that cause problems for uniforms and shader inputs persist. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	3a1fdca5ad	intel/vec4: Try to emit immediate sources for MOV Per the comment in vec4_visitor::nir_emit_load_const, further improvement is possible in this area. That case would be more complicated as I think we'd want to check that all users of the nir_load_const_instr result intended to use the value as float. No shader-db changes on any Gen8+ platform as these platforms do not use the vec4 backend. v2: Massive rebase on `eeebeb211f` ("intel/vec4: Try emitting non-scalar immediates"). This commit is about twice as helpful since `b04beaf41d` ("intel/vec4: Try both sources as candidates for being immediates"). Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13478598 -> 13474068 (-0.03%) instructions in affected programs: 589452 -> 584922 (-0.77%) helped: 2773 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.63 x̃: 1 helped stats (rel) min: 0.16% max: 5.66% x̄: 0.96% x̃: 0.83% 95% mean confidence interval for instructions value: -1.67 -1.60 95% mean confidence interval for instructions %-change: -0.98% -0.94% Instructions are helped. total cycles in shared programs: 376386916 -> 376369392 (<.01%) cycles in affected programs: 16871628 -> 16854104 (-0.10%) helped: 2293 HURT: 523 helped stats (abs) min: 2 max: 812 x̄: 13.80 x̃: 2 helped stats (rel) min: <.01% max: 10.18% x̄: 1.02% x̃: 0.36% HURT stats (abs) min: 2 max: 316 x̄: 26.99 x̃: 14 HURT stats (rel) min: <.01% max: 19.34% x̄: 2.15% x̃: 1.43% 95% mean confidence interval for cycles value: -7.87 -4.58 95% mean confidence interval for cycles %-change: -0.52% -0.34% Cycles are helped. Sandy Bridge total instructions in shared programs: 10860328 -> 10857675 (-0.02%) instructions in affected programs: 335907 -> 333254 (-0.79%) helped: 1639 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.10% max: 5.26% x̄: 0.86% x̃: 0.70% 95% mean confidence interval for instructions value: -1.67 -1.57 95% mean confidence interval for instructions %-change: -0.89% -0.84% Instructions are helped. total cycles in shared programs: 153942720 -> 153934120 (<.01%) cycles in affected programs: 5604818 -> 5596218 (-0.15%) helped: 1494 HURT: 97 helped stats (abs) min: 2 max: 256 x̄: 7.84 x̃: 2 helped stats (rel) min: 0.01% max: 6.62% x̄: 0.35% x̃: 0.18% HURT stats (abs) min: 2 max: 160 x̄: 32.02 x̃: 20 HURT stats (rel) min: 0.02% max: 3.37% x̄: 0.88% x̃: 0.56% 95% mean confidence interval for cycles value: -6.45 -4.36 95% mean confidence interval for cycles %-change: -0.32% -0.23% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8139378 -> 8137267 (-0.03%) instructions in affected programs: 265616 -> 263505 (-0.79%) helped: 1148 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.84 x̃: 1 helped stats (rel) min: 0.22% max: 4.76% x̄: 0.87% x̃: 0.62% 95% mean confidence interval for instructions value: -1.90 -1.78 95% mean confidence interval for instructions %-change: -0.90% -0.83% Instructions are helped. total cycles in shared programs: 188541756 -> 188537540 (<.01%) cycles in affected programs: 9807004 -> 9802788 (-0.04%) helped: 1143 HURT: 4 helped stats (abs) min: 2 max: 10 x̄: 3.70 x̃: 2 helped stats (rel) min: <.01% max: 3.01% x̄: 0.13% x̃: 0.06% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18% 95% mean confidence interval for cycles value: -3.80 -3.55 95% mean confidence interval for cycles %-change: -0.14% -0.12% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	acd7796a07	intel/vec4: Try to emit a VF source in try_immediate_source This commit is also a pre-requisite for the next commit. No shader-db changes on any Gen8+ platform as these platforms do not use the vec4 backend. v2: Massive rebase on `eeebeb211f` ("intel/vec4: Try emitting non-scalar immediates"). This change is a lot less helpful since that commit landed (previously helped 1934 shaders on HSW) because, apparently, a lot of the cases helped by that commit were things like vector loads of { 1.0, 1.0, 1.0 } that were also helped by this commit. Haswell total instructions in shared programs: 13480095 -> 13478598 (-0.01%) instructions in affected programs: 229534 -> 228037 (-0.65%) helped: 1006 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.49 x̃: 1 helped stats (rel) min: 0.04% max: 3.45% x̄: 1.11% x̃: 1.09% 95% mean confidence interval for instructions value: -1.54 -1.43 95% mean confidence interval for instructions %-change: -1.15% -1.07% Instructions are helped. total cycles in shared programs: 376385734 -> 376386916 (<.01%) cycles in affected programs: 14101380 -> 14102562 (<.01%) helped: 941 HURT: 56 helped stats (abs) min: 2 max: 322 x̄: 5.62 x̃: 2 helped stats (rel) min: <.01% max: 7.74% x̄: 0.51% x̃: 0.42% HURT stats (abs) min: 2 max: 618 x̄: 115.50 x̃: 32 HURT stats (rel) min: 0.03% max: 4.62% x̄: 0.83% x̃: 0.44% 95% mean confidence interval for cycles value: -2.06 4.43 95% mean confidence interval for cycles %-change: -0.47% -0.39% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 12048004 -> 12046589 (-0.01%) instructions in affected programs: 217072 -> 215657 (-0.65%) helped: 934 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.51 x̃: 1 helped stats (rel) min: 0.04% max: 3.45% x̄: 1.14% x̃: 1.11% 95% mean confidence interval for instructions value: -1.57 -1.46 95% mean confidence interval for instructions %-change: -1.18% -1.10% Instructions are helped. total cycles in shared programs: 180285854 -> 180287608 (<.01%) cycles in affected programs: 14103824 -> 14105578 (0.01%) helped: 871 HURT: 53 helped stats (abs) min: 2 max: 322 x̄: 5.51 x̃: 2 helped stats (rel) min: <.01% max: 7.67% x̄: 0.50% x̃: 0.42% HURT stats (abs) min: 2 max: 618 x̄: 123.66 x̃: 32 HURT stats (rel) min: 0.03% max: 4.47% x̄: 0.92% x̃: 0.46% 95% mean confidence interval for cycles value: -1.60 5.39 95% mean confidence interval for cycles %-change: -0.46% -0.37% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10861227 -> 10860328 (<.01%) instructions in affected programs: 92969 -> 92070 (-0.97%) helped: 624 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.11% max: 3.45% x̄: 1.05% x̃: 0.95% 95% mean confidence interval for instructions value: -1.52 -1.36 95% mean confidence interval for instructions %-change: -1.09% -1.01% Instructions are helped. total cycles in shared programs: 153944316 -> 153942720 (<.01%) cycles in affected programs: 1640956 -> 1639360 (-0.10%) helped: 601 HURT: 15 helped stats (abs) min: 2 max: 120 x̄: 3.56 x̃: 2 helped stats (rel) min: 0.02% max: 6.33% x̄: 0.18% x̃: 0.08% HURT stats (abs) min: 2 max: 72 x̄: 36.13 x̃: 36 HURT stats (rel) min: 0.05% max: 3.84% x̄: 1.95% x̃: 2.00% 95% mean confidence interval for cycles value: -3.44 -1.74 95% mean confidence interval for cycles %-change: -0.18% -0.09% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8139924 -> 8139378 (<.01%) instructions in affected programs: 69776 -> 69230 (-0.78%) helped: 322 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.70 x̃: 1 helped stats (rel) min: 0.27% max: 3.23% x̄: 0.79% x̃: 0.54% 95% mean confidence interval for instructions value: -1.88 -1.51 95% mean confidence interval for instructions %-change: -0.85% -0.72% Instructions are helped. total cycles in shared programs: 188542864 -> 188541756 (<.01%) cycles in affected programs: 3031532 -> 3030424 (-0.04%) helped: 320 HURT: 0 helped stats (abs) min: 2 max: 20 x̄: 3.46 x̃: 2 helped stats (rel) min: <.01% max: 0.69% x̄: 0.06% x̃: 0.06% 95% mean confidence interval for cycles value: -3.85 -3.07 95% mean confidence interval for cycles %-change: -0.06% -0.05% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	365b45d571	intel/vec4: Try to emit a single load for multiple 3-src instruction operands If a 3-source instruction uses immediate values 1.0 and -1.0, just load 1.0 into a register. Use the negation source modifier to get -1.0. This has trivial impact now, but it prevents a few thousand regressions on vec4 platforms with "nir/algebraic: Recognize open-coded flrp(-1, 1, a) and flrp(1, -1, a)" All Gen6 and Gen7 platforms had similar results. (Haswell shown) total instructions in shared programs: 13487412 -> 13487406 (<.01%) instructions in affected programs: 541 -> 535 (-1.11%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.36% max: 2.08% x̄: 1.65% x̃: 1.80% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.33% -0.97% Instructions are helped. total cycles in shared programs: 376402564 -> 376402500 (<.01%) cycles in affected programs: 10348 -> 10284 (-0.62%) helped: 10 HURT: 1 helped stats (abs) min: 2 max: 26 x̄: 7.00 x̃: 2 helped stats (rel) min: 0.13% max: 2.05% x̄: 0.89% x̃: 0.79% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.29% max: 0.29% x̄: 0.29% x̃: 0.29% 95% mean confidence interval for cycles value: -11.72 0.08 95% mean confidence interval for cycles %-change: -1.20% -0.36% Inconclusive result (value mean confidence interval includes 0). No shader-db changes on any other Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Ian Romanick	6f6bc842f6	intel/vec4: Refactor operand fixing for ffma and flrp Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-11 10:20:03 -07:00
Alyssa Rosenzweig	8305766e0e	panfrost: Wire up GLES2-class polygon offset Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 09:40:47 -07:00
Alyssa Rosenzweig	7a36c72f5d	pan/decode: Depth units/factor are identical to GL I'm not sure why I thoughtt here was an off-by-one, other than maybe bad data collection. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 09:40:47 -07:00
Christian Gmeiner	a7153ebcd3	etnaviv: remove dead translate_ts_sampler_format(..) declaration Fixes: `66411521ea` ("etnaviv: combine translate_ts_sampler_format/translate_msaa_format") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-07-11 17:51:15 +02:00
Caio Marcelo de Oliveira Filho	b390ff3517	intel/fs: Add support for SLM fence in Gen11 Gen11 SLM is not on L3 anymore, so now the hardware has two separate fences. Add a way to control which fence types to use. At this time, we don't have enough information in NIR to control the visibility of the memory being fenced, so for now be conservative and assume that fences will need a stall. With more information later we'll be able to reduce those. Fixes Vulkan CTS tests in ICL: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.buffer.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp The whole set of supported tests in dEQP-VK.memory_model.* group should be passing in ICL now. v2: Pass BTI around instead of having an enum. (Jason) Emit two SHADER_OPCODE_MEMORY_FENCE instead of one that gets transformed into two. (Jason) List tests fixed. (Lionel) v3: For clarity, split the decision of which fences to emit from the emission code. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-11 08:29:32 -07:00
Tomeu Vizoso	838374b6dd	Revert "panfrost/midgard: Use _safe iterator" This reverts commit `812ce2ce9e`. We massively regress with the reverted patch. So in the meantime, take it out. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-11 16:53:42 +02:00
Alyssa Rosenzweig	507e297431	panfrost: Don't lie about Z/S formats Only Z24S8 is properly supported right now, so let's be careful. Fixes a number of issues relating to improper Z/S handling. The most obvious is depth buffers with incorrect strides, which manifests in truly bizarre ways and can happen commonly with FBOs. Fixes WebGL (Aquarium runs, etc). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 14:27:25 +00:00
Samuel Pitoiset	cd403a931f	radv/gfx10: enable geometry shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 15:46:02 +02:00
Bas Nieuwenhuizen	0a8ef756d3	radv/gfx10: Fix NGG GS output mask handlings for LDS indexing. In emit_vertex we optimize storage if the output mask does not have all bits set. Do the same in the epilogue so the indices actually match up. Fixes dEQP-VK.geometry.input.basic_primitive.points because it outputs PSIZE with an output mask of 1, which cause the generic attribute for the color to be loaded from the wrong indices. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 15:45:59 +02:00
Bas Nieuwenhuizen	f5982917ff	radv/gfx10: Simplify output mask handling for NGG GS. We only ever get in this function for a NGG GS proper. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 15:45:58 +02:00
Bas Nieuwenhuizen	7515f41c78	radv/gfx10: Do GS prologue outside of gs_threads if. Mirror radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 15:45:56 +02:00
Samuel Pitoiset	5bbcb3f5bc	radv/gfx10: implement support for GS as NGG Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 15:45:53 +02:00
Bas Nieuwenhuizen	7286865f6d	radv/gfx10: Use correct ES shader for es_vgpr_comp_cnt for GS. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 15:45:51 +02:00
Bas Nieuwenhuizen	45b73b3aa9	radv/gfx10: Do not allocate a gs_copy_shader on gfx10. Will use ngg for any gs anyway. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 15:45:47 +02:00
Samuel Pitoiset	ef5efb40f4	radv/gfx10: fix VGT_SHADER_STAGES_EN for GS as NGG The driver shouldn't set the copy shader bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 15:45:43 +02:00
Samuel Pitoiset	8bc3ab6f0c	radv/gfx10: fix number of GS invocations for NGG Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 15:45:40 +02:00
Tomeu Vizoso	812ce2ce9e	panfrost/midgard: Use _safe iterator Fixes this assertion: ../mesa/src/panfrost/midgard/midgard_schedule.c:507:schedule_block: Assertion `ins == __next && "use _safe iterator"' failed. Trace/breakpoint trap Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-11 15:06:51 +02:00
Tomeu Vizoso	82ee48e5ef	panfrost: Place the height value in the height field In the mali_single_framebuffer descriptor. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> v2: Remove unwanted chunks	2019-07-11 15:06:47 +02:00
Samuel Pitoiset	022b1f4190	radv/gfx10: fix maximum number of mip levels for 3D images The dimensions also have to be adjusted if the number of supported mip levels is changed. This fixes dEQP-VK.api.info.image_format_properties.3d.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 14:44:47 +02:00
Samuel Pitoiset	f3dfdd4091	radv/gfx10: disable TC-compat HTILE for multisampled D32_SFLOAT format For some reasons D32_SFLOAT is also affected on GFX10, it works fine with previous generations. This fixes some dEQP-VK.renderpass2.depth_stencil_resolve.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-11 13:43:21 +02:00
Kenneth Graunke	a01770b9c8	iris: Fix key->input_vertices for 8_PATCH TCS mode. We were failing to flag the program dirty when it changed. Also, we were unnecessarily setting key->input_vertices for SINGLE_PATCH mode, which would reduce program cache hits. Only set it if needed.	2019-07-11 01:18:24 -07:00
Kenneth Graunke	c58f52f0ef	iris: Only set key->flat_shade if COL0/COL1 are written. This was just laziness on my part, we already added similar checks in the VS key handling. Just need to do it here too. Should improve cache hits.	2019-07-11 00:12:50 -07:00
Kenneth Graunke	cb82d534a0	iris: Drop comment about var->data.binding not being set. I refactored the sampler lowering passes a long time ago to ensure that gl_nir_lower_samplers_as_deref is run and var->data.binding is set.	2019-07-11 00:12:00 -07:00
Kenneth Graunke	38f9954208	iris: Drop comments about missing NOS These stages don't need NOS. If they do, we can add it - the infrastructure is there if we need it someday.	2019-07-11 00:12:00 -07:00
Kenneth Graunke	2bd1234a77	iris: Drop a TODO comment This is literally implemented two lines above.	2019-07-11 00:12:00 -07:00
Neil Roberts	eae06b34ea	glsl/builtin types: Set the precision on the depth range params The members of gl_DepthRangeParameters are declared to be highp in GLSL ES specs. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-11 08:04:54 +02:00
Neil Roberts	74d71dac20	glsl: Add a constructor for glsl_struct_field to specify the precision Adds a third constructor to glsl_struct_field which has an extra parameter to specify the precision. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-11 08:04:54 +02:00
Neil Roberts	014be60398	glsl: Add a macro for the default values for glsl_struct_field There are two constructors for glsl_struct_field with different parameters. Instead of repeating them for both constructors, this patch adds a convenience macro. This will make it easier to add a third constructor in a later patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-11 08:04:54 +02:00
Neil Roberts	ca6ee488e9	glsl/builtin_variables: Add a precision to the builtins All of the builtin variables mentioned in the GLSL ES spec and the extensions include a precision declaration which is different depending on what the variable is used for. This patch makes it set the corresponding precision when creating the variable. This will make a difference once we start using the precision information for optimisation. Previously all of the builtin variables ended up with a precision of NONE. v2: Made gl_PointSize and gl_FragCoord highp since GLSL ES 3.00. Fixed gl_MaxViewPorts to always be highp. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-11 08:04:54 +02:00
Kenneth Graunke	ce93bf1876	compiler: Save a single copy of the softfp64 shader in the context. We were recompiling the softfp64 library of functions from GLSL to NIR every time we compiled a shader that used fp64. Worse, we were ralloc stealing it to the GL context. This meant that we'd accumulate lots of copies for the lifetime of the context, which was a big space leak. Instead, we can simply stash a single copy in the GL context, and use it for subsequent compiles. Having a single copy should be fine from a memory context point of view: nir_inline_function_impl already clones the necessary nir_function_impl's as it inlines. KHR-GL45.enhanced_layouts.ssb_member_align_non_power_of_2 was previously OOM'ing a system with 16GB of RAM when using softfp64. Now it finishes much more quickly and uses only ~200MB of RAM. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-07-10 22:14:36 -07:00
Timothy Arceri	ae4ccb67be	radv: fix memory leak when restoring from cache Fixes: `726a31df70` ("radv: Add the concept of radv shader binaries.") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-11 10:44:29 +10:00
Kristian H. Kristensen	e03259974e	freedreno: Generate headers from xml files Reviewed-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Rob Clark <robdclark@gmail.com>	2019-07-10 22:05:02 +00:00
Samuel Pitoiset	51e2124a4b	radv: switch to the new VS exports path It will help for GS as NGG on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:37:02 +02:00
Samuel Pitoiset	f616d80a7a	radv: set the slot_index correctly for VARYING_SLOT_CLIP_DIST1 For selecting a different SQ_EXP_POS target. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:36:59 +02:00
Samuel Pitoiset	c4ab33378a	radv: add a new function for exporting VS outputs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:36:57 +02:00
Samuel Pitoiset	ac0edc369c	radv: implement new path for exporting generic varyings Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:36:55 +02:00
Samuel Pitoiset	0b368fc8c3	radv: use the generic export path for clip/cull distances When they are exported to the next stage. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:36:52 +02:00
Samuel Pitoiset	f653e5c1d6	radv: remove an extra memcpy when exporting clip/cull distances Cleanup. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 23:36:50 +02:00
Jason Ekstrand	14781e2122	intel/compiler: Add a "base class" for program keys Right now, all keys have two things in common: a program string ID and a sampler_prog_key_data. I'd like to add another thing or two and need a place to put it. This commit adds a new brw_base_prog_key struct which contains those two common bits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-10 19:35:55 +00:00
Jason Ekstrand	3a4667e502	i965/program_cache: Cast the key to char * before adding key_size We're about to change the type of key to be brw_base_prog_key and that will mean blindly adding the key size without a cast will lead to the wrong calculation. It's safer to cast to char * first anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-10 19:35:55 +00:00
Jason Ekstrand	bb14abed18	anv: Make the workaround BO a whole page I'm not 100% sure how this ever worked because gem_create usually shoots you if the BO size isn't page-aligned. Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-10 19:35:23 +00:00
Jason Ekstrand	6a2ff217b8	anv: Set Stateless Data Port Access MOCS This is the MOCS setting used for the A64 stateless messages which we sometimes use for SSBO operations. Fixes: `48ed2a7bb0` "anv: Implement VK_EXT_buffer_device_address" Fixes: `79fb0d27f3` "anv: Implement SSBOs bindings with GPU addr..." Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-10 19:35:23 +00:00
Alyssa Rosenzweig	bb483a9166	panfrost: Clamp point size It's not clear the hardware really has a maximum which confuses dEQP; clamp to whatever we report as our maximum. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 11:30:00 -07:00
Alyssa Rosenzweig	7318b525a2	pan/decode: Auto style $ astyle .c .h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	ec2a59cd7a	panfrost: Move non-Gallium files outside of Gallium In preparation for a Panfrost-based non-Gallium driver (maybe Vulkan...?), hoist everything except for the Gallium driver into a shared src/panfrost. Practically, that means the compilers, the headers, and pandecode. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	a2d0ea92ba	panfrost: Style main Gallium driver $ astyle .c .h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	e4bd6fbe51	panfrost/midgard: Apply code styling $ astyle .c .h --style=linux -s8 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	b4733b2b61	panfrost/nir: Apply NIR style $ astyle .c .h --style=linux -s3 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	c2c8983cf4	panfrost: Move midgard/nir* to nir folder The reason for doing this is two-fold: 1. These passes are likely to be shared with the Bifrost compiler Therefore, we don't want to restrict them to Midgard 2. The coding style is different (NIR-style vs Panfrost-style) The NIR passes are candidates for moving upstream into compiler/nir, so don't block that off for stylistic reasons Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 10:43:23 -07:00
Alyssa Rosenzweig	ef2d577769	panfrost: Typofix Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 09:45:16 -07:00
Alyssa Rosenzweig	31fc52a4e7	panfrost: Identify shared tiler structure This is identical across SFBD/MFBD so pull it out to allow for better code sharing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 09:45:16 -07:00
Alyssa Rosenzweig	6eb99c78e2	panfrost/midgard: Drop unnecessary assert Just use the #define instead. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-07-10 09:37:08 -07:00
Alyssa Rosenzweig	c1b109caec	panfrost: Don't expose OES_standard_derivatives This has not been implemented quite yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 09:36:03 -07:00
Erik Faye-Lund	39e7fbf24a	gallium: get rid of PIPE_CAP_SM3 PIPE_CAP_SM3 has always been an odd one out of all our caps. While most other caps are fine-grained and single-purpose, this cap encode several features in one. And since OpenGL cares more about single features, it'd be nice to get rid of this one. As it turns, this is now relatively simple. We only really care about three features using this cap, and those already got their own caps. So we can remove it, and make sure all current drivers just give the same response to all of them. The only place we really care about SM3 is in nine, and there we can instead just re-construct the information based on the finer-grained caps. This avoids DX9 semantics from needlessly leaking into all of the drivers, most of who doesn't care a whole lot about DX9 specifically. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 15:50:51 +02:00
Erik Faye-Lund	21de1bf24b	gallium: give vertex-shader saturate its own cap Shader Model 3.0 is a big promise to make to the state-tracker, and for instance mobile hardware might support vertex-shader saturate but not some of the other features of SM3. So let's give this its own cap for simplicity. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 15:49:57 +02:00
Erik Faye-Lund	681fa03e8d	gallium: give fragment-shader derivatives its own cap Shader Model 3.0 is a big promise to make to the state-tracker, and for instance mobile hardware might support fragment-shader derivatives but not some of the other features of SM3. So let's give this its own cap for simplicity. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 15:49:57 +02:00
Erik Faye-Lund	66ee6661e9	gallium: give fragment-shader texture-lod its own cap Shader Model 3.0 is a big promise to make to the state-tracker, and for instance mobile hardware might support texture lod but not some of the other features of SM3. So let's give this its own cap for simplicity. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 15:49:57 +02:00
Erik Faye-Lund	ffbd004686	mesa/st: drop needless has_shader_model3 boolean This boolean is only consulted once during init, so there's nothing much saved by storing this in the context. So let's just check directly when we need it instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 15:49:57 +02:00
Alyssa Rosenzweig	af2949e928	panfrost: Fix copyright identifier in a few places Oops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-10 06:47:15 -07:00
Alyssa Rosenzweig	629c516b76	panfrost: Bikeshed pan_screen.c comment The asterisks were inherited from... softpipe, maybe? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-10 06:47:13 -07:00
Alyssa Rosenzweig	2f7145a6de	panfrost: Check GPU version before loading Panfrost is known to only work on a select few CPU/GPU combinations at the moment (tested system-on-chips: RK3288, RK3399, and S912). Whitelist the combinations known to work and refuse to load on others where nothing works yet to avoid user confusion. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-10 06:47:11 -07:00
Alyssa Rosenzweig	b5de423ac1	panfrost: Be more honest about PIPE_CAPs A lot of the pan_screen.c code was cargoculted from other drivers. The upshot is that we return true for a lot of PIPE_CAPs that we don't actually support, resulting in us exposing way too many extensions that we don't actually support. Be more careful. Some CAPs we do need to fake to access higher dEQP versions (i.e. in order to debug the features we're hiding behind the CAP). For these, we hide the CAP behind a special PAN_MESA_DEBUG=deqp option to avoid apps randomly using these in-development features. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-07-10 06:47:01 -07:00
Alyssa Rosenzweig	b69d5d6e19	panfrost/midgard: Hit missed scheduling opportunity Don't try to schedule to vmul when that can't possible work (forcing a bundle break). glmark: total bundles in shared programs: 2700 -> 2683 (-0.63%) bundles in affected programs: 695 -> 678 (-2.45%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.21 x̃: 1 helped stats (rel) min: 1.27% max: 7.69% x̄: 4.30% x̃: 4.77% 95% mean confidence interval for bundles value: -1.68 -0.75 95% mean confidence interval for bundles %-change: -5.63% -2.97% Bundles are helped. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	2d739f6b59	panfrost/midgard: Include shader size for shader-db It's easy to forget about, but shader size does matter for things like i-cache, so let's include it in the analysis. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	7ad6516f3b	panfrost/midgard: Include loop count for shader-db We have to emit it anyway for the report to be happy (with respect to unrolling), so return an actual count rather than dummy numbers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	138e40d471	panfrost/midgard: Dump shader-db stats All the kool kids are doing it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	a2f1a06a5e	panfrost/midgard: Flush undefineds to zero Fixes a buggy dEQP test. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	318e9933b1	panfrost/midgard: Specify channel count for broadcasting ops bany/ball type ops read from all 4 channels even though they only write to 1; specify this in the opcode table like we do for dot products. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Alyssa Rosenzweig	a1a4dfa74b	panfrost/midgard: Don't try to "alias" texture registers It won't work. Just, stop it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:45:20 -07:00
Samuel Pitoiset	4cadf4309c	radv: compute correct number of input vertices for NGG Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 15:17:08 +02:00
Samuel Pitoiset	3303bc8b74	radv: remove extra code for exporting LayerID to the next stage Now that the output usage mask is set to 0x1 the LayerID is correctly exported in the loop above. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 15:17:08 +02:00
Samuel Pitoiset	bd86ded027	radv: set the LayerId output usage mask if FS needs it When the stage preceding FS doesn't export it the fragment shader might read it, even if it's 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 15:17:08 +02:00
Alyssa Rosenzweig	53d64753e1	panfrost: Update supported formats Much of the format selection code was inherited from softpipe (!) of all places, and a lot of it is accordingly cruft. Later if-elses were added in random places to workaround missing formats at various points in history. Clean up some of this. Theoretically, any format we can texture from we can also render to. In practice, there are a few corner cases that we need to disable explicitly. For one, we do have to restrict SCANOUT formats to workaround buggy apps (in particular, dEQP which with --deqp-surface-type=window under Weston will end up with RGB10_A2 and complain about low alpha precision). Just be clearer about how/why. Also, RGB5_A1 support is still broken; let's not worry about that quite yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:09 -07:00
Alyssa Rosenzweig	ced132d203	panfrost/mfbd: Cleanup format code selection Rather than have random variables flying around and a long if-else chain, use a switch. They're literally designed for this. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:09 -07:00
Alyssa Rosenzweig	da5382c0d8	panfrost/midgard: Cleanup blend switch Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	c0c709a13a	panfrost/mfbd: Handle PIPE_FORMAT_B10G10R10A2_UNORM Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	c58c5268da	panfrost/midgard: Handle PIPE_FORMAT_B10G10R10A2_UNORM Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	c2ee937cf2	panfrost: Implement ES3-format writeout We add support for writing out (via a blend shader): - RGBA4 - RGB10_A2_UNORM - RGB10_A2_UINT - RGB5_A1_UNORM - R11G11B10_FLOAT Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	46396af1ec	panfrost: Refactor blend infrastructure We would like to permit keying blend shaders against the framebuffer format, which requires some new blending abstractions. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	c9af7701d1	panfrost/midgard: Use unsigned blend patch offset We would like the offset field to be unsigned, letting 0 represent "no offset" and positive represent an offset. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	6def428f10	panfrost/midgard: Handle pure int formats I'm not sure I'm totally comfortable with this, but conceptually neither float nor pure-int formats require any format conversion, except size conversion. Going from a shaderable format (fp32 or i16, for instance) into a blendable format (fp16) is a separate question, one we can defer momentarily while we're not interested in actually blending. As an aside, I'd be fascinated by an integer-based blending implementation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	280c777fd7	panfrost/mfbd: Handle pure int formats We see that the render target itself turns out to be typeless (surprise!) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:08 -07:00
Alyssa Rosenzweig	7647e56c1f	panfrost: Set rt_count_2 for bpp>4 formats Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:07 -07:00
Alyssa Rosenzweig	0c619210b2	panfrost/midgard: Implement preliminary float converters We'll need some careful handling, but for now, get some baseline code out for handling float formats in a blend shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:06 -07:00
Alyssa Rosenzweig	5849c85008	panfrost/midgard: Skip blend for REPLACE (shader) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:06 -07:00
Alyssa Rosenzweig	5e825f5cad	panfrost: Handle "blend disabled" blend shaders Normally, disabled blend can definitely be fixed-function'd away, but if a blend shader is used merely for format conversion rather than blending, this code path can be nevertheless hit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	27e0c8c15d	panfrost: Route format through fixed-function blending Not all framebuffer formats are supported by the fixed-function blender. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	e7551c1bff	panfrost: Pipe framebuffer format around Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	74fd914a89	panfrost/midgard: Use Gallium framebuffer formats Ideally, we would keep Galliumisms far away from the compiler; unfortunately, Mesa hasn't standardized on system of format codes to be shared across APIs and across drivers, so using Gallium formats is our best bet in the short run. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	2157fe967a	panfrost/midgard: Use fp16 exclusively while blending We now have some preliminary fp16 support available. We're not able to expose this for GLSL quite yet, but for internal blend shaders, we're able to do control bitness ourselves just fine. So let's fp16 that stuff! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	0cfa54801e	panfrost/midgard: Remove opt_copy_prop_tex Eventually this should be replaced by proper tex RA / not emitting so many silly moves to begin with / better general copy prop. For now remove it since it breaks things. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	b113be7683	panfrost/midgard: Fix scalarification Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	e92caad744	panfrost/midgard: Handle fp16 in embedded_to_inline_constants Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	3dbedb26f5	panfrost/midgard: Eliminate redundant type convert Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	64df54d894	panfrost/midgard: Fix fp16 embedded constants Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:05 -07:00
Alyssa Rosenzweig	f8b18a4277	panfrost/midgard: Hoist mask field Share a single mask field in midgard_instruction with a unified format, rather than using separate masks for each instruction tag with hardware-specific formats. Eliminates quite a bit of duplicated code and will enable vec8/vec16 masks as well (which don't map as cleanly to the hardware as we might like). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	e69cf1fed9	panfrost/midgard: Allow fp16 in scalar ALU The packing is a little different, so implement that. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	d8c084d2ca	panfrost/midgard: Implement f2u16 and friends Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	954c6afa3e	panfrost/midgard: Implement f2f16/f2f32 These conversions handle half-floats within the shader. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	0ed8cca008	panfrost/midgard: Verify src_bitsize == dst_bitsize We can handle differing, but we'd prefer not to because there are restrictions on sizing which aren't accounted for yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	1686ef8655	panfrost/midgard: Simplify blend read It's not clear where the extra indirection was from (older hardware or just older blobs?) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	952993d3bb	panfrost/midgard: NIRify blend load scale/convert The scale and type-convert can now be expressed in NIR, rather than MIR, which is significantly more maintainable and demonstrates correctness of the type conversion patches. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	ae42991b83	panfrost/midgard: Fix blend constant scheduling bug Blend constant conflicts run in two directions. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	7f807ef1fa	panfrost/midgard: Implement upscaling type converts Rather than using a dest_override, we upscale integers by using a half field with a sign-extend bit. A variant of this trick should also work for floats, but one step at a time! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	541b329bd1	panfrost/midgard: Move blend load/store into NIR We have dedicated intrinsics to access the raw contents of the tile buffer so we can use a dedicated NIR pass to lower appropriately for blend shaders, rather than introducing a bizarre hardcoded blend epilogue that only works for RGBA8_UNORM. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:04 -07:00
Alyssa Rosenzweig	f42e5be910	panfrost/midgard: Use nir_dest_num_components Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	4df80cab40	panfrost/midgard: Implement integer downsize ops Oh, dear. No turning back now. We begin implementing non-32-bit types, using downsizing integer type conversions as the initial instructions. We implement them naively as type-converting moves; substantially more efficient operation is possible by copypropping the type conversion modifier, but this optimization is not implemented here. Size converting modifiers on Midgard allow an instruction to write to a destination 1/2 the size, or to read from a source 1/2 the size. If we need an extreme conversion (32-bit to 8-bit, for instance), multiple type converting ops are chained together, which here is handled via an algebraic pass. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	dc69d3bf8f	panfrost/midgard: Move scale from MIR to NIR This begins the process of removing blend shader specific MIR into a more general NIR lowering pass for formats. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	d151319a3d	panfrost/midgard: Passthrough nir_lower_framebuffer Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	8e4e46794e	panfrost: Extend clear colour packing Eventually, this will allow packing clear colours for all formats, including floating-point framebuffers, pure integer buffers, and special formats. Currently, a few of these formats are supported, and many more are handled through a generic Gallium colour packing path (which is not a perfect fit for the hardware, but works for many formats and is a sane default for the moment.) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	21c863a695	panfrost/mfbd: Include codes for float framebuffers We see the hardware doesn't actually support float framebuffers in the native sense -- rather, it just allows higher bpp framebuffers and lets a blend shader / additional clear_color fields sort out the formats. This will be.. interesting. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	36b3e7ea90	panfrost: Prepare some code for MRT Full MRT support is a while away, but in the mean time, we can remove code that explicitly assumes nr_cbufs <= 0, to minimize the obstacles we'll face later when we add the whole thing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Alyssa Rosenzweig	7c82dfba8f	panfrost: Use standard ALIGN_POT/INFINITY macros We had vendored duplicates from pre-Mesa days; clean that up. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-10 06:12:03 -07:00
Eric Engestrom	c78d2d9840	egl: add glvnd symbols check According to the spec [1], `__egl_Main` is the only symbol that needs to be exported. We don't want applications directly linking against libEGL_mesa.so (apps should always go through libEGL.so, regardless of who is providing it), so we shouldn't export any other symbols either. [1] https://github.com/NVIDIA/libglvnd/blob/master/include/glvnd/libeglabi.h (this header is the closest there is to a spec) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	ba18b968e8	egl: rewrite entrypoints check Part of the effort to replace shell scripts with portable python scripts. I could've used a trivial `assert lines == sorted(lines)`, but this way the caller is shown which entrypoint is out of order. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	b619f89e23	mapi: add shared glapi symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	1abae9e54a	tu: add exported symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	0fd30c1011	vulkan: add symbols file According to the Vulkan ICD spec [1], these two symbols must be exposed: - vk_icdGetInstanceProcAddr - vk_icdNegotiateLoaderICDInterfaceVersion and this one is optional: - vk_icdGetPhysicalDeviceProcAddr [1] https://github.com/KhronosGroup/Vulkan-Loader/blob/master/loader/LoaderAndLayerInterface.md Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	915eab5e87	meson: remove unused env_test No longer used as of last commit :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	6f305d0c61	gles: use new symbols check script Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	111c34d2ae	gbm: sort symbols Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	aa6973e611	gbm: use new symbols check script Note: the list in gbm-symbols.txt is the same as the one that was in gbm-symbols-check, I just took the opportunity to sort it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	1172263c87	egl: use new symbols check script Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Eric Engestrom	176f350fcf	symbols-check: introduce new python script I've re-written this in bash a couple times over the years, and then I realised python is much more portable and already required by Mesa, so we might as well make use of it. I decided to still use the build system's NM instead of re-implementing symbols extraction, to offload the complexity of keeping it compatible with many systems (Linux, Unix, BSD, MacOS, etc.), especially when cross-building. This new script checks not only that nothing is exported when it shouldn't be, but also that everything that should be exported is. Sometimes, some symbols _can_ be exported but don't have to be, in which case they can be prefixed with `(optional)`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Karol Herbst	62362a4abb	nv50/ir/nir: implement load/store_global required by OpenCL v2: fix setting globalAccess Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-07-10 13:23:00 +02:00
Karol Herbst	33a9b9fce5	nv50/ir/nir: handle kernel inputs required by OpenCL Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-07-10 13:22:40 +02:00
Karol Herbst	2617c78fe2	nv50/ir/nir: don't assert on !main required for OpenCL Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-07-10 13:22:21 +02:00
Karol Herbst	fa6bd3c639	nv50/ir/nir: parse system values first and stop for compute shaders required by OpenCL Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2019-07-10 13:20:13 +02:00
Connor Abbott	133273aa22	nir/lower_io: Don't use variable to get deref mode Drivers only use lower_io for modes where pointers don't have a meaningful value, and dereferences can always be traced back to a variable. But there can be other modes, like global mode with VK_EXT_buffer_device_address, where pointers cannot be traced back to a variable, and lower_io would segfault on loads/stores of these since nir_deref_instr_get_variable() would return NULL. Just use the mode on the deref itself to filter out these modes before we try to get the variable. Fixes: `118a66df99` ("radv: Use NIR barycentric coordinates") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 12:31:41 +02:00
Connor Abbott	f18b8a1174	radv: Don't optimize after lowering FS inputs Currently this is done rather late in radv, after lowering booleans, so it isn't safe to run additional optimizations that may add e.g. 1-bit booleans. We could move the lowering parts earlier, but since right now we only lower FS inputs and by this point all indirects have been lowered away, there's no reason we should need to optimize anything. One shader from Devil May Cry 5 was getting optimized, but only because the optimization loop was working on 32-bit booleans which revealed an opportunity that was hidden with 1-bit booleans, and we generated a 1-bit boolean which is invalid. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111092 Fixes: `118a66df99` Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-10 10:10:20 +02:00
Mauro Rossi	fe3898547a	android: amd/addrlib: add gfx10 support Fix the following building error: external/mesa/src/amd/addrlib/src/gfx10/gfx10addrlib.cpp:35:10: fatal error: 'gfx10_gb_reg.h' file not found ^~~~~~~~~~~~~~~~ 1 error generated. Fixes: `78cdf9a` ("amd/addrlib: add gfx10 support") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 09:03:55 +02:00
Mauro Rossi	b3d46cb539	android: amd/common/gfx10: add register JSON The necessary Android makefile building rules are added and the generation rules are simplified for readability Fixes the following building errors: external/mesa/src/amd/common/ac_llvm_build.c:1496:45: error: use of undeclared identifier 'V_008F0C_IMG_FORMAT_8_UINT' case V_008F0C_BUF_DATA_FORMAT_8: format = V_008F0C_IMG_FORMAT_8_UINT; break; ^ Fixes: `74a26af` ("amd/common/gfx10: add register JSON") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 09:03:51 +02:00
Mauro Rossi	2434fb3e8e	android: radeonsi/gfx10: generate gfx10_format_table.h (v2) Fix Android building rules for gfx10_format_table.h generated header (v2) Add LOCAL_C_INCLUDES += $(intermediates)/radeonsi to fix error: external/mesa/src/gallium/drivers/radeonsi/si_state.c:46:10: fatal error: 'gfx10_format_table.h' file not found ^~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `0ffa229` ("radeonsi/gfx10: generate gfx10_format_table.h") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-07-10 09:03:46 +02:00
Chih-Wei Huang	0d394f1734	android: virgl: remove unnecessary LOCAL_C_INCLUDES The path could be imported automatically. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>	2019-07-10 08:56:47 +02:00
Chih-Wei Huang	4dc129e4f4	android: vulkan/util: fix generating vk_enum_to_str.* The gen_enum_to_str.py generates vk_enum_to_str.c and its header at once. However, the makefiles incorrectly list both files parallel with the same recipes. That means both two files may be generated simultaneously by two processes. The generating files may be truncated by another process, as shown below: $ cd $OUT/obj/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util $ ls -l -rw-rw-r-- 1 lh lh 193713 Jul 5 13:31 vk_enum_to_str.c -rw-rw-r-- 1 lh lh 4609 Jul 5 13:31 vk_enum_to_str.d -rw-rw-r-- 1 lh lh 0 Jul 5 16:21 vk_enum_to_str.h Let one file depends on the other with empty recipe to avoid the issue. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-10 08:56:37 +02:00
Chih-Wei Huang	a74285def2	android: radv: import include paths from used libraries It's unnecessary to manually add these include paths since they could be imported automatically. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:56:28 +02:00
Chih-Wei Huang	f982c6789c	android: anv: import include path of libmesa_nir Add libmesa_nir to a common LOCAL_STATIC_LIBRARIES defined by ANV_STATIC_LIBRARIES so that its include path can be imported automatically. Then ANV_INCLUDES is unnecessary and could be eliminated. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:56:23 +02:00
Chih-Wei Huang	5cb61f27d0	android: anv: eliminate libmesa_anv_entrypoints The dummy library libmesa_anv_entrypoints is totally unnecessary. The four VULKAN_GENERATED_FILES could be generated and built in libmesa_vulkan_common directly. The libraries using the generated headers should get it via the exported include path. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:56:16 +02:00
Chih-Wei Huang	4338e08bd6	android: vulkan/util: fix export path Export the correct include path so that the libraries use it can get it automatically. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:56:10 +02:00
Chih-Wei Huang	e2ef281da1	android: radv: fix improper use of LOCAL_WHOLE_STATIC_LIBRARIES The libmesa_git_sha1 is a dummy library. There is no reason to put it into LOCAL_WHOLE_STATIC_LIBRARIES. Move libmesa_vulkan_util to the vulkan.radv which really needs it. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:56:04 +02:00
Chih-Wei Huang	8ff01f0342	android: anv: fix improper use of LOCAL_WHOLE_STATIC_LIBRARIES The libmesa_anv_entrypoints and libmesa_genxml are dummy libraries. There is no reason to put them into LOCAL_WHOLE_STATIC_LIBRARIES. Move libmesa_vulkan_util to the vulkan HAL which really needs it. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:55:59 +02:00
Chih-Wei Huang	352d91ce5b	android: radv: remove unused LOCAL_EXPORT_C_INCLUDE_DIRS The vulkan module is the final HAL. No need to export its headers since none will import it. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:55:50 +02:00
Chih-Wei Huang	4fb11c01c5	android: anv: remove unused LOCAL_EXPORT_C_INCLUDE_DIRS The vulkan module is the final HAL. No need to export its headers since none will import it. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 08:55:42 +02:00
Jason Ekstrand	7e0fcea727	nir/loop_analyze: Pass nir_const_values directly to helpers Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	ff972c7a3a	nir/loop_analyze: Properly handle swizzles in loop conditions This commit re-plumbs all of nir_loop_analyze to use nir_ssa_scalar for all intermediate values so that we can properly handle swizzles. Even though if conditions are required to be scalars, they may still consume swizzles so you could have ((a.yzw < b.zzx).xz && c.xx).y == 0 as your loop termination condition. The old code would just bail the moment it saw its first non-zero swizzle but we can now properly chase the scalar from the if condition to all the way to a, b, and c. Shader-db results on Kaby Lake: total loops in shared programs: 4388 -> 4364 (-0.55%) loops in affected programs: 29 -> 5 (-82.76%) helped: 29 HURT: 5 Shader-db results on Haswell: total loops in shared programs: 4370 -> 4373 (0.07%) loops in affected programs: 2 -> 5 (150.00%) helped: 2 HURT: 5 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	0333649e63	nir/loop_analyze: Refactor detection of limit vars This commit reworks both get_induction_and_limit_vars() and try_find_trip_count_vars_in_iand to return true on success and not modify their output parameters on failure. This makes their callers significantly simpler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	8f7405ed9d	nir: Add some helpers for chasing SSA values properly There are various cases in which we want to chase SSA values through ALU ops ranging from hand-written optimizations to back-end translation code. In all these cases, it can be very tricky to do properly because of swizzles. This set of helpers lets you easily work with a single component of an SSA def and chase through ALU ops safely. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	9a3cb6f5fe	nir/loop_analyze: Bail if we encounter swizzles None of the current code knows what to do with swizzles. Take the safe option for now and bail if we see one. This does have a small shader-db impact but it is at least safe. Shader-db results on Kaby Lake: total loops in shared programs: 4364 -> 4388 (0.55%) loops in affected programs: 5 -> 29 (480.00%) helped: 5 HURT: 29 Shader-db results on Haswell: total loops in shared programs: 4373 -> 4370 (-0.07%) loops in affected programs: 5 -> 2 (-60.00%) helped: 5 HURT: 2 Fixes: `6772a17acc` "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	6455fa9710	nir/loop_analyze: Use new eval_const_* helpers in test_iterations Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	268ad47c11	nir/loop_analyze: Handle bit sizes correctly in calculate_iterations The current code assumes everything is 32-bit which is very likely true but not guaranteed by any means. Instead, use nir_eval_const_opcode to do the calculations in a bit-size-agnostic way. We also use the new constant constructors to build the correct size constants. Fixes: `6772a17acc` "nir: Add a loop analysis pass" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	9f7ffe41dd	nir/loop_analyze: Fix phi-of-identical-alu detection One issue was that the original version didn't check that swizzles matched when comparing ALU instructions so it could end up matching very different instructions. Using the nir_instrs_equal function from nir_instr_set.c which we use for CSE should be much more reliable. Another was that the loop assumes it will only run two iterations which may not be true. If there's something which guarantees that this case only happens for phis after ifs, it wasn't documented. Fixes: `9e6b39e1d5` "nir: detect more induction variables" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	6e984bcb92	nir/instr_set: Expose nir_instrs_equal() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	64328f947e	nir/builder: Use nir_const_value_for_* for constructing immediates Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	3acddc733f	nir: Refactor nir_src_as_* constant functions Now that we have the nir_const_value_as_* helpers, every one of these functions is effectively the same except for the suffix they use so we can easily define them with a repeated macro. This also means that they're inline and the fact that the nir_src is being passed by-value should no longer really hurt anything. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Jason Ekstrand	ce5581e23e	nir: Add more helpers for working with const values Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-10 00:20:59 +00:00
Chia-I Wu	b44bb8bded	virgl: remove virgl_transfer_queue_lists COMPLETED_LIST is always empty. We only need one list. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Chia-I Wu	48aefcbd6b	virgl: simplify virgl_transfer_queue_extend We can reuse virgl_transfer_queue_find_pending. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Chia-I Wu	eae4527551	virgl: remove transfer after transfer_write Now that virgl_transfer_queue_is_queued does not search COMPLETED_LIST, we don't need to move transfers to that list. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Chia-I Wu	bec2a85c48	virgl: improve virgl_transfer_queue_is_queued Search only the pending list and return immediately on the first hit. When the transfer queue was introduced, the function was used to deal with write transfer -> draw -> write transfer sequence. It was used to tell if the second transfer intersects with the first transfer. If yes, the transfer queue avoided reordering the second transfer to before the draw (by flushing) in case the draw uses the transferred data. With the recent changes to the transfer code, the function is used to deal with write transfer -> readback transfer We want to avoid reordering the readback transfer to before the first transfer (also by flushing). In the old code, we needed to track the compeleted transfers as well to avoid reordering. But in the new code, a readback transfer is guaranteed to see the data from the completed transfers (in other words, it cannot be reoderered to before the already completed transfers). We don't need to search the COMPLETED_LIST. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Chia-I Wu	5f6aab2ee2	virgl: fix transfers_intersect for mipmaps We never use transfers_intersect with textures, but fix it anyway to avoid confusion. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Chia-I Wu	6ca1bbabbe	virgl: fix some false positives in transfers_overlap Rewrite the function and check z/depth more carefully. We intentionally avoid u_box_test_intersection_2d because it returns true when two boxes touch but do not intersect and can be confusing. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-07-09 14:26:55 -07:00
Marek Olšák	2b2093961e	radeonsi/gfx10: enable primitive binning by default Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	9f68367d19	radeonsi/gfx10: implement primitive binning Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	4e56a2aaa8	radeonsi: simplify primitive binning enablement Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	3521297251	radeonsi: set primitive binning tunables for dGPUs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	d7e80ba1e7	radeonsi: set FLUSH_ON_BINNING_TRANSITION when needed Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	9dbe63ceea	radeonsi/gfx10: use the new scan converter when binning is disabled Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	80b3f4b4bd	radeonsi/gfx9: fix an oversight in primitive binning code Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	1f53a3e766	radeonsi: use BREAK_BATCH instead of FLUSH_DFSM when CB_TARGET_MASK changes Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	605900d7dd	radeonsi/gfx10: don't expose unimplemented PIPE_CAP_QUERY_SO_OVERFLOW Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	270a8ab648	radeonsi/gfx10: launch 2 compute waves per CU before going onto the next CU Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	ab1f36a1d3	radeonsi/gfx10: set more registers and fields Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	9b65f6618c	radeonsi/gfx10: enable LATE_ALLOC_GS Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	4985c3ee22	radeonsi/gfx10: set HS/GS/CS.WGP_MODE Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	329406ec9c	radeonsi/gfx10: set GE_PC_ALLOC Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	9d1483de3b	radeonsi/gfx10: enable 1D textures Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	1d3bffaf9c	radeonsi/gfx10: enable image stores with DCC Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	5b50fb9b7f	radeonsi/gfx10: no need to invalidate L2 for framebuffer -> texture coherency Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	fbf781e401	radeonsi/gfx10: support pixel shaders without exports It only works if there are not color and no Z exports. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	2adc8e2736	radeonsi/gfx10: enable vertex shaders without param space allocation Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	07fe51156d	radeonsi: update DCC settings from PAL Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	4002913f8d	radeonsi: reorder shader IO indices for better IO space usage for tess and GS The highest used index determines the stride for shader outputs in shaders that use LDS or memory for outputs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	1c99a13f89	radeonsi: decrease maximum supported GENERIC varying index from 42 to 31 This can decrease LDS and/or memory usage for shader outputs when geometry shaders or tessellation is used. Only PS inputs support higher indices and those aren't eliminated by kill_outputs. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	6335cc6a58	radeonsi: cosmetic cleanup in si_shader_io_get_unique_index Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	3be4ed2fe1	radeonsi: fix and clean up shader_type passing - don't pass it via a parameter if it can be derived from other parameters - set shader_type for ac_rtld_open - use enum pipe_shader_type instead of unsigned Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2019-07-09 17:24:16 -04:00
Marek Olšák	37b26671a7	radeonsi: enable RB+ for pixel shaders with no/non-contiguous color outputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie airlied@redhat.com	2019-07-09 17:24:16 -04:00
Marek Olšák	5058d62b05	radeonsi: don't set READ_ONLY for const_uploader to fix bindless texture hangs Bindless textures can update descriptors with WRITE_DATA. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Dave Airlie airlied@redhat.com	2019-07-09 17:24:16 -04:00
Alyssa Rosenzweig	6074eae753	gallium: Add util_format_is_unorm8 check Useful for formats that would work with the same driver code path as RGBA8 UNORM but that don't meet the util_format_is_rgba8_variant criteria due to a smaller channel count. v2: Use simpler logic (suggested by Iago). v3: Fix spelling erorr. boolean->bool (thank you airlied). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 21:17:47 +00:00
Alyssa Rosenzweig	15000c79da	nir: Add Panfrost-specific blending intrinsic This gives more flexibility than the normal store_deref/store_output versions (particularly, it allows us to abuse the type system in awful ways, which is necessary for efficient format conversion in blend shaders.) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2019-07-09 14:07:23 -07:00
Pratik Vishwakarma	177a3df7b0	radeonsi: Expose support for 10-bit VP9 decode Fix si_vid_is_format_supported to expose support for 10-bit VP9 decode using P016 format. Without this change, 10-bit decode will be exposed only for HEVC even though newer hardware support 10-bit decode for VP9. Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2019-07-09 15:26:54 -04:00
Alyssa Rosenzweig	4a4b48fb05	nir: Add nir_imm_vec4_16 We already have nir_imm_float16 and nir_imm_vec4; let's add the ability to easily make immediate fp16 vectors as well, now that fp16 support is maturing in NIR/GLSL. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-09 18:43:07 +00:00
Karol Herbst	a110a8090d	nvc0: remove nvc0_program.tp.input_patch_size right now that's dead code Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-07-09 12:41:54 +02:00
Bas Nieuwenhuizen	14291342ec	radv: Add a common member in the union to make things more clear. This clarifies that the struct can be used when the shader can be one of VS/TES. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-09 09:59:07 +00:00
Bas Nieuwenhuizen	f9070743a9	Revert "radv: keep track of whether NGG is used for GS on GFX10" This reverts commit `63e0675d98`. The GS is merged with the preceding shader and since the preceding shader will have as_ngg set the final binary will have is_ngg set. So we do not need the gs key here. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-09 09:59:07 +00:00
Juan A. Suarez Romero	d33e93d332	docs: update calendar, add news item and link release notes for 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-07-09 11:22:13 +02:00
Juan A. Suarez Romero	3c90baf047	docs: add sha256 checksums for 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `e42399f4de`)	2019-07-09 09:19:25 +00:00
Juan A. Suarez Romero	0f51d69087	docs: add release notes for 19.1.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `fe1f7b538b`)	2019-07-09 09:19:24 +00:00
Connor Abbott	86968327df	nir/lower_io_to_temporaries: Fix hash table leak Fixes: `c45f5db527` ("nir/lower_io_to_temporaries: Handle interpolation intrinsics") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-07-09 10:39:37 +02:00
Bas Nieuwenhuizen	64cd972ffb	radv/gfx10: Use correct gs_out for tess point_mode. Fixes: `204e4da9b4` "radv: Use correct gs_out with tessellation." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-09 09:52:50 +02:00
Samuel Pitoiset	3f50007ad8	radv: set correct number of VGPRs for GS on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:27 +02:00
Samuel Pitoiset	611ddf794e	radv: fix VGT_ESGS_RING_ITEMSIZE for GS as NGG on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:24 +02:00
Samuel Pitoiset	eca8a478a5	radv: emit VGT_GS_MAX_VERT_OUT for legacy and NGG paths for GS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:22 +02:00
Samuel Pitoiset	f240147cf7	radv: emit the geometry shader as NGG if enabled on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:21 +02:00
Samuel Pitoiset	63e0675d98	radv: keep track of whether NGG is used for GS on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:19 +02:00
Samuel Pitoiset	c81b719812	radv: add radv_pipeline_generate_hw_gs() helper For legacy GS path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:17 +02:00
Samuel Pitoiset	54e2470047	radv: fix setting VGT_REUSE_OFF for TES on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:16 +02:00
Samuel Pitoiset	d2a8b63a2c	radv: fix computing the number of ES VGPRS for TES on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:14 +02:00
Samuel Pitoiset	2974df819e	radv: set max workgroup size to 128 for TES as NGG on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:12 +02:00
Samuel Pitoiset	53c75f17ec	radv: fix allocating USER SGPRs on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 09:54:11 +02:00
Alejandro Piñeiro	71446bf8e3	v3d: Early return with handle 0 when getting a bo on the simulator Until now we were just asking entries on the bo hash table, and don't worry if the handle was NULL, as we were just expecting to get a NULL in return. It seems that now the hash table assert with some reserverd pointers, included NULL. This commit just early returns with handle 0. This change fixes several crashes on vk-gl-cts GLES tests when using the v3d simulator, like: KHR-GLES3.core.internalformat.copy_tex_image.* Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-09 08:40:35 +02:00
Lionel Landwerlin	b031dd9010	vulkan/overlay: use a single macro to lookup objects Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-09 09:13:21 +03:00
Lionel Landwerlin	b3a96e69ac	vulkan/overlay: add queue present timing measurement Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-09 09:13:19 +03:00
Bas Nieuwenhuizen	f7f08b2d81	radv/gfx10: Enable tess. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:29 +10:00
Bas Nieuwenhuizen	795adbbadd	radv/gfx10: Add pipeline state support for tess. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:26 +10:00
Bas Nieuwenhuizen	23c6698ea2	radv/gfx10: Only set HW edge flags with gs & tess disabled. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:23 +10:00
Bas Nieuwenhuizen	9a8e4a07ad	radv/gfx10: Add tess eval ngg shader support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:20 +10:00
Bas Nieuwenhuizen	204e4da9b4	radv: Use correct gs_out with tessellation. We should use the primitives output by the TES in that case. There is always a separate TES if there is no GS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:16 +10:00
Bas Nieuwenhuizen	343a435c46	radv/gfx10: Use correct count of max_offchip_buffers. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:04:12 +10:00
Bas Nieuwenhuizen	5d0dbc2564	radv/gfx10: Load global pointers in correct userdata registers for hs/gs. Fixes: `cfaad5e3ca` "radv/gfx10: implement radv_emit_global_shader_pointers()" Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-09 12:03:51 +10:00
Timothy Arceri	6b60cfd079	radeonsi: update function name in comment This was missed in `2361558eb7` Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-09 10:00:23 +10:00
Timothy Arceri	7c612c49b4	r600: remove query/apply_opaque_metadata callbacks Theses seem to have been radeonsi specific callbacks that are no longer needed now that these drivers no longer share this code path. These callbacks were removed from radeonsi in `c0d44fe0e9`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-09 10:00:23 +10:00
Lionel Landwerlin	a72351cc76	vulkan/overlay: fix crash on freeing NULL command buffer It is legal to call vkFreeCommandBuffers() on NULL command buffers. This fix requires `eb41ce1b01` ("util/hash_table: Properly handle the NULL key in hash_table_u64"). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4438188f49` ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 21:49:26 +00:00
Lionel Landwerlin	6271d16320	vulkan: bump headers & registry to 1.1.114 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-09 00:09:36 +03:00
Dave Airlie	6422fa75b4	radv: only use specialised 3D meta paths on GFX9. GFX10 appears to act like GFX8 here. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-09 06:32:28 +10:00
Ian Romanick	0349bc3ce2	mesa: Set minimum possible GLSL version Set the absolute minimum possible GLSL version. API_OPENGL_CORE can mean an OpenGL 3.0 forward-compatible context, so that implies a minimum possible version of 1.30. Otherwise, the minimum possible version 1.20. Since Mesa unconditionally advertises GL_ARB_shading_language_100 and GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't advertise any extensions to enable any shader stages (e.g., GL_ARB_vertex_shader). Converts about 2,500 piglit tests from crash to skip on NV18. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955 Cc: mesa-stable@lists.freedesktop.org	2019-07-08 12:34:09 -07:00
Caio Marcelo de Oliveira Filho	d577db293d	anv: Set maxComputeSharedMemorySize to 64k This value is supported since gen7. See also `8514c75a26` "i965: Set compute shader shared memory max to 64k". Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 11:35:42 -07:00
Ian Romanick	dd2dc7e707	intel/vec4: Delete vec4_visitor::emit_lrp Effectivley unused since `dd7135d55d` ("intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5"). I had intended to remove this code as part of that series, but I forgot. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:11 -07:00
Ian Romanick	5450fd7a36	nir: Allow nir_ssa_alu_instr_src_components to operate on non-SSA destinations Existing users only operate on instructions with SSA destinations. Some later patches add new direct calls and indirect calls (via existing NIR functions) on instructions after going out of SSA. At the very least, these calls are added by: intel/vec4: Try to emit a VF source in try_immediate_source intel/vec4: Try to emit a single load for multiple 3-src instruction operands The first commit adds direct calls, and the second adds calls via nir_alu_srcs_equal and nir_alu_srcs_negative_equal. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:11 -07:00
Ian Romanick	12217de08c	nir: Handle swizzle in nir_alu_srcs_negative_equal When I added this function, I was not sure if swizzles of immediate values were a thing that occurred in NIR. The only existing user of these functions is the partial redundancy elimination for compares. Since comparison instructions are inherently scalar, this does not occur. However, a couple later patches, "nir/algebraic: Recognize open-coded flrp(-1, 1, a) and flrp(1, -1, a)" combined with "intel/vec4: Try to emit a single load for multiple 3-src instruction operands", collaborate to create a few thousand instances. No shader-db changes on any Intel platform. v2: Handle the swizzle in nir_alu_srcs_negative_equal and leave nir_const_value_negative_equal unchanged. Suggested by Jason. v3: Correctly handle write masks. Add note (and assertion) that the caller is responsible for various compatibility checks. The single existing caller only calls this for combinations of scalar fadd and float comparison instructions, so all of the requirements are met. A later patch (intel/vec4: Try to emit a single load for multiple 3-src instruction operands) will call this for sources of the same instruction, so all of the requirements are met. v4: Add unit test for nir_opt_comparison_pre that is fixed by this commit. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:11 -07:00
Ian Romanick	ad50e812a3	nir: nir_const_value_negative_equal compares one value at a time Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:10 -07:00
Ian Romanick	bcd22b740c	nir: Port some const_value_negative_equal tests to alu_src_negative_equal The next commit will make the existing tests irrelevant. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 11:30:10 -07:00
Ian Romanick	ec96c289ea	nir: Pass fully qualified type to nir_const_value_negative_equal Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:10 -07:00
Ian Romanick	0ac5ff9ecb	nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size This is important because, for example nir_op_fne has dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or 64-bits. Fixing this helps partial redundancy elimination for compares in a few more shaders. v2: Add unit tests for nir_opt_comparison_pre that are fixed by this commit. All Intel platforms had similar results. total instructions in shared programs: 17179408 -> 17179081 (<.01%) instructions in affected programs: 43958 -> 43631 (-0.74%) helped: 118 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2 helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for instructions value: -3.08 -2.37 95% mean confidence interval for instructions %-change: -1.30% -0.85% Instructions are helped. total cycles in shared programs: 360959066 -> 360942386 (<.01%) cycles in affected programs: 774274 -> 757594 (-2.15%) helped: 111 HURT: 4 helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36 helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24% HURT stats (abs) min: 1 max: 2068 x̄: 533.25 x̃: 32 HURT stats (rel) min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56% 95% mean confidence interval for cycles value: -200.61 -89.47 95% mean confidence interval for cycles %-change: -10.32% -6.58% Cycles are helped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1] Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `be1cc3552b` ("nir: Add nir_const_value_negative_equal")	2019-07-08 11:30:10 -07:00
Ian Romanick	47c2aa5b48	intel/vec4: Reswizzle VF immediates too Previously, an instruction like mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F] would get rewritten as mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F] The latter does not produce the correct result. The VF immediate in the second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F]. This commit produces the former. Fixes: `1ee1d8ab46` ("i965/vec4: Reswizzle sources when necessary.") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:10 -07:00
Ian Romanick	b08d704051	nir: Add unit tests for nir_opt_comparison_pre Each tests has a comment with the expected before and after NIR. The tests don't actually check this. The tests only check whether or not the optimization pass reported progress. I couldn't think of a robust, future-proof way to check the before and after code. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-08 11:30:10 -07:00
Dongwon Kim	f734e2a042	anv: disable repacking for compression for applicable gen set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	6866765cb3	iris: disable repacking for compression for applicable gen set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	00df55cdc9	i965: disable repacking for compression for applicable gen set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	eb6d067e68	intel: add disable_ccs_repack to gen_device_info add a new attribute, 'disable_ccs_repack' to gen_device info, which indicates whether repacking of components in certain pixel formats before compression needs to be disabled to keep the compatibility with decompression capability of display controller (gen11+) Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:38 -07:00
Dongwon Kim	e6ac6d3224	intel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11 correct bit fields information of CACHE_MODE_0 reg in current gen11.xml Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-07-08 10:54:37 -07:00
Caio Marcelo de Oliveira Filho	2614319259	nir: print ptr_stride for deref_casts Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-08 10:05:56 -07:00
Caio Marcelo de Oliveira Filho	9c7adaeb5f	anv: Advertise VK_EXT_shader_demote_to_helper_invocation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho	1a83c9a619	spirv: Implement SPV_EXT_demote_to_helper_invocation Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho	5a7c69399d	spirv: Update the headers from latest Khronos master This corresponds to 29c11140baaf9f7fdaa39a583672c556bf1795a1 in https://github.com/KhronosGroup/SPIRV-Headers. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho	45f5db5a84	intel/fs: Implement "demote to helper invocation" The "demote" intrinsic works like "discard" but don't change the control flow, allowing derivative operations to work. This is the semantics of D3D discard. The "is_helper_invocation" intrinsic will return true for helper invocations -- both the ones that started as helpers and the ones that where demoted. This is needed to avoid changing the behavior of gl_HelperInvocation which is an input (so not expected to change during shader execution). v2: Emit the discard jump and comment why it is safe. (Jason) Rework the is_helper_invocation() that was stomping f0.1. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Caio Marcelo de Oliveira Filho	a42e8f0ed1	nir: Add demote and is_helper_invocation intrinsics From SPV_EXT_demote_to_helper_invocation. Demote will be implemented as a variant of discard, so mark uses_discard if it is used. v2: Add CAN_ELIMINATE flag to the new intrinsic. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-08 08:57:25 -07:00
Samuel Pitoiset	9b116173b6	radv: do not emit VGT_FLUSH on GFX10 We don't need it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:45:23 +02:00
Connor Abbott	0c114ae3be	ac/nir: Remove now-unused interp_deref handling Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:18:52 +02:00
Connor Abbott	b3a226691d	radeonsi/nir: Use NIR barycentric intrinsics This is simpler than radv, since the driver_location is already assigned for us. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:18:46 +02:00
Connor Abbott	d1c65939e2	radeonsi/nir: Delete unreachable code We always get gl_FragCoord as a system value, not a varying, so this is never hit. We already set PIXEL_CENTER_INTEGER elsewhere. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:18:41 +02:00
Connor Abbott	e5536aa584	compiler: Add color system value This is nice to have with radeonsi, where color varyings are handled specially to avoid recompiles. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:18:34 +02:00
Connor Abbott	118a66df99	radv: Use NIR barycentric intrinsics We have to add a few lowering to deal with things that used to be dealt with inline when creating inputs. We also move the code that fills out the radv_shader_variant_info struct for linking purposes to radv_shader.c, as it's no longer tied to the NIR->LLVM lowering. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:18:25 +02:00
Connor Abbott	0cad0424e9	ac/nir: Implement barycentric intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:18:25 +02:00
Connor Abbott	6b28808b22	intel/nir: Extract add_const_offset_to_base Pretty much every driver using nir_lower_io_to_temporaries followed by nir_lower_io is going to want this. In particular, radv and radeonsi in the next commits. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	c45f5db527	nir/lower_io_to_temporaries: Handle interpolation intrinsics These weren't properly supported. This does pretty much the same thing that the radv code did. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	3a2ea2af9d	nir: Avoid coalescing vars created by lower_io_to_temporaries Right now nir_copy_prop_vars is effectively undoing nir_lower_io_to_temporaries for inputs by propagating the original variable through the copy created in lower_io_to_temporaries. A theoretical variable coalescing pass would have the same issue with output variables, although that doesn't exist yet. To fix this, add a new bit to nir_variable, and disable copy propagation when it's set. This doesn't seem to affect any drivers now, probably since since no one uses lower_io_to_temporaries for inputs as well as copy_prop_vars, but it will fix radv once we flip on lower_io_to_temporaries for fs inputs. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	f3e2c65041	nir: Return correct size in nir_assign_io_var_locations() It was double-counting cases where multiple variables were assigned to the same slot, and not handling the case where the last variable is a compact variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	dd81d8808d	nir: Handle compact variables when assigning i/o locations These are used in Vulkan for clip/cull distances, instead of the GLSL lowering when the clip/cull arrays are shared. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	fd5ed6b9d6	nir: Move st_nir_assign_var_locations() to common code It isn't really doing anything Gallium-specific, and it's needed for handling component packing, overlapping, etc. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:15:06 +02:00
Connor Abbott	27f0c3c15e	radv: Make FragCoord a sysval load_fragcoord is already handled in common code for radeonsi, so we don't need to do anything to handle it. However, there were some passes creating NIR with the varying, so we switch them over to the sysval. In the case of nir_lower_input_attachments which is used by both radv and anv, we add handling for both until intel switches to using a sysval. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Connor Abbott	64f3fc5ea6	spirv: Add an option for making FragCoord a sysval On AMD, FragCoord should be a sysval because it is handled separately from all the other inputs. We were already doing this in radeonsi, but we weren't doing it with radv. It'll be much more annoying to handle VARYING_SLOT_POS in fragment shaders when we let NIR lower FS inputs for us, so here we add an option so that radv can get it as a system value. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Daniel Schürmann	e41e932e57	radv: Lower input attachments in NIR. v2 (Connor) - Fix warning in release mode using MAYBE_UNUSED Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:14:53 +02:00
Daniel Schürmann	c65e880a65	radv: Implement nir_intrinsic_load_layer_id(). Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-08 14:14:53 +02:00
Daniel Schürmann	c31f470066	anv,nir: Move lower_input_attachments pass from ANV to NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 14:02:50 +02:00
Dave Airlie	1d327689f9	radv/gfx10: don't emit PFP packets on ME. This was done for all previous GPUs. This fixes Talos Principle launch hangs. Fixes: `7e43022e8c` (radv/gfx10: add gfx10_cs_emit_cache_flush) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 17:19:42 +10:00
Samuel Pitoiset	49e5136887	ac: select the GFX ring when halting waves with UMR on GFX10 GFX10 has two rings, so UMR want to know which one to halt. Select the first one by default. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-08 09:10:57 +02:00
Bas Nieuwenhuizen	4d118ad44a	radv/gfx10: Move NGG output handling outside of giant if-statement. In merged shaders we put a big if around each shader, so both stages can have a different number of threads. However, the NGG output code still needs to run if the first shader is not executed. This can happen when there are more gs threads than vs/es threads, or when there are 0 es/vs threads (why? no clue). Fixes: `ee21bd7440` "radv/gfx10: implement NGG support (VS only)" Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-08 01:49:54 +02:00
Bas Nieuwenhuizen	703efab7e4	radv: Actually use VK formats for the format table. No ETC2 or ASTC on navi so nothing to add. Fixes: `3dc5ec5d16` "radv/gfx10: generate gfx10_format_table.h" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-07 23:10:32 +02:00
Chia-I Wu	5824130389	anv: fix VkExternalBufferProperties for host allocation It was reported as unsupported previously. It should be importable and is compatible with itself. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-07 13:31:58 -07:00
Chia-I Wu	f3c7a02a62	anv: fix VkExternalBufferProperties for unsupported handles compatibleHandleTypes must include the queried handle type. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-07 13:31:58 -07:00
Bas Nieuwenhuizen	e46b41b3ae	radv: Handle cmask being disallowed by addrlib. alignment=0 does weird things with align64. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-07-07 21:29:52 +02:00
Samuel Pitoiset	5eaed7ecfc	radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen	817bd0cc2e	radv/gfx10: Use GS rectlist when needed. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	ee21bd7440	radv/gfx10: implement NGG support (VS only) This needs to be cleaned up a bit, and it probably contains missing stuff and/or bugs. This doesn't fix the "half of the triangles" issue. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen	9e37609d0b	radv: Combine vs and tes output keys parts. That way the same deref is valid for both shader stages. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen	d0978427cb	radv/gfx10: Use new uconfig reg index packet for GFX10+. Otherwise the hardware/firmware seems to not set the registers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen	aeb5b1a998	radv/gfx10: Set MEM_ORDERED flags on shaders. Scattered because depending on stage they are at offset 24/25/27/30. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	67b6888d8b	radv/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	74d69299d1	radv/gfx10: double the number of tessellation offchip buffers per SE Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale the number of offchip buffers accordingly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	bf1e1a29c3	radv/gfx10: require LLVM 9+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	0f769ed398	radv/gfx10: disable geometry and tessellation shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	fe4419d3c7	radv/gfx10: disable binning Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	faf27ee9b3	radv/gfx10: disable CLEAR_STATE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	698f9e6fd3	radv/gfx10: disable VK_EXT_transform_feedback It requires a bunch of work, so disable for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	2141e6fc73	radv/gfx10: set user data base registers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	7e43022e8c	radv/gfx10: add gfx10_cs_emit_cache_flush The cache flush logic on GFX10 is quite different and it's implemented with a new function. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	b0b6e27bca	radv/gfx10: set the DCC constant encoding flag Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	ce3b5d4c17	radv/gfx10: do not declare streamout SGPRS Streamout is completely different on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	352365c5e2	radv/gfx10: do not set stream output shader config Transform feedback is really different on GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	3f68329806	radv/gfx10: emit VGT_VERTEX_REUSE_BLOCK_CNTL during gfx initialization The value doesn't need to be updated for tess. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	2a83154b4a	radv/gfx10: update shader-related fields in si_emit_graphics() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	5556f16609	radv/gfx10: implement si_emit_compute() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	c90f46700d	radv/gfx10: mask DCC tile swizzle by alignment DCC alignment can be less than the alignment of the main surface. In that case, the DCC tile swizzle needs to be masked accordingly. Should have no impact on pre-gfx10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	b1b60a92b1	radv/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	12a42c2d9f	radv/gfx10: implement radv_flush_vertex_descriptors() change Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:32 +02:00
Samuel Pitoiset	0ca09a7fe3	radv/gfx10: implement fill_geom_tess_rings() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:31 +02:00
Samuel Pitoiset	ebeb319f0e	radv/gfx10: implement radv_CmdBindDescriptorSets() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:31 +02:00
Samuel Pitoiset	97891a0d10	radv/gfx10: implement write_buffer_descriptor() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:31 +02:00
Samuel Pitoiset	bdd8acde02	radv/gfx10: use the correct register for image descriptor dumping Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:31 +02:00
Samuel Pitoiset	e5a8f21b0e	radv/gfx10: implement radv_pipeline_generate_hw_hs() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:51:31 +02:00
Samuel Pitoiset	4c82094b7b	radv/gfx10: implement radv_fill_shader_variant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:39 +02:00
Samuel Pitoiset	b144a70ca8	radv/gfx10: implement radv_pipeline_generate_geometry_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:39 +02:00
Samuel Pitoiset	5551d6d6ea	radv/gfx10: implement radv_init_sampler() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:39 +02:00
Samuel Pitoiset	4c31f3dcc0	radv/gfx10: fix PS exports for SPI_SHADER_32_AR Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:39 +02:00
Samuel Pitoiset	8574a84291	radv/gfx10: implement radv_get_device_name() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	863727c4a3	radv/gfx10: set RADV_FORCE_FAMILY Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	34b185cc43	radv/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	b3a53de5fa	radv/gfx10: set PA_SC_TILE_STEERING_OVERRIDE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	96cd24588b	radv/gfx10: set cache control registers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	9a01eded0c	radv/gfx10: set llvm_has_working_vgpr_indexing Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	6b9dbb28ef	radv/gfx10: update DB_DFSM_CONTROL register Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	2435b571de	radv/gfx10: update DB_Z_INFO register GFX10 uses the same register as GFX8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	cfaad5e3ca	radv/gfx10: implement radv_emit_global_shader_pointers() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	3f5ca22e9c	radv/gfx10: implement radv_emit_tess_factor_ring() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	17048c1765	radv/gfx10: implement radv_emit_fb_ds_state() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	2481ac81d3	radv/gfx10: implement radv_initialise_ds_surface() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	c2a5d98148	radv/gfx10: implement radv_emit_fb_color_state() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	e80f189de0	radv/gfx10: implement radv_initialise_color_surface() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	ee8d6a2a6c	radv/gfx10: implement radv_init_dcc_control_reg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	ccce8f5915	radv/gfx10: implement radv_make_buffer_descriptor() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	549d0aeee4	radv/gfx10: implement si_set_mutable_tex_desc_fields() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	bf11f1c3a4	radv/gfx10: add gfx10_make_texture_descriptor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	3dc5ec5d16	radv/gfx10: generate gfx10_format_table.h Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	9c1266048f	radv/gfx10: increase maximum number of layers to 8192 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	0213fe09b8	radv/gfx10: increase maximum number of levels to 14 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	1f82007a9e	radv/gfx10: set MAX_ALLOC_COUNT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	c3459968cd	ac/nir: unpacked GS invocation ID on GFX10+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Samuel Pitoiset	4d7c420a94	ac: add missing formats to ac_get_tbuffer_format() for GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 17:03:38 +02:00
Lionel Landwerlin	8f0f727fe4	vulkan/overlay: fix command buffer stats Begin/Reset of command buffer both reset the content of the command buffer. Don't forget to wipe them on Begin. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `4438188f49` ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit") Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-07 15:47:54 +03:00
Lionel Landwerlin	5493ec3c19	anv: manually add KHR_display to the list of platforms Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `38305e6c94` ("anv: replace hard-coded platform list with vk.xml parse") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111078 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-07 15:34:09 +03:00
Dave Airlie	002c8cae44	docs/features: add shader buffer and atomic support for llvmpipe	2019-07-07 16:24:21 +10:00
Dave Airlie	2f8cbdfc88	llvmpipe: enable ARB_shader_storage_buffer_object Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:24:17 +10:00
Dave Airlie	df46b3d196	llvmpipe: add support for shader buffer binding. This add support for setting shader buffers and passing them to draw or binding them to the fragment shader jit. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:24:12 +10:00
Dave Airlie	d8fb66a3e1	draw: add shader buffer interfaces. This adds the interface to add mapped shader buffers, and sets up the jit linkage for them. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:24:09 +10:00
Dave Airlie	b5ac381d8f	gallivm: add buffer operations to the tgsi->llvm conversion. This adds load, store and atomic operations. These operations have to respect the exec_mask, and can't operate in lanes where the execute is off. This is needed to avoid side effects seen outside the shaders. There is also bounds checking on the ssbo accesses vs the size ptr. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:24:05 +10:00
Dave Airlie	a845baff16	gallivm: move mask_vec function up higher so it can be reused. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:24:01 +10:00
Dave Airlie	ab807859ea	tgsi: denote which load/store/atomic channels are unsigned llvmpipe will need this info. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:54 +10:00
Dave Airlie	e21007f426	llvmpipe: add support for ssbo to the fragment shader jit. This just adds the ssbo ptrs to the jit fragment shader api. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:51 +10:00
Dave Airlie	69ff738eb0	draw: add support for ssbo ptrs to jit tables. This adds ssbo/num_ssbo ptrs to the vs/gs jit tables. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:46 +10:00
Dave Airlie	e84570ba70	gallivm: add some basic SSBO limits. (v2) v2: update ssbo size Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:44 +10:00
Dave Airlie	7c3807c1b3	util: add util_copy_shader_buffer. This just adds an inline to copy a pipe_shader_buffer. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:40 +10:00
Dave Airlie	5ff697aa65	gallivm: add ssbo pointers to the soa build api. Need to pass ssbo + ssbo size pointers just like constants. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:36 +10:00
Dave Airlie	2a55acbc1d	gallivm: add compare exchange wrapper This just pulls the wrapper from LLVM for older versions Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:32 +10:00
Dave Airlie	4f709c86a9	vertex shader: add exec masking (v2) As suggested by Roland this is just a compare of fetch_max vs the counter, much simpler than my original spaghetti code. We require the vertex shader to have an exec mask to get proper ssbo/image load/atore/atomics semantics Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-07-07 16:23:27 +10:00
Alexandros Frantzis	4271430dd7	virgl: Hide internal virgl_resource functions Since the transition to virgl_resource_transfer_map(), several previously public virgl_resource functions are not required to be public anymore. We also move the functions earlier in the file so they can be used without functions declarations. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-06 19:30:38 -07:00
Alexandros Frantzis	e5b54d0018	virgl: Use virgl_resource_transfer_map for textures Replace custom texture map code (for maps which don't require resolve) with virgl_resource_transfer_map. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-06 19:30:37 -07:00
Alexandros Frantzis	f8975f8f2f	virgl: Use virgl_resource_transfer_map for buffers Replace custom buffer map code with virgl_resource_transfer_map. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-06 19:30:34 -07:00
Alexandros Frantzis	bb0a38d819	virgl: Introduce virgl_resource_transfer_map Normal mapping of buffers and textures uses almost identical logic. This commit extracts the this logic in the form of the virgl_resource_transfer_map() helper function. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-06 19:30:22 -07:00
Jason Ekstrand	4633298fd6	iris: Use a uint16_t for key sizes sizeof(struct brw_vs_prog_key) == 324. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-04 19:52:34 -05:00
Marek Olšák	aa5dab27f9	ac: destroy passes in ac_destroy_llvm_compiler Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:39:04 -04:00
Marek Olšák	ea64d66fde	ac: use an LLVM fence instead of s.waitcnt when possible Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:39:03 -04:00
Marek Olšák	14450c8c41	ac: remove unused AC_WAIT_EXP Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:39:01 -04:00
Marek Olšák	fe5dbe75b2	ac: only set ac_dlc in ac_llvm_build.c Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:39:00 -04:00
Marek Olšák	8a71f60194	ac: replace glc,slc with cache_policy for loads cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:38:56 -04:00
Marek Olšák	a29e781961	ac: replace glc,slc with cache_policy for stores cosmetic change Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-04 15:38:54 -04:00
Jonathan Marek	5feb8adb0f	etnaviv: implement buffer compression Vivante GPUs have lossless buffer compression using the tile-status bits, which can reduce memory access and thus improve performance. This patch only enables compression for "V4" compression GPUs, but the implementation is tested on GC2000(V1) and GC3000(V2). V1/V2 compresssion looks absolutely useless, so it is not enabled. I couldn't test if this patch breaks MSAA, because it looks like MSAA is already broken. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	f6a0d17abe	etnaviv: detect v4 compression Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	e910acb3f2	etnaviv: rs: don't use etna_compatible_rs_format when possible This mirrors the change in blt. RS cares about this for msaa/compression. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	66411521ea	etnaviv: combine translate_ts_sampler_format/translate_msaa_format Both translate the same thing, so just add the missing cases into one. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	84c87f40fb	etnaviv: fix compression format not set correctly in TS_MEM_CONFIG VIVS_TS_MEM_CONFIG_COLOR_COMPRESSION_FORMAT() needs to be used. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	53475c85fd	etnaviv: set correct ts_clear_value for BLT engine BLT engine uses all ones to clear TS, set ts_clear_value to match that. Note: ts_clear_value is never used with BLT engine. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	7c7eaaed4a	etnaviv: remove initial CPU ts clear Since we have "ts_valid" to avoid using uncleared ts, this memset serves no purpose. Also it is broken because it doesn't use cpu_prep/cpu_fini. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	95d937852e	etnaviv: implement TS_MODE for GC7000L GC7000L has a TS mode with larger tiles, which improves performance. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:18 -04:00
Jonathan Marek	bc5ae6a330	etnaviv: fix ts size calculation The size of the TS is screen->specs.bits_per_tile bits per tile, with each tile being 64 bytes of the resource. This gives the same result for 32bpp formats, but reduces the size of TS for 16bpp formats by 2. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:05:09 -04:00
Jonathan Marek	2f540745ad	etnaviv: update headers from rnndb Update to etna_viv commit 8a8b13a and use new names in the code. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-07-04 14:04:47 -04:00
Eric Engestrom	c314ba2c26	scons: s/HAVE_NO_AUTOCONF/HAVE_SCONS/ Back when autotools and scons were the two build systems, it kinda made sense to call scons "not autoconf", but autoconf's been gone for a while now and other build systems have been added (android.mk and meson), so the name really doesn't make any sense anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-04 16:41:23 +01:00
Bas Nieuwenhuizen	bbbcb49f9b	radeonsi: Fix some warnings. ../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c: In function ‘si_clear_buffer’: ../mesa/src/gallium/drivers/radeonsi/si_compute_blit.c:195:11: warning: unused variable ‘clear_alignment’ [-Wunused-variable] unsigned clear_alignment = MIN2(clear_value_size, 4); ^~~~~~~~~~~~~~~ [23/60] Compiling C object 'src/gallium/drivers/radeonsi/3cdc30e@@radeonsi@sta/si_compute_prim_discard.c.o'. ../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c: In function ‘si_prepare_prim_discard_or_split_draw’: ../mesa/src/gallium/drivers/radeonsi/si_compute_prim_discard.c:1106:7: warning: unused variable ‘compute_has_space’ [-Wunused-variable] bool compute_has_space = sctx->ws->cs_check_space(cs, need_compute_dw, false); Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 11:12:27 +00:00
Nicolai Hähnle	cb07f91489	amd/common: move ac_shader_{binary,reloc} into r600 and rename They are no longer used by radeonsi or radv. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Nicolai Hähnle	510e74ff48	amd/common: removed unused ac_shader_binary functions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Nicolai Hähnle	b398230e6d	amd/common: remove unused ac_compile_module_to_binary Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	6a220e67ce	radv: Switch to using rtld. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	5ff651c0a7	radv: Move more stuff to variant create time. Due to them depending on the linker result. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	726a31df70	radv: Add the concept of radv shader binaries. This simplifies a bunch of stuff by (1) Keeping all the things in a single allocation, making things easier for the cache. (2) creating a shader_variant creation helper. This is immediately put to use by creating rtld shader binaries. This is the main reason for the binaries, as we need to do the linking at upload time, i.e. post caching. We do not enable rtld yet. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	43f2f01cc8	radv: Add export_prim_id to the shader variant info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	15046ef7c8	radv: use last nir shader to determine stage in postprocessing Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Bas Nieuwenhuizen	7469516244	radv: Merge rsrc1/rsrc2 fields with the config fields. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-04 10:52:26 +00:00
Andres Gomez	4000428ada	vulkan: Update headers to 1.1.113 Some headers were not dragged in the last update(s). Fixes: `465ec0b145` ("vulkan: Update the XML and headers to 1.1.113") Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-04 10:37:52 +00:00
Samuel Pitoiset	cce2645810	radv: do not crash when generating binning state for unknown chips These values are only useful if binning is disabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-04 12:22:46 +02:00
Samuel Pitoiset	8a425e057d	radv: fix potential crash in the compute resolve path If the destination attachment is UNUSED. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-04 12:22:43 +02:00
Tomeu Vizoso	0cc02c9ea6	panfrost: Take into account off-screen FBOs In that case, ctx->pipe_framebuffer.cbufs[0] can be NULL. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Fixes: `5375d009be` ("panfrost: Pass referenced BOs to the SUBMIT ioctls")	2019-07-04 10:48:09 +02:00
Christian Gmeiner	f39a7fd627	util/macros: rework DIV_ROUND_UP macro Simplify used math. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-07-04 10:21:32 +02:00
Christian Gmeiner	e519d3c239	gitlab-ci: bump required libdrm version Fixes following build problem: Message: libdrm 2.4.99 needed because amdgpu has the highest requirement Dependency libdrm_intel found: NO found '2.4.97' but need: '>=2.4.99' Dependency libdrm_intel found: NO meson.build:1178:4: ERROR: Invalid version of dependency, need 'libdrm_intel' ['>=2.4.99'] found '2.4.97'. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-04 09:55:10 +02:00
Kenneth Graunke	9ea67f0a79	iris: Fix MOCS for grid surface Hardcoding 4 is bad; we have a function for this now.	2019-07-03 22:24:50 -07:00
Kenneth Graunke	10560f8506	iris: Minor tidying	2019-07-03 22:24:44 -07:00
Marek Olšák	6ab23805c3	Revert "mesa/st: Passthrough scissor when clearing by quad" This reverts commit `0a88aa3025`. It breaks a lot of piglit tests.	2019-07-04 01:08:02 -04:00
Marek Olšák	8dfdf5aae4	gallium/u_blitter: add return to fix the build	2019-07-03 23:44:14 -04:00
Alyssa Rosenzweig	0a88aa3025	mesa/st: Passthrough scissor when clearing by quad The scissor state -is- setup, but the scissor test is not enabled. This can prevent certain optimizations from occurring on tilers where unaffected tiles are thrown out entirely. v2: Only enable scissor test if the scissor test is actually set by the app, to avoid regressing quad-based clears used for other reasons (like a color mask). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-07-03 14:33:46 -07:00
Nicolai Hähnle	8845a23698	amd: add NAVI10 PCI IDs Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	92e34568b7	radeonsi/gfx10: fix legacy GS LLVM doesn't insert s_waitcnt_vscnt before GS_DONE. There was also the crash in legacy GS copy shader. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	dfa8e758c2	radeonsi/gfx10: disable clear state Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	0dd57f0fc0	radeonsi/gfx10: disable DPBB Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	815fd77a47	radeonsi/gfx10: disable SDMA Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	f66ee5af2f	radeonsi: determine the rasterization primitive type accurately (v2) v2: reworked version to fix bugs and make it more efficient Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	a4b3eea325	radeonsi/gfx10: consolidate & improve input_prim determination for NGG Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	969e5176c2	ac: rework ac_build_waitcnt for gfx10 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	214ddfb688	radeonsi/gfx10: implement si_shader_vs Only used with tessellation + GS instancing. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	6cf2fb1fc4	radeonsi/gfx10: unpack GS invocation ID Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	32694456f7	radeonsi/gfx10: jump over the shader query atomic if the queries are disabled Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	244a8e6798	radeonsi/gfx10: cosmetic changes Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	09a905d930	radeonsi/gfx10: set cache control registers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	b680f723f8	radeonsi/gfx10: export correct PrimitiveID from NGG vertex shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	3203a74dcb	radeonsi/gfx10: set PA_SC_TILE_STEERING_OVERRIDE Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	07aacdbfd5	radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	51db950419	radeonsi/gfx10: disable DCC with MSAA It was only enabled for 2x MSAA anyway. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	6920f09f4b	radeonsi/gfx10: fix GL_LINE polygon mode for decomposed primitives We need to tell PA to accept edge flags generated by the input assembler, because decomposed primitives shouldn't draw inner edges. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	e39d4594da	radeonsi/gfx10: fix NGG GS color clamping Just need to pass the input from ES to GS. Everything else is done. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	40e7c65590	radeonsi/gfx10: fix vertex color clamping for TES Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	cc7875150a	radeonsi/gfx10: unbind NGG shaders when destroyed This fixes glsl-max-varyings, which creates shaders, draws, and then destroys them. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	b90ddff477	radeonsi/gfx10: don't use the GS workaround for triangle strips w/ adjancency Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	c3ac22a620	radeonsi/gfx10: don't do the query buffer atomic for blit shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	adbec817d3	radeonsi/gfx10: update spi_map if API VS (as NGG) changes and PS doesn't Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	1e39c21c23	radeonsi/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	683cf11b81	radeonsi/gfx10: prefetch HW GS when NGG is used Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	76898a8062	amd/common/gfx10: set DLC for llvm.amdgcn.s.buffer.load Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	7f71579064	radeonsi/gfx10: fix PS exports for SPI_SHADER_32_AR Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	4bdf44724f	radeonsi/gfx10: set DLC for loads when GLC is set This fixes L1 shader array cache coherency. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	f81aa6b0c8	radeonsi/gfx10: fix shader images Don't promote 2D image instructions to 3D, and don't set z=BASE_ARRAY. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	7c805a7c67	radeonsi/gfx10: set the DCC constant encoding flag Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	6eb219e963	radeonsi/gfx10: fix intensity formats move the ALPHA_IS_ON_MSB fixup into vi_alpha_is_on_msb Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	6944f99176	radeonsi/gfx10: allocate GDS BOs for streamout Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Marek Olšák	395185912d	radeonsi/gfx10: make sure GDS is idle between IBs Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	5ff3aff0d6	radeonsi/gfx10: implement streamout Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	792a638b03	radeonsi/gfx10: implement streamout-related queries The NGG hardware pipeline doesn't track these statistics automatically, and in fact cannot track them automatically when API geometry shaders are involved, so we accumulate statistics in the shader using atomic adds. This implementation accumulates statistics via the memory system and the RW buffer descriptor setup. We could use GDS, but since these atomics aren't latency-sensitive, that basically just trades off L2$ bandwidth vs. export bus bandwidth. One single memory transaction per shader workgroup doesn't seem too bad. The result ring buffer in memory is needed either way to avoid pipeline stalls. The shader code contains the atomic unconditionally, though the GFX10_GS_QUERY_BUF is a null buffer when no queries are active. The atomic is simply discarded by the shader hardware in that case. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	bcd2d2e194	radeonsi/gfx10: enable the workaround for unaligned vertex fetch Yes, really. Note that non-format buffer loads are unaffected and work just fine with unaligned pointers (as long as SH_MEM_CONFIG is setup correctly, which amdgpu ensures). Fixes e.g. KHR-GL45.vertex_attrib_64bit.vao Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	22b85bfc02	radeonsi/gfx10: re-order the initialization order in si_compile_tgsi_main It's useful to be able to access gs_ngg_scratch before creating the main wrapping branch. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	3aa622aab1	radeonsi/gfx10: apply DCC MSAA blend workaround Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	bc25ccfe22	radeonsi/gfx10: implement si_emit_global_shader_pointers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	6bcc273de8	radeonsi/gfx10: implement si_init_tess_factor_ring Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	2492cfde66	radeonsi/gfx10: initialize EXEC for TES-as-NGG (without geometry shader) Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	591537c7fa	radeonsi/gfx10: use correct VGPR for instance ID in LS shader Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	f3b9a37278	radeonsi/gfx10: implement si_shader_hs Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	e4d6b4daae	radeonsi/gfx10: implement si_create_sampler_state Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	0bf3e6fae7	radeonsi/gfx10: double the number of tessellation offchip buffers per SE Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale the number of offchip buffers accordingly. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	2afd3c421d	radeonsi/gfx10: implement get_tess_ring_descriptor Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	d028440f57	radeonsi/gfx10: mask DCC tile swizzle by alignment DCC alignment can be less than the alignment of the main surface. In that case, the DCC tile swizzle needs to be masked accordingly. Should have no impact on pre-gfx10. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	1666ee183e	radeonsi/gfx10: implement hardware MSAA resolve MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile optimization that we use on gfx9 and earlier does not work. Be very explicit about how the swizzle mode of the temporary surface is selected. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:13 -04:00
Nicolai Hähnle	69c41fb8ff	radeonsi/gfx10: fix binding on si_update_scratch_relocs Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	fd8758366b	radeonsi/gfx10: set llvm_has_working_vgpr_indexing Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	48810ad02d	radeonsi/gfx10: implement load_const_buffer_desc_fast_path Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	1b11fb148c	radeonsi/gfx10: take PRIMID from the correct output when exported by GS Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	8060339278	radeonsi/gfx10: change location of instance ID shader input Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	ccdf792910	radeonsi/gfx10: set USER_DATA_ADDR offset for geometry shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	00707922d4	radeonsi/gfx10: implement si_emit_derived_tess_state Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	e0c2a4d58c	radeonsi/gfx10: implement si_shader_gs This is only used in the legacy, non-NGG path. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	2864d53deb	radeonsi/gfx10: implement preload_ring_buffers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	56cab3e996	radeonsi/gfx10: implement si_set_ring_buffer Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	3c1aeb834f	radeonsi/gfx10: allow rectangle outputs from NGG primitive shader Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	77e715541c	radeonsi/gfx10: emit VGT_GS_OUT_PRIM_TYPE from draw and add it to VS_STATE With NGG, the VGT_GS_OUT_PRIM_TYPE can change without a shader change. The VS_STATE is required for both streamout and culling from a vertex shader without pre-compiling outprim-specific variants. We could consider compiling specialized variants in the future. We could also consider compiling the NGG logic as an epilog. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4ecc39e1aa	radeonsi/gfx10: NGG geometry shader PM4 and upload Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	a04aa4be2b	radeonsi/gfx10: generate geometry shaders for NGG Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	efe1cd4859	radeonsi/gfx10: use the correct register for image descriptor dumping Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	1ce52c1e37	radeonsi/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	77c0f9e7ba	radeonsi/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	47c9505a92	radeonsi/gfx10: setup registers for OpenGL compute Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	b8d3fd46d6	radeonsi/gfx10: set user data base registers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	016a465d7d	radeonsi/gfx10: implement gfx10_shader_ngg For pipelines without API GS. We will later expand this to cover NGG geometry shaders as well. Note that the vtx offset passed into the GS part is just the vertex index multiplied by VGT_ESGS_RING_ITEMSIZE. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	d0c204a1e0	radeonsi/gfx10: add NGG registers to si_init_config Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	ae00cae0b7	radeonsi/gfx10: update shader-related fields in si_init_config Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	1dee01ee13	radeonsi/gfx10: implement si_shader_ps Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	612489bd5d	radeonsi/gfx10: generate VS and TES as NGG merged ESGS shaders This does not support geometry shading yet. Also missing are streamout and NGG-specific optimizations. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	e86256c512	radeonsi/gfx10: distinguish between merged shaders and multi-part shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4063ea95e9	radeonsi/gfx10: update si_get_shader_name Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	8ec60d3031	radeonsi/gfx10: add as_ngg shader key bit Also add the shader main part NGG variant, so that in principle we can switch between legacy in NGG modes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	40b12c0f5a	radeonsi/gfx10: implement si_update_shaders Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	5726ec0d24	radeonsi/gfx10: implement si_build_vgt_shader_config Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	b45c3debe8	radeonsi/gfx10: keep track of whether NGG is used We always use NGG by default, except when tessellation is enabled with extreme geometry shader amplification. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	226f650d92	radeonsi/gfx10: document NGG shader stages Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	7bb9bb0540	radeonsi/gfx10: implement gfx10_emit_cache_flush Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	0c6c6810bd	radeonsi/gfx10: add si_context::emit_cache_flush The introduction of GCR_CNTL makes cache flush handling on gfx10 sufficiently different that it makes sense to just use a separate function. Since emit_cache_flush is called quite early during context init, we initialize the pointer explicitly in si_create_context. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	08e2a62b07	radeonsi/gfx10: implement DB registers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	372652bccc	radeonsi/gfx10: set CB registers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	44adae42ae	radeonsi/gfx10: always set up sample locations Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	79b1eaf2fd	radeonsi/gfx10: use Z32_FLOAT_CLAMP for upgraded depth textures Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	c049a6f895	radeonsi/gfx10: implement vertex format changes Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	62f73d8214	radeonsi/gfx10: implement si_set_{constant,shader}_buffer Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	21ac1da0d1	radeonsi/gfx10: implement si_make_buffer_descriptor Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	7bc818aef1	radeonsi/gfx10: implement si_set_mutable_tex_desc_fields Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	8598a999ea	radeonsi/gfx10: gfx10 can render up to 8192 layers Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	3f2b2b52d0	radeonsi/gfx10: add gfx10_make_texture_descriptor Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	595a7f7c47	radeonsi/gfx10: add pipe_screen::make_texture_descriptor Texture descriptors in gfx10 are very different. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4afce5efdd	radeonsi/gfx10: determine view->is_integer based on the pipe_format It was convenient, but NUM_FORMAT no longer exists in gfx10. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	3163db3ba4	radeonsi/gfx10: implement si_is_format_supported Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	0ffa2292b3	radeonsi/gfx10: generate gfx10_format_table.h Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	af29ad7cc6	radeonsi/gfx10: set MAX_ALLOC_COUNT The number for Vega was copied from PAL and has no effect because of MIN2. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	594010e366	radeonsi/gfx10: require LLVM 9 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	de99e0a563	radeon/vcn: update for new vcn enc interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	9ab1e427bb	radeonsi: enable jpeg decode for navi10 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	6480c7b577	radeon/vcn: implement vcn 2.0 jpeg decode Use direct register to implement vcn 2.0 jpeg deocde Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	0cd7953ece	radeon/vcn: add direct register bool VCN 2.0 uses direct register space where VCN 1.0 uses some indirect registers Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	7a5c22d32a	radeon/vcn: add defines for vcn 2.0 jpeg Add neccesary register defines for vcn 2.0 jpeg deocde Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	0c27971157	radeon/vcn: use variable to assign ib cmd Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	587b9c5dae	radeon/vcn: implement vcn 2.0 encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	40e1bed389	radeon/vcn: add vcn2.0 encode skeleton Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> (v2: build fix -- Nicolai) Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	8f6272d494	radeon/vcn: move vcn1.0 specific defines to c Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	b5287a9fa6	radeon/vcn: assign function pointer with ib functions Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	9940a6e066	radeon/vcn: add function pointer for ib functions Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	c6b5188505	radeon/vcn: move header related algorithm to vcn_enc Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	dd46740bc2	radeon/vcn: move add buf func to common file Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Boyuan Zhang	e6ca4d1bd8	radeon/vcn: move cs defines to enc header file Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Leo Liu	874881b26b	radeon/vcn: add VP9 support for Navi10 It requires bigger DPB and context buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Leo Liu	9bbb546c4f	radeonsi: enable encode support for newer HW Previously it was Raven only allowed to do so Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Leo Liu	d6acd29c9a	radeon/vcn: add VCN2 set of internal registers for IB From VCN2.0, the RBC have different views on the registers Signed-off-by: Leo Liu <leo.liu@amd.com> (v2: rebase -- Nicolai) Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Leo Liu	a38268ea5b	radeonsi/uvd: allow newer HW to create HW decoder Previously it was Raven only allowed to do so Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	84e7ee421f	ac/surface/gfx10: allow "rotated" micro mode Standard mode does not support DCC. The R is retconned to "render target" on gfx10. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	a66be784c3	ac/surface/gfx10: DCC is only supported with SW_64KB_{Z,R}_X modes Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	97ddcfff7c	amd/addrlib/gfx10: forbid DCC for swizzle modes which the hardware does not support Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	9eb4a79345	amd/addrlib/gfx10: fix assertion in Addr2IsValidDisplaySwizzleMode Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	6d416ac7e1	amd/common/gfx10: print gfx10 registers in debug dumps Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	70fd27d1e3	amd/common/gfx10: CMASK is only used for FMASK All regular color compression is done via DCC. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	b52bf8f12a	amd/common/gfx10: support new tbuffer encoding Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	c067aaa580	amd/common/gfx10: pad shader buffers for instruction prefetch Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	227c29a80d	amd/common/gfx10: implement scan & reduce operations Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	7ba80c1d19	amd/common/gfx10: add GS_ALLOC_REQ message define Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4c364c89e2	amd/common/gfx10: print out GCR_CNTL as part of {ACQUIRE,RELEASE}_MEM Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	74a26af913	amd/common/gfx10: add register JSON A small number of fields now need new disambiguation. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	536782b0b7	amd/common: add GFX10 chips Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Marek Olšák	677bb80c98	meson: require libdrm_amdgpu 2.4.99 for Navi Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	db7e7a6cb5	radv: gfx10 is not supported Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Marek Olšák	78cdf9a99f	amd/addrlib: add gfx10 support Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	112bf7f900	radeonsi: make emit_streamout_output externally accessible Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	e241b405ca	radeonsi: pass the context to query destroy functions We'll need this in the future. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	064f195ef0	radeonsi: make si_restore_qbo_state externally available Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	04e27ec136	radeonsi: make get_primitive_id externally visible Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	5059a4df8a	radeonsi: make si_llvm_export_vs externally available Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Nicolai Hähnle	4a774ba893	radeonsi: various si_translate_*format functions only apply to pre-gfx10 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 15:51:12 -04:00
Marek Olšák	c53e6ea05d	radeonsi: use a fragment shader blit instead of DB->CB copy for ZS CPU mappings This mainly removes and simplifies code that is no longer needed. There were some issues with the DB->CB stencil copy on gfx10, so let's just use a fragment shader blit for all ZS mappings. It's more reliable. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-07-03 15:51:12 -04:00
Marek Olšák	6686d8a130	gallium/u_blitter: implement copying from ZS to color and vice versa This is for drivers that can't map depth and stencil and need to blit them to a color texture for CPU access. This also useful for drivers using separate depth and stencil. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-07-03 15:51:12 -04:00
Marek Olšák	13a5e9d685	gallium/util: rewrite depth-stencil blit shaders - merge all 3 functions (Z, S, ZS) - don't write the color output - read the value from texel.x, then write it to position.z or stencil.y (don't use the value from texel.y or texel.z) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-07-03 15:51:12 -04:00
Marek Olšák	131d40cfc9	st/mesa: accelerate glCopyPixels(STENCIL) Tested-by: Dieter Nützel	2019-07-03 15:50:04 -04:00
Yevhenii Kolesnikov	65dc4db08e	glsl/standalone: meson test for --dump-builder Added meson test for standalone compiler with --dump-builder option on builtin texture* functions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-07-03 12:13:37 -07:00
Sergii Romantsov	9f85b4940c	glsl/standalone: exit on unsupported texture functions glsl/standalone with --dump-builder will exit when unsupported texture functions are encountered. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-07-03 12:13:37 -07:00
Pierre-Eric Pelloux-Prayer	ea5b7de138	radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled gl_SampleMaskIn is 1 when R_028BE0_PA_SC_AA_CONFIG is 0, so this commit rework the conditions controlling this register. Before it was set if the sctx->framebuffer had a sample count > 1. Now we still require this condition, but we also need either: - GL_MULTISAMPLE to be enabled - to be executing an operation that doesn't depends on GL state using u_blitter. This fixes the arb_sample_shading/sample_mask piglit tests on radeonsi. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-03 14:59:21 -04:00
Brian Paul	7bb3d6acec	gallium/u_blitter: enable MSAA when blitting to MSAA surfaces If we're doing a Z -> Z MSAA blit (for example) we need to enable msaa rasterization when drawing the quads so that we can properly write the per-sample values. This fixes a number of Piglit ext_framebuffer_multisample blit tests such as ext_framebuffer_multisample/no-color 2 depth combined with the VMware driver. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-07-03 14:59:15 -04:00
Alexandros Frantzis	e5be4351c2	virgl: Clear the valid buffer range when possible If we are discarding the whole resource, we don't care about previous contents, and the resource storage is now unused, either because we have created new resource storage, or because we have waited for the existing resource storage to become unused, or because the transfer is unsynchronized. In the last two cases this commit marks the storage as uninitialized, but only if the resource is not host writable (in which case we can't clear the valid range, since that would result in missed readbacks in future transfers). In the first case, when the whole resource discard involves a reallocation, the reallocation and subsequent rebinding already update the valid buffer range appropriately. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-07-03 09:59:55 -07:00
Jan Zielinski	243db4980c	swr/swr: Enable ARB_viewport_array The rasterizer core supported ARB_viewport_array, but the swr layer connecting core to Gallium state tracker only allowed one viewport. We add support for multiple viewports to swr layer. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-07-03 14:43:28 +02:00
Bas Nieuwenhuizen	c6cb9b197d	radv: Support VK_EXT_queue_family_foreign. Basically same as external for now. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Only case we might need to handle differently in the near future is Raven's case of displayable DCC which is not renderable. But we don't support that yet.	2019-07-03 10:56:21 +00:00
Bas Nieuwenhuizen	8a053254b8	radv: Fix interactions between variable descriptor count and inline uniform blocks. Fixes: `d7e6541cc7` "radv: Only allocate supplied number of descriptors when variable." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-03 10:43:35 +00:00
Michel Dänzer	11a3679e3a	winsys/amdgpu: Make KMS handles valid for original DRM file descriptor Getting a DMA-buf fd and converting that to a handle using our duplicate of that file descriptor (getting at which requires passing a radeon_winsys pointer to the buffer_get_handle hook) makes sure of this, since duplicated file descriptors reference the same file description and therefore the same GEM handle namespace. This is necessary because libdrm_amdgpu may use a different DRM file descriptor with a separate handle namespace internally, e.g. because it always reuses any existing amdgpu_device_handle for the same device. amdgpu_bo_export returns a handle which is valid for that internal file descriptor. Bugzilla: https://bugs.freedesktop.org/110903 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-03 09:19:07 +00:00
Michel Dänzer	cb446dc0fa	winsys/amdgpu: Add amdgpu_screen_winsys It extends pipe_screen / radeon_winsys and references amdgpu_winsys. Multiple amdgpu_screen_winsys instances may reference the same amdgpu_winsys instance, which corresponds to an amdgpu_device_handle. The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM file descriptor passed to amdgpu_winsys_create, which will be needed in the next change. v2: * Add comment in amdgpu_winsys_unref explaining why it always returns true (Marek Olšák) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-03 09:19:07 +00:00
Michel Dänzer	6fce296400	winsys/amdgpu: Use amdgpu_winsys helper instead of open-coded casts Cleanup to prevent breakage with the next change, no functional change intended in this one. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-03 09:19:07 +00:00
Juan A. Suarez Romero	e06bc0b166	intel: fix wrong format usage Do not use the view format when filling the surface state. Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.* Fixes: `fb1350c76f` ("intel: Add and use helpers for level0 extent") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-03 10:14:54 +02:00
Samuel Pitoiset	a7b6a869a7	radv: only allocate a 32-bit value for the TC-compat range metadata Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 08:52:01 +02:00
Samuel Pitoiset	6baa453dd5	radv: remove unused code in radv_update_tc_compat_zrange_metadata() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 08:51:58 +02:00
Samuel Pitoiset	a21f23c811	radv: add radv_get_depth_pipeline() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-03 08:51:42 +02:00
Mike Blumenkrantz	e005470466	iris: assert isl_surf_init success in resource_from_handle this can fail unexpectedly due to bugs, so it's good to provide feedback when this occurs Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-07-02 15:39:44 -07:00
Jason Ekstrand	e708261cb7	anv: Advertise a more accurate minTexelBufferOffsetAlignment Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-02 22:28:44 +00:00
Jason Ekstrand	0bc657f2db	anv: Implement VK_EXT_texel_buffer_alignment Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-02 22:28:44 +00:00
Jason Ekstrand	465ec0b145	vulkan: Update the XML and headers to 1.1.113 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-02 22:28:44 +00:00
Caio Marcelo de Oliveira Filho	050eb6389a	spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-07-02 12:15:01 -07:00
Karol Herbst	95a7fd0f10	nouveau: handle new CAPS Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-07-02 20:09:44 +02:00
Jason Ekstrand	fa869f45c8	intel/fs: Use nir_lower_interpolation on gen11+ On gen11, the removed the PLN instruction so we have to emit a pile of MAD to emulate it. We may as well do that in NIR so we can optimize and later schedule it. Shader-db results on Ice Lake: total instructions in shared programs: 17145644 -> 16556440 (-3.44%) instructions in affected programs: 11507454 -> 10918250 (-5.12%) helped: 35763 HURT: 42085 helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18 helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49% HURT stats (abs) min: 1 max: 248 x̄: 2.22 x̃: 2 HURT stats (rel) min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47% 95% mean confidence interval for instructions value: -7.67 -7.47 95% mean confidence interval for instructions %-change: -4.46% -4.29% Instructions are helped. total loops in shared programs: 4370 -> 4370 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 360624645 -> 368220857 (2.11%) cycles in affected programs: 269631244 -> 277227456 (2.82%) helped: 15583 HURT: 65874 helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32 helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44% HURT stats (abs) min: 1 max: 238638 x̄: 133.87 x̃: 20 HURT stats (rel) min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97% 95% mean confidence interval for cycles value: 67.42 119.09 95% mean confidence interval for cycles %-change: 3.61% 3.73% Cycles are HURT. total spills in shared programs: 8943 -> 8981 (0.42%) spills in affected programs: 1925 -> 1963 (1.97%) helped: 44 HURT: 14 total fills in shared programs: 21815 -> 21925 (0.50%) fills in affected programs: 3511 -> 3621 (3.13%) helped: 41 HURT: 18 LOST: 70 GAINED: 14 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Jason Ekstrand	2b79a9e5a5	intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Jason Ekstrand	8e7d066682	intel/fs: Actually implement the load_barycentric intrinsics If they never get used, dead code should clean them up. Also, we rework the at_offset and at_sample intrinsics so they return a proper vec2 instead of returning things in PLN layout. Fortunately, copy-prop is pretty good at cleaning this up and it doesn't result in any actual extra MOVs. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Rob Clark	5787a2dfe3	nir: add pass to lower load_interpolated_input Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-02 16:15:25 +00:00
Boris Brezillon	5375d009be	panfrost: Pass referenced BOs to the SUBMIT ioctls Instead of manually adding the BOs from the various SLAB pools plus the one backing the color FB, we insert them in the BO set attached to the job and let panfrost_drm_submit_job() pass all BOs from this set to the SUBMIT ioctl. This means we are now passing all referenced BOs and let the scheduler wait on referenced BO fences if needed. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 15:00:21 +02:00
Boris Brezillon	3557746e0d	panfrost: Make SLAB pool creation rely on BO helpers There's no point duplicating the code, and it will help us simplify the bo_handles[] filling logic in panfrost_drm_submit_job(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:59:28 +02:00
Boris Brezillon	c684a79669	panfrost: Add the panfrost_drm_{create,release}_bo() helpers To avoid the panfrost_memory <-> panfrost_bo dance done in panfrost_resource_create_bo() and panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:58:51 +02:00
Boris Brezillon	948fddfc42	panfrost: Move the mmap BO logic out of panfrost_drm_import_bo() So we can re-use it for the panfrost_drm_create_bo() function we are about to introduce. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:58:51 +02:00
Boris Brezillon	8d4afcdacc	panfrost: Avoid passing winsys handles to import/export BO funcs Let's keep a clear split between ioctl wrappers and the rest of the driver. All the import BO function need is a dmabuf FD and the screen object, and the export one should only take care of generating a dmabuf FD out of a BO object. Winsys handle manipulation should stay in the resource.c file. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:58:51 +02:00
Boris Brezillon	aa5bc35f31	panfrost: Move BO meta-data out of panfrost_bo That's what most (all?) implementation seem to do, and my understanding is that a BO is just a bunch of memory that can be used for anything GPU related, not only texture/FB resources. Let's move those meta data in panfrost_resource so we can use panfrost_bo for all kind of memory allocation and make BO allocation more consistent. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:58:51 +02:00
Boris Brezillon	c4f4193ad4	panfrost: Stop exposing internal panfrost_drm_*() functions panfrost_drm_submit_job() and panfrost_fence_create() are not used outside of pan_drm.c. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:58:51 +02:00
Boris Brezillon	6ba61324f0	panfrost: Get rid of the "free imported BO" logic bo->imported was never set to true which means this path was never taken. Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap() call which seems wrong because the import BO function calls mmap(). Let's just kill this function along with the ->imported field. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:57:35 +02:00
Boris Brezillon	079aaa9c6d	panfrost: Get rid of the panfrost_driver abstraction leftovers Commit `5f81669d88` ("panfrost: Remove the panfrost_driver abstraction") left a few things behind, remove them now. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:57:35 +02:00
Boris Brezillon	6608642d21	panfrost: Move scanout res creation out of panfrost_resource_create() Which improves readability and help us avoid a memory leak. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2019-07-02 14:57:35 +02:00
Boris Brezillon	873b7b93e8	panfrost: Add the sampled texture BO to the job Otherwise we get random use-after-{free,unmap} errors. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> --- Changes in v2: - Move the panfrost_job_add_bo() call out of the loop	2019-07-02 14:57:35 +02:00
Samuel Pitoiset	6cc213b3c1	radv: enable DCC for layers on GFX8 It's currently only enabled if dcc_slice_size is equal to dcc_slice_fast_clear_size because the driver assumes that portions of multiple layers are contiguous but it's not always true. Still not supported on GFX9. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:38:02 +02:00
Samuel Pitoiset	233224c7f7	radv: do not enable DCC for mipmapped arrays because performance is worse Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:38:00 +02:00
Samuel Pitoiset	e41e575e24	radv: implement clearing DCC layers on GFX8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:56 +02:00
Samuel Pitoiset	e47c68b7b0	radv: merge radv_dcc_clear_level() into radv_clear_dcc() This will help for clearing DCC arrays because we need to know the subresource range. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:51 +02:00
Samuel Pitoiset	f772fe6a11	radv: add support for decompressing DCC layers with compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:49 +02:00
Samuel Pitoiset	83297baf2d	ac: compute the DCC fast clear size per slice on GFX8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:44 +02:00
Samuel Pitoiset	6517d226ac	ac: compute the size of one DCC slice on GFX8 Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-02 09:37:41 +02:00
Kenneth Graunke	457a55716e	iris: Defer closing and freeing VMA until buffers are idle. There will unfortunately be circumstances where we cannot re-use a virtual memory address until it's no longer active on the GPU. To facilitate this, we instead move BOs to a "dead" list, and defer closing them and returning their VMA until they are idle. We periodically sweep these away in cleanup_bo_cache, which triggers every time a new object's refcount hits zero. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2019-07-02 07:23:55 +00:00
Kenneth Graunke	07f3455664	iris: Add an explicit alignment parameter to iris_bo_alloc_tiled(). In the future, some images will need to be aligned to a larger value than 4096. Most buffers, however, don't have any such requirement, so for now we only add the parameter to iris_bo_alloc_tiled() and leave the others with the simpler interface. v2: Fix missing alignment in vma_alloc, caught by Caio! Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2019-07-02 07:23:55 +00:00
Iago Toral Quiroga	042aeffd5b	v3d: do not flush jobs that are synced with 'Wait for transform feedback' Generally, we achieve this by skipping the flush on calls to v3d_flush_jobs_writing_resource() when we detect that the resource is written in the current job from a transform feedback write. The exception to this is the case where the caller is about to map the resource, in which case we need to flush immediately since we can only emit 'Wait for transform feedback' commands on rendering jobs. We add a parameter to the function so the caller can identify that scenario. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-02 08:57:20 +02:00
Iago Toral Quiroga	88cbc4f7f6	v3d: emit 'Wait for transform feedback' commands when needed The hardware can flush transform feedback writes before reads in the same job by inserting this command. This patch detects when the rendering state for the current draw call reads resources that had been previously written by transform feedback in the same job and inserts the 'Wait for transform feedback' command before emitting the new draw. v2 (Eric): - this was intended to look at job->tf_write_prscs for TF jobs. - clear job->tf_write_prscs after we emit the TF flush. - can skip flushes for fragment shader reads from TF. v3 (Eric): - all resources in job->tf_write_prscs are resources written by TF so we don't need to check if they are bound to PIPE_BIND_STREAM_OUTPUT. - documented optimization opportunity for geometry stages. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-02 08:57:20 +02:00
Iago Toral Quiroga	c7dff0e614	v3d: keep track of resources written by transform feedback The hardware provides a feature to sync reads from previous transform feedback writes in the same job so if we use this mechanism we no longer have to flush the job. In order to identify this scenario we need a mechanism to identify resources that are written by transform feedback. v2: use _mesa_pointer_set_create (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-02 08:57:20 +02:00
Mike Blumenkrantz	c8dcc308cc	st/dri: fix typo in format table for GR1616 format the dri image format here should match the fourcc format Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-01 15:17:10 -07:00
Mike Blumenkrantz	08fc14a979	st/dri: pass dri2_format_mapping directly to dri2_create_image_from_winsys this makes the entire struct available for use here Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-01 15:16:56 -07:00
Mike Blumenkrantz	2cc85670a7	mesa/st: simplify format usage in st_bind_egl_image the formats handled in the switch statement will always return an unknown mesa format, so process them directly and leave the default case for other/unknown formats no functional changes Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-01 15:16:43 -07:00
Kenneth Graunke	9b1b971491	iris: Use MI_COPY_MEM_MEM for tiny resource_copy_region calls. If our resource_copy_region size is a small number of DWords, then instead of firing up BLORP, we can simply use MI_COPY_MEM_MEM (after a CS stall). We also try and select the optimal batch. Improves performance in Shadow of Mordor on Low settings at 1920x1080 on Skylake GT4e by 0.689096% +/- 0.473968% (n=4). It tries to copy 4 bytes of data to a buffer which was most recently used as a writable compute shader SSBO. Previously we were switching from compute to the render pipeline, then firing up all of blorp_buffer_copy...for 4 bytes. I arbitrarily decided to support 4/8/12/16 bytes. Jason thinks this is about the right threshold where it's cheaper to use MI_COPY_MEM_MEM.	2019-07-01 13:59:49 -07:00
Bas Nieuwenhuizen	d7e6541cc7	radv: Only allocate supplied number of descriptors when variable. Fixes: `b5e04e9217` "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-07-01 20:53:33 +02:00
Eric Engestrom	177c35bf13	egl: simplify loop Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>	2019-07-01 19:35:22 +01:00
Eric Anholt	67ffb853f0	sparc: Reuse m_vector_asm.h. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-01 11:14:29 -07:00
Eric Anholt	20294dceeb	mesa: Enable asm unconditionally, now that gen_matypes is gone. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 11:14:10 -07:00
Eric Anholt	52a39a332f	mesa: Replace gen_matypes with a simple header for V4F/mat layout. We can greatly simplify our builds by just hardcoding GLvector4f and GLmatrix's layouts. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-01 11:12:15 -07:00
Eric Anholt	1738b38ce8	matypes: Drop some unused defines. Most of these haven't been used since the conversion from checked-in matypes to generation. By cutting down the generated contents, this should clarify why the file is generated: we need architecture-specific offsets to the V4F fields in the asm that uses it. v2: Keep matrix offsets to prevent x86 build breakage.. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-07-01 11:09:26 -07:00
Eric Engestrom	1835f30097	meson: drop duplicate source & inc_dir These two are already pulled from `idep_vulkan_util_headers`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-07-01 18:53:57 +01:00
Eric Engestrom	04e0ac59b1	swrast: simplify function pointer calls Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-07-01 18:51:49 +01:00
Eric Engestrom	fbf7c38da3	egl/wayland: use bitset.h for `formats` bit set Currently only 7 formats are supported, but we don't want the 16 limit (it's an `unsigned`) to hit us by surprise :] Let's use bitset.h's BITSET magic to allow us to have any number of formats, with a static assert to make sure we don't forget to update it. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-01 18:35:54 +01:00
Sagar Ghuge	d5f63990b4	intel/tools: Add assembler unit tests for ROL/ROR instructions Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	e9c35dd7cc	intel/tools: Add ROL/ROR support in assembler Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	456557a837	nir: Add lower_rotate flag and set to true in all drivers Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	1e92e83856	intel/compiler: Emit ROR and ROL instruction v2: Reorder patch (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	80117117bd	nir: Add optimization to use ROR/ROL instructions v2: 1) Add more optimization rules for ROL/ROR (Matt Turner) 2) Add lowering rules for ROL/ROR (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	81d342e2a1	nir: Add urol and uror opcodes Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Sagar Ghuge	83fdec0f0d	intel/compiler: Enable the emission of ROR/ROL instructions v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support align16 mode (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Alyssa Rosenzweig	8d74749f81	panfrost: Implement instanced rendering We implement GLES3.0 instanced rendering with full support for instanced arrays (via instance divisors). To do so, we use the new invocation helpers to invoke a triplet of (1, vertex_count, instance_count), rather than simply (1, vertex_count, 1). We rewrite the attribute handling code into a new pan_instancing.c file which handles both the simple LINEAR case for non-instanced as well as each of the new instancing cases: MODULO (for per-vertex attributes), POT and NPOT divisors. As a side effect, we rework how vertex buffers are handled, duplicating them to be 1:1 with vertex descriptors to simplify instancing code paths dramatically. This might be a performance regression, but this remains to be seen; if so, we can always deduplicate later with some added logic in pan_instancing.c Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:50:57 -07:00
Alyssa Rosenzweig	e9e22546ff	panfrost/decode: Compute padded_num_vertices for MODULO Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:49:18 -07:00
Alyssa Rosenzweig	9b97ed1250	panfrost/midgard: Emit type appropriate ld_vary Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:42:56 -07:00
Alyssa Rosenzweig	aa333ac6ad	panfrost/midgard: Add unsigned ld/st ops Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:42:55 -07:00
Alyssa Rosenzweig	bbc050b82e	panfrost/midgard: Use the appropriate ld_attr type Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:42:55 -07:00
Alyssa Rosenzweig	c9b164f9b5	panfrost: Implement dispatch helpers Rather than open-coding workgroups_shift_* type fields, we include a general routine for packing the vertex/tiler/compute descriptor based on the provided dispatch parameters. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:42:55 -07:00
Alyssa Rosenzweig	8fd748de3d	panfrost: Remove ancient comment Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:42:55 -07:00
Alyssa Rosenzweig	9fe4fd8a9c	panfrost: Extend software tiling to larger bpp Should not affect lima. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-07-01 07:40:19 -07:00
Alyssa Rosenzweig	f2801f7775	panfrost: Rewrite u-interleaving code Rather than using a magic lookup table with no explanations, let's add liberal comments to the code to explain what this tiling scheme is and how to encode/decode it efficiently. It's not so mysterious after all -- just reordering bits with some XORs thrown in. v2: Correct copyright identifier. Fix spelling error. Switch space_4 to a LUT. Fix comment typo. Use LUT instead of space_x tricks. Fallback on generic rather than split up unaligned writes. v3: Correct stride order (fixes crash loading). Correct coordinate system mishap. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-07-01 07:39:51 -07:00
Rob Clark	02893fe73a	freedreno: update generated registers Corrects the a3xx texconst state for TILE_MODE. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-07-01 06:15:52 -07:00
Samuel Pitoiset	d8b079e4c7	radv: rework how the number of VGPRs is computed Just a cleanup, it shouldn't change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:27 +02:00
Samuel Pitoiset	e3baa54195	radv: gather if a vertex shaders needs the instance ID Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:24 +02:00
Samuel Pitoiset	17cb7ea6fc	radv: fix decompressing DCC levels with compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:22 +02:00
Samuel Pitoiset	f4d2c47cf6	radv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8 Just move around the switch case. GFX9+ is handled below. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:19 +02:00
Samuel Pitoiset	b4477fa4d4	radv: reduce number of VGPRs for TESS_EVAL if primitive ID is not used We only need to 2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:59:17 +02:00
Samuel Pitoiset	cc50c85e13	radv: make sure to mark the image as compressed when clearing DCC levels Found while working on DCC for arrays. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-07-01 14:58:56 +02:00
Michel Dänzer	3fd21a6b77	targets/opencl: Add clangASTMatchers library as dependency Fixes link failure since clang r364424 "[clang/DIVar] Emit the flag for params that have unmodified value", clangCodeGen depends on clangASTMatchers now. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-07-01 12:54:40 +02:00
Caio Marcelo de Oliveira Filho	5ad283550b	glsl/nir: Lower buffers using Binding instead of Names When using ARB_gl_spirv, the block names are optional and the uniform blocks are referred using Bindings instead. Teach gl_nir_lower_buffers to handle those. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	2af2235a32	glspirv: Enable the new deref-base UBO/SSBO path on gl_spirv Among other things, it supports arrays of arrays of UBO/SSBO (default codepath doesn't). Acked-by: Timothy Arceri <tarceri@itsqueeze.com> v2: nir_address_format_vk_index_offset got renamed to nir_address_format_32bit_index_offset (after rebase against master) v3: the ptr_type fields in spirv_to_nir_options got changed to be of type nir_address_format. v4: remove phys_ssbo_addr_format and push_const_addr_format as they are not used by glspirv Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	cae501b394	i965: call to gl_nir_link_uniform_blocks When using a SPIR-V shader. Note that needs to be done before linking uniforms, so when creating the uniform storage entries, block_index could be filled properly (among other things). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	678140e195	i965: use GLboolean for all brw_link_shader returns The function had a mix of true/GL_TRUE and false/GL_FALSE returns. Using GL_TRUE/GL_FALSE as the function returns a GLboolean. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	a69a48d65a	nir/linker: update already processed uniforms search for UBOs/SSBOs Until now, we were using the uniform explicit location to check if the current nir variable was already processed while adding entries on the uniform storage. But for UBOs/SSBOs, entries are added too but we lack a explicit location. For those we need to rely on the UBO/SSBO binding and the unifor storage block_index. In that case several uniforms would need to be updated at once. v2: (from Timothy review) * Improve wording and fix typos of some long comments. * Rename update_uniform_storage for mark_stage_as_active v3: (from cmarcelo review) * Fixed some comment typos Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	de05a6ccf5	nir/linker: fill up uniform_storage with explicit data Specifically, offset, stride (coming from arrays or matrices) and row_major. On GLSL, most of that info is computed using the layout qualifier, but on ARB_gl_spirv they are explicit, and for Mesa, included on the glsl_type. From ARB_gl_spirv spec: "Mapping of layouts std140/std430 -> explicit Offset, ArrayStride, and MatrixStride Decoration on struct members"" "7.6.2.spv SPIR-V Uniform Offsets and Strides The SPIR-V decorations GLSLShared or GLSLPacked must not be used. A variable in the Uniform Storage Class decorated as a Block must be explicitly laid out using the Offset, ArrayStride, and MatrixStride decorations" For offset we needed to include the parent and index_in_parent while processing the type, as the offset is maintained on glsl_struct_field of the parent type, not on the type itself. v2: Fix the default values for MATRIX_STRIDE, ARRAY_STRIDE and ROW_MAJOR when the variable is not backed by a buffer object (Antia Puentes). v3: Update after Jason series "SPIR-V: Use NIR deref instructions for UBO/SSBO access" that included just one explicit stride, instead of a previous patch we wrote that had matrix_stride and array_stride (Alejandro) Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:27 -05:00
Alejandro Piñeiro	eb50d1d2a6	nir/linker: use only the array element type for array of ssbo/ubo For this interfaces, the inner members are added only once as uniforms or resources, in opposite to other cases, like a uniform array of structs. For those guessing why a issue (16) from ARB_program_interface_query was used, instead of a quote of the core spec: The core spec is not really clear about how members of arrays of blocks should be enumerated. On GLSL this was also problematic, specially when we were trying to pass the 4.5 CTS tests. See commit "glsl: Fix program interface queries relating to interface blocks" (`4c4d9e4f03`), as a reference. That one also needed to rely on issue (16) to justify the change, pointing that the core spec needs to be clarified. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	eec1d5f801	nir/linker: fill is_shader_storage for uniforms Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	5723919282	nir/linker: add gl_nir_link_uniform_blocks.c Adding the ability to link uniform blocks and shader storage blocks using NIR, intended for ARB_gl_spirv support. Among other things, this linking needs to take into account that everything should work without names, as they could be not present, while the GLSL IR uniform block linking was wrote with the names on its core. The other major difference compared with the GLSL IR linker is that we don't deal with layouts. There are no references to std140, std430, etc. Layouts are expressed through explicit offset, array stride and matrix stride. That simplifies how the buffer size are computed. But also means that we couldn't use the existing methods at glsl_types, so we needed to implement new methods. It is worth to note that this linking do a iteration over the glsl_types, similarly to what the linking uniforms do. A possible future improvement would be refactor both cases to try to share more code that it sharing right now. On GLSL IR there are a class visitor, specialized on each case, for that sharing. As adding a class visitor on C would more complicated, for now we are just iterating on both. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> v2: (from Timothy review) * Fix variable name convention * Stop to use _function_name convention * Don't use // for comments * "nir/linker: Keep track of the stages referencing an UBO/SSBO" squashed with this patch v3: (from Caio review) * Don't delete the linked shader on failure * Use rzalloc_array to avoid some explicit initializations Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	39f4ef57d6	nir_types: add glsl_type_is_leaf helper Helper used to know when a glsl_type is a leaf when iteraring through a complex type. Note that GLSL IR linking also uses the concept of leaf while doing the same iteration, although in that case it uses a visitor. See link_uniform_blocks, process_array_leaf and others as reference. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> v2: * Moved from gl_nir_linker to nir_types, so it could be used on nir xfb gathering (Timothy Arceri) * Minor update after Timothy's series about record to struct renaming landed master. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	0019d61527	glsl/nir: add glsl_types::explicit_size plus nir C wrapper While using SPIR-V shaders (ARB_gl_spirv), layout data is not implicit to a specific value (std140, std430, etc) but explicitly included on the type (explicit values for offset, stride and row_major). So this method is equivalent to the existing std140_size and std430_size, but using such explicit values. Note that the value returned by this method is only valid if such data is set, so when dealing with SPIR-V shaders. v2: (all changes suggested by Jason Ekstrand) * Iterate through all struct members, instead of assume that fields are ordered by offset * Use else if * Take into account the case that explicit_stride > elem_size, to fine graine the final size on arrays and matrices * Handle different bit-sizes in general, not just 32 and 64. v3: (change suggested by Caio Marcelo de Oliveira Filho) * fix up explicit_size() to consider interface types Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	c23522add2	glsl_types: add type::bit_size and glsl_base_type_bit_size helpers Note that the nir_types glsl_get_bit_size is not a wrapper of this one, because for bools at the nir level, we want to return size 1, but at the glsl_types we want to return 32. v2: reuse the new method in order to simplify is_16bit and is_32bit helpers (Timothy) v3: add a comment clarifying the difference between glsl_base_type_bit_size and glsl_get_bit_size. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	12355c7e91	nir: add is_in_ubo/ssbo/block helpers Equivalent to the already existing ir_variable is_in_buffer_block and is_in_shader_storage_block, adding the uniform buffer object one. I'm using the short forms (ssbo, ubo) to avoid having method names too long. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	15f134412f	spirv/nir: fill up nir variable info for ubos and ssbo The data for some nir variables is only filled up for some specific modes. We need now too for UBO/SSBO, as such info would be used when linking for OpenGL (ARB_gl_spirv). There is an existing comment just before that code (starts with XXX) that points that binding still needs to be filled up for uniform variables at that point, and that should be fixed, although it doesn't specify why that's a problem or what would be the alternative. For now doing the same for UBO/SSBO, and will hope that the future fixing is done for all of them. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Alejandro Piñeiro	7d7ab34d5f	spirv/nir: create nir variable for UBO/SSBO Providing nir variables for UBO/SSBO it is not required for Vulkan, but it is needed for OpenGL (ARB_gl_spirv), like for example, to gather info from the UBO/SSBO while linking. In opposite with most cases where the nir variables is created, here the type assigned is the full type (not just the bare type). This is needed because while linking using the nir shader we need the explicit layout info (explicit stride, explicit offset, row_major, etc). Also, we need to assign an interface type, used also on the OpenGL linker if it is a UBO/SSBO. See ir_variable::is_in_buffer_block as example. v2: assign interface_type to be the variable type, not need to be arrayness (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-30 16:58:26 -05:00
Gert Wollny	75d8b4e795	vl: Use CS composite shader only if TEX_LZ and DIV are supported Enable the compute shader copositer only when TEX_LZ is supported by the driver. v2: Also check whether DIV is supported. https://bugs.freedesktop.org/show_bug.cgi?id=110783 Fixes: `9364d66cb7` gallium/auxiliary/vl: Add video compositor compute shader render Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-30 18:41:38 +02:00
Gert Wollny	843723e2f7	gallium: Add CAP for opcode DIV Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able to check this. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-30 18:41:35 +02:00
Gert Wollny	187c308b96	vl: replace DIV-ADD with MAD using inverse size Optimize the shader a bit by emitting MAD with the inverse size values instead of DIV+ADD. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-30 18:41:26 +02:00
Jonathan Marek	89381191a9	etnaviv: blt: blit with the original format when possible This fixes BGR565 blit: currently BGRA444 is used for the blit, but with swizzles from the original BGR565 format, so the 4 alpha bits are set to 1. We can't just use the swizzle from the 'compatible' format, since there are cases where BGR<->RGB swap needs to happen. We can avoid all this trouble by using the original formats and only falling back to the 'compatible' format when we need to. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-29 21:49:50 -04:00
Jonathan Marek	a99a265b14	etnaviv: clear all bits for 24bpp depth without stencil For fast clear to happen, all bits must be cleared. This allows using fast clear for 24bpp depth without stencil. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-29 21:49:50 -04:00
Eric Engestrom	74f064ae90	mesa: use binary search for MESA_EXTENSION_OVERRIDE Not a hot path obviously, but the table still has 425 extensions, which you can go through in just 9 steps with a binary search. The table is already sorted, as required by other parts of the code and enforced by mesa's `main-test`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-30 01:45:36 +01:00
Eric Engestrom	b738d4494c	gitlab-ci: test meson installation Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-29 21:46:37 +00:00
Eric Engestrom	5f9764bc0b	anv: fix indentation Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:41:06 +01:00
Eric Engestrom	42eb85a9d8	anv: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:41:06 +01:00
Eric Engestrom	38305e6c94	anv: replace hard-coded platform list with vk.xml parse Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-29 22:38:54 +01:00
Chih-Wei Huang	bb75c73e96	android: fix typo LOCAL_EXPORT_C_INCLUDES Should be LOCAL_EXPORT_C_INCLUDE_DIRS. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Tested-by: Mauro Rossi <issor.oruam@gmail.com>	2019-06-29 17:17:49 +02:00
Mauro Rossi	c237654dca	android: virgl: fix generated virgl_driinfo.h building rules Changelog in Android makefile: - Add LOCAL_MODULE_CLASS, intermediates and LOCAL_GENERATED_SOURCES - Use LOCAL_EXPORT_C_INCLUDE_DIRS to export $(intermediates) path - Move generated header rules before 'include $(BUILD_STATIC_LIBRARY)' Fixes the following building error: In file included from external/mesa/src/gallium/targets/dri/target.c:1: external/mesa/src/gallium/auxiliary/target-helpers/drm_helper.h:257:16: fatal error: 'virgl/virgl_driinfo.h' file not found #include "virgl/virgl_driinfo.h" ^~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `cf800998a` ("virgl: Add driinfo file and tie it into the build") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Review-by: Chih-Wei Huang <cwhuang@linux.org.tw>	2019-06-29 16:25:01 +02:00
Lionel Landwerlin	5847de6e9a	intel/compiler: don't use byte operands for src1 on ICL The simulator complains about using byte operands, we also have documentation telling us. Note that add operations on bytes seems to work fine on HW (like ADD). Using dwords operands with CMP & SEL fixes the following tests : dEQP-VK.spirv_assembly.type.vec.i8. v2: Drop the GLK changes (Matt) Add validator tests (Matt) v3: Drop GLK ref (Matt) Don't mix float/integer in MAD (Matt) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com> BSpec: 3017 Cc: <mesa-stable@lists.freedesktop.org>	2019-06-29 12:56:09 +00:00
renchenglei	500b45a98a	egl: Enable eglGetPlatformDisplay on Android Platform This helps to add eglGetPlatformDisplay support on Android Platform. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-29 12:20:17 +01:00
Ian Romanick	02c6cd8481	nir/serach: Increase maximum commutative expressions from 4 to 8 No shader-db change on any Intel platform. No shader-db run-time difference on a certain 36-core / 72-thread system at 95% confidence (n=20). Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-28 18:56:19 -07:00
Ian Romanick	1a43cf9a40	nir/algebraic: Don't mark expression with duplicate sources as commutative There is no reason to mark the fmul in the expression ('fmul', ('fadd', a, b), ('fadd', a, b)) as commutative. If a source of an instruction doesn't match one of the ('fadd', a, b) patterns, it won't match the other either. This change is enough to make this pattern work: ('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)), ('fadd', 1.0, ('fneg', a))), ('fmul', ('flrp', a, 1.0, a), b)) This pattern has 5 commutative expressions (versus a limit of 4), but the first fmul does not need to be commutative. No shader-db change on any Intel platform. No shader-db run-time difference on a certain 36-core / 72-thread system at 95% confidence (n=20). There are more subpatterns that could be marked as non-commutative, but detecting these is more challenging. For example, this fadd: ('fadd', ('fmul', a, b), ('fmul', a, c)) The first fadd: ('fmul', ('fadd', a, b), ('fadd', a, b)) And this fadd: ('flt', ('fadd', a, b), 0.0) This last case may be easier to detect. If all sources are variables and they are the only instances of those variables, then the pattern can be marked as non-commutative. It's probably not worth the effort now, but if we end up with some patterns that bump up on the limit again, it may be worth revisiting. v2: Update the comment about the explicit "len(self.sources)" check to be more clear about why it is necessary. Requested by Connor. Many Python fixes style / idom fixes suggested by Dylan. Add missing (!!!) opcode check in Expression::__eq__ method. This bug is the reason the expected number of commutative expressions in the bitfield_reverse pattern changed from 61 to 45 in the first version of this patch. v3: Use all() in Expression::__eq__ method. Suggested by Connor. Revert away from using __eq__ overloads. The "equality" implementation of Constant and Variable needed for commutativity pruning is weaker than the one needed for propagating and validating bit sizes. Using actual equality caused the pruning to fail for my ('fmul', ('fadd', 1, a), ('fadd', 1, a)) case. I changed the name to "equivalent" rather than the previous "same_as" to further differentiate it from __eq__. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-28 18:56:19 -07:00
Ian Romanick	cae1af4339	nir/search: Log Boolean constants instead of asserting Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-28 18:56:19 -07:00
Ian Romanick	8d6b35fffd	nir/algebraic: Fail build when too many commutative expressions are used Search patterns that are expected to have too many (e.g., the giant bitfield_reverse pattern) can be added to a white list. This would have saved me a few hours debugging. :( v2: Implement the expected-failure annotation as a property of the search-replace pattern instead of as a property of the whole list of patterns. Suggested by Connor. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-28 18:56:19 -07:00
Ian Romanick	57704b8d22	nir/algebraic: Fix whitespace error Trivial Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-28 18:56:19 -07:00
Alyssa Rosenzweig	f8fca4fe61	panfrost: Allow R11G11B10 rendering Doesn't fully work yet, but better than crashing. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 18:48:13 -07:00
Alyssa Rosenzweig	7692ad19fb	panfrost: Default to util_pack_color for clears This might help as we bringup more render-target formats. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 18:48:13 -07:00
Ian Romanick	b04beaf41d	intel/vec4: Try both sources as candidates for being immediates For some reason, when I first wrote try_immediate_source, I thought the sources had already been ordered so that the immediate value was the second source. That's rubbish. The generator assumes neither source is immediate, and it relies on later copy/constant propagation passes to do the reordering. For this reason, the changes to try_immediate_source have to go to some efforts to reorder the operands and tell the caller when it reordered them. The generator for comparison instructions uses this to determine when the comparison needs to change (e.g., from GT to LT). No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. Haswell total instructions in shared programs: 13484431 -> 13480500 (-0.03%) instructions in affected programs: 441138 -> 437207 (-0.89%) helped: 1883 HURT: 0 helped stats (abs) min: 1 max: 49 x̄: 2.09 x̃: 1 helped stats (rel) min: 0.07% max: 8.91% x̄: 1.10% x̃: 0.90% 95% mean confidence interval for instructions value: -2.19 -1.98 95% mean confidence interval for instructions %-change: -1.14% -1.06% Instructions are helped. total cycles in shared programs: 376420286 -> 376406400 (<.01%) cycles in affected programs: 15995668 -> 15981782 (-0.09%) helped: 1692 HURT: 219 helped stats (abs) min: 2 max: 764 x̄: 13.78 x̃: 4 helped stats (rel) min: <.01% max: 9.69% x̄: 0.69% x̃: 0.35% HURT stats (abs) min: 2 max: 516 x̄: 43.09 x̃: 22 HURT stats (rel) min: 0.02% max: 12.09% x̄: 2.30% x̃: 1.13% 95% mean confidence interval for cycles value: -9.70 -4.83 95% mean confidence interval for cycles %-change: -0.42% -0.28% Cycles are helped. total spills in shared programs: 23166 -> 23158 (-0.03%) spills in affected programs: 66 -> 58 (-12.12%) helped: 2 HURT: 0 total fills in shared programs: 34592 -> 34580 (-0.03%) fills in affected programs: 75 -> 63 (-16.00%) helped: 2 HURT: 0 Ivy Bridge total instructions in shared programs: 12051590 -> 12048513 (-0.03%) instructions in affected programs: 355911 -> 352834 (-0.86%) helped: 1481 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 2.08 x̃: 1 helped stats (rel) min: 0.07% max: 4.92% x̄: 1.08% x̃: 0.90% 95% mean confidence interval for instructions value: -2.17 -1.98 95% mean confidence interval for instructions %-change: -1.12% -1.04% Instructions are helped. total cycles in shared programs: 180319624 -> 180307642 (<.01%) cycles in affected programs: 15591028 -> 15579046 (-0.08%) helped: 1340 HURT: 174 helped stats (abs) min: 2 max: 764 x̄: 14.19 x̃: 2 helped stats (rel) min: <.01% max: 8.68% x̄: 0.64% x̃: 0.32% HURT stats (abs) min: 2 max: 518 x̄: 40.41 x̃: 14 HURT stats (rel) min: 0.02% max: 8.37% x̄: 1.59% x̃: 0.67% 95% mean confidence interval for cycles value: -10.85 -4.97 95% mean confidence interval for cycles %-change: -0.45% -0.31% Cycles are helped. All Gen6 and earlier platforms had simlar results. (Sandy Bridge shown) total instructions in shared programs: 10863159 -> 10861462 (-0.02%) instructions in affected programs: 157839 -> 156142 (-1.08%) helped: 715 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 2.37 x̃: 2 helped stats (rel) min: 0.23% max: 4.33% x̄: 1.07% x̃: 0.85% 95% mean confidence interval for instructions value: -2.53 -2.21 95% mean confidence interval for instructions %-change: -1.13% -1.02% Instructions are helped. total cycles in shared programs: 153957782 -> 153948778 (<.01%) cycles in affected programs: 3171648 -> 3162644 (-0.28%) helped: 696 HURT: 62 helped stats (abs) min: 2 max: 390 x̄: 15.72 x̃: 4 helped stats (rel) min: 0.02% max: 10.57% x̄: 0.57% x̃: 0.12% HURT stats (abs) min: 2 max: 300 x̄: 31.29 x̃: 2 HURT stats (rel) min: 0.11% max: 7.23% x̄: 0.83% x̃: 0.34% 95% mean confidence interval for cycles value: -15.65 -8.11 95% mean confidence interval for cycles %-change: -0.56% -0.36% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 18:13:18 -07:00
Ian Romanick	379cf3bb87	intel/vec4: Try immediate sources for dot products too No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. All Haswell and earlier platforms has similar results. (Haswell shown) total instructions in shared programs: 13484467 -> 13484431 (<.01%) instructions in affected programs: 8540 -> 8504 (-0.42%) helped: 33 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.09 x̃: 1 helped stats (rel) min: 0.31% max: 1.53% x̄: 0.49% x̃: 0.35% 95% mean confidence interval for instructions value: -1.19 -0.99 95% mean confidence interval for instructions %-change: -0.60% -0.38% Instructions are helped. total cycles in shared programs: 376420572 -> 376420286 (<.01%) cycles in affected programs: 56260 -> 55974 (-0.51%) helped: 26 HURT: 5 helped stats (abs) min: 2 max: 204 x̄: 11.85 x̃: 2 helped stats (rel) min: 0.11% max: 3.08% x̄: 0.39% x̃: 0.13% HURT stats (abs) min: 2 max: 6 x̄: 4.40 x̃: 6 HURT stats (rel) min: 0.03% max: 0.35% x̄: 0.24% x̃: 0.35% 95% mean confidence interval for cycles value: -22.91 4.45 95% mean confidence interval for cycles %-change: -0.56% -0.02% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 17:16:16 -07:00
Ian Romanick	eeebeb211f	intel/vec4: Try emitting non-scalar immediates Sometimes an instruction has a vector as a source, but all of the components have the same value. For example, vec3 32 ssa_16 = load_const (1.0, 1.0, 1.0) ... vec3 32 ssa_82 = fadd ssa_16, -ssa_81.xyz No changes on any Gen8 or later platform because those platforms do not use the vec4 backend. Haswell total instructions in shared programs: 13487811 -> 13484467 (-0.02%) instructions in affected programs: 421981 -> 418637 (-0.79%) helped: 1859 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.04% max: 9.80% x̄: 1.04% x̃: 0.84% 95% mean confidence interval for instructions value: -1.85 -1.74 95% mean confidence interval for instructions %-change: -1.07% -1.00% Instructions are helped. total cycles in shared programs: 376423252 -> 376420572 (<.01%) cycles in affected programs: 14800970 -> 14798290 (-0.02%) helped: 1519 HURT: 329 helped stats (abs) min: 2 max: 462 x̄: 10.59 x̃: 4 helped stats (rel) min: 0.03% max: 16.73% x̄: 0.79% x̃: 0.36% HURT stats (abs) min: 2 max: 598 x̄: 40.74 x̃: 16 HURT stats (rel) min: <.01% max: 10.32% x̄: 2.56% x̃: 0.98% 95% mean confidence interval for cycles value: -3.53 0.63 95% mean confidence interval for cycles %-change: -0.30% -0.09% Inconclusive result (value mean confidence interval includes 0). total fills in shared programs: 34601 -> 34592 (-0.03%) fills in affected programs: 91 -> 82 (-9.89%) helped: 9 HURT: 0 Ivy Bridge total instructions in shared programs: 12053565 -> 12051626 (-0.02%) instructions in affected programs: 298103 -> 296164 (-0.65%) helped: 1228 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.58 x̃: 1 helped stats (rel) min: 0.04% max: 3.57% x̄: 0.91% x̃: 0.81% 95% mean confidence interval for instructions value: -1.63 -1.53 95% mean confidence interval for instructions %-change: -0.95% -0.88% Instructions are helped. total cycles in shared programs: 180322270 -> 180319922 (<.01%) cycles in affected programs: 14123840 -> 14121492 (-0.02%) helped: 1036 HURT: 195 helped stats (abs) min: 2 max: 462 x̄: 11.93 x̃: 2 helped stats (rel) min: 0.03% max: 14.05% x̄: 0.82% x̃: 0.35% HURT stats (abs) min: 2 max: 598 x̄: 51.33 x̃: 16 HURT stats (rel) min: <.01% max: 9.68% x̄: 3.02% x̃: 0.72% 95% mean confidence interval for cycles value: -4.92 1.10 95% mean confidence interval for cycles %-change: -0.35% -0.07% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10864286 -> 10863189 (-0.01%) instructions in affected programs: 159722 -> 158625 (-0.69%) helped: 724 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.52 x̃: 1 helped stats (rel) min: 0.10% max: 2.91% x̄: 0.79% x̃: 0.62% 95% mean confidence interval for instructions value: -1.58 -1.46 95% mean confidence interval for instructions %-change: -0.82% -0.75% Instructions are helped. total cycles in shared programs: 153967938 -> 153957926 (<.01%) cycles in affected programs: 1923186 -> 1913174 (-0.52%) helped: 654 HURT: 56 helped stats (abs) min: 2 max: 170 x̄: 20.00 x̃: 4 helped stats (rel) min: 0.03% max: 11.82% x̄: 0.89% x̃: 0.18% HURT stats (abs) min: 2 max: 390 x̄: 54.75 x̃: 32 HURT stats (rel) min: 0.05% max: 6.92% x̄: 3.09% x̃: 2.92% 95% mean confidence interval for cycles value: -17.42 -10.78 95% mean confidence interval for cycles %-change: -0.76% -0.40% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8142677 -> 8141721 (-0.01%) instructions in affected programs: 139511 -> 138555 (-0.69%) helped: 588 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 1.63 x̃: 1 helped stats (rel) min: 0.21% max: 4.39% x̄: 0.84% x̃: 0.46% 95% mean confidence interval for instructions value: -1.70 -1.55 95% mean confidence interval for instructions %-change: -0.89% -0.78% Instructions are helped. total cycles in shared programs: 188549394 -> 188547676 (<.01%) cycles in affected programs: 3171960 -> 3170242 (-0.05%) helped: 527 HURT: 0 helped stats (abs) min: 2 max: 18 x̄: 3.26 x̃: 2 helped stats (rel) min: <.01% max: 0.80% x̄: 0.08% x̃: 0.06% 95% mean confidence interval for cycles value: -3.49 -3.03 95% mean confidence interval for cycles %-change: -0.09% -0.07% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-28 17:16:06 -07:00
Eric Anholt	8fd8964302	nir: Fix lowering of bitfield_insert to shifts. The bfi/bfm behavior change replaced the bfi/bfm usage in lower_bitfield_insert_to_shifts with actual shifts like the name says, but it failed to handle the offset=0, bits==32 case in the new lowering. v2: Use 31 < bits instead of bits == 32, to get the 31 < (iand bits, 31) -> false optimization. Fixes regressions in dEQP-GLES31.bitfield_insert on freedreno. Fixes: `165b7f3a44` ("nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-06-28 16:38:23 -07:00
Dylan Baker	97c2c4546c	Revert "meson: Add support for using cmake for finding LLVM" This reverts commit `5157a42765`. There is a meson bug that causes llvm to always be statically linked, which is obviously not what we want. I haven't had time to look into it yet, but for now let's just revert it.	2019-06-28 16:36:38 -07:00
Dylan Baker	69f9fbab8a	Revert "meson: try to use cmake as a finder for clang" This reverts commit `0ba0c0c15c`.	2019-06-28 16:36:27 -07:00
Eric Engestrom	78aa4a3c0a	mesa: stop trying new filenames if the filename existing is not the issue Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 23:37:49 +01:00
Eric Engestrom	d02d2b626b	mesa: use os_file_create_unique() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 23:37:49 +01:00
Eric Engestrom	1b259f1ae7	util: add os_file_create_unique() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 23:37:49 +01:00
Alyssa Rosenzweig	9de4325b27	panfrost: Disable DXT-style texture compression Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:10:05 -07:00
Alyssa Rosenzweig	e8ae998c1b	panfrost: Dump unknown formats before aborting Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:10:05 -07:00
Alyssa Rosenzweig	68a5b58fb9	panfrost/midgard: Fix 3D texture regression Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:10:05 -07:00
Alyssa Rosenzweig	601d4d3157	panfrost: Add some special formats Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:10:05 -07:00
Alyssa Rosenzweig	e32af4b5c3	panfrost/midgard: Implement integer sampler Turns out one of the magic bits in the texture instruction meant 'float'. Different magic bits mean int and uint then :) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:09:07 -07:00
Alyssa Rosenzweig	7d30000628	panfrost: Remove dubious assert We already can support texture formats with bpp > 4, so.. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:09:07 -07:00
Alyssa Rosenzweig	7f5481258c	panfrost: Implement primitive restart For GLES3, just pass the flag through. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:09:07 -07:00
Anuj Phogat	804d1bd111	i965/icl: Apply WA_1606682166 to compute workloads We missed the workaround for compute workloads in earlier patches. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-28 14:02:13 -07:00
Anuj Phogat	d96cba7754	Revert "iris/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `9c421d6b47`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-28 14:02:13 -07:00
Anuj Phogat	387e43b52f	Revert "anv/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `2be60e0c73`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-28 14:02:13 -07:00
Anuj Phogat	7746d4edef	Revert "i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch" SLICE_COMMON_CHICKEN3 is a privileged register not accesible from userspace. This patch silences a simulator warning about it. We don't need to add this workaround in linux kernel as the WA description says it's fixed on latest stepping. This reverts commit `85ecd14ef6`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-28 14:02:13 -07:00
Anuj Phogat	db093d028c	i965/icl: Fix WA_1606682166 An earlier change was setting the SamplerCount = 0 for Gen 11 under #if GEN_GEN < 7. This commit fixes the problem. This WA has also been added to the linux kernel. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-28 14:02:13 -07:00
Rob Clark	9753d7381c	freedreno/ir3: small cleanup `target` cannot be NULL here. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-28 13:02:59 -07:00
Rob Clark	016a9ab2f9	freedreno/ir3: fix missing (ss) in dummy bary.f case In case we need to insert a dummy bary.f for the (ei) flag, it also needs (ss) so we don't release varying storage to the next VS wave before the ldlv completed. Fixes random failures in: dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.* Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-28 13:02:59 -07:00
Rob Clark	21beddd3bc	freedreno/a6xx: wire up dither state Fixes: dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.no_rebind_rbo_rgba4_depth_component16 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.no_rebind_rbo_rgba4_stencil_index8 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-28 13:02:59 -07:00
Arfrever Frehtes Taifersar Arahesis	b120a02b21	meson: Improve detection of Python when using Meson >=0.50. Previously, on systems where multiple versions of Python 3 (e.g. 3.6 and 3.7) are installed, wrong version of Python 3 could have been used. The proper fix requires availability of path() method in Meson's python module, which has been added in Meson 0.50: https://github.com/mesonbuild/meson/pull/4616 Distro Bug: https://bugs.gentoo.org/671308 Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> v2: - Add missing `endif` keyword (Dylan)	2019-06-28 12:51:21 -07:00
Pierre-Eric Pelloux-Prayer	c81c784a4a	radeon/uvd: fix calc_ctx_size_h265_main10 Left shift was applied twice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110702 Reviewed-by: Leo Liu <leo.liu@amd.com> Tested-by: <irherder@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-06-28 15:44:48 -04:00
Pierre-Eric Pelloux-Prayer	1f7d8f9786	mesa: add display list support for gl(Compressed)TextureSubImage2DEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:35 -04:00
Pierre-Eric Pelloux-Prayer	360ef82765	mesa: add glTextureParameteri/iv/f/fvEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:34 -04:00
Pierre-Eric Pelloux-Prayer	29194648a6	mesa: extend _mesa_lookup_or_create_texture to support EXT_dsa Adds a boolean to implement EXT_dsa specifics. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:32 -04:00
Pierre-Eric Pelloux-Prayer	274104ec38	mesa: refactor bind_texture Splits texture lookup and binding actions. The new _mesa_lookup_or_create_texture will be useful to implement the EXT_direct_state_access extension. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:30 -04:00
Pierre-Eric Pelloux-Prayer	6535964fdf	mesa: extract helper function for glTexParameter* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:29 -04:00
Pierre-Eric Pelloux-Prayer	32eefb7451	mesa: add buffer != 0 checks to glNamedBufferEXT functions The EXT_direct_state_access spec says: INVALID_OPERATION is generated by GetNamedBufferParameterivEXT, GetNamedBufferPointervEXT, GetNamedBufferSubDataEXT, MapNamedBufferEXT, NamedBufferDataEXT, NamedBufferSubDataEXT, and UnmapNamedBufferEXT if the buffer parameter is zero. This commits adds buffer != 0 validation to the implemented functions. glNamedBufferStorageEXT isn't included in this list and the EXT_buffer_storage doesn't says that buffer = 0 is an error either so I didn't add the same validation for this function. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:26 -04:00
Marek Olšák	0de2754aa7	mesa: fix a typo in map_named_buffer_range Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:25 -04:00
Timothy Arceri	9c53a2ecb7	mesa: add support for glMapNamedBufferEXT() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:24 -04:00
Timothy Arceri	76e25edf6a	mesa: add support for glUnmapNamedBufferEXT() Since the ARB DSA function glUnmapNamedBuffer() is only exposed for 3.1 or above we make glUnmapNamedBuffer() an alias of glUnmapNamedBufferEXT() rather than the other way around. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:21 -04:00
Timothy Arceri	b5f930ea05	mesa: add support for glCompressedTextureSubImage2DEXT() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:20 -04:00
Timothy Arceri	b82b3d28d3	mesa: add support for glTextureSubImage2DEXT() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:19 -04:00
Timothy Arceri	cb0f25a926	mesa: add support for glMapNamedBufferRangeEXT() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:16 -04:00
Timothy Arceri	eec5c01b5e	mesa: add support for glNamedBufferStorageEXT This is available in ARB_buffer_storage when EXT_direct_state_access is present. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:14 -04:00
Timothy Arceri	83ed9485b7	mesa: add support for glNamedBuffer*DataEXT() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:41:12 -04:00
Timothy Arceri	0972b0b059	mesa: add support for glBindMultiTextureEXT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:40:54 -04:00
Pierre-Eric Pelloux-Prayer	c37f03d464	mesa: delete framebuffer texture attachment sampler views When a context is destroyed the destroy_tex_sampler_cb makes sure that all the sampler views created by that context are destroyed. This is done by walking the ctx->Shared->TexObjects hash table. In a multiple context environment the texture can be deleted by a different context, so it will be removed from the TexObjects table and will prevent the above mechanism to work. This can result in an assertion in st_save_zombie_sampler_view because the sampler_view owns a reference to a destroyed context. This issue occurs in blender 2.80. This commit fixes this by explicitly releasing sampler_view created by the destroyed context for all texture attachments. Fixes: `593e36f956` (st/mesa: implement "zombie" sampler views (v2)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110944 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-28 15:29:08 -04:00
James Clarke	7389bf9761	meson: GNU/kFreeBSD has DRM/KMS and requires -D_GNU_SOURCE This is a regression from the old autotools build system. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-28 19:06:46 +00:00
Kenneth Graunke	65e0c4b64f	gallium/u_transfer_helper: Don't leak a reference to the resource. We pipe_resource_reference when handling transfers in map, we need to do a corresponding unreference in unmap. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2019-06-28 11:25:56 -07:00
Eric Engestrom	6227e6faee	meson: only add empty lines betwen active summary sections Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-28 19:15:18 +01:00
Eric Engestrom	5819bc0e5c	meson: bump required libdrm version to 2.4.81 `dbb4457d98` started using drmDevicesEqual(), which was introduced in libdrm 2.4.81 We could either copy the function locally, or bump the required version. Since the function is non-trivial and 2.4.81 is old enough already, I suggesting the latter. Fixes: `dbb4457d98` ("egl: add EGL_EXT_device_drm support") Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-28 19:03:04 +01:00
Emil Velikov	4ec32413f3	ac: change ac_query_gpu_info() signature Currently libdrm_amdgpu provides a typedef of the various handles. While the goal was to make those opaque, it effectively became part of the API To the best of my knowledge there are two ways to have opaque handles: - "typedef void foo;" - rather messy IMHO - "stuct foo;" and use "struct foo " through the API In our case amdgpu_device_handle is used only internally, plus respective code is not used or applicable for r300 and r600. Hence we copied the typedef. Seemingly this will be a problem since libdrm_amdgpu wants to change the API, while not updating the code(?). Either way, we can safely s/amdgpU_device_handle/void */ and carry on. Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak at amd.com>	2019-06-28 17:49:32 +01:00
Tomeu Vizoso	7c745f6148	panfrost: Only tag AFBC addresses when sampling Rendering to AFBC was broken, as the HW will complaint loudly if we pass a tagged pointer in bifrost_render_target. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Fixes: `3609b50a64` ("panfrost: Merge AFBC slab with BO backing") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 15:40:52 +02:00
Jose Fonseca	3573412981	gallivm: Improve lp_build_rcp_refine. Use the alternative more accurate expression from https://en.wikipedia.org/wiki/Division_algorithm#Newton%E2%80%93Raphson_division v2: Use lp_build_fmuladd as suggested by Roland Tested by enabling this code path, and running lp_test_arit. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-06-28 11:48:12 +01:00
Tomeu Vizoso	0ec8a292fb	panfrost/ci: Don't error out on RK3288 At the moment we don't have enough people to ensure that RK3288 is regression-free, so don't fail the CI in that case. For now we'll focus on not regressing on RK3399 and we can expand to other SoCs as more people join the effort. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 11:13:04 +02:00
Tomeu Vizoso	6a26d6f4d9	panfrost/ci: Don't print every kernel file As there's lots of them and Gitlab struggles rendering logs with so many lines. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 11:13:04 +02:00
Tomeu Vizoso	61b793dde4	panfrost/ci: Fix the image name These changes will make sure we get the right image from the container registry. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 11:13:04 +02:00
Tomeu Vizoso	0315350d9e	panfrost/ci: Remove batching Panfrost has grown and doesn't leak as much as before. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-28 11:13:04 +02:00
Kenneth Graunke	847ef8ee4f	iris: Don't leak resources in iris_create_surface for incomplete FBOs We were failing to pipe_resource_unreference on the failure path due to a non-renderable format. Instead of fixing this, just move the checks earlier, before we even bother with refcounting or calloc.	2019-06-28 01:13:11 -07:00
Samuel Pitoiset	ef1787dbc9	radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+ These two extensions are supported on GFX8 but the throughput of 16-bit floats/integers is same as 32-bit. Also, shaderInt16 is only enabled on GFX9+ for the same reason, be more consistent. This fixes a crash with Wolfenstein II because it expects shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is exposed. Note that AMDVLK only enables these extensions on GFX9+. Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-28 08:40:44 +02:00
Samuel Pitoiset	5d6d29ed5d	radv: add si_emit_ia_multi_vgt_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-28 08:40:42 +02:00
Alexandros Frantzis	7da90a7cc9	virgl: Don't allow creating staging pipe_resources Staging buffers are now created directly by the virgl_staging_mgr. We don't need to support creating staging pipe_resources. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-28 04:30:02 +00:00
Alexandros Frantzis	5388be039b	virgl: Use virgl_staging_mgr Use an instance of virgl_staging_mgr instead of u_upload_mgr to handle the staging buffer. This removes the need to track the availability of the staging manager, since virgl_staging_mgr can handle concurrent active allocations. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-28 04:30:02 +00:00
Alexandros Frantzis	790d1a0b17	virgl: Add tests for virgl_staging_mgr Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-28 04:30:02 +00:00
Alexandros Frantzis	55a58dfcfb	virgl: Introduce virgl_staging_mgr Add a manager for the staging buffer used in virgl. The staging manager is heavily inspired by u_upload_mgr, but is simpler and is a better fit for virgl's purposes. In particular, the staging manager: * Allows concurrent staging allocations. * Calls the virgl winsys directly to create and map resources, avoiding unnecessarily going through gallium resources and transfers. olv: make virgl_staging_alloc_buffer return a bool Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-28 04:30:02 +00:00
Alexandros Frantzis	6a03f25522	virgl: Store the virgl_hw_res for copy transfers Store the virgl_hw_res instead of the pipe_resource for copy transfer sources. This prepares the codebase for a change to provide only the virgl_hw_res for the staging buffers in upcoming commits. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-28 04:30:02 +00:00
Kenneth Graunke	bed305fb7a	iris: Fix major resource leak in iris_set_shader_images We were failing to unreference the old image resource. Instead of open coding this and doing it badly, just use the copier function which does the right thing.	2019-06-27 19:08:46 -07:00
Kenneth Graunke	255c71ec07	gallium: Make util_copy_image_view handle shader_access A while back, we added a new field, but failed to update the copier. I believe iris is the only current user of the new field, and it hasn't used the copier, so noone noticed. Fixes: `8b626a22b2` st/mesa: Record shader access qualifiers for images Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-27 19:06:19 -07:00
Kenneth Graunke	0d6fc6f07e	gallium: Teach GALLIUM_REFCNT_LOG about array textures Otherwise they are classified as pipe_martian_resource, and don't contain any helpful information about the texture. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-27 16:56:15 -07:00
Nanley Chery	02f6995d76	isl: Don't align phys_level0_sa by block dimension Aligning phys_level0_sa by the compression block dimension prior to mipmap layout causes the layout of compressed surfaces to differ from the sampler's expectations in certain cases. The hardware docs agree: From the BDW PRM, Vol. 5, Compressed Mipmap Layout, The compressed mipmaps are stored in a similar fashion to uncompressed mipmaps [...] The following exceptions apply to the layout of compressed (vs. uncompressed) mipmaps: * [...] * The dimensions of the mip maps are first determined by applying the sizing algorithm presented in Non-Power-of-Two Mipmaps above. Then, if necessary, they are padded out to compression block boundaries. The last bullet indicates that alignment should not be done for calculating a miplevel's dimensions, but rather for determining miplevel placement/padding. Comply with this text by removing the extra alignment. Fixes some fbo-generatemipmap-formats piglit failures on all tested platforms (SNB-KBL). v2: - Note fixed platforms. - Update some consumers via a helper function. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-27 23:38:38 +00:00
Nanley Chery	fb1350c76f	intel: Add and use helpers for level0 extent Prepare for a bug fix by adding and using helpers which convert isl_surf::logical_level0_px and isl_surf::phys_level0_sa to units of surface elements. v2: - Update iris (Ken). - Update anv. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-27 23:38:37 +00:00
Dylan Baker	0ba0c0c15c	meson: try to use cmake as a finder for clang Clang (like LLVM), very annoyingly refuses to provide pkg-config, and only provides cmake (unlike LLVM which at least provides llvm-config, even if llvm-config is terrible). Meson has gained the ability to use cmake to find dependencies, and can successfully find Clang. This change attempts to use cmake to find clang instead of a bunch of library searches, when paired with -Dcmake_prefix_path we can much more reliably use cmake to control which clang we're getting. This is only enabled for meson >= 0.51, which adds the required options. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-27 22:12:02 +00:00
Dylan Baker	5157a42765	meson: Add support for using cmake for finding LLVM Meson has support for using cmake as a finder for some dependencies, including LLVM. Using cmake has a lot of advantages: it needs less meson maintenance to keep working (even for llvm updates); it works more sanely for cross compiles (as llvm-config is a compiled binary not a shell script). Meson 0.51.0 also has a new generic variable getter that can be used to get information from either cmake, pkg-config, or config-tools dependencies, which is needed for cmake. We continue to support using llvm-config if you don't have cmake installed, or if cmake cannot find a suitable version. Fixes: `0d59459432` ("meson: Force the use of config-tool for llvm") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-27 22:12:02 +00:00
Kenneth Graunke	3d3685d354	iris: Fix memory leak of SO targets We need to pitch these on context destroy.	2019-06-27 14:59:39 -07:00
Kenneth Graunke	d65819f054	iris: Fix memory leak for draw parameter resources Need to pitch these on context destroy.	2019-06-27 14:59:39 -07:00
Kenneth Graunke	50eb1c1396	iris: Drop u_upload_unmap We use persistent maps so this does nothing.	2019-06-27 14:59:39 -07:00
Lionel Landwerlin	836225840c	intel/compiler: fix derivative on y axis implementation This rewrites the ddy in EXECUTE_4 mode with a loop to make it more obvious what is going on and also sets the group each of the 4 threads in the groups are supposed to execute. Fixes the following CTS tests : dEQP-VK.glsl.derivate.dfdyfine.dynamic_* Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Co-Authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `2134ea3800` ("intel/compiler/fs: Implement ddy without using align16 for Gen11+")	2019-06-27 18:14:58 +00:00
Eric Engestrom	53f17c4efd	meson: set up a proper internal dependency for xmlconfig Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Eric Engestrom	ad0ee5bfa5	xmlconfig: add missing #include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Eric Engestrom	069e6d587e	xmlpool: fix typo in comment s/otions/options/, and while here let's give the full path to xmlpool.h since `../` won't be true in the generated file. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-06-27 17:42:25 +00:00
Kenneth Graunke	d6683e118f	iris: Also properly restore INTERFACE_DESCRIPTOR_DATA buffer object We were at least cleaning up this reference, but we were failing to pin it in iris_restore_compute_saved_bos.	2019-06-27 08:12:22 -07:00
Kenneth Graunke	340df53d6a	iris: Fix resource tracking for CS thread ID buffer Today, we stream the compute shader thread IDs simply because they're (annoyingly) relative to dynamic state base address. We could upload them once at compile time, but we'd need a separate non-streaming uploader for IRIS_MEMZONE_DYNAMIC, and I'm not sure it's worth it. stream_state pins the buffer for use in the current batch, but also returns a reference to the pipe_resource. We dropped this reference on the floor, leaking a reference basically every time we dispatched a compute shader after switching to a new one. The reason it returns a reference is so that we can hold on to it and re-pin it in iris_restore_compute_saved_bos, which we were also failing to do. So if we actually filled up a batch with repeated dispatches to the same compute shader, and flushed, then continued dispatching, we would fail to pin it and likely GPU hang.	2019-06-27 08:12:22 -07:00
Kenneth Graunke	16d334951e	iris: Only bother with thread ID upload if doing MEDIA_CURBE_LOAD We were unconditionally uploading the new data, but then conditionally using it with MEDIA_CURBE_LOAD. If we're not going to emit the command, there's no point in uploading the data.	2019-06-27 08:12:22 -07:00
Kenneth Graunke	8f51f1ba6e	iris: Do MEDIA_CURBE_LOAD when IRIS_DIRTY_CS is set, not constants We only use push the compute shader thread IDs, not any actual constant buffer data. So we should track the compute shader variant changing, not constbuf changes.	2019-06-27 08:12:22 -07:00
Kenneth Graunke	85c72da1b1	iris: Drop UBO range stuff from iris_restore_compute_saved_bos Compute doesn't use UBO ranges (annoyingly), so this is dead code.	2019-06-27 08:12:22 -07:00
Kenneth Graunke	f94ebf0c9d	iris: Properly align interface descriptor data addresses MEDIA_INTERFACE_DESCRIPTOR's Interface Descriptor Data Start Address field's docs say: "This bit specifies the 64-byte aligned address..." And we were doing 32. Superfluous thread ID uploading was apparently saving us from GPU hangs in most cases.	2019-06-27 08:12:22 -07:00
Andrii Simiklit	62c6059584	mesa: use a correct function return type v2: standard 'bool' can be used ( Eric Engestrom <eric.engestrom@intel.com> ) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-06-27 07:53:41 +00:00
Tomeu Vizoso	9bef1f1ff1	panfrost/decode: Mention the address of a few descriptors When the fault_pointer field in the header is set, we can get some idea of which descriptor the HW isn't happy with if we know their addresses. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-27 09:13:48 +02:00
Tomeu Vizoso	de02fb19ed	panfrost/decode: Wait for a job to finish before dumping Then we can get some information back about any exception that might have happened. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-27 09:13:42 +02:00
Tomeu Vizoso	fa36c194fd	panfrost/decode: Decode exception status Arm's kernel driver mentions how to decode this field, which makes a bit clearer what had happened. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-27 09:13:35 +02:00
Tomeu Vizoso	b26c2b4840	panfrost/decode: Print AFBC struct when appropriate Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-27 09:12:56 +02:00
Samuel Pitoiset	d5004f60be	radv: only export clip/cull distances if PS reads them The only exception is the GS copy shader which emits them unconditionally. Totals from affected shaders: SGPRS: 71320 -> 71008 (-0.44 %) VGPRS: 54372 -> 54240 (-0.24 %) Code Size: 2952628 -> 2941368 (-0.38 %) bytes Max Waves: 9689 -> 9723 (0.35 %) This helps Dota2, Doom, GTAV and Hitman 2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-27 08:56:37 +02:00
Samuel Pitoiset	1e9ccc5429	radv: fix FMASK expand if layerCount is VK_REMAINING_ARRAY_LAYERS This doesn't fix anything known, but it's likely going to break if layerCount is ~0U. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-27 08:56:34 +02:00
Kenneth Graunke	8551dc17a7	iris: Disable loop unrolling in GLSL IR. Leave it to NIR instead, like i965 does. Thanks to Tim Arceri for noticing that I'd left this enabled by accident. shader-db results on Skylake: total instructions in shared programs: 15522628 -> 15521642 (<.01%) instructions in affected programs: 94008 -> 93022 (-1.05%) helped: 34 HURT: 33 helped stats (abs) min: 12 max: 48 x̄: 33.82 x̃: 42 helped stats (rel) min: 0.06% max: 22.14% x̄: 9.86% x̃: 10.89% HURT stats (abs) min: 1 max: 16 x̄: 4.97 x̃: 3t HURT stats (rel) min: 0.82% max: 3.77% x̄: 1.73% x̃: 1.53% 95% mean confidence interval for instructions value: -20.08 -9.35 95% mean confidence interval for instructions %-change: -5.95% -2.36% Instructions are helped. total cycles in shared programs: 367105221 -> 367074230 (<.01%) cycles in affected programs: 10017660 -> 9986669 (-0.31%) helped: 266 HURT: 184 helped stats (abs) min: 1 max: 9556 x̄: 151.35 x̃: 12 helped stats (rel) min: 0.08% max: 59.91% x̄: 4.66% x̃: 1.67% HURT stats (abs) min: 1 max: 1716 x̄: 50.37 x̃: 6 HURT stats (rel) min: <.01% max: 24.40% x̄: 2.42% x̃: 0.85% 95% mean confidence interval for cycles value: -133.90 -3.84 95% mean confidence interval for cycles %-change: -2.44% -1.10% Cycles are helped. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-26 22:55:03 -07:00
Kenneth Graunke	acadeaff6a	st/mesa: Set EmitNoIndirectSampler if GLSLVersion < 400. This patch changes the code which sets EmitNoIndirectSampler to check the core profile GLSL version, rather than the ARB_gpu_shader5 extension enable. st/mesa exposes ARB_gpu_shader5 if GLSLVersion (in core profiles) or GLSLVersionCompat (in compat profiles) >= 400. The Intel drivers do not currently expose ARB_gpu_shader5 in compat profiles. But the backend can absolutely handle indirect samplers. Looking at the core profile version number should be a good indication of what the driver supports. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-26 22:54:52 -07:00
Kenneth Graunke	116144d65e	iris: Delete dead ice->state.streamout_strides field. Nothing uses this, it must be a remnant from an earlier approach.	2019-06-26 20:17:22 -07:00
Caio Marcelo de Oliveira Filho	085c0f1f13	nir/algebraic: Add helpers and a rule involving wrapping The helpers are needed so we can use the syntax `instr(cond)` in the algebraic rules. Add simple rule for dropping a pair of mul-div of the same value when wrapping is guaranteed to not happen. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-26 14:13:02 -07:00
Caio Marcelo de Oliveira Filho	5a143965b8	spirv: Implement NoSignedWrap and NoUnsignedWrap decorations When handling the specified ALU operations, check for the decorations and set nir_alu_instr no_signed_wrap and no_unsigned_wrap flags accordingly. v2: Add a glsl_base_type_is_unsigned_integer() helper. (Karol) v3: Rename helper to glsl_base_type_is_uint(). v4: Use two flags, so we don't need the helper anymore. (Connor) v5: Pass alu directly to handle function. (Jason) Reviewed-by: Karol Herbst <kherbst@redhat.com> [v3] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-26 14:13:02 -07:00
Caio Marcelo de Oliveira Filho	ae37237713	nir: Add a no wrapping bits to nir_alu_instr They indicate the operation does not cause overflow or underflow. This is motivated by SPIR-V decorations NoSignedWrap and NoUnsignedWrap. Change the storage of `exact` to be a single bit, so they pack together. v2: Handle no_wrap in nir_instr_set. (Karol) v3: Use two separate flags, since the NIR SSA values and certain instructions are typeless, so just no_wrap would be insufficient to know which one was referred to. (Connor) v4: Don't use nir_instr_set to propagate the flags, unlike `exact`, consider the instructions different if the flags have different values. Fix hashing/comparing. (Jason) Reviewed-by: Karol Herbst <kherbst@redhat.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-26 14:13:02 -07:00
Dylan Baker	f97dcb7a55	docs: add news item and link release notes for 19.0.8 This is an emergency release due to a critical bug.	2019-06-26 13:48:06 -07:00
Dylan Baker	290495a431	docs: Add mesa 19.0.8 sha256 sums	2019-06-26 13:46:30 -07:00
Dylan Baker	10a24925a0	docs: Add docs for 19.0.8	2019-06-26 13:46:29 -07:00
Jonathan Marek	a70ff70158	nir: remove fnot/fxor/fand/for opcodes There doesn't seem to be any reason to keep these opcodes around: * fnot/fxor are not used at all. * fand/for are only used in lower_alu_to_scalar, but easily replaced Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-26 15:26:10 -04:00
Jonathan Marek	0b5a483baa	nir: opt_vectorize: combine different constant sources We can vectorize instructions with different constant sources by creating a new load_const and using that. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 14:56:28 -04:00
Alyssa Rosenzweig	10688257bd	panfrost/midgard: Merge embedded constants In Midgard, a bundle consists of a few ALU instructions. Within the bundle, there is room for an optional 128-bit constant; this constant is shared across all instructions in the bundle. Unfortunately, many instructions want a 128-bit constant all to themselves (how selfish!). If we run out of space for constants in a bundle, the bundle has to be broken up, incurring a performance and space penalty. As an optimization, the scheduler now analyzes the constants coming in per-instruction and attempts to merge shared components, adjusting the swizzle accessing the bundle's constants appropriately. Concretely, given the GLSL: (a * vec4(1.5, 0.5, 0.5, 1.0)) + vec4(1.0, 2.3, 2.3, 0.5) instead of compiling to the naive two bundles: vmul.fmul [temp], [a], r26 fconstants 1.5, 0.5, 0.5, 1.0 vadd.fadd [out], [temp], r26 fconstants 1.0, 2.3, 2.3, 0.5 The scheduler can now fuse into a single (pipelined!) bundle: vmul.fmul [temp], [a], r26.xyyz vadd.fadd [out], [temp], r26.zwwy fconstants 1.5, 0.5, 1.0, 2.3 Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 10:01:36 -07:00
Alyssa Rosenzweig	a0a34946d8	panfrost/midgard: Share swizzle compose Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 10:01:36 -07:00
Alyssa Rosenzweig	f6fde45d5c	panfrost/midgard: Share swizzle/mask code Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 10:01:36 -07:00
Alyssa Rosenzweig	0979ea9de8	panfrost: Fix checksumming typo Fixes: `3e6c6bb0` ("panfrost: Merge checksum buffer with main BO") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 09:58:30 -07:00
Kenneth Graunke	ab009b7d6e	iris: Fix overzealous query object batch flushing. In the past, each query object had their own BO. Checking if the batch referenced that BO was an easy way to check if commands were still queued to compute the query value. If so, we needed to flush. More recently (`c24a574e6c`), we started using an u_upload_mgr for query objects, placing multiple queries in the same BO. One side-effect is that iris_batch_references is a no longer a reasonable way to check if commands are still queued for our query. Ours might be done, but a later query that happens to be in the same BO might be queued. We don't want to flush in that case. Instead, check if the current batch's signalling syncpt is the one we referenced when ending the query. We know the syncpt can't have been reused because our query is holding a reference, so a simple pointer comparison should suffice. Removes all batch flushing caused by query objects in Shadow of Mordor. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-06-26 09:49:01 -07:00
Kenneth Graunke	db878a728c	iris: Make an iris_batch_get_signal_syncpt() helper. This returns a pointer to the signalling syncpt, without incrementing the reference count. This can be useful for comparisons. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-06-26 09:49:01 -07:00
Boris Brezillon	443e530194	panfrost: Remove unneeded check in panfrost_scissor_culls_everything() The ss local var is guaranteed to be != NULL. Get rid of this useless check. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 09:35:25 -07:00
Alyssa Rosenzweig	d4575c3071	panfrost: Update copyright identifiers "Collabora, Ltd." should be listed in lieu of simply "Collabora" Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Suggested-by: Daniel Stone <daniels@collabora.com>	2019-06-26 09:10:51 -07:00
Alyssa Rosenzweig	b0e8941df1	panfrost/midgard: Reorder to permit constant bias Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 09:08:37 -07:00
Alyssa Rosenzweig	213b62810d	panfrost/midgard: Add helper to encode constant bias Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 09:08:37 -07:00
Alyssa Rosenzweig	b51727ea28	panfrost/midgard: Handle negative immediate bias Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-26 09:08:37 -07:00
Rob Clark	1833827eac	freedreno: correct batch_depends_on() logic Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-26 08:43:02 -07:00
Rob Clark	2b10bb6e5e	freedreno: drop unused arg from fd_batch_flush() The `force` arg has been unused for a while.. but apparently I forgot to garbage collect it. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-26 08:43:02 -07:00
Timothy Arceri	5f809e2707	st/glsl: fix silly regression finding gs/tes variants Fixes: `d19fe5e67a` ("st/glsl: support clamping color outputs in compat for gs/tes") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-06-26 23:13:02 +10:00
Timothy Arceri	d19fe5e67a	st/glsl: support clamping color outputs in compat for gs/tes This support requires the driver to be a NIR driver as we use the NIR lowering pass to do the clamping. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-26 00:36:48 +00:00
Timothy Arceri	f5f31612d3	nir: add tess support to nir_lower_clamp_color_outputs() This will be used to add compat profile support for higher GL versions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-26 00:36:48 +00:00
Sagar Ghuge	06807e1948	glsl: Fix round64 conversion function Fix round64 function to handle round to nearest even cases specially with positive and negative numbers with fraction part 0.5. v2: 1) Simplify unused bits (Elie Tournier) Fixes: KHR-GL45.gpu_shader_fp64.builtin.round_dvec2 KHR-GL45.gpu_shader_fp64.builtin.round_dvec3 KHR-GL45.gpu_shader_fp64.builtin.round_dvec4 KHR-GL45.gpu_shader_fp64.builtin.roundeven_double KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec2 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec3 KHR-GL45.gpu_shader_fp64.builtin.roundeven_dvec4 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-06-25 15:19:10 -07:00
Alyssa Rosenzweig	e8f4c9f56c	panfrost/ci: Add RK3288 flipflops I don't want to deal with right now Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:42:58 -07:00
Alyssa Rosenzweig	70a87a915d	panfrost/ci: Update failures list A ton of tests were fixed by this series. A few were incorrectly passing before (QualityError, for instance) and now are explicitly failing. A few legitimate regressions but overwhelmingly positive. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	ddf5f04edf	panfrost/ci: Set MESA_GLES_VERSION_OVERRIDE=3.0 Fixes cube map tests due to disagreements between Mesa, dEQP, and the spec... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	33f3cac1c2	panfrost/ci: Run full set of mipmap tests Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	f34635c699	panfrost: Advertise support for other 8-bit UNORM formats Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	310ca6ba40	panfrost: Use pipe_surface->format directly in blitter Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	5cfb4248c6	panfrost: Invert swizzle for rendering Fixes rendering to e.g. alpha textures. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	b96f119d85	panfrost: Honour first_layer...last_layer when sampling Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	0ad17f56ae	panfrost: Use the sampler_view target (not the textures) u_blitter gets "special treatment" and uses this mechanism to cast cube maps to 2D textures in order to texelFetch them. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	faf8ad4875	panfrost/midgard: Assert guard texelFetch against cubemaps Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:18 -07:00
Alyssa Rosenzweig	124f6b541b	panfrost: Zero pixels in any axis is zero pixels total Multiplication, not addition, so switch the logic operator. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	06211f45a7	panfrost: Respect mip level when wallpapering Fixes DATA_INVALID_FAULT raised when wallpapering while rendering to a mipmap. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	6729912a4b	panfrost/midgard: Fixup NIR texture op In a vertex shader, a tex op should map to txl, as there must be a LOD given to the hardware (implicitly or explicitly). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	17adcfc008	panfrost: Support (non-)seamless cube maps Identify the seamless cubemap bit and passthrough the Gallium state rather than setting unconditionally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	3e6c6bb0af	panfrost: Merge checksum buffer with main BO This is similar to the AFBC merge; now all (non-imported) buffers use a common backing buffer. Reenables checksumming, eliminating a performance regression. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	a9fc1c8399	panfrost/decode: Limit MRT blend count I thought I already fixed this. Maybe that was a dream...? Then again, I might be dreaming now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	65e9d9b625	panfrost: Clamp tile coordinates before job submission Fixes TILE_RANGE_FAULT raised on some tests in dEQP-GLES3.functional.fbo.blit.* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	7005c0d83b	panfrost: Use dedicated u_blitter context for wallpapers The main ctx->blitter instance should be reserved for blits originated from Gallium (like mipmap generation). Since wallpapering is conceptually different -- wallpaper blits can be triggered by Gallium blits -- the blitter pipes must be separate to avoid potential u_blitter recursion. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	64b7bd3f90	panfrost: Sanity check layer It doesn't make sense to try to render to multiple array elements at once. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	eb3c09716b	panfrost: Divide array_size by 6 for cubemaps Addresses the disparity between Mali and Gallium definitions of array_size. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	65bc56b568	panfrost: Use get_texture_address for framebuffer computations Allows for sharing some code as well as theoretically allowing cubemap rendering. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	3609b50a64	panfrost: Merge AFBC slab with BO backing Rather than tracking AFBC memory "specially", just use the same codepath as linear and tiled. Less things to mess up, I figure. This allows us to use the standard setup_slices() call with AFBC resources, allowing mipmapped AFBC resources. Unfortunately, we do have to disable AFBC (and checksumming) in the meantime to avoid functional regressions, as we don't know _a priori_ if we'll need to access a resource from software (which is not yet hooked up with AFBC) and we don't yet have routines to switch the layout of a BO at runtime. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	aea3f0ac1d	panfrost: Z/S can't be tiled As far as we know, Utgard-style tiling only works for color render targets, not depth/stencil, so ensure we don't try to tile it (rather than compress or plain old linear) and drive ourselves into a corner. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	ad56dd4e97	panfrost: Enable mipmapping Now the autogeneration of mipmaps is working (via u_blitter), we can finally enable mipmaps! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	5aeffa9517	panfrost: Enable blitting Now that all the prerequisites breaking u_blitter are fixed, we can finally hook up panfrost_blit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	06d192c742	panfrost: Allow texelFetch for wallpaper blits We just implemented the routine; we may as well use it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	f4bb7f096c	panfrost/midgard: Implement texelFetch (2D only) txf instructions can result from blits, so handle them rather than crash. Only works for 2D textures (not even 2D array texture) due to a register allocation constraint that may not be sorted for a while. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	4ac42f2b38	panfrost: Skip flushes only for wallpapers, not any blit We need the flush from u_blitter for a normal blit (e.g. for mipmaps); it's only wallpaper-related blits that are special-cased. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	ffcc4d1c4e	panfrost: Handle generate_mipmap ourselves To avoid interference with the wallpaper code, we need to do some state tracking when generating mipmaps. In particular, we need to mark the generated layers as invalid before generating the mipmap, so we don't try to backblit them if they already had content. Likewise, we need to flush both before and after generating a mipmap since our usual set_framebuffer_state flushing isn't quite there yet. Ideally better optimizations would save the flush but I digress. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Alyssa Rosenzweig	f57dfe4cdd	panfrost: Disable mipmapping if necessary If a mipfilter is not set, it's legal to have an incomplete mipmap; we should handle this accordingly. An "easy way out" is to rig the LOD clamps. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-25 13:39:17 -07:00
Kenneth Graunke	748e5dac72	intel/blorp: Disable sampler state prefetching on Gen11 Sampler state prefetching is broken on Gen11, and WA_160668216 says to disable it. Apparently sampler state prefetching also has basically zero impact on performance, so we don't need to worry there. i965, anv, and iris already handle this correctly, but we missed BLORP. Ideally the kernel should globally disable this by writing SARCHKMD, at which point we wouldn't have to worry about it. But let's be defensive and handle it ourselves too. v2: separate out from BTP workaround in case we change that eventually Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> [v1]	2019-06-25 13:29:31 -07:00
Jason Ekstrand	0a364a4a74	anv/descriptor_set: Only write texture swizzles if we have an image view When immutable samplers are set we call write_image_view with a NULL image view. This causes issues on IVB where we have to fake texture swizzling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999 Fixes: `d2aa65eb18` "anv: Emulate texture swizzle in the shader when..."	2019-06-25 19:43:25 +00:00
Chia-I Wu	74786b3aa3	virgl: add VIRGL_DEBUG_XFER When set, do as requested and skip any transfer optimization. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-25 12:01:45 -07:00
Chia-I Wu	e93d918b65	virgl: add VIRGL_DEBUG_SYNC When set, wait after every each flush. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-25 12:01:43 -07:00
Chia-I Wu	119b5701e1	virgl: fix the value of VIRGL_DEBUG_BGRA_DEST_SWIZZLE VIRGL_DEBUG_BGRA_DEST_SWIZZLE should use bit 3. Make some cosmetic changes as well. Fixes: `a478e56fbd` virgl: Add debug flag to bypass driconf to enable the BGRA tweaks Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-25 12:01:14 -07:00
Samuel Pitoiset	8ea7ee1536	radv: rename and re-document cache flush flags SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 18:38:37 +02:00
Samuel Pitoiset	5411f47056	radv: set DISABLE_CONSTANT_ENCODE_REG to 1 for Raven2 Ported from RadeonSI, will be emitted for GFX10 too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:45:15 +02:00
Samuel Pitoiset	34bef8a0d7	radv: clear CMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:28 +02:00
Samuel Pitoiset	476b907a3b	radv: clear FMASK layers instead of the whole buffer on GFX8 This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:25 +02:00
Samuel Pitoiset	a5ba386b3f	radv: always initialize levels without DCC as fully expanded This fixes a rendering issue with RoTR/DXVK. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-25 16:36:23 +02:00
Sergii Romantsov	1931c97a1d	i965: leaking of upload-BO with push constants In case of any enabled VS members from: uses_firstvertex, uses_baseinstance, uses_drawid, uses_is_indexed_draw leaks may happens. Call gen6_upload_push_constants allocates stage_stat->push_const_bo. It than takes pointer from push_const_bo to draw_params_bo (in the call brw_prepare_shader_draw_parameters by brw_upload_data) and do reference which finally haven't got unreferenced. Fixes leak: 136 bytes in 1 blocks are definitely lost in loss record 6 of 13 at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596) by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672) by 0xC314BB3: brw_upload_space (intel_upload.c:88) by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155) by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300) by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540) by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659) by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681) by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052) by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175) by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386) by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2019-06-25 12:26:25 +00:00
Juan A. Suarez Romero	81d28c69ea	docs: update calendar, add news item and link release notes for X.Y.Z Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-25 13:02:37 +02:00
Juan A. Suarez Romero	2c06071521	docs: fix some typos in 19.0.7 release notes Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-25 13:01:56 +02:00
Juan A. Suarez Romero	4a2b502a6b	docs: add sha256 checksums for 19.1.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `d54dc24d6d`)	2019-06-25 12:56:49 +02:00
Juan A. Suarez Romero	5f7c66676f	docs: add release notes for 19.1.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `22eddd8b9d`)	2019-06-25 12:56:46 +02:00
Tapani Pälli	7a6e5a4bc3	intel/compiler: silence a warning of using different enum type Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-25 10:09:22 +03:00
Eric Engestrom	e9286eb60b	egl: replace dead vfunc with an error st/egl used to support eglCreatePbufferFromClientBuffer, but now that it's gone, any call to it would segfault. Let's return a nice error instead. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 07:47:19 +01:00
Eric Engestrom	eeacd66324	egl: delete unused vfuncs Nobody ever uses these, so let's just hard code them instead. If an EGL driver ever comes around that needs them they're trivial to re-add. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 07:47:19 +01:00
Eric Engestrom	83f01f5261	egl: drop empty eglfallbacks.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	757d2fb48d	egl: move eglGetSyncAttrib() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	26d5ca44ba	egl: move eglSwapInterval() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	9dc00c8433	egl: move eglSurfaceAttrib() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	58be9d50a7	egl: move eglQuerySurface() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	b792b3ebd7	egl: move eglQueryContext() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	7f848f9713	egl: move eglGetConfigAttrib() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:54 +00:00
Eric Engestrom	1b76cca40f	egl: move eglChooseConfig() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:53 +00:00
Eric Engestrom	b883d7f567	egl: move eglGetConfigs() fallback to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-25 06:36:53 +00:00
Rob Clark	927fb50727	freedreno/a5xx: fix batch leak in fd5 blitter path Fixes: `3d198926a4` freedreno: use fd_bc_alloc_batch instead of fd_batch_create. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-24 18:43:20 -07:00
Marek Olšák	4a1421aa26	radeonsi: don't set spi_ps_input_* for monolithic shaders The driver doesn't use these values and ac_rtld has assertions expecting the value of 0. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	1d6e358c36	radeonsi: rename and re-document cache flush flags SMEM and VMEM caches are L0 on gfx10. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	aa8d6e0507	radeonsi: fix AMD_DEBUG=nofmask Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	f46efacd01	radeonsi: flatten the switch for DPBB tunables Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-24 21:04:10 -04:00
Marek Olšák	ac4b1e2f0a	radeonsi: set the calling convention for inlined function calls otherwise the behavior is undefined Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-24 21:04:10 -04:00
Nicolai Hähnle	610e1a81f7	radeonsi: refactor si_update_vgt_shader_config We'll have to extend this at some point, and using a bitfield union in this way makes it easier to get the right index without excessive branching. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Nicolai Hähnle	bd3a3fd25a	amd/rtld: update the ELF representation of LDS symbols The initial prototype used a processor-specific symbol type, but feedback suggests that an approach using processor-specific section name that encodes the alignment analogous to SHN_COMMON symbols is preferred. This patch keeps both variants around for now to reduce problems with LLVM compatibility as we switch branches around. This also cleans up the error reporting in this function. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Marek Olšák	0032f6b8a0	ac/surface: remove addrlib_family_rev_id Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 21:04:10 -04:00
Dylan Baker	032fe7d602	docs: update calendar, add news item and link release notes for 19.0.7	2019-06-24 16:24:05 -07:00
Dylan Baker	7badae431a	docs: Add SHA256 sums for 19.0.7	2019-06-24 16:22:21 -07:00
Dylan Baker	8c0e5c4cfc	Docs add 19.0.7 release notes	2019-06-24 16:22:20 -07:00
Ian Romanick	ee1c69fadd	glsl: Don't increase the iteration count when there are no terminators Incrementing the iteration count was intended to fix an off-by-one error when the first terminator was superseded by a later terminator. If there is no first terminator or later terminator, there is no off-by-one error. Incrementing the loop count creates one. This can be seen in loops like: do { if (something) { // No breaks or continues here. } } while (false); Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Abel Briggs <abelbriggs1@hotmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953 Fixes: `646621c66d` ("glsl: make loop unrolling more like the nir unrolling path")	2019-06-24 14:32:33 -07:00
Eric Anholt	5c4289dd4b	freedreno: Only upload the used part of UBO0 to the constant buffer. We were pessimistically uploading all of it in case of indirection, but we can just bump that when we encounter indirection. total constlen in shared programs: 2529623 -> 2485933 (-1.73%) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-24 14:23:07 -07:00
Eric Anholt	852704976a	freedreno: Stop treating UBO 0 specially in UBO uploading. ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we need to upload (all of it, since it will lower indirect UBO 0 accesses from load_ubo back to indirection on the constant buffer). Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-24 14:23:07 -07:00
Rob Clark	572c76fd88	freedreno: Clamp UBO uploads to the constlen decided by the shader. If the NIR-level analysis decided to move UBO loads to the constant file, but the backend decided not to load those constants, we could upload past the end of constlen. This is particularly relevant for pre-a6xx, where we emit a different constlen between bin and render variants. (Fix by Rob, commit message by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-24 14:23:07 -07:00
Alyssa Rosenzweig	c1ca138475	panfrost: Allow up to 16 UBOs This is the hardware max, as far as I can tell. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	b670becb1e	panfrost: DRY between shader stage setup Just a little spring cleanup, extending UBOs to vertex shaders in the process. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	5e2c3d40bd	panfrost/midgard: Implement UBO reads UBOs and uniforms now use a common code path with an explicit `index` argument passed, enabling UBO reads. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	f28e9e868b	panfrost: Handle disabled/empty UBOs Prevents an assert(0) later in this (not so edge) case. We still have to have a dummy there. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	bd2fc60a8a	panfrost: Identify "uniform buffer count" bits We've known about this for a while, but it was never formally in the machine header files / decoder, so let's add them in. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	856e03902b	panfrost: Upload UBOs Now that all the counting is sorted, it's a matter of passing along a GPU address and going. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	4c6d751274	panfrost: Allow for dynamic UBO count We already uploaded UBOs, but only a fixed number (1) for uniforms; let's upload as many as we compute we need. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	5d60be4e24	panfrost: Report UBO count We look at the highest set bit in the UBO enable mask to work out the maximum indexable UBO, i.e. the UBO count as we need to report to the hardware. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	ca2caf01df	panfrost: Constant buffer refactor We refactor panfrost_constant_buffer to mirror v3d's constant buffer handling, to enable UBOs as well as a single set of uniforms. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig	f35f373850	panfrost: Replace varyings for point sprites This doesn't handle Y-flipping, but it's good enough to render the stars in Neverball. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:56:22 -07:00
Alyssa Rosenzweig	be03060066	panfrost: Track point sprites in fragment shader key In preparation for lowering point sprites, track them like we track alpha testing state. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-24 12:56:16 -07:00
Caio Marcelo de Oliveira Filho	7fc907118e	i965: Move resources lowering after NIR linking Those either depend on information filled by the NIR linking steps OR are restricted by those: - gl_nir_lower_samplers: depends on UniformStorage being set by the linker. - brw_nir_lower_image_load_store: After `6981069fc8` "i965: Ignore uniform storage for samplers or images, use binding info" we want this pass to happen after gl_nir_lower_samplers. - gl_nir_lower_buffers: depends on UniformBlocks and SharedStorageBlocks being set by the linker. For the regular GLSL code path, those datastructures are filled earlier. For NIR linking code path we need to generate the nir_shader first then process it -- and currently the processing works with all shaders together. So move the passes out of brw_create_nir into its own function, called by the brwProgramStringNotify and brw_link_shader(). This patch prepares ground for ARB_gl_spirv, that will make use of NIR linker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-24 11:44:03 -07:00
Caio Marcelo de Oliveira Filho	6e2ff10886	glsl/nir: Fix copying 64-bit values in uniform storage The iterator `i` already walks the right amount now that is incremented by `dmul`, so no need to `* 2`. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: `b019fe8a5b` "glsl/nir: Fix handling of 64-bit values in uniform storage" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 11:32:14 -07:00
Caio Marcelo de Oliveira Filho	390ff8ac54	glsl/nir: Fix copying vector constant values For n_columns == 1, we have a vector which is handled by the else case. Fixes invalid memory access in upcoming ARB_gl_spirv tests. Failure bisected by Arcady Goldmints-Orlov. Fixes: `81e51b412e` "nir: Make nir_constant a vector rather than a matrix" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 11:32:14 -07:00
Daniel Schürmann	0daeb1d127	amd/common: lower bitfield_extract to ubfe/ibfe. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	48a75e7af0	amd/common: lower bitfield_insert to bfm & bitfield_select Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	a8b0b6e52b	nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select. bitfield_select is defined as: bitfield_select(mask, base, insert) = (mask & base) \| (~mask & insert) matching the behavior of AMD's BFI instruction. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	1403c3a7bf	nir/algebraic: Use unsigned comparison when lowering bitfield insert/extract This lets us use the optimization pattern (('ult', 31, ('iand', b, 31)), False) to remove the bcsel instruction for code originating in D3D shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	4eeb49ea71	nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sources The [iu]bfe and bfm instructions are defined to only use the five least significant bits. This optimizes a common pattern from D3D -> SPIR-V translation. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	165b7f3a44	nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec. That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00
Daniel Schürmann	a74f256c58	nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and friends. These optimizations are based on the fact that 'and(a,b) <= umin(a,b)'. For AMD, this series moves the optimization from LLVM to NIR, so currently no vkpipeline-db changes here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-24 18:42:20 +02:00
Andreas Baierl	fa6ea16a8d	lima/ppir: Add fsat op Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-24 16:41:33 +02:00
Andreas Baierl	f1d89bbc2f	lima/ppir: Add fneg op Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-24 16:41:33 +02:00
Andreas Baierl	512397058d	lima/ppir: Add fabs op Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-24 16:41:33 +02:00
Eric Engestrom	2d2e824fae	util: support "y" and "n" in env_var_as_boolean() Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-24 12:49:13 +00:00
Andreas Baierl	0cb9ce12fd	lima/ppir: lower ffma in ppir Since we cannot handle ffma in ppir, lower it on nir level already. Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-24 11:57:57 +00:00
Samuel Pitoiset	946193ae00	radv: add support for VK_AMD_buffer_marker This simple extension might be useful for debugging purposes. GAPID has support for it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-24 10:50:54 +02:00
Tapani Pälli	ff77b0415b	meson: error out if platforms contains empty string Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110939 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-24 08:40:18 +03:00
Nataraj Deshpande	d94fca5420	anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform gralloc module will select a format based on the usage flags provided by the camera device and the other endpoint of the stream. The patch fixes crash in vulkan when the test is run with camera stream set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED. Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering on chromebook with camera HAL3. v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan) Fixes: `f1654fa7e3` "anv/android: support creating images from external format" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-24 08:28:18 +03:00
Timur Kristóf	3b6d787e40	iris: move sysvals to their own constant buffer This commit moves the sysvals to a separate, new constant buffer at the end (before the shader constants). It also allows us to remove the special handling we had for cbuf0, and enables all constant buffers to support user-specified resources and user buffers. v2: (by Kenneth Graunke) - Rebase on the previous patch to fix system value uploading. - Fix disk cache num_cbufs calculation - Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs - Change upload_sysvals to assert that num_cbufs > 0 when num_system_values > 0. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-23 18:33:23 +02:00
Kenneth Graunke	ebc8c20b3e	iris: Mark cbuf0 as not needing uploading every single time I neglected to mark cbuf0_needs_upload = false after uploading it. The obvious fix regressed user clip plane tests, because of a second bug: we also forgot to mark that they may need re-uploading when changing shader programs (which may have more or less system values). Thanks to Timur Kristóf for catching the original issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>	2019-06-23 18:32:11 +02:00
Eric Engestrom	188dbb1679	Revert "egl: drop empty eglfallbacks.c" and "egl: move fallback calls to eglapi.c" This reverts commits `cc4b68a801` and `b27fb3eaca`. These caused a bunch of EGLSync tests to crash when they were previously failing. I have a hunch the tests are doing something wrong, like using extensions without checking for they support, but until the issue is investigated I'm just reverting these commits. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-22 21:59:06 +01:00
Eric Engestrom	cc4b68a801	egl: drop empty eglfallbacks.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-22 15:17:42 +00:00
Eric Engestrom	b27fb3eaca	egl: move fallback calls to eglapi.c Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-22 15:17:42 +00:00
Eric Engestrom	262b767023	egl: drop `_eglReturnFalse()` fallbacks v2: drop them altogether, they should never get called in the first place (Emil) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-22 15:17:42 +00:00
Eric Engestrom	82487ede62	egl: remove unnecessary eglGetProcAddress() fallback No need to add a function that returns `false` only to be cast into a pointer, we can just use the existing `return NULL` :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-22 15:17:42 +00:00
Eric Engestrom	30ecd86947	egl: remove NULL assignments after calloc() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-22 15:17:42 +00:00
Eric Engestrom	64c7c05b71	egl: move bad_param check further up This way other functions added in these entrypoints don't need to check anything. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-22 15:17:42 +00:00
Kenneth Graunke	262787b9bc	iris: Drop bo != NULL check from blorp 48b invalidate function. There is always a BO.	2019-06-21 20:50:42 -05:00
Kenneth Graunke	5da37a826b	Revert "iris: Don't check VF address high bits when there is no buffer." This reverts commit `db8f57a5cb`. This is bonkers. There will always be a BO.	2019-06-21 20:50:42 -05:00
Eric Anholt	4449572c47	freedreno: Only upload UBO pointers for UBOs that haven't been lowered. total constlen in shared programs: 2485933 -> 2462236 (-0.95%) Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-21 17:14:43 -07:00
Eric Anholt	01d0bad9ef	freedreno: Remove silly return from ir3_optimize_nir(). We only ever return the shader we were passed in (but internally modified). Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-21 17:14:43 -07:00
Eric Anholt	56842d33d5	freedreno: Fix up end range of unaligned UBO loads. We need the constants uploaded to cover the NIR offset plus the size, not the aligned-down start of our upload range plus the size. Fixes mistaken UBO analysis with mat3 loads. Fixes: `893425a607` ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-21 17:14:43 -07:00
Eric Anholt	5e7c96b95d	freedreno: Fix UBO load range detection on booleans. NIR 1-bit bool dests will have a bit size of 1, and thus a calculated "bytes" of 0. load_ubo is always loading from dwords in the source. Fixes: `893425a607` ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-21 17:14:43 -07:00
Eric Anholt	23a7feda63	freedreno: Stop reporting max_const in shader-db. We end up uploading constlen regardless, so max_const would only get you slightly improved granularity in const usage in comparison. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-21 17:14:43 -07:00
Eric Anholt	ee2e1e85d4	freedreno: Include binning shaders in shader-db. We want to see if we've improved our binning VS output, as well as the render VS. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-21 17:14:43 -07:00
Marek Olšák	8ab9f3a857	include: update GL headers from the registry Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-06-21 19:00:52 -04:00
Alyssa Rosenzweig	a6bef350ed	panfrost: Fix unused variable warning Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 13:06:49 -07:00
Boris Brezillon	5f81669d88	panfrost: Remove the panfrost_driver abstraction The non-DRM backend is gone. Let's get rid of the panfrost_driver abstraction and call the panfrost_drm_xxx() functions directly. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 13:01:49 -07:00
Boris Brezillon	e8257f3de8	panfrost: Remove the perf counters interface The DRM backend has a dummy implementation and the non-DRM backend is gone, so let's remove this perf counter interface. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 13:01:12 -07:00
Tomeu Vizoso	0bcbccf887	panfrost: ci: Fix parsing of crashed tests Without this fix, LAVA isn't parsing crashes as failed tests, because the shell logging is interspersed within the fake deqp output. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig	d38ac21297	panfrost: Conditionally submit fragment job If there are no tiling jobs and no clears, there is no need to submit a fragment job (relevant for transform feedback). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig	cd5d618b5c	panfrost: Implement rasterizer discard D'aww, look how cute that is now that scoreboarding is setup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 09:35:31 -07:00
Alyssa Rosenzweig	26c5a145a7	panfrost: Track buffer initialization We want to know if a given slice of a buffer is initialized at a particular point in the execution of the program. This is accomplished easily enough -- start out uninitialized and upon an operation writing to the buffer, mark it initialized. The motivation is to optimize away expensive operations (like wallpaper blits) when reading from an uninitialized buffer; since it's uninitialized, the results of these operations are undefined, and it's legal to take the fast path ^_^ Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-21 09:35:09 -07:00
Alyssa Rosenzweig	f0854745fd	panfrost: Implement command stream scoreboarding This is a rather complex change, adding a lot of code but ideally cleaning up quite a bit as we go. Within a batch (single frame), there are multiple distinct Mali job types: SET_VALUE, VERTEX, TILER, FRAGMENT for the few that we emit right now (eventually more for compute and geometry shaders). Each hardware job has a mali_job_descriptor_header, which contains three fields of interest: job index, a dependencies list, and a next job pointer. The next job pointer in each job is used to form a linked list of submitted jobs. Easy enough. The job index and dependencies list, however, are used to form a dependency graph (a DAG, where each hardware job is a node and each dependency is a directed edge). Internally, this sets up a scoreboarding data structure for the hardware to dispatch jobs in parallel, enabling (for example) vertex shaders from different draws to execute in parallel while there are strict dependencies between tiling the geometry of a draw and running that vertex shader. For a while, we got by with an incredible series of total hacks, manually coding indices, lists, and dependencies. That worked for a moment, but combinatorial kaboom kicked in and it became an unmaintainable mess of spaghetti code. We can do better. This commit explicitly handles the scoreboarding by providing high-level manipulation for jobs. Rather than a command like "set dependency #2 to index 17", we can express quite naturally "add a dependency from job T on job V". Instead of some open-coded logic to copy a draw pointer into a delicate context array, we now have an elegant exposed API to simple "queue a job of type XYZ". The design is influenced by both our current requirements (standard ES2 draws and u_blitter) as well as the need for more complex scheduling in the future. For instance, blits can be optimized to use only a tiler job, without a vertex job first (since the screen-space vertices are known ahead-of-time) -- causing tiler-only jobs. Likewise, when using transform feedback with rasterizer discard enabled, vertex jobs are created (to run vertex shaders) with no corresponding tiler job. Both of these cases break the original model and could not be expressed with the open-coded logic. More generally, this will make it easier to add support for compute shaders, geometry shaders, and fused jobs (an optimization available on Bifrost). Incidentally, this moves quite a bit of state from the driver context to the batch, which helps with Rohan's refactor to eventually permit pipelining across framebuffers (one important outstanding optimization for FBO-heavy workloads). v2: Add comment explaining the meaning of "primary batch" as suggested by Tomeu (trivial - not reviewed). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Rohan Garg <rohan.garg@collabora.com>	2019-06-21 09:35:02 -07:00
Anuj Phogat	e334a595e4	intel/icl: Add new ICL PCI-IDs Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 08:38:08 -07:00
Jason Ekstrand	1a9e5b9094	anv: Implement "pop-free" clipping This is the preferred clipping mode since it doesn't mean your points disappear the moment part of the point crosses over the edge of the viewport and that lines have weird endpoints at viewport edges. We've just never bothered to hook it up until now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Jason Ekstrand	4a757d6c31	anv: Enable the guardband clip test In workloads where there is a lot of geometry drawn that crosses over the edge of the viewport, this should substantially improve clipper performance. Not really sure why it's taken 3 years to turn it on but we never got around to it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Jason Ekstrand	13f0c278c5	i965,iris: Move guardband calculations to a common location Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-21 14:18:59 +00:00
Mauro Rossi	60c581b57d	android: virgl: fix libmesa_winsys_virgil_common build and dependencies Fixes the following building errors and resolves Bug 110922 Fixes gallium_dri target missing symbols at linking. external/mesa/src/gallium/winsys/virgl/drm/Android.mk: error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... external/mesa/src/gallium/winsys/virgl/vtest/Android.mk: error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... build/core/main.mk:728: error: exiting from previous errors. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found #include "virgl_resource_cache.h" ^~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `b18f09a` ("virgl: Introduce virgl_resource_cache") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-06-21 15:53:29 +02:00
Mauro Rossi	cf389ba895	android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependecies Fix android building errors in winsys/amdgpu and radv due to 'amdgfxregs.h' not found. Changelog: amd/common - generated $(intermediated)/common path is added to exports winsys/amdgpu - libmesa_amd_common static dependency is added radv - correct generated $(intermediated)/common path is added to includes Fixes: `f480b8a` ("amd/common: use generated register header") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-06-21 15:53:23 +02:00
Samuel Pitoiset	9bf47fefe0	radv: add support for VK_KHR_depth_stencil_resolve Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:38 +02:00
Samuel Pitoiset	e67fc11c26	radv: pass sample locations for transitions before depth/stencil resolves HTILE decompressions need the user sample locations if specified in the current subpass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:35 +02:00
Samuel Pitoiset	396da5c029	radv: clear the depth/stencil resolve attachment if necessary The driver might need to clear one aspect of the depth/stencil resolve attachment before performing the resolve itself. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:33 +02:00
Samuel Pitoiset	c7872237bf	radv: decompress HTILE if the resolve src image is compressed It's required to decompress HTILE before resolving with the compute path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:27 +02:00
Samuel Pitoiset	29c4d44cee	radv: select the depth/stencil resolve method based on some conditions Only fallback to the compute path for layers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:24 +02:00
Samuel Pitoiset	5cf350f565	radv: implement all depth/stencil resolve modes using compute This path supports layers but it requires to decompress HTILE before resolving. The driver also needs to fixup HTILE after the resolve. This path is probably slower than the graphics one. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:19 +02:00
Samuel Pitoiset	cdc6efddf9	radv: implement all depth/stencil resolve modes using graphics When using graphics, the driver doesn't need to decompress HTILE before resolving. This path currently doesn't support layers so we have to fallback to the compute path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:15 +02:00
Samuel Pitoiset	e52ad9f845	radv: record if a render pass has depth/stencil resolve attachments Only supported with vkCreateRenderPass2(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:12 +02:00
Samuel Pitoiset	ac6369a2d0	radv: rename has_resolve to has_color_resolve Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 14:50:10 +02:00
Samuel Pitoiset	203f60ebf2	radv: emit framebuffer state from primary if secondary doesn't inherit it Otherwise fast color/depth clears can't work because they depend on the framebuffer. This fixes the following CTS (when the small hint is disabled): - dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.2d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube_array.secondary_cmd_buffer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110810 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107986 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-21 13:49:35 +02:00
Eric Engestrom	6a9dd62882	drisw: move build logic to build systems Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-21 11:35:39 +00:00
Tomeu Vizoso	1cbe2ad394	panfrost: ci: Exclude two more flip-flop from results These three tests pass on RK3399, but fail on RK3288: dEQP-GLES2.functional.shaders.matrix.div.const_lowp_mat2_mat2_vertex dEQP-GLES2.functional.shaders.operator.unary_operator.pre_increment_effect.highp_ivec4_vertex dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3 They reliably pass when run individually, but reliably fail when run in a full CI run. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-21 10:45:12 +02:00
Gert Wollny	ef4429d9c5	gallium/st: Add Gallium hud to swrast drivers Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-21 08:54:57 +02:00
Iago Toral Quiroga	4d8f82946b	v3d: flush jobs writing to vertex buffers used in the current draw call This can happen when any of our vertex buffers was written by a previous transform feedback draw. Fixes the following piglit tests: spec/ext_transform_feedback/position-render-bufferbase spec/ext_transform_feedback/position-render-bufferbase-discard spec/ext_transform_feedback/position-render-bufferoffset spec/ext_transform_feedback/position-render-bufferoffset-discard spec/ext_transform_feedback/position-render-bufferrange spec/ext_transform_feedback/position-render-bufferrange-discard Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-21 08:06:13 +02:00
Iago Toral Quiroga	eb44dcc219	v3d: flush jobs reading from transform feedback output buffers If we are about to write to a transform feedback buffer, we should make sure that we flush any prior work that intended to read from any of these buffers. Fixes piglit test: spec/ext_transform_feedback/immediate-reuse Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-21 08:06:13 +02:00
Iago Toral Quiroga	42572f2f7d	v3d: add a helper to check if transform feedback is enabled v2: We should be safe assuming that bind_vs != NULL (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-21 08:06:13 +02:00
Dave Airlie	00a56acc23	llvmpipe: make remove_shader_variant static. this isn't used outside this file. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-06-21 10:27:57 +10:00
Eric Engestrom	955c63d364	util/os_file: resize buffer to what was actually needed Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 21:49:30 +00:00
Tomeu Vizoso	2743e34f20	panfrost: ci: Update expectations These tests have been fixed recently. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-20 20:57:41 +02:00
Alyssa Rosenzweig	195e297a92	panfrost/midgard: Broadcast swizzle Fixes regression in shaders using ball/etc by explicitly passing through the number of channels in the NIR op and broadcasting the last components of the channel appropriately, as the Midgard ops are all vec4 implicitly but NIR can be vec2/3. v2: Don't also regress every other swizzle in Equestria. v3: Don't regress the swizzles at Canterlot High either. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-20 20:52:04 +02:00
Kenneth Graunke	31de802e7e	iris: Use stream uploader for shader draw parameters. Most vertex data lives in user VBOs in IRIS_MEMZONE_OTHER, which typically have high bits set to 0xffff. The shader draw parameters were being uploaded in IRIS_MEMZONE_DYNAMIC, which have high bets set to 0x2. This was causing a lot of ping-ponging of high bits, leading to unnecessary VF cache flushing. Cuts 7.2% of the flushes in the Civizilation VI demo on Kabylake GT2.	2019-06-20 13:32:16 -05:00
Kenneth Graunke	db8f57a5cb	iris: Don't check VF address high bits when there is no buffer. If there is no buffer, then it doesn't matter. Leave the old stale high bits in place (for next time) and don't bother invalidating. Cuts 5.6% of the flushes in the Civilization VI demo on Kabylake GT2.	2019-06-20 13:32:16 -05:00
Kenneth Graunke	ecc500398f	iris: Drop RT flushes from depth stencil clearing flushes. These write depth and stencil, not color writes, so there's no need to flush the render target.	2019-06-20 13:32:16 -05:00
Kenneth Graunke	1d63af0f2c	iris: Don't bother with PIPE_CONTROLs for CPU writes and no history If a buffer has no usage history, we don't have any read only cache invalidates to do. If we've written it with the CPU, we don't need to flush the render cache. The only bit remaining is the CS stall from iris_flush_bits_for_history. We can just skip the PIPE_CONTROL in this case. This is pretty common - an app creates a buffer, fills it with data, and then binds it for some purpose. Cuts 36% of the flushes in Manhattan 3.0 on Kabylake GT2.	2019-06-20 13:32:16 -05:00
Kenneth Graunke	dfff6e10b4	iris: Only do an RT flush for transfer maps if using copy_region. If we wrote the data via the CPU, there's no point in doing a render target flush. If using BLORP, we do want a render target flush so the data lands.	2019-06-20 13:32:15 -05:00
Kenneth Graunke	c4c17ab3ec	iris: Use iris_flush_bits_for_history in iris_transfer_flush_region Instead of using the combined iris_flush_and_dirty_for_history, use iris_flush_bits_for_history directly - we were already using the split out iris_dirty_for_history. There's no need to dirty twice, and we can avoid the looping altogether for non-buffers.	2019-06-20 13:32:15 -05:00
Kenneth Graunke	6890340c31	iris: Avoid double flushing in iris_transfer_flush_region when copying. My intention was to have iris_copy_region not do flushing, and leave that up to the callers. iris_resource_copy_region needs to do this, but iris_transfer_flush_region was already doing it. The net result was that we were doing it twice for transfers. So, move the flushing from iris_copy_region to iris_resource_copy_region so that it only happens in the callers as I intended.	2019-06-20 13:32:15 -05:00
Kenneth Graunke	64fb20ed32	iris: Fix iris_flush_and_dirty_history to actually dirty history. When I split iris_flush_and_dirty_history into two helper functions, I accidentally made it stop dirtying. Which was...sort of the point. Fixes: `21688a306b` iris: Split iris_flush_and_dirty_for_history into two helpers.	2019-06-20 13:32:15 -05:00
Kenneth Graunke	5e501ffeb2	iris: Add maybe_flush calls to texture_barrier and memory_barrier Otherwise, tests which loop on glMemoryBarrier may run us out of batch space with piles of flushing. (Ideally, we'd elide those bonus PIPE_CONTROLs, but presumably this isn't that common of a case...) Piglit's arb_pipeline_statistics_query-comp would hit this case after some of the next patches remove other PIPE_CONTROLs with maybe_flushes.	2019-06-20 13:32:15 -05:00
Kenneth Graunke	d4a4384b31	iris: Implement INTEL_DEBUG=pc for pipe control logging. This prints a log of every PIPE_CONTROL flush we emit, noting which bits were set, and also the reason for the flush. That way we can see which are caused by hardware workarounds, render-to-texture, buffer updates, and so on. It should make it easier to determine whether we're doing too many flushes and why.	2019-06-20 13:32:15 -05:00
Alyssa Rosenzweig	c378829a0d	panfrost: Skip shading unaffected tiles Looking at the scissor, we can discard some tiles. We specifially don't care about the scissor on the wallpaper, since that's a no-op if the entire tile is culled. v2: Clarify clear comment (not reviewed but trivial). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-20 09:30:38 -07:00
Eric Engestrom	65b016b146	glx: fix glvnd pointer types Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110709 Fixes: `22a9e00aab` ("glx: Implement the libglvnd interface.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-20 17:21:37 +01:00
Eric Engestrom	e0ee790ba7	glx: drop misleading comment about the file being "generated" This `gen_scrn_dispatch.pl` has never existed, in the sense that NVIDIA never published it. There have been a number (6) of commits to fix various things in there over the years, and never anything from NVIDIA. For all intents and purposes this file is hand-written and hand-maintained, and we're on our own. Let's make this clear by removing this misleading comment. Suggested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-20 16:19:58 +00:00
Boris Brezillon	56434450f6	nir/lower_tex: Add an assert() in nir_lower_txs_lod() We don't expect the output of a TXS instruction to be wider than a vec3. Add an assert() to make sure this never happens. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 09:15:53 -07:00
Tomeu Vizoso	babc3ad291	panfrost: Set job requirements during draw Right now we are doing it at a moment when we don't have all the information we need. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Rohan Garg <rohan.garg@collabora.com> Cc: Rohan Garg <rohan.garg@collabora.com> Fixes: `bfca21b622` ("panfrost: Figure out job requirements in pan_job.c")	2019-06-20 18:07:04 +02:00
Alyssa Rosenzweig	dc668203db	panfrost/meson: Link with libpanfrost_shared Fixes: `035a07c0` ("panfrost: Switch to lima tiling") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 08:56:38 -07:00
Hyunjun Ko	f7f8fb1b55	freedreno/ir3: fix typo Fixes: `a9b556d3a0` ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-20 08:34:09 -07:00
Alyssa Rosenzweig	546236e27f	panfrost: Load from tiled images Now that we have lima tiling code available, use it to load from a tiled source. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig	035a07c0ae	panfrost: Switch to lima tiling Lima and Panfrost both have implementations of software tiling (the Lima one was forked off the Panfrost one which was forked off the original Lima one...). Switch to the most recent Lima code, since it's more complete than ours at this point. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig	7b46f09f26	panfrost: Fix tiled NPOT textures with bpp<4 Panfrost's tiling routines (incorrectly) ignored the source stride, masking this bug; lima's routines respect this stride, causing issues when tiling NPOT textures whose stride is not a multiple of 64 (for instance, NPOT textures with bpp=1). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 08:22:38 -07:00
Alyssa Rosenzweig	413242277a	lima,panfrost: Move lima_tiling.c/h to /src/panfrost This will allow both drivers to share this code. Both drivers build-tested with meson. Android build not tested. v2: Change naming from tiling->shared, in case Lima and Panfrost can share more in the future. Fix Android build system. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-and-tested-by: Qiang Yu <yuq825@gmail.com>	2019-06-20 08:06:35 -07:00
Kenneth Graunke	c57b4c86c0	iris: Use render_batch/compute_batch locals in memory_barrier We have them, may as well use them.	2019-06-20 10:04:38 -05:00
Lionel Landwerlin	4a61be24fe	anv: only resort to sync fds internally with no syncobj support We can rely on only one kind of synchronization object (drm-syncobj) when it is available. This reduces the number of file descriptors we use in our implementation. This will be required later for timeline semaphores implementation, at this point we won't ever want to use anything else but syncobjs. v2: Only use has_syncobj for semaphores (Jason) v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 14:59:51 +00:00
Alyssa Rosenzweig	1d7e53a854	panfrost: Remove other commented pointers Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:05 -07:00
Alyssa Rosenzweig	2608da14b9	panfrost/decode: Elide more zero fields Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:05 -07:00
Alyssa Rosenzweig	cfc2218a8c	panfrost/decode: Remove memory comments These do more harm than good at this point. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	8643b89c48	panfrost: Add missing 0x in invocation_count Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	b6d46d09c2	panfrost/decode: Skip decode of fragment backend in non-fragment This is all zero for anything but fragment shaders. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	ae2bfab7b7	panfrost/decode: Clip mali_compute_fbd at 64-bytes Looking at internal evidence (later fields including a literal other compute job inception-style, seeming memory corruption, no clear function, and the field after this being a pointer to itself), it looks like this is really a much smaller descriptor. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	3faf33488a	panfrost/decode: Print COMPUTE uniforms as pointers In OpenGL, uniforms generally represent fp32 vec4s (at least in highp mode). In OpenCL, they represent vec2s of 64-bit pointers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	0021fae7f8	panfrost/decode: Show int uniforms Float is ambiguous. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	1f7dfee1b4	panfrost/decode: Expand pointers in compute descriptor Just as an aid. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Alyssa Rosenzweig	0aa5d89acb	panfrost/decode: Identify "compute FBD" There is fundamentally not a framebuffer associated with a compute job. Allocate a new structure for it so we don't mess up graphics when decoding. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 07:48:04 -07:00
Tomeu Vizoso	4f881237c3	panfrost: Allocate panfrost_job in panfrost_context Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 15:48:35 +02:00
Tomeu Vizoso	b5db7cce60	panfrost: Release transient pools Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 15:48:35 +02:00
Tomeu Vizoso	6cec937e22	panfrost: ci: Exclude flip-flops from results These tests are failing at times, blacklist for now: dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgb dEQP-GLES2.functional.shaders.matrix.mul.dynamic_highp_mat4_vec4_vertex Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-20 15:48:15 +02:00
Alejandro Piñeiro	6a159bca9d	util: add empty line before virgl options Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-20 15:21:39 +02:00
Alejandro Piñeiro	790c3dbac8	util: add missing DRI_CONF_OPT_END When DRI_CONF_GLES_EMULATE_BGRA was added for the virgl driver, it missed a DRI_CONF_OPT_END. This make some drivers, like v4c/v3d to crash with the following error: Fatal error in __driConfigOptions line 99, column 2: mismatched tag. Not sure why it doesn't fail with virgl. Fixes: `b793663449` Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-20 14:11:30 +02:00
Eric Engestrom	a9e09d56a9	isl: tag unreachable path as such GCC should be able to figure out that all the possible enum values are exhausted in the switch() and all the branches return from the function, but apparently it doesn't, so let's tell the compiler explicitly. This gets rid of the following warnings in GCC 9: [1/24] Compiling C object 'src/intel/isl/60d23f8@@isl@sta/isl.c.o'. ../src/intel/isl/isl.c: In function ‘isl_surf_init_s’: ../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1569 \| surf = (struct isl_surf) { \| ~~~~~~^~~~~~~~~~~~~~~~~~~~~ 1570 \| .dim = info->dim, \| ~~~~~~~~~~~~~~~~~ 1571 \| .dim_layout = dim_layout, \| ~~~~~~~~~~~~~~~~~~~~~~~~~ 1572 \| .msaa_layout = msaa_layout, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1573 \| .tiling = tiling, \| ~~~~~~~~~~~~~~~~~ 1574 \| .format = info->format, \| ~~~~~~~~~~~~~~~~~~~~~~~ 1575 \| \| 1576 \| .levels = info->levels, \| ~~~~~~~~~~~~~~~~~~~~~~~ 1577 \| .samples = info->samples, \| ~~~~~~~~~~~~~~~~~~~~~~~~~ 1578 \| \| 1579 \| .image_alignment_el = image_align_el, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1580 \| .logical_level0_px = logical_level0_px, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1581 \| .phys_level0_sa = phys_level0_sa, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1582 \| \| 1583 \| .size_B = size_B, \| ~~~~~~~~~~~~~~~~~ 1584 \| .alignment_B = base_alignment_B, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1585 \| .row_pitch_B = row_pitch_B, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1586 \| .array_pitch_el_rows = array_pitch_el_rows, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1587 \| .array_pitch_span = array_pitch_span, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1588 \| \| 1589 \| .usage = info->usage, \| ~~~~~~~~~~~~~~~~~~~~~ 1590 \| }; \| ~ ../src/intel/isl/isl.c:1488:24: warning: ‘((void )&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1488 \| struct isl_extent2d phys_total_el; \| ^~~~~~~~~~~~~ ../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized] 1335 \| isl_align_div(phys_total_el->w tile_el_scale, \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ ../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here 1488 \| struct isl_extent2d phys_total_el; \| ^~~~~~~~~~~~~ Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-20 12:05:14 +00:00
Samuel Pitoiset	f179febde0	radv: enable DCC for mipmapped color textures on GFX8 It's tricky on GFX9, so only GFX8 for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:04:02 +02:00
Samuel Pitoiset	17f94e1984	radv: do not fast clears if one level can't be fast cleared And fallback to slow color clears. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:03:58 +02:00
Samuel Pitoiset	450bce522a	radv: add fast clears support for mipmapped color images with DCC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:03:57 +02:00
Samuel Pitoiset	fa903ba799	radv: add radv_dcc_clear_level() helper For clearing only one level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:03:53 +02:00
Samuel Pitoiset	b92d87f7f0	radv: re-initialize DCC metadata after decompressing using compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:03:52 +02:00
Samuel Pitoiset	dc6e3053a7	radv: initialize levels without DCC during layout transitions Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-20 11:03:49 +02:00
Thomas Hellstrom	71b43490dd	svga: Support ARB_buffer_storage This basically boils down to supporting persistent and coherent buffer storage. We chose to use coherent buffer storage for all persistent buffers even if it's not explicitly specified, since using glMemoryBarrier to obtain coherency would be particularly expensive in our driver stack, and require a lot of additional bookkeeping. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Thomas Hellstrom	8c01e5ed5f	gallium/util: Make it possible to disable persistent maps in the upload manager For svga, the use of persistent / coherent maps is typically slightly slower than without them. It's probably a bit case-dependent and possible to tune, but for now, make sure we can disable those. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Thomas Hellstrom	3b828c4e68	svga: Map vertex- index- and constant buffers ansynchronously when reading With SWTNL and index translation we're mapping buffers for reading. These buffers are commonly upload_mgr buffers that might already be referenced by another submitted or unsubmitted GPU command. A synchronous map will then trigger a flush and sync, at least on Linux that doesn't distinguish between read- and write referencing. So map these buffers async. If they for some obscure reason happen to be dirty (stream-output, buffer-copy), the resource_buffer code will read-back and sync anyway. For persistent / coherent buffers a corresponding read-back and sync will happen in the kernel fault handler. Testing: Piglit quick. No regressions. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Thomas Hellstrom	f51915ba62	svga: Fix index buffer uploads In the case of SWTNL and index translation we were uploading index buffers and then reading out from them using the CPU. Furthermore, when translating indices we often cached the results with an upload_mgr buffer, causing the cached indexes to be immediately discarded on the next write to that upload_mgr buffer. Fix this by only uploading when we know the index buffer is going to be used by hardware. If translating, only cache translated indices if the original buffer was not a user buffer. In the latter case when we're not caching, use an upload_mgr buffer for the hardware indices. This means we can also remove the SWTNL hand-crafted index buffer upload mechanism in favour of the upload_mgr. Finally avoid using util_upload_index_buffer(). It wastes index buffer space by trying to make sure that the offset of the indices in the upload_mgr buffer is larger or equal to the position of the indices in the source buffer. From what I can tell, the SVGA device does not require that. Testing done: Piglit quick. No regressions. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Thomas Hellstrom	4f59d51d82	winsys/svga: Make it possible to specify coherent resources Add a flag in the surface cache key and a winsys usage flag to specify coherent memory. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Thomas Hellstrom	4412be40dd	gallium/util: Make u_debug_flush support persistent maps Previously unsynchronized maps have been assumed to also be persistent, Now destinguish between persistent and unsynchronized map and also support PIPE_TRANSFER_PERSISTENT from ARB_buffer_storage. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-20 09:30:22 +02:00
Gert Wollny	a478e56fbd	virgl: Add debug flag to bypass driconf to enable the BGRA tweaks This useful for testing, also because with vtest the dri configuration is not read. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	5dbecf7863	virgl: Add a tweak to set the value for emulated queries of GL_SAMPLES_PASSED On GLES hosts GL_SAMPLES_PASSED is emulated by GL_ANY_SAMPLES_PASSED which returns a boolen. With this tweak the value that is returned if any sample passed can be set. This may be of iterest when an application decides whether some geometry is rendered based on an amount of visibility and not just a binary desicion. virgelrenderer sets a default of 1024 on th host. v2: Remove reference from virgl and correct description (Emil) v3: Send the tweak binary encoded instead of using strings (Gurchetan) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	59757dbad6	virgl: Add tweak to apply a swizzle when drawing/blitting to a emulated BGRA texture With Qemu this final swizzle is not needed, but with vtest it is, i.e. it depends on how a program using virglrenderer uses the surface that is rendered to, hence a tweak is added. v2: Update description and fix spelling (Emil) v3: Send tweak as binary value instead of using strings (Gurchetan) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	b793663449	virgl: Add driconf tweak for emulating BGRA surfaces on GLES These tweaks are used to fix rendering issues with Valve games and at least also "The Raven Remastered" when run on a GLES host. v2: Fix type in define and remove virgl from driconf option (Emil) v3: Encode tweak binary instead of using strings (Gurchetan) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	13d4a34c44	virgl: Add override for BGRA format to use swizzled SRGB format Tie in the check whether the host supports tweaks and whether this tweak is enabled. v2: Add comment about the emulated formats not being used directly in the guest (Gurchetan) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	22edafb239	virgl: Add code to accept BGRx_SRGB as RGBx_SRGB This will be enabled in later patches by the emulation tweak. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	d8967b7951	virgl: Add skeleton to evaluate cap and send tweaks Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	28dc096e15	virgl: factor out format host bits check This will make it a single location when we want to replace a format. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	30eb1fdc51	gallium/virgl: Add code path for virgl to read driconf This works only for the drm variant of virgl and not for the vtest variant. v2: Rebase, replace the configuration query function by a pointer to the configuration data. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:38 +02:00
Gert Wollny	cf800998af	virgl: Add driinfo file and tie it into the build Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-06-20 08:50:37 +02:00
Caio Marcelo de Oliveira Filho	9b0720c436	glspirv: Call pass to lower frexp instructions These were previously handled by the spirv_to_nir, but that changed to be an explict pass in `23d30f4099` "spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-19 22:07:57 -07:00
Caio Marcelo de Oliveira Filho	12131096fa	spirv: Restrict use of descriptor intrinsics to Vulkan In ARB_gl_spirv we'll be able to use variables for uniform buffers, so don't use the descriptor intrinsics to lower the block access. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-19 22:07:51 -07:00
Nicolai Hähnle	21dd881416	ac/rtld: report better error messages for LDS overallocation Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Marek Olšák	b64bd5887e	ac/rtld: check correct LDS max size Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	1ee0f0d315	radeonsi: add s_sethalt to shaders for debugging Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Nicolai Hähnle	87182200c7	ac/rtld: fix sorting of LDS symbols by alignment Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-06-19 20:30:32 -04:00
Bas Nieuwenhuizen	d1c04835ab	meson: Allow building radeonsi with just the android platform. Just as was allowed by autotools. Fixes: `108d257a16` "meson: build libEGL" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 23:42:49 +00:00
Bas Nieuwenhuizen	755c633b8d	anv: Fix vulkan build in meson. Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 23:27:46 +00:00
Bas Nieuwenhuizen	4c300bd328	radv: Fix vulkan build in meson. Apparently the android part was never ported to meson. CC: <mesa-stable@lists.freedesktop.org> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-19 23:27:46 +00:00
Jason Ekstrand	ef323d02d8	anv/image: Set different usage flags for shadow surfaces For the block BLOCK_TEXEL_VIEW_COMPATIBLE case, this didn't matter because the flags were already more-or-less what we wanted. However, for gen7 stencil shadow images, it still had ISL_SURF_USAGE_STENCIL_BIT so we were getting W-tiled which isn't what we want for the shadow. By passing just ISL_SURF_USAGE_TEXTURE_BIT (and CUBE if we care), we now get something that's actually texturable. Fixes: `f3ea0cf828` "anv: Add stencil texturing support for gen7"	2019-06-19 22:21:46 +00:00
Jason Ekstrand	215f9f83f5	anv: Flush caches in anv_image_copy_to_shadow Copies to a shadow image happen during a VkCmdPipelineBarrier or at subpass transitions. We could potentially be a bit more conservative but these transitions shouldn't happen often and it's better to have our bases covered. Fixes: `f3ea0cf828` "anv: Add stencil texturing support for gen7"	2019-06-19 22:21:46 +00:00
Jason Ekstrand	81e51b412e	nir: Make nir_constant a vector rather than a matrix Most places in NIR, we treat matrices like arrays. The one annoying exception to this has been nir_constant where a matrix is a first-class thing. This commit changes that so a matrix nir_constant is the same as an array nir_constant. This makes matrix nir_constants a tiny bit more expensive but shrinks all others by 96B. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	b019fe8a5b	glsl/nir: Fix handling of 64-bit values in uniform storage Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	a54e397152	spirv: Only copy needed components for OpSpecConstantOp Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	96bb9c9277	spirv: Use a single path for OpSpecConstantOp of OpVectorShuffle Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	280e5442e5	spirv: Use vtn_constan_uint() for array lengths and gather components Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	aa11c2e75e	spirv: Add a vtn_constant_int helper Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 21:05:54 +00:00
Jason Ekstrand	93f4aa9889	glsl/types: Add a real is_integer helper Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	f0920e266c	glsl/types: Rename is_integer to is_integer_32 It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	21a7e6d569	glsl/types: Ignore bit sizes in contains_integer() All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Jason Ekstrand	0d1fb380b1	glsl/types: Handle all bit sizes in glsl_type_is_integer All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-19 20:28:52 +00:00
Caio Marcelo de Oliveira Filho	feb0cdcb52	glsl/nir_opt_access: Update uniforms correctly when only vars change Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: `f2d0e48ddc` "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-19 12:50:41 -07:00
Caio Marcelo de Oliveira Filho	d7ea433a5f	glsl/nir: Fix getting the sampler dim when arrays are involved Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: `f2d0e48ddc` "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-19 12:50:39 -07:00
Jory Pratt	10e8d46601	meson: Search for execinfo.h Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for execinfo.h presence, just check directly. This allows the build to work on musl. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 12:16:18 -07:00
Jory Pratt	fd7b7f14d8	util: Heap-allocate 256K zlib buffer The disk cache code tries to allocate a 256 Kbyte buffer on the stack. Since musl only gives 80 Kbyte of stack space per thread, this causes a trap. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size (In musl-1.1.21 the default stack size has increased to 128K) [mattst88]: Original author unknown, but I think this is small enough that it is not copyrightable. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-19 12:16:18 -07:00
Kenneth Graunke	9c19d07b1c	anv: Fix wrong printf formatter %lu is for unsigned long, %zu is for size_t. Just cast the data.	2019-06-19 11:57:01 -05:00
Kenneth Graunke	bbbf7a538c	iris: Bail on queries for INTEL_NO_HW=1. We don't execute any of the commands to record snapshots, so we can't actually produce a real result. We do however need to avoid waiting on a syncpt which will never be signalled. So, just return 0.	2019-06-19 11:55:43 -05:00
David Riley	11e74daae5	virgl: Support VIRGL_BIND_SHARED Support a new virgl bind type for shared buffers. Signed-off-by: David Riley <davidriley@chormium.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-06-19 07:28:47 -07:00
Lionel Landwerlin	bc62673dce	anv: write spirv-nir logs back to the application Using the existing VK_EXT_debug_report extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-19 15:45:52 +03:00
Connor Abbott	53a7649e5d	ac/nir: Set speculatable for buffer loads where allowed This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	77be5b2f88	nir: Use reorderable access flag No changes with radeonsi shader-db. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	a1c737927c	nir: Add a helper to determine if an intrinsic can be reordered This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	6fc83c253f	st/nir: Use gl_nir_opt_access Nothing uses its results yet, that will come with the following commits. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	f2d0e48ddc	glsl/nir: Add optimization pass for access flags Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	c813c5776d	nir: Add reorderable memory access enum Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	75063fbac5	nir/copy_prop_vars: Ignore volatile accesses The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:28 +02:00
Connor Abbott	364996d70d	glsl/nir: Propagate access qualifiers We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:27 +02:00
Connor Abbott	6f20643b47	nir: Allow qualifiers on copy_deref and image instructions In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-19 14:08:27 +02:00
Connor Abbott	3bf8981c51	ac,radeonsi: Always mark buffer stores as inaccessiblememonly inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-19 14:08:27 +02:00
Eric Engestrom	4db2c1e2fe	egl: add missing #include close() is in <unistd.h> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-19 12:05:58 +00:00
Samuel Pitoiset	0a313cc285	radv: disable viewport clamping even if FS doesn't write Z This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-19 11:18:50 +02:00
Samuel Pitoiset	e91c1ea06c	radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask This allows us to disable the FMASK decompress pass when transitioning from CB writes to shader reads. This will likely be improved and enabled by default in the future. No CTS regressions on GFX8 but a few number of multisample CTS failures on GFX9 (they look related to the small hint). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-19 10:06:39 +02:00
Samuel Pitoiset	a7f75377ab	radv: fix FMASK expand with SRGB formats Found while working on DCC for MSAA. Fixes: `6b976024a8` ("radv: add support for FMASK expand") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-19 07:53:53 +02:00
Tomeu Vizoso	0fcf73bc2d	panfrost: Move to use ralloc for some allocations We have some serious leaks, so plug some and also move to ralloc to limit the lifetime of some objects to that of their parent. Lots more such work to do. For some reason, this fixes: dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-19 07:34:15 +02:00
Mathias Fröhlich	5743a36b2b	egl: Don't add hardware device if there is no render node v2. Do not offer a hardware drm backed egl device if no render node is available. The current implementation will fail on this egl device. On top it issues a warning that is actually missleading. There are finally more error paths that can fail on the way to a hardware backed egl device. Fixing all of them would kind of require opening the drm device and see if there is a usable driver associated with the device. The taken approach avoids a full probe and fixes at least this kind of problem on kvm virtualization hosts I observe here. Fixes: `dbb4457d98` ("egl: add EGL_EXT_device_drm support") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-06-19 07:17:23 +02:00
Christian Gmeiner	8dd26fa2f0	etnaviv: support GL_ARB_seamless_cubemap_per_texture Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-By: Guido Günther <agx@sigxcpu.org>	2019-06-19 00:39:50 +02:00
Christian Gmeiner	a13efb3cdb	etnaviv: update headers from rnndb Update to etna_viv commit `a3bf0da`. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-19 00:39:50 +02:00
Dave Airlie	378ea92bf6	radeonsi: fix undefined shift in macro definition Pointed out by coverity Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-19 08:32:36 +10:00
Dave Airlie	93ba356544	nouveau: fix frees in unsupported IR error paths. This is pointless in that we won't ever hit those paths in real life, but coverity complains. Fixes: `f014ae3c7c` ("nouveau: add support for nir") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-06-19 08:32:19 +10:00
Rohan Garg	ad284f794c	panfrost: Move clearing logic into pan_job Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 12:32:43 -07:00
Chia-I Wu	98eda99ab8	virgl: fix sync issue regarding discard/unsync transfers GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into ptr = glMapBufferRange(buf, 0, size, GL_WRITE_BIT\|GL_MAP_INVALIDATE_BUFFER_BIT); memcpy(ptr, data1, size); glUnmapBuffer(buf); ptr = glMapBufferRange(buf, size, size, GL_WRITE_BIT\|GL_MAP_UNSYNCHRONIZED_BIT); memcpy(ptr, data2, size); glUnmapBuffer(buf); we never want data1 to be copy_transfer'ed. Because that would mean that data2 might overwrite valid data. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis alexandros.frantzis@collabora.com Fixes: `a22c5df079` ("virgl: Use buffer copy transfers to avoid waiting when mapping") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-18 10:38:21 -07:00
Alyssa Rosenzweig	2a717f300b	panfrost: Enable sRGB Now that sRGB formats are supported for both rendering and sampling, advertise support. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	5aa51ba97f	panfrost: Disable AFBC on sRGB buffers The performance impact is slightly mitigated by tiling the render target, but it's undeniably still slow compared to AFBC. Unfortunately, it doesn't look like AFBC and sRGB play nice... Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	6585bb9f52	panfrost: Enable sRGB fixed-function blending For fixed-function, we have hardware to handle sRGB so we just set a flag. For blend shaders, it's rather more involved; this is currently unimplemented. Assert it out for now; we don't need it quite yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	4b137da409	panfrost: Specify sRGB in the render target Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	58c34e4a6c	panfrost: Implement sRGB texturing Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	31a4ef847c	panfrost: Add sRGB render target flag Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	01e1eecb95	panfrost: Implement tiled rendering We already can sample from Mali's linear/tiled encoding (the one from Utgard -- AFBC is mostly unrelated); let's be able to render to it as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:29 -07:00
Alyssa Rosenzweig	d50795109b	panfrost: Decode rendering block type A mode for rendering tiled/uncompressed was noticed, so we reshuffle the MFBD render target definitions to explicitly include block type. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:28 -07:00
Alyssa Rosenzweig	83c02a5ea9	panfrost: Refactor texture targets This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a single 2-bit texture target selection, noticing it's the same as the 2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we share this definition and add the missing entry for 1D/buffer textures. This requires a nontrivial (but functionally similar) refactor of all parts of the driver to use the new definitions appropriately. Theoretically, this should add support for buffer textures, but that's obviously not tested and probably wouldn't work. While doing so, we notice the sRGB enable bit, which we document and decode as well here so we don't forget about it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:59:28 -07:00
Rohan Garg	bfca21b622	panfrost: Figure out job requirements in pan_job.c Requirements for a job should be figured out in pan_job.c v2: [Alyssa] Fix early return Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:52:20 -07:00
Rohan Garg	debb85d1ec	panfrost: Reset job counters once the job is submitted Move the reset out of frame invalidation into job submission Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:52:20 -07:00
Rohan Garg	0f43a2ae8a	panfrost: Initial implementation of panfrost_job_submit Start fleshing out panfrost_job v2: [Alyssa: Remove unused variable, warning introduced] Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 09:52:01 -07:00
Gurchetan Singh	2daf3d8215	virgl_hw: add YUV support Add corresponding entries from p_format.h Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-18 09:18:58 -07:00
Gurchetan Singh	2480ce802a	virgl: sync to virglrenderer virgl_hw.h It's nice to keep these two files in sync, as they define guest userspace <---> host userspace communcation. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-18 09:18:48 -07:00
Jason Ekstrand	58cb865313	anv: Make border colors the right size and alignment on HSW Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-18 16:07:08 +00:00
Lionel Landwerlin	51076eb87c	imgui: bump imgui memory editor copy Getting rid of a compiler warning : In file included from ../src/intel/tools/aubinator_viewer.cpp:225: ../src/imgui/imgui_memory_editor.h: In member function ‘void MemoryEditor::DisplayPreviewData(size_t, const u8, size_t, MemoryEditor::DataType, MemoryEditor::DataFormat, char, size_t) const’: ../src/imgui/imgui_memory_editor.h:637:16: warning: enumeration value ‘DataType_COUNT’ not handled in switch [-Wswitch] switch (data_type) ^ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-18 15:34:13 +00:00
Alyssa Rosenzweig	9402970751	panfrost/midgard: Enable autovectorization Enable nir_opt_vectorize. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:44:13 -07:00
Connor Abbott	47e7c6961a	nir: add a vectorization pass This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences: 1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack. The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient. [Alyssa: Rebase on latest master, updating with respect to typeless moves] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-18 06:43:30 -07:00
Boris Brezillon	c3558868da	panfrost: Add support for TXS instructions This patch adds support for nir_texop_txs instructions which are needed to support the OpenGL textureSize() function. This is also needed to support RECT texture sampling which is currently lowered to 2D sampling + a TXS() instruction by the nir_lower_tex() helper. Changes in v2: * Split options for the 1st and 2nd tex lowering passes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	5c17f84ae2	panfrost: Prepare things to support non-native texture ops We are about to add support for the TXS (texture size) op which is not implemented using a midgard texture instruction. Let's rename emit_tex() into emit_texop_native() and repurpose emit_tex() as a dispatcher. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	c57f7d0f15	panfrost: Move sysval upload logic out of panfrost_emit_for_draw() We're about to add more sysval types, and panfrost_emit_for_draw() is big enough, so let's move the sysval upload logic in a separate function. We also add one sub-function per sysval type to keep the panfrost_upload_sysvals() small/readable. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	bd49c8f0eb	panfrost: Make the sysval logic more generic We are about to add support for nir_texop_txs which requires adding a sysval/uniform containing the texture size. Let's change the emit_sysval_read() prototype to take a nir_instr object instead of a nir_intrinsic_instr one so we can re-use this function when emitting a sysval for a txs instruction. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	296c5fd25d	nir/lower_tex: Add a way to lower TXS(non-0-LOD) instructions The V3D driver has an open-coded solution for this, and we need the same thing for Panfrost, so let's add a generic way to lower TXS(LOD) into max(TXS(0) >> LOD, 1). Changes in v2: * Use == 0 instead of ! * Rework the minification logic as suggested by Jason * Assign cursor pos at the beginning of the function * Patch the LOD just after retrieving the old value Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	0e489fd360	nir/lower_tex: Update ->sampler_dim value before calling get_texture_size() get_texture_size() will create a txs instruction with ->sampler_dim set to the original tex->sampler_dim. The condition to call lower_rect() only checks the value of ->sampler_dim and whether lower_rect is requested or not. This leads to an infinite loop when calling nir_lower_tex() with the same options until it returns false. In order to avoid that, let's move the tex->sampler_dim patching before get_texture_size() is called. This way the txs instruction will have ->sampler_dim set to GLSL_SAMPLER_DIM_2D and nir_lower_tex() won't try to lower it on the subsequent passes. Changes in v2: * Add Jason R-b * Add a comment explaining why we patch ->sampler_dim at the beginning of the lower_rect() func Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Boris Brezillon	352b1d9c31	nir/lower_tex: Actually report when projector lowering happened The code considers that projector lowering was done even if it's not really the case. Change the project_src() prototype to return a bool encoding whether projector lowering happened or not and update the progress var accordingly in nir_lower_tex_block(). --- Changes in v2: * Add Jason R-b * Drop the part suggesting that nir_lower_rect() could be called in a do-while(progress) loop. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 06:36:07 -07:00
Tomeu Vizoso	6f60fec48f	panfrost: Adapt to constant name change in UABI We hadn't updated the kernel header after the driver got into mainline. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 15:26:08 +02:00
Tomeu Vizoso	5ad5777f89	panfrost: ci: Update results Alyssa fixed some failing tests last night. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-18 15:25:01 +02:00
Samuel Pitoiset	c16bf48bfc	radv: adjust the DCC base VA for mipmapped color attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-18 12:24:26 +02:00
Samuel Pitoiset	6ee40efd02	radv: fix color decompressions for FMASK/CMASK Only skip levels without DCC when it's a DCC decompression. Whoops. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-18 12:09:04 +02:00
Samuel Pitoiset	42a41a9e4a	radv: do not decompress levels without DCC with the graphics path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-18 11:24:50 +02:00
Samuel Pitoiset	e8917dcadb	radv: do not decompress levels without DCC with the compute path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-18 11:24:41 +02:00
Samuel Pitoiset	864ddda8a3	radv: check if DCC is enabled per mip not for the whole image In other words, make use of radv_dcc_enabled() instead of radv_image_has_dcc() all over the places. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-18 11:24:36 +02:00
Iago Toral Quiroga	79a30543ee	v3d: implement simultaneous peripheral access exceptions for V3D 4.1+ Shader-db results: total instructions in shared programs: 9117550 -> 9102719 (-0.16%) instructions in affected programs: 1752873 -> 1738042 (-0.85%) helped: 7076 HURT: 478 helped stats (abs) min: 1 max: 22 x̄: 2.19 x̃: 2 helped stats (rel) min: 0.07% max: 13.89% x̄: 1.70% x̃: 1.07% HURT stats (abs) min: 1 max: 7 x̄: 1.41 x̃: 1 HURT stats (rel) min: 0.09% max: 10.17% x̄: 0.86% x̃: 0.54% 95% mean confidence interval for instructions value: -2.00 -1.92 95% mean confidence interval for instructions %-change: -1.58% -1.50% Instructions are helped. total max-temps in shared programs: 1327774 -> 1327728 (<.01%) max-temps in affected programs: 1025 -> 979 (-4.49%) helped: 47 HURT: 2 helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1 helped stats (rel) min: 2.63% max: 20.00% x̄: 7.67% x̃: 5.26% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 4.17% max: 4.17% x̄: 4.17% x̃: 4.17% 95% mean confidence interval for max-temps value: -1.06 -0.82 95% mean confidence interval for max-temps %-change: -8.89% -5.49% Max-temps are helped. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-18 08:09:03 +02:00
Iago Toral Quiroga	6d97c8fac1	v3d: only flush jobs accessing the query BO when reading query results Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-18 08:09:03 +02:00
Iago Toral Quiroga	5491883a9a	v3d: add a helper function to flush jobs using a BO v2: use _mesa_set_search() (Eric) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-18 08:09:03 +02:00
Kenneth Graunke	e8cd7a30d5	iris: Support more RGBX pipe formats. Without them, the state tracker falls back to an RGBA format, but it doesn't always manage to override the swizzle for us. So we lose the information that the API expects an X channel, where alpha is garbage and reads back as 1. We have no equivalent ISL RGBX format for these, so we just use RGBA directly and override the swizzle in all cases.	2019-06-17 21:52:38 -05:00
Kenneth Graunke	3c10a2726b	glsl: Fix out of bounds read in shader_cache_read_program_metadata The VaryingNames array has NumVaryings entries. But BufferStride is a small array of MAX_FEEDBACK_BUFFERS (4) entries. Programs with more than 4 varyings would read out of bounds. Also, BufferStride is set based on the shader itself, which means that it's inherently already included in the hash, and doesn't need to be included again. At the point when shader_cache_read_program_metadata is called, the linker hasn't even set those fields yet. So, just drop it entirely. Fixes valgrind errors in KHR-GL45.transform_feedback.linking_errors_test. Fixes: `6d830940f7` glsl/shader_cache: Allow shader cache usage with transform feedback Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-06-17 21:22:19 -05:00
Jason Ekstrand	9672b7044c	anv: Set STATE_BASE_ADDRESS upper bounds on gen7 This should fix floating-point border color on all gen7 HW. Integer is still thoroughly busted on gen7 because it doesn't exist on IVB and it's crazy on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-17 18:53:07 -05:00
Bas Nieuwenhuizen	925c04b4c7	radv: Disable linear tiled compressed textures. Support got removed in the new addrlib update. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-18 01:00:49 +02:00
Jason Ekstrand	1be38f9178	anv:Use VK_EXT_separate_stencil_usage to avoid stencil shadows on gen7 Whenever stencil texturing is not required (most of the time), we can use VK_EXT_separate_stencil_usage to only create the shadow image when VK_IMAGE_USAGE_SAMPLED_BIT is required for stencil. Of course, this depends on applications to use the extension but hopefully DXVK and similar translators are doing so and that covers most of the apps. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	f3ea0cf828	anv: Add stencil texturing support for gen7 Intel hardware didn't get support for sampling from W-tiled (required for stencil) images until Broadwell so we can't directly sample from stencil. Instead, if we want to support stencil texturing on gen7 hardware, we have to keep a texture-capable shadow copy around and use BLORP to update when stencil changes. The one thing this commit does not implement is self-dependencies with stencil input attachments. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99493 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	4faa3145b1	anv/blorp: Update shadow images when clearing or uploading Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	2b736d9e6c	anv/cmd_buffer: Add a stencil transition helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	86fc268142	anv/blorp: Take an aspect in anv_image_copy_to_shadow Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Jason Ekstrand	fcbefe013a	anv/formats: Re-arrange the way se set some flag bits Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-17 22:32:26 +00:00
Kenneth Graunke	659d4f613e	iris: Make resource_copy_region handle packed depth-stencil resources. Also copy along the separate stencil buffer if needed. Fixes Piglit's arb_copy_image-formats.	2019-06-17 17:29:09 -05:00
Kenneth Graunke	a36f1542ae	iris: Order CS stall and TC invalidate for format reinterpretation hacks This should ensure the TC invalidate happens after the stall. Fixes KHR-GL43.copy_image.functional which does a CopyImage (blorp_copy) from a buffer (using R8G8B8A8_UINT), then GetTexImage to read back the original image (using R10G10B10A2_UNORM).	2019-06-17 16:38:08 -05:00
Kenneth Graunke	94b9f50e63	iris: Be more aggressive at post-format-reintepret TC invalidate hack When copying/blitting with format reinterpretation, we invalidate the texture cache before/after. Before is so the source of the copy works, and after is to get rid of our new data in the "wrong" format to protect future attempts to sample. When I ported these hacks to iris, I tried to be cautious by only bothering with the hacks if the batch referenced the BO. This makes some sense for the before case. If it isn't referenced, the texture cache can't really have any data for the BO (since it's also invalidated between batches). But we still need to do the after case regardless, as we've just polluted the cache with hazardous entries.	2019-06-17 16:38:08 -05:00
Gert Wollny	2b87753a84	virgl: Assume sRGB write control for older guest kernels or virglrenderer hosts When the host virglrenderer is an older version that doesn't check the sRGB write control feature, or when the guest kernel doesn't support CAPS v2, then the guest will only report support for GL 2.1 on a GL 3.3 host, even though it was supporting 3.3 with earlier guest mesa versions. By also checking the host feature check version this regression can be avoided. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110921 Fixes: `2845939d6a` virgl: Set sRGB write control CAP based on host capabilities Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-17 21:16:11 +00:00
Rob Clark	21c795ab07	freedreno/a6xx: disallow UBWC for x24s8 Fixes: dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-17 20:29:13 +00:00
Rob Clark	4e72abcd97	freedreno/a6xx: un-swap X24S8_UINT The stencil is actually in the .w component, but we used to use SWAP to remap the channels. This doesn't work when tiled/ubwc. Fixes: dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube dEQP-GLES31.functional.stencil_texturing.misc.base_level dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-17 20:29:13 +00:00
Samuel Pitoiset	6e3aee4630	radv: add mipmaps support for DCC decompression on compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	ebb1db96d5	radv: add mipmaps support for color decompressions (DCC/FMASK/CMASK) And some cleanups. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	00f0e5c6fd	radv: set the DCC/FCE predicates from the base level Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	7832e75ea8	radv: load the fast color clear values from the base level Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	7971697efe	radv: store the DCC predicate for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	38aa386e96	radv: store the FCE predicate for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	7295512037	radv: store the fast color clear values for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Samuel Pitoiset	58506fec63	radv: allocate DCC metadata for each mip Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 22:20:53 +02:00
Caio Marcelo de Oliveira Filho	4b0bc664a5	gallium: Remove unused util_ringbuffer Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-06-17 13:02:44 -07:00
Caio Marcelo de Oliveira Filho	397d1a18ef	llvmpipe: Don't use u_ringbuffer for lp_scene_queue Inline the ring buffer and signal logic into lp_scene_queue instead of using a u_ringbuffer. The code ends up simpler since there's no need to handle serializing data from / to packets. This fixes a crash when compiling Mesa with LTO, that happened because of util_ringbuffer_dequeue() was writing data after the "header packet", as shown below struct scene_packet { struct util_packet header; struct lp_scene scene; }; / Snippet of old lp_scene_deque(). */ packet.scene = NULL; ret = util_ringbuffer_dequeue(queue->ring, &packet.header, sizeof packet / 4, return packet.scene; but due to the way aliasing analysis work the compiler didn't considered the "&packet->header" to alias with "packet->scene". With the aggressive inlining done by LTO, this would end up always returning NULL instead of the content read by util_ringbuffer_dequeue(). Issue found by Marco Simental and iThiago Macieira. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-06-17 13:02:44 -07:00
Alyssa Rosenzweig	390126e70a	panfrost/midgard: Simplify 2D array logic It shouldn't matter if we stick a z in for non-arrays, anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 12:52:51 -07:00
Alyssa Rosenzweig	a3ae3cb8e9	panfrost/midgard: Handle non-zero component in store Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 12:52:51 -07:00
Alyssa Rosenzweig	2c9e124f81	panfrost/midgard: Apply writemask to LUTs Fixes LUT instructions with NIR registers. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 12:52:50 -07:00
Marek Olšák	eba932ea43	amd: update addrlib Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 15:14:55 -04:00
Nicolai Hähnle	d15cc1f55a	radeonsi: reduce MAX_GEOMETRY_OUTPUT_VERTICES This fixes piglit spec@glsl-1.50@gs-max-output on gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 15:14:51 -04:00
Alyssa Rosenzweig	aef01dd2e5	panfrost: Cleanup default blend mode Just encode the Mali magic number for `replace` rather than awkwardly forcing Gallium structures through. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 10:45:52 -07:00
Alyssa Rosenzweig	fbbb29aa5b	panfrost: Don't accidentally include blend shader Some residual dirty state can leak through across frames; zero this out. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 10:45:52 -07:00
Alyssa Rosenzweig	565c446dab	panfrost/midgard: Use typeless moves internally We switch all fmov to (i)mov, following the NIR switch. This simplifies some code surrounding blend shaders and should have no functional changes elsewhere. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 10:45:52 -07:00
Chia-I Wu	1fece5fa5f	virgl: better support for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE When the resource to be mapped is busy and the backing storage can be discarded, reallocate the backing storage to avoid waiting. In this new path, we allocate a new buffer, emit a state change, write, and add the transfer to the queue . In the PIPE_TRANSFER_DISCARD_RANGE path, we suballocate a staging buffer, write, and emit a copy_transfer (which may allocate, memcpy, and blit internally). The win might not always be clear. But another win comes from that the new path clears res->valid_buffer_range and does not clear res->clean_mask. This makes it much more preferable in scenarios such as access = enough_space ? GL_MAP_UNSYNCHRONIZED_BIT : GL_MAP_INVALIDATE_BUFFER_BIT; glMapBufferRange(..., GL_MAP_WRITE_BIT \| access); memcpy(...); // append new data glUnmapBuffer(...); Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-17 09:36:31 -07:00
Chia-I Wu	9975a0a84c	virgl: add virgl_rebind_resource We are going support reallocating the HW resource for a virgl_resource. When that happens, the virgl_resource needs to be rebound to the context. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-17 09:36:31 -07:00
Chia-I Wu	7e0508d9aa	virgl: save virgl_hw_res in virgl_transfer When PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE is properly supported, virgl_transfer might refer to a different virgl_hw_res than virgl_resource does. We need to save the virgl_hw_res and use the saved one. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-17 09:36:31 -07:00
Chia-I Wu	ad1ef35dc1	virgl: add resource_reference to virgl_winsys It works similar to pipe_resource_reference but is for virgl_hw_res. It can also replace resource_unref. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-17 09:36:31 -07:00
Alyssa Rosenzweig	73bf669e3f	panfrost/midgard: Add rounding mode specific opcodes This adds a set of opcodes for performing moves and type conversions with respect to particular rounding modes, required for OpenCL. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 09:32:31 -07:00
Alyssa Rosenzweig	9865b79a88	panfrost: Drop draws with complete scissor The hardware support for scissoring requires minimally 1 pixel to be drawn. If the scissor culls everything, we need to drop the draw entirely early on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 09:29:09 -07:00
Alyssa Rosenzweig	3a9b7692f1	panfrost: Disable pipelining temporarily Pipelined rendering is important for performance but is not working right these days. Disable it for correctness until the panfrost_job refactor is enabled and we can do it right. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 09:25:52 -07:00
Alyssa Rosenzweig	d4aed00214	panfrost/mfbd: Handle rendering to linear mipmap In anticipation of more general mipmapping support, we implemented support for rendering to linear mipmaps (a very simple case). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:42:54 -07:00
Alyssa Rosenzweig	531715431f	panfrost: Implement sampling from non-zero initial levels In preparation for more complex mipmap operations. glGenerateMipmap() in particular, as implemented by u_blitter, requires reading from non-zero initial mip levels. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:42:54 -07:00
Alyssa Rosenzweig	a5f5b0640c	panfrost: Resource management for linear 2D texture arrays Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:36:15 -07:00
Alyssa Rosenzweig	dabfc71d36	panfrost/midgard: Adjust swizzles for 2D arrays Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:36:14 -07:00
Alyssa Rosenzweig	67a34acd00	panfrost: Set array_size to permit array textures Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:36:14 -07:00
Alyssa Rosenzweig	bdf169abb3	panfrost: Decode array textures Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:36:14 -07:00
Alyssa Rosenzweig	0ae6bbe8a9	panfrost: Implement 3D texture resource management Passes dEQP-GLES3.functional.texture.format.unsized.3d Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:36:14 -07:00
Alyssa Rosenzweig	36a7b2b018	panfrost: Specify 3D in texture descriptor Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:28:13 -07:00
Alyssa Rosenzweig	8429beef5e	panfrost/midgard: Fix 3D texture masks/swizzles Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:28:13 -07:00
Alyssa Rosenzweig	56f9b47efd	panfrost/midgard: Add swizzle_of/mask_of helpers These make manipulating vectors in the Midgard compiler easier. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:28:13 -07:00
Alyssa Rosenzweig	8d1adc091b	panfrost: Enable helper invocations when texturing it turns out we have explicit control over helper invocations; if a particular bit in the fragment shader descriptor is set, helper invocations are launched; if it clear, they are not. Helper invocations are required whenever computing derivatives, whether explicitly (dFdx/dFdy) or implicitly (any texturing). Accordingly, we set this bit when texturing to fix edge case behaviour (literally, haha). Thank you to Jason Ekstrand and Ilia Mirkin for pointing out the representative dEQP test failed along triangle edges and for suggesting helper invocations / derivatives as a list of suspect pieces (which led to discovering the helper invocations enable bit in the first place). Ideally we would use the new NIR analysis pass for this, but that hasn't landed quite yet. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 08:22:37 -07:00
Alyssa Rosenzweig	0219b99500	panfrost: Handle missing texture case In some cases, Gallium can give us bad info about the texture count, counting some NULL textures. We pass Gallium's info to the hardware blindly, which can confuse the hardware in edge cases. This patch adjusts accordingly.	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	443f9ae0ad	panfrost: Remove forced flush on clears This worked around a bug in oooold versions of Panfrost. Nowadays, its presence is, at best, creating bugs. Let's wack it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	6460442049	panfrost: Flush scanout too In a poorly coded app, the framebuffer can be partially drawn, an FBO switched, switch back to the framebuffer and keep drawing, etc. Reordering would fix this, but for now we need to just be careful about flushing scanout too. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	fc3f57bd7f	panfrost: Improve viewport (clipping) robustness On more complex apps (possibly using desktop GL specific extensions?), our viewport code was getting wacky results for unclear reasons. Let's be a little less wacky. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	f9ecca2ff0	panfrost: Disable the tiler for clear-only jobs To do so, we route some basic information through to the FBD creation routines (currently just a binary toggle of "has draws?"). Eventually, more refactoring will enable dynamic hierarchy mask selection, but right now we do the most basic. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	ac68946d9d	panfrost: Identify and decode mfbd_flags Previously known as the unk3 field. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	12d4289bf9	panfrost: Stub out hierarchy mask selection Quite a bit of refactoring in the main driver will be necessary to make use of this effectively, so the implementation is incomplete. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	6434f5c494	panfrost: Rename misc_0 -> tiler_polygon_list Just for readability. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	e2c2ccd5b8	panfrost: Sanity check tiler polygon list size Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	953cc4b540	panfrost: Compute and use polygon list body size This is a bit of a hack, but it gets the point across. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	b660953733	panfrost: Use polygon list header size computation Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	edfba9bee2	panfrost: Calculate polygon list header size As per the notes at the beginning of pan_tiler.c, we implement a routine to calculate the size of the polygon list header given the framebuffer dimensions and the provided hierarchy mask. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:59:14 -07:00
Alyssa Rosenzweig	e88ff9ad85	panfrost: Add pan_tiler.h header Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:47:49 -07:00
Alyssa Rosenzweig	21eb411d2f	panfrost: Document tile size heuristic I'm not sure how the blob does it, but this seems to be a dead simple test and roughly corresponds to what I've noticed from the blob, so maybe it's good enough. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:47:49 -07:00
Alyssa Rosenzweig	7f26bb3553	panfrost: Rename tiler fields per tiler research Following the research into Midgard's hierarchical tiling infrastructure, we now understand (in broad stokes) the purpose of each tiler field in the MFBD. Additionally, we understand more of the tiling fields in the SFBD and in Bifrost's structures, although this knowledge is still incomplete. Update the names, decoder, and comments to reflect this new understanding. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:47:49 -07:00
Alyssa Rosenzweig	8d6fb66e3a	panfrost: Add notes about the tiler allocations This explains how the polygon list is allocated, updating the headers appropiately to sync the terminology. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:47:49 -07:00
Alyssa Rosenzweig	85e745f2b4	panfrost: Integrate kernel names for tiler FBD These names are from the replay workaround in kbase; they begin to shine some light on the meaning of these fields. In particular, we now understand why the "tiler_meta" field has the effect it does on performance in certain scenes (controlling tile granularity). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-17 07:47:49 -07:00
Bas Nieuwenhuizen	1a7caac9e9	radv: Add asserts that buffer descriptors are created with valid buffer formats. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-17 10:56:50 +00:00
Bas Nieuwenhuizen	4107590911	radv: Decompress DCC when the image format is not allowed for buffers. Otherwise the buffer loads/stores in the bufimage meta operations fail. If we decompress DCC then we can use the "canonical" format compatible with the not-supported format. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-17 10:56:50 +00:00
Samuel Pitoiset	e9875fc0b6	radv: make sure to init the DCC decompress compute path state This fixes a segfault when forcing DCC decompressions on compute because internal meta objects are not created since the on-demand stuff. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 11:30:49 +02:00
Samuel Pitoiset	4c7ef1b02e	ac: make ac_compute_cmask() a static function Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 11:30:47 +02:00
Samuel Pitoiset	cf77d3abf1	radv: rely on ac_compute_cmask() for CMASK info Instead of re-computing in the driver. The 3d and cube flags are correctly set, so the same values should returned by ac_compute_surface(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 11:30:44 +02:00
Samuel Pitoiset	6880b42cfc	radv: silent a compiler warning in radv_CmdPushDescriptorSetKHR() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-17 09:53:26 +02:00
Tomeu Vizoso	e655d63644	panfrost: ci: Speed things up a bit by skipping a git clone Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-17 09:17:53 +02:00
Tomeu Vizoso	f1efb0f254	panfrost: ci: Exclude all blend tests from results As they randomly fail on T760. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-17 09:17:53 +02:00
Samuel Pitoiset	b5012a0518	ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+ LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-17 08:58:33 +02:00
Erico Nunes	d72bbb2c89	lima: lower fmod in ppir and gpir Since commit `4f3c82c72c` fmod is no longer being lowered in nir, and ends up crashing lima programs with "unsupported nir_op: fmod" in both ppir and gpir. There seems to be no mod operation in hardware in utgard and there is an optimization in nir to lower fmod to instructions that lima already implements, so let's use that. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-16 10:11:59 +00:00
Rob Clark	a417c323ad	freedreno/a6xx: re-enable UBWC for depth/stencil Now that we can blit depth/stencil in a way that plays nicely with UBWC, re-enable it. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Rob Clark	363a9ed614	freedreno/a6xx: handle z24s8/z24x8 blits with u_blitter Now that it can turn these blits into rendering to RB6_Z24_UNORM_S8_UINT it can properly handle cases where only one of depth+stencil is being blit. And this avoids lying about he format, which completely doesn't work when UBWC is used. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Rob Clark	a96ae18de6	freedreno/a6xx: handle fallback for rewritten blits ourself For re-written z/s blits, we want to use the re-written `pipe_blit_info` even if we have to fallback to 3d pipe (`u_blitter`). So handle that fallback ourself. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Rob Clark	94c36a8554	freedreno/a6xx: rename variable The name 'separate' doesn't make a while lot of sense, as only one of the cases is the blit actually split. But split out from previous patch in an attempt to reduce the noise. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Rob Clark	5fe7b627eb	freedreno/a6xx: consolidate z/s blit handling This will get even simpler with the next patch Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Rob Clark	4c75d62ce8	gallium: add z24s8_as_r8g8b8a8 format This maps to a special format that recent generations of adreno have, for blitting z24s8. Conceptually it is similar to doing Z and/or S blits by pretending it is r8g8b8a8 (with appropriate writemask). But it differs when bandwidth compression is used, as z24 is a different type from r8g8b8. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-06-15 07:33:04 -07:00
Kenneth Graunke	1d75f52589	st/mesa: Respect GL_TEXTURE_SRGB_DECODE_EXT in GenerateMipmaps() Apparently, we're supposed to look at the texture object's built-in sampler object's sRGB decode setting in order to decide whether to decode/downsample/re-encode, or simply downsample as-is. Previously, we had just respected the pipe_resource's format. Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test. (This ports commit `337a808062` from i965 to st/mesa for Gallium drivers.) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 20:13:46 +00:00
Erico Nunes	3ddea5e8c5	lima: fix dynarray usage in lima_submit_add_bo Commit `de8a919702` refactored dynarray usage and changed the size of the allocation in lima_submit_add_bo. That causes a segfault in programs running with lima. This commit restores the allocation size back to the previous size. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-06-14 20:47:35 +02:00
Alyssa Rosenzweig	9ab8d31f32	panfrost: Fix variant selection Fixes 1acffb ("panfrost: Unify...") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-14 10:35:07 -07:00
Marek Olšák	abe9a51d27	ac: add radeon_info::is_amdgpu instead of checking drm_major == 3 and clean up Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-14 13:31:18 -04:00
Mauro Rossi	bbbbea243a	android: amd/common: fix missing include path Fixes the following building error in Android: In file included from external/mesa/src/amd/common/ac_llvm_helper.cpp:34: In file included from external/mesa/src/amd/common/ac_llvm_build.h:30: In file included from external/mesa/src/compiler/nir/nir.h:40: In file included from external/mesa/src/compiler/nir_types.h:36: external/mesa/src/compiler/glsl_types.h:37:10: fatal error: 'main/config.h' file not found ^~~~~~~~~~~~~~~ 1 error generated. Fixes: `bd4c661` ("ac,ac/nir: use a better sync scope for shared atomics") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-14 18:36:10 +02:00
Mauro Rossi	51e24af8fd	android: radv: fix necessary dependecies Fixes building errors due to libmesa_util and libexpat dependencies: In file included from external/mesa/src/amd/vulkan/radv_device.c:52: external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so ... external/mesa/src/util/xmlconfig.c:670: error: undefined reference to 'XML_ParserCreate' ... clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `3c2e826` ("radv: Add support for driconf.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-14 18:35:10 +02:00
Alejandro Piñeiro	d317944c24	docs: document three NIR_ envvars Initially I was only interested on documenting NIR_PRINT, as today I needed to check the code to find this envvar, that at the moment I vaguely remembered that existed. As we are here, though, let's just document all of them (assuming that makes sense). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 16:18:43 +02:00
Alexandros Frantzis	83829abe03	virgl: Return immediately when finding a compatible resource in the cache When searching for resources in the cache, we previously released all expired resources even after having found a compatible resource. This commit changes this behavior to return immediately when finding a compatible resource, so that the operation finishes more quickly. This moves more of the burden of releasing expired resources to cache addition, which, since it happens at resource destruction time, it's less time critical. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-14 12:59:51 +03:00
Alexandros Frantzis	801753d4b3	virgl: Use virgl_resource_cache in the vtest winsys Replace the cache implementation in the vtest winsys with virgl_resource_cache. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-14 12:59:49 +03:00
Alexandros Frantzis	13f70d3668	virgl: Use virgl_resource_cache in the drm winsys Replace the cache implementation in the drm winsys with virgl_resource_cache. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-14 12:59:43 +03:00
Alexandros Frantzis	b18f09a509	virgl: Introduce virgl_resource_cache Introduce a resource cache implementation that can be used by any virgl winsys backend. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-14 12:58:51 +03:00
Haihao Xiang	8ead5bebdb	i965: support UYVY for external import only It is similar with YUYV Fixes: `165e704719` ("i965/i915: Add UYVY as the supported format") Signed-off-by: Haihao Xiang <haihao.xiang@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-14 15:45:56 +08:00
Neil Roberts	34d4b3e367	glsl: Set default precision on record members Record types have their own slot to store the precision for each member in glsl_struct_field. Previously if the member didn’t have an explicit precision qualifier this was being left as GLSL_PRECISION_NONE. This patch makes it take into account the type’s default precision qualifier like it does for regular variables in apply_type_qualifier_to_variable. This has the additional benefit of correctly reporting an error when a float type is used in a struct without declaring the default type. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	235425771c	glsl/linker: Make precision matching optional in intrastage_match This function is confusingly also used to match interstage interfaces as well as intrastage. In the interstage case it needs to avoid comparing the precisions. This patch adds a parameter to specify whether to take the precision into account or not so that it can be used for both cases. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	19b27a8569	glsl/linker: Don’t check precision for shader interface On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. Section 4.3.10 of the GLSL ES 3.00 spec: “The type of vertex outputs and fragment inputs with the same name must match, otherwise the link command will fail. The precision does not need to match.” Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	230d1e8d86	compiler/types: Making comparing record precision optional On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. This adds an extra argument to glsl_types::record_compare to disable the precision comparison. This will later be used for the shader interface check. In order to make this work this patch also adds a helper function to recursively compare types while ignoring the precision. v2: Call record_compare from within compare_no_precision to avoid duplicating code (Eric Anholt). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Lucas Stach	ab74699190	etnaviv: fix some pm query issues The offsets to read the query results were off-by-one, which causes the counters to report bogus increasing values. Also the counter result is u32, so we need to initialize the query type to reflect that. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-14 09:06:28 +02:00
Iago Toral Quiroga	360b832c58	v3d: do not setup execute flags for else block in uniform control flow Either all channels executed the 'then' block, in which case all channels will directly jump to the 'endif' block at the end of the 'then' block, or all channels execute the 'else' block (so no execution masking is necessary). Shader-db results: total instructions in shared programs: 9119238 -> 9117550 (-0.02%) instructions in affected programs: 401252 -> 399564 (-0.42%) helped: 855 HURT: 77 total uniforms in shared programs: 3022622 -> 3022605 (<.01%) uniforms in affected programs: 3566 -> 3549 (-0.48%) helped: 17 HURT: 0 total max-temps in shared programs: 1327762 -> 1327774 (<.01%) max-temps in affected programs: 619 -> 631 (1.94%) helped: 2 HURT: 15 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 08:00:52 +02:00
Iago Toral Quiroga	2a2501247b	nir: detect more dynamically uniform expressions Shader-db results for v3d: total instructions in shared programs: 9132728 -> 9119238 (-0.15%) instructions in affected programs: 596886 -> 583396 (-2.26%) helped: 1118 HURT: 224 total threads in shared programs: 234298 -> 234308 (<.01%) threads in affected programs: 10 -> 20 (100.00%) helped: 5 HURT: 0 total uniforms in shared programs: 3022949 -> 3022622 (-0.01%) uniforms in affected programs: 29163 -> 28836 (-1.12%) helped: 108 HURT: 37 total max-temps in shared programs: 1328030 -> 1327762 (-0.02%) max-temps in affected programs: 10097 -> 9829 (-2.65%) helped: 263 HURT: 15 total spills in shared programs: 3793 -> 3777 (-0.42%) spills in affected programs: 432 -> 416 (-3.70%) helped: 16 HURT: 0 total fills in shared programs: 4380 -> 4266 (-2.60%) fills in affected programs: 828 -> 714 (-13.77%) helped: 16 HURT: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 08:00:52 +02:00
Tapani Pälli	287b58f827	ir3: initialize progress false before ir3_nir_lower_imul Removes a compiler warning about uninitialized variable. Fixes: `c02ffd2700` "ir3: Use the new NIR lowering pass for integer multiplication" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Rob Clark <robclark@gmail.com> Reviewed-by: Eduardo Lima <elima@igalia.com>	2019-06-14 08:21:42 +03:00
Boris Brezillon	749c544b84	panfrost: Fix general purpose varying handling When both the fragment and vertex shaders point to the same varying location they expect to share the same varying slot. Make sure vertex and fragment varyings pointing to the same loc have ->src_offset set to the same value. [Alyssa: In addition a patch implement txs, this fixes GALLIUM_HUD on Panfrost] Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-13 10:54:18 -07:00
Marek Olšák	7566a9a58a	ac/registers: use better names for disambiguated definitions Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-13 13:52:06 -04:00
Marek Olšák	08ab9b70ce	ac/registers: remove deprecated/inapplicable definitions Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-13 13:52:06 -04:00
Caio Marcelo de Oliveira Filho	5bd48ff252	iris: Enable INTEL_shader_atomic_float_minmax Supported only for gen >= 9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-13 09:03:58 -07:00
Caio Marcelo de Oliveira Filho	81835f87a4	gallium: Add PIPE_CAP_ATOMIC_FLOAT_MINMAX Used to enable INTEL_shader_atomic_float_minmax. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-06-13 09:03:58 -07:00
Rob Clark	9f10e40cde	freedreno/a6xx: fix MAX_INDICES Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-13 08:56:27 -07:00
Rob Clark	ce12ac8c2b	freedreno/blitter: remove dead code The src/dst format is overriden from the pipe_blit_info, so this just logic just serves to confuse the reader. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-13 08:56:27 -07:00
Rob Clark	a8be53211d	freedreno: turn staging cube into 2d-array Since we could only need a subset of the layers, and otherwise we trigger an assert in util_max_layer() Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-13 08:56:27 -07:00
Tomeu Vizoso	3adf9b0757	panfrost: ci: Exclude some tests from results These are tests that regressed in RK3288 but still pass on RK3399. So we still have a CI we can rely on, add them to the flip-flop list for now. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-06-13 17:45:27 +02:00
Tomeu Vizoso	50901a27f6	panfrost: ci: Update test expectations Some tests got fixed since the last update, but also some regressions crept in. To keep the CI green, add the regressions to the expected failures. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-06-13 17:45:22 +02:00
Connor Abbott	37b92b0ae6	nir: Don't manually index intrinsic index enum This fixes a rebase fail in `ea51275e07`, and prevents it from happening again. There's no reason to do this manually. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-13 17:10:41 +02:00
Erik Faye-Lund	901795238b	docs: work around broken altsoftware.com link altsoftware.com seems to no longer be around, and is currently being held by a domain squatter. Let's link to waybackmachine instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	795b5d923f	docs: work around broken dsbox.com link dsbox.com now forwards to haystax.com, which is tehcnially unrealted to this link. Let's link to waybackmachine instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	b16e77e051	docs: work around broken sgi.com links sgi.com now forwards to hpe.com, which is technically unrelated to these links. Let's link to waybackmachine instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	a9956ed87a	docs: update link to OpenGL FAQ Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	12f4cd6a09	docs: update link to the Linux OpenGL ABI Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	e19448c102	docs: update link to glw GLW is currently living in gitlab, the cgit-page is just a mirror. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	04fc0bc3f3	docs: fixup link-target Just a couple of lines above, we have this exact same link, but this time with a leading "www.". Let's match that. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	372f9f6947	docs: eliminate another stale autoconf-reference Meson is what should tell you about these issues, not the configure script. We no longer have that. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	c9d396710b	docs: replace autoconf with meson We no longer have an autoconf build-system to maintain, but we do have a meson build-system. So let's mention that instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	f4f78a59b0	docs: update required packages Automake and libtool are no longer required to build, instead we need meson and ninja-build. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	26287b91ac	docs: remove pointless haiku-comment The only build system that doesn't support Haiku is `Android.mk`, which also doesn't support most other platforms either, so there is no need to single it out. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Erik Faye-Lund	c339c0175a	docs: fixup typo Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-13 14:14:05 +00:00
Daniel Schürmann	c58dff753c	radv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ('shader_ballot') Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	deedc0b31d	amd/common: add support for AMD_shader_ballot functions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	7a858f274c	spirv/nir: add support for AMD_shader_ballot and Groups capability This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	ea51275e07	nir: add intrinsics for AMD_shader_ballot Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	f2277c327a	radv: enable shader_subgroup_vote & shader_subgroup_ballot extensions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	1b89ebeede	nir/spirv: add support for the SubgroupBallotKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	de56ebadce	nir/spirv: add support for the SubgroupVoteKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Alejandro Piñeiro	17c2c9cd67	v3d: fix checking twice auf flag Seems a C&P error, and should check for auf/muf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110902 Fixes: `8f065596d2` "v3d: Add an optimization pass for redundant flags updates." Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-13 11:45:18 +02:00
Samuel Pitoiset	ca6bf9a6cd	radv: flush and invalidate CB before resetting query pools on GFX9 We have to emit a CACHE_FLUSH_AND_INV_TS_EVENT to be sure all prior GPU work is done. While we are at it, also flush and invalidate DB. This fixes the following CTS (when the small hint is disabled): dEQP-VK.query_pool.statistics_query.reset_before_copy.* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-13 11:23:48 +02:00
Bas Nieuwenhuizen	cb728f28ac	vl: Always enable drm winsys. The dri2 winsys also uses libdrm (and you can only enable dri3 if you enable dri2), and the drm winsys only requires libdrm. So if any winsys is enabled you can also enable the drm winsys, and since we always want at least one winsys we can always enable it. I removed the check for the drm platform for VA and OMX since they do not care anymore. Since we still check for one of r600g, nouveau or radeonsi, we are guarantueed to still only enable it by default in a configuration that requires libdrm anyway. So for people using va=auto, we don't suddenly start requiring libdrm were we did not before. This supersedes "vl: Enable DRM by default.", which I pushed, but rolled back because it used dep_libdrm before its definition. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-13 08:25:48 +00:00
Bas Nieuwenhuizen	b4c7ce360b	radv: Always disable DCC on shareable images. Do not want it for perf reasons. Always have to disable DCC when transferring to external queue. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-13 08:15:45 +00:00
Bas Nieuwenhuizen	0667c1f14b	radv: Skip transitions coming from external queue. Transitions to external queue should do the transition & make sure it works on all queues. Fixes: `8ebc7dcb59` "radv: Allow fast clears with concurrent queue mask for some layouts." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-13 08:15:45 +00:00
Mateusz Krzak	60009aefdb	lima/ppir: change offset type to int Offset doesn't need to be 64-bit. This fixes compilation error with 64-bit off_t. Fixes: `af0de6b9` lima/ppir: implement discard and discard_if Suggested-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Mateusz Krzak <kszaquitto@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-06-13 07:43:24 +02:00
Chia-I Wu	900a80f9e4	virgl: virgl_transfer should own its virgl_resource We should avoid having potentially dangling pointers to pipe_resources in general. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-12 18:20:30 -07:00
Chia-I Wu	74051efbea	virgl: pass virgl_context to transfer create/destroy A pipe_transfer is a context object. It is fine for the constructor/destructor to have access to the context. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-12 18:20:30 -07:00
Chia-I Wu	514e12b1b8	virgl: init transfer queue from virgl_context A pipe_transfer is a context object. It is fine for virgl_transfer_queue to have access to the context. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-12 18:20:30 -07:00
Chia-I Wu	308ba2c0f9	virgl: clean up virgl_transfer_queue.h Add header guard and forward declare structs. Move virgl_resource.h inclusion to the C file. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-12 18:20:30 -07:00
Nicolai Hähnle	2d114e6267	radeonsi: add radeonsi_debug_disassembly option This dumps disassembly to the pipe_debug_callback together with shader stats. Can be used together with shader-db to get full disassembly of all shaders in the database. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	3bde69e789	radeonsi: fix line splitting in si_shader_dump_assembly Compute the count since the start of the current line instead of the count since the start of the the disassembly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	aa737e8580	radeonsi: raise the alignment of LDS memory for compute shaders This implies that the memory will always be at address 0, which allows LLVM to generate slightly better code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	33be5ad8a3	radeonsi: use an explicit symbol for the LSHS LDS memory Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	174fad7075	radeonsi: rename lds_{load,store} to lshs_lds_{load,store} These functions are now only used in LS/HS shaders (both separate and merged). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	b519ddc35c	radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9 This will make it easier to use LDS for other purposes in geometry shaders in the future. The lifetime of the esgs_ring variable is as follows: - declared as [0 x i32] while compiling shader parts or monolithic shaders - just before uploading, gfx9_get_gs_info computes (among other things) the final ESGS ring size (this depends on both the ES and the GS shader) - during upload, the "esgs_ring" symbol is given to ac_rtld as a shared LDS symbol, which will lead to correctly laying out the LDS including other LDS objects that may be defined in the future - si_shader_gs uses shader->config.lds_size as the LDS size This change depends on the LLVM changes for emitting LDS symbols into the ELF file. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	f8315ae04b	amd/rtld: layout and relocate LDS symbols Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	dc99a8cd9b	radeonsi: cleanup some #includes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	1ff2440eee	amd/common: use ARRAY_SIZE for the LLVM command line options This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	ca21ba2a08	radeonsi: inline si_shader_binary_read_config into its only caller Since it can only be used for reading the config of an individual, non-combined shader, it is not very reusable anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	bf8a1ca902	radeonsi: use the new run-time linker for shaders v2: - fix a memory leak Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	16bee0e5f6	radeonsi: don't declare pointers to static strings The compiler should be able to optimize them away, but still. There's no point in declaring those as pointers, and if the compiler doesn't optimize them away, they add unnecessary load-time relocations. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	3c958d924a	amd/common: add ac_compile_module_to_elf A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	66da60f4da	radeonsi: dump shader binary buffer contents Help identify bugs related to corruption of shaders in memory, or errors in shader upload / rtld. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	bf11c594dd	radeonsi: return bool from si_shader_binary_upload We didn't really use error codes anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	8b1343ca79	radeonsi: let si_shader_create return a boolean We didn't really use error codes anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	77b05cc42d	radeonsi: use ac_shader_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Nicolai Hähnle	b3be346c68	amd/common: add a more powerful runtime linker Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 20:28:23 -04:00
Caio Marcelo de Oliveira Filho	608257cf82	i965: Fix INTEL_DEBUG=bat Use hash_table_u64 instead of hash_table directly, since the former will also handle the special keys (deleted and freed) and allow use the whole u64 space. Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the current value for a freed key. Fixes: `b38dab101c` "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-12 15:57:16 -07:00
Caio Marcelo de Oliveira Filho	eb41ce1b01	util/hash_table: Properly handle the NULL key in hash_table_u64 The hash_table_u64 should support any uint64_t as input. It does special handling for the "deleted" key, storing the data in the table itself; do the same for the "freed" key. Fixes: `b38dab101c` "util/hash_table: Assert that keys are not reserved pointers" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-12 15:57:16 -07:00
Nicolai Hähnle	c129cb3861	amd/common: clarify ac_shader_binary::lds_size Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:21 -04:00
Nicolai Hähnle	2e96c01073	amd/common: extract ac_parse_shader_binary_config Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:33:08 -04:00
Nicolai Hähnle	de8a919702	u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros The main motivation for this change is API ergonomics: most operations on dynarrays are really on elements, not on bytes, so it's weird to have grow and resize as the odd operations out. The secondary motivation is memory safety. Users of the old byte-oriented functions would often multiply a number of elements with the element size, which could overflow, and checking for overflow is tedious. With this change, we only need to implement the overflow checks once. The checks are cheap: since eltsize is a compile-time constant and the functions should be inlined, they only add a single comparison and an unlikely branch. v2: - ensure operations are no-op when allocation fails - in util_dynarray_clone, call resize_bytes with a compile-time constant element size v3: - fix iris, lima, panfrost Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:30:25 -04:00
Nicolai Hähnle	71b45bae14	u_dynarray: return 0 on realloc failure and ensure no-op We're not very good at handling out-of-memory conditions in general, but this change at least gives the caller the option of handling it gracefully and without memory leaks. This happens to fix an error in out-of-memory handling in i965, which has the following code in brw_bufmgr.c: node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node)); if (unlikely(!node)) return 0ull; Previously, allocation failure for util_dynarray_grow wouldn't actually return NULL when the dynarray was previously non-empty. v2: - make util_dynarray_ensure_cap a no-op on failure, add MUST_CHECK attribute - simplify the new capacity calculation: aside from avoiding a useless loop when newcap is very large, this also avoids an infinite loop when newcap is larger than 1 << 31 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:30:25 -04:00
Nicolai Hähnle	dc75362511	freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0) This is more expressive and simplifies a subsequent change. v2: - fix one more call-site after rebase Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-12 18:30:25 -04:00
Alyssa Rosenzweig	1ee2366693	panfrost/midgard: Differentiate vertex/fragment texture tags Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:32:12 -07:00
Alyssa Rosenzweig	5062b612be	panfrost/midgard: Assert on unknown texture source Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:32:07 -07:00
Alyssa Rosenzweig	4ea512844c	panfrost/midgard: Set minimal swizzle on texture input Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:32:01 -07:00
Alyssa Rosenzweig	6ae4f9c523	panfrost/midgard: Lower texture projectors We do have native support for perspective division on the load/store unit, but this is for the future, something ideally we would select generally, not just for textures. Meanwhile, flipping on projector lowering works now. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:31:53 -07:00
Alyssa Rosenzweig	4012e06788	panfrost/midgard: Implement txl This follows the txb implementation, but requires an adjustment to how the cont/last flags are set. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:31:45 -07:00
Alyssa Rosenzweig	a19ca344ab	panfrost/midgard: Implement txb op We refactor the main tex handling to fit a bias argument in as well. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:31:37 -07:00
Alyssa Rosenzweig	1acffb5671	panfrost: Unify bind_vs/fs_state This replaces bind_vs/fs_state calls to a unified bind_shader_state call, removing a great deal of duplicated logic related to variant selection. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:29:05 -07:00
Alyssa Rosenzweig	f8a4090f80	panfrost: Add panfrost_job_type_for_pipe helper This logic is repeated in a bunch of places and will only grow worse as we support more job types; collect it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:27:47 -07:00
Alyssa Rosenzweig	15fae1e38c	panfrost/midgard: Extract emit_varying_read Paralleling emit_uniform_read, this allows varying reads to be emitted independent of an honest-to-goodness load vary instruction in the NIR. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:25:17 -07:00
Alyssa Rosenzweig	8c88bd0253	panfrost: Remove "vertex/tiler render target" silliness I don't think these are actual structures, just figments over cargoculting dumped memory without making any sense of it. Nothing seems to break if the region is zeroed out, anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:21:56 -07:00
Alyssa Rosenzweig	b96df80069	panfrost/decode: Print line number of bad memory access Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:14:53 -07:00
Alyssa Rosenzweig	fc7bcee865	panfrost: Replace pantrace with direct decoding History lesson! In the early days of a Panfrost, we had a library independent of the driver called `panwrap` which would be LD_PRELOAD'ed into a driver to decode its cmdstream in real-time. When upstreaming Panfrost, we realized that we would much rather have this decode functionality maintained in-tree to avoid divergence, but that we could not upstream panwrap because of its use with the legacy API. So we instead dumped GPU memory to the filesystem with an out-of-tree panwrap, and decoded that with the in-tree pandecode module. When we migrated to the new kernel, we just added support for doing this memory dump directly from the driver (via a module "pantrace"). This works, but dumping memory every frame is sloooooooooooooow and error-prone. I figured if we have pandecode in-tree, we might as well link to it directly in the driver, allowing us to decode Panfrost's command streams without dumping memory to the filesystem first. This cleans up the code substantially and improves dumping performance by a HUGE margin. I'm talking "several seconds per frame" to "dumping in real-time" kind of jump. Note to users: this removes the environmental option "PANTRACE_BASE". Instead, for equivalent functionality set "PAN_MESA_DEBUG=trace" and redirect stdout to the file of your choosing. This should be debugging Panfrost much more pleasant. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-12 14:07:09 -07:00
Kevin Strasser	845ec8576a	st/mesa: Add rgbx handling for fp formats Add missing cases for fp32 and fp16 formats. Fixes: `c68334ffc0` "st/mesa: add floating point formats in st_new_renderbuffer_fb()" Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-12 19:03:47 +00:00
Kevin Strasser	ec0a68e50d	gallium/winsys/kms: Fix dumb buffer bpp The bpp in the dumb buffer creation request is hardcoded to 32, which is an incorrect assumption as the caller is free to pick any pipe format. Use the bpp supplied to us through util_format_get_blocksizebits(). Fixes: `3b176c441b` "gallium: Add a dumb drm/kms winsys backed swrast provider" Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-12 11:44:10 -07:00
Eric Engestrom	9996ddbb27	util/futex: fix dangling pointer use Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901 Fixes: `7dc2f47882` "util: emulate futex on FreeBSD using umtx" Cc: Greg V <greg@unrelenting.technology> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-12 17:27:44 +01:00
Samuel Pitoiset	d378151246	radv: fix VK_EXT_memory_budget if one heap isn't available When the visible VRAM size is equal to the VRAM size only two heaps are exposed. This fixes dEQP-VK.api.info.device.memory_budget. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-12 15:52:48 +02:00
Samuel Pitoiset	2ef9d2738c	radv: fix occlusion queries on VegaM The number of render backends is 16 but the enabled mask is 0xaaaa. As noticed by Bas, allowing disabled render backends might break the OCCLUSION_QUERY packet. We don't use it yet but keep this in mind. This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-12 15:51:12 +02:00
Lionel Landwerlin	93b93e5a9d	anv: do not parse genxml data without INTEL_DEBUG=bat This significantly slows down the CTS runs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `32ffd90002` ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-06-12 12:53:35 +03:00
Lionel Landwerlin	f80679c8e8	intel/dump: fix segfault when the app hasn't accessed the device Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-12 09:49:55 +03:00
Caio Marcelo de Oliveira Filho	48d7e7a9b8	iris: Only upload surface state for grid info when needed Special care is needed to ensure that when we have two consecutive calls with the same grid size, we only bail in the second one if it either don't need the surface state or the surface state was already uploaded. v2: Instead of having a new bool in ice->state to know whether we had a surface, check whether we have state->ref. (Ken) Clean up the logic a little bit by adding 'grid_updated' local. (Ken) Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 17:57:37 -07:00
Caio Marcelo de Oliveira Filho	f346b277d1	iris: Create binding table slot for num_work_groups only when needed Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 17:57:37 -07:00
Rui Salvaterra	7b43362f29	r300g: implement GLSL disk shader caching This implements GLSL disk shader caching for the R300-R500 series of AMD GPUs. Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-11 20:49:34 -04:00
Richard Thier	ffd2f948fe	r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added v1: Fix skipped slab allocators and the buffer cache. v2: Use only 1 domain for texture allocation v3: Added flag for the create_fence call too Based on Marek v1 and v2 proposed fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=1107812.patch Cc: 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-11 20:45:27 -04:00
Marek Olšák	ec0956a194	radeonsi: don't test SDMA perf if SDMA is disabled/unsupported	2019-06-11 20:05:21 -04:00
Marek Olšák	993bf52977	radeonsi: always interpolate PrimID as flat	2019-06-11 20:05:21 -04:00
Marek Olšák	7f7ffa0883	radeonsi: move color clamping to si_llvm_export_vs to unify the code	2019-06-11 20:05:21 -04:00
Marek Olšák	4773f5a293	radeonsi: use the ac helper for index buffer stores in the culling shader	2019-06-11 20:05:21 -04:00
Marek Olšák	579003e7bd	radeonsi: use the ac helper for image stores	2019-06-11 20:05:21 -04:00
Marek Olšák	deef3833f8	radeonsi: use the ac helper for SSBO stores	2019-06-11 20:05:21 -04:00
Marek Olšák	e5fe38484a	radeonsi: fixes for vec3 buffer stores in LLVM 9	2019-06-11 20:05:21 -04:00
Caio Marcelo de Oliveira Filho	9c81db8adb	iris: Enable PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED This avoids lowering of CS system values by GLSL (configured by state tracker). In i965 we don't use that lowering, and we also shouldn't need that in Iris. Using it cause some unnecessary round trip between values, e.g.: shader uses gl_LocalInvocationIndex, GLSL rewrites it in terms of gl_LocalInvocationID, then driver rewrites those in terms of gl_LocalInvocationIndex again. Copy propagation can make some of those go away, but not all as seen below. Intel SKL shader-db results: total instructions in shared programs: 15595189 -> 15594556 (<.01%) instructions in affected programs: 74880 -> 74247 (-0.85%) helped: 81 HURT: 4 helped stats (abs) min: 2 max: 172 x̄: 7.88 x̃: 4 helped stats (rel) min: 0.19% max: 5.66% x̄: 1.71% x̃: 1.23% HURT stats (abs) min: 1 max: 2 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.45% max: 1.65% x̄: 0.76% x̃: 0.46% 95% mean confidence interval for instructions value: -11.56 -3.34 95% mean confidence interval for instructions %-change: -1.91% -1.28% Instructions are helped. total loops in shared programs: 4831 -> 4831 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 372136618 -> 372145628 (<.01%) cycles in affected programs: 9218230 -> 9227240 (0.10%) helped: 131 HURT: 86 helped stats (abs) min: 1 max: 798 x̄: 39.79 x̃: 12 helped stats (rel) min: <.01% max: 6.75% x̄: 0.42% x̃: 0.13% HURT stats (abs) min: 2 max: 2442 x̄: 165.38 x̃: 6 HURT stats (rel) min: <.01% max: 20.83% x̄: 0.74% x̃: 0.12% 95% mean confidence interval for cycles value: -2.07 85.11 95% mean confidence interval for cycles %-change: -0.22% 0.30% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 11956 -> 11950 (-0.05%) spills in affected programs: 77 -> 71 (-7.79%) helped: 3 HURT: 0 total fills in shared programs: 25619 -> 25549 (-0.27%) fills in affected programs: 593 -> 523 (-11.80%) helped: 4 HURT: 0 LOST: 0 GAINED: 0 Total CPU time (seconds): 1695.69 -> 1706.03 (0.61%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 15:12:17 -07:00
Caio Marcelo de Oliveira Filho	46de3beab1	gallium: Add PIPE_CAP_CS_DERIVED_SYSTEM_VALUES_SUPPORTED Tells whether or not the driver can handle gl_LocalInvocationIndex and gl_GlobalInvocationID. If not supported (the default), state tracker will lower those on behalf of the driver. v2: Add case to u_screen.c. (Anholt) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 15:12:17 -07:00
Caio Marcelo de Oliveira Filho	f03b21ae69	st/glsl: Perform some var optimizations Perform those before some derefs are gone when we lower the buffers after the st_nir_opts() call. Intel SKL shader-db results: total instructions in shared programs: 15593685 -> 15590708 (-0.02%) instructions in affected programs: 378078 -> 375101 (-0.79%) helped: 777 HURT: 44 helped stats (abs) min: 1 max: 68 x̄: 4.07 x̃: 4 helped stats (rel) min: 0.04% max: 31.58% x̄: 2.88% x̃: 1.37% HURT stats (abs) min: 1 max: 24 x̄: 4.20 x̃: 2 HURT stats (rel) min: 0.17% max: 8.00% x̄: 1.60% x̃: 1.27% 95% mean confidence interval for instructions value: -4.02 -3.23 95% mean confidence interval for instructions %-change: -2.93% -2.35% Instructions are helped. total loops in shared programs: 4815 -> 4815 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 371965528 -> 371788566 (-0.05%) cycles in affected programs: 184190307 -> 184013345 (-0.10%) helped: 3650 HURT: 2855 helped stats (abs) min: 1 max: 59400 x̄: 99.45 x̃: 15 helped stats (rel) min: <.01% max: 43.18% x̄: 2.60% x̃: 1.02% HURT stats (abs) min: 1 max: 16362 x̄: 65.16 x̃: 10 HURT stats (rel) min: <.01% max: 66.22% x̄: 2.78% x̃: 0.81% 95% mean confidence interval for cycles value: -53.73 -0.68 95% mean confidence interval for cycles %-change: -0.39% -0.08% Cycles are helped. total spills in shared programs: 11936 -> 11956 (0.17%) spills in affected programs: 443 -> 463 (4.51%) helped: 0 HURT: 8 total fills in shared programs: 25644 -> 25619 (-0.10%) fills in affected programs: 2306 -> 2281 (-1.08%) helped: 24 HURT: 2 LOST: 7 GAINED: 16 Total CPU time (seconds): 1679.04 -> 1695.69 (0.99%) shader-db results radeonsi (VEGA64): Totals from affected shaders: SGPRS: 180160 -> 179552 (-0.34 %) VGPRS: 115368 -> 114544 (-0.71 %) Spilled SGPRs: 5627 -> 5603 (-0.43 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 7808364 -> 7803268 (-0.07 %) bytes LDS: 192 -> 192 (0.00 %) blocks Max Waves: 19202 -> 19340 (0.72 %) Wait states: 0 -> 0 (0.00 %) Radeonsi results provided by Timothy. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-11 14:53:54 -07:00
Ville Syrjälä	6230bfeb65	anv/cmd_buffer: Reuse gen8 Cmd{Set, Reset}Event on gen7 Modern DXVK requires event support [1], but looks like it only uses vkCmdSetEvent() + vkGetEventStatus(). So we can just borrow the relevant code from gen8, leaving CmdWaitEvents still unimplemented. [1] `8c3900c533` v2: Also move CmdWaitEvents into genX_cmd_buffer.c (Jason) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-11 16:25:07 -05:00
Ian Romanick	39f4dc23a5	intel/fs: Mark source 0 of bcsel as needing Boolean resolve The other sources of the bcsel behave like the sources of an and or other logical operation. However, source zero behaves differently. It is evaluated as a Boolean, so it needs to be resolved. No shader-db changes, but the tests mentioned in the bug get a couple instructions added back. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110857 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-11 12:12:07 -07:00
Rob Clark	f9f89df8bc	freedreno/a5xx: enable a540 Tested-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-11 12:03:10 -07:00
Rob Clark	832010f6ac	freedreno/a6xx: enable UBWC by default Flip the FD_MESA_DEBUG flag to a disable rather than enable, drop the obsolete comment (and bonus, drop unused softpin debug flag) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	81cc555e9a	freedreno/a6xx: disallow UBWC for z24s8 This is slightly annoying because it mostly works.. but we have some issues to sort out about how to blit z24s8/x24s8/z24x8 with UBWC before we can enable UBWC by default. For now it is a step forward to at least enable it for non-z/s while we figure out how to blit z24s8+UBWC. (The basic issue is that pretending z24s8 is an equivalently sized rgba format for the purpose of blitting falls apart when UBWC is in the picture.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	4f1319a17d	freedreno/a6xx: use correct UBWC reg builders No functional change, the registers have the same layout as MRT flags pitch reg. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	d42ce659ed	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	490baa6974	freedreno/a6xx: disable UBWC for some formats An older blob claims to support UBWC w/ r32ui an r32i, but not r32f. Results from deqp indicate that it doesn't work with r32ui and r32i. This could also just mean that use as "IBO" (image) is more limited than as texture, although blob also doesn't seem to bother to try to use UBWC with images at all, so hard to know for sure. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	8ddffa75c0	freedreno/a6xx: handle non-UWC-compatible image views Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	dac3bc9862	freedreno/a6xx: handle non-UBWC-compatible texture views Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	fe5c7b2b75	freedreno: add helper to uncompress UBWC resource We'll need this for a few edge cases, like image/sampler view that uses a format that UBWC does not support with a resource originally created in a format that UBWC does support. NOTE we could in some cases do an in-place uncompress. But that has a couple potential sharp edges: 1) the uncompressed buffer could have different layout, ie. a5xx with meta and pixel data of layers/levels interleaved. 2) if it comes mid-batch, it would force flush, or somehow fixing up cmdstream for draws already emitted. But with the resource shadowing approach we can rely on batch re-ordering to avoid splitting things.. older draws see the older compressed version, newer draws see the new uncompressed version of the rsc. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	846b8a76bd	freedreno: handle images in rebind_resource() Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	c6ae354299	freedreno: allow null discard box in shadow path When uncompressing a UBWC buffer, we don't want to discard anything. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	12201d7a8b	freedreno: swap UBWC state in shadow path It doesn't come up yet, as so far we only hit this path with linear buffers. But it will when we start re-using the shadow path for uncompressing UBWC buffers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	3c9a31eb50	freedreno: add modifier param to fd_try_shadow_resource() To uncompress UBWC, I want to re-use the shadow path, but we'll need a way to request that the new buffer is not compressed. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Rob Clark	3b05a120a3	freedreno: correct modifier for UBWC buffers Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-11 10:55:27 -07:00
Chia-I Wu	15323c14fd	virgl: consider newly created resources idle A newly created resource can be regarded as idle. We don't care if the RESOURCE_CREATE command has been retired, unless it is used for fencing. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	9e4452cfd9	virgl: make resource_wait/resource_is_busy cheaper The round trip to the kernel is expensive. Add a local cache to avoid it when possible. There is a race condition when two contexts access the same resource at the same time (e.g., ctx1 submits a cmdbuf that accesses a resource while ctx2 maps the resource). But that is probably an app bug in the first place. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	ddc90be907	virgl: add virgl_drm_{alloc,free,clear}_res_list Helpers to work with resource list. virgl_drm_release_all_res is removed. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Chia-I Wu	71465fe569	virgl: do not cache external resources We should not reuse a resource for other purposes when it can still be accessed by another process or device. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-11 10:03:54 -07:00
Alyssa Rosenzweig	7d43999e63	panfrost: Enable AFBC on depth/stencil This seems to be a performance win, but more rigorous testing is necessary to figure out the exact circumstances when this is good/bad. Incidentally, this fixes non-aligned ZS. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig	15f62b8e7c	panfrost: Linear depth/stencil should be aligned We might render to it. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig	d7ad29ce25	panfrost/midgard: Decode LOD/bias registers For constant LODs/biases, we can use an immediate embedded in the texture (already decoded); for non-constant, we have to use a register squeezed into the usual immediate field, which is decoded here. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	b4a3296e77	panfrost/midgard: Decode texture offset register swizzle Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	4e9e42cc56	panfrost/midgard/disasm: include textureGather() Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	6c18ae33bc	panfrost/midgard: Support negative immediate offsets It's not at all clear why this work for texelFetch but not texture. Maybe the top bits are dual-purpose on other texturing ops...? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	4d8157f12d	panfrost/midgard: Fix redunant mask redundancy Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	3dee556c4e	panfrost/midgard/disasm: Print LOD for texelFetch Its encoding differs slightly from the LOD used in normal texture calls. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	cda9f32909	panfrost/midgard: Identify the in_reg_full field This is clear for texelFetch, hence the confusion with Bifrost's filter field, but it's much more general in reality. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	445a7b523f	panfrost/midgard/disasm: Correctly dump bias/LOD Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	873a3ed342	panfrost/midgard/disasm: Cleanup texture op code Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	289405392d	panfrost/midgard/disasm: Add missing space Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	f4ee8d055c	panfrost/midgard/disasm: LOD immediate/register select Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig	59fa7c95c8	panfrost/midgard/disasm: Use texture op name bare This allows us to show a call to textureLod in a reasonable way. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	109460f03a	panfrost/midgard/disasm: Varying perspective divides With an extra flag, we're able to do a perspective division "for free" while loading a varying. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	fc472007e7	panfrost/midgard: Add perspective division opcodes ...on the load/store unit, not the ALUs. Looks goofy but hey. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	b0396d6dda	panfrost/midgard: Print texture offsets This patch identifies the two modes of offsets in a texture instruction (immediate and register, disambiguated by the bit-once-known-as "has_offset") and implements disassembly for both. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig	ed1c48e91d	panfrost/midgard: Expand texture to 4-channel swizzle This eliminates some unknowns, clarifies 3D textures, and will maybe help with array/shadow textures? Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-11 08:44:18 -07:00
Juan A. Suarez Romero	b586ed51f3	docs: update calendar, add news item and link release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-06-11 17:38:22 +02:00
Juan A. Suarez Romero	cc7fc7e319	docs: Add SHA256 sums for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `2a5b4e2b9f`)	2019-06-11 15:26:42 +00:00
Juan A. Suarez Romero	7e8e49475c	docs: Add release notes for 19.1.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `1517811f4f`)	2019-06-11 15:26:38 +00:00
Samuel Iglesias Gonsálvez	32e1d85cb6	radv: assert on inline uniform blocks in radv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 16:32:27 +02:00
Samuel Iglesias Gonsálvez	d0c52ff610	anv: ignore inline uniform blocks in anv_CmdPushDescriptorSetKHR() According to the Vulkan spec, inline uniform blocks are not allowed to be updated through vkCmdPushDescriptorSetKHR(). These are the spec quotes from "13.2.1. Descriptor Set Layout" that are relevant for this case: "VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies that descriptor sets must not be allocated using this layout, and descriptors are instead pushed by vkCmdPushDescriptorSetKHR." "If flags contains VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all elements of pBindings must not have a descriptorType of VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT". There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid this case but it is implied in the creation of the descriptor set layout as aforementioned. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-11 16:25:53 +02:00
Eric Engestrom	773ff93bc4	egl: compare the whole list of attributes `memcmp()` compares a given number of bytes, but `EGLAttrib` is larger than a byte. Fixes: `8e991ce539` "egl: handle the full attrib list in display::options" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-11 12:18:09 +00:00
Eduardo Lima Mitev	3fb7b1fd35	freedreno/a5xx: Fix indirect draw max_indices calculation The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at `79180a05`. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-11 08:28:45 +02:00
Samuel Pitoiset	40699f74b8	radv: remove extra assignment in radv_decompress_resolve_subpass_src() baseArrayLayer is defined twice, trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-11 08:17:22 +02:00
Samuel Pitoiset	c39a1611ab	radv: add radv_get_resolve_pipeline() helper in the graphics path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:42 +02:00
Samuel Pitoiset	b06d1f029d	radv: do not decompress all image layers before resolving inside a subpass When decompressing resolve source images, we should rely on the framebuffer layer count instead of resolving all images layers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:39 +02:00
Samuel Pitoiset	4efbd963ec	radv: initialize the aspect mask when decompressing resolve source images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:35 +02:00
Samuel Pitoiset	c31a07fa85	radv: perform proper layout transitions before resolving Use an explicit pipeline barrier for doing layout transitions instead of duplicating some code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:32 +02:00
Samuel Pitoiset	92fa6264cb	radv: do not resolve all image layers with compute inside a subpass When resolving inside a subpass, we should rely on the framebuffer layer count instead of resolving all images layers. This should improve performance of layered resolves a bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-11 08:06:28 +02:00
Kenneth Graunke	a8588f512b	iris: Bypass half-float pack/unpack lowering. This skips GLSL IR lowering of pack/unpackHalf operations, allowing the NIR optimizer to see them Improves performance in Synmark2's OglCSDof by about 2x, by cutting about 90% of the cycles from one of the compute shaders. shader-db statistics on Skylake: 4 compute shaders went from SIMD8 to SIMD16. total instructions in shared programs: 15598871 -> 15542568 (-0.36%) instructions in affected programs: 143016 -> 86713 (-39.37%) helped: 144 HURT: 0 helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164 helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22% 95% mean confidence interval for instructions value: -510.50 -271.49 95% mean confidence interval for instructions %-change: -32.70% -27.65% Instructions are helped. total cycles in shared programs: 371973958 -> 368902103 (-0.83%) cycles in affected programs: 5557722 -> 2485867 (-55.27%) helped: 144 HURT: 0 helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697 helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67% 95% mean confidence interval for cycles value: -41570.02 -1094.64 95% mean confidence interval for cycles %-change: -38.44% -33.80% Cycles are helped. total spills in shared programs: 11936 -> 11903 (-0.28%) spills in affected programs: 110 -> 77 (-30.00%) helped: 3 HURT: 2 total fills in shared programs: 25644 -> 25178 (-1.82%) fills in affected programs: 677 -> 211 (-68.83%) helped: 5 HURT: 0 total loops in shared programs: 4830 -> 4829 (-0.02%) loops in affected programs: 1 -> 0 helped: 1 HURT: 0	2019-06-10 16:01:36 -07:00
Bas Nieuwenhuizen	e0d12f79c5	radv: Handle UNDEFINED format in image format list. Was watching a presentation on YT where this was used and it turns out it is not invalid. The only case it is actually valid as format in the creation of an image or image view is with Android Hardware Buffers which have their format specified externally. So we can just ignore all entries with VK_FORMAT_UNDEFINED. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:21:16 +00:00
Bas Nieuwenhuizen	39c71e0025	radv: Prevent out of bound shift on 32-bit builds. uintptr_t is 32-bits then and shifting it by 32 bits results in undefined behavior IIRC. Fixes: `b3c8de1c55` "radv: save all descriptor pointers into the trace BO" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-10 22:18:51 +00:00
Caio Marcelo de Oliveira Filho	2cb5907508	glsl: Check order and uniqueness of interlock functions With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:32 -07:00
Caio Marcelo de Oliveira Filho	b7c9fc72fd	glsl: Make interlock builtins follow same compiler rules as barriers Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:26 -07:00
Eduardo Lima Mitev	fb2169040a	nir/opt_algebraic: Fix rules for imadsh_mix16 The rules added in patch `3addd7c` are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-10 22:27:46 +02:00
Alyssa Rosenzweig	e9703fb416	panfrost: Ignore discards in dead branch analysis Fixes regressions in dEQP-GLES2.functional.shaders.discard.dynamic_loop_* Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 08:23:08 -07:00
Samuel Pitoiset	e9316fdfd4	radv: fix setting CB_SHADER_MASK for dual source blending CB_SHADER_MASK was computed without the second color buffer format which looks totally wrong to me. While we are at it, copy a comment from RadeonSI. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-10 17:21:56 +02:00
Alyssa Rosenzweig	50ffaaff3b	panfrost/midgard: Disambiguate register mode We postfix instructions by their size if a destination override is in place (a la AT&T assembly), disambiguating instruction sizes. Previously, "16-bit instruction, 16-bit dest, 16-bit sources" disassembled identically to "32-bit instruction, 16-bit dest, 16-bit sources", which is semantically distinct due to the lessened opportunity for parallelism but (potentially) greater precision. Adding a postfix removes the ambiguity and relieves mental gymnastics reading weird disassemblies even in some cases that are not ambiguous. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:12 -07:00
Alyssa Rosenzweig	8027cc9975	panfrost/midgard: Expose vec8/vec16 modes Midgard ALUs can operate in one of four modes: vec2 64-bit, vec4 32-bit, vec8 16-bit, or vec16 8-bit. Our compiler (and indeed, any OpenGL ES shader) only uses 32-bit (and eventually vec4 16-bit) modes in normal circumstances. Nevertheless, the other modes do exist and are easily accessible through OpenCL; they also come up in cases like blend shaders. While we have had minimal support for decoding 8-bit/64-bit modes, we did so pretending they were vec4 in each case; 16-bit registers had a synthetically duplicated register file to separate lo/hi halves, etc. This works for GL, but it doesn't map to what the hardware is -actually- doing, which can cause some headscratchingly bizarre disassemblies from OpenCL. So, we dive in the deep end and support these other modes natively in the disassembler, using absurdly long masks/swizzles, since the hardware is considerably more flexible than what was exposed before. Outside of some fixed routines for blending, none of the above is supported in the compiler yet. But it's better to have it in the ISA definitions and disassembler than not, for future use if nothing else. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	2d0bda0885	panfrost/midgard: Add shifting int modifiers As a source modifier, shift allows shifting a value left by the bit size, useful in conjunction with a greater register mode, for instance to implement `upsample`. As a concrete example, the following OpenCL: ushort hr0 = /* ... /, uint r1 = / ... /; uint r2 = (convert_uint(hr0) << 16) ^ b; compiles to the following Midgard assembly: ixor r, (hr0) << 16, b In reverse, the ".hi" output modifier shifts the value right by the bit size, leaving just the carry/overflow at the bottom. To implement _hi functions in OpenCL (for <64-bit), we do arithmetic in the 2x higher mode with the .hi modifier. (For 64-bit, things are hairier, since there is not an 128-bit int mode). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	6780481a3f	panfrost/midgard: Add integer outmods For floats, output modifiers determine clamping behaviour. For integers, they determine wrapping/saturation behaviour (or shifting -- see next commit). These are very different; they are conceptually two unrelated enums union'ed together; the distinction is responsible for many-a-bug. While clamping behaviour for floats was clear from GL, the int behaviour is only known From OpenCL contortion with convert_*_sat() functions. With the underlying functions known, clean up the codebase, likely fixing outmod type related bugs in the process. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	215b8844ee	panfrost/midgard: Note floating compares type convert OP_TYPE_CONVERTS denotes an opcode that returns a different type than is source (going from int-domain to float-domain or vice versa), named after the f2i/i2f family of opcodes it covers. We care because source mods are determined by the source type (i/f) but output modifiers are determined by the output type (equals the source type, unless the op type converts, in which case it's the opposite). The upshot is that floating-point compares (feq/fne/etc) actually do type-convert. That is, that take in floating-points and output in integer space (a boolean), so we mark them off this way to ensure the correct output modifiers are used. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:50:11 -07:00
Alyssa Rosenzweig	d48d991ce2	panfrost: Align linear renderable resources It's just -easier- to render to aligned framebuffers. For winsys targets, we already align, but even for an internal linear FBO we ought to align everything nicely. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:48:07 -07:00
Alyssa Rosenzweig	d89e0716a1	panfrost: Fix stride check when mipmapping Now that we support custom strides on mipmapped textures (theoretically, at least), extend the stride check to support mipmaps. Fixes incorrect strides of linear windows in Weston. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-10 06:47:18 -07:00
Alyssa Rosenzweig	416fc3b5ef	panfrost: Refactor texture/sampler upload We move some coding packing the texture/sampler descriptors into dedicated functions (out of the terrifyingly long emit_for_draw monolith), cleaning them up as we go. The discovery triggering the cleanup is the format for including manual strides in the presence of mipmaps/cubemaps. Rather than placed at the end like previously assumed, they are interleaved after each address. This difference is relevant when handling NPOT linear mipmaps. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:33 -07:00
Alyssa Rosenzweig	a35069a7b5	panfrost: Refactor blitting code We refactor the wallpaper rendering code to separate the wallpaper-specific bits from the general blitting capabilities. In the (hopefully near) future, we'll turn this on to implement real Gallium blits, e.g. for automatic mipmap generation. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:25 -07:00
Alyssa Rosenzweig	d878753efa	panfrost: Refactor AFBC code This patch does a substantial cleanup of the code for handling AFBC, moving various disparate misplaced functions into a new central pan_afbc.c file. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:14 -07:00
Alyssa Rosenzweig	b4763984ac	panfrost: Move pan_screen() to pan_screen.h Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:45:05 -07:00
Alyssa Rosenzweig	a38583e352	panfrost: Always align strides to cache line (64) (Performance tweak.) Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 06:44:56 -07:00
Emil Velikov	0534fcf57d	docs: fixup 19.0.5 <> 19.0.6 confusion The title of the release notes says 19.0.5 while the rest of the file (correctly) says 19.0.6 Fixes: `fe79d75ccf` ("docs: Add relnotes for 19.0.6") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan at pnwbakers.com>	2019-06-10 14:04:39 +01:00
Emil Velikov	a379b1c0ee	mapi: correctly handle the full offset table Earlier commit converted ES1 and ES2 to a new, much simpler, dispatch generator. At the same time, GL/glapi and the driver side are still using the old code. There is a hidden ABI between GL*.so and glapi.so, former referencing entry-points by offset in the _glapi_table. Hence earlier commit added the full table of entry-points, alongside a marker for other cases like indirect GL(X) and driver-size remapping. Yet the patches did not handle things fully, thus it was possible to get different interpretations of the dispatch table after the marker. This commit fixes that adding an indicative error message to catch future bugs. While here correct the marker (MAX_OFFSETS) comment. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Fixes: `cf317bf093` ("mapi: add all _glapi_table entrypoints tostatic_data.py") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:30 +01:00
Emil Velikov	497de977bd	mapi: add static_date offset to EXT_dsa As elaborated in the next patch, there is some hidden ABI that effectively require most entrypoints to be listed in the file. Cc: Marek Olšák <marek.olsak@amd.com> Fixes: `d2906293c4` ("mesa: EXT_dsa add selectorless matrix stackfunctions") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:25 +01:00
Emil Velikov	61960547df	mapi: add static_date offset to MaxShaderCompilerThreadsKHR As elaborated in the next patch, there is some hidden ABI that effectively require most entrypoints to be listed in the file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110302 Cc: Marek Olšák <maraeo@gmail.com> Fixes: `c5c38e831e` ("mesa: implement ARB/KHR_parallel_shader_compile") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-06-10 14:04:18 +01:00
Mathias Fröhlich	a7ecf78b90	egl: Let the caller of dri2_create_drawable decide about loaderPrivate. In the call arguments to dri2_create_drawable decouple loaderPrivate from dri2_surf. For all callers of dri2_create_drawable the two pointers are the same with the exception of the gbm backed platform. Let the calling code of dri2_create_drawable decide what loaderPrivate shall be. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-06-10 11:06:48 +02:00
Samuel Pitoiset	91aa25f462	radv: fix alpha-to-coverage when there is unused color attachments When alphaToCoverage is enabled, we should always write the alpha channel of MRT0 if it's unused. This now matches RadeonSI. This fixes the new CTS: dEQP-VK.pipeline.multisample.alpha_to_coverage_unused_attachment.samples_*.alpha_invisible Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2019-06-10 09:23:41 +02:00
Tomeu Vizoso	2fe7f9f2ae	panfrost: ci: Switch from direct Docker use to buildah Use the infrastructure in wayland/ci-templates to build the container images. This prevents from getting into some situations in which the images wouldn't be rebuilt, and allows us to share some infrastructure with other projects in freedesktop.org. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Suggested-by: Michel Dänzer <michel@daenzer.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-10 08:09:23 +02:00
Kenneth Graunke	81582e9366	gallium/u_transfer_helper: Free the staging buffer on unmap. u_transfer_helper sometimes mallocs a staging buffer, and leaked it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-09 15:16:10 -07:00
Lionel Landwerlin	17898a9b7e	intel/gpu_dump: fix argument passing We were dropping "/' around arguments grouped together. This was triggering failures with : $ ./framemetrics -g "Memory Writes Distribution Gen9" -o /tmp/output.csv -f ./my.trace 10 11 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-09 19:45:13 +00:00
Eric Engestrom	93349d7118	util/os_file: suppress sign comparison warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	fd5c18de88	util/os_file: fix error being sign-cast back and forth Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	341ba406fd	util/os_file: avoid shadowing read() with a local variable Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Eric Engestrom	7e35f20d44	util/os_file: actually return the error read() gave us Fixes: `316964709e` "util: add os_read_file() helper" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-09 13:14:13 +00:00
Alexandros Frantzis	f8f222ea36	virgl: Work around possible memory exhaustion Since we don't normally flush before performing copy transfers, it's possible in some scenarios to use too much memory for staging resources and start failing. This can happen either because we exhaust the total available memory (including system memory virtio-gpu swaps out to), or, more commonly, because the total size of resources in a command buffer doesn't fit in virtio-gpu video memory. To reduce the chances of this happening, force a flush before a copy transfer if the total size of queued staging resources exceeds a certain limit. Since after a flush any queued staging resources will be eventually released, this ensures both that each command buffer doesn't require too much video memory, and that we don't end up consuming too much memory for staging resources in total. Fixes kernel errors reported when running texture_upload tests in glbench. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:45 -07:00
Alexandros Frantzis	e34f79c918	virgl: Remove incorrect resource wait condition Now that we have copy transfers in place, we can remove the incorrect resource wait condition. Copy transfers and other optimizations minimize the performance impact of this removal, while providing the correct behavior. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:43 -07:00
Alexandros Frantzis	236c55f650	virgl: Use copy transfers for textures Extend copy transfers to also be used for busy textures. Performance results: Unigine Valley, qemu before: 22.7 FPS after: 23.1 FPS Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:42 -07:00
Alexandros Frantzis	a22c5df079	virgl: Use buffer copy transfers to avoid waiting when mapping We typically need to wait for a buffer to become ready before mapping, so that we don't write new contents while the host is still using the old contents. However, if we are allowed to discard the contents of the mapped buffer range, then we can avoid waiting by using a staging buffer range which we guarantee to never be busy, copying from the staging buffer range to the target buffer in the host. This commit implements this optimization by utilizing a dedicated u_upload_mgr for the staging buffer. Performance results: Twilight Struggle (Steam/Proton), qemu before: 7 FPS after: 25 FPS glmark2 ubo, qemu before: 38 FPS after: 331 FPS Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Suggested-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:39 -07:00
Alexandros Frantzis	6e7726e50c	virgl: Support copy transfers Support transfers that use a different resource as the source of data to transfer. This will be used in upcoming commits to send data to host buffers through a transfer upload buffer, in order to avoid waiting when the buffer resource is busy. Note that we don't support queueing copy transfers in the transfer queue. Copy transfers should be emitted directly in the command queue, allowing us to avoid flushes before them and leads to better performance. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:36 -07:00
Alexandros Frantzis	199d95f29e	virgl: Add copy_transfer3d definitions Introduce definitions for the copy_transfer3d protocol command and virgl capability. This command transfers data to the host by copying through another resource, and will be used in upcoming commits to avoid waiting when transferring data for busy resources. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:34 -07:00
Alexandros Frantzis	ccec1555c1	virgl: Make VIRGL_BIND_STAGING resources cacheable This could help performance when trying to recreate such resources for copy transfers. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:33 -07:00
Alexandros Frantzis	636345f496	virgl: Support VIRGL_BIND_STAGING Support a new virgl bind type for staging buffers which don't require dedicated host-side storage. These will be used to implement copy transfers. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:31 -07:00
Alexandros Frantzis	f38cdaebac	virgl: Avoid unfinished transfer_get with PIPE_TRANSFER_DONTBLOCK If we are not allowed to block, and we know that we will have to wait, either because the resource is busy, or because it will become busy due to a readback, return early to avoid performing an incomplete transfer_get. Such an incomplete transfer_get may finish at any time, during which another unsynchronized map could write to the resource contents, leaving the contents in an undefined state. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Suggested-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:22 -07:00
Alexandros Frantzis	8eb8222c10	virgl: Deduplicate checks for resource caching Also fixes a missed check for VIRGL_BIND_CUSTOM in one of the duplicate code snippets. Note that legacy fences also use VIRGL_BIND_CUSTOM, but we ensured they don't go through the cache in the previous commit. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:20 -07:00
Alexandros Frantzis	e0ffcdf16a	virgl: Don't try to use cached resources for legacy fences Resources for fences should not be from the cache, since we are basing the fence status on the resource creation busy status. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:45:16 -07:00
Alexandros Frantzis	8089d3658a	virgl: More info about chosen alignment value Add more info about why the value of VIRGL_MAP_BUFFER_ALIGNMENT. Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2019-06-07 21:44:53 -07:00
Chia-I Wu	371743157e	virgl: store all info about atomic buffers We will need the full info. This also speeds up virgl_attach_res_atomic_buffers and fixes resource leaks when the context is destroyed. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	98fd742d7e	virgl: add shader images to virgl_shader_binding_state It replaces virgl_context::images. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	f965efb3c8	virgl: add SSBOs to virgl_shader_binding_state It replaces virgl_context::ssbos. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	920c4143f0	virgl: add UBOs to virgl_shader_binding_state It replaces virgl_context::ubos. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Chia-I Wu	2e21d66d7a	virgl: add virgl_shader_binding_state virgl_shader_binding_state will be used to manage all per-stage shader bindings. For now, it manages only sampler views. This replaces virgl_textures_info and fixes some issues - start_slot is now honored - views outside of [start_slot, slart_slot+count) are unmodified - views are released when the context is destroyed Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-07 22:47:07 +00:00
Kenneth Graunke	30314270d4	iris: Zero shs->cbuf0 when binding a passthrough TCS Fixes valgrind errors when running two CTS tests back to back: - KHR-GL45.shader_image_load_store.basic-allTargets-loadStoreT* (The first test has an actual TCS, the second uses passthrough.)	2019-06-07 15:13:42 -07:00
Jason Ekstrand	1e6b32d08c	intel/blorp: Only double the fast-clear rect alignment on HSW This restriction was accidentally added to the BSpec/PRM as an unrestricted restriction starting with the HSW docs and it was never removed. However, it only ever applied to HSW and actually potentially causes problems on BDW and above where we have mipmapped fast-clears. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-06-07 22:00:55 +00:00
Rob Clark	3c456cf583	freedreno/a6xx: re-arrange program stageobj/group Split out a separate program config state group to run early before the other groups. This seems to help w/ intermittent "missed tiles" (although I had assumed that was a mem2gmem issue), or at least I can't reproduce that issue with this patch, but can without. It has the benefit of HLSQ_VS_CNTL.CONSTLEN matching for VS and BS. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	958f6ffb60	freedreno/a6xx: fix hangs with newer sqe fw With the newer (v1.76) fw, we were getting hangs (compared to older v1.66 fw). Re-work the GMEM code to structure things a bit closer to the blob. This moves some PKT7 packets from IB2 to IB1, which I think is what was confusing SQE and causing it to get stuck in an infinite loop. But in general structuring things at least closer to the same way blob does makes it easier to compare cmdstream. Note: this is a bit on the large side for what I'd normally consider for stable.. but right now it is looking like it is the newer fw that is headed for linux-firmware. This should defn have some soak time on master, but probably a good idea for this patch to end up in distro mesa builds by the time a630_sqe.fw hits linux-firmware. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	1d002cfade	freedreno/a6xx: WFI before RB_CCU_CNTL writes This seems to be in a block of non buffered/context regs. Blob always WFIs before write, so probably a good idea. Annoyingly, compared to ealier gens, it is a bit harder to tell from the register offset whether it is a buffered reg, it isn't as simple as everything below 0x2000, it seems. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	8a02ca807d	freedreno/a6xx: don't pre-dispatch texture fetch on accident Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Rob Clark	b820c09fa8	freedreno/a6xx: fix issues with gallium HUD In some cases the draw for the text wasn't working. This seems to be fixed by resyncing some of the "golded registers" from blob (initial values were based on somewhat older blob version). Perhaps good to have a bit of soak time on master, but would be good to eventually land in 19.x stable branches. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-07 12:07:29 -07:00
Nanley Chery	b4198e792c	anv/cmd_buffer: Initalize the clear color struct for CNL+ On CNL+, the clear color struct is composed of RGBA channel values and fields which are either reserved by the HW or used to control fast-clears. Currently anv initializes the channel values to zero and allows the other fields to be undefined. Satisfy the MBZ field requirements by removing an optimization that doesn't hold true for CNL+ and pulling in the number of dwords to initialize from ISL. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-07 18:43:06 +00:00
Jon Turney	87173ded6e	glx/windows: Fix compilation with -Werror-format Fix compilation where the DWORD type is used with a format, after -Werror-format added by `c9c1e261`. Some Win32 API types are different fundamental types in the 32-bit and 64-bit versions. This problem is then further compounded by the fact that whilst both 32-bit Cygwin and 32-bit MinGW use the ILP32 data model, 64-bit MinGW uses the LLP64 data model, but 64-bit Cygwin uses the LP64 data model. This makes it near impossible to write printf format specifiers which are correct for all those targets. In the Win32 API, DWORD is an unsigned, 32-bit type. So, it is defined in terms of an unsigned long, except in the LP64 data model used by 64-bit Cygwin, where it is an unsigned int. It should always be safe to cast it to unsigned int and use %u or %x. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 11:28:48 -07:00
Kenneth Graunke	cd796120c9	iris: Rename bind_state to bind_shader_state. bind_state is possibly the worst name ever. For create, we used create_shader_state, which is more descriptive. Put shader in the name.	2019-06-07 11:26:20 -07:00
Kenneth Graunke	d5d2fb5c4c	isl: Mark enum isl_channel_select packed so it becomes 1 byte. I recently discovered that the following code lead to valgrind errors: struct isl_swizzle swizzle = ISL_SWIZZLE_IDENTITY; VALGRIND_CHECK_MEM_IS_DEFINED(&swizzle, sizeof(swizzle)); which is surprising, because struct isl_swizzle is simply: struct isl_swizzle { enum isl_channel_select r:4; enum isl_channel_select g:4; enum isl_channel_select b:4; enum isl_channel_select a:4; }; and the above code initializes all of them with a C99 initializer. Iván Briano reminded me that C99 initializers don't necessarily zero padding. A quick inspection revealed that sizeof(struct isl_swizzle) was 4 (rather than the expected 2). Ian Romanick suggested changing it to uint16_t, since this is essentially dicing up an unsigned, and that worked. This patch marks enum isl_channel_select packed, changing its size from 4 bytes to 1 byte. This then makes struct isl_swizzle 2 bytes, with no bogus padding fields. This eliminates valgrind undefined memory warnings. These isl_swizzle values become part of our BLORP blit program keys, which are then hashed. This undefined padding was being included in the hashing, possibly leading to issues. I originally saw this error when running KHR-GL45.texture_size_promotion.functional in iris under valgrind. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 11:09:44 -07:00
Alyssa Rosenzweig	e1c14b2820	panfrost/ci: Texture wrap tests are legitimately fixed These depended on the wallpaper reload. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	8442dde169	panfrost/midgard: Lower inot to inor with 0 We were previously lowering to inand, but the second arg was not duplicated so inot would always return ~0. Oops. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	d415748955	panfrost/midgard: Cleanup tag fetch in disassembler Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	d3ad8d6b48	panfrost/midgard: Use fancy iterator Trivial cleanup. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:29 -07:00
Alyssa Rosenzweig	ae20bee75e	panfrost/midgard: Cull dead branches This fixes bugs with complex control flow. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	c62f2ff852	panfrost/midgard: Add mir_print_bundle helper This helps with debugging scheduling/emission. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	fd6d6c1b15	panfrost/midgard/disasm: Pretty-print branch tags Just makes it a little more obvious what's going on. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	2ebf22c399	panfrost/ci: Note some since-fixed tests Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	de8d49acdc	panfrost/midgard: Vectorize I/O This uses the new mesa/st functionality for NIR I/O vectorization, which eliminates a number of corner cases (resulting in assorted dEQP failures and regressions) and should improve performance substantial due to lessened pressure on the load/store pipe. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	4aced18031	panfrost/midgard: Remove varyings delay pass This pass interfered with the more delicate path required for non-vectorized I/O. It's also ugly and duplicating the job of an actual honest-to-goodness scheduler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Alyssa Rosenzweig	43568f2675	panfrost/midgard: Apply component to load_input Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-07 09:05:28 -07:00
Eric Engestrom	440fe0eb43	nir: fix s/&&/\|\|/ typo Fixes: `cd73b6174b` "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 16:06:25 +01:00
Kristian H. Kristensen	b9bbac6234	freedreno/a6xx: Drop struct stage array This now boils down to just picking between binning or vertex shader and dummy_fs or real fs, which we can do in a couple of lines of code instead. The constlen logic isn't doing what it thinks it's doing, both constlens at this point MAX2(s[VS].constlen, align(state->bs->constlen, 4)); are binning shader constlens. We'll have to revisit the constlen logic, but this commit doesn't change how it works. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:12 -07:00
Kristian H. Kristensen	9382a3c11d	freedreno/a6xx: Drop support for SS6_DIRECT shader upload a6xx only supports indirect shaders. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:10 -07:00
Kristian H. Kristensen	0ef00ceb2e	freedreno/a6xx: Share shader_t_to_opcode We have a similar function in fd6_program.c. Move to fd6_emit.h and share. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:33:03 -07:00
Kristian H. Kristensen	4552162e2d	freedreno/a6xx: Consolidate more of dword 0 building in fd6_draw_vbo There's already a bit of duplicated logic here and tessellation will add more. Build up dword 0 in fd6_draw_vbo() and drop the a4xx in the process. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:32:59 -07:00
Kristian H. Kristensen	cae6b4d741	freedreno: Move fd4_size2indextype() helper to freedreno_util.h In preparation for refactoring fd6_draw.c a bit. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 07:32:34 -07:00
Samuel Pitoiset	0905189a25	radv: enable VK_EXT_sample_locations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:17 +02:00
Samuel Pitoiset	05f5fa661f	radv: enable HTILE for images that might need variable sample locations This is now supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:14 +02:00
Samuel Pitoiset	e7677a697b	radv: handle sample locations during automatic layout transitions From the Vulkan spec 1.1.109: "Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. [...] and VkRenderPassSampleLocationsBeginInfoEXT can be chained from VkRenderPassBeginInfo to provide sample locations for layout transitions performed implicitly by a render pass instance." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:11 +02:00
Samuel Pitoiset	d0d41e58c3	radv: determine the first subpass id for every attachments Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:08 +02:00
Samuel Pitoiset	f58e9f6d69	radv: handle sample locations during explicit depth/stencil transitions From the Vulkan spec 1.1.109, "Some implementations may need to evaluate depth image values while performing image layout transitions. To accommodate this, instances of the VkSampleLocationsInfoEXT structure can be specified for each situation where an explicit or automatic layout transition has to take place. VkSampleLocationsInfoEXT can be chained from VkImageMemoryBarrier structures to provide sample locations for layout transitions performed by vkCmdWaitEvents and vkCmdPipelineBarrier calls." This handles explicit depth/stencil layout transitions performed with CmdWaitEvents() or CmdPipelineBarrier(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:01 +02:00
Samuel Pitoiset	a20925f2a9	radv: allow the depth decompress pass to emit dynamic sample locations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:11:00 +02:00
Samuel Pitoiset	2dd8dfd913	radv: allow to set dynamic sample locations to the depth decompress pass If VK_EXT_sample_locations is used, the driver might need to emit the sample locations specified during layout transitions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:10:55 +02:00
Samuel Pitoiset	d78990c174	radv: allow to save/restore sample locations during meta operations This will be used for the depth decompress pass that might need to emit variable sample locations during layout transitions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-07 13:10:50 +02:00
Kenneth Graunke	22025595f3	iris: Sweep the NIR in iris_create_uncompiled_shader(). We run a ton of backend specific passes here (mostly brw_preprocess_nir) and ought to sweep up any unused memory at this point, since we're going to hang on to this NIR for as long as the linked program lives.	2019-06-07 01:29:38 -07:00
Eduardo Lima Mitev	c02ffd2700	ir3: Use the new NIR lowering pass for integer multiplication Shader-db stats courtesy of Eric Anholt: total instructions in shared programs: 6480215 -> 6475457 (-0.07%) instructions in affected programs: 662105 -> 657347 (-0.72%) helped: 1209 HURT: 13 total constlen in shared programs: 1432704 -> 1427769 (-0.34%) constlen in affected programs: 100063 -> 95128 (-4.93%) helped: 512 HURT: 0 total max_sun in shared programs: 875561 -> 873387 (-0.25%) max_sun in affected programs: 46179 -> 44005 (-4.71%) helped: 1087 HURT: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	340277ad71	ir3/nir: Add new NIR AlgebraicPass for lowering imul Currently, ir3 backend compiler is lowering integer multiplication from: dst = a * b to: dst = (al * bl) + (ah * bl << 16) + (al * bh << 16) by emitting this code: mull.u tmp0, a, b ; mul low, i.e. al * bl madsh.m16 tmp1, a, b, tmp0 ; mul-add shift high mix, i.e. ah * bl << 16 madsh.m16 dst, b, a, tmp1 ; i.e. al * bh << 16 which at that point has very low chances of being optimized. This patch adds a new nir_algebraic.AlgebraicPass to performs this lowering during NIR algebraic optimization passes, giving it a better chance for optimizing the resulting code. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	3addd7c8d9	nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16 For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	e45de3a6c3	ir3/compiler: Handle new alu opcodes 'umul_low' and 'imadsh_mix16' They directly emit ir3_MULL_U and ir3_MADSH_M16 respectively. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	c27b3758fa	nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Iago Toral Quiroga	9b96ae69bc	v3d: don't emit point coordinates varyings if the FS doesn't read them We still need to emit them in V3D 3.x since there there is no mechanism to disable them. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:29:42 +02:00
Iago Toral Quiroga	5e26e55e72	v3d: add a helper to track variables that need point coordinates Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:26:52 +02:00
Kenneth Graunke	4e3297f7d4	egl/x11: calloc dri2_surf so it's properly zeroed Commit `2282ec0a` refactored drawable creation across various platforms into a new dri2_create_drawable helper function. The GBM code in platform_drm.c code passed in dri2_surf->gbm_surf as the loaderPrivate, while most other backends passed in dri2_surf directly. To try and handle this, the patch checked if dri2_surf->gbm_surf was non-NULL, and if so, presumed that the caller is the DRM platform and we should use the dri2_surf->gbm_surf pointer. This worked for most platforms, which calloc their dri2_surf structure, zeroing the data. Unfortunately, platform_x11.c used malloc, leaving most of the dri2_surf as garbage. In particular, dri2_surf->gbm_surf was often non-NULL, causing dri2_create_drawable to try and use it, passing a garbage pointer to the createNewDrawable hook, usually leading to a SIGBUS or SIGSEGV when trying to dereference that bad pointer. Since most callers calloc the data, make platform_x11.c follow suit. Fixes crashes with i915_dri.so when running dEQP-GLES2. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-06 22:45:27 -07:00
Mark Janes	04dac69752	tests/graw: use C99 print conversion specifier for 32 bit builds Fixes formatting errors for 32 bit compilations, eg: error: format specifies type 'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned long long') [-Werror,-Wformat] printf("result1 = %lu result2 = %lu\n", res1.u64, res2.u64); Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 14:39:41 -07:00
Alyssa Rosenzweig	30adeb7a53	panfrost/midgard: Fix crash with unused SSA values Crash introduced in "b38dab101ca7e0896255dccbd85fd510c47d84d1" but not adding a Fixes tag since it's our bug anyway. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-06 13:44:27 -07:00
Boris Brezillon	3d661a4ef9	panfrost: Report sRGB colorspace as not supported The driver does not support sRGB yet, so let's report it as unsupported. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-06 13:41:54 -07:00
Erik Faye-Lund	c0dfe8c6df	docs: do not use div for line-breaking HTML has the <p>-tag for this purpose. It adds some margins, but that just makes this read better, IMO. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-06 17:51:45 +00:00
Erik Faye-Lund	f3235cfa70	docs: fixup code-tag positioning This reads better if we include the asterisk in the code-block, as it's part of the function-reference, even though it's not technically speaking code. But as the <code>-tag isn't purely for code, this should be fine. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-06 17:51:45 +00:00
Erik Faye-Lund	205f960e08	docs: add missing code-tags Looks like I missed a few cases when I recently added more code-tags here. So let's add these cases as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-06 17:51:45 +00:00
Erik Faye-Lund	54b7a1f175	docs: add accidentally dropped "at" When rewriting `20c56e18c2` after review, I accidentally dropped the "at" here. Sorry for that, and let's fix it up! Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `20c56e18c2` ("docs: use proper links instead of code-tags") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-06 17:51:45 +00:00
Gurchetan Singh	110f139f98	anv: allow NV12 <--> AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 inter-op AHARDWAREBUFFER_FORMAT_Y8Cb8Cr8_420 is an implementation defined flexible YUV format. Most of the times, it's NV12 or YV12. On Intel, NV12 is preferred since it can be used by the display engine. This API adds a dependency between gralloc and buffer consumers, unfortunately. Right now, the code seems to work for i915 gralloc, but not cros_gralloc. Add a preprocessor flag to fix this. TEST=android.graphics.cts.MediaVulkanGpuTest#testMediaImportAndRendering Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-06 09:20:03 -07:00
Connor Abbott	9d93d2a404	ac/nir: Remove stale TODO While we're here, copy the comment explaining this from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-06 17:14:28 +02:00
Connor Abbott	1d55b0da59	radeonsi: Don't force dcc disable for loads When `e9d935ed0e` added force_dcc_off(), we forced it off for any preloaded image descriptor which had stores associated with them, since the same preloaded descriptors were used for loads and stores. However, when the preloading was removed in `16be87c904`, the existing logic was kept despite it not being necessary anymore. The comment above force_dcc_off() only mentions stores, so only force DCC off for stores. Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-06 17:14:28 +02:00
Gert Wollny	10895c39c3	mesa/main: Expose EXT_clip_control and related enums and the function Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-06 12:25:17 +02:00
Gert Wollny	f1f6228a38	mapi/glapi/registry: Update gl.xml to latest upstream version The old copy didn't include EXT_clip_control, so update it. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-06 12:25:12 +02:00
Gert Wollny	8657257a6e	virgl: Enable CAP_CLIP_HALFZ if host supports it On according hosts this enables the piglits as "pass": arb_clip_control-* v2: sync flag with host Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-06 12:24:53 +02:00
Charmaine Lee	f29b8fde91	svga: Remove unnecessary check for the pre flush bit for setting vertex buffers This fixes the missing rebind when the can_pre_flush bit is not set and the vertex buffers are the same as what have been sent. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Neha Bhende <bhenden@vmware.com> Signed-off-by: Charmaine Lee <charmainel@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-06-06 10:27:10 +02:00
Deepak Rawat	72fc886826	winsys/svga/drm: Fix 32-bit RPCI send message Depending on whether compiled with frame-pointer or not, the temporary memory location used for the bp parameter in these macros are referenced relative to the stack pointer or the frame pointer. Hence we can never reference that parameter when we've modified either the stack pointer or the frame pointer, because then the compiler would generate an incorrect stack reference. Fix this by pushing the temporary memory parameter on a known location on the stack before modifying the stack- and frame pointers. Also in case of failuire RPCI channel is not closed which lead to vmx running out of channels. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2019-06-06 10:27:10 +02:00
Samuel Pitoiset	b9d3a6b656	radv: set the subpass before any initial subpass transitions This might fix initial subpass transitions when multiview is used. Noticed while implementing sample locations during layout transitions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-06 10:00:29 +02:00
Nataraj Deshpande	d6724471a5	anv: Fix check for isl_fmt in assert Checking isl_fmt returned value in assert seems appropriate instead of format variable. Fixes: `f1654fa7e3` "anv/android: support creating images from external format" Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-06-06 09:24:08 +03:00
Iago Toral Quiroga	09d230c6cf	v3d: fix scheduling dependency tracking for ALU with small immediates We were not accountint for small immediates in the B mux so the scheduler was interpreting these are regular register file accesses, which could lead to additional (incorrect) write-read dependencies. Shader-db changes: total instructions in shared programs: 9163664 -> 9137263 (-0.29%) instructions in affected programs: 3931035 -> 3904634 (-0.67%) helped: 12457 HURT: 2563 total max-temps in shared programs: 1325787 -> 1325597 (-0.01%) max-temps in affected programs: 5746 -> 5556 (-3.31%) helped: 186 HURT: 16 helped stats (abs) min: 1 max: 4 x̄: 1.12 x̃: 1 helped stats (rel) min: 1.45% max: 22.22% x̄: 4.42% x̃: 3.28% HURT stats (abs) min: 1 max: 3 x̄: 1.12 x̃: 1 HURT stats (rel) min: 2.86% max: 10.00% x̄: 5.76% x̃: 5.88% 95% mean confidence interval for max-temps value: -1.04 -0.84 95% mean confidence interval for max-temps %-change: -4.16% -3.07% Max-temps are helped. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 08:16:43 +02:00
Vasily Khoruzhick	b412e05751	lima/ppir: add missing handling of min/max ops for vec4 add slot Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-06 04:30:36 +00:00
Vasily Khoruzhick	5980565a37	lima/ppir: fix crash when program uses no registers at all Program may need no regalloc at all, e.g. in case when program consists of single discard op. Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-06-06 04:30:36 +00:00
Jason Ekstrand	b38dab101c	util/hash_table: Assert that keys are not reserved pointers If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Jason Ekstrand	8306dabc03	util/set: Assert that keys are not reserved pointers If we insert a NULL key, it will appear to succeed but will mess up entry counting. Similar errors can occur if someone accidentally inserts the deleted key. The later is highly unlikely but technically possible so we should guard against it too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Jason Ekstrand	7a18ce0b91	glsl/loop_analysis: Don't search for NULL variables in the hash table Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-06 00:27:53 +00:00
Jason Ekstrand	d96878a66a	nir/propagate_invariant: Don't add NULL vars to the hash table Fixes: `8410cf66d` "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Ian Romanick	1c30d26d89	intel/compiler: Treat b32csel as potentially producing a Boolean result for resolve analysis If the 2nd and 3rd source are both Boolean values, we can potentially avoid a resolve by only resolving the result of the b32csel. No changes on any Gen6+ Intel platform. v2: Use ?: instead of cast from bool to unsigned. Suggested by Caio. Iron Lake total instructions in shared programs: 8142729 -> 8142677 (<.01%) instructions in affected programs: 12890 -> 12838 (-0.40%) helped: 26 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.52% -0.39% Instructions are helped. total cycles in shared programs: 188549632 -> 188549394 (<.01%) cycles in affected programs: 60754 -> 60516 (-0.39%) helped: 25 HURT: 1 helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8 helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70% 95% mean confidence interval for cycles value: -12.91 -5.40 95% mean confidence interval for cycles %-change: -0.84% -0.23% Cycles are helped. GM45 total instructions in shared programs: 5013119 -> 5013093 (<.01%) instructions in affected programs: 6764 -> 6738 (-0.38%) helped: 13 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -0.52% -0.34% Instructions are helped. total cycles in shared programs: 128977804 -> 128977700 (<.01%) cycles in affected programs: 37738 -> 37634 (-0.28%) helped: 13 HURT: 0 helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -8.00 -8.00 95% mean confidence interval for cycles %-change: -0.36% -0.24% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:04:17 -07:00
Ian Romanick	0ba9497e66	intel/fs: Improve discard_if code generation Previously we would blindly emit an sequence like: mov(1) f0.1<1>UW g1.14<0,1,0>UW ... cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F /* 15F / (+f0.1) cmp.z.f0.1(16) null<1>D g7<8,8,1>D 0D The first move sets the flags based on the initial execution mask. Later discard sequences contain a predicated compare that can only remove more SIMD channels. Often times the only user of the result from the first compare is the second compare. Instead, generate a sequence like mov(1) f0.1<1>UW g1.14<0,1,0>UW ... cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F / 15F / (+f0.1) cmp.ge.f0.1(8) null<1>F g5<8,8,1>F 0x41700000F / 15F */ If the results stored in g7 and f0.0 are not used, the comparison will be eliminated. This removes an instruction and potentially reduces register pressure. v2: Major re-write of the commit message (including fixing the assembly code). Suggested by Matt. All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224434 -> 17198659 (-0.15%) instructions in affected programs: 2908125 -> 2882350 (-0.89%) helped: 18891 HURT: 5 helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02% HURT stats (abs) min: 9 max: 105 x̄: 51.40 x̃: 35 HURT stats (rel) min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56% 95% mean confidence interval for instructions value: -1.39 -1.34 95% mean confidence interval for instructions %-change: -1.79% -1.73% Instructions are helped. total cycles in shared programs: 361468458 -> 361170679 (-0.08%) cycles in affected programs: 38470116 -> 38172337 (-0.77%) helped: 16202 HURT: 1456 helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18 helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18% HURT stats (abs) min: 1 max: 5982 x̄: 87.51 x̃: 28 HURT stats (rel) min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64% 95% mean confidence interval for cycles value: -18.24 -15.49 95% mean confidence interval for cycles %-change: -2.26% -2.14% Cycles are helped. total spills in shared programs: 12147 -> 12176 (0.24%) spills in affected programs: 175 -> 204 (16.57%) helped: 8 HURT: 5 total fills in shared programs: 25262 -> 25292 (0.12%) fills in affected programs: 269 -> 299 (11.15%) helped: 8 HURT: 5 Haswell total instructions in shared programs: 13530316 -> 13502647 (-0.20%) instructions in affected programs: 2507824 -> 2480155 (-1.10%) helped: 18859 HURT: 10 helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1 helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41% HURT stats (abs) min: 5 max: 39 x̄: 25.70 x̃: 31 HURT stats (rel) min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31% 95% mean confidence interval for instructions value: -1.49 -1.44 95% mean confidence interval for instructions %-change: -2.42% -2.34% Instructions are helped. total cycles in shared programs: 377865412 -> 377639034 (-0.06%) cycles in affected programs: 40169572 -> 39943194 (-0.56%) helped: 15550 HURT: 1938 helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18 helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25% HURT stats (abs) min: 1 max: 4862 x̄: 89.17 x̃: 35 HURT stats (rel) min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75% 95% mean confidence interval for cycles value: -14.42 -11.47 95% mean confidence interval for cycles %-change: -2.05% -1.91% Cycles are helped. total spills in shared programs: 26769 -> 26814 (0.17%) spills in affected programs: 826 -> 871 (5.45%) helped: 9 HURT: 10 total fills in shared programs: 38383 -> 38425 (0.11%) fills in affected programs: 834 -> 876 (5.04%) helped: 9 HURT: 10 LOST: 5 GAINED: 10 Ivy Bridge total instructions in shared programs: 12079250 -> 12044139 (-0.29%) instructions in affected programs: 2409680 -> 2374569 (-1.46%) helped: 16135 HURT: 0 helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2 helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68% 95% mean confidence interval for instructions value: -2.21 -2.14 95% mean confidence interval for instructions %-change: -2.76% -2.67% Instructions are helped. total cycles in shared programs: 180116747 -> 179900405 (-0.12%) cycles in affected programs: 25439823 -> 25223481 (-0.85%) helped: 13817 HURT: 1499 helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18 helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97% HURT stats (abs) min: 1 max: 3684 x̄: 98.99 x̃: 52 HURT stats (rel) min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42% 95% mean confidence interval for cycles value: -15.68 -12.57 95% mean confidence interval for cycles %-change: -1.77% -1.63% Cycles are helped. LOST: 8 GAINED: 10 Sandy Bridge total instructions in shared programs: 10878990 -> 10863659 (-0.14%) instructions in affected programs: 1806702 -> 1791371 (-0.85%) helped: 13023 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1 helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10% 95% mean confidence interval for instructions value: -1.18 -1.17 95% mean confidence interval for instructions %-change: -1.68% -1.62% Instructions are helped. total cycles in shared programs: 154082878 -> 153862810 (-0.14%) cycles in affected programs: 20199374 -> 19979306 (-1.09%) helped: 12048 HURT: 510 helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18 helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52% HURT stats (abs) min: 1 max: 448 x̄: 54.39 x̃: 16 HURT stats (rel) min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17% 95% mean confidence interval for cycles value: -17.97 -17.08 95% mean confidence interval for cycles %-change: -1.84% -1.75% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 8155075 -> 8142729 (-0.15%) instructions in affected programs: 949495 -> 937149 (-1.30%) helped: 5810 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2 helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85% 95% mean confidence interval for instructions value: -2.14 -2.11 95% mean confidence interval for instructions %-change: -2.59% -2.48% Instructions are helped. total cycles in shared programs: 188584610 -> 188549632 (-0.02%) cycles in affected programs: 17274446 -> 17239468 (-0.20%) helped: 3881 HURT: 90 helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6 helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30% HURT stats (abs) min: 2 max: 10 x̄: 2.80 x̃: 2 HURT stats (rel) min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07% 95% mean confidence interval for cycles value: -9.35 -8.27 95% mean confidence interval for cycles %-change: -0.85% -0.77% Cycles are helped. GM45 total instructions in shared programs: 5019308 -> 5013119 (-0.12%) instructions in affected programs: 489028 -> 482839 (-1.27%) helped: 2912 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2 helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81% 95% mean confidence interval for instructions value: -2.14 -2.11 95% mean confidence interval for instructions %-change: -2.54% -2.39% Instructions are helped. total cycles in shared programs: 129002592 -> 128977804 (-0.02%) cycles in affected programs: 12669152 -> 12644364 (-0.20%) helped: 2759 HURT: 37 helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4 helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31% HURT stats (abs) min: 2 max: 10 x̄: 3.62 x̃: 4 HURT stats (rel) min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04% 95% mean confidence interval for cycles value: -9.53 -8.20 95% mean confidence interval for cycles %-change: -0.79% -0.70% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:04:13 -07:00
Ian Romanick	a288708506	intel/fs: Add need_dest parameter to fs_visitor::nir_emit_alu This is the same as the need_dest parameter to prepare_alu_destination_and_sources. This allows us to not change the register that is expected to hold an result if an instruction is re-emitted. This is particularly a problem if the re-emitted instruction is a partial write. A later patch will use this feature. No shader-db changes on any Intel platform. v2: Don't do the Boolean resolve when there is no destination. If the ALU instruction didn't write a register, there's nothing to resolve. This replaces an earlier patch "intel/fs: Allocate dummy destination register when need_dest is false". Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:04:08 -07:00
Ian Romanick	e13a5c7d67	intel/fs: Allow cmod propagation across reads and writes of different flags This also helps a later patch (intel/fs: Improve discard_if code generation) on about 200 shaders. v2: Document that other instruction sequences are also valid in subtract_merge_with_compare_intervening_mismatch_flag_write. Suggested by Caio. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224438 -> 17224434 (<.01%) instructions in affected programs: 296 -> 292 (-1.35%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.04% -0.81% Instructions are helped. total cycles in shared programs: 361468455 -> 361468458 (<.01%) cycles in affected programs: 2862 -> 2865 (0.10%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31% HURT stats (abs) min: 3 max: 4 x̄: 3.50 x̃: 3 HURT stats (rel) min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -4.34 5.84 95% mean confidence interval for cycles %-change: -0.70% 0.90% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:45 -07:00
Ian Romanick	8030cb75c1	intel/fs: Fix flag_subreg handling in cmod propagation There were two errors. First, the pass could propagate conditional modifiers from an instruction that writes on flag register to an instruction that writes a different flag register. For example, cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F could be come cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F Second, if an instruction writes f0.1 has it's condition propagated, the modified instruction will incorrectly write flag f0.0. For example, linterp(16) vgrf6:F, g2:F, attr0:F cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F (-f0.1) discard_jump(16) (null):UD could become linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F (-f0.1) discard_jump(16) (null):UD None of these cases will occur currently. The only time we use f0.1 is for generating discard intrinsics. In all those cases, we generate a squence like: cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d (-f0.1) discard_jump(16) (null):UD Due to the mixed types and incompatible conditions, this sequence would never see any cmod propagation. The next patch will change this. No shader-db changes on any Intel platform. v2: Fix typo in comment in test case subtract_delete_compare_other_flag. Noticed by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:40 -07:00
Ian Romanick	2dd6013933	intel/fs: Add missing tests for cmod_propagate_not Tests like this should have been added in `4467040cb6` ("i965/fs: Propagate conditional modifiers from not instructions"). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:31 -07:00
Kenneth Graunke	6a7d387394	i965: Allow signed/unsigned integer conversions in miptree up/download BLORP now handles this so there's no reason to fall back. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-05 16:58:07 -07:00
Kenneth Graunke	f06c86358c	intel/blorp: Handle SINT/UINT clamping on blits. This patch makes blorp_blit handle SINT<->UINT blit value clamping. After reading the source's integer data (which is expanded to 32-bit), we either IMAX with 0 (for SINT -> UINT, to clamp negative numbers) or UMIN with (1 << 31) - 1 (for UINT -> SINT, to clamp positive numbers outside of the representable range). Such blits are not allowed by the OpenGL or Vulkan APIs directly: The Vulkan 1.1 spec for vkCmdBlitImage says: "Integer formats can only be converted to other integer formats with the same signedness." The GL 4.5 spec for glBlitFramebuffer says: "An INVALID_OPERATION error is generated if format conversions are not supported, which occurs under any of the following conditions: [...] * The read buffer contains unsigned integer values and any draw buffer does not contain unsigned integer values. * The read buffer contains signed integer values and any draw buffer does not contain signed integer values." However, they are useful for other operations, such as texture upload and download, which typically are implemented via blorp_blit(). i965 has code to fall back in this case (which the next commit will delete), and Gallium expects blit() to handle this case for texture upload. Fixes the following tests on iris: - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo - GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-05 16:58:07 -07:00
Caio Marcelo de Oliveira Filho	1aea4cd0d9	anv/pipeline: Move lowering of nir_var_mem_global later This let deref optimizations apply to globals before lowering them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-05 16:57:09 -07:00
Kenneth Graunke	4f3c82c72c	st/nir: Don't use GLSL IR's MOD_TO_FLOOR lowering when using NIR. Both GLSL IR and NIR perform the same mod -> floor lowering for 32-bit types. But nir_lower_double_ops is slightly more defensive against lowered drcp precision loss, and handles mod(x, x) = 0 directly. This works well...assuming nir_lower_double_ops actually gets an fmod op to lower in the first place. The previous patches enabled NIR-based lowering for the remaining drivers, so we can stop using the GLSL IR lowering when using NIR. Fixes KHR-GL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	f4d4c42608	radeonsi: Enable NIR's lower_fmod option. Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIR-based drivers. The AMD NIR backend also has code to handle fmod, so we could potentially skip this and still be fine. I don't have an opinion on that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	e0641e0728	vc4: Enable NIR's lower_fmod option. Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIR-based drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	b0e3bd79dc	v3d: Enable NIR's lower_fmod option. Currently, st/mesa is always calling the GLSL IR lower_instructions() pass with MOD_TO_FLOOR set, so mod operations will be lowered before ever reaching NIR. This enables the same lowering at the NIR level, which will let me shut off the GLSL IR path for NIR-based drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	c7d1b52a2c	nir: Combine lower_fmod16/32 back into a single lower_fmod. We originally had a single lower_fmod option. In commit `2ab2d2e5`, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit `ca31df6f`. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	edd45af9ba	nir: Drop lower_fmod64 option. nir_lower_doubles offers a wide variety of fp64 lowering, including lowering fmod@64. The version there also better handles imprecisions due to lowered frcp@64. Let's consolidate on one version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	dfb18f0a28	panfrost: Switch to nir_lower_doubles instead of lower_fmod64. I don't think panfrost actually does doubles yet, but it at least claims to support PIPE_CAP_DOUBLES, so at least pretend to switch to the new lowering. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	d13059f4d5	nouveau: Use nir_lower_doubles instead of lower_fmod64 on nvc0. We currently have two duplicate mechanisms for lowering fmod@64. One is a nir_opt_algebraic rule keyed off of options->lower_fmod64, and the other is nir_lower_doubles, which offers a full gamut of fp64 lowering. The latter works slightly better in some corner cases, so I'm trying to eliminate lower_fmod64 and drop the redundancy. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	fa56a3795f	gallium: Drop lower_fmod64 from drivers that don't support doubles. Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no fmod64 to be lowered. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Dylan Baker	e8a60e5d50	docs: update calendar, and news item and link release notes for 19.0.6	2019-06-05 16:42:36 -07:00
Dylan Baker	fccd44940d	docs: Add SHA256 sums for 19.0.6	2019-06-05 16:40:57 -07:00
Dylan Baker	fe79d75ccf	docs: Add relnotes for 19.0.6	2019-06-05 16:40:55 -07:00
Erik Faye-Lund	ff41ac7292	docs: add day of month to all news-entries This makes it easier to batch-convert them to other structured markup-formats. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	382776d241	docs: add MD5 checksums for 9.2.2 files These checksums were obtained by downloading the releases from ftp://ftp.freedesktop.org/pub/mesa/older-versions/9.x/9.2.2/ and running md5sum on them. Hopefully the server wasn't compromised since release. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	1941e642bc	docs: use pre-block for showing commit-note Having a single-item list for this seems odd. Let's just use a pre-block in stead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	b16e593f79	docs: switch to definition list and code-tags A definition list is a better semantic match for what this list is supposed to convey, so let's use that instead. And while we're at it, let's add some code-tags around filenames, as they stand a bit more out that way. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	3b0d48e219	docs: combine headings This is more in line with how we mark-up other definition lists, and avoids portability issues with other markup-formats. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	942c4daac9	docs: more code-tags in llvmpipe.html This makes the article a bit easier to read. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	52667f990e	docs: use more code-tags in envvars.html This wraps code, identifiers, values and paths in code-tags, which makes them appear in a monospace-font for readability. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:46 +02:00
Erik Faye-Lund	d311d8f424	docs: use code-tags for envvars and options This makes it a bit easier to tell what's what. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	5639f0d5ee	docs: use dl instead of ul A HTML definition-list is more semantically strong than just some unordered list, and renders a bit cleaner by default. So let's use that instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	0bca0f1aa2	docs: remove pointlessly repeated list The examples listed above are exactly the same ones are we're about to list, so let's just keep the list that defines what they do. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	aed4ac6da8	docs: remove stray whitespace There's some stray whitespace in these files that doesn't do anything useful. Let's get rid of if. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	20c56e18c2	docs: use proper links instead of code-tags These links are a bit odd in that the URLs are simply placed in code-tags. This makes them harder to work with. Let's use proper links instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	c59c793ae5	docs: update doxygen-links One of these URLs are dead these days, and the other one forwards to the current one, doxygen.nl. Let's get these links up to date. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	7c4a4fb09a	docs: remove some noisy spacing in pre-blocks These newlines caused the blocks to have trailing newlines in them, which renders a bit noisily. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	412046f74e	docs: improve quoting slightly Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	8620f53212	docs: do not use br-tag for non-significant breaks According to the W3C, we shouldn't use the br-tag unless the line-break is part of the content: https://www.w3.org/TR/2011/WD-html5-author-20110809/the-br-element.html All of these instances are for non-content usage, and is as such technically out-of-spec. So let's either remove them, or split paragraphs, based on how related the content are. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	d5e273aad2	docs: remove pointless line-break Line-breaks at the end of a paragraph doesn't do anything useful, so let's just get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	db8211a883	docs: remove pointless trailing hard-breaks Line-break at the end of an article is quite pointless, and doesn't do much to increase the readability. Let's get rid of them. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	74a6a68196	docs: rewrite paragraph to be free-form These half-way structured sections are needlessly problematic to translate cleanly to other markup-languages, so let's just make this into a free-form paragraph instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	9e5bc2c868	docs: use h4 instead of free-standing paragraphs and br-tags This makes this document a bit more structured, which is generally considered a good thing for HTML. It will also translate a bit better into other markup-formats. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:45 +02:00
Erik Faye-Lund	38652a29ae	docs: slightly reword paragraph and tweak markup This makes this paragraph a bit easier to digest. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	b2ac7582d9	docs: remove stray space in code-block Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	0114d15ed6	docs: remove some pointless spacing The different headers and header-sizes already convey the hierarchical structure of this document, the unusual spacing arguably just looks a bit inconsistent with the rest of the site. Let's remove it; it looks fine without it, and will translate better to other markup languages. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	392c083377	docs: add more more code-tags It's easier to read function-names, file-names and other "machine"-related strings if they are formatted in a monospace font. So let's mark these up with code-tags. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	0ee366960c	docs: use code instead of tt-tag The tt-tag has been removed from HTML5, so let's normalize this to code-tags intead. This just makes things a bit more consistent, as we've mixed these left and right so far anyway. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	d60dc5d16f	docs: use paragraph instead of double newlines This is a bit more semantically clean in HTML, and makes us keep content and presentation a bit more separated. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Erik Faye-Lund	9a65de343e	docs: use verbatim .plan quote This quote is now verbatim, as archived here: https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_year/johnc_plan_1999.txt This makes it look a bit more consistent with the following news-entry, and makes things IMO a bit more clear. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-06-05 23:48:44 +02:00
Alyssa Rosenzweig	905d914cb6	panfrost/midgard: Verify SSA claims when pipelining The pipeline register creation algorithm is only valid for SSA indices; NIR registers and such cannot be pipelined without more complex analysis. However, there are the ocassional class of "liars" -- indices that claim to be SSA but are not. This occurs in the blend shader prologue, for example. Detect this and just bail quietly for now. Eventually we need to rewrite the blend shader prologue to occur in NIR anyway (which would mitigate the issue), but that's more involved and depends on a better understanding of pixel formats in blend shaders (for non-RGBA8888/UNORM cases). Fixes some blend shader regressions. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 14:40:08 -07:00
Alyssa Rosenzweig	dcd12aad46	panfrost/midgard: Don't assign var locations ourselves This piece of code was cargo-culted from the ir3 standalone compiler and made sense when we were a standalone compiler ourselves. Unfortunately, for the online compiler, mesa/st already handles this for us and if we duplicate it here, we're duplicating it incorrectly. So just delete these lines and fix a heck of a lot of tests. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 14:40:08 -07:00
Tomeu Vizoso	de5c882973	panfrost: Reload framebuffer contents if there's no clear If by flush time the client hasn't submitted a clear, add jobs for reloading the framebuffer contents as the first draw in the frame. This is required by programs such as Weston which don't do clears and rely on the previous contents of the framebuffer being there. Reloading the whole framebuffer on every frame without regards to what is needed or what is going to be covered is very inefficient, but future work will introduce support for damage regions and partial updates so we know what needs to be actually reloaded. Fixes quite a few tests in dEQP-EGL.functional.buffer_age.*. [Alyssa: The context is that tilers do an implicit glClear() on every frame, whether you asked them to or not. If you want a clear, this is very efficient. But if you don't, you have to explicitly blit the backbuffer back into tile memory, accomplished by a dummy texturing draw. This patch generates that draw via u_blitter, although we could do a bit better ourselves by eliding the vertex job. This fixes "black rectangles in Weston/sway" as well as "video not displaying when UI visible in mpv"] Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 14:35:48 -07:00
Alyssa Rosenzweig	2adf35e4f5	panfrost: Don't flip scanout The mesa/st flips the viewport, so we respect that rather than trying to flip the framebuffer itself and ignoring the viewport and using a messy heuristic. However, this brings an underlying disagreement about the interpretation of winding order to light. The blob uses a different strategy than Mesa for handling viewport Y flipping, so the meanings of the winding order bit are flipped for it. To keep things clean on our end, we rename to explicitly use Gallium (rather than flipped OpenGL) conventions. Fixes upside-down Xwayland/egl windows. v2: Adjust lowering configuration to correctly flip gl_PointCoord.y and gl_FragCoord.y. v1 was R-b'd by Tomeu, but then retracted due to these regressions which are not fixed. Suggested-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Sort-of-reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-06-05 14:35:48 -07:00
Timur Kristóf	c94b70a178	st/nine: Use tgsi_to_nir when preferred IR is NIR. This patch allows nine to read the preferred IR from pipe caps and use NIR when that is preferred by the driver, by calling tgsi_to_nir. Also adds some debug options that allow overriding it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2019-06-05 23:32:13 +02:00
Lionel Landwerlin	c162127440	intel/perf: improve dynamic loading config detection We're currently trying to detect dynamic loading config support by trying to remove to test config (hard coded in the i915 driver) and checking we get ENOENT. This can fail if the test config was updated in Mesa but not yet in i915. A better way to do this is to pick an invalid ID and check for ENOENT. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-05 20:16:23 +00:00
Jason Ekstrand	811c05dfe6	intel/nir: Take nir_shaders in brw_nir_link_shaders Since NIR_PASS no longer swaps out the NIR pointer when NIR_TEST_ is enabled, we can just take a single pointer and not a pointer to pointer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-05 20:07:28 +00:00
Jason Ekstrand	bb67a99a2d	intel/nir: Stop returning the shader from helpers Now that NIR_TEST_* doesn't swap the shader out from under us, it's sufficient to just modify the shader rather than having to return in case we're testing serialization or cloning. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-05 20:07:28 +00:00
Jason Ekstrand	fe2fc30cb5	nir: Don't replace the nir_shader when NIR_TEST_SERIALIZE=1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-06-05 20:07:28 +00:00
Jason Ekstrand	9eba6d9a88	nir: Don't replace the nir_shader when NIR_TEST_CLONE=1 Instead, we add a new helper which stomps one nir_shader and replaces it with another. The new helper effectively just changes which pointer gets used for the base nir_shader. It should be 99% as good at testing cloning but without requiring that everything handle having the shader swapped out from under it constantly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-06-05 20:07:28 +00:00
Caio Marcelo de Oliveira Filho	747926ddfb	iris: Only recompile CS when needed Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-06-05 12:57:54 -07:00
Lionel Landwerlin	0430c6d18a	intel/perf: fix EuThreadsCount value in performance equations EuThreadsCount is supposed to be the number of threads per EU, not the total number of threads in the whole device. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1fc7b95127` ("i965: Add Gen8+ INTEL_performance_query support") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-05 22:41:01 +03:00
Mark Janes	36d8a922de	intel/tools: use C99 print conversion specifier for 32 bit builds Fixes formatting errors for 32 bit compilations, eg: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=] Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-06-05 19:25:15 +00:00
Samuel Pitoiset	8a31eaa4e2	radv: use only one descriptor in the fmask expand pass This removes one useless SMEM load operations which pointed to the same descriptor anyway. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-05 20:50:58 +02:00
Samuel Pitoiset	7664eb8f2b	radv: set ACCESS_NON_READABLE on the fmask expand pass output image The driver will emit GLC=1. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-05 20:50:56 +02:00
Samuel Pitoiset	8206390546	radv: remove one useless image type in the fmask expand shader Both input and output images use the same type. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-05 20:50:53 +02:00
Kristian H. Kristensen	1e6c873f1f	freedreno/ir3: Extend debug helpers to support TCS/TES/GS Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-05 11:15:04 -07:00
Kristian H. Kristensen	3da9a24f35	freedreno/a6xx: Use VALIDREG in next_regid() helper Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-05 11:15:04 -07:00
Kristian H. Kristensen	6fffc091e2	freedreno/a6xx: Remove dead code from a5xx Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-05 11:15:04 -07:00
Kristian H. Kristensen	cea39af2fb	freedreno/ir3: Generalize ir3_shader_disasm() Use a helper function to get the sysval/attribute/varying/output name and make the disam debug output independent of shader stage. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-05 11:15:04 -07:00
Alyssa Rosenzweig	1ea987576d	panfrost/midgard: Always break up fragment writeout In a fragment shader, r0 is written out with a special branch sequence. r0 is not a real register here, but essentially a pipeline register -- as such, it needs to be written out in full and on time, with hanging dependencies in the bundle. Otherwise, we break up the bundle, which costs an extra ALU cycle and adds a move. When the scheduler ran last thing, we could do this analysis within the scheduler. Now that RA can run after scheduling, that's no longer valid, so we remove the analysis and always break it up (at a performance penalty). Future work can add a post-RA/post-schedule pass to merge writeout blocks if possible. It's a bit of a low-priority next to fixing conformance regressions, of course. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 18:06:49 +00:00
Alyssa Rosenzweig	3d11b075f0	panfrost/midgard: Fix cubemap regression Fixes: `2d9802233` ("panfrost/midgard: Extend RA to non-vec4 sources") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 18:06:48 +00:00
Deepak Rawat	828e1b0b4c	winsys/drm: Fix out of scope variable usage In this particular instance, struct member were used outside of the block where it was defined. Fix this by moving the definition outside of block. Signed-off-by: Deepak Rawat <drawat@vmware.com> Fixes: `569f838987` ("winsys/svga: Add support for new surface ioctl, multisample pattern") Reviewed-by: Brian Paul <brianp@vmware.com>	2019-06-02 22:31:07 -07:00
Alyssa Rosenzweig	c51312bc94	panfrost/midgard: Lower integer division We use the shared nir_lower_idiv pass to lower integer division, fixing 144 dEQP tests. This pass was not applied in the past due to breakage from iabs fixed earlier in the series. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-05 17:59:27 +00:00
Alyssa Rosenzweig	88c59798fe	panfrost/midgard: Fix 1-arg ALU memory corruption Certain ops that only take one argument have an imaginary "zero" constant for their second argument. For instance, conversions: i2f [dest], [source], #0 Memory corruption meant that #0 was instead random noise. For some ops, that doesn't matter (manifested as abnormally large code size and poor scheduling due to extra constants in random places). But for others, where a 1-op is emulated by a 2-op with an implicit 0 second argument, that broke things. Fixes iabs (emulated by iabsdiff). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-05 17:59:24 +00:00
Alyssa Rosenzweig	9f14e20fa1	panfrost/midgard: Add a bunch of new ALU ops These ops are used to accelerate various functions exposed in OpenCL. This commit only includes the routine additions to the table. They are not wired through the compiler; rather, they are just here to keep a reference for the disassembler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-05 17:58:14 +00:00
Emil Velikov	d6edccee8d	egl: add EGL_platform_device support This new 'platform' is added by default with no guards. It is effectively a copy of the surfaceless one, with updated function names and brand new probe function. Due to the reuse, some of the ifdef HAVE_SURFACELESS_PLATFORM guards have been dropped. A worthy mention are the changes in _egFindDisplay, since the original and dup'd fd are required, we make use of the plat_opt argument. Note that no hacks for eglGetDisplay are added - the API works only with the eglGetPlatformDisplay* API. v2: - s/_eglCompareDeviceDisplay/_eglSameDeviceDisplay/ (Eric) - let ^^ return bool (Eric) - fixup meson build, move files() further up (Eric) - copy from plat. surfaceless w/o the visual cleanups - close and free when destroying the dpy - sprinkle a few _eglDeviceSupports - split fd handling into separate function - use directly the render node if no FD is given (Mathias) v3: - s/dpy/disp/g - drop swap_buffers* callbacks - drop loader_set_logger() - drop local define - re-introduce _eglGetDRMDeviceRenderNode() - EGL_WARN on ForceSoftware with HW device - continue using the HW device - bail out for "EGL_MESA_device_software" until it's fixed - wire-up the Android build v4: - use new style _eglFindDisplay() - split hw vs sw code paths - don't close the internal fd (already handled in FiniDisplay()) - make swrast work (bit hacky bit will do for now) - Android for real, drop autotools - Correct HW + LIBGL_ALWAYS_SOFTWARE check - use the dri2_create_drawable() helper v5: - enhance comment around fd checks (Mathias) - rebase for dri2_init_surface() changes Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net> Acked-by: Marek Olšák <marek.olsak@amd.com> (v4) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Emil Velikov	2f11957532	egl: keep the software device at the end of the list By default, the user is likely to pick the first device so it should not be the least performant (aka software) one. v2: Drop odd comment (Marek) Suggested-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Emil Velikov	2282ec0ad6	egl/dri: flesh out and use dri2_create_drawable() Wrap the loader->createNewDrawable() dance into a helper and use it throughout the codebase. This addresses a cases like surfaceless (SL) on swrast (SL on kms_swrast is fine) where we'd attempt using the wrong driver and crash out. v2: fixup quirky GBM (Mathias) v3: fixup GBM for real (Marek) Cc: mesa-stable@lists.freedesktop.org Cc: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (v2) Signed-off-by: Marek Olšák <marek.olsak@amd.com> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-05 13:35:21 -04:00
Emil Velikov	5e0f527d60	egl: fold X11 attrib handling like other platforms Since we no longer need special handling for X11, refactor the code to follow the style used by all other platforms. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Adam Jackson	2b29cf2468	egl: remove Options::Platform handling The full set of attributes is already handled with previous patches. Thus all this is not dead code. v2 (Emil) - split from a larger patch. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Adam Jackson	4aebd86f9a	egl/x11: pick the user requested screen At the moment the user will pass the screen number via attribs, yet we would throw that away. Reason being that the int *screen passed to xcb_connect() is output only. v2 (Emil): - split from a larger patch - use xcb_connect() returned screen, as fallback - use helper function only as needed Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Adam Jackson	8e991ce539	egl: handle the full attrib list in display::options Earlier spec is vague, although EGL 1.5 makes it clear: Multiple calls made to eglGetPlatformDisplay with the same parameters will return the same EGLDisplay handle. With this commit we store and compare the full attrib list. v2 (Emil): - Split into separate patches - Use EGLBoolean over int masked as such - Don't return free'd pointed on calloc failure Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Emil Velikov	72b9aa973b	egl: flesh out a _eglNumAttribs() helper Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 13:35:21 -04:00
Krzysztof Raszkowski	4ff02b3edd	swr: fix support for GL_ARB_copy_image extension This commit fix support and adjusts the capabilities returned by the SWR driver and the documentation to correctly report the GL_ARB_copy_image extension. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-06-05 15:26:47 +00:00
Guido Günther	755fdd6f9d	etnaviv: etnaviv_bo_cache_test: Use /dev/dri/renderD128 by default Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	90cc0de102	build: Build etnaviv drm tests Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	01a8ba79fe	etnaviv: drm tests: Use mesa header locations Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	2b377547c3	etnaviv: Add libdrm tests as of 922d92994267743266024ecceb734ce0ebbca808 Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	b921df352d	build: Build etnaviv drm Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	3696235f82	etnaviv: gallium: Use internal etnaviv_drmif.h Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	95d8b4ac0b	etnaviv: drm: s/bo_del/_etna_bo_del/ This avoids a conflict with freedreno's bo_del(). Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	17d7282cca	etnaviv: drm: s/table_lock/etna_table_lock/ This avoids a conflict with freedreno's table_lock Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	7a5b19346a	etnaviv: drm: Move uapi header Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	3925d38870	etnaviv: drm: Drop excessive debugging in perfmon Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	22fa1c95ff	entaviv: drm: Don't use drmMsg() Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	c93007a618	etnaviv: drm: Use _mesa_hash_table instead of drmHash Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	92fc14321f	etnaviv: drm: Use mesa's ARRAY_SIZE Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	e87d128b52	etnaviv: drm: Use mesa's os_m{un,}map Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	2ebd444c10	etnaviv: drm: Use mesa's atomic definitions Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	6ab83b8474	etnaviv: drm: Drop drm_{public,private} Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	66eb554d46	etnaviv: drm: Drop inexistent headers Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	58eec3808e	etnaviv: Add libdrm code as of 922d92994267743266024ecceb734ce0ebbca808 Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Guido Günther	3835e21369	etnaviv: untabify Two driver files had tabs mixed with spaces. Remove the tabs. Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-05 08:58:05 +00:00
Tomeu Vizoso	c7a6e07454	panfrost: bifrost: Fix format string in disassembler The compiler configuration was hardened to fail on format warnings and things stopped building. Fixes: `c9c1e26106` ("mesa: prevent common string formatting security issues") Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-05 10:40:19 +02:00
Kenneth Graunke	8d4f68ee20	iris: Free the buffer when reading from the disk cache.	2019-06-04 23:53:57 -07:00
Alyssa Rosenzweig	bfa9f56a2a	panfrost/midgard: Don't promote non-SSA to pipeline registers Fixes: `33800f4612` ("panfrost/midgard: Implement "pipeline register" prepass") Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-06-05 00:12:36 +00:00
Eric Anholt	36cb209787	freedreno: Drop invalid scissor optimization. We do support TF now, so it's no longer valid. Besides, if we want this optimization, we should probably have mesa/st doing it right for everyone. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-04 16:44:37 -07:00
Eric Anholt	8843b90cac	freedreno: Reuse glsl_get_sampler_coordinate_components(). We have the GLSL type, so we can just ask it how many coordinates there are. The GLSL function already has Vulkan cases that we'd probably want eventually. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-04 16:44:24 -07:00
Eric Anholt	fb872748ec	freedreno: Improve the pi approximations in trig lowering. When comparing our sin/cos behavior to the closed source driver, I noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits). Fixes: dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52 dEQP-GLES3.functional.shaders.random.all_features.vertex.0 Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-04 23:35:38 +00:00
Marek Olšák	ff63b99531	ac: rename LLVM <= 7 helpers for readability Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:53:46 -04:00
Marek Olšák	c9b64b58de	ac: fix a typo in ac_build_wg_scan_bottom Cc: 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:53:46 -04:00
Caio Marcelo de Oliveira Filho	04e8ff8595	glx: Fix error message when no driverName is available Just provide a "(null)" literal in case driverName is NULL. In file included from ../src/glx/dri3_glx.c:76: ../src/glx/dri3_glx.c: In function ‘dri3_create_screen’: ../src/glx/dri_common.h:70:36: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 70 \| #define CriticalErrorMessageF(...) dri_message(_LOADER_FATAL, __VA_ARGS__) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/glx/dri3_glx.c:1002:4: note: in expansion of macro ‘CriticalErrorMessageF’ 1002 \| CriticalErrorMessageF("failed to load driver: %s\n", driverName); \| ^~~~~~~~~~~~~~~~~~~~~ ../src/glx/dri3_glx.c:1002:50: note: format string is defined here 1002 \| CriticalErrorMessageF("failed to load driver: %s\n", driverName); \| ^~ cc1: some warnings being treated as errors Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-04 15:28:12 -07:00
Chia-I Wu	65439291a0	virgl: resolve to correct level during texture read When PIPE_TRANSFER_READ requires a resolve, we blit from the host storage to a temporary storage, and do a format conversion from the temporary storage to the guest storage. This change makes sure we convert to the correct level of the guest storage. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-04 21:37:03 +00:00
Chia-I Wu	067018d4e7	virgl: fix texture resolving with compressed formats util_format_translate_3d expects the source box to be aligned to the block size. When resolving, make sure the size of the staging buffer is aligned to the block size. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-06-04 21:37:03 +00:00
Bas Nieuwenhuizen	a6a5a6f67f	freedreno: Add printf pattern string. Some new flag setting disallows it due to being a security risk. Fixes: `c9c1e26106` "mesa: prevent common string formatting security issues" Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-04 23:20:50 +02:00
Bas Nieuwenhuizen	6256925b11	Revert "vl: Enable DRM by default." Reason: meson.build:586:7: ERROR: Unknown variable "dep_libdrm". if building without x11 platform. This reverts commit `392c60928a`.	2019-06-04 23:14:56 +02:00
Alyssa Rosenzweig	4a03d37827	panfrost/midgard: .pos propagation A previous optimization converts fmax(x, 0.0) instructions to fmov.pos. This pass then propagates the .pos from the move up to the source instruction (when possible). From there, copy propagation will eliminate the move. In the future, we might prefer to do this in common NIR code like we do for saturate, as Bifrost can also benefit. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	5da0a33fab	panfrost/midgard: Cleanup copy propagation Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	33800f4612	panfrost/midgard: Implement "pipeline register" prepass This prepass, run after scheduling but before RA, specializes to pipeline registers where possible. It walks the IR, checking whether sources are ever used outside of the immediate bundle in which they are written. If they are not, they are rewritten to a pipeline register (r24 or r25), valid only within the bundle itself. This has theoretical benefits for power consumption and register pressure (and performance by extension). While this is tested to work, it's not clear how much of a win it really is, especially without an out-of-order scheduler (yet!). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	2a79afc5f0	panfrost/midgard: Helpers for pipeline Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	3c7abbfbe8	panfrost/midgard: Refactor schedule/emit pipeline First, this moves the scheduler and emitter out of midgard_compile.c into their own dedicated files. More interestingly, this slims down midgard_bundle to be essentially an array of _pointers_ to midgard_instructions (plus some bundling metadata), rather than the instructions and packing themselves. The difference is critical, as it means that (within reason, i.e. as long as it doesn't affect the schedule) midgard_instrucitons can now be modified _after_ scheduling while having changes updated in the final binary. On a more philosophical level, this removes an IR. Previously, the IR before scheduling (MIR) was separate from the IR after scheduling (post-schedule MIR), requiring a separate set of utilities to traverse, using different idioms. There was no good reason for this, and it restricts our flexibility with the RA. So unify all the things! Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	0524ab9c37	panfrost/midgard: Cleanup RA (stylistic changes) Trivial. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	debc29b9ad	panfrost/midgard: Share MIR utilities These are more generally useful than the files they were constrained to. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	1bfa0d6ccc	panfrost/midgard: Misc. cleanup for readibility Mostly, this fixes a number of instances of lines >> 80 chars, refactoring them into something legible. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	2d98022330	panfrost/midgard: Extend RA to non-vec4 sources This represents a major break with the former RA design. We now use conflicting register classes to represent the subdivision of Midgard's 128-bit registers into varying sizes and arrangement. We determine class based on the number of components in the instructions' masks. To support this, we include a number of helpers in the RA to allow composing swizzles and masks, such that MIR written implicitly assuming .xyzw sources can be transformed to use actual (non-aligned) sources. The net result is a marked decrease in register pressure on non-vec4-exclusive shaders. We could still be doing much better. Not implemented yet are: - Register spilling - Per-component liveness Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	c1715b558a	panfrost/midgard: Set masks on ld_vary These masks distinguish scalar/vec2/vec3 loads from the default vec4, which helps with assembly readability (since it's immediately obvious how many components are _actually_ affected, rather than doing mysterious things to an unknown number of unused components). Later in the series, this will enable smarter register allocation, as the unused components will not be interpreted abnormally. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	550be763fa	panfrost/midgard: Fix liveness analysis bugs This fixes liveness analysis with respect to inline constants and branching. in practice, the symptom is abnormally high register pressure. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	c54f3f42eb	panfrost/midgard: Set int outmod for "pasted" code These snippets of integer assembly are injected for various purposes. Eventually, we'll want to implement these in NIR directly. Regardless, the "default" output modifier is different between floats and ints, so let's set the right one. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	51196c3591	panfrost/midgard: Hoist some utility functions These were static to midgard_compile.c but are more generally useful across the compiler. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	005d9b1ada	panfrost/midgard: Remove pinning This mechanism is only used by blend shaders, so just use a move here. Ideally, it'll be copy-propped and DCE'd away; this removes a source of considerable indirection and will simplify RA logic. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-06-04 20:14:50 +00:00
Alyssa Rosenzweig	d2d3cc66cf	nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a) This pattern was noticed in glmark's jellyfish scene. v2: Add inexact qualifier due to NaN behaviour. Minimal shader-db changes (slightly helped). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-06-04 19:57:19 +00:00
Mark Janes	c9c1e26106	mesa: prevent common string formatting security issues Adds a compile-time error for obvious security issues like: printf(string_var); The proposed flag is more tolerant than -Wformat-nonliteral. Specifically, it tolerates common mesa formatting like: static const char *shader_template = "really long string %d"; printf(shader_template, uniform_number); Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-06-04 12:49:38 -07:00
Jason Ekstrand	f4ef34f207	intel/fs: Add an UNDEF instruction to avoid excess live ranges With 8 and 16-bit types and anything where we have to use non-trivial strides registersto deal with restrictions, we end up with things that look like partial writes even though we don't care about any values in the register except those written by that instruction. This is particularly important when dealing with loops because liveness sees is_partial_write and the fact that an old version from a previous loop iteration may be valid at that point and extends all purely partially written values to the entire loop. This commit adds a new UNDEF instruction which does nothing (the generator doesn't emit anything) but which does a fake write to the register. This informs liveness that we don't care about any values before that point so it won't consider those registers to be falsely live. We can safely emit UNDEF instructions for all SSA values that come in from NIR and nearly all temporaries generated by various stages of the compiler. In particular, we need to insert UNDEF instructions when we handle region restrictions because the newly allocated registers are almost guaranteed to be partially written. No shader-db changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110432 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-04 14:27:30 -05:00
Caio Marcelo de Oliveira Filho	d482a8f680	spirv: Update the OpenCL.std.h header This corresponds to commit 8b911bd2ba37677037b38c9bd286c7c05701bcda on GitHub. We previously tweaked OpenCL.std.h from upstream to be included in C code. Now upstream header can be included, however the symbol names are slightly different (include an OpenCLstd_ prefix), so this patch also fixes vtn_opencl.c to use those. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-04 12:12:51 -07:00
Bas Nieuwenhuizen	9701cb1034	radv: Use bo metadata for imported image tiling on Android. This way we handle linear images etc. correctly. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-06-04 18:32:45 +00:00
Bas Nieuwenhuizen	392c60928a	vl: Enable DRM by default. If libdrm is found the pipe loader enables drm anyway, and that is pretty much the only extra dependency this code has. This enables creating libva display using a drm fd without having to enable the DRM (GBM really) backend of EGL, which is completely unrelated. Leaving the X11 platforms alone as they would still result in the additional inclusion of extra deps. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-06-04 20:01:34 +02:00
Jason Ekstrand	c2a0335bb0	anv: Advertise support for VK_EXT_fragment_shader_interlock Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Jason Ekstrand	5176805471	spirv: Implement SPV_EXT_fragment_shader_interlock Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Jason Ekstrand	b5aa76b1df	spirv: Update the headers from latest Khronos master This corresponds to 8b911bd2ba37677037b38c9bd286c7c05701bcda in https://github.com/KhronosGroup/SPIRV-Headers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Jason Ekstrand	8339e3f010	vulkan: Update the XML and headers to 1.1.110 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Rhys Perry	73dda85512	ac/nir: mark some texture intrinsics as convergent Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-04 17:30:53 +01:00
Rhys Perry	d4a2f8b33b	radv: fix some compiler warnings Fixes -Woverflow warnings with GCC 9.1.1 v2: use a cast instead of a bitwise and Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-04 17:30:53 +01:00
Jason Ekstrand	a84de3fb7c	intel/fs: Skip registers faster when setting spill costs This might be slightly faster since we're doing one read rather than two before we decide to skip. The more important reason, however, is because no_spill prevents us from re-spilling spill registers. In the new world in which we don't re-calculate liveness every spill, we may not have valid liveness for spill registers so we shouldn't even look their live ranges up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110825 Fixes: `e99081e76d` "intel/fs/ra: Spill without destroying the..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2019-06-04 14:37:56 +00:00
Connor Abbott	d68218dbca	radeonsi/nir: Fix type in bindless address computation Bindless handles in GL are 64-bit. This fixes an assert failure in LLVM. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-04 15:15:46 +02:00
Christian Gmeiner	a6e879984c	etnaviv: implement set_active_query_state(..) for hw queries Clear w/ quad uses a normal draw which adds up to OQ. st/meta uses set_active_query_state(..) to tell the driver to pause queries in such cases. Fixes spec@arb_occlusion_query@occlusion_query_meta_save piglit. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-04 14:58:02 +02:00
Samuel Pitoiset	8a35eb0602	radv: do not use gfx fast depth clears for layered depth/stencil images The driver should only fast depth clears with the graphics path when the view covers all image layers, otherwise this might corrupt layers when HTILE is enabled. Cc: 19.0 19.1 mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-04 08:55:32 +02:00
Samuel Pitoiset	33f4e04d5a	ac,radv: do not emit vec3 for raw load/store on SI It's unsupported, only load/store format with vec3 are supported. Fixes: `6970a9a6ca` ("ac,radv: remove the vec3 restriction with LLVM 9+")" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-04 08:47:26 +02:00
Sagar Ghuge	3016756398	intel/compiler: Fix assertions in brw_alu3 v2: Fix assertion for src1 (Ian Romanick) Fixes: `3b967e17` (intel/compiler: Avoid false positive assertions) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-03 23:14:34 -07:00
Kenneth Graunke	34d3103dee	iris: Fix SO stride units for DrawTransformFeedback Mesa measures in DWords. The hardware also claims to measure in DWords. Except the SO_WRITE_OFFSET field is actually bits 31:2, with 1:0 MBZ. Which means that it really measures in bytes. So, convert to bytes. Without this, our offset / stride denominator was 1/4th the size it should be, leading to 4x the vertex count that we should have had. Fixes GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_two_buffers	2019-06-03 22:51:18 -07:00
Timothy Arceri	fea36a8f43	st/glsl: make sure to propagate initialisers to driver storage This essentially reverts `20234cfe3a`. Fixes piglit test: tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test Fixes: `20234cfe3a` "st/mesa: don't propagate uniforms when restoring from cache" Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784	2019-06-04 11:36:45 +10:00
Caio Marcelo de Oliveira Filho	61de825e11	spirv: Like Uniform, do nothing for UniformId Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	b4eff83180	spirv: Implement SpvOpCopyLogical This is the same as SpvOpCopyObject but without the type checking, which is how vtn_composite_copy works, so we just need to hook the operation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	81586e9f53	spirv: Generalize OpSelect SPIR-V 1.4 supports OpSelect over any composite type, and also allows scalar boolean condition for vector types -- a case which we already handled to support old GLSLang. Added a helper function to recursively perform nir_bcsel, that makes easier to support structs. v2: Replace asserts() with vtn_fail_if(). (Jason) v3: Simplify Condition and Result types verifications. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	17630291e5	spirv: Move OpSelect handling to a function This will make a later change easier to review. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	ea0e89859c	nir/vars_to_ssa: Handle UNDEF_NODE in more places Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110832 Fixes: `911ea2c66f` "nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:09:22 -07:00
Marek Olšák	b2bbd1a27b	ac/registers: don't use the si, cik, vi names, use gfxN trivial	2019-06-03 20:06:41 -04:00
Nicolai Hähnle	f480b8aaa4	amd/common: use generated register header	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	853ef5ccba	amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0 The automatic header generation unifies identical registers in a series and only emits definitions for the first one. This is mostly to avoid emitting excessive definitions for CB registers, but special-casing an exception for this family of registers doesn't seem worth it.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cf51009ad2	amd/common: unify PITCH_GFX6 and PITCH_GFX9 The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	e04215815e	amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguation This "register" name collides with R_370_CONTROL. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	cd247cf456	amd/common: cleanup DATA_FORMAT/NUM_FORMAT field names The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field contents were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	ef6ef098af	amd/common: derive ac_debug tables from register JSON	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	d02286c753	amd/registers: add JSON description of packet3 fields	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	67702e3319	amd/registers: add JSON descriptions of registers The descriptions are mostly derived from parsing the existing register headers.	2019-06-03 20:05:20 -04:00
Nicolai Hähnle	e6184b0892	amd/registers: scripts for processing register descriptions in JSON We will derive both the debugging tables and (the majority of) the register headers from descriptions in JSON, instead of deriving the debugging tables from an awkward parsing of the register headers. Some of the scripts are useful for maintaining the register database itself. The scripts are designed to output reasonably readable JSON by default.	2019-06-03 20:05:20 -04:00
Vinson Lee	d4e70be739	freedreno: Fix GCC build error. ../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 }, ^ Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-03 16:46:54 -07:00
Mark Janes	774a088f64	mesa: Use string literals for format strings Android build settings require format strings to be string literals. Fixes: `d2906293c4` "mesa: EXT_dsa add selectorless matrix stack functions" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 16:17:23 -07:00
Caio Marcelo de Oliveira Filho	045aeccf0e	iris: Always reserve binding table space for NIR constants Don't have a separate mechanism for NIR constants to be removed from the table. If unused, we will compact it away. The use_null_surface is needed when INTEL_DISABLE_COMPACT_BINDING_TABLE is set. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	5611444809	iris: Print binding tables when INTEL_DEBUG=bt Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	97cd865be2	iris: Compact binding tables Change the iris_binding_table to keep track of what surfaces are actually going to be used, then assign binding table indices just for those. Reducing unused bytes on those are valuable because we use a reduced space for those tables in Iris. The rest of the driver can go from "group indices" (i.e. UBO #2) to BTI and vice-versa using helper functions. The value IRIS_SURFACE_NOT_USED is returned to indicate a certain group index is not used or a certain BTI is not valid. The environment variable INTEL_DISABLE_COMPACT_BINDING_TABLE can be set to skip compacting binding table. v2: (all from Ken) Use BITFIELD64_MASK helper. Improve comments. Assert all group is marked as used when we have indirects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	79f1529ae0	iris: Create an enum for the surface groups This will make convenient to handle compacting and printing the binding table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	1c8ea8b300	iris: Handle binding table in the driver Stop using brw_compiler to lower the final binding table indices for surface access. This is done by simply not setting the 'prog_data->binding_table.*_start' fields. Then make the driver perform this lowering. This is a better place to perfom the binding table assignments, since the driver has more information and will also later consume those assignments to upload resources. This also prepares us for two changes: use ibc without having to implement binding table logic there; and remove unused entries from the binding table. Since the `block` field in brw_ubo_range now refers to the final binding table index, we need to adjust it before using to index shs->constbuf. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	518f83236b	iris: Pull brw_nir_analyze_ubo_ranges() call out setup_uniforms We'll change iris to perform lowering of the binding table indices earlier (before the backend kick in), but the backend compiler uses the result of the analysis to identify load_ubo intrinsics, so we do the analysis after the lowering to have the right indices. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-03 14:14:45 -07:00
Caio Marcelo de Oliveira Filho	1f8546ba2f	spirv: Implement OpPtrEqual, OpPtrNotEqual and OpPtrDiff Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	ca164ab495	nir: Add functions to subtract and compare addresses v2: Fix comparing addresses from formats that have more than one component by using nir_ball_iequal(). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	09cc3389b9	nir: Add nir_ball_iequal() helper Similar to nir_bany_inequal(). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Sergii Romantsov	88340372ee	mesa: ARB program parser should clean parameters Program parser allocates parameter list. In case of parsing error some variables will not be freed. Patch adds freeing of it. Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 16:41:26 -04:00
Hyunjun Ko	382e3553af	freedreno/ir3: fix counting and printing for half registers. v2: defining 0x100 and use this for setting the FS_OUTPUT_REG.HALF_PRECISION Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Neil Roberts	fb53b326c2	freedreno/ir3: Fix up the half reg source even when src instr==NULL Previously the loop for assigning registers was bailing out early if the register had a null source. I think the intention is that in this case it isn’t necessary to assign a register. However it was also missing out the part to fix up the types. This can happen if the instruction is copy propagated to be a move from a constant half-float input register. In that case it still needs to fix up the types. Fixes assert in dEQP-GLES3.functional.shaders.invariance.highp.subexpression_precision_mediump when lowering the precision of the variables. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Neil Roberts	3222216a58	freedreno/ir3: Add a 16-bit implementation of nir_op_imul Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:51 -07:00
Hyunjun Ko	daee6bc1a1	freedreno/ir3: set dst type of alu instructions correctly. Though it should be fixed in RA pass, it needs to be set correctly from the beginning according to the bitsize of NIR dest. v2: Would be better for mad,fddx,fddy to fixup later in RA pass. [small cleanup of fallout from imov/fmov removal fallout] Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 13:31:26 -07:00
Hyunjun Ko	43d80a3e20	freedreno/ir3: adjust the bitsize of regs when an array loading. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	cbd1f47433	freedreno/ir3: convert back to 32-bit values for half constant registers. It seems to handle only 32-bit values for half constant registers within floating point opcodes according to the blob driver. So we need to convert back to 32-bit values from 16-bit values, when a lower precision pass is in effect. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	a9b556d3a0	freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov. If the type of dest reg and src reg of absneg opcode are different, it shouldn't be considered as same type mov. This patch becomes meaningful when we start to use mediump information for doing precision lowering to 16bit. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Hyunjun Ko	6fb8ef3da6	freedreno/ir3: set proper dst type for uniform according to the type of nir dest. eg. uniform mediump vec4 f; This patch means nothing since there's no mediump lowering pass for now, but will be meaningful when the pass land in the near future. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Neil Roberts	689c3c7d40	freedreno/ir3: Use output type size to set OUTPUT_REG_HALF_PRECISION Previously the A5XX_SP_FS_OUTPUT_REG_HALF_PRECISION was set depending on whether half_precision was set in the shader key. With support for mediump precision, it is possible to have different outputs use different precisions. That means we can’t have a global shader state to specify it. Instead it now tries to copy the half-float-ness from the nir_variable for the output into the ir3_shader_variant. This is then used to decide whether to set half-precision for each output. The a6xx version is copied from the a5xx code but it has not been tested. v2. [Hyunjun Ko (zzoon@igalia.com)] There's the half flag recently added, which represents precision based on IR3_REG_HALF. Now use this flag to avoid duplication. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Neil Roberts	8cd1b76b7d	freedreno/ir3: Fix loading half-float immediate vectors Previously the code to load from a constant instruction was always using the u32 pointer. If the constant is actually a 16-bit source this would end up with the wrong values because the pointer would be offset by the wrong size. This fixes it to use the u16 pointer. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-03 12:44:03 -07:00
Rob Clark	7bbf21e898	freedreno/ir3: immediately schedule meta instructions The aren't real instructions, and don't change # of live values, so no point in them competing with real instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Rob Clark	771d04c82d	freedreno/ir3: scheduler improvements For instructions that increase the # of live values, apply a threshold to avoid scheduling them too early. And factor the net change of # of live values that would result from scheduling an instruction, to prioritize instructions that reduce number of live values as the number of live values increases. For manhattan: total instructions in shared programs: 27869 -> 28413 (1.95%) instructions in affected programs: 26756 -> 27300 (2.03%) helped: 102 HURT: 87 total full in shared programs: 1903 -> 1719 (-9.67%) full in affected programs: 1390 -> 1206 (-13.24%) helped: 124 HURT: 9 The reduction in register usage nets ~20% gain in manhattan. (So getting mediump support should be a huge win for gles gfxbench.) Also significantly helps some of the more complex shadertoy shaders, like IQ's Piano (32 to 18 regs, doubles fps). The effect is less pronounced on smaller shaders. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Rob Clark	bb3aa44ade	freedreno/ir3: sched should mark outputs used Account for shader outputs and values live in any direct/indirect successor block. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-03 12:44:03 -07:00
Pierre-Eric Pelloux-Prayer	d2906293c4	mesa: EXT_dsa add selectorless matrix stack functions Allows the legacy matrix stacks to be manipulated without disturbing the matrix mode selector. Adapted from a patch from Chris Forbes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:51 -04:00
Pierre-Eric Pelloux-Prayer	28ce704bb0	mesa: factor out enum -> matrix stack lookup Split this out from glMatrixMode since we're about to need it independently for EXT_DSA. Adapted from Chris Forbes commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:49 -04:00
Timothy Arceri	b69584ad69	mesa: add new EXT_direct_state_access tokens Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:47 -04:00
Chris Forbes	028682f7f4	glapi: add EXT_direct_state_access Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:45 -04:00
Timothy Arceri	9c5d86af38	mesa: add a list of EXT_direct_state_access to dispatch sanity This extension is huge and this gives us a TODO list of functions to implement. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:28:33 -04:00
Pierre-Eric Pelloux-Prayer	4583f09caa	radeonsi: init sctx->dma_copy before using it Commit `a1378639ab` reordered context functions initializations but broke sctx->b.resource_copy_region init when using AMD_DEBUG=forcedma. In this case sctx->dma_copy was assigned a value after being used in: sctx->b.resource_copy_region = sctx->dma_copy; This commit moves the FORCE_DMA special case after sctx->dma_copy initialization. See https://bugs.freedesktop.org/show_bug.cgi?id=110422 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 15:05:30 -04:00
Axel Davy	5820ac6756	d3dadapter9: Revert to old throttling limit value Recently PIPE_CAP_MAX_FRAMES_IN_FLIGHT was changed from 2 to 1: `20909284f2` No driver seems to overwrite the default value. One user reports severe regressions for some games. For now, revert to the value 2 for nine. Cc: "19.1" mesa-stable@lists.freedesktop.org Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-06-03 20:37:13 +02:00
Marek Olšák	486bc1e17e	ac: use amdgpu-flat-work-group-size Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-03 14:32:47 -04:00
Marek Olšák	4b11ed443b	u_blitter: don't fail mipmap generation for depth formats containing stencil Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109754 Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-06-03 14:32:47 -04:00
Christian Gmeiner	3135ca4172	etnaviv: drop a bunch of duplicated gallium PIPE_CAP default code Now that we have the util function for the default values, we can get rid of the boilerplate. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-06-03 16:29:59 +02:00
Samuel Pitoiset	445098916a	radv: flush pending query reset caches before copying results From the Vulkan spec 1.1.108: "vkCmdCopyQueryPoolResults is guaranteed to see the effect of previous uses of vkCmdResetQueryPool in the same queue, without any additional synchronization." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-06-03 16:05:46 +02:00
Jonathan Marek	91672becc3	nir: copy intrinsic type when lowering load input/uniform and store output Fixes: `c1275052` "nir: add type information to load uniform/input and store output intrinsics" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Tested-by: Erico Nunes <nunes.erico@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-06-03 12:46:14 +00:00
Samuel Pitoiset	6970a9a6ca	ac,radv: remove the vec3 restriction with LLVM 9+ This changes requires LLVM r356755. 32706 shaders in 16744 tests Totals: SGPRS: 1448848 -> 1455984 (0.49 %) VGPRS: 1016684 -> 1016220 (-0.05 %) Spilled SGPRs: 25871 -> 25815 (-0.22 %) Spilled VGPRs: 122 -> 122 (0.00 %) Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread Code Size: 55324500 -> 55301152 (-0.04 %) bytes Max Waves: 235660 -> 235586 (-0.03 %) Totals from affected shaders: SGPRS: 293704 -> 300840 (2.43 %) VGPRS: 246716 -> 246252 (-0.19 %) Spilled SGPRs: 159 -> 103 (-35.22 %) Scratch size: 188 -> 180 (-4.26 %) dwords per thread Code Size: 8653664 -> 8630316 (-0.27 %) bytes Max Waves: 60811 -> 60737 (-0.12 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-03 11:30:08 +02:00
Caio Marcelo de Oliveira Filho	75590604a9	nir: Return nir_type_invalid for non-numeric base types Now that the type gathering function look at instructions that might have other types, return invalid type instead of crashing. That invalid will be properly ignored later. Fixes: `c12750527b` "nir: add type information to load uniform/input and store output intrinsics" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 16:27:03 -07:00
Caio Marcelo de Oliveira Filho	27497c5c02	iris: Drop unused locals from iris_clear.c to avoid warning Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-05-31 15:55:05 -07:00
Jonathan Marek	f387c2b238	nir: remove bool lowering from lower_int_to_float Removes the bool_to_float logic from the int_to_float pass, so that both can be used separately. By having separate passes we have better validation and it makes it possible to use with the lower_ftrunc option (int lowering generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should probably be reworked). For now we always expect lower_bool to come after lower_int. Also fixes f2i32 to become ftrunc and adds u2f/f2u cases. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	f6579ee204	nir: fix lower_{int,bool}_to_float for new mov opcode It is treated like the vecN instructions which also have no type. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	f889180ee1	nir: add lower_bitshift option Add a "lower_bitshift" option, which disables optimizations introducing bitshifts and lowers ishl by constant to a multiply, so that we don't have to deal with bitshifts in int_to_float lowering. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	887c2a6092	nir: fix gather_ssa_types Consts and undefs can be used as different types (common with "0" constant) so don't copy types from consts/undefs, only to them. It doesn't entirely solve the problem that the type given to the const could be wrong , but now the only realistic case is with "0" which is the same when casted to float, so it doesn't matter for lower_int_to_float. The other change is to get type information for load input/uniform and store output, and use that to get correct results. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	c12750527b	nir: add type information to load uniform/input and store output intrinsics This type information will be used by gather_ssa_types to get usable results Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	6016df211f	nir: improvements to native_integers removal Improvements related to the patch that removed native_integers: * In glsl_to_nir, special cases for i2f,u2f,etc are no longer needed * In prog_to_nir, use sge/slt and let lower_scmp lower it if needed Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Rob Clark	32131a9568	freedreno/a6xx: add 'type' to shader state key We could have identical texture state for both VS and FS.. which would result in VS state getting created first, and FS state mapping to the identical cmdstream. Resulting in VS state getting emitted twice and no FS state emitted. Fixes: dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.basic_array.sampler2D_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.struct_in_array.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES2.functional.uniform_api.value.assigned.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-31 12:58:47 -07:00
Rob Clark	8b7bf5e07a	freedreno/ir3: fix constlen versus indirect UBO If we access the address of the UBO indirectly, and there is no higher const emitted w/ direct access (like an immediate lowered to uniform) the assembler won't figure out the correct constlen. Fixes: dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_vertex dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.dynamically_uniform_vertex dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.dynamically_uniform_fragment Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-31 12:58:33 -07:00
Rob Clark	8eaa2d5021	freedreno/a6xx: fix GPU crash on small render targets Fixes dEQP-GLES2.functional.multisampled_render_to_texture.readpixels Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-05-31 12:58:33 -07:00
Rob Clark	f9fa456e1d	freedreno/ir3: set more barrier bits Blob is also setting the .l bit, and it seems to solve some intermittent failures with a couple of deqp's: dEQP-GLES31.functional.image_load_store.2d.qualifiers.coherent_r32i dEQP-GLES31.functional.image_load_store.2d.qualifiers.volatile_r32f Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-05-31 12:58:33 -07:00
Rob Clark	5d43b806ba	freedreno/ir3: set (ss) on last_input if ldlv It seems like (ei) handling doesn't sync on (ss), so we could end up in a situation where we release varying storage before an ldlv for flat shaded varyings completes. Keep track if we've done an (ss) since the last ldlv, and if not add (ss) flag to last_input which gets (ei). Noticed with dEQP-GLES3.functional.fragment_out.random.24 and dEQP-GLES3.functional.fragment_out.random.27, which previously passed by luck because ir3_sched ordered instructions in a way that resulted in a lucky (ss). Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Eric Anholt <eric@anholt.net>	2019-05-31 12:58:33 -07:00
Rob Clark	73fb02c5d6	freedreno/ir3: add assert The special handling for last_input assumes that all the varying loads are in the first block. Add an assert to catch if anyone breaks that assumption. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-31 12:58:33 -07:00
Connor Abbott	8c74772edc	util/hash_table: Use fast modulo computation While we're here, copy the size table from set.c to get rid of hard tabs in the hash_table.c version. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:35 +02:00
Connor Abbott	83667f7a61	util/set: Use fast modulo computation Compilation times with my shader-db database: Difference at 95.0% confidence -1.22312 +/- 0.726033 -0.283979% +/- 0.168254% (Student's t, pooled s = 1.02177) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:30 +02:00
Connor Abbott	b87817871b	util: Add a helper for faster remainders This should be at least as fast as using fast_idiv_by_const, and has the advantage that the precomputation is simple enough to be evaluated at Mesa-compile time for hash tables and sets which have a fixed table of possible divisors. Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:27 +02:00
Connor Abbott	983b001c77	util/hash_table: Add specialized resizing add function To keep it in sync with the set implementation. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:22 +02:00
Connor Abbott	6f9beb28bb	util/set: Add specialized resizing add function A significant portion of the time spent in nir_opt_cse for the Dolphin ubershaders was in resizing the set. When resizing a hash table, we know in advance that each new element to be inserted will be different from every other element, so we don't have to compare them, and there will be no tombstone elements, so we don't have to worry about caching the first-seen tombstone. We add a specialized add function which skips these steps entirely, speeding up resizing. Compile-time results from my shader-db database: Difference at 95.0% confidence -2.29143 +/- 0.845534 -0.529475% +/- 0.194767% (Student's t, pooled s = 1.08807) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:16 +02:00
Connor Abbott	451211741c	util/hash_table: Pull out loop-invariant computations To keep the set and hash table in sync. Note that some of this had already been done for hash tables, in particular pulling out the hash % ht->size computation. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:09 +02:00
Connor Abbott	f7ff685649	util/set: Pull out loop-invariant computations Unfortunately GCC can't do this for us, probably because we call the key comparison function which GCC can't prove won't modify arbitrary memory. This is a pretty hot function, so do the optimization manually to be sure the compiler will get it right. While we're here, make the computation of the new probe address use a single conditional subtract instead of a modulo, since we know that it won't ever get as big as 2 * ht->size before the modulo. Modulos tend to be pretty expensive operations. shader-db compile time results for my database: Difference at 95.0% confidence -2.24934 +/- 0.69897 -0.516296% +/- 0.159993% (Student's t, pooled s = 0.983684) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:14:04 +02:00
Connor Abbott	3bd0733011	nir/instr_set: Use _mesa_set_search_or_add() Before this change, we were searching for each instruction twice, once when checking if it exists and once when figuring out where to insert it. By using the new function, we can do everything we need to do in one operation. Compilation time numbers for my shader-db database: Difference at 95.0% confidence -4.04706 +/- 0.669508 -0.922142% +/- 0.151948% (Student's t, pooled s = 0.95824) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:13:59 +02:00
Connor Abbott	8a838e172f	util/set: Add a _mesa_set_search_or_add() function Unlike _mesa_set_search_and_add(), it doesn't replace an entry if it's found, returning it instead. This is useful for nir_instr_set, where we have to know both the original original instruction and its equivalent. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:13:45 +02:00
Jonathan Marek	1db86d8b62	freedreno/ir3: fix input ncomp for vertex shaders ncomp is never set for vertex shaders, but a3xx and a4xx still use it. Fixes: `831f1a05c0` freedreno/ir3: rework varying packing Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-05-31 12:21:23 -04:00
Ian Romanick	65df6122da	intel/compiler: Use compare rematerialization pass Almost all of the spill / fill benefit is in Deus Ex. Haswell and all Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224438 -> 17196395 (-0.16%) instructions in affected programs: 1518658 -> 1490615 (-1.85%) helped: 1550 HURT: 3 helped stats (abs) min: 1 max: 170 x̄: 18.11 x̃: 2 helped stats (rel) min: 0.04% max: 8.35% x̄: 1.12% x̃: 0.45% HURT stats (abs) min: 5 max: 10 x̄: 6.67 x̃: 5 HURT stats (rel) min: 0.32% max: 0.41% x̄: 0.35% x̃: 0.32% 95% mean confidence interval for instructions value: -19.86 -16.26 95% mean confidence interval for instructions %-change: -1.19% -1.04% Instructions are helped. total cycles in shared programs: 361468455 -> 361288721 (-0.05%) cycles in affected programs: 197367688 -> 197187954 (-0.09%) helped: 990 HURT: 683 helped stats (abs) min: 1 max: 119045 x̄: 806.00 x̃: 16 helped stats (rel) min: <.01% max: 38.56% x̄: 1.06% x̃: 0.26% HURT stats (abs) min: 1 max: 12190 x̄: 905.14 x̃: 22 HURT stats (rel) min: <.01% max: 25.18% x̄: 1.16% x̃: 0.47% 95% mean confidence interval for cycles value: -315.45 100.58 95% mean confidence interval for cycles %-change: -0.31% <.01% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 12147 -> 8948 (-26.34%) spills in affected programs: 5433 -> 2234 (-58.88%) helped: 343 HURT: 0 total fills in shared programs: 25262 -> 21814 (-13.65%) fills in affected programs: 7771 -> 4323 (-44.37%) helped: 343 HURT: 3 LOST: 0 GAINED: 17 Ivy Bridge total instructions in shared programs: 12083517 -> 12081427 (-0.02%) instructions in affected programs: 540744 -> 538654 (-0.39%) helped: 786 HURT: 29 helped stats (abs) min: 1 max: 42 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.06% max: 5.44% x̄: 0.55% x̃: 0.36% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 0.95% x̄: 0.38% x̃: 0.31% 95% mean confidence interval for instructions value: -2.83 -2.30 95% mean confidence interval for instructions %-change: -0.57% -0.47% Instructions are helped. total cycles in shared programs: 180153463 -> 180124798 (-0.02%) cycles in affected programs: 72597920 -> 72569255 (-0.04%) helped: 572 HURT: 249 helped stats (abs) min: 1 max: 14830 x̄: 109.48 x̃: 13 helped stats (rel) min: <.01% max: 8.92% x̄: 0.71% x̃: 0.26% HURT stats (abs) min: 1 max: 11060 x̄: 136.37 x̃: 10 HURT stats (rel) min: <.01% max: 10.85% x̄: 0.54% x̃: 0.32% 95% mean confidence interval for cycles value: -96.22 26.39 95% mean confidence interval for cycles %-change: -0.43% -0.23% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 3625 -> 3623 (-0.06%) spills in affected programs: 46 -> 44 (-4.35%) helped: 1 HURT: 0 total fills in shared programs: 4065 -> 4061 (-0.10%) fills in affected programs: 104 -> 100 (-3.85%) helped: 1 HURT: 0 LOST: 0 GAINED: 8 Sandy Bridge total instructions in shared programs: 10879656 -> 10878699 (<.01%) instructions in affected programs: 275167 -> 274210 (-0.35%) helped: 544 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.76 x̃: 1 helped stats (rel) min: 0.06% max: 3.11% x̄: 0.39% x̃: 0.25% 95% mean confidence interval for instructions value: -1.97 -1.55 95% mean confidence interval for instructions %-change: -0.43% -0.36% Instructions are helped. total cycles in shared programs: 154089096 -> 154081132 (<.01%) cycles in affected programs: 4422722 -> 4414758 (-0.18%) helped: 459 HURT: 214 helped stats (abs) min: 1 max: 258 x̄: 26.67 x̃: 8 helped stats (rel) min: <.01% max: 5.45% x̄: 0.51% x̃: 0.14% HURT stats (abs) min: 1 max: 226 x̄: 19.99 x̃: 4 HURT stats (rel) min: <.01% max: 3.15% x̄: 0.34% x̃: 0.09% 95% mean confidence interval for cycles value: -15.51 -8.15 95% mean confidence interval for cycles %-change: -0.31% -0.17% Cycles are helped. total spills in shared programs: 2880 -> 2876 (-0.14%) spills in affected programs: 636 -> 632 (-0.63%) helped: 2 HURT: 0 total fills in shared programs: 3161 -> 3157 (-0.13%) fills in affected programs: 1519 -> 1515 (-0.26%) helped: 2 HURT: 0 LOST: 0 GAINED: 2 Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8157361 -> 8155067 (-0.03%) instructions in affected programs: 382491 -> 380197 (-0.60%) helped: 677 HURT: 0 helped stats (abs) min: 1 max: 43 x̄: 3.39 x̃: 2 helped stats (rel) min: 0.09% max: 5.19% x̄: 0.66% x̃: 0.42% 95% mean confidence interval for instructions value: -3.76 -3.01 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: 188588292 -> 188583040 (<.01%) cycles in affected programs: 3155064 -> 3149812 (-0.17%) helped: 377 HURT: 13 helped stats (abs) min: 2 max: 180 x̄: 14.13 x̃: 6 helped stats (rel) min: <.01% max: 3.96% x̄: 0.39% x̃: 0.12% HURT stats (abs) min: 2 max: 8 x̄: 5.85 x̃: 6 HURT stats (rel) min: <.01% max: 0.22% x̄: 0.06% x̃: 0.04% 95% mean confidence interval for cycles value: -15.67 -11.27 95% mean confidence interval for cycles %-change: -0.45% -0.30% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Ian Romanick	3ee2e84c60	nir: Rematerialize compare instructions On some architectures, Boolean values used to control conditional branches or condtional selection must be propagated into a flag. This generally means that a stored Boolean value must be compared with zero. Rather than force the generation of extra compares with zero, re-emit the original comparison instruction. This can save register pressure by not needing to store the Boolean value. There are several possible ares for future improvement to this pass: 1. Be more conservative. If both sources to the comparison instruction are non-constants, it may be better for register pressure to emit the extra compare. The current shader-db results on Intel GPUs (next commit) lead me to believe that this is not currently a problem. 2. Be less conservative. Currently the pass requires that all users of the comparison match the pattern. The idea is that after the pass is complete, no instruction will use the resulting Boolean value. The only uses will be of the flag value. It may be beneficial to relax this requirement in some cases. 3. Be less conservative. Also try to rematerialize comparisons used for discard_if intrinsics. After changing the way the Intel compiler generates cod e for discard_if (see MR!935), I tried implementing this already. The changes were pretty small. Instructions were helped in 19 shaders, but, overall, cycles were hurt. A commit "nir: Rematerialize comparisons for nir_intrinsic_discard_if too" is on my fd.o cgit. 4. Copy the preceeding ALU instruction. If the comparison is a comparison with zero, and it is the only user of a particular ALU instruction (e.g., (a+b) != 0.0), it may be a further improvment to also copy the preceeding ALU instruction. On Intel GPUs, this may enable cmod propagation to make additional progress. v2: Use much simpler method to get the prev_block for an if-statement. Suggested by Tim. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Ian Romanick	336eab0630	nir: Add a shallow clone function for nir_alu_instr Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Tomeu Vizoso	0e1c5cc78f	panfrost: Remove link stage for jobs And instead, link them as they are added. Makes things a bit clearer and prepares future work such as FB reload jobs. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-31 14:37:10 +02:00
Tomeu Vizoso	da9f7ab6d4	panfrost: ci: Switch to kernel 5.2-rc2 Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-31 13:51:51 +02:00
Tomeu Vizoso	77f5663cf3	panfrost: ci: Update expectations A bunch of tests have been fixed, but some regressions have appeared on T760. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2019-05-31 13:51:43 +02:00
Connor Abbott	78f33620e8	radeonsi/nir: Remove hack for builtins We now bounds check properly in the uniform loading fast path, so there's no need to disable it by pretending there are other UBO bindings in use. The way this looks at the variable name was causing problems when two piglit shaders, one with a name that triggered the hack and one that didn't, got hashed to the same thing after stripping out the names. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-31 11:03:05 +02:00
Connor Abbott	fca1a35163	radeonsi/nir: Use correct location for uniform access bound location is the API-level location, but driver_location is the actual location the uniform gets passed to the driver. This apparently only caused failures with builtins, where the location is 0 because it's represented via the state tokens instead. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-31 11:02:57 +02:00
Connor Abbott	6571032af1	radeonsi/nir: Correctly handle double TCS/TES varyings ac expands the store to 32-bit components for us, but we still have to deal with storing up to 8 components, and when a varying is split across two vec4 slots we have to calculate the address again for the second slot, since they aren't adjacent in memory. I didn't do this on the ac level because we should generate better indexing arithmetic for the lds store, where slots are contiguous. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-31 11:02:11 +02:00
Christian Gmeiner	ca19f7639a	etnaviv: blt: s/TRUE/true && s/FALSE/false Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-05-31 10:04:49 +02:00
Christian Gmeiner	9e6463e62a	etnaviv: rs: s/TRUE/true && s/FALSE/false Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-05-31 10:04:49 +02:00
Bas Nieuwenhuizen	e24a7840f6	nir: Actually propagate progress in nir_opt_move_load_ubo. Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950). Fixes: `af355aaa07` "nir: add nir_opt_move_load_ubo() optimization pass" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-31 07:45:43 +00:00
Samuel Pitoiset	9178076a46	radv: use RADV_CMD_DIRTY_DYNAMIC_* when restoring viewport/scissor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-31 08:50:16 +02:00
Samuel Pitoiset	0e7b029d00	radv: use CmdPushConstants when restoring constants after meta operations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-31 08:50:13 +02:00
Jason Ekstrand	f1cb3348f1	nir/split_vars: Properly bail in the presence of complex derefs Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	cc59503b16	nir/vars_to_ssa: Properly ignore variables with complex derefs Because the core principle of the vars_to_ssa pass is that it globally (within a function) looks at all of the uses of a never-indirected path and does a full into-SSA on that path, it can't handle a path which has any chance of having aliasing. If a function_temp variable has a cast or anything else which may cause aliasing, we have to assume that all paths to that variable may alias and ignore the entire variable. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	911ea2c66f	nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer We're about to change the meaning of get_deref_node returning NULL so we need a non-NULL value to mean properly undefined. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	e84194686d	nir/deref: Add a has_complex_use helper This lets passes easily detect derefs which have uses that fall outside the standard load/store/copy pattern so they can bail appropriately. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	8948048c6f	nir/dead_cf: Call instructions aren't dead When we inlined cf_node_has_side_effects into node_is_dead, all the conditions flipped and we forgot to flip one. Fortunately, it doesn't matter right now because no one uses this pass on shaders with more than one function. Fixes: `b50465d197` "nir/dead_cf: Inline cf_node_has_side_effects" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Dave Airlie	5441d56243	vtn: create cast with type stride. When creating function parameters, we create pointers from ssa values, this creates nir casts with stride 0, however we have no where else to get this value from. Later passes to lower explicit io need this stride value to do the right thing. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-31 09:57:45 +10:00
Rob Clark	372e83b95f	list: add some iterator debug Debugging use of unsafe iterators when you should have used the _safe version sucks. Add some DEBUG build support to catch and assert if someone does that. I didn't update the UPPERCASE verions of the iterators. They should probably be deprecated/removed. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-05-30 22:11:26 +00:00
Caio Marcelo de Oliveira Filho	03ce12c5ed	nir: Accept nir_var_mem_global in derefs used by phis This mode is used by PhysicalStorageBufferEXT storage class. Fixes: `8bdf5a008b` "nir: Allow derefs to be used as phi sources" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-30 14:07:29 -07:00
Jason Ekstrand	5e43a75950	intel/blorp: Use the hardware op for CCS ambiguate on gen10+ Cannonlake hardware adds a new resolve type in 3DSTATE_PS called FAST_CLEAR_0 which does an ambiguate. Now that the hardware can do it directly, we should use that instead of binding the CCS as a render target and doing it manually. This was tested with a full Vulkan CTS run on Cannonlake. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2019-05-30 13:49:48 -07:00
Jan Zielinski	b31a31bba5	swr/rast: Enable ARB_GL_texture_buffer_range No significant changes in the code needed to enable the extension. Just updating SWR capabilities and the documentation Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-05-30 15:42:15 +00:00
Jan Zielinski	cf673747ce	swr/rast: fix 32-bit compilation on Linux Removing unused but problematic code from simdlib header to fix compilation problem on 32-bit Linux. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-05-30 15:31:15 +00:00
Jason Ekstrand	9e403dc56e	intel/fs: Do a stalling MFENCE in endInvocationInterlock() Fixes: `939312702e` "i965: Add ARB_fragment_shader_interlock support" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-30 14:00:26 +00:00
Jason Ekstrand	859de4a748	intel/fs,vec4: Use g0 as the header for MFENCE We set header_present but then pass it some random garbage. Give it g0 instead. I'm not actually sure this does anything but g0 is the usual header data and this is what the windows driver does so it seems like a good idea. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-30 14:00:26 +00:00
Samuel Pitoiset	43cc3dc9c0	radv: enable transformFeedbackStreamsLinesTriangles The driver should already support this without any changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-30 15:42:36 +02:00
Samuel Pitoiset	da26013eb7	radv: implement VK_EXT_sample_locations and disable it Basically, this extension allows applications to use custom sample locations. It doesn't support variable sample locations during subpass. Note that we don't have to upload the user sample locations because the spec doesn't allow this. The extension is currently disabled because the driver needs to support variable sample locations during layout transitions. The depth decompress needs to know them and that's a bit invasive. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-30 09:52:16 +02:00
Kenneth Graunke	e917bb7ad4	iris: Avoid holding the lock while allocating pages. We only need the lock for: 1. Rummaging through the cache 2. Allocating VMA We don't need it for alloc_fresh_bo(), which does GEM_CREATE, and also SET_DOMAIN to allocate the underlying pages. The idea behind calling SET_DOMAIN was to avoid a lock in the kernel while allocating pages, now we avoid our own global lock as well. We do have to re-lock around VMA. Hopefully this shouldn't happen too much in practice because we'll find a cached BO in the right memzone and not have to reallocate it. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-05-30 00:46:37 -07:00
Kenneth Graunke	0cb380a6b3	iris: Move SET_DOMAIN to alloc_fresh_bo() Chris pointed out that the order between SET_DOMAIN and SET_TILING doesn't matter, so we can just do the page allocation when creating a new BO. Simplifies the flow a bit. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-05-30 00:15:26 -07:00
Kenneth Graunke	53878f7a89	iris: Be lazy about cleaning up purged BOs in the cache. Mathias Fröhlich reported that commit `6244da8e23` crashes. list_for_each_entry_safe is safe against removing the current entry, but iris_bo_cache_purge_bucket was potentially removing next entries too, which broke our saved next pointer. To fix this, don't bother with the iris_bo_cache_purge_bucket step. We just detected a single entry where the kernel has purged the BO's memory, and so it isn't a usable entry for our cache. We're about to continue the search with the next BO. If that one's purged, we'll clean it up too. And so on. We may miss cleaning up purged BOs that are further down the list after non-purged BOs...but that's probably fine. We still have the time-based cleaner (cleanup_bo_cache) which will take care of them eventually, and the kernel's already freed their memory, so it's not that harmful to have a few kicking around a little longer. Fixes: `6244da8e23` iris: Dig through the cache to find a BO in the right memzone Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2019-05-29 23:38:01 -07:00
Kenneth Graunke	6244da8e23	iris: Dig through the cache to find a BO in the right memzone This saves some util_vma thrash when the first entry in the cache happens to be in a different memory zone, but one just a tiny bit ahead is already there and instantly reusable. Hopefully the cost of a little extra searching won't break the bank - if it does, we can consider having separate list heads or keeping a separate VMA cache. Improves OglDrvRes performance by 22%, restoring a regression from deleting the bucket allocators in `694d1a08d3`. Thanks to Clayton Craft for alerting me to the regression. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 20:03:45 -07:00
Kenneth Graunke	4c2d9729df	iris: Tidy BO sizing code and comments Buckets haven't been power of two sized in over a decade. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:42:15 -07:00
Kenneth Graunke	7acc88a47c	iris: Move some field setting after we drop the lock. It's not much, but we may as well hold the lock for a bit less time. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:42:04 -07:00
Kenneth Graunke	76c5a19668	iris: Move cached BO allocation into a helper function. There's enough going on here to warrant a helper. This also simplifies the control flow and eliminates the last non-error-case goto. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:41:52 -07:00
Kenneth Graunke	cea6671395	iris: Fall back to fresh allocations of mapping for zero-memset fails. It is unlikely that we would fail to map a cached BO in order to zero its contents. When we did, we would free the first BO in the cache and try again with the second. It's possible that this next BO already had a map setup, in which case we'd succeed. But if it didn't, we'd likely fail again in the same manner. There's not much point in optimizing this case (and frankly, if we're out of CPU-side VMA we should probably dump the cache entirely)...so instead, just fall back to allocating a fresh BO from the kernel which will already be zeroed so we don't have to try and map it. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:41:50 -07:00
Kenneth Graunke	042f8514e6	iris: Move fresh BO allocation into a helper function. There's enough going on here to warrant a helper. More cleaning coming. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:41:22 -07:00
Kenneth Graunke	06421e5be7	iris: Do SET_TILING at a single point rather than in two places. Both the from-cache and fresh-from-GEM cases were calling SET_TILING. In the cached case, we would retry the allocation on failure, pitching one BO from the cache each time. This is silly, because the only time it should fail is if the tiling or stride parameters are unacceptable, which has nothing to do with the particular BO in question. So there's no point in retrying - we should simply fail the allocation. This patch moves both calls to bo_set_tiling_internal() below the cache/fresh split, so we have it at a single point in time instead of two. To preserve the ordering between SET_TILING and SET_DOMAIN, we move that below as well. (I am unsure if the order matters.) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:41:08 -07:00
Kenneth Graunke	43d835cb0f	iris: Use the BO cache even for coherent buffers on non-LLC. We mark snooped BOs as non-reusable, so we never return them to the cache. This means that we'd need to call I915_GEM_SET_CACHING to make any BO we find in the cache snooped. But then again, any BO we freshly allocate from the kernel will also be non-snooped, so it has the same issue. There's really no reason to skip the cache - we may as well use it to avoid the I915_GEM_CREATE overhead. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:40:18 -07:00
Kenneth Graunke	78003014d0	iris: Fix locking around vma_alloc in iris_bo_create_userptr util_vma needs to be protected by a lock. All other callers of vma_alloc and vma_free appear to be holding a lock already. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:40:16 -07:00
Kenneth Graunke	5fc11fd988	iris: Fix lock/unlock mismatch for non-LLC coherent BO allocation. The goto jumped over the mtx_lock, but proceeded to hit the mtx_unlock. We can simply set the bucket to NULL and it will skip the cache without goto, and without messing up locking. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-29 19:40:15 -07:00
Marek Olšák	2285b93032	radeonsi: fix timestamp queries for compute-only contexts Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2019-05-29 21:13:35 -04:00
Marek Olšák	b5697c311b	Change a few frequented uses of DEBUG to !NDEBUG debugoptimized builds don't define NDEBUG, but they also don't define DEBUG. We want to enable cheap debug code for these builds. I only chose those occurences that I care about. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-29 21:13:35 -04:00
Kenneth Graunke	0f1b68ebee	iris: Re-emit Surface State Base Address when context is lost. When we hit a GPU hang, we failed to reset Surface State Base Address right away, and would keep hanging until we filled up the binder. Then we'd finally get it right after a lot of repeated stumbles. Update it right away so we hopefully hang fewer times before succeeding.	2019-05-29 16:35:02 -07:00
Jason Ekstrand	e459d6d6df	iris: Enable nir_opt_large_constants Shader-db results on Kaby Lake: total instructions in shared programs: 15306230 -> 15304726 (<.01%) instructions in affected programs: 4570 -> 3066 (-32.91%) helped: 16 HURT: 0 total cycles in shared programs: 361703436 -> 361680041 (<.01%) cycles in affected programs: 129388 -> 105993 (-18.08%) helped: 16 HURT: 0 LOST: 0 GAINED: 2 The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal Space Program Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-29 21:09:16 +00:00
Jason Ekstrand	9dc57eebd5	iris: Don't assume UBO indices are constant It will be true for the constant/system value buffer because they use a constant zero but it's not true in general. If we ever got here when the source wasn't constant, nir_src_as_uint would assert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2019-05-29 21:09:16 +00:00
Jason Ekstrand	744f93f5c1	iris: Move upload_ubo_ssbo_surf_state to iris_program.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-29 21:09:16 +00:00
Brian Paul	e584fd894e	nir: silence three compiler warnings seen with MinGW Silence two unused var warnings. And init elem_size, elem_align to zero to silence "maybe uninitialized" warnings. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-29 13:59:24 -06:00
Brian Paul	c71ca65405	svga: clamp max_const_buffers to SVGA_MAX_CONST_BUFS In case the device reports 15 (or more) buffers. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-29 13:59:23 -06:00
Kenneth Graunke	6892d2b94a	iris: Clone before calling nir_strip and serializing This is non-destructive and leaves the debugging information in place. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-29 18:16:32 +00:00
Kenneth Graunke	e1409aead5	iris: Only store the SHA1 of the NIR in iris_uncompiled_shader Jason pointed out that we don't need to keep an entire copy of the serialized NIR around, we just need the SHA1. This does change our disk cache key to be taking a SHA1 of a SHA1, which is a bit odd, but should work out and be faster and use less memory. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-29 18:16:32 +00:00
Caio Marcelo de Oliveira Filho	e45bf01940	spirv: Change spirv_to_nir() to return a nir_shader spirv_to_nir() returned the nir_function corresponding to the entrypoint, as a way to identify it. There's now a bool is_entrypoint in nir_function and also a helper function to get the entry_point from a nir_shader. The return type reflects better what the function name suggests. It also helps drivers avoid the mistake of reusing internal shader references after running NIR_PASS on it. When using NIR_TEST_CLONE or NIR_TEST_SERIALIZE, those would be invalidated right in the first pass executed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho	a3bfdacb6c	radv: Don't re-use entry_point pointer from spirv_to_nir Replace its uses with checking for is_entrypoint and calling nir_shader_get_entrypoint(). This is a preparation to change spirv_to_nir() return type. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho	ee59bac9f4	glspirv: Don't re-use entry_point pointer from spirv_to_nir Replace its use with checking for is_entrypoint. This is a preparation to change spirv_to_nir() return type. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-29 10:34:30 -07:00
Caio Marcelo de Oliveira Filho	c92d002982	turnip: Don't re-use entry_point pointer from spirv_to_nir Replace its uses with nir_shader_get_entrypoint(), and change the helper function to return nir_shader *. This is a preparation to change spirv_to_nir() return type. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 10:26:22 -07:00
Chia-I Wu	0a0be7aee0	virgl: fix readback with pending transfers When readback is true, and there are pending writes in the transfer queue, we should flush to avoid reading back outdated data. This fixes piglit arb_copy_buffer/dlist and a subtest of arb_copy_buffer/data-sync. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-05-29 16:47:04 +00:00
Caio Marcelo de Oliveira Filho	8bdf5a008b	nir: Allow derefs to be used as phi sources It is possible and valid for a pointer to be selected based on a conditional before used, and depending on the mode, those cases will result in a phi with derefs as sources. To achieve this, we don't rematerialize derefs that are used by phis. As a consequence, when converting from SSA to regs, we may have phis that come from different blocks and are used by phis. We now convert those to regs too. Validation was added to ensure only derefs of certain modes can be used as phi sources. No extra validation is needed for the presence of cast, any instruction that uses derefs will validate the deref-chain is complete (ending in a cast or a var). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-29 08:19:15 -07:00
Connor Abbott	ee2a92bcde	radeonsi: Fix editorconfig At least on vim, indenting doesn't work without this. Copied from src/amd/vulkan. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-29 15:55:40 +02:00
Erik Faye-Lund	551b61528f	mesa/main: clean up extension-check for GL_SAMPLE_MASK Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	426e896515	mesa/main: clean up extension-check for GL_SAMPLE_SHADING Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	b9e9d701dc	mesa/main: correct extension-checks for GL_PRIMITIVE_RESTART_FIXED_INDEX This shouldn't be allowed in GLES 1/2. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	34ade0dc7c	mesa/main: correct extension-checks for GL_BLEND_ADVANCED_COHERENT_KHR KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so we shouldn't allow its enums there either. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	c0dabc6192	mesa/main: correct extension-checks for GL_FRAMEBUFFER_SRGB This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead use the extenion-helpers, and check for desktop and gles extensions separately. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	a33ff7876f	mesa/main: correct extension-checks for MESA_tile_raster_order This extension isn't enabled for GLES 1.x, so we shouldn't allow the state there. Let's use the extension-helpers instead of CHECK_EXTENSION for this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	bf91d6ae4a	mesa/main: make the CONSERVATIVE_RASTERIZATION_NV checks consistent This just makes the logic of the checks for this enum the same for gl{Enable,Disable} and for glIsEnabled. They are already functionally the same, so this is just a minor code-cleanup. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Erik Faye-Lund	00c683bc8e	mesa/main: make the PRIMITIVE_RESTART_NV checks consistent {En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on compatibility contextxs. While we're at it, modernize the code a bit, by using the extension helpers instead of open-coding. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-29 10:54:09 +02:00
Samuel Pitoiset	d3771ccaa3	radv: use view format when selecting the resolve path for subpasses Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 08:53:48 +02:00
Samuel Pitoiset	017170a785	radv: always use view format when performing subpass resolves It makes sense to use the image view formats when resolving inside subpasses, while we have to use the image formats for normal resolves. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 08:53:46 +02:00
Samuel Pitoiset	eaeaad25f7	radv: sync before resetting a pool if there is active pending queries Make sure to sync all previous work if the given command buffer has pending active queries. Otherwise the GPU might write queries data after the reset operation. This fixes a bunch of new dEQP-VK.query_pool.* CTS failures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-29 08:47:54 +02:00
Kenneth Graunke	bc273dece2	intel/decoder: Use get_state_size() over guessed counts in more cases This makes the following packets use actual driver provided sizes rather than guessing an arbitrary number: - CC_VIEWPORT - SF_CLIP_VIEWPORT - BLEND_STATE - COLOR_CALC_STATE - SCISSOR_RECT Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-05-28 13:44:16 -07:00
Mike Lothian	29ea92e6a1	meson: Link Gallium drivers with ld_args_build_id Link all Gallium drivers with ld_args_build_id to prevent failures in Iris that uses GNU_BUILD_ID Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757 Fixes: `4756864cdc` "iris: Start wiring up on-disk shader cache" Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-28 13:37:36 -07:00
Lionel Landwerlin	366811bedb	nir/lower_non_uniform: safely iterate over blocks This fixes a problem where the same instruction gets replaced twice. This was happening when the replaced instruction would be at the end of a block. Replacement of : if ssa_8 { .... intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf / / image_array=false / / format=34836 / / access=32 / } Would be : if ssa_8 { loop { vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_48 = ieq ssa_47, ssa_44 if ssa_48 { loop { vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) () vec1 1 ssa_50 = ieq ssa_49, ssa_44 if ssa_50 { intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) / image_dim=Buf / / image_array=false / / format=34836 / / access=32 */ break } else { .... } Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-28 20:23:16 +01:00
Samuel Pitoiset	47a10edefb	radv: allocate more space in the CS when emitting events If the driver waits for CP DMA to be idle and emit an EOP event we need more space. This fixes a crash with Quake Champions. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-28 16:56:17 +02:00
Kenneth Graunke	6a9e39d44b	iris: Ask st to vectorize our IO. (Technically this is common code, but it doesn't affect i965 or anv.) Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS. Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e at 1080p by 0.325208% +/- 0.0842233% (n=18). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-28 01:06:48 -07:00
Kenneth Graunke	c31b4420e7	st/nir: Re-vectorize shader IO We scalarize IO to enable further optimizations, such as propagating constant components across shaders, eliminating dead components, and so on. This patch attempts to re-vectorize those operations after the varying optimizations are done. Intel GPUs are a scalar architecture, but IO operations work on whole vec4's at a time, so we'd prefer to have a single IO load per vector rather than 4 scalar IO loads. This re-vectorization can help a lot. Broadcom GPUs, however, really do want scalar IO. radeonsi may want this, or may want to leave it to LLVM. So, we make a new flag in the NIR compiler options struct, and key it off of that, allowing drivers to pick. (It's a bit awkward because we have per-stage settings, but this is about IO between two stages...but I expect drivers to globally prefer one way or the other. We can adjust later if needed.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-28 01:06:48 -07:00
Mathias Fröhlich	1d0a8cf40d	mesa: Prevent classic swrast crash on a surfaceless context v2. This fixes the egl_mesa_platform_surfaceless piglit test as well as the new egl_ext_device_base piglit test on classic swrast. v2: Fix swrast surfaceless contexts on the driver side. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-28 08:27:16 +02:00
Samuel Pitoiset	15cb19ed6f	radv add radv_get_resolve_pipeline() in the compute path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-28 08:17:26 +02:00
Samuel Pitoiset	469258c3b1	radv: cleanup the compute resolve path for subpass This makes use of radv_meta_resolve_compute_image() by filling a VkImageResolve region instead of duplicating code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-28 08:17:23 +02:00
Timothy Arceri	d2b0246741	radeonsi: add drirc workaround for American Truck Simulator Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711	2019-05-28 08:47:44 +10:00
Timothy Arceri	11e16ca7ce	Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt" This reverts commit `55376cb31e`. It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the original issue. It seems i965 only ever applied this workaround to the 18.0 branch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-28 08:46:50 +10:00
Lionel Landwerlin	2042f22e28	anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors When using the binding tables to access arrays of YCbCr descriptors we did not consider the offset of the accessed element. We can't do a simple multiple because the binding table entries are tightly packed. For example element 0 of the array could use 2 entries/planes and element 1 could use 2 entries/planes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bb8768b9d` ("anv: toggle on support for VK_EXT_ycbcr_image_arrays") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-27 22:47:53 +01:00
Marek Olšák	fccced57cf	radeonsi: clean up winsys creation - unify the code - choose radeon or amdgpu based on the DRM version, not based on which one succeeds first	2019-05-27 15:26:06 -04:00
Marek Olšák	bb5d82bd06	radeonsi: allow query functions for compute-only contexts	2019-05-27 15:26:06 -04:00
Marek Olšák	b257956021	ac: treat Mullins as Kabini, remove the enum it's the same design	2019-05-27 15:10:51 -04:00
Christian Gmeiner	37af75f88c	etnaviv: rs: choose clear format based on block size Fixes following piglit and does not introduce any regressions. spec@ext_packed_depth_stencil@fbo-depth-gl_depth24_stencil8-blit Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-05-27 20:55:11 +02:00
Vasily Khoruzhick	af0de6b91c	lima/ppir: implement discard and discard_if This commit also adds codegen for branch since we need it for discard_if. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-27 07:39:03 -07:00
Samuel Pitoiset	7a7be61398	radv: ignore the loadOp if the first use of an attachment is a resolve Based on ANV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-27 13:52:39 +02:00
Samuel Pitoiset	ff27eb509a	radv: always dirty the framebuffer when restoring a subpass The old code was not wrong because the transitions performed after the resolves should re-emit the framebuffer if needed. This change is mostly a no-op but it improves consistency regarding other meta operations that need to save/restore subpasses. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-27 13:52:36 +02:00
Samuel Pitoiset	9af15986b0	radv: add radv_clear_htile() helper This helper will be useful for clearing HTILE after some depth/stencil resolves. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-27 13:52:34 +02:00
Chenglei Ren	13b38ca1e4	anv/android: fix missing dependencies issue during parallel build The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure it gets generated as a dependency before building them. Signed-off-by: Chenglei Ren <chenglei.ren@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-05-27 10:13:17 +03:00
Samuel Pitoiset	2d2e7954c3	radv: tidy up GetQueryPoolResults for occlusion queries Just move the block that checks the availability bit into the switch like other query types. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-27 08:50:55 +02:00
Kenneth Graunke	b5fa3abfc2	iris: Don't flag IRIS_DIRTY_URB after BLORP operations unless it changed We already flag IRIS_DIRTY_URB when we change it, but we were additionally flagging it on every BLORP operation, even if we didn't.	2019-05-26 17:45:18 -07:00
Dave Airlie	7fe5a8e874	Revert "mesa: unreference current winsys buffers when unbinding winsys buffers" This reverts commit `12bf7cfecf`. This commits caused lots of problems: https://bugs.freedesktop.org/show_bug.cgi?id=110721 https://bugs.freedesktop.org/show_bug.cgi?id=110761 Fixes: `12bf7cfecf` ("mesa: unreference current winsys buffers when unbinding winsys buffers") Pushing without review as we need to get it into next stable.	2019-05-27 09:36:28 +10:00
Alyssa Rosenzweig	659aa3dd65	panfrost/midgard: Implement fneg/fabs/fsat Fix a regression I inadvertently caused by acking typeless movs before implementing/pushing this whistles Nothing to see here, move along folks. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-26 03:16:37 +00:00
Qiang Yu	1dc593e9b9	lima: fix lima_blit with non-zero level source resource lima_blit will do blit between resources with different levels. When blit from a level!=0 source, it will sample from that level of resource as texture. Current texture setup won't respect level when not mipmap filter. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-25 12:41:44 +08:00
Qiang Yu	54490b0b36	lima: fix render to non-zero level texture Current implementation won't respect level of surface to render. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-25 12:41:31 +08:00
Dylan Baker	9838185056	editorconfig: Fix meson style The syntax was wrong, resulting in it not working at all. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-24 18:44:18 +00:00
Chia-I Wu	ea1e0acfd0	virgl: remove an incorrect check in virgl_res_needs_flush Imagine this resource_copy_region(ctx, dst, ..., src, ...); transfer_map(ctx, src, 0, PIPE_TRANSFER_WRITE, ...); at the beginning of a cmdbuf. We need to flush in transfer_map so that the transfer is not reordered before the resource copy. The check for "vctx->num_draws == 0 && vctx->num_compute == 0" is not enough. Removing the optimization entirely. Because of the more precise resource tracking in the previous commit, I hope the performance impact is minimized. We will have to go with perfect resource tracking, or attempt a more limited optimization, if there are specific cases we really need to optimize for. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-24 17:37:40 +00:00
Chia-I Wu	56f9b60e50	virgl: reemit resources on first draw/clear/compute This gives us more precise resource tracking. It can be beneficial because glFlush is often followed by state changes. We don't want to reemit resources that are going to be unbound. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-24 17:37:40 +00:00
Chia-I Wu	424ec2356b	virgl: add missing emit_res for SO targets Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-24 17:37:40 +00:00
Roland Scheidegger	d4e8a44bf6	gallivm: fix default cbuf info. The default null_output really needs to be static, otherwise the values we'll eventually get later are doubly random (they are not initialized, and even if they were it's a pointer to a local stack variable). VMware bug 2349556. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-05-24 19:22:50 +02:00
Roland Scheidegger	84f3f1cf00	scons: fix build with llvm 9. The x86asmprinter component is gone, and things seem to work by just removing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110707 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-05-24 18:28:28 +02:00
Tomeu Vizoso	9fe1a925e2	panfrost: Dereference sampled texture We are currently leaking resources if they were sampled from. Once we are done with a sampler, we should dereference that resource. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 16:50:09 +02:00
Tomeu Vizoso	3c81010213	panfrost: ci: Avoid pulling Docker image on every run Jump over the container stage if we haven't changed any of the files that involved in building the container images. This saves 1-2 minutes in each run and helps conserve resources. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 16:50:09 +02:00
Jason Ekstrand	f2dc0f2872	nir: Drop imov/fmov in favor of one mov instruction The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Rob Clark <robdclark@chromium.org>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	22421ca7be	nir/builder: Merge nir_[if]mov_alu into one nir_mov_alu helper Unless source modifiers are present, fmov and imov are the same. There's no good reason for having two helpers. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	cd73b6174b	nir/lower_to_source_mods: Stop turning add, sat, and neg into mov Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	2a39788d03	nir/source_mods: Add a helpers for setting source modifiers It's potentially a tiny bit less efficient but the helpers make it much easier to sort out the rules for updating source modifiers. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	8ffbb54405	intel: Implement abs, neg, and sat in the back-end Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	4fde459563	intel/nir: Call alu_to_scalar one last time before out-of-ssa A few of our very late passes can end up generating vectors accidentally so we need to get rid of them. The only known case of this is the ffma peephole which generates fneg and fabs as vectors. Currently, they're not a problem because they get turned into fmov which the back-end compiler knows how to handle as a vector. That's about to change. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	ddd08e1888	nir/builder: Remove the use_fmov parameter from nir_swizzle This flag has caused more confusion than good in most cases. You can validly use imov for floats or fmov for integers because, without source modifiers, neither modify their input in any way. Using imov for floats is more reliable so we go that direction. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Jason Ekstrand	6c2ca2a5d3	ptn,ttn: Use nir_channel for selecting channels Both of these passes predate the nir_channel helper. We should just use it instead of hand-rolling it in both passes. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-24 08:38:11 -05:00
Michel Zou	88eb2a1f7e	scons: For MinGW use -posix flag. Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2019-05-24 12:18:40 +01:00
Christian Gmeiner	78fb5594be	etnaviv: use the correct uniform dirty bits Found during code inspection. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-05-24 12:41:43 +02:00
Danylo Piliaiev	c82dcf89ae	anv: Do not emulate texture swizzle for INPUT_ATTACHMENT, STORAGE_IMAGE If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE or VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT, the imageView member of each element of pImageInfo must have been created with the identity swizzle. Fixes: `d2aa65eb` Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-24 09:20:38 +00:00
Tapani Pälli	397fe0cc50	st/dri: enable EGL_ANDROID_blob_cache on gallium drivers Verified to work properly with Iris driver on Android Celadon. Cache files get generated as 'com.android.opengl.shaders_cache' for each application. v2: check that cache was returned (Eric Engestrom) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-24 09:17:04 +03:00
Alyssa Rosenzweig	ea6b581444	panfrost: Remove the standalone compiler Now that the online compiler and pandecode are reliable and upstreamed, nobody is using this. If somebody does need it, it should be easy enough to bring back, I suppose. At the moment, it's just a maintenance hazard, since meson is silly and does double builds for compiler updates (triple for disassembler changes). If people need the standalone _disassembler_, that can be added trivially into pandecode (pandecode already includes the disassembler). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-24 03:10:43 +00:00
Eric Engestrom	8d386e6eef	vk/util: suppress warning about out-of-enum android value src/vulkan/util/vk_enum_to_str.c: In function ‘vk_structure_type_size’: src/vulkan/util/vk_enum_to_str.c:3335:9: warning: case value ‘1000010000’ not in enumerated type ‘VkStructureType’ {aka ‘const enum VkStructureType’} [-Wswitch] case VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID: return sizeof(VkNativeBufferANDROID); ^~~~ Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-23 15:28:43 +00:00
Kenneth Graunke	25afbb04c2	iris: Advertise coherent framebuffer fetches This lets us advertise GL_EXT_shader_framebuffer_fetch and GL_KHR_blend_equation_advanced_coherent support.	2019-05-23 08:13:10 -07:00
Kenneth Graunke	cca8af0c7d	gallium: Add PIPE_CAP_FBFETCH_COHERENT and expose extensions st/mesa now exposes KHR_blend_equation_advanced_coherent and EXT_shader_framebuffer_fetch if the new capability is supported. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-23 08:13:09 -07:00
Kenneth Graunke	87f4286137	st/mesa: Advertise GL_EXT_shader_framebuffer_fetch_non_coherent This extension requires the ability to read from all render targets, so we only enable it if PIPE_CAP_FBFETCH >= PIPE_CAP_MAX_RENDER_TARGETS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-23 08:13:09 -07:00
Kenneth Graunke	a2d7834457	gallium: Change PIPE_CAP_TGSI_FS_FBFETCH bool to PIPE_CAP_FBFETCH count TGSI's FBFETCH instruction currently only supports reading from a single render target, but NIR intrinsics can support multiple render targets. radeonsi can only support fetching from RT 0, but other drivers may be able to support fetching from any render target. To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?" to an integer number of render targets which can be fetched. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-23 08:13:07 -07:00
Kenneth Graunke	7d2b54e393	iris: Record state sizes for INTEL_DEBUG=bat decoding. Felix noticed a crash when using INTEL_DEBUG=bat decoding. It turned out that we were sometimes placing variable length data near the end of a buffer, and with the decoder guessing random lengths rather than having an actual count, it was walking off the end and crashing. So this does more than improve the decoder output. Unfortunately, this is a bit more complicated than i965's handling, because we don't have a single state buffer. Various places upload data via u_upload_mgr, and so there isn't a central place to record the size. We don't need to catch every single place, however, since it's only important to record variable length packets (like viewports and binding tables). State data also lives arbitrarily long, rather than being discarded on every batch like i965, so we don't know when to clear out old entries either. (We also don't have a callback when an upload buffer is released.) So, this tracking may space leak over time. That's probably okay though, as this is only a debugging feature and it's a slow leak. We may also get lucky and overwrite existing entries as we reuse BOs, though I find this unlikely to happen. The fact that the decoder works in terms of offsets from a state base address is also not ideal, as dynamic state base address and surface state base address differ for iris. However, because dynamic state addresses start from the top of a 4GB region, and binding tables start from addresses [0, 64K), it's highly unlikely that we'll get overlap. We can always improve this, but for now it's better than what we had.	2019-05-23 08:07:08 -07:00
Eric Engestrom	00cfeacf31	vk/util: drop no-op compiler warning workaround `-Wswitch` applies to `switch()`, not `case:`, and is bypassed by the presence of a `default:` anyway, so let's drop the `default:` and move the warning suppression to where it can make a difference, and then it turns out that we don't need to keep a list of special cases anymore :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-23 15:06:11 +00:00
Erik Faye-Lund	90e7ce5bde	mesa/main: make the CONSERVATIVE_RASTERIZATION_INTEL checks consistent INTEL_conservative_rasterization isn't exposed on compatibility contexts, nor for GLES 3.0 and below. We already do this correctly for gl{Enable,Disable}, but we should do the same for glIsEnabled as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:43:18 +02:00
Erik Faye-Lund	0dff3eecda	mesa/main: make the FRAGMENT_PROGRAM checks consistent IsEnabled(FRAGMENT_PROGRAM) isn't supposed to be allowed, but our check allowed this anyway. Let's make these checks consistent, and while we're at it, modernize them a bit. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:35:55 +02:00
Erik Faye-Lund	147751a856	mesa/main: make the TEXTURE_CUBE_MAP checks consistent IsEnabled(TEXTURE_CUBE_MAP) isn't supposed to be allowed, but our check allowed this anyway. Let's make these checks consistent, and while we're at it, modernize them a bit. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:35:55 +02:00
Erik Faye-Lund	182d75d2a5	mesa/main: remove duplicate macros These are already defined as the exactly same, so let's get rid of the duplicate definitions. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:35:55 +02:00
Erik Faye-Lund	e002763c99	mesa/main: remove unused argument The 'CAP' argument has been unused in both of these macros since 2010, so let's get rid of it from both. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:35:55 +02:00
Erik Faye-Lund	619b2c9a7d	mesa/main: remove unused macro The first version of this macro is unused, so let's get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-05-23 11:35:55 +02:00
Timothy Arceri	a482cf6ab2	glsl: simplify resource list building code This greatly simplifies the code to calculate if we should add a buffer to the resource list. This uses the spec rules and simple math to decide if we should add the buffer rather than complex string processing. This patch refines a patch present in the ARB_gl_spriv merge request for the NIR linker and applies it to the GLSL IR linker. This is why we also move the function to the shared linker code, because we will want to reuse the code for the NIR linker also. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-23 15:06:20 +10:00
Chia-I Wu	96c2851586	virgl: track valid buffer range for transfer sync virgl_transfer_queue_is_queued was used to avoid flushing. That fails when the resource is being accessed by previous cmdbufs but not the current one. The new approach of tracking the valid buffer range does not apply to textures however. But hopefully it is fine because the goal is to avoid waiting for this scenario glBufferSubData(..., offset, size, data1); glDrawArrays(...); // append new vertex data glBufferSubData(..., offset+size, size, data2); glDrawArrays(...); If glTex(Sub)Image* turns out to be an issue, we will need to track valid level/layer ranges as well. v2: update virgl_buffer_transfer_extend as well v3: do not remove virgl_transfer_queue_is_queued Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> (v2)	2019-05-22 09:28:19 -07:00
Chia-I Wu	440982cdd6	virgl: remove support for buffer surfaces st/mesa does not need it and virglrenderer does not really support it. Remove the support so that we are sure pipe_surface never refers to a buffer resource. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-22 09:28:19 -07:00
Chia-I Wu	fa9afb9de0	virgl: handle NULL shader resource explicitly When shader images/buffers are set, do not rely on virgl_encoder_write_res and virgl_resource_dirty to do the implicit NULL check. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-22 09:28:19 -07:00
Lionel Landwerlin	cb7c9b2a93	vulkan: fix build dependency issue with generated files On machines with many cores, you can run into that issue : ../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory v2: Move declare_dependency around (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Jan Ziak Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-22 14:07:14 +00:00
Greg V	506ebf55c0	gallium: enable dmabuf on BSD as well The DRM_CONF_SHARE_FD code did not check for Linux, so the commit that introduced PIPE_CAP_DMABUF broke Wayland-EGL clients on FreeBSD. Fixes: `8ae50e60` (gallium: replace DRM_CONF_SHARE_FD with PIPE_CAP_DMABUF) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-22 13:20:31 +00:00
Tapani Pälli	ed563b79df	iris: fix android build Fixes: `4756864cdc` ""iris: Start wiring up on-disk shader cache Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-22 14:01:41 +03:00
Philipp Zabel	1ccb8a071b	etnaviv: fill missing offset in etna_resource_get_handle Without this gbm_bo_get_offset() can return 0 where it shouldn't. Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: <mesa-stable@lists.freedesktop.org>	2019-05-22 12:57:40 +02:00
Samuel Pitoiset	32a0bc915a	radv: do not reset query pool during creation From the Vulkan spec 1.1.108: "After query pool creation, each query must be reset before it is used." So, the driver doesn't need to do this at creation time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:41 +02:00
Samuel Pitoiset	e9bfd88183	radv: fix the sample max distance value for 8x It should be 7, not 8. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:39 +02:00
Samuel Pitoiset	bc4548ca3d	radv: emit correct centroid priority based on the number of samples Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:37 +02:00
Samuel Pitoiset	a7763ddcf2	radv: clean up the sample locations codebase Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:35 +02:00
Samuel Pitoiset	135dff8dcf	radv: remove remaining code related to 16 samples The driver only supports up to 8 samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-22 08:36:33 +02:00
Kenneth Graunke	6dc1c2d8bd	iris: Fix ALT mode regressions from shader cache We were checking this based on nir->info.name, but with the shader cache enabled, nir_strip throws out the name, causing us to use IEEE mode for ARB programs. gl-1.0-spot-light regressed because it wants ALT mode for 0^0 behavior. Fixes: `dc5dc727d5` iris: Serialize the NIR to a blob we can use for shader cache purposes.	2019-05-21 16:58:54 -07:00
Marek Olšák	d6053bf2a1	radeonsi: fix a regression in si_rebind_buffer Don't update non-buffer images. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110701 Fixes: `78e35df52a` "radeonsi: update buffer descriptors in all contexts after buffer invalidation" Cc: 19.1 <mesa-stable@lists.freedesktop.org> Tested-By: Gert Wollny <gert.wollny@collabora..com>	2019-05-21 18:58:03 -04:00
Kenneth Graunke	fb1d08dcfd	iris: Expose the disk cache to the state tracker as well. This lets st/nir cache the NIR for shaders, based on the shader source string hash, allowing us to skip initial compiles altogether, and also letting us start from there should we need to recompile for NOS. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	601c9bc135	iris: Cache assembly shaders in the on-disk shader cache This implements storing and retrieving iris_compiled_shader objects from the on-disk shader cache. (by Dylan Baker and Kenneth Graunke)	2019-05-21 15:05:38 -07:00
Kenneth Graunke	dc5dc727d5	iris: Serialize the NIR to a blob we can use for shader cache purposes. We will use a hash of the serialized NIR together with brw_prog_*_key (for NOS) as the disk cache key, where the disk cache contains actual assembly shaders. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	4756864cdc	iris: Start wiring up on-disk shader cache This creates the on-disk shader cache data structure, and handles the build-id keying aspects. The next commits will fill it out so it's actually used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-21 15:05:38 -07:00
Kenneth Graunke	6ae2caf201	iris: Move iris_uncompiled_shader definition to iris_context.h It had been internal to iris_program.c, but with the upcoming disk cache code, the "program module" is going to be spread across a couple source files. Into a header it goes! Now it lives alongside iris_compiled_shader, which makes sense. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Kenneth Graunke	419d9b21e1	intel: Move brw_prog_key_set_id from i965 to the compiler. I want to use it in iris. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-21 15:05:38 -07:00
Dylan Baker	b589c2547d	docs: update calendar, and news item and link release notes for 19.0.5	2019-05-21 14:25:36 -07:00
Dylan Baker	e2987f83ad	docs: Add Sha256 sums for 19.0.5	2019-05-21 14:23:16 -07:00
Dylan Baker	74e8dfecc8	docs: Add release notes for 19.0.5	2019-05-21 14:23:14 -07:00
Caio Marcelo de Oliveira Filho	9b9f7030c6	spirv: Drop GOOGLE suffix from names incorporated to SPIR-V SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1 were incorporated to SPIR-V. Let's pick the names used by SPIR-V core. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-21 11:52:41 -07:00
Caio Marcelo de Oliveira Filho	02d140ce9a	spirv: Pick the right bitsize when doing SpvUConvert Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-21 11:52:29 -07:00
Caio Marcelo de Oliveira Filho	fd94a45823	spirv: Trivially handle new 1.4 loop controls Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-21 11:52:12 -07:00
Caio Marcelo de Oliveira Filho	e21dee6c21	spirv: Update JSON and Headers to 1.4 This refers to commit c4f8f65792d4bf2657ca751904c511bbcf2ac77b from GitHub. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-21 11:50:58 -07:00
Caio Marcelo de Oliveira Filho	4b474e2e8a	spirv: Handle instruction aliases in spirv_info_c.py Choose the first we see in the grammar file as the main one. This is needed to parse SPIR-V 1.4 because it introduced opcode aliases. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-05-21 11:50:47 -07:00
Erik Faye-Lund	810b95e02c	Revert "glsl: do not use deprecated bison-keyword" This reverts commit `eb85124a9f`.	2019-05-21 17:53:54 +02:00
Eric Engestrom	93d900ece3	imgui: delete demo file Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-21 14:40:22 +01:00
Lionel Landwerlin	fd80f1e8d1	vulkan/overlay: update remaining manual error checks Through a series of rebases, I forgot to switch a bunch of error checks to use a macro that will show where the problem is, rather than printing out a dumb "ERROR!". Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-21 14:08:35 +01:00
Lionel Landwerlin	213d6527d4	vulkan/overlay: fix timestamp query emission with no pipeline stats The if (!pipe && timestamp) logic was broken. It should have been : if (!pipe && !timestamp) Let just drop this condition as the following code does the right thing for all cases. An error was appearing with the following variables : VK_INSTANCE_LAYERS=VK_LAYER_MESA_overlay VK_LAYER_MESA_OVERLAY_CONFIG=gpu_timing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ea7a6fa980` ("vulkan/overlay: add pipeline statistic & timestamps support") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-21 14:08:35 +01:00
Erik Faye-Lund	eb85124a9f	glsl: do not use deprecated bison-keyword %error-verbose has been deprecated since Bison 3.0, which was released in 2013. In Bison 3.3.1 which was recently released, this has started causing warnings. Let's update the code to do this in the modern way intead, to avoid cluttering the output needlessly. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-21 11:31:43 +00:00
Karol Herbst	67f9496893	glsl: handle 8 and 16 bit ints in glsl_base_type_is_integer Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-21 08:47:16 +00:00
Dave Airlie	4785e50e75	nir/test: add split vars tests (v2) This just adds some split var splitting tests, it verifies by counting derefs and local vars. a basic load from inputs, store to array, same as before but with a shader temp struct { float } [4] don't split test a basic load from inputs, with some out of band loads. a load/store of only half the array two level array, load from inputs store to all levels a basic load from inputs with an indirect store to array. two level array, indirect store to lvl 0 two level array, indirect store to lvl 1 load from inputs, store to array twice load from input, store to array, load from array, store to another array. load and from input and copy deref to array create wildcard derefs, and do a copy v2: use array_imm helpers, move derefs out of loops, rename toplevel/secondlevel, use ints, fix lvl1 don't split test, rename globabls to shader_temp, add comment, check the derefs type Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-21 13:43:28 +10:00
Caio Marcelo de Oliveira Filho	cf05ffbfd6	anv: Don't re-use entry_point pointer from spirv_to_nir When running with NIR_TEST_CLONE=1, the pointer will not be valid, as the whole shader is going to be recreated every pass. Prefer using is_entrypoint (to query when looping) and nir_shader_get_entrypoint() instead. Fixes the Vulkan Piglit tests - vulkan/glsl450/frexp-double - vulkan/glsl450/isinf-double - vulkan/shaders/fs-multiple-large-local-array Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-20 16:47:39 -07:00
Caio Marcelo de Oliveira Filho	005cc9ae37	nir: Fix clone of nir_variable state slots When num_state_slots is 0, don't create the array. This was triggering the following assert when running vkcube with NIR_TEST_CLONE=1 vkcube: ../src/compiler/nir/nir_split_per_member_structs.c:66: split_variable: Assertion `var->state_slots == NULL' failed. Fixes: `9fbd390dd4` "nir: Add support for cloning shaders" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-20 16:47:28 -07:00
Charmaine Lee	12bf7cfecf	mesa: unreference current winsys buffers when unbinding winsys buffers This fixes surface leak when no winsys buffers are bound. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-20 13:09:32 -07:00
Charmaine Lee	b480adfa5e	st/mesa: purge framebuffers with current context after unbinding winsys buffers With commit `c89e8470e5`, framebuffers are purged after unbinding context, but this change also introduces a heap corruption when running Rhino application on VMware svga device. Instead of purging the framebuffers after the context is unbound, this patch first ubinds the winsys buffers, then purges the framebuffers with the current context, and then finally unbinds the context. This fixes heap corruption. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-20 13:09:32 -07:00
Caio Marcelo de Oliveira Filho	7e5723d6d7	spirv: Generate proper NULL pointer values Use the storage class address format information to pick the right constant values for a NULL pointer. v2: Don't add a deref_cast to the values. (Jason) v3: Update to use vtn_storage_class_to_mode() and vtn_mode_to_address_format() explicitly. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	83550b7dc4	spirv: Reuse helpers in vtn_handle_type() And change vtn_storage_class_to_mode() to accept NULL as interface_type. In this case, if we have a SpvStorageClassUniform, we assume it is uses an ubo_addr_format, like the code being replaced by the helper. That assumption is a problem, but no different than the previous code. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	48ea3bbff6	spirv: Add vtn_variable_mode_image Corresponding to SpvStorageClassImage. We see pointers for that storage class in tests, but don't use the storage class any further. Adding this so that we can call vtn_mode_to_address_format() for all supported pointers. v2: Fail when trying to create a SpvStorageClassImage variable. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	672a3f42d9	spirv: Add vtn_mode_to_address_format() Handles all the modes and we can use it in combination with nir_address_format_to_glsl_type() to replace the vtn_ptr_type_for_mode() helper. Since the new helper is more generic, moved the assertions from the old one to the call sites. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	192daf68a4	spirv: Add vtn_mode_uses_ssa_offset() Just the mode is needed to decide whether SSA offsets are needed, so make a function that takes that and reuse it for vtn_pointer_uses_ssa_offset(). This will be used for constant null pointers, that won't have a vtn_pointer handy. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	f9336751bc	spirv: Add and use vtn_type_without_array() helper v2: Renamed from vtn_interface_type. (Jason) Accept any type not only pointers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	8af9de0a38	spirv: Change vtn_null_constant() to use vtn_type This is a preparation to handle OpConstantNull for pointers, we'll use the vtn_type to get to the address format and then the appropriate representation of NULL pointer. v2: Move rest of body to use vtn_type. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	bdf2361b87	spirv: Export vtn_storage_class_to_mode() So we can reuse in spirv_to_nir.c. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	f051fa6ad7	nir: Add nir_address_format_null_value() Returns the nir_const_value * with the representation of the NULL pointer for each address format. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	31a7476335	spirv, radv, anv: Replace ptr_type with addr_format Instead of setting the glsl types of the pointers for each resource, set the nir_address_format, from which we can derive the glsl_type, and in the future the bit pattern representing a NULL pointer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	6bc9cdb1b7	nir: Add nir_address_format_32bit_offset This is a simple 32-bit address which is not a global address. Gives us a format that don't use 0 as its null pointer value. We will need this in anv to represent nir_var_mem_shared addresses. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Caio Marcelo de Oliveira Filho	bdaf41107a	nir: Add nir_address_format_logical An address format representing a purely logical addressing model. In this model, all deref chains must be complete from the dereference operation to the variable. Cast derefs are not allowed. These addresses will be 32-bit scalars but the format is immaterial because you can always chase the chain. E.g. push constants in anv. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 10:53:38 -07:00
Rob Clark	9f61aa3f75	freedreno/a6xx: WFI in program stateobj too This "fixes" hangs seen w/ various android games. I think a similar issue to with constant state, we need to avoid CP_LOAD_STATE until previous draw completes. It isn't entirely clear why blob doesn't need to do this, but it might have a different way to accomplish the same thing. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	abfb31acdb	freedreno/a6xx: make sure binning pass constlen is large enough Since we use same constant state for both binning pass program state and draw pass state, and it is possible for binning pass shader to use fewer consts, we need to make sure we program a large enough constlen. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	d200d58e65	freedreno/a6xx: limit IBO state to draw pass Currently we are only supporting images in FS (and CS) so limit this stateobj to draw pass. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Rob Clark	54d94f5780	freedreno/a6xx: don't evaluate FS tex state in binning pass It is unneeded since FS doesn't run in binning pass. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-20 09:10:12 -07:00
Samuel Pitoiset	daa85a882e	radv: decompress FMASK before performing a MSAA decompress using FMASK This fixes some CTS failures related to VK_EXT_sample_locations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-20 12:41:47 +02:00
Dave Airlie	6b2b150a66	nir/validate: fix crash if entry is null. we validate assert entry just before this, but since that doesn't stop execution, we need to check entry before the next validation assert. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-20 16:26:48 +10:00
Qiang Yu	a1d419603f	lima/gpir: switch to use nir_lower_viewport_transform Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Qiang Yu	a7688b2713	lima/gpir: support vector ssa load Some vector sysval can't be lowered to scaler, so need to break it to scaler in nir to gpir convertion. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Qiang Yu	4a74e28130	lima/gpir: add helper function for emit load node Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-20 10:57:11 +08:00
Timothy Arceri	ac779ff2b7	util: add missing include to build_id.h Required to use uint8_t Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-20 10:24:23 +10:00
Alyssa Rosenzweig	1155446c19	panfrost/midgard: Split up midgard_compile.c (RA) This commit moves the register allocator out of midgard_compile.c and into its own midgard_ra.c file. In doing so, a number of dependencies are identified and moved into their own files in turn. midgard_compile.c is still fairly monolithic, but this should help. Code churn, but no functional changes should be introduced by this commit. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 23:37:45 +00:00
Alyssa Rosenzweig	9cd8cd26de	panfrost: Improve fixed-function blending This fixes a few miscellaneous issues with the fixed-function blending programming, though it is far from complete. For cases known to be buggy, we force a fallback to blend shaders. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:35 +00:00
Alyssa Rosenzweig	d1a9b760ea	panfrost: Wire up nir_lower_blend This implements blend shaders via nir_lower_blend, by creating dummy fragment shaders simply passing through the source color and using the new lowering pass to inject blendability. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:34 +00:00
Alyssa Rosenzweig	39104221e1	panfrost/midgard: Route new blending intrinsics To prepare for the new nir_lower_blend pass, we wire up the intrinsics for tilebuffer reads and constant colour loading. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:56:14 +00:00
Alyssa Rosenzweig	a1885b2a35	panfrost/nir: Add nir_lower_blend pass This new lowering pass implements the OpenGL ES blend pipeline in shaders, applicable to hardware lacking full-featured blending hardware (including Midgard/Bifrost and vc4). This pass is run on a fragment shader, rewriting the store to a blended version, loading in the framebuffer destination color and constant color via intrinsics as necessary. This pass is sufficient for OpenGL ES 2.0 and is verified to pass dEQP's blend tests. MIN/MAX modes are included and tested as well. That said, at present it has the following limitations: - MRT is not supported (ES3). - sRGB support is missing (ES3). - Extended blending is not yet ported from GLSL IR lowering (ES3.2) - Dual-source blending is not supported. (N/A) - Logic ops are not supported. (N/A) v2: Fix code conventions (per Ian Romanick's feedback). Implement color masks. This pass should be in common nir/ space, but due to non-technical reasons, for now it's in Panfrost space. In the future, depending if other drivers need some of the functionality, we can move this back to src/compiler/nir space. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-19 17:54:56 +00:00
Alyssa Rosenzweig	6b2457e75c	panfrost: Fix Bifrost-specific padding Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:28 +00:00
Alyssa Rosenzweig	7b5217ad70	panfrost: Cleanup panfrost_job comments Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:26 +00:00
Alyssa Rosenzweig	ae705387a9	panfrost/decode: Decode blend constant This adds a forgotten decode line on Midgard and adds the field of a blend constant on Bifrost. The Bifrost encoding is fairly weird; whereas Midgard is just a regular 32-bit float, Bifrost uses a fancy fixed-point-esque encoding. The decode logic here is experimentally correct. The encode logic is a sort of "guesstimate", assuming that the high byte is just int(f / 255.0) and then solving algebraicly for the low byte. This might be slightly off in some cases. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:23 +00:00
Alyssa Rosenzweig	3645c781ab	panfrost: Hoist blend constant into Midgard-specific struct This eliminates one major source of #ifdef parity between Midgard and Bifrost, better representing how the struct acts on Midgard and allowing proper decodes on Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:21 +00:00
Alyssa Rosenzweig	50382df728	panfrost/decode: Disassemble Bifrost shaders We already have the Bifrost disassembler in-tree, so now that panwrap is able to dump Bifrost command streams, hook up the disassembler to pandecode. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>	2019-05-19 17:41:08 +00:00
Bas Nieuwenhuizen	4689e98fe8	vulkan/wsi: Set X11 minImageCount to 3. For IMMEDIATE and FIFO, most games work in a pipelined manner where the can produce frames at a rate of 1/MAX(CPU duration, GPU duration), but the render latency is CPU duration + GPU duration. This means that with scanout from pageflipping we need 3 frames to run full speed: 1) CPU rendering work 2) GPU rendering work 3) scanout Once we have a nonblocking acquire that returns a semaphore we can merge 1 and 3. Hence the ideal implementation needs only 2 images, but games cannot tellwe currently do not have an ideal implementation and that hence they need to allocate 3 images. So let us do it for them. This is a tradeoff as it uses more memory than needed for non-fullscreen and non-performance intensive applications. Since this is pretty much a TODO that can use the context I added this as a comment. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-19 00:38:03 +00:00
Eric Engestrom	ccb8ea7acf	meson: expose glapi through osmesa Suggested-by: Pierre Guillou <pierre.guillou@lip6.fr> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109659 Fixes: `f121a669c7` "meson: build gallium based osmesa" Fixes: `cbbd5bb889` "meson: build classic osmesa" Cc: Brian Paul <brianp@vmware.com> Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2019-05-18 11:15:04 +01:00
Kenneth Graunke	28c2ce7105	egl: Allow EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY in ES and GL EGL annoyingly defines a few variants of this token: EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY_EXT - 0x3138 EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY_KHR - 0x31BD EGL_CONTEXT_OPENGL_RESET_NOTIFICATION_STRATEGY - 0x31BD The EGL_EXT_create_context_robustness extension specifies that the EXT token is only valid for ES contexts, not GL. The EGL_KHR_create_context extension defines the KHR version, and says it is only allowed for GL contexts, and specifically calls out that it's an error for ES contexts. But EGL 1.5 includes the new suffixless token, which has the same value as the KHR version, and specifically calls out that it's now valid to use with both GL and ES contexts. So we should allow this. Fixes KHR-NoContext.es32.robustness.no_reset_notification and KHR-NoContext.es32.robustness.lose_context_on_reset on iris, which apparently is exposing EGL 1.5. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-05-17 15:13:15 -07:00
Jason Ekstrand	1c92358bd8	anv: Only consider minSampleShading when sampleShadingEnable is set From the Vulkan 1.1.107 spec: Sample shading is enabled for a graphics pipeline: - If the interface of the fragment shader entry point of the graphics pipeline includes an input variable decorated with SampleId or SamplePosition. In this case minSampleShadingFactor takes the value 1.0. - Else if the sampleShadingEnable member of the VkPipelineMultisampleStateCreateInfo structure specified when creating the graphics pipeline is set to VK_TRUE. In this case minSampleShadingFactor takes the value of VkPipelineMultisampleStateCreateInfo::minSampleShading. Otherwise, sample shading is considered disabled. In other words, if sampleShadingEnable is set to VK_FALSE, we should ignore minSampleShading. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-17 20:33:57 +00:00
Jason Ekstrand	8413fd136c	anv: Stop forcing bindless for images This was an unintended artifact of my testing of bindless images. We should be choosing bindless or not dynamically. Fixes: `c0d9926df7` "anv: Use bindless handles for images" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-17 19:58:51 +00:00
Neha Bhende	926a6a35cf	draw: fix memory leak introduced `7720ce32a` We need to free memory allocation PrimitiveOffsets in draw_gs_destroy(). This fixes memory leak found while running piglit on windows. Fixes: `7720ce32a` ("draw: add support to tgsi paths for geometry streams. (v2)") Tested with piglit Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-17 12:26:48 -06:00
Jason Ekstrand	d2aa65eb18	anv: Emulate texture swizzle in the shader when needed Now that we have the descriptor buffer mechanism, emulated texture swizzle can be implemented in a very non-invasive way. Previous attempts all tried to extend the push constant based image param mechanism which was gross. This could, in theory, be done much faster with a magic back-end instruction which does indirect MOVs but Vulkan on IVB is already so slow this isn't going to matter much. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104355 Cc: "19.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-17 12:25:58 -05:00
Alyssa Rosenzweig	ea479fdc1d	panfrost/midgard: Typofix Reported-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-17 14:59:52 +00:00
Eric Engestrom	6a1f609a4c	gitlab-ci: build-test the tools as well Suggested-by: Rob Clark <robclark@freedesktop.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-17 11:21:48 +01:00
Samuel Pitoiset	d7501834cd	radv: add a workaround for Monster Hunter World and LLVM 7&8 The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-17 11:41:19 +02:00
Thomas Hellstrom	47afc5eed7	svga: Add an environment variable to force coherent surface memory The vmwgfx driver supports emulated coherent surface memory as of version 2.16. Add en environtment variable to enable this functionality for texture- and buffer maps: SVGA_FORCE_COHERENT. This environment variable should be used for testing only. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	1a66ead1c7	pipebuffer, winsys/svga: Add functionality to update pb_validate_entry flags In order to be able to add access modes to a pb_validate_entry, update the pb_validate_add_buffer function to take a pointer hash table and also to return whether the buffer was already on the validate list. Update the svga winsys accordingly. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	a119da3bc9	svga: Set the rendered-to flag for dma transfers to surfaces The rendered-to flag indicates that the HW surface content is more recent than the content of the mob. That's the case after a SurfaceDMA transfer to the surface. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	fb6d09764d	winsys/svga: Fix RELOC_INTERNAL mob GPU access SVGA_RELOC_INTERNAL indicates a transfer between surface and backing mob. This means that if the GPU for example reads from the surface it writes to the backing mob. But since the buffer mapping code allows for simultaneous gpu- and cpu read access, a read from the surface to the mob will not synchronize a subsequent map to the readback. Fix this by inverting the mob access mode in a surface relocation with SVGA_RELOC_INTERNAL set. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Thomas Hellstrom	eed24156ec	svga: Remove the surface_invalidate winsys function Instead unconditionally call SVGA3D_InvalidateGBSurface() since it's needed also for Linux for dirty buffers and operation without SurfaceDMA. For non-guest-backed operation, remove the surface cache surface invalidation altogether. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2019-05-17 08:44:31 +02:00
Gert Wollny	0f598ed7b3	Revert "softpipe/buffer: load only as many components as the the buffer resource type provides" This reverts commit `865b9ddae4`. The buffer always reports format PIPE_FORMAT_R8_UNORM so with this patch only one component would be supported. The original issue is still relevant, but the fix should be different. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-17 08:27:55 +02:00
Dave Airlie	b6e2a9eca7	glsl/nir: init non-static class member. glsl_to_nir.cpp:276: uninit_member: Non-static class member "sig" is not initialized in this constructor nor in any functions that it calls. Reported by coverity Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-05-17 12:33:09 +10:00
Dave Airlie	ebdddb36a0	imgui: fix undefined behaviour bitshift. imgui_draw.cpp:1781: error[shiftTooManyBitsSigned]: Shifting signed 32-bit value by 31 bits is undefined behaviour Reported by coverity Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-05-17 12:33:09 +10:00
Dave Airlie	2bfe5b8556	glsl: init non-static class member in link uniforms. (v2) link_uniforms.cpp:477: uninit_member: Non-static class member "shader_storage_blocks_write_access" is not initialized in this constructor nor in any functions that it calls. Reported by coverity. v2: fix 9->0 typo (Ilia) Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-05-17 12:33:09 +10:00
Dave Airlie	b2d4d08a5c	glsl: init packed in more constructors. src/compiler/glsl_types.cpp:577: uninit_member: Non-static class member "packed" is not initialized in this constructor nor in any functions that it calls. from Coverity. Fixes: `659f333b3a` (glsl: add packed for struct types) Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-05-17 12:33:09 +10:00
Alyssa Rosenzweig	81d3262fa5	panfrost: Cleanup leak todos Many of these are now patched; one of them we patch here. Regardless, this is one less thing to worry about in the code, I suppose. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-17 00:14:49 +00:00
Alyssa Rosenzweig	c65271c929	panfrost: assert(0) -> unreachable for some switch Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 23:42:33 +00:00
Nanley Chery	629806b55b	anv: Fix some depth buffer sampling cases on ICL+ Don't attempt sampling with HiZ if the sampler lacks support for it. On ICL, the HW docs state that sampling with HiZ is not supported and that instances of AUX_HIZ in the RENDER_SURFACE_STATE object will be interpreted as AUX_NONE. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-05-16 20:54:53 +00:00
Caio Marcelo de Oliveira Filho	ded2c202d5	nir: Only convert SSA values to regs when needed If the SSA def produced by this instruction is only in the block in which it is defined and is not used by ifs or phis, then we don't have a reason to convert it to a register in nir_lower_ssa_defs_to_regs_block(). The special case for derefs is covered by the general case, so can be removed: at this point all derefs in the block are materialized (i.e. the whole deref chain is in the block) and derefs are not used in phis. v2: Fix wrong check for if_uses. If there's such an use, the def is not "local_to_block". (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-16 12:23:47 -07:00
Kenneth Graunke	4b5e8eb3c8	st/mesa: Record samplers for extra planes in info->textures_used. Normally gl_nir_lower_samplers_as_deref records info->textures_used for us, but this pass runs after that, attempting to assign samplers in the same order as st_atom_texture's external_samplers_used loop so the stars align and we get the same locations. Since we're adding textures late, we need to amend info->textures_used. iris uses info->textures_used to set up texture bindings; this fixes Piglit's ext_image_dma_buf_import-sample-{nv12,yuv420,yvu420} there. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-05-16 11:54:07 -07:00
Caio Marcelo de Oliveira Filho	8a995f2b5e	nir: Fix nir_opt_idiv_const when negatives are involved First, allow the case for negative powers of two. Then ensure that we use the absolute value of the non-constant value to calculate the quotient -- this was hinted in the code by the name 'uq'. This fixes an issue when 'd' is positive and 'n' is negative. The ishr will propagate the negative sign and we'll use nir_ineg() again, incorrectly. v2: First version used only ishr, but that isn't sufficient, since it never can produce a zero as a result. (Jason) Allow negative powers of two. (Caio) Fixes: `74492ebad9` "nir: Add a pass for lowering integer division by constants" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-16 10:55:03 -07:00
Eric Anholt	ef88e23d03	freedreno: Log the number of loops in the shader for shader-db. shader-db's report.py will use this to see when we've changed loop unrolling behavior on a shader and skip including other stats like instruction count from being considered for that shader, since they won't be useful as a proxy for real world performance in that case. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:22 -07:00
Eric Anholt	c2e68bebb4	freedreno: Output the same shader-db format as v3d and intel. This lets us reuse their report.py, at the expense of fd-report.py no longer working. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:20 -07:00
Eric Anholt	6d9b45171d	freedreno: Remove the ir3_tgsi_to_nir() helper function. It was more of a hindrance, as it pretended that we could compile in the driver with a missing screen. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:18 -07:00
Eric Anholt	a0d4d7febf	freedreno: Fix assertion failures in context setup in shader-db mode. The TTN path needs access to the screen to make the right decisions about lowering, but we didn't have pctx->screen set up at fdN_prog_init time. Reviewed-by: Rob Clark <robdclark@gmail.com> Tested-by: Eduardo Lima Mitev <elima@igalia.com>	2019-05-16 10:25:06 -07:00
Marek Olšák	9d1485554c	ac: match radeonsi code in ac_shader_binary_read_config	2019-05-16 13:15:36 -04:00
Marek Olšák	894e017c9c	r600+radeonsi: use ctx_query_reset_status on radeon This allows a nice cleanup, because the winsys always handles it.	2019-05-16 13:15:36 -04:00
Marek Olšák	4549c36788	winsys/radeon: implement ctx_query_reset_status by copying radeonsi To make it behave like amdgpu. I'm just trying to move this out of radeonsi. The radeonsi code will be removed in the next commit.	2019-05-16 13:15:36 -04:00
Marek Olšák	6b3343e5d8	winsys/amdgpu: report a CS rejection as a reset only if there's no GPU reset	2019-05-16 13:15:36 -04:00
Marek Olšák	78e35df52a	radeonsi: update buffer descriptors in all contexts after buffer invalidation Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108824 Cc: 19.1 <mesa-stable@lists.freedesktop.org>	2019-05-16 13:15:36 -04:00
Marek Olšák	0f1b070bad	radeonsi: remove old_va parameter from si_rebind_buffer by remembering offsets This is a prerequisite for the next commit. Cc: 19.1 <mesa-stable@lists.freedesktop.org>	2019-05-16 13:14:55 -04:00
Marek Olšák	f3ae455eb0	radeonsi: compute culling - flush CS to remove write references to buffers Only read-only buffers can use compute culling. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	04122532e3	radeonsi: invalidate caches at the beginning of the prim discard compute IB Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	9f505ce21d	radeonsi: disable primitive restart for triangles for DiRT Rally It may decrease performance and it prevents compute-based primitive culling. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	0252fb92b8	radeonsi: add primitive culling stats to the HUD Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	c9b7a37b8f	radeonsi: cull primitives with async compute for large draw calls Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:34 -04:00
Marek Olšák	187f1c999f	winsys/amdgpu: add REWIND emulation via INDIRECT_BUFFER into cs_check_space Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	4eb377d1c3	radeonsi: add si_vs_prolog_bits::unpack_instance_id_from_vertex_id:1 The prim discard compute shader bakes InstanceID into the output index buffer. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	b206f007de	radeonsi: make some functions non-static Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	301344008f	radeonsi: allow si_shader_select_with_key to return an optimized shader or fail If a prim discard compute shader hasn't finished compilation, we don't want to any shader. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	ca9edd7cd0	radeonsi: use pipe_draw_info::instance_count indirectly It will be modified by compute shader culling. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	d380fabdbb	radeonsi: use pipe_draw_info::prim and primitive_restart indirectly so that the fields can be changed by the driver. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	43aa2f4f7c	radeonsi: make functions for creating LLVM functions non-static Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:10:07 -04:00
Marek Olšák	b19884e08e	winsys/amdgpu: add a parallel compute IB coupled with a gfx IB Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:07:00 -04:00
Marek Olšák	eda281e977	ac: add LLVM code for triangle culling Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:58 -04:00
Marek Olšák	07c83d25fd	radeonsi: add a cs parameter into si_cp_copy_data Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:57 -04:00
Marek Olšák	ce264d19a0	radeonsi: add a cs parameter into si_cp_release_mem Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:56 -04:00
Marek Olšák	9624855f13	radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limits Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:54 -04:00
Marek Olšák	6e38af0631	radeonsi: move si_*_descriptors_idx functions into si_state.h Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:53 -04:00
Marek Olšák	49a016ec5d	radeonsi: make si_initialize_compute reusable Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:51 -04:00
Marek Olšák	c44c6951d4	radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helper Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:49 -04:00
Marek Olšák	c7ceeea093	radeonsi: return the last part's return value from @wrapper The primitive discard compute shader will get the position output this way. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:40 -04:00
Marek Olšák	d569b7cb31	winsys/amdgpu: always set NO_CPU_ACCESS and NO_SUBALLOC on GDS resources Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:18 -04:00
Jan Zielinski	d65b160e6a	swr: clean up supported OGL4.0/4.1 extensions list This commit adjusts the capabilities returned by the SWR driver and the documentation to correctly report the following extensions: GL_ARB_texture_query_lod, GL_ARB_texture_cube_map_array, GL_ARB_gpu_shader_fp64, GL_ARB_texture_gather, GL_ARB_vertex_attrib_64bit. Reviewed-by: Alok Hota <alok.hota@intel.com>	2019-05-16 17:41:14 +02:00
Leo Liu	aa040d3b3c	vl/dri3: set back buffer from output to NULL with front buffer case Since the using output optimization is only for back buffer case Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-16 10:28:38 -04:00
Alejandro Piñeiro	16a1ef7860	docs: advice to resolve discussion on gitlab MR doc For newcomers to gitlab, it is not evident that it is better to press the "Resolve Discussion" button when you update your branch handling feedback. v2: * Fix several grammar nits, reorder, use new corrected text (Connor Abbot) * Use "reviewers", instead of reviewer (Eric Engestrom) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-16 16:16:32 +02:00
Roland Scheidegger	4171a26193	auxiliary/draw: fix crash with zero-stride draw auto transform feedback draws get the number of vertices from the transform feedback object. In draw, we'll figure this out with the number of bytes written divided by the stride. However, it is apparently possible we end up with a stride of 0 there (not entirely sure it could happen with GL). Probably when nothing was actually ever written (so we don't actually have a stride set). Just avoid the division by zero by setting the count to 0. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-05-16 14:01:33 +02:00
Eric Engestrom	22c1657d05	util/os_file: always use the 'grow' mechanism Use fstat() only to pre-allocate a big enough buffer. This fixes a race where if the file grows between fstat() and read() we would be missing the end of the file, and if the file slims down read() would just fail. Fixes: `316964709e` "util: add os_read_file() helper" Reported-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-16 12:56:25 +01:00
Lionel Landwerlin	e04cf0b612	nir: lower_non_uniform_access: iterate over instructions safely This pass moves instructions around and adds control-flow in the middle of blocks. We need to use nir_foreach_instr_safe to ensure that we iterate over instructions correctly anyway. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-16 10:22:01 +01:00
Kenneth Graunke	752367b766	iris: Dodge more GLSL IR lowering This avoids some lower_instructions bits in st.	2019-05-15 19:44:21 -07:00
Jason Ekstrand	fce0214e94	intel/fs/live_variables: Do compute_start_end in BITSET_WORD chunks For a block with a contiguous chunk of 32 vars that don't need updating, this lets us skip 32 vars at a time. Also, by using bitscan, we only iterate for each set bit rather than testing them all one at a time. Looking at perf (with -O0 which is unfortunately necessary to get reasonable back-traces), this seems to cuts about 50-60% of the time spent in compute_start_end() which is, itself about 4-6% of the run-time. In the real world, with a release driver build, this cuts 1.34% off a full shader-db run. (I ran shader-db 5 times in each configuration). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-16 02:14:40 +00:00
Jason Ekstrand	b2d274c677	intel/fs/ra: Choose a spill reg before throwing away the graph Otherwise, we get an effectively random spill reg because we no longer have the information from RA to guide us. Also, a completely clean graph has undefined data in in_stack which is used for choosing the spill reg so it really is non-deterministic. Fixes: `e99081e76d` "intel/fs/ra: Spill without destroying the..." Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-16 02:13:09 +00:00
Jason Ekstrand	c19acf321c	intel/fs/ra: Add spill costs to the graph on-demand Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-16 02:13:09 +00:00
Jason Ekstrand	2c14e2b5bf	intel/fs/ra: Add a helper for discarding the interference graph Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-16 02:13:09 +00:00
Alyssa Rosenzweig	46494c3dc1	nir/algebraic: Remove problematic "optimization" This line is no longer relevant now that booleans are 1-bit, and in fact causes issues (infinite progress loop between algebraic optimizations and copy prop) with constant vector masks. No shader-db changes on Intel platforms (Jason). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2019-05-16 02:08:37 +00:00
Alyssa Rosenzweig	74ab80b92d	panfrost/midgard: Add load/store opcodes This commit adds a bunch of new load/store opcodes, largely related to OpenCL, as well as adjusting the name of existing opcodes to be more uniform. The immediate effect is compute shaders are substantially easier to interpret now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:25:25 +00:00
Alyssa Rosenzweig	f73c0b73ec	panfrost/midgard: Enable integer constant inlining Midgard ALU features two types of constants: embedded constants (128-bit chunk, zero/one per schedule bundle) and inline constants (16-bit splattered into the op, second source if present). Inline constants are much more efficient from a space and scheduling freedom standpoint, so it's desirable to inline when possible. Now that integer ops are well understood and in use, we enable inlining of integers constants in addition to floats (which have been inlined since forever). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:20:41 +00:00
Alyssa Rosenzweig	8214aaa3c8	panfrost/midgard: Remove imov workaround The previous commit fixes the issue this patched around. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:20:41 +00:00
Alyssa Rosenzweig	0a13babdd8	panfrost/midgard: Set int outmod for ops writing integers By default, the "normal" output modifier is set on ALU ops. This is the correct default for float outputs -- for floats, it preserves the semantic value. Unfortunately, when used with integers, it does not preserve the bitstream encoding, causing misbehaviour. (It's an open question what happens when `normal` is used with integers -- does it apply some other transformation? or does it do floating point normalization/etc on the ints as if they were floats?). Instead, we default to the "clamp to integer" output modifier for ops writing integers. Semantically, this makes sense (clamping an integer to the nearest integer is the identity function). In the hardware with an integer opcode, this is the actual "normal". This fixes numerous sporadic and sometimes bizarre bugs relating to integers, especially integer moves. With this in place, we no longer care about the types involved; it's just bits on the wire again. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:20:30 +00:00
Alyssa Rosenzweig	81b1053d9b	panfrost: Set custom stride for textures when necessary From Gallium (and our) perspective, the stride of a BO is arbitrary. For internal buffers, we can make it something nice, but for imported linear buffers (e.g. EGL clients), we don't always have that luxury. To cope, we calculate the expected stride of a texture, compare it to the BO's actual reported stride, and if they differ, set the latter as a custom stride. Fixes rendering of windows not on tile boundaries (noticeable in Weston with es2gears_wayland, for instance). Also, this should fix stride issues with bufer reloading. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:16:36 +00:00
Alyssa Rosenzweig	cea9352059	panfrost/decode: Stride decoding With a special flag, texture descriptors can include custom stride(s). We haven't seen a case of this used for mipmaps/cubemaps, so it's not clear how that will be encoded, but this dumps correctly for single one-level 2D textures. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:15:37 +00:00
Alyssa Rosenzweig	d699ffbf0e	panfrost/decode: Futureproof texture dumping One field was not dumped for some reason. It's observed to be 0, but it's still good to have it available. Also, extra fields might be snuck in the bitmaps array (it's variable-lengthed at the end), and we want to guard against that possibility, so we dump a little more. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-16 01:15:37 +00:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Marek Olšák	e5cc363f43	ac: add comments to chip enums Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (except GFX2 changes) Reviewed-by: Dave Airlie <airlied@redhat.com> (except <= GFX5 changes)	2019-05-15 20:54:10 -04:00
Anuj Phogat	a42163cbbc	compiler: Add lowering support for 64-bit saturate operations to software Fixes 7 Khronos GL CTS tests: KHR-GL45.gpu_shader_fp64.builtin.smoothstep_dvec{double, 2, 3, 4} KHR-GL45.gpu_shader_fp64.builtin.smoothstep_against_scalar_dvec{2, 3, 4} Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-15 23:30:30 +00:00
Kenneth Graunke	d305409db5	st/dri: Minor style fixes Trivial.	2019-05-15 14:49:14 -07:00
Chia-I Wu	659c5800e5	virgl: handle DONT_BLOCK and MAP_DIRECTLY Handle PIPE_TRANSFER_DONT_BLOCK and PIPE_TRANSFER_MAP_DIRECTLY. Make virgl_resource_transfer_prepare return an enum instead of a bool for extensibility (e.g., instruct the callers to map differently). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-05-15 20:51:28 +00:00
Chia-I Wu	e87186fc67	virgl: add virgl_resource_transfer_prepare virgl_resource_transfer_prepare should be called before mapping to prepare the resource. It does flush, readback, and wait as needed. virgl_res_needs_flush and virgl_res_needs_readback become internal helpers to the new function. There should be no externally visible change. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-05-15 20:51:28 +00:00
Chia-I Wu	cdcf38b98a	virgl: honor DISCARD_WHOLE_RESOURCE in virgl_res_needs_readback Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-05-15 20:51:28 +00:00
Chia-I Wu	a62ab178ce	virgl: clean up virgl_res_needs_readback Add comments and follow the coding style of virgl_res_needs_flush. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>	2019-05-15 20:51:28 +00:00
Lionel Landwerlin	391a836e8f	nir: fix lower_non_uniform_access pass Obviously missing the instruction insertion into the SSA list. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3bd5457641` ("nir: Add a lowering pass for non-uniform resource access") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-15 18:15:20 +00:00
Alex Villacís Lasso	b2200514af	gbm: gbm_bo_get_handle_for_plane fallback to nonplanar handle Commit `f9567ab435` (gbm: Export a getter for per plane handles) contains an API version check that fails on i915 (API version 7 vs. check for minimum API version 13). Any client that migrates to the planar API will start failing on i915 (see https://gitlab.gnome.org/GNOME/mutter/issues/127 for mutter, and https://bugs.freedesktop.org/show_bug.cgi?id=108487 for weston). This commit adds a fallback for plane 0 when the API check fails and returns the non-planar handle in this scenario, making the call equivalent to gbm_bo_get_handle(). This is enough for weston 6.0.0 to start working again on an i915 system. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=108487 Signed-off-by: Alex Villacís Lasso <a_villacis@palosanto.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-05-15 18:27:30 +01:00
Alyssa Rosenzweig	a9cef4f0e5	gallium: Add default check for PIPE_CAP_FRAGMENT_SHADER_INTERLOCK Fixes: `c704c0226` ("gallium: Add a PIPE_CAP_FRAGMENT_SHADER_INTERLOCK") Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 21:34:49 -07:00
Andrii Kryvytskyi	eca53f00aa	iris: Check if resource has stencil before returning it Signed-off-by: Andrii Kryvytskyi <andrii.o.kryvytskyi@globallogic.com> Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 21:16:11 -07:00
Jordan Justen	49958c4b5d	i965/blorp: Set MOCS for gen11 in blorp_alloc_vertex_buffer v2: * Add build error for gen > 6 if MOCS is not set. (Lionel) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-05-14 19:57:01 -07:00
Kenneth Graunke	bb5db02bab	iris: Enable fragment shader interlock on Gen9+. There's some debate about whether we should support this on older hardware as well. Currently i965 turns it off on Gen8- though, so we follow suit. If this changes, we can update this as well. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-14 19:34:33 -07:00
Kenneth Graunke	c704c0226c	gallium: Add a PIPE_CAP_FRAGMENT_SHADER_INTERLOCK. Corresponding to GL_ARB_fragment_shader_interlock and GL_NV_fragment_shader_interlock. Currently, only the NIR paths support this functionality, but someone could conceivably add it to TGSI too. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-14 19:34:29 -07:00
Dave Airlie	4efd04ab18	intel/compiler: use bitset instead of opencoding a 32-bit bitset. (v2) In the future I want to expand this to 128-bits, for vec16 support, so lets just put the code in place to use bitset ranges now. v2: just declare the bitset to be the max of what we should ever see and change assert to reflect it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-15 07:10:34 +10:00
Dave Airlie	3b2c433167	intel/compiler: remove repeated bit_size / 8 in brw mem lowering pass. Just use a variable already. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-15 07:10:30 +10:00
Kenneth Graunke	646924cfa1	intel/compiler: Implement TCS 8_PATCH mode and INTEL_DEBUG=tcs8 Our tessellation control shaders can be dispatched in several modes. - SINGLE_PATCH (Gen7+) processes a single patch per thread, with each channel corresponding to a different patch vertex. PATCHLIST_N will launch (N / 8) threads. If N is less than 8, some channels will be disabled, leaving some untapped hardware capabilities. Conditionals based on gl_InvocationID are non-uniform, which means that they'll often have to execute both paths. However, if there are fewer than 8 vertices, all invocations will happen within a single thread, so barriers can become no-ops, which is nice. We also burn a maximum of 4 registers for ICP handles, so we can compile without regard for the value of N. It also works in all cases. - DUAL_PATCH mode processes up to two patches at a time, where the first four channels come from patch 1, and the second group of four come from patch 2. This tries to provide better EU utilization for small patches (N <= 4). It cannot be used in all cases. - 8_PATCH mode processes 8 patches at a time, with a thread launched per vertex in the patch. Each channel corresponds to the same vertex, but in each of the 8 patches. This utilizes all channels even for small patches. It also makes conditions on gl_InvocationID uniform, leading to proper jumps. Barriers, unfortunately, become real. Worse, for PATCHLIST_N, the thread payload burns N registers for ICP handles. This can burn up to 32 registers, or 1/4 of our register file, for URB handles. For Vulkan (and DX), we know the number of vertices at compile time, so we can limit the amount of waste. In GL, the patch dimension is dynamic state, so we either would have to waste all 32 (not reasonable) or guess (badly) and recompile. This is unfortunate. Because we can only spawn 16 thread instances, we can only use this mode for PATCHLIST_16 and smaller. The rest must use SINGLE_PATCH. This patch implements the new 8_PATCH TCS mode, but leaves us using SINGLE_PATCH by default. A new INTEL_DEBUG=tcs8 flag will switch to using 8_PATCH mode for testing and benchmarking purposes. We may want to consider using 8_PATCH mode in Vulkan in some cases. The data I've seen shows that 8_PATCH mode can be more efficient in some cases, but SINGLE_PATCH mode (the one we use today) is faster in other cases. Ultimately, the TES matters much more than the TCS for performance, so the decision may not matter much. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:30 -07:00
Kenneth Graunke	076159b40b	intel/compiler: Move ICP handle fetching into a helper function. This will be significantly different in 8_PATCH mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:28 -07:00
Kenneth Graunke	3d84fd29e8	intel/compiler: Don't repeat dispatch max fixing condition Having a single flag will keep both places in sync if the condition gets more complicated. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:27 -07:00
Kenneth Graunke	f0d52cf2b0	intel/compiler: Rename invocation_id_mask to instance_id_mask The payload field is actually "instance" (thread number), which is used to calculate the invocation ID. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:25 -07:00
Kenneth Graunke	d86260719e	intel/compiler: Refactor TCS invocation ID setup into a helper When we add 8_PATCH mode, this will get a bit more complex, so we may as well start by putting it in a helper function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:24 -07:00
Kenneth Graunke	381c2aded2	i965: Pass compiler to default key populators This lets us get devinfo and other misc. compiler settings. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 13:16:21 -07:00
Marek Olšák	6b0b8f132a	ac: use 1D GEPs for descriptors and constants just a cleanup Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-14 15:15:11 -04:00
Marek Olšák	67b4785958	mesa: fix _mesa_max_texture_levels for GL_TEXTURE_EXTERNAL_OES This helps fix: piglit/bin/ext_image_dma_buf_import-sample_yuv -fmt=NV12 -auto Fixes: `d88f3392ff` Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 15:15:11 -04:00
Eric Anholt	e5db87b00b	freedreno: Restore msm_drm.h to a pristine "make headers_install" copy. This diverged back in `f1374805a8` ("drm-uapi: use local files, not system libdrm") to point at drm-uapi's copy, which we don't need now that we're actually in drm-uapi. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-05-14 11:51:57 -07:00
Eric Anholt	18d11cb4dc	freedreno: Move msm_drm.h to the same spot as other DRM uapi. The new location matches other drivers, and has a README about the rules for updating it. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-05-14 11:51:55 -07:00
Ian Romanick	32d259713b	nir/algebraic: Commute 1-fsat(a) to fsat(1-a) for all non-fmul instructions The goal is to avoid having an extra MOV instruction to perform the saturate. Doing the subtraction first allows the saturate to be applied to the ADD instruction making the MOV unnecessary. Values generated in different block and values from non-ALU instructions (e.g., texture instructions) almost always need the extra MOV. Multiply instructions are restricted because doing this rearrangement can interfere with the generation of flrp and ffma instructions. v2: Now that the final method has been selected, squash three commits into one. All Intel platforms has similar results. (Ice Lake shown) total instructions in shared programs: 17223214 -> 17219386 (-0.02%) instructions in affected programs: 1524376 -> 1520548 (-0.25%) helped: 2686 HURT: 26 helped stats (abs) min: 1 max: 32 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 16.67% x̄: 0.54% x̃: 0.37% HURT stats (abs) min: 1 max: 2 x̄: 1.69 x̃: 2 HURT stats (rel) min: 0.33% max: 1.67% x̄: 0.54% x̃: 0.35% 95% mean confidence interval for instructions value: -1.46 -1.36 95% mean confidence interval for instructions %-change: -0.56% -0.50% Instructions are helped. total cycles in shared programs: 360811571 -> 360791896 (<.01%) cycles in affected programs: 103650214 -> 103630539 (-0.02%) helped: 1557 HURT: 675 helped stats (abs) min: 1 max: 1773 x̄: 41.44 x̃: 16 helped stats (rel) min: <.01% max: 26.77% x̄: 1.37% x̃: 0.64% HURT stats (abs) min: 1 max: 1513 x̄: 66.44 x̃: 14 HURT stats (rel) min: <.01% max: 46.16% x̄: 2.00% x̃: 0.49% 95% mean confidence interval for cycles value: -14.82 -2.81 95% mean confidence interval for cycles %-change: -0.50% -0.20% Cycles are helped. LOST: 2 GAINED: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:23 -07:00
Ian Romanick	a7f0c57673	nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1) v2: Fix copy-and-paste bug in a cmp b vs b cmp a cases. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224337 -> 17224269 (<.01%) instructions in affected programs: 13578 -> 13510 (-0.50%) helped: 68 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.31% max: 3.12% x̄: 0.84% x̃: 0.42% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.05% -0.63% Instructions are helped. total cycles in shared programs: 360826090 -> 360825137 (<.01%) cycles in affected programs: 94867 -> 93914 (-1.00%) helped: 58 HURT: 1 helped stats (abs) min: 2 max: 28 x̄: 17.74 x̃: 18 helped stats (rel) min: 0.08% max: 3.17% x̄: 1.39% x̃: 1.22% HURT stats (abs) min: 76 max: 76 x̄: 76.00 x̃: 76 HURT stats (rel) min: 2.86% max: 2.86% x̄: 2.86% x̃: 2.86% 95% mean confidence interval for cycles value: -19.53 -12.78 95% mean confidence interval for cycles %-change: -1.56% -1.08% Cycles are helped. No changes on any other Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:23 -07:00
Ian Romanick	281f20e26d	nir/algebraic: Strip double negatives from comparison sources All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224623 -> 17224337 (<.01%) instructions in affected programs: 32648 -> 32362 (-0.88%) helped: 148 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.93 x̃: 2 helped stats (rel) min: 0.16% max: 2.74% x̄: 1.07% x̃: 1.08% 95% mean confidence interval for instructions value: -1.97 -1.89 95% mean confidence interval for instructions %-change: -1.15% -1.00% Instructions are helped. total cycles in shared programs: 360828714 -> 360826090 (<.01%) cycles in affected programs: 347416 -> 344792 (-0.76%) helped: 148 HURT: 26 helped stats (abs) min: 1 max: 426 x̄: 26.33 x̃: 18 helped stats (rel) min: 0.03% max: 15.10% x̄: 1.78% x̃: 1.41% HURT stats (abs) min: 2 max: 337 x̄: 48.96 x̃: 6 HURT stats (rel) min: 0.04% max: 18.82% x̄: 2.15% x̃: 0.27% 95% mean confidence interval for cycles value: -23.78 -6.38 95% mean confidence interval for cycles %-change: -1.59% -0.79% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	45c7ff95fc	intel/compiler: Repeat nir_opt_algebraic_late A tiny bit of help seems to come from nir_copy_prop. Future patches will benefit from this change. Doing more copy propagation on the vec4 backend led to a disaster in hurt cycles. v2: Fix typo in comment. Noticed by Matt. All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224634 -> 17224623 (<.01%) instructions in affected programs: 4586 -> 4575 (-0.24%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.19% max: 0.53% x̄: 0.27% x̃: 0.23% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.36% -0.19% Instructions are helped. total cycles in shared programs: 360828542 -> 360828714 (<.01%) cycles in affected programs: 151159 -> 151331 (0.11%) helped: 49 HURT: 28 helped stats (abs) min: 1 max: 254 x̄: 26.41 x̃: 6 helped stats (rel) min: 0.06% max: 12.02% x̄: 1.34% x̃: 0.42% HURT stats (abs) min: 1 max: 196 x̄: 52.36 x̃: 15 HURT stats (rel) min: 0.05% max: 10.74% x̄: 2.55% x̃: 0.88% 95% mean confidence interval for cycles value: -13.48 17.95 95% mean confidence interval for cycles %-change: -0.69% 0.84% Inconclusive result (value mean confidence interval includes 0). Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13529544 -> 13529542 (<.01%) instructions in affected programs: 358 -> 356 (-0.56%) helped: 2 HURT: 0 total cycles in shared programs: 357290311 -> 357289678 (<.01%) cycles in affected programs: 178324 -> 177691 (-0.35%) helped: 48 HURT: 40 helped stats (abs) min: 1 max: 201 x̄: 31.52 x̃: 13 helped stats (rel) min: 0.06% max: 10.92% x̄: 1.71% x̃: 0.66% HURT stats (abs) min: 1 max: 224 x̄: 22.00 x̃: 6 HURT stats (rel) min: 0.05% max: 15.84% x̄: 1.29% x̃: 0.31% 95% mean confidence interval for cycles value: -18.28 3.89 95% mean confidence interval for cycles %-change: -1.01% 0.32% Inconclusive result (value mean confidence interval includes 0). Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8159110 -> 8158980 (<.01%) instructions in affected programs: 22719 -> 22589 (-0.57%) helped: 65 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.07% max: 1.05% x̄: 0.73% x̃: 0.74% 95% mean confidence interval for instructions value: -2.06 -1.94 95% mean confidence interval for instructions %-change: -0.78% -0.68% Instructions are helped. total cycles in shared programs: 188609448 -> 188609214 (<.01%) cycles in affected programs: 1875852 -> 1875618 (-0.01%) helped: 109 HURT: 104 helped stats (abs) min: 2 max: 46 x̄: 5.30 x̃: 4 helped stats (rel) min: 0.02% max: 0.90% x̄: 0.09% x̃: 0.07% HURT stats (abs) min: 2 max: 20 x̄: 3.31 x̃: 2 HURT stats (rel) min: 0.01% max: 0.26% x̄: 0.04% x̃: 0.02% 95% mean confidence interval for cycles value: -1.95 -0.25 95% mean confidence interval for cycles %-change: -0.04% -0.01% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	d2a9ba03e3	Revert "nir: add late opt to turn inot/b2f combos back to bcsel" This reverts commit `7acc865226`. With these optimizations in place, the extra constant folding added in the next commit extends some live ranges of 0.0 and ±1.0 constants, and that causes several hundred shaders to have more spills and fills. I believe this optimization we made basically irrelevant by `7725d60938` "intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))". All Gen7.5+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17225303 -> 17224634 (<.01%) instructions in affected programs: 879402 -> 878733 (-0.08%) helped: 679 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.03% max: 0.93% x̄: 0.24% x̃: 0.05% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.02 -0.95 95% mean confidence interval for instructions %-change: -0.26% -0.22% Instructions are helped. total cycles in shared programs: 360842595 -> 360828542 (<.01%) cycles in affected programs: 110443594 -> 110429541 (-0.01%) helped: 389 HURT: 265 helped stats (abs) min: 1 max: 7525 x̄: 162.81 x̃: 28 helped stats (rel) min: <.01% max: 18.66% x̄: 1.11% x̃: 0.11% HURT stats (abs) min: 1 max: 7614 x̄: 185.96 x̃: 48 HURT stats (rel) min: <.01% max: 25.08% x̄: 0.95% x̃: 0.10% 95% mean confidence interval for cycles value: -75.65 32.67 95% mean confidence interval for cycles %-change: -0.49% -0.06% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 12159 -> 12161 (0.02%) spills in affected programs: 13 -> 15 (15.38%) helped: 0 HURT: 1 total fills in shared programs: 25207 -> 25208 (<.01%) fills in affected programs: 25 -> 26 (4.00%) helped: 0 HURT: 1 Ivy Bridge total instructions in shared programs: 12082019 -> 12082013 (<.01%) instructions in affected programs: 1033 -> 1027 (-0.58%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.41% max: 0.83% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.78% -0.45% Instructions are helped. total cycles in shared programs: 179849270 -> 179849157 (<.01%) cycles in affected programs: 4735 -> 4622 (-2.39%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 74 x̄: 28.25 x̃: 18 helped stats (rel) min: 0.13% max: 6.53% x̄: 2.85% x̃: 2.36% 95% mean confidence interval for cycles value: -82.73 26.23 95% mean confidence interval for cycles %-change: -7.98% 2.28% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10882750 -> 10882748 (<.01%) instructions in affected programs: 266 -> 264 (-0.75%) helped: 2 HURT: 0 Iron Lake total cycles in shared programs: 188609440 -> 188609448 (<.01%) cycles in affected programs: 4320 -> 4328 (0.19%) helped: 0 HURT: 2 GM45 total cycles in shared programs: 129016868 -> 129016872 (<.01%) cycles in affected programs: 2302 -> 2306 (0.17%) helped: 0 HURT: 1 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	3cb091f8b4	nir/algebraic: Eliminate a tautological compare The value-range tracking pass that is coming is not clever enough to know that the result of the ffma must be non-negative. Making it that smart will require quite a bit of work. It might be possible to add a special case that detects that a whole tree of fadd(fmul(fsat(a), fneg(fsat(a))), 1.0) cannot be negative. For cases when the comparison is used in the domain guard for a square-root (see nir/algebraic: Simplify fsqrt domain guard), the compare may be converted to a fmax. This patch also handles that case. All of the affected cases are in DiRT: Showdown. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17225365 -> 17225303 (<.01%) instructions in affected programs: 40051 -> 39989 (-0.15%) helped: 62 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.07% max: 0.66% x̄: 0.27% x̃: 0.26% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.31% -0.22% Instructions are helped. total cycles in shared programs: 360842788 -> 360842595 (<.01%) cycles in affected programs: 1818081 -> 1817888 (-0.01%) helped: 29 HURT: 22 helped stats (abs) min: 1 max: 206 x̄: 20.66 x̃: 14 helped stats (rel) min: <.01% max: 9.55% x̄: 0.87% x̃: 0.42% HURT stats (abs) min: 1 max: 108 x̄: 18.45 x̃: 7 HURT stats (rel) min: <.01% max: 4.48% x̄: 0.56% x̃: 0.19% 95% mean confidence interval for cycles value: -14.48 6.91 95% mean confidence interval for cycles %-change: -0.71% 0.21% Inconclusive result (value mean confidence interval includes 0). No changes on any other Intel platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	9725e45b3d	nir/algebraic: Simplify fsqrt domain guard All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17228376 -> 17225365 (-0.02%) instructions in affected programs: 280732 -> 277721 (-1.07%) helped: 1072 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 2.81 x̃: 2 helped stats (rel) min: 0.16% max: 5.10% x̄: 1.43% x̃: 1.07% 95% mean confidence interval for instructions value: -2.92 -2.70 95% mean confidence interval for instructions %-change: -1.48% -1.37% Instructions are helped. total cycles in shared programs: 360935690 -> 360842788 (-0.03%) cycles in affected programs: 7838017 -> 7745115 (-1.19%) helped: 1569 HURT: 69 helped stats (abs) min: 1 max: 1198 x̄: 63.53 x̃: 20 helped stats (rel) min: 0.06% max: 26.17% x̄: 3.44% x̃: 2.12% HURT stats (abs) min: 1 max: 2820 x̄: 98.22 x̃: 47 HURT stats (rel) min: 0.05% max: 16.67% x̄: 3.50% x̃: 2.31% 95% mean confidence interval for cycles value: -63.55 -49.89 95% mean confidence interval for cycles %-change: -3.33% -2.96% Cycles are helped. No changes on any other platform. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	e2ad047779	nir/search: Don't compare 8-bit or 1-bit constants with floats Without this, adding an algebraic rule like (('bcsel', ('flt', a, 0.0), 0.0, ...), ...), will cause assertion failures inside nir_src_comp_as_float in GTF-GL46.gtf21.GL.lessThan.lessThan_vec3_frag (and related tests) from the OpenGL CTS and shaders/closed/steam/witcher-2/511.shader_test from shader-db. All of these cases have some code that ends up like ('bcsel', ('flt', a, 0.0), 'b@1', ...) When the 'b@1' is tested, nir_src_comp_as_float fails because there's no such thing as a 1-bit float. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	5116646a76	nir/algebraic: Recognize open-coded fsat with modifiers This change also enables a later change (nir/algebraic: Replace 1-fsat(a) with fsat(1-a)) to affect more shaders. Almost all of the affected shaders are in Bioshock Infinite, and all of those shaders all require GLSL 4.10. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17228584 -> 17228376 (<.01%) instructions in affected programs: 31438 -> 31230 (-0.66%) helped: 105 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.98 x̃: 1 helped stats (rel) min: 0.08% max: 1.53% x̄: 0.73% x̃: 0.70% 95% mean confidence interval for instructions value: -2.20 -1.76 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 360936431 -> 360935690 (<.01%) cycles in affected programs: 420100 -> 419359 (-0.18%) helped: 71 HURT: 21 helped stats (abs) min: 1 max: 160 x̄: 19.28 x̃: 10 helped stats (rel) min: <.01% max: 9.78% x̄: 0.95% x̃: 0.48% HURT stats (abs) min: 1 max: 198 x̄: 29.90 x̃: 10 HURT stats (rel) min: 0.05% max: 8.36% x̄: 1.24% x̃: 0.90% 95% mean confidence interval for cycles value: -16.77 0.66 95% mean confidence interval for cycles %-change: -0.85% -0.06% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	c769641c8e	nir/algebraic: Push unary operations into source operands of fsat source Pushing a unary operation, like fneg, into the operation that generates its operand allows the fsat to be applied to the inner instruction instead of on a separate instruction that performs the unary operation. This changes fmul ssa_100, ssa_99, ssa_98 fmov.sat ssa_101, -ssa_100 into fmul.sat ssa_100, -ssa_99, ssa_98 Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown) total instructions in shared programs: 17228658 -> 17228584 (<.01%) instructions in affected programs: 3163 -> 3089 (-2.34%) helped: 49 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.51 x̃: 2 helped stats (rel) min: 0.58% max: 9.09% x̄: 3.69% x̃: 3.51% 95% mean confidence interval for instructions value: -1.66 -1.37 95% mean confidence interval for instructions %-change: -4.37% -3.00% Instructions are helped. total cycles in shared programs: 360937144 -> 360936431 (<.01%) cycles in affected programs: 24029 -> 23316 (-2.97%) helped: 47 HURT: 2 helped stats (abs) min: 4 max: 18 x̄: 15.34 x̃: 16 helped stats (rel) min: 0.69% max: 6.18% x̄: 3.78% x̃: 4.27% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.34% max: 0.67% x̄: 0.50% x̃: 0.50% 95% mean confidence interval for cycles value: -16.05 -13.05 95% mean confidence interval for cycles %-change: -4.07% -3.15% Cycles are helped. All Gen7 and earlier platforms had similar results. (Haswell shown) total instructions in shared programs: 13536059 -> 13535884 (<.01%) instructions in affected programs: 8797 -> 8622 (-1.99%) helped: 150 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.40% max: 11.11% x̄: 3.51% x̃: 1.96% 95% mean confidence interval for instructions value: -1.23 -1.11 95% mean confidence interval for instructions %-change: -3.97% -3.05% Instructions are helped. total cycles in shared programs: 357696119 -> 357694193 (<.01%) cycles in affected programs: 50216 -> 48290 (-3.84%) helped: 109 HURT: 14 helped stats (abs) min: 2 max: 92 x̄: 18.97 x̃: 16 helped stats (rel) min: 0.26% max: 19.09% x̄: 7.37% x̃: 5.37% HURT stats (abs) min: 2 max: 26 x̄: 10.14 x̃: 5 HURT stats (rel) min: 0.18% max: 4.73% x̄: 1.84% x̃: 0.92% 95% mean confidence interval for cycles value: -19.27 -12.05 95% mean confidence interval for cycles %-change: -7.34% -5.31% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:22 -07:00
Ian Romanick	3b74790941	nir/algebraic: Recognize open-coded flrp(a, b, fsat(c)) All Gen6+ GPUs had similar results. (Skylake shown) total instructions in shared programs: 15336712 -> 15336622 (<.01%) instructions in affected programs: 3952 -> 3862 (-2.28%) helped: 24 HURT: 0 helped stats (abs) min: 3 max: 5 x̄: 3.75 x̃: 4 helped stats (rel) min: 1.75% max: 2.70% x̄: 2.34% x̃: 2.46% 95% mean confidence interval for instructions value: -4.06 -3.44 95% mean confidence interval for instructions %-change: -2.47% -2.22% Instructions are helped. total cycles in shared programs: 355722052 -> 355721235 (<.01%) cycles in affected programs: 27326 -> 26509 (-2.99%) helped: 20 HURT: 4 helped stats (abs) min: 1 max: 227 x̄: 44.75 x̃: 14 helped stats (rel) min: 0.12% max: 22.95% x̄: 3.83% x̃: 1.23% HURT stats (abs) min: 2 max: 64 x̄: 19.50 x̃: 6 HURT stats (rel) min: 0.21% max: 3.63% x̄: 1.24% x̃: 0.55% 95% mean confidence interval for cycles value: -61.61 -6.47 95% mean confidence interval for cycles %-change: -5.59% -0.39% Cycles are helped. No changes on Ice Lake, Iron Lake, or GM45. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-14 11:38:21 -07:00
Ian Romanick	a79570099b	intel/fs: Allow cmod propagation to instructions with saturate modifier v2: Add unit tests. Suggested by Matt. All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 17229441 -> 17228658 (<.01%) instructions in affected programs: 159574 -> 158791 (-0.49%) helped: 489 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.60 x̃: 1 helped stats (rel) min: 0.07% max: 2.70% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.72 -1.48 95% mean confidence interval for instructions %-change: -0.64% -0.58% Instructions are helped. total cycles in shared programs: 360944149 -> 360937144 (<.01%) cycles in affected programs: 1072195 -> 1065190 (-0.65%) helped: 254 HURT: 27 helped stats (abs) min: 2 max: 234 x̄: 30.51 x̃: 9 helped stats (rel) min: 0.04% max: 8.99% x̄: 0.75% x̃: 0.24% HURT stats (abs) min: 2 max: 83 x̄: 27.56 x̃: 24 HURT stats (rel) min: 0.09% max: 3.79% x̄: 1.28% x̃: 1.16% 95% mean confidence interval for cycles value: -30.11 -19.75 95% mean confidence interval for cycles %-change: -0.70% -0.41% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2019-05-14 11:38:21 -07:00
Ian Romanick	a7724b1cbb	nir/algebraic: Add missing ffma(-1, a, b) pattern All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17229439 -> 17229377 (<.01%) instructions in affected programs: 9859 -> 9797 (-0.63%) helped: 41 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 1.51 x̃: 1 helped stats (rel) min: 0.08% max: 11.54% x̄: 1.65% x̃: 0.67% 95% mean confidence interval for instructions value: -1.88 -1.14 95% mean confidence interval for instructions %-change: -2.48% -0.81% Instructions are helped. total cycles in shared programs: 360944145 -> 360942989 (<.01%) cycles in affected programs: 178167 -> 177011 (-0.65%) helped: 36 HURT: 19 helped stats (abs) min: 1 max: 222 x̄: 38.03 x̃: 5 helped stats (rel) min: 0.01% max: 31.01% x̄: 4.01% x̃: 0.45% HURT stats (abs) min: 1 max: 34 x̄: 11.21 x̃: 6 HURT stats (rel) min: 0.03% max: 2.74% x̄: 0.72% x̃: 0.50% 95% mean confidence interval for cycles value: -36.01 -6.02 95% mean confidence interval for cycles %-change: -4.18% -0.57% Cycles are helped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:03 -07:00
Ian Romanick	7b4ff6a1af	nir: Mark ffma as 2src_commutative This doesn't make any real difference now, but future work (not in this series) will add a LOT of ffma patterns. Having to duplicate all of them for ffma(a, b, c) and ffma(b, a, c) is just terrible. No shader-db changes on any Intel platform. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Ian Romanick	e049a9c92b	nir: Add support for 2src_commutative ops that have 3 sources v2: Instead of handling 3 sources as a special case, generalize with loops to N sources. Suggested by Jason. v3: Further generalize by only checking that number of sources is >= 2. Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Ian Romanick	ede45bf9cf	nir: Rename commutative to 2src_commutative The meaning of the new name is that the first two sources are commutative. Since this is only currently applied to two-source operations, there is no change. A future change will mark ffma as 2src_commutative. It is also possible that future work will add 3src_commutative for opcodes like fmin3. v2: s/commutative_2src/2src_commutative/g. I had originally considered this, but I discarded it because I did't want to deal with identifiers that (should) start with 2. Jason suggested it in review, so we decided that _2src_commutative would be used in nir_opcodes.py. Also add some comments documenting what 2src_commutative means. Also suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-14 11:25:02 -07:00
Jason Ekstrand	e99081e76d	intel/fs/ra: Spill without destroying the interference graph Instead of re-building the interference graph every time we spill, we modify it in place so we can avoid recalculating liveness and the whole O(n^2) interference graph building process. We make a simplifying assumption in order to do so which is that all spill/fill temporary registers live for the entire duration of the instruction around which we're spilling. This isn't quite true because a spill into the source of an instruction doesn't need to interfere with its destination, for instance. Not re-calculating liveness also means that we aren't adjusting spill costs based on the new liveness. The combination of these things results in a bit of churn in spilling. It takes a large cut out of the run-time of shader-db on my laptop. Shader-db results on Kaby Lake: total instructions in shared programs: 15311224 -> 15311360 (<.01%) instructions in affected programs: 77027 -> 77163 (0.18%) helped: 11 HURT: 18 total cycles in shared programs: 355544739 -> 355830749 (0.08%) cycles in affected programs: 203273745 -> 203559755 (0.14%) helped: 234 HURT: 190 total spills in shared programs: 12049 -> 12042 (-0.06%) spills in affected programs: 2465 -> 2458 (-0.28%) helped: 9 HURT: 16 total fills in shared programs: 25112 -> 25165 (0.21%) fills in affected programs: 6819 -> 6872 (0.78%) helped: 11 HURT: 16 Total CPU time (seconds): 2469.68 -> 2360.22 (-4.43%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	147665d0a2	intel/fs/ra: Put the VGRFs at the end of the nodes This is slightly less convenient in some places but it will make it much easier when we want to start adding nodes dynamically. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e7b7d572b3	intel/fs/ra: Re-arrange interference setup The old code was arranged by the type of interference being added. It would set up payload registers and then add payload interference for all VGRFs. It would set up MRFs and add MRF interference for all VGRFs. This commit re-arranges things to be organized differently. It first creates and sets up all RA nodes and then groups interference into two new categories: live range and instruction interference. Once all the RA nodes have been set up, it walks the list of VGRFs and sets up their live range interference and then walks the list of instructions and sets up instruction interference. This new arrangement will be advantageous for a future patch but, at the moment, it cuts 2% off the run-time of shader-db on my laptop. Shader-db results on Kaby Lake: total instructions in shared programs: 15311224 -> 15311224 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355544739 -> 355544739 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2523.45 -> 2469.68 (-2.13%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	0fd60e95fb	intel/fs/ra: Do the spill loop inside RA Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	47b1dcdcab	intel/fs/ra: Only add MRF hack interference if we're spilling The only use of the MRF hack these days is for spilling and there we don't need the precise MRF usage information. If we're spilling then we know pretty well how many MRFs are going to be used. It is possible if the only things that are spilled have fewer SIMD channels than the dispatch width of the shader that this may be more MRFs than needed. That's a risk we're willing to takd. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311224 (<.01%) instructions in affected programs: 16664 -> 16788 (0.74%) helped: 1 HURT: 5 total cycles in shared programs: 355543197 -> 355544739 (<.01%) cycles in affected programs: 731864 -> 733406 (0.21%) helped: 3 HURT: 6 The hurt shaders are all SIMD32 compute shaders where we reserve enough space for a 32-wide spill/fill but don't need it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	69878a9bb0	intel/fs/ra: Pull the guts of RA into its own class This accomplishes two things. First, it makes interfaces which are really private to RA private to RA. Second, it gives us a place to store some common stuff as we go through the algorithm. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9e00a251be	intel/fs/ra: Move assign_regs further down in the file It's the main function from which all the other functions are called. It belongs at the bottom. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	5d9ac57c8c	intel/fs/ra: Split building the interference graph into a helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	472ef2f98d	intel/fs/ra: Initialize grf_used with first_non_payload_grf There's no reason why we need to use the calculated payload_node_count value which is just first_non_payload_grf aligned up. The grf_used value will be aligned up to 16 anyway (which is a much bigger alignment) before being handed off to hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	096ad8a809	intel/fs/ra: Stop adding RA interference to too many SENDS nodes We only have one node per VGRF so this was adding way too much interference. No idea how we didn't catch this before. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355543197 (0.02%) cycles in affected programs: 2472492 -> 2547639 (3.04%) helped: 17 HURT: 20 Fixes: `014edff0d2` "intel/fs: Add interference between SENDS sources" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	5911abd76f	util/ra: Assert nodes are in-bounds in add_node_interference Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	88cac12230	intel/fs/ra: Only add dest interference to sources that exist Fixes: `83dedb6354` "i965: Add src/dst interference for certain" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e291cd8a7e	util/ra: Don't destroy the graph in ra_allocate() We want to be able to call ra_allocate() and, when it fails, mutate the graph and try again rather than re-building the graph from scratch. This commit moves all the scratch bits except the final register allocation (which is really an out value not scratch) into sub-structs named "tmp" to make it clear which things are scratch. It also adds bits to the ra_select() initialization loop to initialize things (since we can't trust rzalloc anymore) and copy q_test and forced_reg over. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9040215f5d	util/ra: Add a helper for resetting a node's interference Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	698bb9b984	util/ra: Add helpers for adding nodes to an interference graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	6c0f75c953	util/ralloc: Add helpers for growing zero-initialized memory Unfortunately, we can't quite follow the standard C conventions for these because ralloc doesn't know the sizes of pointers. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	6212326941	intel/fs: Stop doing extra RA calls In the last phase of the schedule and RA loop, the RA call is redundant if we spill. Immediately afterwards, we're going to see that we couldn't allocate without spilling and call back into RA and tell it to go ahead and spill. We've known about it for a while but we've always brushed over it on the theory that, if you're going to spill, you'll be calling RA a bunch anyway and what does one extra RA hurt? As it turns out, it hurts more than you'd expect. Because the RA interference graph gets sparser with each spill and the RA algorithm is more efficient on sparser graphs, the RA call that we're duplicating is actually the most expensive call in the RA-and-spill loop. There's another extra RA call we do that's a bit harder to see which this also removes. If we try to compile a shader that isn't the minimum dispatch width and it fails to allocate without spilling we call fail() to set an error but then go ahead and do the first spilling RA pass and only after that's complete do we detect the fail and bail out. By making minimum dispatch widths part of the spill condition, we side-step this problem. Getting rid of these extra spills takes the compile time of a nasty Aztec Ruins shader from about 28 seconds to about 26 seconds on my laptop. It also makes shader-db 1.5% faster Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355468050 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2524.31 -> 2486.63 (-1.49%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	41b310e219	util/ra: Improve the performance of ra_simplify The most expensive part of register allocation is the ra_simplify step which is a fixed-point algorithm with a worst-case complexity of O(n^2) which adds the registers to a stack which we then use later to do the actual allocation. This commit uses bit sets and changes the core loop of ra_simplify to first walk 32-node chunks and then walk each chunk. This lets us skip whole 32-node chunks in one go based on bit operations and compute the minimum q value potentially 32x as fast. Of course, the algorithm still has the same fundamental O(n^2) run-time but the constant is now much lower. In the nasty Aztec Ruins compute shader, this shaves a full four seconds off the 30s compile time for a release build of mesa. In a debug build (needed for accurate stack traces), perf says that ra_select takes 20% of runtime before this patch and only 5-6% of runtime after this patch. It also makes shader-db runs faster. Shader-db results on Kaby Lake: total instructions in shared programs: 15311100 -> 15311100 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 355468050 -> 355468050 (0.00%) cycles in affected programs: 0 -> 0 helped: 0 HURT: 0 Total CPU time (seconds): 2602.37 -> 2524.31 (-3.00%) Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	e1511f1d4c	util/ra: Only update q_total if the reg is not assigned We only use q_total if the reg is not assigned so there's no point in updating it if the reg is not assigned. This has no known perf benefit but it will reduce churn in a future commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	9d6d1f47e7	util/ra: Only update best_optimistic_node if !progress This shaves about half a second off the 30 second compile time of one of the compute shaders in Aztec ruins. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	de56d3a2d1	util/ra: Make in_stack a bitset in the graph Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Jason Ekstrand	7720ad65ae	util/ra: Get rid of tabs Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-14 12:30:22 -05:00
Chia-I Wu	34810f4237	virgl: clean up virgl_res_needs_flush Add comments and some minor cleanups. v2: document the function Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> (v1) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2019-05-14 17:00:22 +00:00
Chia-I Wu	08241624ad	virgl: comment on a sync issue in transfers Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	76e45534d2	virgl: PIPE_TRANSFER_READ does not imply flush virgl_res_needs_flush should suffice. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	9f8521882a	virgl: do not skip readback because of explicit flush Both apps and we (see virgl_buffer_transfer_flush_region) might flush regions that are unmodified. We have to read back for those flushes. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Chia-I Wu	be8eeb3b59	virgl: remove unused virgl_transfer_inline_write It currently has no user and is probably incorrect (resource_wait is required in some more cases). Remove it so that we can focus on transfers first. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-14 17:00:22 +00:00
Nanley Chery	e81392868e	iris/resource: Drop redundant checks for aux support Drop some checks that are already done by ISL. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	75a3947af4	iris/resource: Fall back to no aux if creation fails No surface requires an auxiliary surface to operate correctly. Fall back to an uncompressed surface if mesa fails to create and allocate an auxiliary surface. This enables adding more restrictions to ISL without having to update iris. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	1423b78633	i965/miptree: Refactor intel_miptree_supports_ccs_e() Update and rename this function to format_supports_ccs_e() to better match its behavior. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	779bd8d332	i965/miptree: Drop intel_*_supports_hiz() intel_tiling_supports_hiz() and intel_miptree_supports_hiz() duplicate much the work done by isl_surf_get_hiz_surf(). Replace them with simple expressions. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	29a13eb71d	isl: Add restrictions to isl_surf_get_hiz_surf() Import some restrictions from intel_tiling_supports_hiz() and intel_miptree_supports_hiz(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	942755bec4	i965/miptree: Drop intel_*_supports_ccs() intel_tiling_supports_ccs() and intel_miptree_supports_ccs() duplicate much the work done by isl_surf_get_ccs_surf(). Drop them both and index a boolean array to choose CCS_D in intel_miptree_choose_aux_usage(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	d57242190e	isl: Add restriction and comments to isl_surf_get_ccs_surf() Import some restrictions and comments from intel_miptree_supports_ccs(). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	91a42537d1	i965/miptree: Drop intel_miptree_supports_mcs() This function duplicates much the work done by isl_surf_get_mcs_surf(). Replace it with a simple expression. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	1de089797c	isl: Modify restrictions in isl_surf_get_mcs_surf() Import some restrictions from intel_miptree_supports_mcs() and don't assume that the caller knows which device generations are supported. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Nanley Chery	cf758c4182	i965/miptree: Fall back to no aux if creation fails No surface requires an auxiliary surface to operate correctly. Fall back to an uncompressed surface if mesa fails to create and allocate an auxiliary surface. This enables adding more restrictions to ISL without having to update i965. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-14 16:23:12 +00:00
Mathias Fröhlich	fc455797c1	mesa: Set _NEW_VARYING_VP_INPUTS iff varying_vp_inputs are set. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	b4b1df5a17	mesa: Avoid setting _NEW_VARYING_VP_INPUTS in non fixed function mode. Instead of checking the API variant on entry of set_varying_vp_inputs to check if we can ever be interrested in fixed function processing or not, we can check if we are actually fixed function processing. To check this we can use the immediately updated gl_context::VertexProgram._VPMode value that tells us if we have a user provided shader program or if we are in fixed function processing either through an internal TNL shader of directly through hardware. When doing so, we also need to recheck the varying_vp_inputs variable at the time gl_context::VertexProgram._VPMode is set to VP_MODE_FF. Put asserts at the consumers of gl_context::varying_vp_inputs to make sure gl_context::VertexProgram._VPMode is set to VP_MODE_FF. By that gl_context::varying_vp_inputs should be up to date then. By not looking at the opengl api for this decision we should actually catch more cases where we can avoid setting a state change flag, including the ones where we cannot get into VP_MODE_FF by the choice of the api. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	663f93c869	mesa: Fix test for setting the _NEW_VARYING_VP_INPUTS flag. The precondition stated in the comment is not true. The values mentioned are only set from _mesa_update_state which in turn may not yet be called. For now set the _NEW_VARYING_VP_INPUTS flag a bit more often, we will narrow that down to a minimum again in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	df50af19d3	mesa: Make _mesa_set_varying_vp_inputs static in state.c. Is no longer used outside that file. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	99952579f3	mesa: Fix old outdated variable name in a comment. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Mathias Fröhlich	e634ba5116	mesa/vbo: Update Comment to what is actually happening. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2019-05-14 18:09:49 +02:00
Jonas Ådahl	903ad59407	wayland/egl: Ensure correct buffer size when allocating Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a buffer swap, make sure the size is up to date. Prior to this commit, we failed to do so when querying the buffer age, or swapping buffers without any prior EGL call or draw call. Signed-off-by: Jonas Ådahl <jadahl@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-14 15:33:35 +00:00
Paulo Zanoni	73055ae1c9	egl: check if a window/pixmap is already used on surface creation The spec says we can't create another surface if we already created a surface with the given window or pixmap. Implement this check. This behavior is exercised by piglit/egl-create-surface. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-05-14 12:41:14 +00:00
Paulo Zanoni	04ecda3b3c	egl: store the native surface pointer in struct _egl_surface Each platform stores this in a different place: - platform_drm uses dri2_surf->gbm_surf->base - platform_android uses dri2_surf->window - platform_wayland uses dri2_surf->wl_win - platform_x11 uses dri2_surf->drawable - platform_x11_dri3 uses dri3_surf->loader_drawable.drawable - haiku doesn't even store it! We need access to the native surface since the specification asks us to refuse creating a new surface if there's already an EGLSurface associated with native_surface. An alternative to this patch would be to create a new API.GetNativeWindow callback that each platform would have to implement. While that's something we can definitely do, I prefer this approach. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-05-14 12:41:14 +00:00
Samuel Pitoiset	9520e7c1e9	radv: add support for VK_KHR_uniform_buffer_standard_layout Nothing to do. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-14 09:15:28 +02:00
Gert Wollny	865b9ddae4	softpipe/buffer: load only as many components as the the buffer resource type provides Otherwise we risk to read past the end of the buffer. In addition, change the loop counters to unsigned to be consistent with the types. Fixes: `afa8707ba9` softpipe: add SSBO/shader atomics support. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-14 06:49:43 +00:00
Tomeu Vizoso	1050273094	panfrost: ci: Reduce batch size to 3000 As with the previous value of 5000 we seemed to be reaching OOM in some circumstances. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-14 07:43:11 +02:00
Tomeu Vizoso	9beb8aedeb	panfrost: ci: Update expectations Since last Friday, these two tests have been fixed: dEQP-GLES2.functional.shaders.functions.control_flow.return_in_nested_loop_fragment dEQP-GLES2.functional.shaders.linkage.varying_7 Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-14 07:43:06 +02:00
Eric Anholt	db329260bf	freedreno: Fix warning on printing a uint64_t using %llx. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Eric Anholt	40dd28acc3	freedreno: Silence compiler warnings about "" in boolean context. It sure looks like we just want both of them to be nonzero, and && is probably going to be cheaper than anyway. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Eric Anholt	06168d3f6a	freedreno: Silence compiler warnings about uninit 'layers' My gcc can't see that the uninitialized value from the PIPE_BUFFER case isn't used from the !PIPE_BUFFER cases later. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Eric Anholt	c49f0159bd	freedreno: Quiet compiler warnings on 64-bit. __u64 is a ulonglong on x86_64, not uint64_t, so my gcc was complaining about the wrong type being passed in. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Eric Anholt	0734905d9a	freedreno: Make emacs indent the way robclark's eclipse does. The .editorconfig helps with the tabs, but we've got this two-tabs-from-previous-indentation line continuation style that requires whacking the c-file-offsets. This will throw emacs warnings when first opening a file in the directory, press '!' to shut it up for the future. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Eric Anholt	257999d9a8	freedreno: Make .editorconfig match .dir-locals.el. The editorconfig takes precedence over dir-locals in emacs26 with editorconfig enabled, so the /.editorconfig was affecting these directories. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-05-13 15:37:01 -07:00
Jason Ekstrand	0745d4bd96	anv: Implement VK_KHR_uniform_buffer_standard_layout There's no real work to do here since we already support scalar block layout which is a direct superset of what this extension allows. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-13 17:20:33 -05:00
Jason Ekstrand	b464504777	vulkan: Update the XML and headers to 1.1.108 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-13 17:20:33 -05:00
Jason Ekstrand	072227da0a	tu/entrypoints: Import copy It's used without being imported	2019-05-13 17:20:33 -05:00
Karol Herbst	fc800af83b	nv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvals Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Pierre Moreau <dev@pmoreau.org>	2019-05-13 23:40:40 +02:00
Karol Herbst	358e52383c	nv50/ir/nir: prefer to shift 1ull instead of 1ll Signed-off-by: Karol Herbst <kherbst@redhat.com> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Pierre Moreau <dev@pmoreau.org>	2019-05-13 23:40:40 +02:00
Bas Nieuwenhuizen	1619f20883	radv: Clean up signalled and submitted fields from winsys fences. Other types like syncobj do not need it, so lets make things a bit more uniform. Also reduce confusion what the signalled/submitted referred to (especially with imported fences) Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-13 20:36:29 +00:00
Samuel Pitoiset	5555db103e	radv: bump reported version to 1.1.107 VK_AMD_draw_indirect_count has been promoted with the suffix changed to KHR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-13 21:38:01 +02:00
Eric Anholt	60a64f028d	v3d: Use driconf to expose non-MSAA texture limits for Xorg. The V3D 4.2 HW has a limit to MSAA texture sizes of 4096. With non-MSAA, we can go up to 7680 (actually probably 8138, but that hasn't been validated by the HW team). Exposing 7680 in X11 will allow dual 4k displays.	2019-05-13 12:03:11 -07:00
Eric Anholt	0c31fe9ee7	gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE. The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 12:03:08 -07:00
Eric Anholt	f33cb272f0	mesa: Replace MaxTextureLevels with MaxTextureSize. In most places (glGetInteger, max_legal_texture_dimensions), we wanted the number of pixels, not the number of levels. Number of levels is easily recovered with util_next_power_of_two() and ffs(). More importantly, for V3D we want to be able to expose a non-power-of-two maximum texture size to cover 2x4k displays on HW that can't quite do 8192 wide. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 12:03:05 -07:00
Eric Anholt	ce6dbc0417	mesa: Remove proxy image checks for maximum level. We've already verified this by _mesa_legal_texture_dimensions() before this call. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 12:03:03 -07:00
Eric Anholt	d88f3392ff	mesa: Reuse _mesa_max_texture_levels() instead of open-coding it. The shared function has some extension presence checks, but other than that has the same switch statement contents. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 12:02:59 -07:00
Vinson Lee	20b42fad9b	intel/tools: Fix build with glibc < 2.27. glibc < 2.27 defines OVERFLOW in /usr/include/math.h. This patch fixes this build error. In file included from ../include/c99_math.h:37:0, from ../src/util/u_math.h:44, from ../src/mesa/main/macros.h:35, from ../src/intel/compiler/brw_reg.h:47, from ../src/intel/tools/i965_asm.h:32, from ../src/intel/tools/i965_gram.y:29: src/intel/tools/i965_gram.tab.c:562:5: error: expected identifier before numeric constant OVERFLOW = 412, ^ Fixes: `70308a5a8a` ("intel/tools: New i965 instruction assembler tool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110656 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-05-13 11:05:48 -07:00
Marek Olšák	84816d1464	st/mesa: enable the ST_DEBUG env var in release and debugoptimized builds Useful for dumping shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-13 13:01:01 -04:00
Nicolai Hähnle	d814c21b1b	radeonsi: overhaul the vertex fetch fixup mechanism The overall goal is to support unaligned loads from vertex buffers natively on SI. In the unaligned case, we fall back to the general case implementation in ac_build_opencoded_load_format. Since this function is fully general, we will also use it going forward for cases requiring fully manual format conversions of dwords anyway. This requires a different encoding of the fix_fetch array, which will now contain the entire format information if a fixup is required. Having to check the alignment of vertex buffers is awkward. To keep the impact on the fast path minimal, the si_context will keep track of which vertex buffers are (not) at least dword-aligned, while the si_vertex_elements will note which vertex buffers have some (at most dword) alignment requirement. Vertex buffers should be dword-aligned most of the time, which allows a fast early-out in almost all cases. Add the radeonsi_vs_fetch_always_opencode configuration variable for testing purposes. Note that it can only be used reliably on LLVM >= 9, because support for byte and short load is required. v2: - add a missing check to si_bind_vertex_elements Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Nicolai Hähnle	8a951c3d2f	radeonsi: store sctx->vertex_elements in a local in si_shader_selector_key_vs Purely as a shorthand in the remainder of the function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Nicolai Hähnle	81fe33735a	amd/common: add ac_build_opencoded_fetch_format Implement software emulation of buffer_load_format for all types required by vertex buffer fetches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Jason Ekstrand	712f99934c	nir/validate: Use a single set for SSA def validation The current SSA def validation we do in nir_validate validates three things: 1. That each SSA def is only ever used in the function in which it is defined. 2. That an nir_src exists in an SSA def's use list if and only if it points to that SSA def. 3. That each nir_src is in the correct use list (uses or if_uses) based on whether it's an if condition or not. The way we were doing this before was that we had a hash table which provided a map from SSA def to a small ssa_def_validate_state data structure which contained a pointer to the nir_function_impl and two hash sets, one for each use list. This meant piles of allocation and creating of little hash sets. It also meant one hash lookup for each SSA def plus one per use as well as two per src (because we have to look up the ssa_def_validate_state and then look up the use.) It also involved a second walk over the instructions as a post-validate step. This commit changes us to use a single low-collision hash set of SSA sources for all of this by being a bit more clever. We accomplish the objectives above as follows: 1. The list is clear when we start validating a function. If the nir_src references an SSA def which is defined in a different function, it simply won't be in the set. 2. When validating the SSA defs, we walk the uses and verify that they have is_ssa set and that the SSA def points to the SSA def we're validating. This catches the case of a nir_src being in the wrong list. We then put the nir_src in the set and, when we validate the nir_src, we assert that it's in the set. This takes care of any cases where a nir_src isn't in the use list. After checking that the nir_src is in the set, we remove it from the set and, at the end of nir_function_impl validation, we assert that the set is empty. This takes care of any cases where a nir_src is in a use list but the instruction is no longer in the shader. 3. When we put a nir_src in the set, we set the bottom bit of the pointer to 1 if it's the condition of an if. This lets us detect whether or not a nir_src is in the right list. When running shader-db with an optimized debug build of mesa on my laptop, I get the following shader-db CPU times: With NIR_VALIDATE=0 3033.34 seconds Before this commit 20224.83 seconds After this commit 6255.50 seconds Assuming shader-db is a representative sampling of GLSL shaders, this means that making this change yields an 81% reduction in the time spent in nir_validate. It still isn't cheap but enabling validation now only increases compile times by 2x instead of 6.6x. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-13 14:43:47 +00:00
Jason Ekstrand	bab08c791d	util/set: Add a helper to resize a set Often times you don't know how big a set will be and you want the code to just grow it as needed. However, sometimes you do know and you can avoid a lot of rehashing if you just specify a size up-front. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-13 14:43:47 +00:00
Jason Ekstrand	abb450870e	util/set: Add a search_and_add function This function is identical to _mesa_set_add except that it takes an extra out parameter that lets the caller detect if a replacement happened. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-13 14:43:47 +00:00
Jason Ekstrand	460567eabf	nir/validate: Use a ralloc context for our temporary data All of our hash tables and sets are already using ralloc. There's really no good reason why we don't just make a ralloc context rather than try to remember to clean everything up manually. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2019-05-13 14:43:47 +00:00
Patrick Lerda	6963f59cae	lima: add Allwinner H5 support The H5 hardware variant requires a specific plb_max_blk number. This value can't be probed at the hardware level. Signed-off-by: Patrick Lerda <patrick9876@free.fr> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-13 13:32:55 +02:00
Patrick Lerda	38c5a5a8b5	lima: refactor plb_max_blk Move plb_max_blk to lima_screen, and add a new debug option: LIMA_PLB_MAX_BLK Signed-off-by: Patrick Lerda <patrick9876@free.fr> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-13 13:32:55 +02:00
Bas Nieuwenhuizen	f53ebfb450	radv: Do not use extra descriptor space for the 3rd plane. While ImageFormatProperties returns the number of internal descriptors, it turns out that applications do not need to actually allocate more descriptors in the descriptor pool. So if we make descriptors with more planes larger we have to be convervative and always allocate space for the larger descriptors which is a waste given the low usage of this ext. So let us make use of the fact that 3plane formats all have the same formats & dimensions for the last two planes. This way we only need the first half of the descriptor of the 3rd plane and can share the second half of the second plane. This allows us to use 16 bytes for the descriptor which nicely fits into the 16 bytes that are unused right next to the sampler. Fixes: `5564c38212` "radv: Update descriptor sets for multiple planes." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-12 23:02:44 +00:00
Bas Nieuwenhuizen	d6dfb2cf50	radv: Add support for icd loader interface v4. Adds support for physical device functions unknown to the loader. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-13 00:41:31 +02:00
Alyssa Rosenzweig	726f0263e1	panfrost/midgard: Handle csel correctly We use an algebraic pass for the csel optimizations, and use proper vectorized csel ops (i/fcsel_v) for mixed, rather lowering. To avoid regressions along the way, we fix an issue with the copy propagation pass (it should not attempt to propagate constants). Similarly, we take care to break bundles when using csel to fix some scheduler corner cases. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-12 22:21:49 +00:00
Illia Iorin	a35269cf44	iris: Implement ARB_indirect_parameters iris_draw_vbo is divided into two functions to remove unnecessary operations from the loop. This implementation of ARB_indirect_parameters takes into account NV_conditional_render by saving MI_PREDICATE_RESULT at the start of a draw call and restoring it at the end also the result of NV_conditional_render is taken into account when computing predicates that limit draw calls for ARB_indirect_parameters in a similar way to `1952fd8d` in ANV. v2: Optimize indirect draws (suggested by Kenneth Graunke) v3: (by Kenneth Graunke) - Fix an issue where indirect draws wouldn't set patch information before updating the compiled TCS. - Move some code back to iris_draw_vbo to avoid duplicating it. - Fix minor indentation issues. Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-11 23:56:52 -07:00
Kenneth Graunke	21a0be4a79	iris: Split iris_update_draw_info into two functions. Shader draw parameters need updating on each iteration of a multidraw loop, but the primitive based information only needs to be updated once. Also, patch information needs to be recorded before filling out the TCS program key, as it determines the number of HS instances.	2019-05-11 23:54:15 -07:00
Ruslan Kabatsayev	974c4d679c	nir: Fix wrong sign in lower_rcp The nested fma calls were supposed to implement x_new = x + x * (1 - xsrc), but instead current code is equivalent to x_new = x - x (1 - x*src). The result is that Newton-Raphson steps don't improve precision at all. This patch fixes this problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110435 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-11 09:25:22 -07:00
Mike Blumenkrantz	7b2468bf6e	intel: drop misleading driver name from gen_get_device_info()	2019-05-11 04:14:06 +00:00
Józef Kucia	24af0f1318	radv: clear vertex bindings while resetting command buffer Only vertex inputs accessed by vertex shader must have valid buffers bound. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `5010436e09` "radv: bail out when binding the same vertex buffers"	2019-05-11 02:51:00 +02:00
Marek Olšák	83435e748f	st/mesa: fix 2 crashes in st_tgsi_lower_yuv src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct tgsi_full_dst_register , const struct tgsi_full_dst_register , unsigned int): assertion "dst->Register.WriteMask" failed The second crash was due to insufficient allocated size for TGSI instructions. Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-05-10 20:51:16 -04:00
Kenneth Graunke	72ccefb529	iris: Use full ways for L3 cache setup on Icelake. Anuj fixed this in i965 and anv, but the fix never landed in iris. Fixes tessellation corruption on Icelake. Thanks to Rafael for bisecting this and tracking it down. Fixes: `d0996d5fab` iris: Emit default L3 config for the render pipeline Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-05-10 16:50:14 -07:00
Caio Marcelo de Oliveira Filho	3610081daa	anv: Fix limits when VK_EXT_descriptor_indexing is used Update various limits in VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously zero to their values from VkPhysicalDeviceLimits. When using VK_EXT_descriptor_indexing, the former limits will apply to all the descriptor layout sets -- not only those using the new feature bits. For the reference, VK_EXT_descriptor_indexing says "There are new descriptor set layout and descriptor pool creation flags that are required to opt in to the update-after-bind functionality, and there are separate maxPerStage* and maxDescriptorSet* limits that apply to these descriptor set layouts which may be much higher than the pre-existing limits. The old limits only count descriptors in non-updateAfterBind descriptor set layouts, and the new limits count descriptors in all descriptor set layouts in the pipeline layout." Fixes: `6e230d7607` "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-10 15:15:11 -07:00
Lionel Landwerlin	ad2b4aa378	vulkan/overlay: keep allocating draw data until it can be reused The original implementation assumed that we could allocate the same amount of command buffers as the number of images in the swapchain. But the application could potentially render much faster and rerender into images that have been submitted for presentation but not yet presented. This change keeps on allocating command buffers, vertex buffer, vertex indices as well as a semaphore and a fence for as long as we can't reuse a previously submitted one. This fixes rendering issues in the overlay at high frame rates. v2: Don't recreate semaphores constantly (Józef) v3: Drop useless surface & FreeCommandBuffers (Józef) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655 Cc: 19.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>	2019-05-10 21:54:48 +01:00
Lionel Landwerlin	877b371cbb	vulkan/overlay: fix truncating error on 32bit platforms Non dispatchable handles can be uint64_t. When compiling the layer on a 32bit platform, this will lead to casting uint64_t into (void *) which is 32bit, leading to incorrect handles being mapped internally in the layer. v2: Use more HKEY() (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Józef Kucia <joseph.kucia@gmail.com> Fixes: `2d2927938f` ("vulkan/overlay-layer: fix cast errors") Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>	2019-05-10 21:54:48 +01:00
Kenneth Graunke	3f60810de0	i965: Fix memory leaks in brw_upload_cs_work_groups_surface(). This was taking a reference to the 64kB upload buffer and never returning it, leaking a reference each time this atom triggered. This leaked lots of 64kB upload BOs, eventually running us out of of VMA space. This would usually happen when using mpv to watch a movie, after 20-40 minutes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134 Fixes: `63d7b33f51` i965/cs: Setup surface binding for gl_NumWorkGroups Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-10 12:50:19 -07:00
Julien Isorce	98b852cd07	st/va: set the visible image dimensions in vlVaDeriveImage This fixes video being rendered incorrectly. User wants height of 360 but internally pipe_video_buffer 's height is 368 in the test below. Test: GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2019-05-10 17:13:31 +00:00
Alyssa Rosenzweig	292187afcc	swrast: Rename blend_func->swrast_blend_func This avoids a conflict with the new (driver-agnostic) blend_func enum in shader_enum.h, which broke the build of swrast (and i965 by extension). My apologies :( Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `f41be53a` ("compiler: Add enums for blend state") Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-10 09:34:55 -07:00
Eric Engestrom	6e5728e5c9	travis: fix syntax, and drop unused stuff Fixes: `a988d95389` "ci: Delete autotools build jobs" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-10 17:26:53 +01:00
Alyssa Rosenzweig	006cafc243	nir: Add blend_const_color_rgba sysval This represents a float vec4 constant color, as passed to glBlendColor. While the existing 4 shader sysvals are retained to minimize code churn, a single vectorized intrinsic is required for efficient blending on vector architectures. (This may also apply to archictectures like Bifrost where ALU is scalar but load/store is vector; it largely depends on how blending is implemented per-driver.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:49:28 +00:00
Alyssa Rosenzweig	6b0472b181	gallium: Add helper to convert PIPE blending to shader_enum style Complementing the new API-agnostic shader_enum blending style, we add helpers to translate between the two forms. Ideally, we could just use PIPE blending directly, but that makes Vulkan support challenging. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:49:16 +00:00
Alyssa Rosenzweig	f41be53a17	compiler: Add enums for blend state We add enums corresponding to (GLES) blend state to shader_enums.h, complementing the existing advanced blending enums in the file. This allows us to represent blending state in a driver-agnostic, API-agnostic way to permit lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:49:01 +00:00
Jonathan Marek	d0bff89159	nir: allow specifying a set of opcodes in lower_alu_to_scalar This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-05-10 15:10:41 +00:00
Jason Ekstrand	f8bda81887	intel/fs/copy-prop: Don't walk all the ACPs for each instruction In order to set up KILL sets, the dataflow code was walking the entire array of ACPs for every instruction. If you assume the number of ACPs increases roughly with the number of instructions, this is O(n^2). As it turns out, regions_overlap() is not nearly as cheap as one would like and shows up as a significant chunk on perf traces. This commit changes things around and instead first builds an array of exec_lists which it uses like a hash table (keyed off ACP source or destination) similar to what's done in the rest of the copy-prop code. By first walking the list of ACPs and populating the table and then walking instructions and only looking at ACPs which probably have the same VGRF number, we can reduce the complexity to O(n). This takes the execution time of the piglit vs-isnan-dvec test from about 56.4 seconds on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 38.7 seconds. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-10 09:10:17 -05:00
Jason Ekstrand	20bbc175a4	intel/fs/copy-prop: Purge unused ACPs If the destination of an ACP entry exists only within this block, then there's no need to keep it for dataflow analysis. We can delete it from the out_acp table and avoid growing the bitsets any bigger than we absolutely have to. This reduces the maximum number of global ACP entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630 to 3942 and takes the execution time of the piglit vs-isnan-dvec test from about 1:16.2 on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 56.4 seconds. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-10 09:10:17 -05:00
Jason Ekstrand	0b6da5bac6	intel/fs/copy-prop: Bump the hash table size to 64 While the number of ACPs is generally not huge compared to the number of blocks, 16 does seem a bit small. Bumping it to 64 takes the execution time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-10 09:10:17 -05:00
Leo Liu	ceba9ff294	winsys/amdgpu: add VCN JPEG to no user fence group There is no user fence for JPEG, the bug triggering kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT) Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org	2019-05-10 08:24:49 -04:00
Qiang Yu	e2fc0c4a0c	lima: fix width 4096 resolution GP fail When width=4096 and shift_w=0, block_w=0x100 which overflow the PLBU_CMD 8 bits for it. Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Qiang Yu <yuq825@gmail.com>	2019-05-10 16:07:40 +08:00
Tomeu Vizoso	1b97d9c180	panfrost: Add CAPFs for conservative rasterization Just do what everybody else but Nouveau does and return 0.0f. This prevents the repeated logging of these messages on startup: Unexpected PIPE_CAPF 6 query Unexpected PIPE_CAPF 7 query Unexpected PIPE_CAPF 8 query Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:40:52 +02:00
Tomeu Vizoso	c3538ab570	panfrost: Only take the fast paths on buffers aligned to block size As the functions operate on 16-byte blocks. Fixes this Valgrind error: Invalid read of size 4 at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85) by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171) by 0x584F587: panfrost_tile_texture (pan_resource.c:489) by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525) by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516) by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515) by 0x5875F13: u_default_texture_subdata (u_transfer.c:80) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd at 0x483F5C8: malloc (vg_replace_malloc.c:299) by 0x584F47D: panfrost_transfer_map (pan_resource.c:467) by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243) by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59) by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480) by 0x54005BB: st_TexImage (st_cb_texture.c:1709) by 0x5391353: teximage (teximage.c:3105) by 0x5391353: teximage_err (teximage.c:3132) by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170) by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833) by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2) Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: 19.1 <mesa-stable@lists.freedesktop.org>	2019-05-10 07:39:39 +02:00
Tomeu Vizoso	554975bafa	panfrost: Fix two uninitialized accesses in compiler Valgrind was complaining of those. NIR_PASS only sets progress to TRUE if there was progress. nir_const_load_to_arr() only sets as many constants as components has the instruction. This was causing some dEQP tests to flip-flop, such as: dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Fixes: `14531d676b` ("nir: make nir_const_value scalar")	2019-05-10 07:37:57 +02:00
Tomeu Vizoso	67b9c196d0	panfrost: ci: Skip running some tests These tests add too much time to the total run time, and some of them even hang the DUTs, even if I haven't been able to reproduce it locally. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:47 +02:00
Tomeu Vizoso	a94cf20051	panfrost: ci: Don't restart Weston There doesn't seem to actually be any noticeably memory leaks on Weston when running dEQP. We do seem to leak quiet a bit in the client, so we still have to run the dEQP runner in batches. This removes the risk of Weston not restarting properly and introducing spurious failures. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:30 +02:00
Tomeu Vizoso	0d0823638f	panfrost: ci: Update list of expected failures This matches the current state of things on both RK3288 and RK3399. Hopefully, from now on we'll only remove stuff from this list. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:23 +02:00
Tomeu Vizoso	8a328c725a	panfrost: ci: Tweak dEQP to improve throughput Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:18 +02:00
Tomeu Vizoso	bbed39bbf2	panfrost: ci: Fix list of tests to run Make sure we have only test case names in the list, excluding names of test groups. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:13 +02:00
Tomeu Vizoso	7842fe3a45	panfrost: ci: Check for incomplete runs To improve robustness, check that we got the expected number of results. Right now we hard-code the expected number of tests run, but with some effort we may be able to infer it. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:05 +02:00
Tomeu Vizoso	8e139250aa	panfrost: ci: Add tests to flip-flop list These tests aren't giving reliable results. Mask them for now. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:37:00 +02:00
Tomeu Vizoso	dab01348d0	panfrost: ci: Add support for running the tests on RK3288 Build artifacts for armhf and schedule them on a Veyron Chromebook with RK3288. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-10 07:32:29 +02:00
Vasily Khoruzhick	e44a4bae52	lima: fix tile buffer reloading Buffer needs to be reloaded every time unless explicit clear() was called. Fixes rendering issues with wayland compositors. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-09 21:45:04 -07:00
Caio Marcelo de Oliveira Filho	f7d53fffa2	anv: Remove special allocation for anv_push_constants The key reason for that mechanism is gone: all the extra optional data that could be in the anv_push_constants was moved elsewhere. At this point, just put anv_push_constants directly in anv_cmd_state (part of anv_cmd_buffer). v2: Remove a NULL check we don't need anymore in anv_cmd_buffer_push_constants(). (Lionel) Fix size we consider for valid push params. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-09 19:01:14 -07:00
Kenneth Graunke	c61862ddfc	iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY This provides a way for the application to query whether any resets have happened, which lets us expose "robust" contexts. This also enables the KHR_robust_buffer_access_behavior tests.	2019-05-09 16:49:07 -07:00
Kenneth Graunke	343f41781c	iris: Hook up device reset callbacks This mechanism lets the driver inform the state tracker about GPU resets, say for destroying a robust API context and reporting a "device lost" error to the application, making it take action to deal with this.	2019-05-09 16:49:07 -07:00
Kenneth Graunke	c5c12bdd00	iris: Try to recover from GPU hangs. The iris batch module now tries to detect that the kernel has banned our GEM context, creates a new non-banned context, and informs the iris context module that all assumptions about state are now invalid and it needs to reinitialize the relevant state. Based on Chris Wilson's work, but significantly rewritten by me.	2019-05-09 16:49:07 -07:00
Chris Wilson	7402564c07	iris: Add helpers to clone a hardware context. (Chris Wilson wrote this code in a patch titled "i965: Be resilient in the face of GPU hangs"; Ken fixed a bug and copied it to iris.)	2019-05-09 16:49:07 -07:00
Kenneth Graunke	c3701e9070	iris: Mark render batches as non-recoverable. Adapted from Chris Wilson's patch. The comment is largely his. Currently, when iris hangs the GPU, it will continue sending batches which incrementally update the state, assuming it's preserved across batches. However, the kernel's GPU reset support reinitializes the guilty context to the default GPU state (reasonably not wanting to trust the current state). This ends up resetting critical things like STATE_BASE_ADDRESS, causing memory accesses in all subsequent batches to be garbage, and almost certainly result in more hangs until we're banned or we kill the machine. We now ask the kernel to ban our render context immediately, so we notice we've gone off the rails as fast as possible. Eventually, we'll attempt to recover and continue. For now, we just avoid torching the GPU over and over.	2019-05-09 16:49:07 -07:00
Rob Clark	9faf218b8c	freedreno/ir3: fix rasterflat/glxgears Ofc legacy gl features that are broken don't trigger fails in deqp. I should remember to test glxgears more often. Fixes: `7ff6705b8d` freedreno/ir3: convert to "new style" frag inputs Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-09 16:21:05 -07:00
Lionel Landwerlin	f2f6ac1c08	anv: Use corresponding type from the vector allocation We didn't notice this issue much because the 2 struct share a similar layout, expect for the additional fields... We run into that issue in Anv : ==15236== Invalid write of size 8 ==15236== at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211) ==15236== by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264) ==15236== by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312) ==15236== by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167) ==15236== by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190) ==15236== by 0x8CF60871: alloc_surface_state (anv_image.c:1122) ==15236== by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519) ==15236== by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358) ==15236== Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd ==15236== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==15236== by 0x8D2578E6: u_vector_init (u_vector.c:47) ==15236== by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168) ==15236== by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921) ==15236== by 0x8CF56517: anv_CreateDevice (anv_device.c:1909) ==15236== by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073) ==15236== by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so) ==15236== by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so) ==15236== by 0x8BCB35C6: loader_create_device_chain (loader.c:5449) ==15236== by 0x8BCBC230: vkCreateDevice (trampoline.c:838) v2: Rename mmap_cleanups to avoid confusion (Caio) v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-09 21:57:26 +01:00
Dylan Baker	79ad8acd01	docs: update calendar, and news item and link release notes for 19.0.4	2019-05-09 13:48:47 -07:00
Dylan Baker	723f74c270	docs: Add SHA256 sums for mesa 19.0.4	2019-05-09 13:46:30 -07:00
Dylan Baker	ce32b71a8c	Docs: add 19.0.4 release notes	2019-05-09 13:46:26 -07:00
Pierre-Eric Pelloux-Prayer	62ed82ea1a	mesa: fix GL_PROGRAM_BINARY_RETRIEVABLE_HINT handling When first implemented in `fefd03e16c` Mesa's behavior was aligned on behavior of Nvidia's driver. This caused a failing test in piglit but was ok since the specification is unclear on this subject. Nvidia's driver behavior has been modified because using version 410.104, the problematic test (program_binary_retrievable_hint) now passes. This commit defers BinaryRetrievableHint update until the next linking so the test passes on Mesa as well. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-05-09 16:15:20 -04:00
Ian Romanick	1f1007a4ed	nir: Initialize lower_flrp_progress everywhere I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: `d41cdef2a5` ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989	2019-05-09 10:03:51 -07:00
Eric Engestrom	8b3baa2744	gallium: fix typo in comment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-09 11:14:37 +01:00
Eric Engestrom	86628ed79f	meson: fix a couple typos in comments Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-09 11:14:37 +01:00
Eric Engestrom	6c6af0c8b0	i965_asm: avoid free()ing uninitialized pointers Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-09 10:03:15 +00:00
Eric Engestrom	51597eca84	i965_asm: fix memleak Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-09 10:03:15 +00:00
Samuel Pitoiset	53dfff1c4d	radv: fix setting the number of rectangles when it's dyanmic We need to know the number of rectangles. This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*. Fixes: `5db0bf9994` ("radv: Implement VK_EXT_discard_rectangles.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-09 11:42:25 +02:00
Chris Wilson	8b81256469	iris: Reorganise execbuf to have a single point of failure Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure if need be. We retain the current behaviour to abort() at the first sign of trouble -- for a non-robustness context, arguably this is the right thing to do as the client cannot recover, and the system state is lost. How to properly integrate with KHR_robustness and reset-strategy is left as a future exercise. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-08 17:21:07 -07:00
Chris Wilson	8b7e19dbc5	drm-uapi: Update i915_drm.h for I915_CONTEXT_PARAM_RECOVERABLE Pull i915_drm.h to include kernel commit ba4fda620a5f7db521aa9e0262cf49854c1b1d9c Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Feb 18 10:58:21 2019 +0000 drm/i915: Optionally disable automatic recovery after a GPU reset for improved resilience in handling GPU hangs.	2019-05-08 17:21:07 -07:00
Dave Airlie	0a42d5b98b	kmsro: add _dri.so to two of the kmsro drivers. Fixes: `8cfc17bdda` (kmsro: Add the rest of the current set of tinydrm drivers.) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-09 07:15:26 +10:00
Kenneth Graunke	d9b9bb91ff	iris: Report the same video memory settings as i965. This just copy and pastes Ian's code from i965.	2019-05-08 12:43:08 -07:00
Eric Engestrom	5f8d29ab4b	gitlab-ci: add the vulkan overlay layer to the vulkan build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-05-08 19:51:46 +02:00
Eric Engestrom	c6306125b5	gitlab-ci: add the vulkan overlay layer to the vulkan build Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [ Michel Dänzer: Take changes affecting the docker image from !299, plus remove the unzip package again before generating the image ]	2019-05-08 16:59:02 +00:00
Michel Dänzer	fcf75534ec	gitlab-ci: Don't install WINE packages They were just making the docker image larger for no benefit at this point. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 16:59:02 +00:00
Michel Dänzer	82b30094ed	gitlab-ci: Reorder jobs a bit to be generally ordered longer => shorter This makes the longer jobs likely to run earlier, which can help the overall pipeline duration. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 16:59:02 +00:00
Michel Dänzer	6897715770	gitlab-ci: Build clover against all supported versions of LLVM And consolidate it all into a single job. It doesn't take much longer than a single version, thanks to ccache. Overall, this single job might be faster or at least use fewer CPU cycles than the two jobs before, while covering thrice as many versions of LLVM. v2: * Move "rm -rf _build" to meson-build.sh. * Set GALLIUM_DRIVERS the same way both times in the meson-clover job, for symmetry. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1	2019-05-08 16:59:02 +00:00
Michel Dänzer	cc2b3a99cc	gitlab-ci: Move meson job script to separate file No functional change intended (except for no longer running meson --version separately, as the version appears early in meson's output anyway). Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 16:59:02 +00:00
Michel Dänzer	d0b9a7f0d7	gitlab-ci: Remove superfluous comment about image tag counter suffix We really shouldn't ever need a suffix, otherwise it indicates a failure in coordination. :) In which case, it doesn't really matter how the tag is disambiguated. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 16:59:02 +00:00
Dylan Baker	0d59459432	meson: Force the use of config-tool for llvm meson git now has a cmake find method for llvm, but it lacks a couple of features that we use from the config tool version. Until that reaches parity we need to use the config-tool version. CC: 19.0 19.1 <<mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 09:39:03 -07:00
Brian Paul	a17c1ae165	gallium/util: fix two MSVC compiler warnings Remove stray const qualifier. s/unsigned/enum tgsi_semantic/ Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-08 10:05:42 -06:00
Brian Paul	4f54e550e9	gallium/pp: s/uint/enum tgsi_semantic/ to fix MSVC warning Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-08 10:05:42 -06:00
Brian Paul	cf5c7beb63	noop: s/enum pipe_transfer_usage/unsigned/ to fix MSVC warning The function pointer declaration in pipe_context uses unsigned for the bitmask. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-08 10:05:41 -06:00
Brian Paul	bc517dbbf7	ddebug: fix a few MSVC compiler warnings Don't return an expression in void functions. Replace an unsigned int with proper enum. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-08 10:05:41 -06:00
Brian Paul	2e28983ed2	glsl: s/GLboolean/bool/ to silence MSVC compiler warning It complains about mixing GLboolean and bool in the \|= expression. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-08 10:05:41 -06:00
Ian Romanick	ed5f024515	nir/flrp: Reassociate add in flrp(±1, b, c) lowering path With this reassociation, this lowering path is still beneficial. Ice Lake total instructions in shared programs: 17220191 -> 17207181 (-0.08%) instructions in affected programs: 999871 -> 986861 (-1.30%) helped: 3703 HURT: 17 helped stats (abs) min: 1 max: 686 x̄: 3.52 x̃: 3 helped stats (rel) min: 0.09% max: 51.97% x̄: 2.21% x̃: 1.35% HURT stats (abs) min: 1 max: 9 x̄: 1.47 x̃: 1 HURT stats (rel) min: 0.08% max: 4.55% x̄: 0.78% x̃: 0.55% 95% mean confidence interval for instructions value: -4.01 -2.99 95% mean confidence interval for instructions %-change: -2.29% -2.11% Instructions are helped. total cycles in shared programs: 360871298 -> 360755040 (-0.03%) cycles in affected programs: 9931334 -> 9815076 (-1.17%) helped: 2388 HURT: 1569 helped stats (abs) min: 1 max: 10228 x̄: 93.54 x̃: 18 helped stats (rel) min: <.01% max: 74.11% x̄: 3.36% x̃: 1.07% HURT stats (abs) min: 1 max: 1917 x̄: 68.27 x̃: 22 HURT stats (rel) min: <.01% max: 44.90% x̄: 3.44% x̃: 1.72% 95% mean confidence interval for cycles value: -39.48 -19.28 95% mean confidence interval for cycles %-change: -0.86% -0.46% Cycles are helped. total spills in shared programs: 12355 -> 12159 (-1.59%) spills in affected programs: 295 -> 99 (-66.44%) helped: 2 HURT: 1 total fills in shared programs: 25398 -> 25207 (-0.75%) fills in affected programs: 288 -> 97 (-66.32%) helped: 2 HURT: 1 LOST: 3 GAINED: 44 Iron Lake total instructions in shared programs: 8169225 -> 8159729 (-0.12%) instructions in affected programs: 1025712 -> 1016216 (-0.93%) helped: 3352 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.15% max: 12.00% x̄: 1.51% x̃: 1.05% 95% mean confidence interval for instructions value: -2.86 -2.80 95% mean confidence interval for instructions %-change: -1.56% -1.46% Instructions are helped. total cycles in shared programs: 188656796 -> 188612280 (-0.02%) cycles in affected programs: 18633584 -> 18589068 (-0.24%) helped: 3085 HURT: 14 helped stats (abs) min: 2 max: 72 x̄: 14.45 x̃: 12 helped stats (rel) min: 0.02% max: 5.73% x̄: 0.73% x̃: 0.31% HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -14.55 -14.18 95% mean confidence interval for cycles %-change: -0.76% -0.69% Cycles are helped. GM45 total instructions in shared programs: 5026905 -> 5021856 (-0.10%) instructions in affected programs: 584169 -> 579120 (-0.86%) helped: 1776 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 2.84 x̃: 3 helped stats (rel) min: 0.15% max: 11.11% x̄: 1.43% x̃: 0.98% 95% mean confidence interval for instructions value: -2.88 -2.80 95% mean confidence interval for instructions %-change: -1.50% -1.37% Instructions are helped. total cycles in shared programs: 129047376 -> 129018918 (-0.02%) cycles in affected programs: 12941924 -> 12913466 (-0.22%) helped: 1722 HURT: 14 helped stats (abs) min: 4 max: 72 x̄: 16.56 x̃: 18 helped stats (rel) min: 0.02% max: 5.73% x̄: 0.72% x̃: 0.30% HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4 HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -16.65 -16.13 95% mean confidence interval for cycles %-change: -0.76% -0.66% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-08 07:41:54 -07:00
Ian Romanick	ba203a3cd7	nir/flrp: Fix typo on the flrp(±1, b, c) path After Samuel reported the bisect, I was able to find the bug by inspection. Good thing for well-named varibles. :) Unfortunately, this undoes almost all of the benefit of the original patch. Ice Lake total instructions in shared programs: 17183159 -> 17218166 (0.20%) instructions in affected programs: 1308722 -> 1343729 (2.67%) helped: 98 HURT: 4746 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.47% max: 2.70% x̄: 0.60% x̃: 0.57% HURT stats (abs) min: 1 max: 691 x̄: 7.40 x̃: 8 HURT stats (rel) min: 0.10% max: 700.00% x̄: 5.82% x̃: 2.83% 95% mean confidence interval for instructions value: 6.82 7.64 95% mean confidence interval for instructions %-change: 5.22% 6.15% Instructions are HURT. total cycles in shared programs: 360705959 -> 360853522 (0.04%) cycles in affected programs: 10754380 -> 10901943 (1.37%) helped: 1594 HURT: 3331 helped stats (abs) min: 1 max: 1896 x̄: 119.81 x̃: 60 helped stats (rel) min: <.01% max: 35.48% x̄: 5.06% x̃: 3.64% HURT stats (abs) min: 1 max: 10208 x̄: 101.63 x̃: 38 HURT stats (rel) min: 0.01% max: 878.95% x̄: 9.01% x̃: 2.78% 95% mean confidence interval for cycles value: 21.11 38.81 95% mean confidence interval for cycles %-change: 3.76% 5.15% Cycles are HURT. total spills in shared programs: 12158 -> 12355 (1.62%) spills in affected programs: 98 -> 295 (201.02%) helped: 1 HURT: 2 total fills in shared programs: 25204 -> 25398 (0.77%) fills in affected programs: 94 -> 288 (206.38%) helped: 0 HURT: 3 LOST: 15 GAINED: 8 Iron Lake total instructions in shared programs: 8121430 -> 8166733 (0.56%) instructions in affected programs: 1148353 -> 1193656 (3.95%) helped: 2 HURT: 4046 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.85% max: 1.92% x̄: 1.89% x̃: 1.89% HURT stats (abs) min: 1 max: 43 x̄: 11.20 x̃: 11 HURT stats (rel) min: 0.20% max: 716.67% x̄: 7.40% x̃: 3.87% 95% mean confidence interval for instructions value: 11.02 11.37 95% mean confidence interval for instructions %-change: 6.84% 7.94% Instructions are HURT. total cycles in shared programs: 188376326 -> 188601568 (0.12%) cycles in affected programs: 27416674 -> 27641916 (0.82%) helped: 68 HURT: 3947 helped stats (abs) min: 2 max: 222 x̄: 13.88 x̃: 6 helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01% HURT stats (abs) min: 2 max: 670 x̄: 57.31 x̃: 64 HURT stats (rel) min: <.01% max: 1811.11% x̄: 4.11% x̃: 1.09% 95% mean confidence interval for cycles value: 55.01 57.20 95% mean confidence interval for cycles %-change: 2.88% 5.19% Cycles are HURT. LOST: 35 GAINED: 3 GM45 total instructions in shared programs: 4979794 -> 5003551 (0.48%) instructions in affected programs: 635174 -> 658931 (3.74%) helped: 1 HURT: 2142 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.85% max: 1.85% x̄: 1.85% x̃: 1.85% HURT stats (abs) min: 1 max: 43 x̄: 11.09 x̃: 11 HURT stats (rel) min: 0.20% max: 716.67% x̄: 7.00% x̃: 3.53% 95% mean confidence interval for instructions value: 10.85 11.33 95% mean confidence interval for instructions %-change: 6.25% 7.74% Instructions are HURT. total cycles in shared programs: 128519586 -> 128654990 (0.11%) cycles in affected programs: 17635304 -> 17770708 (0.77%) helped: 46 HURT: 2088 helped stats (abs) min: 4 max: 220 x̄: 18.13 x̃: 6 helped stats (rel) min: <.01% max: 1.28% x̄: 0.15% x̃: 0.01% HURT stats (abs) min: 2 max: 670 x̄: 65.25 x̃: 66 HURT stats (rel) min: <.01% max: 1464.29% x̄: 4.05% x̃: 0.99% 95% mean confidence interval for cycles value: 61.75 65.15 95% mean confidence interval for cycles %-change: 2.58% 5.34% Cycles are HURT. LOST: 38 GAINED: 38 Fixes: `5b908db604` ("nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently") Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-08 07:41:26 -07:00
Lionel Landwerlin	43596e5f34	anv: fix use after free Once mem->bo is removed from the cache, it is likely to be freed. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b80930a6fe` ("anv: add support for VK_EXT_memory_budget") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 12:02:13 +01:00
Lionel Landwerlin	a07d06f103	anv: rework queries writes to ensure ordering memory writes We use a mix of MI & PIPE_CONTROL commands to write our queries' data (results & availability). Those commands' memory write order is not guaranteed with regard to their order in the command stream, unless CS stalls are inserted between them. This is problematic for 2 reasons : 1. We copy results from the device using MI commands even though the values are generated from PIPE_CONTROL, meaning we could copy unlanded values into the results and then copy the availability that is inconsistent with the values. 2. We allow the user to poll on the availability values of the query pool from the CPU. If the availability lands in memory before the values then we could return invalid values. This change does 2 things to address this problem : - We use either PIPE_CONTROL or MI commands to write both queries values and availability, so that the ordering of the memory writes guarantees that if availability is visible, results are also visible. - For the occlusion & timestamp queries we apply a CS stall before copying the results on the device, to ensure copying with MI commands see the correct values of previous PIPE_CONTROL writes of availability (required by the Vulkan spec). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reported-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-08 09:49:09 +00:00
Timothy Arceri	e19a8fe033	radv: call constant folding before opt algebraic The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-08 19:45:01 +10:00
Erik Faye-Lund	ecdab0dfea	docs: drop h1 in header It's generally frowned upon to have more than one H1 per document in HTML4. So let's put the text directly inside the header. This means we can drop the flex-based centering, which makes things a bit easier. We also need to change the padding to rem instead of em, because the em has now changed. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	6e0e550904	docs: harmonize headings and titles We're pretty insonsistent in what the headings and titles are, especially compared to what the articles are listed as in the sidebar. Let's harmonize this. There's a notable exception for meson.html, where the sidebar uses a short-hand form that makes sense in the sidebar, but not in the article due to the visible context being different. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	269474b428	docs: renumber headings It's generally frowned upon to have multiple H1 headings in HTML4. So let's make sure each article has a primary heading for the article, and that that heading is the title that is used in the sidebar. While we're at it, let's update the title in the articles to match the title from the sidebar as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	87683ba058	docs: give download-article a primary heading It's generally frowned upon to have multiple H1 headings in HTML4. So let's add a primary heading for the article, and source that from the title used in the sidebar. While we're at it, let's update the title in the article to match the title from the sidebar as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	a8df27b0b2	docs: use title-casing for all headings in sidebar We generally use title-casing for headings in the sidebar. But not all headings was constently cased like that. Let's make sure this is consistent. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	f06e698aad	docs: spell out "and" in sidebar There's no need to keep this short, we can just spell out "and" here. Besides, a slash kind of implies "or", but these articles are about both of these, not either. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	7809331cb9	docs: remove pointless list-entry It's quite visible that there's more docs below, we don't need to spell it out for the reader. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	7421fdf68a	docs: spell out faq in sidebar We're not short on space here, so there's little point in abbreviating this. This also matches the heading in the article. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Erik Faye-Lund	e4fe83c8a0	docs: spell out "and" in sidebar We're not short on space here, so let's just spell out "and" instead of using the ampersand. This is more consistent with the entry above in the sidebar. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-08 07:18:15 +00:00
Timothy Arceri	4fd8161773	glsl_to_nir: remove unused type_is_int() This was missed in `e00fa99b08`. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-08 14:11:38 +10:00
Timothy Arceri	a01b393c39	Revert "glx: Fix synthetic error generation in __glXSendError" This reverts commit `e91ee763c3`. This seems to have broken a number of wine games. Lets revert everything for now and try again later. Acked-by: Adam Jackson <ajax@redhat.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110632 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110590	2019-05-08 13:16:44 +10:00
Timothy Arceri	024232b26c	radeonsi: add an AMD_TEX_ANISO environment variable This brings it inline with the recently added AMD_DEBUG. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109619	2019-05-08 09:32:25 +10:00
Kenneth Graunke	d568fcd0a0	i965: leave the top 4Gb of the high heap VMA unused This ports commit `9e7b0988d6` from anv to i965. Thanks to Lionel for noticing that it was missing! Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 15:45:56 -07:00
Kenneth Graunke	17210c63a9	i965: Force VMA alignment to be a multiple of the page size. This should happen regardless, but let's be paranoid. Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 15:45:56 -07:00
Kenneth Graunke	15f134c628	i965: Fix BRW_MEMZONE_LOW_4G heap size. The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages, and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB. So we can't use the top page. Fixes: `01058a5522` i965: Add virtual memory allocator infrastructure to brw_bufmgr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 15:45:56 -07:00
Matt Turner	e8c74a1e16	intel/compiler: Unset flag reg when FB write is not predicated In the FS IR we pretend that the instruction is predicated with (+f0.1) just for flag dependency tracking purposes. Since the instruction doesn't support predication before Haswell, we unset the predicate so we should also unset the flag register so that we can round-trip the disassembly. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	5d7a9e0811	intel/disasm: Disassemble immediate value properly for dim On haswell, for dim instruction we encode immediate float value operand into double float, v2: Fix comment (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	6c83a68ebc	intel/disasm: Disassemble JIP offset for while Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	9db616e8a2	intel/compiler: Replicate 16 bit immediate value correctly For the W or UW (signed or unsigned word) source types, the 16-bit value must be replicated in both the low and high words of the 32-bit immediate value. v2: Fix replication in other places as well V3: fix a few nits (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	5211159b5b	intel/compiler: Print quad value in hex format Print quad value same as unsigned quad so that we can distinguish in between quater control disassembled values for e.g 1/2/3[Q] and immediate quad value for e.g 1Q. This allows round-tripping through the assembler/disassembler. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	4e828bb48a	intel/tools: Add unit tests for assembler v1: Pass executable object from meson to test(Dylan Baker) v2: Ignore generated output files from git status(Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-05-07 14:33:48 -07:00
Mika Kuoppala	1fb5ce0a11	intel/tools: Initialize offset correctly for i965_asm If we leave offset uninitialized, access to store will be random depending on stack value and can segfault. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Mika Kuoppala	85da1194ec	intel/tools: Add meson pthread dependancy for i965_asm Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-07 14:33:48 -07:00
Sagar Ghuge	70308a5a8a	intel/tools: New i965 instruction assembler tool Tool is inspired from igt's assembler tool. Thanks to Matt Turner, who mentored me through out this project. v2: Fix memory leaks and naming convention (Caio) v3: Fix meson changes (Dylan Baker) v4: Fix usage options (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/141	2019-05-07 14:33:38 -07:00
Kenneth Graunke	a232aa5c50	iris: Also handle res->offset for buffer sampler/image views	2019-05-07 13:36:18 -07:00
Mike Blumenkrantz	ddd716e746	iris: support dmabuf imports with offsets this adds support for imports where the image data begins at an offset from the start of the buffer, as used in h/x264 fixes kwg/mesa#47 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-05-07 13:36:08 -07:00
Roland Scheidegger	748f603390	gallivm: fix broken 8-wide s3tc decoding Brian noticed there was an uninitialized var for the 8-wide case and 128 bit blocks, which made it always crash. Likewise, the 64bit block case had another crash bug due to type mismatch. Color decode (used for all s3tc formats) also had a bogus shuffle for this case, leading to decode artifacts. Fix these all up, which makes the code actually work 8-wide. Note that it's still not used - I've verified it works, and the generated assembly does look quite a bit simpler actually (20-30% less instructions for the s3tc decode part with avx2), however in practice it still seems to be sligthly slower for some unknown reason (tested with openarena) on my haswell box, so for now continue to split things into 4-wide vectors before decoding. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2019-05-07 18:59:38 +02:00
Juan A. Suarez Romero	92dba1c66e	docs: Add relnotes stub for 19.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-07 16:07:29 +00:00
Juan A. Suarez Romero	14a7959cfa	Bump version for 19.1 branch Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2019-05-07 16:02:34 +00:00
Vasily Khoruzhick	6b46399e2f	lima: enable sin and cos lowering for GP GP doesn't support sin/cos natively, so we have to lower them. Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 15:25:21 +00:00
Vasily Khoruzhick	e67e4e90b2	nir: implement lowering for fsin and fcos Lower sin and cos using Nick's fast sin/cos approximation from https://web.archive.org/web/20180105155939/http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648 It's suitable for GLES2, but it throws warnings in dEQP GLES3 precision tests. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 15:25:21 +00:00
Rob Clark	b15c46e6bf	freedreno/ir3: move const_state to ir3_shader For a6xx, we construct/emit a single VS const state used for both binning pass and draw pass. So far we were mostly getting lucky that there were not (obvious) mismatches between the const_state (like different lowered immediates) between the binning and draw pass VS ir3_shader_variant. And I guess this situation will come up more as GS and tess is added into the equation. Since really everything about the const state is not specific to the variant, move this. The main exception is lowered immediates, but these are the last to appear in the layout, and it doesn't hurt for each new shader variant to just append any immed's it lowers to the end of the immediate state. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-07 07:26:00 -07:00
Rob Clark	5690f83bb5	freedreno/ir3: split out const_state setup Next patch moves const_state to ir3_shader, before the compile context is created. So move the code around in prep to call it earlier. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-07 07:26:00 -07:00
Rob Clark	9403184ddd	freedreno/ir3: move immediates to const_state They are really part of the constant state, and it will moving things from ir3_shader_variant to ir3_shader if we combine them. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-07 07:26:00 -07:00
Rob Clark	23e7a34466	freedreno/ir3: consolidate const state Combine the offsets of differenet parts of the constant space with (what was formerly known as) ir3_driver_const_layout. Bunch of churn, but no functional change. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-07 07:26:00 -07:00
Rob Clark	ef3eecd66b	freedreno/ir3: move ir3_pointer_size() Move to ir3_compiler so it doesn't depend on the compile context. Prep work for moving constant state from variant (where we have compile context) to shader (where we do not). Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-05-07 07:26:00 -07:00
Lionel Landwerlin	2d2927938f	vulkan/overlay-layer: fix cast errors Not quite sure what version of GCC/Clang produces errors (8.3.0 locally was fine). v2: also fix an integer literal issue (Karol) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-07 10:45:45 +01:00
Samuel Iglesias Gonsálvez	bc66cebc0d	anv: fix alphaToCoverage when there is no color attachment There are tests in CTS for alpha to coverage without a color attachment that are failing. This happens because we remove the shader color outputs when we don't have a valid color attachment for them, but when alpha to coverage is enabled we still want to preserve the the output at location 0 since we need the alpha component. In that case we will also need to create a null render target for RT 0. v2: - We already create a null rt when we don't have any, so reuse that for this case (Jason) - Simplify the code a bit (Iago) v3: - Take alpha to coverage from the key and don't tie this to depth-only rendering only, we want the same behavior if we have multiple render targets but the one at location 0 is not used. (Jason). - Rewrite commit message (Iago) v4: - Make sure we take into account the array length of the shader outputs, which we were no handling correctly either and make sure we also create null render targets for any invalid array entries too. v5: - Simplify removal of unused outputs by using rt_used[] so we don't have to special case alpha to coverage there too. Fixes the following CTS tests: dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.* Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 09:35:47 +02:00
Ian Romanick	c866500525	intel/compiler: Don't always require precise lowering of flrp No changes on any other Intel platforms. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8164367 -> 8135551 (-0.35%) instructions in affected programs: 3271235 -> 3242419 (-0.88%) helped: 13636 HURT: 90 helped stats (abs) min: 1 max: 30 x̄: 2.13 x̃: 1 helped stats (rel) min: 0.04% max: 10.77% x̄: 1.16% x̃: 0.97% HURT stats (abs) min: 1 max: 4 x̄: 1.80 x̃: 2 HURT stats (rel) min: 0.26% max: 11.11% x̄: 1.76% x̃: 0.78% 95% mean confidence interval for instructions value: -2.13 -2.07 95% mean confidence interval for instructions %-change: -1.16% -1.13% Instructions are helped. total cycles in shared programs: 188719974 -> 188586222 (-0.07%) cycles in affected programs: 70415766 -> 70282014 (-0.19%) helped: 12563 HURT: 515 helped stats (abs) min: 2 max: 600 x̄: 10.90 x̃: 6 helped stats (rel) min: <.01% max: 5.48% x̄: 0.48% x̃: 0.27% HURT stats (abs) min: 2 max: 54 x̄: 6.07 x̃: 4 HURT stats (rel) min: 0.01% max: 4.48% x̄: 0.24% x̃: 0.08% 95% mean confidence interval for cycles value: -10.56 -9.90 95% mean confidence interval for cycles %-change: -0.47% -0.45% Cycles are helped. LOST: 0 GAINED: 13 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	ab86926156	nir/algebraic: Reassociate open-coded flrp(1, b, c) In a previous verion of this patch, Jason commented, "Re-associating based on whether or not something has a constant value of 1.0 seems a bit sneaky. I think it's well within the rules but it seems like something that could bite you." That is possibly true. The reassociation will generate different results if fabs(b) >= 2**24 and fabs(c) < 0.5. The delta increases as fabs(c) approaches 0. However, i965 has done this same reassociation indirectly for years. We would previously allow nir_op_flrp on all pre-Gen11 hardware even though Gen4 and Gen5 do not have a LRP instruction. Optimizations in nir_opt_algebraic would convert expressions like a+c(b-a) into flrp(a, b, c). On Gen7+, the hardware performs the same arithmetic as a(1-c)+bc. Gen6 seems to implement LRP as a+c(b-a). On Gen4 and Gen5, we would lower LRP to a sequence of instructions that implement a(1-c)+bc. The lowering happens after all constant folding, so we would litterally generate a 1+(-1) instruction sequence in this scenario: one instruction to load either 1 or -1 in a register, and another instruction to add either -1 or 1 to it. This patch just cuts out the middle man. Do the reassociation that we've always done, but do it explicitly at a time when we can benefit from other optimizations. A few cases that were hurt by "nir: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently" are restored by this patch. This includes a few shaders in ET:QW. I tried a similar thing for open-coded flrp(-1, b, c), and it hurt instructions on 35 shaders for ILK without helping any. The helped / hurt cycles was about even. No changes on any other Intel platforms. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8172020 -> 8164367 (-0.09%) instructions in affected programs: 1089851 -> 1082198 (-0.70%) helped: 3285 HURT: 64 helped stats (abs) min: 1 max: 6 x̄: 2.35 x̃: 2 helped stats (rel) min: 0.13% max: 12.00% x̄: 1.15% x̃: 0.83% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 0.64% x̄: 0.39% x̃: 0.38% 95% mean confidence interval for instructions value: -2.32 -2.25 95% mean confidence interval for instructions %-change: -1.16% -1.09% Instructions are helped. total cycles in shared programs: 188758338 -> 188719974 (-0.02%) cycles in affected programs: 20004922 -> 19966558 (-0.19%) helped: 3012 HURT: 477 helped stats (abs) min: 2 max: 142 x̄: 13.41 x̃: 12 helped stats (rel) min: 0.01% max: 6.37% x̄: 0.52% x̃: 0.24% HURT stats (abs) min: 2 max: 328 x̄: 4.27 x̃: 4 HURT stats (rel) min: <.01% max: 1.55% x̄: 0.14% x̃: 0.11% 95% mean confidence interval for cycles value: -11.38 -10.62 95% mean confidence interval for cycles %-change: -0.46% -0.41% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	c995d1ca3a	nir/flrp: Lower flrp(a, b, #c) differently This doesn't help on Intel GPUs now because we always take the "always_precise" path first. It may help on other GPUs, and it does prevent a bunch of regressions in "intel/compiler: Don't always require precise lowering of flrp". Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	ae02622d8f	nir/flrp: Lower flrp(a, b, c) differently if another flrp(_, b, c) exists There is little effect on Intel GPUs now because we almost always take the "always_precise" path first. It may help on other GPUs, and it does prevent a bunch of regressions in "intel/compiler: Don't always require precise lowering of flrp". No changes on any other Intel platforms. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 188852500 -> 188852484 (<.01%) cycles in affected programs: 14612 -> 14596 (-0.11%) helped: 4 HURT: 0 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.09% max: 0.13% x̄: 0.11% x̃: 0.11% 95% mean confidence interval for cycles value: -4.00 -4.00 95% mean confidence interval for cycles %-change: -0.13% -0.09% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	6698d861a5	nir/flrp: Lower flrp(a, b, c) differently if another flrp(a, _, c) exists This doesn't help on Intel GPUs now because we always take the "always_precise" path first. It may help on other GPUs, and it does prevent a bunch of regressions in "intel/compiler: Don't always require precise lowering of flrp". No changes on any Intel platform. Before a number of large rebases this helped cycles in a couple shaders on Iron Lake and GM45. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	5b908db604	nir/flrp: Lower flrp(±1, b, c) and flrp(a, ±1, c) differently No changes on any other Intel platforms. v2: Rebase on 424372e5dd5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8189888 -> 8153912 (-0.44%) instructions in affected programs: 1199037 -> 1163061 (-3.00%) helped: 4124 HURT: 10 helped stats (abs) min: 1 max: 40 x̄: 8.73 x̃: 9 helped stats (rel) min: 0.20% max: 86.96% x̄: 4.96% x̃: 3.02% HURT stats (abs) min: 1 max: 2 x̄: 1.20 x̃: 1 HURT stats (rel) min: 1.06% max: 3.92% x̄: 1.62% x̃: 1.06% 95% mean confidence interval for instructions value: -8.84 -8.56 95% mean confidence interval for instructions %-change: -5.12% -4.77% Instructions are helped. total cycles in shared programs: 188606710 -> 188426964 (-0.10%) cycles in affected programs: 27505596 -> 27325850 (-0.65%) helped: 4026 HURT: 77 helped stats (abs) min: 2 max: 646 x̄: 44.99 x̃: 46 helped stats (rel) min: <.01% max: 94.58% x̄: 2.35% x̃: 0.85% HURT stats (abs) min: 2 max: 376 x̄: 17.79 x̃: 6 HURT stats (rel) min: <.01% max: 2.60% x̄: 0.22% x̃: 0.04% 95% mean confidence interval for cycles value: -44.75 -42.87 95% mean confidence interval for cycles %-change: -2.44% -2.17% Cycles are helped. LOST: 3 GAINED: 35 Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	23c5501b77	nir/flrp: Lower flrp(#a, #b, c) differently If the magnitudes of #a and #b are such that (b-a) won't lose too much precision, lower as a+c(b-a). No changes on any other Intel platforms. v2: Rebase on 424372e5dd5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8192503 -> 8192383 (<.01%) instructions in affected programs: 18417 -> 18297 (-0.65%) helped: 68 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 1.76 x̃: 1 helped stats (rel) min: 0.19% max: 7.89% x̄: 1.10% x̃: 0.43% 95% mean confidence interval for instructions value: -2.48 -1.05 95% mean confidence interval for instructions %-change: -1.56% -0.63% Instructions are helped. total cycles in shared programs: 188662536 -> 188661956 (<.01%) cycles in affected programs: 744476 -> 743896 (-0.08%) helped: 62 HURT: 0 helped stats (abs) min: 4 max: 60 x̄: 9.35 x̃: 6 helped stats (rel) min: 0.02% max: 4.84% x̄: 0.27% x̃: 0.06% 95% mean confidence interval for cycles value: -12.37 -6.34 95% mean confidence interval for cycles %-change: -0.48% -0.06% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	dd7135d55d	intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5 Previously lower_flrp32 was only set for vertex shaders. Fragment shaders performed a(1-c)+bc lowering during code generation. The shaders with loops hurt are SIMD8 and SIMD16 shaders for a text-identical fragment shader. v2: Rebase on `26391cceaa` ("intel/compiler: Lower ffma on Gen4 and Gen5"). v3: Rebase on `a004e95dd7` ("radeonsi/nir: create si_nir_opts() helper") Iron Lake total instructions in shared programs: 8211385 -> 8185974 (-0.31%) instructions in affected programs: 2503898 -> 2478487 (-1.01%) helped: 9936 HURT: 921 helped stats (abs) min: 1 max: 155 x̄: 2.86 x̃: 2 helped stats (rel) min: 0.10% max: 35.48% x̄: 1.67% x̃: 1.11% HURT stats (abs) min: 1 max: 12 x̄: 3.24 x̃: 2 HURT stats (rel) min: 0.21% max: 13.64% x̄: 1.86% x̃: 0.89% 95% mean confidence interval for instructions value: -2.43 -2.25 95% mean confidence interval for instructions %-change: -1.41% -1.33% Instructions are helped. total cycles in shared programs: 188523186 -> 188401198 (-0.06%) cycles in affected programs: 71541604 -> 71419616 (-0.17%) helped: 11649 HURT: 1871 helped stats (abs) min: 2 max: 930 x̄: 12.62 x̃: 6 helped stats (rel) min: <.01% max: 44.61% x̄: 0.68% x̃: 0.25% HURT stats (abs) min: 2 max: 138 x̄: 13.38 x̃: 8 HURT stats (rel) min: <.01% max: 10.99% x̄: 0.49% x̃: 0.17% 95% mean confidence interval for cycles value: -9.42 -8.63 95% mean confidence interval for cycles %-change: -0.54% -0.50% Cycles are helped. total loops in shared programs: 852 -> 856 (0.47%) loops in affected programs: 0 -> 4 helped: 0 HURT: 4 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.00% max: 0.00% x̄: 0.00% x̃: 0.00% 95% mean confidence interval for loops value: 1.00 1.00 95% mean confidence interval for loops %-change: 0.00% 0.00% Loops are HURT. LOST: 3 GAINED: 12 GM45 total instructions in shared programs: 5046407 -> 5033694 (-0.25%) instructions in affected programs: 1303584 -> 1290871 (-0.98%) helped: 5010 HURT: 464 helped stats (abs) min: 1 max: 155 x̄: 2.85 x̃: 2 helped stats (rel) min: 0.10% max: 34.38% x̄: 1.63% x̃: 1.08% HURT stats (abs) min: 1 max: 75 x̄: 3.39 x̃: 2 HURT stats (rel) min: 0.20% max: 13.04% x̄: 1.84% x̃: 0.87% 95% mean confidence interval for instructions value: -2.45 -2.20 95% mean confidence interval for instructions %-change: -1.40% -1.28% Instructions are helped. total cycles in shared programs: 128889476 -> 128812366 (-0.06%) cycles in affected programs: 44845402 -> 44768292 (-0.17%) helped: 6079 HURT: 940 helped stats (abs) min: 2 max: 930 x̄: 15.16 x̃: 8 helped stats (rel) min: <.01% max: 41.03% x̄: 0.71% x̃: 0.25% HURT stats (abs) min: 2 max: 138 x̄: 16.01 x̃: 8 HURT stats (rel) min: <.01% max: 10.99% x̄: 0.50% x̃: 0.17% 95% mean confidence interval for cycles value: -11.63 -10.34 95% mean confidence interval for cycles %-change: -0.58% -0.52% Cycles are helped. total loops in shared programs: 633 -> 635 (0.32%) loops in affected programs: 0 -> 2 helped: 0 HURT: 2 total spills in shared programs: 60 -> 69 (15.00%) spills in affected programs: 54 -> 63 (16.67%) helped: 0 HURT: 1 total fills in shared programs: 92 -> 105 (14.13%) fills in affected programs: 80 -> 93 (16.25%) helped: 0 HURT: 1 LOST: 15 GAINED: 15 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v2] Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]	2019-05-06 22:52:29 -07:00
Ian Romanick	d41cdef2a5	nir: Use the flrp lowering pass instead of nir_opt_algebraic I tried to be very careful while updating all the various drivers, but I don't have any of that hardware for testing. :( i965 is the only platform that sets always_precise = true, and it is only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32 only for vertex shaders. For fragment shaders, nir_op_flrp is lowered during code generation as a(1-c)+bc. On all other platforms 64-bit nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old nir_opt_algebraic method. No changes on any other Intel platforms. v2: Add panfrost changes. Iron Lake and GM45 had similar results. (Iron Lake shown) total cycles in shared programs: 188647754 -> 188647748 (<.01%) cycles in affected programs: 5096 -> 5090 (-0.12%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:29 -07:00
Ian Romanick	158370ed2a	nir/flrp: Add new lowering pass for flrp instructions This pass will soon grow to include some optimizations that are difficult or impossible to implement correctly within nir_opt_algebraic. It also include the ability to generate strictly correct code which the current nir_opt_algebraic lowering lacks (though that could be changed). v2: Document the parameters to nir_lower_flrp. Rebase on top of `3766334923` ("compiler/nir: add lowering for 16-bit flrp") Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:28 -07:00
Ian Romanick	dc566a033c	nir/algebraic: Pull common multiplication out of flrp arguments All Intel platforms had similar results. (Skylake shown) total instructions in shared programs: 15342485 -> 15337495 (-0.03%) instructions in affected programs: 217456 -> 212466 (-2.29%) helped: 1539 HURT: 1 helped stats (abs) min: 1 max: 17 x̄: 3.24 x̃: 3 helped stats (rel) min: 0.22% max: 18.75% x̄: 3.10% x̃: 1.91% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.56% max: 0.56% x̄: 0.56% x̃: 0.56% 95% mean confidence interval for instructions value: -3.39 -3.09 95% mean confidence interval for instructions %-change: -3.24% -2.96% Instructions are helped. total cycles in shared programs: 355734320 -> 355728237 (<.01%) cycles in affected programs: 1851555 -> 1845472 (-0.33%) helped: 835 HURT: 575 helped stats (abs) min: 1 max: 658 x̄: 40.62 x̃: 14 helped stats (rel) min: <.01% max: 35.69% x̄: 3.78% x̃: 1.81% HURT stats (abs) min: 1 max: 322 x̄: 48.40 x̃: 14 HURT stats (rel) min: 0.04% max: 71.02% x̄: 8.06% x̃: 2.43% 95% mean confidence interval for cycles value: -8.50 -0.13 95% mean confidence interval for cycles %-change: 0.48% 1.62% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:28 -07:00
Ian Romanick	a83a6e9690	nir/algebraic: Pull common addition out of flrp arguments v2: Augment the late optimization patterns with a couple pre-ffma pass patterns. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15342982 -> 15342485 (<.01%) instructions in affected programs: 56304 -> 55807 (-0.88%) helped: 235 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.11 x̃: 1 helped stats (rel) min: 0.11% max: 8.82% x̄: 1.27% x̃: 0.74% 95% mean confidence interval for instructions value: -2.31 -1.92 95% mean confidence interval for instructions %-change: -1.46% -1.09% Instructions are helped. total cycles in shared programs: 355734740 -> 355734320 (<.01%) cycles in affected programs: 1028807 -> 1028387 (-0.04%) helped: 134 HURT: 104 helped stats (abs) min: 1 max: 212 x̄: 25.69 x̃: 8 helped stats (rel) min: <.01% max: 9.36% x̄: 1.33% x̃: 0.61% HURT stats (abs) min: 1 max: 203 x̄: 29.06 x̃: 8 HURT stats (rel) min: 0.02% max: 15.76% x̄: 1.76% x̃: 0.46% 95% mean confidence interval for cycles value: -8.51 4.98 95% mean confidence interval for cycles %-change: -0.35% 0.39% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10886815 -> 10886390 (<.01%) instructions in affected programs: 36883 -> 36458 (-1.15%) helped: 147 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 2.89 x̃: 3 helped stats (rel) min: 0.35% max: 8.00% x̄: 1.60% x̃: 1.23% 95% mean confidence interval for instructions value: -3.12 -2.67 95% mean confidence interval for instructions %-change: -1.83% -1.38% Instructions are helped. total cycles in shared programs: 154188360 -> 154186902 (<.01%) cycles in affected programs: 388094 -> 386636 (-0.38%) helped: 90 HURT: 58 helped stats (abs) min: 1 max: 243 x̄: 36.80 x̃: 15 helped stats (rel) min: 0.04% max: 9.23% x̄: 1.26% x̃: 0.83% HURT stats (abs) min: 1 max: 684 x̄: 31.97 x̃: 10 HURT stats (rel) min: 0.03% max: 13.50% x̄: 1.15% x̃: 0.51% 95% mean confidence interval for cycles value: -22.62 2.92 95% mean confidence interval for cycles %-change: -0.68% 0.05% Inconclusive result (value mean confidence interval includes 0). Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8221239 -> 8220357 (-0.01%) instructions in affected programs: 54560 -> 53678 (-1.62%) helped: 186 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 4.74 x̃: 3 helped stats (rel) min: 0.34% max: 10.77% x̄: 1.97% x̃: 1.17% 95% mean confidence interval for instructions value: -5.21 -4.28 95% mean confidence interval for instructions %-change: -2.23% -1.72% Instructions are helped. total cycles in shared programs: 188654442 -> 188650364 (<.01%) cycles in affected programs: 1454384 -> 1450306 (-0.28%) helped: 204 HURT: 0 helped stats (abs) min: 2 max: 84 x̄: 19.99 x̃: 18 helped stats (rel) min: 0.02% max: 4.69% x̄: 0.56% x̃: 0.22% 95% mean confidence interval for cycles value: -22.38 -17.60 95% mean confidence interval for cycles %-change: -0.67% -0.46% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-06 22:52:28 -07:00
Christian Gmeiner	e00fa99b08	glsl_to_nir: drop supports_ints At initial nir level all drivers are supporting ints. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 07:35:59 +02:00
Christian Gmeiner	4e110eca42	nir: nir_shader_compiler_options: drop native_integers Driver which do not support native integers should use a lowering pass to go from integers to floats. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-07 07:35:52 +02:00
Alyssa Rosenzweig	050b934a24	panfrost: Refactor blend descriptors This commit does a fairly large cleanup of blend descriptors, although there should not be any functional changes. In particular, we split apart the Midgard and Bifrost blend descriptors, since they are radically different. From there, we can identify that the Midgard descriptor as previously written was really two render targets' descriptors stuck together. From this observation, we split the Midgard descriptor into what a single RT actually needs. This enables us to correctly dump blending configuration for MRT samples on Midgard. It also allows the Midgard and Bifrost blend code to peacefully coexist, with runtime selection rather than a #ifdef. So, as a bonus, this will help the future Bifrost effort, eliminating one major source of compile-time architectural divergence. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-05-07 03:21:08 +00:00
Vasily Khoruzhick	d4a249aa09	lima/gpir: enable lowering for ftrunc Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 01:07:27 +00:00
Vasily Khoruzhick	f4659bea7c	lima/gpir: implement nir_op_fmov Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 01:07:27 +00:00
Vasily Khoruzhick	cf1ab4b96b	lima: use int_to_float lowering pass Neither GP nor PP in Mali4x0 support integers, so utilize new pass and set native_integers to true for now until this flag is dropped. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 01:07:27 +00:00
Vasily Khoruzhick	443c5a3cd6	nir: add int_to_float lowering pass This new pass lowers ints and bools to floats. It allows hardware that doesn't have native integers (e.g. Mali4x0) use the same code paths as modern hardware. It uses newly introduced pass to gather SSA types and should be used as late as possible. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-07 01:07:27 +00:00
Timothy Arceri	49025292fb	radeonsi: add config entry for Counter-Strike Global Offensive This fixes rendering issues with gun scopes which is rather important. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100239	2019-05-07 09:42:09 +10:00
Vasily Khoruzhick	d085920b64	lima/gpir: fix float uniform alignment issue If PIPE_CAP_PACKED_UNIFORMS is not set uniforms are vec4 aligned, so lima_nir_lower_uniform_to_scalar should use first channel of vec4 for float uniforms. Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-05-06 14:08:09 -07:00
Erik Faye-Lund	d84b85bc28	draw: flush when setting stream-out targets We need to re-prepare the middle-end state to pick up changes to this state to react correctly to pausing/resuming stream-out. So let's add a flush here. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `ec8cbd79ac` "draw/softpipe: EXT_transform_feedback support (v2)" Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-06 22:42:37 +02:00
Erik Faye-Lund	ed53e61bec	llvmpipe: pass stream-out targets to draw-module early We currently set this state in the draw-module twice on each draw, but which trashes this state. So far that's not a problem, because we don't really do much from that function. But it turns out, we're going to have to do more; namely flush when the state changes. This will incur a large performance penalty due to the excessive setting. Instead, let's rely on the CSO caching making sure that llvmpipe_set_so_targets doesn't get called needlessly, and setup the state directly there instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-05-06 22:42:37 +02:00
Uros Bizjak	fc7649c4b7	doc: Update GL_KHR_robustness in features.txt for r600 glxinfo for Cypress XT [Radeon HD 5870] lists GL_KHR_robustness as supported extension. This was the last missing extension for GL 4.5, so Mark GL 4.5 as all DONE for r600. Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-05-07 06:21:48 +10:00
Chia-I Wu	c7078397ca	virgl: do not use inline writes for subdata Inline writes skip transfer map/unamp at the cost of an extra copy on the data during execbuffer. That is generally a win for small transfers. But the heuristic to use inline writes based on buffer sizes rather than transfer sizes makes little sense. More importantly, inline writes miss optimizations that are done for buffer transfers. Let's just use transfers. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2019-05-06 10:31:56 -07:00
Chia-I Wu	898be8036d	virgl: rework queries virglrender has been changed such that - VIRGL_CCMD_GET_QUERY_RESULT is fenced - query buffers (PIPE_BIND_CUSTOM) are coherent We can check if a query is ready using DRM_IOCTL_VIRTGPU_WAIT, and also avoid a synchronized transfer to retrieve the query result. When running against an older virglrenderer, it falls back to the old behavior automatically. TF2 @ 640x480 for pts4.dem went from 17fps to 40fps on my testing machine. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-06 10:20:40 -07:00
Chia-I Wu	b4da53b0c3	virgl: export resource_is_busy from winsys Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-05-06 10:20:38 -07:00
Samuel Pitoiset	c10808441c	radv: fix rowPitch for R32G32B32 formats on GFX9 The pitch is actually the number of components per row. We found the problem when we implemented some meta operations for these formats and the wrong pitch has been confirmed with a small test case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108325 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-05-06 19:07:44 +02:00
Kenneth Graunke	a032a9665f	iris: Enable PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This makes CompressedTexSubImage from a PBO source do proper GPU rendering to upload instead of stalling to map the PBO source on the CPU (then copying it on the CPU). Thanks Bas Nieuwenhuizen for pointing out that Vulkan includes this functionality, and to Jason Ekstrand for writing the code I adapted. Vulkan only supports a single layer, however, and this code tries to support multiple layers as long as it's miplevel 0. Improves performance in Sid Meier's Civilization VI: Average frame time (ms): -3.67423% +/- 1.46201% (n=5) 99th percentile frame time (ms): -5.09910% +/- 3.87874% (n=5)	2019-05-06 09:50:32 -07:00
Bas Nieuwenhuizen	8139efbbbd	radv: Use given stride for images imported from Android. Handled similarly as radeonsi. I checked the offsets are actually used. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-06 15:36:39 +00:00
Erico Nunes	11602ccd5d	lima/ppir: abort compilation in case of unsupported intrinsic Currently ppir continues compilation when there is an unsupported intrinsic, resulting in a shader that will surely not work as intended. This is a problem during piglit runs as some tests don't compile properly due to this but actually still get submitted to the gpu and leave the system in an unstable state after executing, causing further tests to fail. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-06 17:15:27 +02:00
Erico Nunes	60a128fe81	lima/ir: print names of unsupported intrinsics While lima still doesn't support some kinds of intrinsics, it is more helpful to display the name of the unsupported instr->intrinsic to make debugging easier. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com>	2019-05-06 17:15:06 +02:00
John Stultz	c7f2145b4b	mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources list In commit `a99c360a46` (nir: add pass to lower fb reads), a new file was added that needs to also be added to the Makefile.sources list used by the Android and SCons build system. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `a99c360a46` ("nir: add pass to lower fb reads") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-05-06 11:29:26 +00:00
John Stultz	d04f44a459	mesa: Makefile.sources: Add ir3_nir_lower_load_barycentric_at_sample/offset to Makefile.sources In commit `2f0b9d2249` ("freedreno/ir3: lower load_barycentric_at_offset") a new file was added that needs to also be added to the Makefile.sources list used by Android and SCons build system. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `2f0b9d2249` ("freedreno/ir3: lower load_barycentric_at_offset") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-05-06 11:29:26 +00:00
John Stultz	c935862127	mesa: android: freedreno: Fix build failure due to path change The ir3_nir_trig.py file was moved in a previous commit, `aa0fed10d3` (freedreno: move ir3 to common location), so update the Android.gen.mk file to match. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-05-06 11:29:26 +00:00
Amit Pundir	88105375c9	mesa: android: freedreno: build libfreedreno_{drm,ir3} static libs Add libfreedreno_drm/ir3 to the build Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `b4476138d5` ("freedreno: move drm to common location") Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> [jstultz: Tweaked to add extra ir3 files from master] Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-05-06 11:29:26 +00:00
Alistair Strachan	0fda3eac31	mesa: android: Remove unnecessary dependency tracking rules The current AOSP master build system breaks building mesa due to the following error: external/mesa3d/src/compiler/Android.glsl.gen.mk:94: error: writing to readonly directory: "external/mesa3d/src/compiler/glsl/ir.h" This error is bogus -- nothing "writes" to ir.h -- but the rule is unnecessary because the generated header that is a dependency of the non-generated header should be added to LOCAL_GENERATED_SOURCES and this will track if the dependency needs to be regenerated. (This change fixes a similar problem affecting nir.h too.) Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Alistair Strachan <astrachan@google.com> [jstultz: Forward ported and tweaked commit subject] Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-05-06 11:29:25 +00:00
Bas Nieuwenhuizen	5692351264	radv: Implement cosited_even sampling. Apparently cosited_even was the required one instead of midpoint. This adds slight offset of 0.5 pixels to the coordinates (+ we need the image size to convert to normalized coords) Fixes: `91702374d5` "radv: Add ycbcr lowering pass." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-06 11:09:30 +00:00
Michel Dänzer	28784e494e	Restore erroneously removed .gitignore entry for "build" directory It was removed in "delete autotools .gitignore files", but the build directory is created by scons. [Skip CI] Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-05-06 12:11:44 +02:00
Bas Nieuwenhuizen	5cbe12ad1b	radv: Disable subsampled formats. Broken on Polaris and since I discovered NV12 is not subsampled, but a 2-plane format I decided I don't really care. Work to do to re-enable: 1) Figure out which devices support it natively. 2) Write some software emulation for the others. Fixes: `52c1adda21` "radv: Add ycbcr format features." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-05-06 09:53:37 +00:00
Timothy Arceri	1af72fa4d6	util/drirc: add workarounds for bugs in Doom 3: BFG This makes the game playable on radeonsi. Cc: "19.0" "19.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110143	2019-05-06 17:32:36 +10:00

3842 changed files with 458047 additions and 156914 deletions

									
										66

.appveyor/appveyor_msvc.bat
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				goto %1

				:install

				rem Check pip

				if "%buildsystem%" == "scons" (

				    python --version

				    python -m pip --version

				    rem Install Mako

				    python -m pip install Mako==1.0.7

				    rem Install pywin32 extensions, needed by SCons

				    python -m pip install pypiwin32

				    rem Install python wheels, necessary to install SCons via pip

				    python -m pip install wheel

				    rem Install SCons

				    python -m pip install scons==3.0.1

				    call scons --version

				) else (

				    python --version

				    python -m pip install Mako meson

				    meson --version

				    rem Install pkg-config, which meson requires even on windows

				    cinst -y pkgconfiglite

				)

				rem Install flex/bison

				set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip

				if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"

				7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				set Path=%CD%\winflexbison;%Path%

				win_flex --version

				win_bison --version

				rem Download and extract LLVM

				if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"

				7z x -y "%LLVM_ARCHIVE%" > nul

				if "%buildsystem%" == "scons" (

				    mkdir llvm\bin

				    set LLVM=%CD%\llvm

				) else (

				    move llvm subprojects\

				    copy .appveyor\llvm-wrap.meson subprojects\llvm\meson.build

				)

				goto :eof

				:build_script

				if "%buildsystem%" == "scons" (

				    call scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1

				) else (

				    call "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=x86

				    rem We use default-library as static to affect any wraps (such as expat and zlib)

				    rem it would be better if we could set subprojects buildtype independently,

				    rem but I haven't written that patch yet :)

				    call meson builddir --backend=vs2017 --default-library=static -Dbuild-tests=true -Db_vscrt=mtd --buildtype=release -Dllvm=true -Dgallium-drivers=swrast -Dosmesa=gallium

				    pushd builddir

				    call msbuild mesa.sln /m

				    popd

				)

				goto :eof

				:test_script

				if "%buildsystem%" == "scons" (

				    call scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check

				) else (

				    call meson test -C builddir

				)

				goto :eof

36

.appveyor/llvm-wrap.meson Normal file

View File

@@ -0,0 +1,36 @@
 # A meson.build file for binary wrapping the LLVM used in the appvyeor CI
 project('llvm', ['cpp'])
 cpp = meson.get_compiler('cpp')
 _deps = []
 _search = join_paths(meson.current_source_dir(), 'lib')
 foreach d : ['LLVMAnalysis', 'LLVMAsmParser', 'LLVMAsmPrinter',
              'LLVMBinaryFormat', 'LLVMBitReader', 'LLVMBitWriter',
              'LLVMCodeGen', 'LLVMCore', 'LLVMCoroutines', 'LLVMCoverage',
              'LLVMDebugInfoCodeView', 'LLVMDebugInfoDWARF',
              'LLVMDebugInfoMSF', 'LLVMDebugInfoPDB', 'LLVMDemangle',
              'LLVMDlltoolDriver', 'LLVMExecutionEngine', 'LLVMGlobalISel',
              'LLVMInstCombine', 'LLVMInstrumentation', 'LLVMInterpreter',
              'LLVMipo', 'LLVMIRReader', 'LLVMLibDriver', 'LLVMLineEditor',
              'LLVMLinker', 'LLVMLTO', 'LLVMMCDisassembler', 'LLVMMCJIT',
              'LLVMMC', 'LLVMMCParser', 'LLVMMIRParser', 'LLVMObjCARCOpts',
              'LLVMObject', 'LLVMObjectYAML', 'LLVMOption', 'LLVMOrcJIT',
              'LLVMPasses', 'LLVMProfileData', 'LLVMRuntimeDyld',
              'LLVMScalarOpts', 'LLVMSelectionDAG', 'LLVMSupport',
              'LLVMSymbolize', 'LLVMTableGen', 'LLVMTarget',
              'LLVMTransformUtils', 'LLVMVectorize', 'LLVMX86AsmParser',
              'LLVMX86AsmPrinter', 'LLVMX86CodeGen', 'LLVMX86Desc',
              'LLVMX86Disassembler', 'LLVMX86Info', 'LLVMX86Utils',
              'LLVMXRay']
   _deps += cpp.find_library(d, dirs : _search)
 endforeach
 dep_llvm = declare_dependency(
   include_directories : include_directories('include'),
   dependencies : _deps,
   version : '5.0.1',
 )
 has_rtti = false
 irbuilder_h = files('include/llvm/IR/IRBuilder.h')

									
										6

.editorconfig
									
												View File
												
				@@ -32,9 +32,13 @@ indent_size = 2

				indent_style = space

				indent_size = 2

				[*.html]

				indent_style = space

				indent_size = 2

				[*.patch]

				trim_trailing_whitespace = false

				[meson.build,meson_options.txt]

				[{meson.build,meson_options.txt}]

				indent_style = space

				indent_size = 2

4

.gitattributes vendored

View File

@@ -1,4 +0,0 @@
 *.dsp -crlf
 *.dsw -crlf
 *.sln -crlf
 *.vcproj -crlf

2

.gitignore vendored

View File

@@ -1,2 +1,4 @@
 *.pyc
 *.pyo
 *.out
 build

									
										772

.gitlab-ci.yml
									
												View File
												
				@@ -1,5 +1,88 @@

				# This is the tag of the docker image used for the build jobs. If the

				# image doesn't exist yet, the containers-build stage generates it.

				variables:

				  UPSTREAM_REPO: mesa/mesa

				include:

				  - project: 'wayland/ci-templates'

				    # Must be the same as in .gitlab-ci/lava-gitlab-ci.yml

				    ref: 0a9bdd33a98f05af6761ab118b5074952242aab0

				    file: '/templates/debian.yml'

				  - local: '.gitlab-ci/lava-gitlab-ci.yml'

				stages:

				  - container

				  - build

				  - test

				  - success

				# When to automatically run the CI

				.ci-run-policy:

				  rules:

				    # Run pipeline by default for merge requests changing files affecting it

				    - if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'

				      changes: &paths

				      - VERSION

				      - bin/**/*

				      # GitLab CI

				      - .gitlab-ci.yml

				      - .gitlab-ci/**/*

				      # Meson

				      - meson*

				      - build-support/**/*

				      - subprojects/**/*

				      # SCons

				      - SConstruct

				      - scons/**/*

				      - common.py

				      # Source code

				      - include/**/*

				      - src/**/*

				      when: on_success

				    # Run pipeline by default in the main project if files affecting it were

				    # changed

				    - if: '$CI_PROJECT_PATH == "mesa/mesa"'

				      changes:

				        *paths

				      when: on_success

				    # Allow triggering jobs manually on branches of forked projects

				    - if: '$CI_PROJECT_PATH != "mesa/mesa" && $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME != $CI_COMMIT_REF_NAME'

				      when: manual

				    # Otherwise, most jobs won't run

				    - when: never

				  retry:

				    max: 2

				    when:

				      - runner_system_failure

				  # Cancel CI run if a newer commit is pushed to the same branch

				  interruptible: true

				success:

				  stage: success

				  image: debian:stable-slim

				  only:

				    - merge_requests

				  except:

				    changes:

				      *paths

				  variables:

				    GIT_STRATEGY: none

				  script:

				    - echo "Dummy job to make sure every merge request pipeline runs at least one job"

				.ci-deqp-artifacts:

				  artifacts:

				    when: always

				    untracked: false

				    paths:

				      # Watch out!  Artifacts are relative to the build dir.

				      # https://gitlab.com/gitlab-org/gitlab-ce/commit/8788fb925706cad594adf6917a6c5f6587dd1521

				      - artifacts

				# Build the CI docker images.

				#

				# DEBIAN_TAG is the tag of the docker image used by later stage jobs. If the

				# image doesn't exist yet, the container stage job generates it.

				#

				# In order to generate a new image, one should generally change the tag.

				# While removing the image from the registry would also work, that's not

				@@ -12,157 +95,335 @@

				# main repository, it's recommended to remove the image from the source

				# repository's container registry, so that the image from the main

				# repository's registry will be used there as well.

				#

				# The format of the tag is "%Y-%m-%d-${counter}" where ${counter} stays

				# at "01" unless you have multiple updates on the same day :)

				variables:

				  UPSTREAM_REPO: mesa/mesa

				  DEBIAN_TAG: "2019-05-01"

				  DEBIAN_VERSION: stretch-slim

				  DEBIAN_IMAGE: "$CI_REGISTRY_IMAGE/debian/$DEBIAN_VERSION:$DEBIAN_TAG"

				include:

				  - project: 'wayland/ci-templates'

				    ref: c73dae8b84697ef18e2dbbf4fed7386d9652b0cd

				    file: '/templates/debian.yml'

				stages:

				  - containers-build

				  - build+test

				# When to automatically run the CI

				.ci-run-policy: &ci-run-policy

				  only:

				    - branches@mesa/mesa

				    - merge_requests

				    - /^ci([-/].*)?$/

				  retry:

				    max: 2

				    when:

				      - runner_system_failure

				# CONTAINERS

				debian:

				  extends: .debian@container-ifnot-exists

				  stage: containers-build

				  <<: *ci-run-policy

				.container:

				  stage: container

				  extends:

				    - .ci-run-policy

				  variables:

				    GIT_STRATEGY: none # no need to pull the whole tree for rebuilding the image

				    DEBIAN_EXEC: 'bash .gitlab-ci/debian-install.sh'

				    DEBIAN_VERSION: buster-slim

				    REPO_SUFFIX: $CI_JOB_NAME

				    DEBIAN_EXEC: 'bash .gitlab-ci/container/${CI_JOB_NAME}.sh'

				    # no need to pull the whole repo to build the container image

				    GIT_STRATEGY: none

				# Debian 10 based x86 build image

				x86_build:

				  extends:

				    - .debian@container-ifnot-exists

				    - .container

				  variables:

				    DEBIAN_TAG: &x86_build "2020-01-14"

				.use-x86_build:

				  variables:

				    TAG: *x86_build

				  image: "$CI_REGISTRY_IMAGE/debian/x86_build:$TAG"

				  needs:

				    - x86_build

				# Debian 10 based x86 test image for GL

				x86_test-gl:

				  extends: x86_build

				  variables:

				    DEBIAN_TAG: &x86_test-gl "2020-01-14"

				# Debian 10 based x86 test image for VK

				x86_test-vk:

				  extends: x86_build

				  variables:

				    DEBIAN_TAG: &x86_test-vk "2020-01-14"

				  # Can only be triggered manually on personal branches because RADV is the only

				  # driver that does Vulkan testing at the moment.

				  rules:

				    # Never build the test image for VK by default in the main project.

				    - if: '$CI_PROJECT_PATH == "mesa/mesa"'

				      when: never

				    # Never build the test image for VK by default for merge requests.

				    - if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'

				      when: never

				    # Otherwise, allow building it manually for personal branches.

				    - when: manual

				# Debian 9 based x86 build image (old LLVM)

				x86_build_old:

				  extends: x86_build

				  variables:

				    DEBIAN_TAG: &x86_build_old "2019-09-18"

				    DEBIAN_VERSION: stretch-slim

				.use-x86_build_old:

				  variables:

				    TAG: *x86_build_old

				  image: "$CI_REGISTRY_IMAGE/debian/x86_build_old:$TAG"

				  needs:

				    - x86_build_old

				# Debian 10 based ARM build image

				arm_build:

				  extends:

				    - .debian@container-ifnot-exists@arm64v8

				    - .container

				  variables:

				    DEBIAN_TAG: &arm_build "2020-01-14"

				.use-arm_build:

				  variables:

				    TAG: *arm_build

				  image: "$CI_REGISTRY_IMAGE/debian/arm_build:$TAG"

				  needs:

				    - arm_build

				# Debian 10 based ARM test image

				arm_test:

				  extends: arm_build

				  variables:

				    DEBIAN_TAG: &arm_test "2019-12-18"

				.use-arm_test:

				  variables:

				    TAG: *arm_test

				  image: "$CI_REGISTRY_IMAGE/debian/arm_test:$TAG"

				  needs:

				    - meson-arm64

				    - arm_test

				# BUILD

				.build:

				  <<: *ci-run-policy

				  image: $DEBIAN_IMAGE

				  stage: build+test

				  cache:

				    paths:

				      - ccache

				# Shared between windows and Linux

				.build-common:

				  extends: .ci-run-policy

				  stage: build

				  artifacts:

				    when: on_failure

				    untracked: true

				    when: always

				    paths:

				      - _build/meson-logs/*.txt

				      # scons:

				      - build/*/config.log

				      - shader-db

				# Just Linux

				.build-linux:

				  extends: .build-common

				  variables:

				    CCACHE_COMPILERCHECK: "content"

				    CCACHE_COMPRESS: "true"

				    CCACHE_DIR: /cache/mesa/ccache

				  # Use ccache transparently, and print stats before/after

				  before_script:

				    - export PATH="/usr/lib/ccache:$PATH"

				    - export CCACHE_BASEDIR="$PWD"

				    - export CCACHE_DIR="$PWD/ccache"

				    - ccache --zero-stats || true

				    - ccache --show-stats || true

				    - ccache --show-stats

				  after_script:

				    - export CCACHE_DIR="$PWD/ccache"

				    - ccache --show-stats

				.build-windows:

				  extends: .build-common

				  tags:

				    - mesa-windows

				  cache:

				    key: ${CI_JOB_NAME}

				    paths:

				      - subprojects/packagecache

				.meson-build:

				  extends: .build

				  extends:

				    - .build-linux

				    - .use-x86_build

				  variables:

				    LLVM_VERSION: 9

				  script:

				    # We need to control the version of llvm-config we're using, so we'll

				    # generate a native file to do so. This requires meson >=0.49

				    - if test -n "$LLVM_VERSION"; then

				        LLVM_CONFIG="llvm-config-${LLVM_VERSION}";

				        echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file;

				        $LLVM_CONFIG --version;

				      else

				        touch native.file;

				      fi

				    - meson --version

				    - meson _build

				            --native-file=native.file

				            -D buildtype=debug

				            -D build-tests=true

				            -D libunwind=${UNWIND}

				            ${DRI_LOADERS}

				            -D dri-drivers=${DRI_DRIVERS:-[]}

				            ${GALLIUM_ST}

				            -D gallium-drivers=${GALLIUM_DRIVERS:-[]}

				            -D vulkan-drivers=${VULKAN_DRIVERS:-[]}

				            -D I-love-half-baked-turnips=true

				    - cd _build

				    - meson configure

				    - ninja -j4

				    - LC_ALL=C.UTF-8 ninja test

				    - .gitlab-ci/meson-build.sh

				.scons-build:

				  extends: .build

				  extends:

				    - .build-linux

				    - .use-x86_build

				  variables:

				    SCONSFLAGS: "-j4"

				  script:

				    - if test -n "$LLVM_VERSION"; then

				        export LLVM_CONFIG="llvm-config-${LLVM_VERSION}";

				      fi

				    - scons $SCONS_TARGET

				    - eval $SCONS_CHECK_COMMAND

				    - .gitlab-ci/scons-build.sh

				# NOTE: Building SWR is 2x (yes two) times slower than all the other

				# gallium drivers combined.

				# Start this early so that it doesn't limit the total run time.

				#

				# We also put softpipe (and therefore gallium nine, which requires

				# it) here, since softpipe/llvmpipe can't be built alongside classic

				# swrast.

				#

				# Putting glvnd here is arbitrary, but we want it in one of the builds

				# for coverage.

				meson-swr-glvnd:

				meson-testing:

				  extends:

				    - .meson-build

				    - .ci-deqp-artifacts

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=dri

				      -D gbm=true

				      -D egl=true

				      -D platforms=x11,drm,surfaceless

				    GALLIUM_ST: >

				      -D dri3=true

				    GALLIUM_DRIVERS: "swrast"

				    VULKAN_DRIVERS: amd

				    BUILDTYPE: "debugoptimized"

				  script:

				    - .gitlab-ci/meson-build.sh

				    - .gitlab-ci/prepare-artifacts.sh

				meson-main:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glvnd=true

				      -D glx=dri

				      -D gbm=true

				      -D egl=true

				      -D platforms=x11,wayland,drm,surfaceless

				    DRI_DRIVERS: "i915,i965,r100,r200,nouveau"

				    GALLIUM_ST: >

				      -D dri3=true

				      -D gallium-extra-hud=true

				      -D gallium-vdpau=true

				      -D gallium-xvmc=true

				      -D gallium-omx=bellagio

				      -D gallium-va=true

				      -D gallium-xa=true

				      -D gallium-nine=true

				      -D gallium-opencl=disabled

				    GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,swr,swrast,svga,v3d,vc4,virgl,etnaviv,panfrost,lima,zink"

				    EXTRA_OPTION: >

				      -D osmesa=gallium

				      -D tools=all

				  script:

				    - .gitlab-ci/meson-build.sh

				    - .gitlab-ci/run-shader-db.sh

				.meson-cross:

				  extends:

				    - .meson-build

				  variables:

				    UNWIND: "false"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D gbm=false

				      -D egl=true

				      -D platforms=surfaceless

				      -D osmesa=none

				    GALLIUM_ST: >

				      -D dri3=false

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=true

				      -D gallium-opencl=disabled

				      -D osmesa=gallium

				    GALLIUM_DRIVERS: "swr,swrast,iris"

				    LLVM_VERSION: "6.0"

				      -D gallium-nine=false

				.meson-arm:

				  extends:

				    - .meson-cross

				    - .use-arm_build

				  variables:

				    VULKAN_DRIVERS: freedreno

				    GALLIUM_DRIVERS: "etnaviv,freedreno,kmsro,lima,nouveau,panfrost,swrast,tegra,v3d,vc4"

				    BUILDTYPE: "debugoptimized"

				    EXTRA_OPTION: >

				      -D I-love-half-baked-turnips=true

				  tags:

				    - aarch64

				meson-armhf:

				  extends:

				    - .meson-arm

				    - .ci-deqp-artifacts

				  variables:

				    CROSS: armhf

				    LLVM_VERSION: "7"

				  script:

				    - .gitlab-ci/meson-build.sh

				    - .gitlab-ci/prepare-artifacts.sh

				meson-arm64:

				  extends:

				    - .meson-arm

				    - .ci-deqp-artifacts

				  variables:

				    LLVM_VERSION: "8"

				    VULKAN_DRIVERS: "freedreno,amd"

				  script:

				    - .gitlab-ci/meson-build.sh

				    - .gitlab-ci/prepare-artifacts.sh

				meson-clang:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glvnd=true

				    DRI_DRIVERS: "auto"

				    GALLIUM_DRIVERS: "auto"

				    VULKAN_DRIVERS: intel,amd,freedreno

				    CC: "ccache clang-8"

				    CXX: "ccache clang++-8"

				    CC: "ccache clang-9"

				    CXX: "ccache clang++-9"

				.meson-windows:

				  extends:

				    - .build-windows

				  before_script:

				    - export CCACHE_BASEDIR="$PWD" CCACHE_DIR="$PWD/ccache"

				    - ccache --zero-stats --show-stats || true

				     # clang++ breaks if it picks up the GCC 8 directory without libstdc++.so

				    - apt-get remove -y libgcc-8-dev

				    - $ENV:ARCH = "x86"

				    - $ENV:VERSION = "2019\Community"

				  script:

				    - cmd /C .gitlab-ci\meson-build.bat

				scons-swr:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "swr=1"

				    SCONS_CHECK_COMMAND: "true"

				    LLVM_VERSION: "6.0"

				scons-win64:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: platform=windows machine=x86_64

				    SCONS_CHECK_COMMAND: "true"

				meson-clover:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D egl=false

				      -D gbm=false

				    GALLIUM_ST: >

				      -D dri3=false

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=false

				      -D gallium-opencl=icd

				  script:

				    - export GALLIUM_DRIVERS="r600,radeonsi"

				    - .gitlab-ci/meson-build.sh

				    - LLVM_VERSION=8 .gitlab-ci/meson-build.sh

				    - export GALLIUM_DRIVERS="i915,r600"

				    - LLVM_VERSION=6.0 .gitlab-ci/meson-build.sh

				    - LLVM_VERSION=7 .gitlab-ci/meson-build.sh

				meson-clover-old-llvm:

				  extends:

				    - meson-clover

				    - .use-x86_build_old

				  variables:

				    UNWIND: "false"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D egl=false

				      -D gbm=false

				      -D platforms=drm,surfaceless

				    GALLIUM_DRIVERS: "i915,r600"

				  script:

				    - LLVM_VERSION=3.9 .gitlab-ci/meson-build.sh

				    - LLVM_VERSION=4.0 .gitlab-ci/meson-build.sh

				    - LLVM_VERSION=5.0 .gitlab-ci/meson-build.sh

				meson-vulkan:

				  extends: .meson-build

				@@ -183,82 +444,247 @@ meson-vulkan:

				      -D gallium-xa=false

				      -D gallium-nine=false

				      -D gallium-opencl=disabled

				      -D b_sanitize=undefined

				      -D c_args=-fno-sanitize-recover=all

				      -D cpp_args=-fno-sanitize-recover=all

				    UBSAN_OPTIONS: "print_stacktrace=1"

				    VULKAN_DRIVERS: intel,amd,freedreno

				    LLVM_VERSION: "7"

				    EXTRA_OPTION: >

				      -D vulkan-overlay-layer=true

				meson-main:

				  extends: .meson-build

				# While the main point of this build is testing the i386 cross build,

				# we also use this one to test some other options that are exclusive

				# with meson-main's choices (classic swrast and osmesa)

				meson-i386:

				  extends: .meson-cross

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=dri

				      -D gbm=true

				      -D egl=true

				      -D platforms=x11,wayland,drm,surfaceless

				    CROSS: i386

				    VULKAN_DRIVERS: intel

				    DRI_DRIVERS: "swrast"

				    GALLIUM_DRIVERS: "iris"

				    EXTRA_OPTION: >

				      -D vulkan-overlay-layer=true

				      -D llvm=false

				      -D osmesa=classic

				    DRI_DRIVERS: "i915,i965,r100,r200,swrast,nouveau"

				    GALLIUM_ST: >

				      -D dri3=true

				      -D gallium-extra-hud=true

				      -D gallium-vdpau=true

				      -D gallium-xvmc=true

				      -D gallium-omx=bellagio

				      -D gallium-va=true

				      -D gallium-xa=true

				      -D gallium-nine=false

				      -D gallium-opencl=disabled

				    GALLIUM_DRIVERS: "iris,nouveau,kmsro,r300,r600,freedreno,svga,v3d,vc4,virgl,etnaviv,panfrost,lima"

				    LLVM_VERSION: "7"

				      -D werror=true

				meson-clover-llvm:

				meson-mingw32-x86_64:

				  extends: .meson-build

				  variables:

				    UNWIND: "true"

				    DRI_LOADERS: >

				      -D glx=disabled

				      -D egl=false

				      -D gbm=false

				    GALLIUM_ST: >

				      -D dri3=false

				      -D gallium-vdpau=false

				      -D gallium-xvmc=false

				      -D gallium-omx=disabled

				      -D gallium-va=false

				      -D gallium-xa=false

				      -D gallium-nine=false

				      -D gallium-opencl=icd

				    GALLIUM_DRIVERS: "r600,radeonsi"

				    UNWIND: "false"

				    DRI_DRIVERS: ""

				    GALLIUM_DRIVERS: "swrast"

				    EXTRA_OPTION: >

				      -Dllvm=false

				      -Dosmesa=gallium

				      --cross-file=.gitlab-ci/x86_64-w64-mingw32

				meson-clover-llvm39:

				  extends: meson-clover-llvm

				  variables:

				    GALLIUM_DRIVERS: "i915,r600"

				    LLVM_VERSION: "3.9"

				scons-nollvm:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "llvm=0"

				    SCONS_CHECK_COMMAND: "scons llvm=0 check"

				scons-llvm:

				scons:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "llvm=1"

				    SCONS_CHECK_COMMAND: "scons llvm=1 check"

				    LLVM_VERSION: "3.4"

				    # LLVM 3.4 packages were built with an old libstdc++ ABI

				    CXX: "g++ -D_GLIBCXX_USE_CXX11_ABI=0"

				    SCONS_CHECK_COMMAND: "scons llvm=1 force_scons=1 check"

				  script:

				    - SCONS_TARGET="" SCONS_CHECK_COMMAND="scons check force_scons=1" .gitlab-ci/scons-build.sh

				    - LLVM_VERSION=9 .gitlab-ci/scons-build.sh

				scons-swr:

				  extends: .scons-build

				  variables:

				    SCONS_TARGET: "swr=1"

				    SCONS_CHECK_COMMAND: "true"

				    LLVM_VERSION: "6.0"

				scons-old-llvm:

				  extends:

				    - scons

				    - .use-x86_build_old

				  script:

				    - LLVM_VERSION=3.9 .gitlab-ci/scons-build.sh

				scons-win64:

				  extends: .scons-build

				.test:

				  extends:

				    - .ci-run-policy

				  stage: test

				  variables:

				    SCONS_TARGET: platform=windows machine=x86_64

				    SCONS_CHECK_COMMAND: "true"

				    GIT_STRATEGY: none # testing doesn't build anything from source

				  before_script:

				    # Note: Build dir (and thus install) may be dirty due to GIT_STRATEGY

				    - rm -rf install

				    - tar -xf artifacts/install.tar

				    - LD_LIBRARY_PATH=install/lib find install/lib -name "*.so" -print -exec ldd {} \;

				  artifacts:

				    when: always

				    name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"

				    paths:

				      - results/

				  dependencies:

				    - meson-testing

				.test-gl:

				  extends:

				    - .test

				  variables:

				    TAG: *x86_test-gl

				  image: "$CI_REGISTRY_IMAGE/debian/x86_test-gl:$TAG"

				  needs:

				    - meson-testing

				    - x86_test-gl

				.test-vk:

				  extends:

				    - .test

				  variables:

				    TAG: *x86_test-vk

				  image: "$CI_REGISTRY_IMAGE/debian/x86_test-vk:$TAG"

				  needs:

				    - meson-testing

				    - x86_test-vk

				.piglit-test:

				  extends: .test-gl

				  artifacts:

				    when: on_failure

				    name: "$CI_JOB_NAME-$CI_COMMIT_REF_NAME"

				    paths:

				      - summary/

				  variables:

				    LIBGL_ALWAYS_SOFTWARE: 1

				    PIGLIT_NO_WINDOW: 1

				  script:

				    - artifacts/piglit/run.sh

				piglit-quick_gl:

				  extends: .piglit-test

				  variables:

				    LP_NUM_THREADS: 0

				    NIR_VALIDATE: 0

				    PIGLIT_OPTIONS: >

				      --process-isolation false

				      -x arb_gpu_shader5

				      -x egl_ext_device_

				      -x egl_ext_platform_device

				      -x ext_timer_query@time-elapsed

				      -x glx-multithread-clearbuffer

				      -x glx-multithread-shader-compile

				      -x max-texture-size

				      -x maxsize

				    PIGLIT_PROFILES: quick_gl

				piglit-glslparser:

				  extends: .piglit-test

				  variables:

				    LP_NUM_THREADS: 0

				    NIR_VALIDATE: 0

				    PIGLIT_PROFILES: glslparser

				piglit-quick_shader:

				  extends: .piglit-test

				  variables:

				    LP_NUM_THREADS: 1

				    NIR_VALIDATE: 0

				    PIGLIT_PROFILES: quick_shader

				.deqp-test:

				  variables:

				    DEQP_SKIPS: deqp-default-skips.txt

				  script:

				    - ./artifacts/deqp-runner.sh

				.deqp-test-gl:

				  extends:

				    - .test-gl

				    - .deqp-test

				.deqp-test-vk:

				  extends:

				    - .test-vk

				    - .deqp-test

				  variables:

				    DEQP_VER: vk

				test-llvmpipe-gles2:

				  variables:

				    DEQP_VER: gles2

				    DEQP_PARALLEL: 4

				    NIR_VALIDATE: 0

				    # Don't use threads inside llvmpipe, we've already got all 4 cores

				    # busy with DEQP_PARALLEL.

				    LP_NUM_THREADS: 0

				    DEQP_EXPECTED_FAILS: deqp-llvmpipe-fails.txt

				    LIBGL_ALWAYS_SOFTWARE: "true"

				  extends: .deqp-test-gl

				test-softpipe-gles2:

				  extends: test-llvmpipe-gles2

				  variables:

				    DEQP_EXPECTED_FAILS: deqp-softpipe-fails.txt

				    DEQP_SKIPS: deqp-softpipe-skips.txt

				    GALLIUM_DRIVER: "softpipe"

				test-softpipe-gles3:

				  parallel: 2

				  variables:

				    DEQP_VER: gles3

				  extends: test-softpipe-gles2

				test-softpipe-gles31:

				  parallel: 4

				  variables:

				    DEQP_VER: gles31

				  extends: test-softpipe-gles2

				arm64_a630_gles2:

				  extends:

				    - .deqp-test-gl

				    - .use-arm_test

				  variables:

				    DEQP_VER: gles2

				    DEQP_EXPECTED_FAILS: deqp-freedreno-a630-fails.txt

				    DEQP_SKIPS: deqp-freedreno-a630-skips.txt

				    NIR_VALIDATE: 0

				    DEQP_PARALLEL: 4

				    FLAKES_CHANNEL: "#freedreno-ci"

				  tags:

				    - mesa-cheza

				  dependencies:

				    - meson-arm64

				arm64_a630_gles31:

				  extends: arm64_a630_gles2

				  variables:

				    DEQP_VER: gles31

				arm64_a630_gles3:

				  extends: arm64_a630_gles2

				  variables:

				    DEQP_VER: gles3

				arm64_a306_gles2:

				  extends: arm64_a630_gles2

				  variables:

				    DEQP_EXPECTED_FAILS: deqp-freedreno-a307-fails.txt

				    DEQP_SKIPS: deqp-default-skips.txt

				  tags:

				    - db410c

				# RADV CI

				.test-radv:

				  variables:

				    VK_DRIVER: radeon

				    RADV_DEBUG: checkir

				  # Can only be triggered manually on personal branches because RADV is the only

				  # driver that does Vulkan testing at the moment.

				  rules:

				    # Never test RADV by default in the main project.

				    - if: '$CI_PROJECT_PATH == "mesa/mesa"'

				      when: never

				    # Never test RADV by default for merge requests.

				    - if: '$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == $CI_COMMIT_REF_NAME'

				      when: never

				    # Otherwise, allow testing RADV if the test image for VK has been manually

				    # started.

				    - when: on_success

				radv_polaris10_vkcts:

				  extends:

				    - .deqp-test-vk

				    - .test-radv

				  variables:

				    DEQP_PARALLEL: 4

				    DEQP_SKIPS: deqp-radv-polaris10-skips.txt

				  tags:

				    - polaris10

									
										122

.gitlab-ci/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,122 @@

				## Mesa testing using gitlab-runner

				The goal of the "test" stage of the .gitlab-ci.yml is to do pre-merge

				testing of Mesa drivers on various platforms, so that we can ensure no

				regressions are merged, as long as developers are merging code using

				the "Merge when pipeline completes" button.

				This document only covers the CI from .gitlab-ci.yml and this

				directory.  For other CI systems, see Intel's [Mesa

				CI](https://gitlab.freedesktop.org/Mesa_CI) or panfrost's LAVA-based

				CI (`src/gallium/drivers/panfrost/ci/`)

				### Software architecture

				For freedreno and llvmpipe CI, we're using gitlab-runner on the test

				devices (DUTs), cached docker containers with VK-GL-CTS, and the

				normal shared x86_64 runners to build the Mesa drivers to be run

				inside of those containers on the DUTs.

				The docker containers are rebuilt from the debian-install.sh script

				when DEBIAN\_TAG is changed in .gitlab-ci.yml, and

				debian-test-install.sh when DEBIAN\_ARM64\_TAG is changed in

				.gitlab-ci.yml.  The resulting images are around 500MB, and are

				expected to change approximately weekly (though an individual

				developer working on them may produce many more images while trying to

				come up with a working MR!).

				gitlab-runner is a client that polls gitlab.freedesktop.org for

				available jobs, with no inbound networking requirements.  Jobs can

				have tags, so we can have DUT-specific jobs that only run on runners

				with that tag marked in the gitlab UI.

				Since dEQP takes a long time to run, we mark the job as "parallel" at

				some level, which spawns multiple jobs from one definition, and then

				deqp-runner.sh takes the corresponding fraction of the test list for

				that job.

				To reduce dEQP runtime (or avoid tests with unreliable results), a

				deqp-runner.sh invocation can provide a list of tests to skip.  If

				your driver is not yet conformant, you can pass a list of expected

				failures, and the job will only fail on tests that aren't listed (look

				at the job's log for which specific tests failed).

				### DUT requirements

				#### DUTs must have a stable kernel and GPU reset.

				If the system goes down during a test run, that job will eventually

				time out and fail (default 1 hour).  However, if the kernel can't

				reliably reset the GPU on failure, bugs in one MR may leak into

				spurious failures in another MR.  This would be an unacceptable impact

				on Mesa developers working on other drivers.

				#### DUTs must be able to run docker

				The Mesa gitlab-runner based test architecture is built around docker,

				so that we can cache the debian package installation and CTS build

				step across multiple test runs.  Since the images are large and change

				approximately weekly, the DUTs also need to be running some script to

				prune stale docker images periodically in order to not run out of disk

				space as we rev those containers (perhaps [this

				script](https://gitlab.com/gitlab-org/gitlab-runner/issues/2980#note_169233611)).

				Note that docker doesn't allow containers to be stored on NFS, and

				doesn't allow multiple docker daemons to interact with the same

				network block device, so you will probably need some sort of physical

				storage on your DUTs.

				#### DUTs must be public

				By including your device in .gitlab-ci.yml, you're effectively letting

				anyone on the internet run code on your device.  docker containers may

				provide some limited protection, but how much you trust that and what

				you do to mitigate hostile access is up to you.

				#### DUTs must expose the dri device nodes to the containers.

				Obviously, to get access to the HW, we need to pass the render node

				through.  This is done by adding `devices = ["/dev/dri"]` to the

				`runners.docker` section of /etc/gitlab-runner/config.toml.

				### HW CI farm expectations

				To make sure that testing of one vendor's drivers doesn't block

				unrelated work by other vendors, we require that a given driver's test

				farm produces a spurious failure no more than once a week.  If every

				driver had CI and failed once a week, we would be seeing someone's

				code getting blocked on a spurious failure daily, which is an

				unacceptable cost to the project.

				Additionally, the test farm needs to be able to provide a short enough

				turnaround time that people can regularly use the "Merge when pipeline

				succeeds" button successfully (until we get

				[marge-bot](https://github.com/smarkets/marge-bot) in place on

				freedesktop.org).  As a result, we require that the test farm be able

				to handle a whole pipeline's worth of jobs in less than 5 minutes (to

				compare, the build stage is about 10 minutes, if you could get all

				your jobs scheduled on the shared runners in time.).

				If a test farm is short the HW to provide these guarantees, consider

				dropping tests to reduce runtime.

				`VK-GL-CTS/scripts/log/bottleneck_report.py` can help you find what

				tests were slow in a `results.qpa` file.  Or, you can have a job with

				no `parallel` field set and:

				```

				  variables:

				    CI_NODE_INDEX: 1

				    CI_NODE_TOTAL: 10

				```

				to just run 1/10th of the test list.

				If a HW CI farm goes offline (network dies and all CI pipelines end up

				stalled) or its runners are consistenly spuriously failing (disk

				full?), and the maintainer is not immediately available to fix the

				issue, please push through an MR disabling that farm's jobs by adding

				'.' to the front of the jobs names until the maintainer can bring

				things back up.  If this happens, the farm maintainer should provide a

				report to mesa-dev@lists.freedesktop.org after the fact explaining

				what happened and what the mitigation plan is for that failure next

				time.

46

.gitlab-ci/arm.config Normal file

View File

@@ -0,0 +1,46 @@
 CONFIG_LOCALVERSION="ccu"
 CONFIG_DEBUG_KERNEL=y
 CONFIG_DEVFREQ_GOV_PERFORMANCE=y
 CONFIG_DEVFREQ_GOV_POWERSAVE=y
 CONFIG_DEVFREQ_GOV_USERSPACE=y
 CONFIG_DEVFREQ_GOV_PASSIVE=y
 CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
 CONFIG_DRM=y
 CONFIG_DRM_ROCKCHIP=y
 CONFIG_DRM_PANFROST=y
 CONFIG_DRM_LIMA=y
 CONFIG_DRM_PANEL_SIMPLE=y
 CONFIG_PWM_CROS_EC=y
 CONFIG_BACKLIGHT_PWM=y
 CONFIG_ROCKCHIP_CDN_DP=n
 CONFIG_SPI_ROCKCHIP=y
 CONFIG_PWM_ROCKCHIP=y
 CONFIG_PHY_ROCKCHIP_DP=y
 CONFIG_DWMAC_ROCKCHIP=y
 CONFIG_MFD_RK808=y
 CONFIG_REGULATOR_RK808=y
 CONFIG_RTC_DRV_RK808=y
 CONFIG_COMMON_CLK_RK808=y
 CONFIG_REGULATOR_FAN53555=y
 CONFIG_REGULATOR=y
 CONFIG_REGULATOR_VCTRL=y
 CONFIG_KASAN=n
 CONFIG_KASAN_INLINE=n
 CONFIG_STACKTRACE=n
 CONFIG_TMPFS=y
 CONFIG_PROVE_LOCKING=n
 CONFIG_DEBUG_LOCKDEP=n
 CONFIG_SOFTLOCKUP_DETECTOR=n
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=n

9

src/gallium/drivers/panfrost/ci/arm64.config → .gitlab-ci/arm64.config

View File

@@ -10,6 +10,7 @@ CONFIG_DEVFREQ_GOV_PASSIVE=y
 CONFIG_DRM=y
 CONFIG_DRM_ROCKCHIP=y
 CONFIG_DRM_PANFROST=y
 CONFIG_DRM_LIMA=y
 CONFIG_DRM_PANEL_SIMPLE=y
 CONFIG_PWM_CROS_EC=y
 CONFIG_BACKLIGHT_PWM=y
@@ -25,7 +26,6 @@ CONFIG_TYPEC_FUSB302=y
 CONFIG_TYPEC=y
 CONFIG_TYPEC_TCPM=y
 CONFIG_ARCH_SUNXI=n
 CONFIG_ARCH_ALPINE=n
 CONFIG_ARCH_BCM2835=n
 CONFIG_ARCH_BCM_IPROC=n
@@ -37,7 +37,6 @@ CONFIG_ARCH_LAYERSCAPE=n
 CONFIG_ARCH_LG1K=n
 CONFIG_ARCH_HISI=n
 CONFIG_ARCH_MEDIATEK=n
 CONFIG_ARCH_MESON=n
 CONFIG_ARCH_MVEBU=n
 CONFIG_ARCH_QCOM=n
 CONFIG_ARCH_SEATTLE=n
@@ -78,5 +77,7 @@ CONFIG_TMPFS=y
 CONFIG_PROVE_LOCKING=n
 CONFIG_DEBUG_LOCKDEP=n
 CONFIG_SOFTLOCKUP_DETECTOR=n
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=n
 CONFIG_SOFTLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
 CONFIG_DETECT_HUNG_TASK=y

									
										10

.gitlab-ci/build-cts-runner.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,10 @@

				#!/bin/bash

				set -ex

				git clone https://gitlab.freedesktop.org/mesa/parallel-deqp-runner.git --depth 1 -b mesa-ci-2019-12-17

				cd parallel-deqp-runner

				meson build/ $EXTRA_MESON_ARGS

				ninja -C build -j4 install

				cd ..

				rm -rf parallel-deqp-runner

									
										61

.gitlab-ci/build-deqp-gl.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,61 @@

				git config --global user.email "mesa@example.com"

				git config --global user.name "Mesa CI"

				# XXX: Use --depth 1 once we can drop the cherry-picks.

				git clone \

				    https://github.com/KhronosGroup/VK-GL-CTS.git \

				    -b opengl-es-cts-3.2.5.1 \

				    /VK-GL-CTS

				pushd /VK-GL-CTS

				# Fix surfaceless build

				git cherry-pick -x 22f41e5e321c6dcd8569c4dad91bce89f06b3670

				git cherry-pick -x 1daa8dff73161ea60ead965bd6c9f2a0a2165648

				# surfaceless links against libkms and such despite not using it.

				sed -i '/gbm/d' targets/surfaceless/surfaceless.cmake

				sed -i '/libkms/d' targets/surfaceless/surfaceless.cmake

				sed -i '/libgbm/d' targets/surfaceless/surfaceless.cmake

				# --insecure is due to SSL cert failures hitting sourceforge for zlib and

				# libpng (sigh).  The archives get their checksums checked anyway, and git

				# always goes through ssh or https.

				python3 external/fetch_sources.py --insecure

				mkdir -p /deqp

				# Save the testlog stylesheets:

				cp doc/testlog-stylesheet/testlog.{css,xsl} /deqp

				popd

				pushd /deqp

				cmake -G Ninja \

				      -DDEQP_TARGET=surfaceless               \

				      -DCMAKE_BUILD_TYPE=Release              \

				      $EXTRA_CMAKE_ARGS                       \

				      /VK-GL-CTS

				ninja

				# Copy out the mustpass lists we want from a bunch of other junk.

				mkdir /deqp/mustpass

				for gles in gles2 gles3 gles31; do

				    cp \

				        /deqp/external/openglcts/modules/gl_cts/data/mustpass/gles/aosp_mustpass/3.2.5.x/$gles-master.txt \

				        /deqp/mustpass/$gles-master.txt

				done

				# Save *some* executor utils, but otherwise strip things down

				# to reduct deqp build size:

				mkdir /deqp/executor.save

				cp /deqp/executor/testlog-to-* /deqp/executor.save

				rm -rf /deqp/executor

				mv /deqp/executor.save /deqp/executor

				rm -rf /deqp/external

				rm -rf /deqp/modules/internal

				rm -rf /deqp/execserver

				rm -rf /deqp/modules/egl

				rm -rf /deqp/framework

				find -iname '*cmake*' -o -name '*ninja*' -o -name '*.o' -o -name '*.a' | xargs rm -rf

				${STRIP_CMD:-strip} modules/*/deqp-*

				du -sh *

				rm -rf /VK-GL-CTS

				popd

									
										33

.gitlab-ci/build-deqp-vk.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,33 @@

				git clone --depth 1 \

				    https://github.com/KhronosGroup/VK-GL-CTS.git \

				    -b vulkan-cts-1.1.6.0 \

				    /VK-GL-CTS

				cd /VK-GL-CTS

				# --insecure is due to SSL cert failures hitting sourceforge for zlib and

				# libpng (sigh).  The archives get their checksums checked anyway, and git

				# always goes through ssh or https.

				python3 external/fetch_sources.py --insecure

				mkdir -p /deqp

				cd /deqp

				cmake -G Ninja \

				      -DDEQP_TARGET=x11_glx \

				      -DCMAKE_BUILD_TYPE=Release \

				      /VK-GL-CTS

				ninja -j4

				# Copy out the mustpass list we want.

				mkdir /deqp/mustpass

				cp /VK-GL-CTS/external/vulkancts/mustpass/master/vk-default.txt \

				   /deqp/mustpass/vk-master.txt

				rm -rf /deqp/modules/internal

				rm -rf /deqp/executor

				rm -rf /deqp/execserver

				rm -rf /deqp/modules/egl

				rm -rf /deqp/framework

				find -iname '*cmake*' -o -name '*ninja*' -o -name '*.o' -o -name '*.a' | xargs rm -rf

				strip external/vulkancts/modules/vulkan/deqp-vk

				du -sh *

				rm -rf /VK-GL-CTS

									
										13

.gitlab-ci/build-piglit.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,13 @@

				#!/bin/bash

				set -ex

				git clone https://gitlab.freedesktop.org/mesa/piglit.git --single-branch --no-checkout /piglit

				pushd /piglit

				git checkout 8771c3860505db2bcf4877216221d774bf90af6b

				patch -p1 <$OLDPWD/.gitlab-ci/piglit/disable-vs_in.diff

				cmake -G Ninja -DCMAKE_BUILD_TYPE=Release

				ninja -j4

				find -name .git -o -name '*ninja*' -o -iname '*cmake*' -o -name '*.[chao]' | xargs rm -rf

				rm -rf target_api

				popd

									
										74

.gitlab-ci/container/arm_build.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,74 @@

				#!/bin/bash

				set -e

				set -o xtrace

				############### Install packages for building

				apt-get -y install ca-certificates

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list

				dpkg --add-architecture armhf

				apt-get update

				apt-get -y install \

					bc \

					bison \

					ccache \

					cmake \

					cpio \

					crossbuild-essential-armhf \

					debootstrap \

					flex \

					g++ \

					gettext \

					git \

					lavacli \

					libdrm-dev:armhf \

					libegl1-mesa-dev \

					libegl1-mesa-dev:armhf \

					libelf-dev \

					libelf-dev:armhf \

					libexpat1-dev \

					libexpat1-dev:armhf \

					libgles2-mesa-dev \

					libgles2-mesa-dev:armhf \

					libpng-dev \

					libpng-dev:armhf \

					libssl-dev \

					libvulkan-dev \

					libvulkan-dev:armhf \

					llvm-7-dev:armhf \

					llvm-8-dev \

					meson \

					pkg-config \

					python \

					python3-mako \

					unzip \

					wget \

					zlib1g-dev

				# dependencies where we want a specific version

				export LIBDRM_VERSION=libdrm-2.4.100

				wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2

				cd $LIBDRM_VERSION; meson build -D vc4=true -D freedreno=true -D etnaviv=true; ninja -j4 -C build install; cd ..

				rm -rf $LIBDRM_VERSION

				############### Generate cross build file for Meson

				cross_file="/cross_file-armhf.txt"

				/usr/share/meson/debcrossgen --arch armhf -o "$cross_file"

				# Explicitly set ccache path for cross compilers

				sed -i "s|/usr/bin/\([^-]*\)-linux-gnu\([^-]*\)-g|/usr/lib/ccache/\\1-linux-gnu\\2-g|g" "$cross_file"

				# Don't need wrapper for armhf executables

				sed -i -e '/\[properties\]/a\' -e "needs_exe_wrapper = False" "$cross_file"

				############### Generate kernel, ramdisk, test suites, etc for LAVA jobs

				DEBIAN_ARCH=arm64 . .gitlab-ci/container/lava_arm.sh

				DEBIAN_ARCH=armhf . .gitlab-ci/container/lava_arm.sh

				apt-get purge -y \

				        wget

				apt-get autoremove -y --purge

									
										64

.gitlab-ci/container/arm_test.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,64 @@

				#!/bin/bash

				set -e

				set -o xtrace

				############### Install packages for building

				apt-get -y install ca-certificates

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list

				apt-get update

				apt-get -y install \

					bzip2 \

					cmake \

					g++ \

					gcc \

					git \

					libc6-dev \

					libdrm-nouveau2 \

					libexpat1 \

					libgbm-dev \

					libgbm-dev \

					libgles2-mesa-dev \

					libllvm8 \

					libpng16-16 \

					libpng-dev \

					libvulkan-dev \

					libvulkan1 \

					meson \

					netcat \

					pkg-config \

					procps \

					python \

					waffle-utils \

					wget \

					zlib1g

				############### Build dEQP runner

				. .gitlab-ci/build-cts-runner.sh

				############### Build dEQP GL

				. .gitlab-ci/build-deqp-gl.sh

				############### Uninstall the build software

				apt-get purge -y \

				        bzip2 \

				        cmake \

				        g++ \

				        gcc \

				        git \

				        libc6-dev \

				        libgbm-dev \

				        libgles2-mesa-dev \

				        libpng-dev \

				        libvulkan-dev \

				        meson \

				        pkg-config \

				        python \

				        wget

				apt-get autoremove -y --purge

									
										63

.gitlab-ci/container/lava_arm.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,63 @@

				#!/bin/bash

				set -e

				set -o xtrace

				if [[ "$DEBIAN_ARCH" = "arm64" ]]; then

				    GCC_ARCH="aarch64-linux-gnu"

				    KERNEL_ARCH="arm64"

				    DEFCONFIG="arch/arm64/configs/defconfig"

				    DEVICE_TREES="arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dtb arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dtb arch/arm64/boot/dts/allwinner/sun50i-h6-pine-h64.dtb arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dtb"

				    KERNEL_IMAGE_NAME="Image"

				else

				    GCC_ARCH="arm-linux-gnueabihf"

				    KERNEL_ARCH="arm"

				    DEFCONFIG="arch/arm/configs/multi_v7_defconfig"

				    DEVICE_TREES="arch/arm/boot/dts/rk3288-veyron-jaq.dtb arch/arm/boot/dts/sun8i-h3-libretech-all-h3-cc.dtb"

				    KERNEL_IMAGE_NAME="zImage"

				fi

				############### Build dEQP runner

				if [[ "$DEBIAN_ARCH" = "armhf" ]]; then

				    EXTRA_MESON_ARGS="--cross-file /cross_file-armhf.txt"

				fi

				. .gitlab-ci/build-cts-runner.sh

				mkdir -p /lava-files/rootfs-${DEBIAN_ARCH}/usr/bin

				mv /usr/local/bin/deqp-runner /lava-files/rootfs-${DEBIAN_ARCH}/usr/bin/.

				############### Build dEQP

				EXTRA_CMAKE_ARGS="-DCMAKE_C_COMPILER=${GCC_ARCH}-gcc -DCMAKE_CXX_COMPILER=${GCC_ARCH}-g++"

				STRIP_CMD="${GCC_ARCH}-strip"

				. .gitlab-ci/build-deqp-gl.sh

				mv /deqp /lava-files/rootfs-${DEBIAN_ARCH}/.

				############### Cross-build kernel

				KERNEL_URL="https://gitlab.freedesktop.org/tomeu/linux/-/archive/v5.5-rc5-panfrost-fixes/linux-v5.5-rc5-panfrost-fixes.tar.gz"

				if [[ "$DEBIAN_ARCH" = "armhf" ]]; then

				    export ARCH=${KERNEL_ARCH}

				    export CROSS_COMPILE="${GCC_ARCH}-"

				fi

				mkdir -p kernel

				wget -qO- ${KERNEL_URL} | tar -xz --strip-components=1 -C kernel

				pushd kernel

				./scripts/kconfig/merge_config.sh ${DEFCONFIG} ../.gitlab-ci/${KERNEL_ARCH}.config

				make -j12 ${KERNEL_IMAGE_NAME} dtbs

				cp arch/${KERNEL_ARCH}/boot/${KERNEL_IMAGE_NAME} /lava-files/.

				cp ${DEVICE_TREES} /lava-files/.

				popd

				rm -rf kernel

				############### Create rootfs

				set +e

				debootstrap --variant=minbase --arch=${DEBIAN_ARCH} testing /lava-files/rootfs-${DEBIAN_ARCH}/ http://deb.debian.org/debian

				cat /lava-files/rootfs-${DEBIAN_ARCH}/debootstrap/debootstrap.log

				set -e

				cp .gitlab-ci/create-rootfs.sh /lava-files/rootfs-${DEBIAN_ARCH}/.

				chroot /lava-files/rootfs-${DEBIAN_ARCH} sh /create-rootfs.sh

				rm /lava-files/rootfs-${DEBIAN_ARCH}/create-rootfs.sh

52

.gitlab-ci/container/llvm-snapshot.gpg.key Normal file

View File

@@ -0,0 +1,52 @@
 -----BEGIN PGP PUBLIC KEY BLOCK-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 mQINBFE9lCwBEADi0WUAApM/mgHJRU8lVkkw0CHsZNpqaQDNaHefD6Rw3S4LxNmM
 EZaOTkhP200XZM8lVdbfUW9xSjA3oPldc1HG26NjbqqCmWpdo2fb+r7VmU2dq3NM
 R18ZlKixiLDE6OUfaXWKamZsXb6ITTYmgTO6orQWYrnW6ckYHSeaAkW0wkDAryl2
 B5v8aoFnQ1rFiVEMo4NGzw4UX+MelF7rxaaregmKVTPiqCOSPJ1McC1dHFN533FY
 Wh/RVLKWo6npu+owtwYFQW+zyQhKzSIMvNujFRzhIxzxR9Gn87MoLAyfgKEzrbbT
 DhqqNXTxS4UMUKCQaO93TzetX/EBrRpJj+vP640yio80h4Dr5pAd7+LnKwgpTDk1
 G88bBXJAcPZnTSKu9I2c6KY4iRNbvRz4i+ZdwwZtdW4nSdl2792L7Sl7Nc44uLL/
 ZqkKDXEBF6lsX5XpABwyK89S/SbHOytXv9o4puv+65Ac5/UShspQTMSKGZgvDauU
 cs8kE1U9dPOqVNCYq9Nfwinkf6RxV1k1+gwtclxQuY7UpKXP0hNAXjAiA5KS5Crq
 aaJg9q2F4bub0mNU6n7UI6vXguF2n4SEtzPRk6RP+4TiT3bZUsmr+1ktogyOJCc
 Ha8G5VdL+NBIYQthOcieYCBnTeIH7D3Sp6FYQTYtVbKFzmMK+36ERreL/wARAQAB
 tD1TeWx2ZXN0cmUgTGVkcnUgLSBEZWJpYW4gTExWTSBwYWNrYWdlcyA8c3lsdmVz
 dHJlQGRlYmlhbi5vcmc+iQI4BBMBAgAiBQJRPZQsAhsDBgsJCAcDAgYVCAIJCgsE
 FgIDAQIeAQIXgAAKCRAVz00Yr090Ibx+EADArS/hvkDF8juWMXxh17CgR0WZlHCC
 CTBWkg5a0bNN/3bb97cPQt/vIKWjQtkQpav6/5JTVCSx2riL4FHYhH0iuo4iAPR
 udC7Cvg8g7bSPrKO6tenQZNvQm+tUmBHgFiMBJi92AjZ/Qn1Shg7p9ITivFxpLyX
 wpmnF1OKyI2Kof2rm4BFwfSWuf8Fvh7kDMRLHv+MlnK/7j/BNpKdozXxLcwoFBmn
 l0WjpAH3OFF7Pvm1LJdf1DjWKH0Dc3sc6zxtmBR/KHHg6kK4BGQNnFKujcP7TVdv
 gMYv84kun14pnwjZcqOtN3UJtcx22880DOQzinoMs3Q4w4o05oIF+sSgHViFpc3W
 R0v+RllnH05vKZo+LDzc83DQVrdwliV12eHxrMQ8UYg88zCbF/cHHnlzZWAJgftg
 hB08v1BKPgYRUzwJ6VdVqXYcZWEaUJmQAPuAALyZESw94hSo28FAn0/gzEc5uOYx
 K+xG/lFwgAGYNb3uGM5m0P6LVTfdg6vDwwOeTNIExVk3KVFXeSQef2ZMkhwA7wya
 KJptkb62wBHFE+o9TUdtMCY6qONxMMdwioRE5BYNwAsS1PnRD2+jtlI0DzvKHt7B
 MWd8hnoUKhMeZ9TNmo+8CpsAtXZcBho0zPGz/R8NlJhAWpdAZ1CmcPo83EW86Yq7
 BxQUKnNHcwj2ebkCDQRRPZQsARAA4jxYmbTHwmMjqSizlMJYNuGOpIidEdx9zQ5g
 zOr431/VfWq4S+VhMDhs15j9lyml0y4ok215VRFwrAREDg6UPMr7ajLmBQGau0Fc
 bvZJ90l4NjXp5p0NEE/qOb9UEHT7EGkEhaZ1ekkWFTWCgsy7rRXfZLxB6sk7pzLC
 DshyW3zjIakWAnpQ5j5obiDy708pReAuGB94NSyb1HoW/xGsGgvvCw4r0w3xPStw
 F1PhmScE6NTBIfLliea3pl8vhKPlCh54Hk7I8QGjo1ETlRP4Qll1ZxHJ8u25f/ta
 RES2Aw8Hi7j0EVcZ6MT9JWTI83yUcnUlZPZS2HyeWcUj+8nUC8W4N8An+aNps9l/
 inIl2TbGo3Yn1JQLnA1YCoGwC34g8QZTJhElEQBN0X29ayWW6OdFx8MDvllbBV
 ymmKq2lK1U55mQTfDli7S3vfGz9Gp/oQwZ8bQpOeUkc5hbZszYwP4RX+68xDPfn+
 M9udl+qW9wu+LyePbW6HX90LmkhNkkY2ZzUPRPDHZANU5btaPXc2H7edX4y4maQa
 xenqD0lGh9LGz/mps4HEZtCI5CY8o0uCMF3lT0XfXhuLksr7Pxv57yue8LLTItOJ
 d9Hmzp9G97SRYYeqU+8lyNXtU2PdrLLq7QHkzrsloG78lCpQcalHGACJzrlUWVP/
 fN3Ht3kAEQEAAYkCHwQYAQIACQUCUT2ULAIbDAAKCRAVz00Yr090IbhWEADbr50X
 OEXMIMGRLe+YMjeMX9NG4jxs0jZaWHc/WrGR+CCSUb9r6aPXeLo+45949uEfdSsB
 pbaEdNWxF5Vr1CSjuO5siIlgDjmT655voXo67xVpEN4HhMrxugDJfCa6z97P0+ML
 PdDxim57uNqkam9XIq9hKQaurxMAECDPmlEXI4QT3eu5qw5/knMzDMZj4Vi6hovL
 wvvAeLHO/jsyfIdNmhBGU2RWCEZ9uo/MeerPHtRPfg74g+9PPfP6nyHD2Wes6yGd
 oVQwtPNAQD6Cj7EaA2xdZYLJ7/jW6yiPu98FFWP74FN2dlyEA2uVziLsfBrgpS4l
 tVOlrO2YzkkqUGrybzbLpj6eeHx+Cd7wcjI8CalsqtL6cG8cUEjtWQUHyTbQWAgG
 VPEgIAVhJ6RTZ26i/G+4J8neKyRs4vz+57UGwY6zI4AB1ZcWGEE3Bf+CDEDgmnP
 LSwbnHefK9IljT9XU98PelSryUO/5UPw7leE0akXKB4DtekToO226px1VnGp3Bov
 GBGvpHvL2WizEwdk+nfk8LtrLzej+9FtIcq3uIrYnsac47Pf7p0otcFeTJTjSq3
 krCaoG4Hx0zGQG2ZFpHrSrZTVy6lxvIdfi0beMgY6h78p6M9eYZHQHc02DjFkQXN
 bXb5c6gCHESH5PXwPU4jQEE7Ib9J6sbk7ZT2Mw==
 =j+4q
 -----END PGP PUBLIC KEY BLOCK-----

									
										220

.gitlab-ci/container/x86_build.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,220 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				CROSS_ARCHITECTURES="i386"

				for arch in $CROSS_ARCHITECTURES; do

				    dpkg --add-architecture $arch

				done

				apt-get install -y \

				      ca-certificates \

				      gnupg \

				      unzip \

				      wget

				# Upstream LLVM package repository

				apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key

				echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list

				apt-get update

				# Use newer packages from backports by default

				cat >/etc/apt/preferences <<EOF

				Package: *

				Pin: release a=buster-backports

				Pin-Priority: 500

				EOF

				apt-get dist-upgrade -y

				apt-get install -y --no-remove \

				      autoconf \

				      automake \

				      autotools-dev \

				      bison \

				      clang-9 \

				      cmake \

				      flex \

				      g++ \

				      gcc \

				      gettext \

				      git \

				      libclang-6.0-dev \

				      libclang-7-dev \

				      libclang-8-dev \

				      libclang-9-dev \

				      libclc-dev \

				      libelf-dev \

				      libepoxy-dev \

				      libexpat1-dev \

				      libgbm-dev \

				      libgtk-3-dev \

				      libomxil-bellagio-dev \

				      libpciaccess-dev \

				      libtool \

				      libunwind-dev \

				      libva-dev \

				      libvdpau-dev \

				      libvulkan-dev \

				      libx11-dev \

				      libx11-xcb-dev \

				      libxdamage-dev \

				      libxext-dev \

				      libxrandr-dev \

				      libxrender-dev \

				      libxshmfence-dev \

				      libxvmc-dev \

				      libxxf86vm-dev \

				      llvm-6.0-dev \

				      llvm-7-dev \

				      llvm-8-dev \

				      llvm-9-dev \

				      meson \

				      pkg-config \

				      python-mako \

				      python3-mako \

				      scons \

				      x11proto-dri2-dev \

				      x11proto-gl-dev \

				      x11proto-randr-dev \

				      xz-utils \

				      zlib1g-dev

				# Cross-build Mesa deps

				for arch in $CROSS_ARCHITECTURES; do

				    apt-get install -y --no-remove \

				            crossbuild-essential-${arch} \

				            libdrm-dev:${arch} \

				            libelf-dev:${arch} \

				            libexpat1-dev:${arch}

				done

				# for 64bit windows cross-builds

				apt-get install -y --no-remove \

				    libz-mingw-w64-dev \

				    mingw-w64 \

				    wine \

				    wine32 \

				    wine64

				# Debian's pkg-config wrapers for mingw are broken, and there's no sign that

				# they're going to be fixed, so we'll just have to fix it ourselves

				# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930492

				cat >/usr/local/bin/x86_64-w64-mingw32-pkg-config <<EOF

				#!/bin/sh

				PKG_CONFIG_LIBDIR=/usr/x86_64-w64-mingw32/lib/pkgconfig pkg-config \$@

				EOF

				chmod +x /usr/local/bin/x86_64-w64-mingw32-pkg-config

				# for the vulkan overlay layer

				wget https://github.com/KhronosGroup/glslang/releases/download/master-tot/glslang-master-linux-Release.zip

				unzip glslang-master-linux-Release.zip bin/glslangValidator

				install -m755 bin/glslangValidator /usr/local/bin/

				rm bin/glslangValidator glslang-master-linux-Release.zip

				# dependencies where we want a specific version

				export              XORG_RELEASES=https://xorg.freedesktop.org/releases/individual

				export               XCB_RELEASES=https://xcb.freedesktop.org/dist

				export           WAYLAND_RELEASES=https://wayland.freedesktop.org/releases

				export         XORGMACROS_VERSION=util-macros-1.19.0

				export             LIBDRM_VERSION=libdrm-2.4.100

				export           XCBPROTO_VERSION=xcb-proto-1.13

				export             LIBXCB_VERSION=libxcb-1.13

				export         LIBWAYLAND_VERSION=wayland-1.15.0

				export  WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.12

				wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2

				cd $XORGMACROS_VERSION; ./configure; make install; cd ..

				rm -rf $XORGMACROS_VERSION

				wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2

				cd $XCBPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $XCBPROTO_VERSION

				wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2

				cd $LIBXCB_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXCB_VERSION

				wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2

				cd $LIBDRM_VERSION; meson build -D vc4=true -D freedreno=true -D etnaviv=true; ninja -j4 -C build install; cd ..

				rm -rf $LIBDRM_VERSION

				wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz

				tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz

				cd $LIBWAYLAND_VERSION; ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation; make install; cd ..

				rm -rf $LIBWAYLAND_VERSION

				wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz

				tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz

				cd $WAYLAND_PROTOCOLS_VERSION; ./configure; make install; cd ..

				rm -rf $WAYLAND_PROTOCOLS_VERSION

				# The version of libglvnd-dev in debian is too old

				# Check this page to see when this local compilation can be dropped in favour of the package:

				# https://packages.debian.org/libglvnd-dev

				GLVND_VERSION=1.2.0

				wget https://gitlab.freedesktop.org/glvnd/libglvnd/-/archive/v$GLVND_VERSION/libglvnd-v$GLVND_VERSION.tar.gz

				tar -xvf libglvnd-v$GLVND_VERSION.tar.gz && rm libglvnd-v$GLVND_VERSION.tar.gz

				pushd libglvnd-v$GLVND_VERSION; ./autogen.sh; ./configure; make install; popd

				rm -rf libglvnd-v$GLVND_VERSION

				pushd /usr/local

				git clone https://gitlab.freedesktop.org/mesa/shader-db.git --depth 1

				rm -rf shader-db/.git

				cd shader-db

				make

				popd

				# Use ccache to speed up builds

				apt-get install -y --no-remove ccache

				# We need xmllint to validate the XML files in Mesa

				apt-get install -y --no-remove libxml2-utils

				# Generate cross build files for Meson

				for arch in $CROSS_ARCHITECTURES; do

				  cross_file="/cross_file-$arch.txt"

				  /usr/share/meson/debcrossgen --arch "$arch" -o "$cross_file"

				  # Explicitly set ccache path for cross compilers

				  sed -i "s|/usr/bin/\([^-]*\)-linux-gnu\([^-]*\)-g|/usr/lib/ccache/\\1-linux-gnu\\2-g|g" "$cross_file"

				  if [ "$arch" = "i386" ]; then

				    # Work around a bug in debcrossgen that should be fixed in the next release

				    sed -i "s|cpu_family = 'i686'|cpu_family = 'x86'|g" "$cross_file"

				    # Don't need wrapper for i386 executables

				    sed -i -e '/\[properties\]/a\' -e "needs_exe_wrapper = False" "$cross_file"

				  fi

				done

				############### Uninstall the build software

				apt-get purge -y \

				      autoconf \

				      automake \

				      autotools-dev \

				      cmake \

				      git \

				      gnupg \

				      libgbm-dev \

				      libtool \

				      unzip \

				      wget

				apt-get autoremove -y --purge

									
										59

.gitlab-ci/container/x86_build_old.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,59 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				apt-get install -y \

				      apt-transport-https \

				      ca-certificates

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian stretch-backports main' >/etc/apt/sources.list.d/backports.list

				apt-get update

				# Use newer packages from backports by default

				cat >/etc/apt/preferences <<EOF

				Package: *

				Pin: release a=stretch-backports

				Pin-Priority: 500

				EOF

				apt-get dist-upgrade -y

				apt-get install -y --no-remove \

				      llvm-3.9-dev \

				      libclang-3.9-dev \

				      llvm-4.0-dev \

				      libclang-4.0-dev \

				      llvm-5.0-dev \

				      libclang-5.0-dev \

				      g++ \

				      bzip2 \

				      ccache \

				      zlib1g-dev \

				      pkg-config \

				      gcc \

				      git \

				      libepoxy-dev \

				      libclc-dev \

				      xz-utils \

				      libdrm-dev \

				      libexpat1-dev \

				      libelf-dev \

				      libunwind-dev \

				      libpng-dev \

				      python-mako \

				      python3-mako \

				      bison \

				      flex \

				      gettext \

				      scons \

				      meson

				############### Uninstall unused packages

				apt-get autoremove -y --purge

									
										96

.gitlab-ci/container/x86_test-gl.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,96 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				apt-get install -y \

				        ca-certificates \

				        gnupg \

				# Upstream LLVM package repository

				apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key

				echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list

				apt-get update

				# Use newer packages from backports by default

				cat >/etc/apt/preferences <<EOF

				Package: *

				Pin: release a=buster-backports

				Pin-Priority: 500

				EOF

				apt-get dist-upgrade -y

				apt-get install -y --no-remove \

				      cmake \

				      g++ \

				      git \

				      gcc \

				      libexpat1 \

				      libgbm-dev \

				      libgles2-mesa-dev \

				      libpng16-16 \

				      libpng-dev \

				      libvulkan1 \

				      libvulkan-dev \

				      libwaffle-dev \

				      libwayland-server0 \

				      libxcb-xfixes0 \

				      libxkbcommon0 \

				      libxkbcommon-dev \

				      libxrender1 \

				      libxrender-dev \

				      libllvm9 \

				      meson \

				      patch \

				      pkg-config \

				      python3-mako \

				      python3-numpy \

				      python3-six \

				      python \

				      waffle-utils \

				      xauth \

				      xvfb \

				      zlib1g

				############### Build piglit

				. .gitlab-ci/build-piglit.sh

				############### Build dEQP runner

				. .gitlab-ci/build-cts-runner.sh

				############### Build dEQP GL

				. .gitlab-ci/build-deqp-gl.sh

				############### Uninstall the build software

				apt-get purge -y \

				      cmake \

				      g++ \

				      gcc \

				      git \

				      gnupg \

				      libc6-dev \

				      libgbm-dev \

				      libgles2-mesa-dev \

				      libpng-dev \

				      libwaffle-dev \

				      libxkbcommon-dev \

				      libxrender-dev \

				      meson \

				      patch \

				      pkg-config \

				      python

				apt-get autoremove -y --purge

									
										87

.gitlab-ci/container/x86_test-vk.sh
									
										Normal file
									
												View File
												
				@@ -0,0 +1,87 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				apt-get install -y \

				        ca-certificates \

				        gnupg \

				# Upstream LLVM package repository

				apt-key add .gitlab-ci/container/llvm-snapshot.gpg.key

				echo "deb https://apt.llvm.org/buster/ llvm-toolchain-buster-9 main" >/etc/apt/sources.list.d/llvm9.list

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian buster-backports main' >/etc/apt/sources.list.d/backports.list

				apt-get update

				# Use newer packages from backports by default

				cat >/etc/apt/preferences <<EOF

				Package: *

				Pin: release a=buster-backports

				Pin-Priority: 500

				EOF

				apt-get dist-upgrade -y

				apt-get install -y --no-remove \

				      cmake \

				      g++ \

				      git \

				      gcc \

				      libexpat1 \

				      libgbm-dev \

				      libgles2-mesa-dev \

				      libpng16-16 \

				      libpng-dev \

				      libvulkan1 \

				      libvulkan-dev \

				      libwayland-server0 \

				      libxcb-randr0 \

				      libxcb-xfixes0 \

				      libxkbcommon0 \

				      libxkbcommon-dev \

				      libxrender1 \

				      libxrender-dev \

				      libllvm9 \

				      meson \

				      patch \

				      pkg-config \

				      python3-distutils \

				      python \

				      xauth \

				      xvfb

				############### Build dEQP runner

				. .gitlab-ci/build-cts-runner.sh

				############### Build dEQP VK

				. .gitlab-ci/build-deqp-vk.sh

				############### Uninstall the build software

				apt-get purge -y \

				      cmake \

				      g++ \

				      gcc \

				      git \

				      gnupg \

				      libgbm-dev \

				      libgles2-mesa-dev \

				      libpng-dev \

				      libvulkan-dev \

				      libxkbcommon-dev \

				      libxrender-dev \

				      meson \

				      patch \

				      pkg-config \

				      python

				apt-get autoremove -y --purge

									
										32

src/gallium/drivers/panfrost/ci/create-rootfs.sh → .gitlab-ci/create-rootfs.sh
									
												View File
												
				@@ -1,8 +1,23 @@

				#!/bin/sh

				#!/bin/bash

				set -ex

				apt-get -y install --no-install-recommends initramfs-tools libpng16-16 weston strace libsensors5

				LLVM=libllvm8

				# LLVMPipe on armhf is broken with LLVM 8

				if [ `dpkg --print-architecture` = "armhf" ]; then

				        LLVM=libllvm7

				fi

				apt-get -y install --no-install-recommends \

				    initramfs-tools \

				    libpng16-16 \

				    strace \

				    libsensors5 \

				    libexpat1 \

				    libdrm2 \

				    libdrm-nouveau2 \

				    $LLVM

				passwd root -d

				chsh -s /bin/sh

				ln -s /bin/sh /init

				@@ -15,9 +30,9 @@ ln -s /bin/sh /init

				rm -rf /etc/localtime

				cp /usr/share/zoneinfo/Etc/UTC /etc/localtime

				UNNEEDED_PACKAGES=" libfdisk1"\

				" tzdata"\

				UNNEEDED_PACKAGES="libfdisk1

				                   tzdata

				                   diffutils"

				export DEBIAN_FRONTEND=noninteractive

				@@ -82,15 +97,10 @@ UNNEEDED_PACKAGES="apt libapt-pkg5.0 "\

				"libsemanage1 libsemanage-common "\

				"libsepol1 "\

				"gzip "\

				"gnupg "\

				"gpgv "\

				"hostname "\

				"adduser "\

				"debian-archive-keyring "\

				"libgl1 libgl1-mesa-dri libglapi-mesa libglvnd0 libglx-mesa0 libegl-mesa0 libgles2 "\

				"libllvm7 "\

				"libx11-data libthai-data "\

				"systemd dbus "\

				# Removing unneeded packages

				for PACKAGE in ${UNNEEDED_PACKAGES}

				@@ -182,4 +192,4 @@ rm usr/lib/*/libdb-5.3.so

				rm usr/lib/*/libnss_hesiod*

				rm usr/lib/*/libnss_nis*

				rm usr/bin/tar

				rm bin/tar

1

.gitlab-ci/cross-xfail-i386 Normal file

View File

				`@@ -0,0 +1 @@`
				`u_format_test`

									
										181

.gitlab-ci/debian-install.sh
									
												View File
											
				@@ -1,181 +0,0 @@

				#!/bin/bash

				set -e

				set -o xtrace

				export DEBIAN_FRONTEND=noninteractive

				apt-get install -y \

				      apt-transport-https \

				      ca-certificates \

				      curl \

				      wget \

				      gnupg \

				      software-properties-common

				curl -fsSL https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -

				add-apt-repository "deb https://apt.llvm.org/stretch/ llvm-toolchain-stretch-7 main"

				add-apt-repository "deb https://apt.llvm.org/stretch/ llvm-toolchain-stretch-8 main"

				sed -i -e 's/http:\/\/deb/https:\/\/deb/g' /etc/apt/sources.list

				echo 'deb https://deb.debian.org/debian stretch-backports main' >/etc/apt/sources.list.d/backports.list

				echo 'deb https://deb.debian.org/debian jessie main' >/etc/apt/sources.list.d/jessie.list

				apt-get update

				apt-get install -y -t stretch-backports \

				      llvm-3.4-dev \

				      llvm-3.9-dev \

				      libclang-3.9-dev \

				      llvm-5.0-dev \

				      llvm-6.0-dev \

				      llvm-7-dev \

				      g++ \

				      clang-8 \

				      libclang-7-dev

				# Install remaining packages from Debian buster to get newer versions

				add-apt-repository "deb https://deb.debian.org/debian/ buster main"

				add-apt-repository "deb https://deb.debian.org/debian/ buster-updates main"

				apt-get update

				apt-get install -y \

				      bzip2 \

				      zlib1g-dev \

				      pkg-config \

				      libxrender-dev \

				      libxdamage-dev \

				      libxxf86vm-dev \

				      gcc \

				      libclc-dev \

				      libxvmc-dev \

				      libomxil-bellagio-dev \

				      xz-utils \

				      libexpat1-dev \

				      libx11-xcb-dev \

				      libelf-dev \

				      libunwind-dev \

				      libglvnd-dev \

				      python-mako \

				      python3-mako \

				      meson \

				      scons

				# autotools build deps

				apt-get install -y \

				      automake \

				      libtool \

				      bison \

				      flex \

				      gettext \

				      make

				# for 64bit windows cross-builds

				apt-get install -y \

				      wine64 \

				      mingw-w64

				# dependencies where we want a specific version

				export              XORG_RELEASES=https://xorg.freedesktop.org/releases/individual

				export               XCB_RELEASES=https://xcb.freedesktop.org/dist

				export           WAYLAND_RELEASES=https://wayland.freedesktop.org/releases

				export         XORGMACROS_VERSION=util-macros-1.19.0

				export            GLPROTO_VERSION=glproto-1.4.17

				export          DRI2PROTO_VERSION=dri2proto-2.8

				export       LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				export             LIBDRM_VERSION=libdrm-2.4.97

				export           XCBPROTO_VERSION=xcb-proto-1.13

				export         RANDRPROTO_VERSION=randrproto-1.3.0

				export          LIBXRANDR_VERSION=libXrandr-1.3.0

				export             LIBXCB_VERSION=libxcb-1.13

				export       LIBXSHMFENCE_VERSION=libxshmfence-1.3

				export           LIBVDPAU_VERSION=libvdpau-1.1

				export              LIBVA_VERSION=libva-1.7.0

				export         LIBWAYLAND_VERSION=wayland-1.15.0

				export  WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8

				wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				tar -xvf $XORGMACROS_VERSION.tar.bz2 && rm $XORGMACROS_VERSION.tar.bz2

				cd $XORGMACROS_VERSION; ./configure; make install; cd ..

				rm -rf $XORGMACROS_VERSION

				wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2

				tar -xvf $GLPROTO_VERSION.tar.bz2 && rm $GLPROTO_VERSION.tar.bz2

				cd $GLPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $GLPROTO_VERSION

				wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2

				tar -xvf $DRI2PROTO_VERSION.tar.bz2 && rm $DRI2PROTO_VERSION.tar.bz2

				cd $DRI2PROTO_VERSION; ./configure; make install; cd ..

				rm -rf $DRI2PROTO_VERSION

				wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				tar -xvf $XCBPROTO_VERSION.tar.bz2 && rm $XCBPROTO_VERSION.tar.bz2

				cd $XCBPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $XCBPROTO_VERSION

				wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				tar -xvf $LIBXCB_VERSION.tar.bz2 && rm $LIBXCB_VERSION.tar.bz2

				cd $LIBXCB_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXCB_VERSION

				wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2

				tar -xvf $LIBPCIACCESS_VERSION.tar.bz2 && rm $LIBPCIACCESS_VERSION.tar.bz2

				cd $LIBPCIACCESS_VERSION; ./configure; make install; cd ..

				rm -rf $LIBPCIACCESS_VERSION

				wget https://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				tar -xvf $LIBDRM_VERSION.tar.bz2 && rm $LIBDRM_VERSION.tar.bz2

				cd $LIBDRM_VERSION; ./configure --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api; make install; cd ..

				rm -rf $LIBDRM_VERSION

				wget $XORG_RELEASES/proto/$RANDRPROTO_VERSION.tar.bz2

				tar -xvf $RANDRPROTO_VERSION.tar.bz2 && rm $RANDRPROTO_VERSION.tar.bz2

				cd $RANDRPROTO_VERSION; ./configure; make install; cd ..

				rm -rf $RANDRPROTO_VERSION

				wget $XORG_RELEASES/lib/$LIBXRANDR_VERSION.tar.bz2

				tar -xvf $LIBXRANDR_VERSION.tar.bz2 && rm $LIBXRANDR_VERSION.tar.bz2

				cd $LIBXRANDR_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXRANDR_VERSION

				wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				tar -xvf $LIBXSHMFENCE_VERSION.tar.bz2 && rm $LIBXSHMFENCE_VERSION.tar.bz2

				cd $LIBXSHMFENCE_VERSION; ./configure; make install; cd ..

				rm -rf $LIBXSHMFENCE_VERSION

				wget https://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2

				tar -xvf $LIBVDPAU_VERSION.tar.bz2 && rm $LIBVDPAU_VERSION.tar.bz2

				cd $LIBVDPAU_VERSION; ./configure; make install; cd ..

				rm -rf $LIBVDPAU_VERSION

				wget https://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2

				tar -xvf $LIBVA_VERSION.tar.bz2 && rm $LIBVA_VERSION.tar.bz2

				cd $LIBVA_VERSION; ./configure --disable-wayland --disable-dummy-driver; make install; cd ..

				rm -rf $LIBVA_VERSION

				wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz

				tar -xvf $LIBWAYLAND_VERSION.tar.xz && rm $LIBWAYLAND_VERSION.tar.xz

				cd $LIBWAYLAND_VERSION; ./configure --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation; make install; cd ..

				rm -rf $LIBWAYLAND_VERSION

				wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz

				tar -xvf $WAYLAND_PROTOCOLS_VERSION.tar.xz && rm $WAYLAND_PROTOCOLS_VERSION.tar.xz

				cd $WAYLAND_PROTOCOLS_VERSION; ./configure; make install; cd ..

				rm -rf $WAYLAND_PROTOCOLS_VERSION

				# Use ccache to speed up builds

				apt-get install -y ccache

				# We need xmllint to validate the XML files in Mesa

				apt-get install -y libxml2-utils

				# Remove unused packages

				apt-get purge -y \

				      automake \

				      libtool \

				      make \

				      curl \

				      wget \

				      gnupg \

				      software-properties-common

				apt-get autoremove -y --purge

10

.gitlab-ci/deqp-default-skips.txt Normal file

View File

@@ -0,0 +1,10 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*

33

.gitlab-ci/deqp-freedreno-a307-fails.txt Normal file

View File

@@ -0,0 +1,33 @@
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
 dEQP-GLES2.functional.clipping.point.wide_point_clip
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
 dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_l8_npot
 dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgb888_npot
 dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgba4444_npot
 dEQP-GLES2.functional.texture.filtering.2d.linear_nearest_clamp_rgba8888_npot
 dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_l8_npot
 dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgb888_npot
 dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgba4444_npot
 dEQP-GLES2.functional.texture.filtering.2d.nearest_linear_clamp_rgba8888_npot
 dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_l8_npot
 dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgb888_npot
 dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgba4444_npot
 dEQP-GLES2.functional.texture.filtering.cube.linear_nearest_clamp_rgba8888_npot
 dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_l8_npot
 dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgb888_npot
 dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgba4444_npot
 dEQP-GLES2.functional.texture.filtering.cube.nearest_linear_clamp_rgba8888_npot

3

.gitlab-ci/deqp-freedreno-a630-fails.txt Normal file

View File

@@ -0,0 +1,3 @@
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
 dEQP-GLES31.functional.stencil_texturing.render.depth24_stencil8_clear
 dEQP-GLES31.functional.stencil_texturing.render.depth24_stencil8_draw

21

.gitlab-ci/deqp-freedreno-a630-skips.txt Normal file

View File

@@ -0,0 +1,21 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*
 # Unstable test results
 #dEQP-GLES3.functional.fragment_out.random.*
 dEQP-GLES3.functional.transform_feedback.*points.*
 dEQP-GLES3.functional.transform_feedback.*lines.*
 dEQP-GLES31.functional.primitive_bounding_box.*
 #dEQP-GLES31.functional.layout_binding.ssbo.fragment_binding_array.*
 # Intermittent timeout
 dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23

205

.gitlab-ci/deqp-lima-fails.txt Normal file

View File

@@ -0,0 +1,205 @@
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_x_neg_y_pos_z_and_pos_x_pos_y_neg_z
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_x_pos_y_pos_z_and_pos_x_neg_y_neg_z
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_x_neg_y_pos_z_and_neg_x_pos_y_neg_z
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_x_pos_y_pos_z_and_neg_x_neg_y_neg_z
 dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.0
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.1
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.10
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.11
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.12
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.13
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.14
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.15
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.16
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.17
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.18
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.19
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.2
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.20
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.21
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.22
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.23
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.24
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.3
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.4
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.5
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.6
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.7
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.8
 dEQP-GLES2.functional.fragment_ops.depth_stencil.random.9
 dEQP-GLES2.functional.fragment_ops.depth_stencil.write_mask.stencil
 dEQP-GLES2.functional.shaders.algorithm.hsl_to_rgb_vertex
 dEQP-GLES2.functional.shaders.functions.array_arguments.global_in_int_vertex
 dEQP-GLES2.functional.shaders.functions.array_arguments.local_in_int_vertex
 dEQP-GLES2.functional.shaders.functions.datatypes.int_int_vertex
 dEQP-GLES2.functional.shaders.functions.overloading.builtin_sin_vertex
 dEQP-GLES2.functional.shaders.functions.overloading.builtin_step_vertex
 dEQP-GLES2.functional.shaders.functions.overloading.user_func_arg_int_types_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.inout_highp_int_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.inout_int_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.inout_lowp_int_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.out_highp_int_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.out_int_vertex
 dEQP-GLES2.functional.shaders.functions.qualifiers.out_lowp_int_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_loop_write_static_loop_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_loop_write_static_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat3_dynamic_write_dynamic_loop_read_vertex
 dEQP-GLES2.functional.shaders.loops.do_while_constant_iterations.conditional_body_vertex
 dEQP-GLES2.functional.shaders.loops.do_while_dynamic_iterations.vector_counter_fragment
 dEQP-GLES2.functional.shaders.loops.do_while_uniform_iterations.conditional_body_vertex
 dEQP-GLES2.functional.shaders.loops.do_while_uniform_iterations.nested_tricky_dataflow_2_vertex
 dEQP-GLES2.functional.shaders.loops.for_dynamic_iterations.vector_counter_fragment
 dEQP-GLES2.functional.shaders.loops.while_constant_iterations.compound_statement_vertex
 dEQP-GLES2.functional.shaders.loops.while_constant_iterations.sequence_statement_vertex
 dEQP-GLES2.functional.shaders.loops.while_dynamic_iterations.nested_sequence_vertex
 dEQP-GLES2.functional.shaders.loops.while_dynamic_iterations.vector_counter_fragment
 dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.nested_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.highp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.lowp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_effect.mediump_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.highp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.lowp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub_assign_result.mediump_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.highp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.lowp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec2_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec3_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec4_int_vertex
 dEQP-GLES2.functional.shaders.operator.binary_operator.sub.mediump_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_int_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.highp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_int_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.lowp_ivec4_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_int_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec2_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec3_vertex
 dEQP-GLES2.functional.shaders.operator.unary_operator.minus.mediump_ivec4_vertex
 dEQP-GLES2.functional.shaders.random.all_features.fragment.37
 dEQP-GLES2.functional.shaders.random.exponential.fragment.11
 dEQP-GLES2.functional.shaders.random.exponential.fragment.12
 dEQP-GLES2.functional.shaders.random.exponential.fragment.14
 dEQP-GLES2.functional.shaders.random.exponential.fragment.37
 dEQP-GLES2.functional.shaders.random.exponential.fragment.5
 dEQP-GLES2.functional.shaders.random.exponential.fragment.74
 dEQP-GLES2.functional.shaders.random.texture.fragment.28
 dEQP-GLES2.functional.shaders.random.trigonometric.fragment.1
 dEQP-GLES2.functional.shaders.random.trigonometric.fragment.65
 dEQP-GLES2.functional.shaders.random.trigonometric.fragment.69
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2d_bias
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec4_bias
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texturecube_bias
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_rgba8888
 dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
 dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_nearest
 dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_nearest
 dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb
 dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgba
 dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgb
 dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgba

46

.gitlab-ci/deqp-lima-skips.txt Normal file

View File

@@ -0,0 +1,46 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance
 dEQP-GLES[0-9]*.stress
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish
 # Crashes
 dEQP-GLES2.functional.shaders.invariance.highp.common_subexpression_1
 dEQP-GLES2.functional.shaders.invariance.mediump.common_subexpression_1
 dEQP-GLES2.functional.shaders.invariance.lowp.common_subexpression_1
 # Flaky
 dEQP-GLES2.functional.fbo.completeness.size.distinct
 dEQP-GLES2.functional.negative_api.shader.uniform_matrixfv_invalid_transpose
 dEQP-GLES2.functional.negative_api.texture.generatemipmap_zero_level_array_compressed
 dEQP-GLES2.functional.shaders.random.exponential.fragment.94
 dEQP-GLES2.functional.shaders.random.all_features.fragment.55
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
 # Driver bugs causing GPU errors
 dEQP-GLES2.functional.shaders.loops.while_constant_iterations.nested_sequence_vertex
 dEQP-GLES2.functional.shaders.loops.while_constant_iterations.conditional_body_vertex
 dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.conditional_continue_vertex
 dEQP-GLES2.functional.shaders.loops.while_uniform_iterations.double_continue_vertex
 # Hangs / OOM
 dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_static_read
 dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_dynamic_read
 dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_static_loop_read
 dEQP-GLES2.functional.shaders.indexing.varying_array.vec4_dynamic_loop_write_dynamic_loop_read
 dEQP-GLES2.functional.shaders.indexing.tmp_array.vec3_dynamic_loop_write_dynamic_read_vertex
 dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_static_read_vertex
 dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_dynamic_read_vertex
 dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_static_loop_read_vertex
 dEQP-GLES2.functional.shaders.indexing.tmp_array.vec4_dynamic_loop_write_dynamic_loop_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_static_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_dynamic_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_static_loop_read_vertex
 dEQP-GLES2.functional.shaders.indexing.matrix_subscript.mat4_dynamic_loop_write_dynamic_loop_read_vertex

124

.gitlab-ci/deqp-llvmpipe-fails.txt Normal file

View File

@@ -0,0 +1,124 @@
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
 dEQP-GLES2.functional.clipping.point.wide_point_clip
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
 dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4
 dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.color_clear.rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.depth.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.polygon_offset.default_displacement_with_units
 dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
 dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
 dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
 dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
 dEQP-GLES2.functional.rasterization.limits.points
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2d_bias
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec3_bias
 dEQP-GLES2.functional.shaders.texture_functions.fragment.texture2dproj_vec4_bias
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_mirror_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_repeat_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_linear_repeat_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_mirror_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_etc1
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_l8
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgb888
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgba4444
 dEQP-GLES2.functional.texture.filtering.2d.linear_mipmap_linear_nearest_repeat_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_mirror_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_repeat_etc1
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_linear_repeat_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_clamp_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_mirror_etc1
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_mirror_rgba8888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_etc1
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_l8
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgb888
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgba4444
 dEQP-GLES2.functional.texture.filtering.2d.nearest_mipmap_linear_nearest_repeat_rgba8888
 dEQP-GLES2.functional.texture.mipmap.2d.affine.linear_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_clamp
 dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_mirror
 dEQP-GLES2.functional.texture.mipmap.2d.affine.nearest_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.2d.basic.linear_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.2d.basic.linear_linear_repeat_non_square
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_clamp
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_clamp_non_square
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_mirror
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_mirror_non_square
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.2d.basic.nearest_linear_repeat_non_square
 dEQP-GLES2.functional.texture.mipmap.2d.projected.linear_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_clamp
 dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_mirror
 dEQP-GLES2.functional.texture.mipmap.2d.projected.nearest_linear_repeat
 dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.basic.linear_nearest
 dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.bias.linear_nearest
 dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_linear
 dEQP-GLES2.functional.texture.mipmap.cube.projected.linear_nearest
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_clamp
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_mirror
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_linear_repeat
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_clamp
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_mirror
 dEQP-GLES2.functional.texture.vertex.2d.filtering.linear_mipmap_linear_nearest_repeat
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_clamp
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_mirror
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_linear_repeat
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_clamp
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_mirror
 dEQP-GLES2.functional.texture.vertex.2d.filtering.nearest_mipmap_linear_nearest_repeat
 dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_clamp
 dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_mirror
 dEQP-GLES2.functional.texture.vertex.2d.wrap.clamp_repeat
 dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_clamp
 dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_mirror
 dEQP-GLES2.functional.texture.vertex.2d.wrap.mirror_repeat
 dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_clamp
 dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_mirror
 dEQP-GLES2.functional.texture.vertex.2d.wrap.repeat_repeat
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_clamp
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_mirror
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_linear_repeat
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_clamp
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_mirror
 dEQP-GLES2.functional.texture.vertex.cube.filtering.linear_mipmap_linear_nearest_repeat
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_clamp
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_mirror
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_linear_repeat
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_clamp
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_mirror
 dEQP-GLES2.functional.texture.vertex.cube.filtering.nearest_mipmap_linear_nearest_repeat
 dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_clamp
 dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_mirror
 dEQP-GLES2.functional.texture.vertex.cube.wrap.clamp_repeat
 dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_clamp
 dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_mirror
 dEQP-GLES2.functional.texture.vertex.cube.wrap.mirror_repeat
 dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_clamp
 dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_mirror
 dEQP-GLES2.functional.texture.vertex.cube.wrap.repeat_repeat

31

.gitlab-ci/deqp-panfrost-t720-fails.txt Normal file

View File

@@ -0,0 +1,31 @@
 dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

14

.gitlab-ci/deqp-panfrost-t720-skips.txt Normal file

View File

@@ -0,0 +1,14 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*
 # XXX: Why does this flake?
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

31

.gitlab-ci/deqp-panfrost-t760-fails.txt Normal file

View File

@@ -0,0 +1,31 @@
 dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

10

.gitlab-ci/deqp-panfrost-t760-skips.txt Normal file

View File

@@ -0,0 +1,10 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*

31

.gitlab-ci/deqp-panfrost-t820-fails.txt Normal file

View File

@@ -0,0 +1,31 @@
 dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

13

.gitlab-ci/deqp-panfrost-t820-skips.txt Normal file

View File

@@ -0,0 +1,13 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*
 # XXX: Why does this flake?
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

31

.gitlab-ci/deqp-panfrost-t860-fails.txt Normal file

View File

@@ -0,0 +1,31 @@
 dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb565_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgb5_a1_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgb_stencil_index8
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_tex2d_rgba_stencil_index8
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_colorbuffer.tex2d_rgba_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb565_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgb5_a1_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.rbo_rgba4_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgb_depth_component16
 dEQP-GLES2.functional.fbo.render.shared_depthbuffer.tex2d_rgba_depth_component16

13

.gitlab-ci/deqp-panfrost-t860-skips.txt Normal file

View File

@@ -0,0 +1,13 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*
 # XXX: Why does this flake?
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z

31

.gitlab-ci/deqp-radv-polaris10-skips.txt Normal file

View File

@@ -0,0 +1,31 @@
 # Disable a TON of tests to keep the run around 5-10 minutes because my runner is
 # slow.
 dEQP-VK.api.*
 dEQP-VK.binding_model.*
 dEQP-VK.clipping.*
 dEQP-VK.compute.*
 dEQP-VK.conditional_rendering.*
 dEQP-VK.descriptor_indexing.*
 dEQP-VK.device_group.*
 dEQP-VK.fragment_operations.*
 dEQP-VK.fragment_shader_interlock.*
 dEQP-VK.graphicsfuzz.*
 dEQP-VK.image.*
 dEQP-VK.imageless_framebuffer.*
 dEQP-VK.info.*
 dEQP-VK.memory.*
 dEQP-VK.memory_model.*
 dEQP-VK.multiview.*
 dEQP-VK.pipeline.*
 dEQP-VK.protected_memory.*
 dEQP-VK.query_pool.*
 dEQP-VK.robustness.*
 dEQP-VK.sparse_resources.*
 dEQP-VK.spirv_assembly.*
 dEQP-VK.subgroups.*
 dEQP-VK.synchronization.*
 dEQP-VK.texture.*
 dEQP-VK.transform_feedback.*
 dEQP-VK.ubo.*
 dEQP-VK.wsi.*
 dEQP-VK.ycbcr.*

									
										237

.gitlab-ci/deqp-runner.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,237 @@

				#!/bin/sh

				set -ex

				DEQP_OPTIONS="--deqp-surface-width=256 --deqp-surface-height=256"

				DEQP_OPTIONS="$DEQP_OPTIONS --deqp-surface-type=pbuffer"

				DEQP_OPTIONS="$DEQP_OPTIONS --deqp-gl-config-name=rgba8888d24s8ms0"

				DEQP_OPTIONS="$DEQP_OPTIONS --deqp-visibility=hidden"

				# It would be nice to be able to enable the watchdog, so that hangs in a test

				# don't need to wait the full hour for the run to time out.  However, some

				# shaders end up taking long enough to compile

				# (dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20 for example)

				# that they'll sporadically trigger the watchdog.

				#DEQP_OPTIONS="$DEQP_OPTIONS --deqp-watchdog=enable"

				if [ -z "$DEQP_VER" ]; then

				   echo 'DEQP_VER must be set to something like "gles2", "gles31" or "vk" for the test run'

				   exit 1

				fi

				if [ "$DEQP_VER" = "vk" ]; then

				   if [ -z "$VK_DRIVER" ]; then

				      echo 'VK_DRIVER must be to something like "radeon" or "intel" for the test run'

				      exit 1

				   fi

				fi

				if [ -z "$DEQP_SKIPS" ]; then

				   echo 'DEQP_SKIPS must be set to something like "deqp-default-skips.txt"'

				   exit 1

				fi

				ARTIFACTS=`pwd`/artifacts

				# Set up the driver environment.

				export LD_LIBRARY_PATH=`pwd`/install/lib/

				export EGL_PLATFORM=surfaceless

				export VK_ICD_FILENAMES=`pwd`/install/share/vulkan/icd.d/"$VK_DRIVER"_icd.x86_64.json

				# the runner was failing to look for libkms in /usr/local/lib for some reason

				# I never figured out.

				export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

				RESULTS=`pwd`/results

				mkdir -p $RESULTS

				# Generate test case list file.

				if [ "$DEQP_VER" = "vk" ]; then

				   cp /deqp/mustpass/vk-master.txt /tmp/case-list.txt

				   DEQP=/deqp/external/vulkancts/modules/vulkan/deqp-vk

				else

				   cp /deqp/mustpass/$DEQP_VER-master.txt /tmp/case-list.txt

				   DEQP=/deqp/modules/$DEQP_VER/deqp-$DEQP_VER

				fi

				# If the job is parallel, take the corresponding fraction of the caselist.

				# Note: N~M is a gnu sed extension to match every nth line (first line is #1).

				if [ -n "$CI_NODE_INDEX" ]; then

				   sed -ni $CI_NODE_INDEX~$CI_NODE_TOTAL"p" /tmp/case-list.txt

				fi

				if [ ! -s /tmp/case-list.txt ]; then

				    echo "Caselist generation failed"

				    exit 1

				fi

				if [ -n "$DEQP_EXPECTED_FAILS" ]; then

				    XFAIL="--xfail-list $ARTIFACTS/$DEQP_EXPECTED_FAILS"

				fi

				set +e

				run_cts() {

				    deqp=$1

				    caselist=$2

				    output=$3

				    deqp-runner \

				        --deqp $deqp \

				        --output $output \

				        --caselist $caselist \

				        --exclude-list $ARTIFACTS/$DEQP_SKIPS \

				        $XFAIL \

				        --job ${DEQP_PARALLEL:-1} \

					--allow-flakes true \

					$DEQP_RUNNER_OPTIONS \

				        -- \

				        $DEQP_OPTIONS

				}

				report_flakes() {

				    if [ -z "$FLAKES_CHANNEL" ]; then

				        return 0

				    fi

				    flakes=$1

				    bot="$CI_RUNNER_DESCRIPTION-$CI_PIPELINE_ID"

				    channel="$FLAKES_CHANNEL"

				    (

				    echo NICK $bot

				    echo USER $bot unused unused :Gitlab CI Notifier

				    sleep 10

				    echo "JOIN $channel"

				    sleep 1

				    desc="Flakes detected in job: $CI_JOB_URL on $CI_RUNNER_DESCRIPTION"

				    if [ -n "CI_MERGE_REQUEST_SOURCE_BRANCH_NAME" ]; then

				        desc="$desc on branch $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME ($CI_MERGE_REQUEST_TITLE)"

				    fi

				    echo "PRIVMSG $channel :$desc"

				    for flake in `cat $flakes`; do

				        echo "PRIVMSG $channel :$flake"

				    done

				    echo "PRIVMSG $channel :See $CI_JOB_URL/artifacts/browse/results/"

				    echo "QUIT"

				    ) | nc irc.freenode.net 6667 > /dev/null

				}

				extract_xml_result() {

				    testcase=$1

				    shift 1

				    qpas=$*

				    start="#beginTestCaseResult $testcase"

				    for qpa in $qpas; do

				        while IFS= read -r line; do

				            if [ "$line" = "$start" ]; then

				                dst="$testcase.qpa"

				                echo "#beginSession" > $dst

				                echo $line >> $dst

				                while IFS= read -r line; do

				                    if [ "$line" = "#endTestCaseResult" ]; then

				                        echo $line >> $dst

				                        echo "#endSession" >> $dst

				                        /deqp/executor/testlog-to-xml $dst "$RESULTS/$testcase.xml"

				                        # copy the stylesheets here so they only end up in artifacts

				                        # if we have one or more result xml in artifacts

				                        cp /deqp/testlog.css "$RESULTS/"

				                        cp /deqp/testlog.xsl "$RESULTS/"

				                        return 0

				                    fi

				                    echo $line >> $dst

				                done

				                return 1

				            fi

				        done < $qpa

				    done

				}

				extract_xml_results() {

				    qpas=$*

				    while IFS= read -r testcase; do

				        testcase=${testcase%,*}

				        extract_xml_result $testcase $qpas

				    done

				}

				# Generate junit results

				generate_junit() {

				    results=$1

				    echo "<?xml version=\"1.0\" encoding=\"utf-8\"?>"

				    echo "<testsuites>"

				    echo "<testsuite name=\"$DEQP_VER-$CI_NODE_INDEX\">"

				    while read line; do

				        testcase=${line%,*}

				        result=${line#*,}

				        # avoid counting Skip's in the # of tests:

				        if [ "$result" = "Skip" ]; then

				            continue;

				        fi

				        echo "<testcase name=\"$testcase\">"

				        if [ "$result" != "Pass" ]; then

				            echo "<failure type=\"$result\">"

				            echo "$result: See $CI_JOB_URL/artifacts/results/$testcase.xml"

				            echo "</failure>"

				        fi

				        echo "</testcase>"

				    done < $results

				    echo "</testsuite>"

				    echo "</testsuites>"

				}

				# wrapper to supress +x to avoid spamming the log

				quiet() {

				    set +x

				    "$@"

				    set -x

				}

				run_cts $DEQP /tmp/case-list.txt $RESULTS/cts-runner-results.txt

				DEQP_EXITCODE=$?

				quiet generate_junit $RESULTS/cts-runner-results.txt > $RESULTS/results.xml

				if [ $DEQP_EXITCODE -ne 0 ]; then

				    # preserve caselist files in case of failures:

				    cp /tmp/deqp_runner.*.txt $RESULTS/

				    echo "Some unexpected results found (see cts-runner-results.txt in artifacts for full results):"

				    cat $RESULTS/cts-runner-results.txt | \

				        grep -v ",Pass" | \

				        grep -v ",Skip" | \

				        grep -v ",ExpectedFail" > \

				        $RESULTS/cts-runner-unexpected-results.txt

				    head -n 50 $RESULTS/cts-runner-unexpected-results.txt

				    if [ -z "$DEQP_NO_SAVE_RESULTS" ]; then

				        # Save the logs for up to the first 50 unexpected results:

				        head -n 50 $RESULTS/cts-runner-unexpected-results.txt | quiet extract_xml_results /tmp/*.qpa

				    fi

				    count=`cat $RESULTS/cts-runner-unexpected-results.txt | wc -l`

				    # Re-run fails to detect flakes.  But use a small threshold, if

				    # something was fundamentally broken, we don't want to re-run

				    # the entire caselist

				else

				    cat $RESULTS/cts-runner-results.txt | \

				        grep ",Flake" > \

				        $RESULTS/cts-runner-flakes.txt

				    count=`cat $RESULTS/cts-runner-flakes.txt | wc -l`

				    if [ $count -gt 0 ]; then

				        echo "Some flakes found (see cts-runner-flakes.txt in artifacts for full results):"

				        head -n 50 $RESULTS/cts-runner-flakes.txt

				        if [ -z "$DEQP_NO_SAVE_RESULTS" ]; then

				            # Save the logs for up to the first 50 flakes:

				            head -n 50 $RESULTS/cts-runner-flakes.txt | quiet extract_xml_results /tmp/*.qpa

				        fi

				        # Report the flakes to IRC channel for monitoring (if configured):

				        quiet report_flakes $RESULTS/cts-runner-flakes.txt

				    else

				        # no flakes, so clean-up:

				        rm $RESULTS/cts-runner-flakes.txt

				    fi

				fi

				exit $DEQP_EXITCODE

844

.gitlab-ci/deqp-softpipe-fails.txt Normal file

View File

@@ -0,0 +1,844 @@
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_center
 dEQP-GLES2.functional.clipping.line.wide_line_clip_viewport_corner
 dEQP-GLES2.functional.clipping.point.wide_point_clip
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_center
 dEQP-GLES2.functional.clipping.point.wide_point_clip_viewport_corner
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
 dEQP-GLES2.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
 dEQP-GLES2.functional.polygon_offset.default_displacement_with_units
 dEQP-GLES2.functional.polygon_offset.fixed16_displacement_with_units
 dEQP-GLES2.functional.rasterization.interpolation.basic.line_loop_wide
 dEQP-GLES2.functional.rasterization.interpolation.basic.line_strip_wide
 dEQP-GLES2.functional.rasterization.interpolation.basic.lines_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.line_loop_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.line_strip_wide
 dEQP-GLES2.functional.rasterization.interpolation.projected.lines_wide
 dEQP-GLES2.functional.rasterization.limits.points
 dEQP-GLES2.functional.rasterization.primitives.points
 dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_center
 dEQP-GLES3.functional.clipping.line.wide_line_clip_viewport_corner
 dEQP-GLES3.functional.clipping.point.wide_point_clip
 dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
 dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
 dEQP-GLES3.functional.clipping.triangle_vertex.clip_two.clip_neg_y_neg_z_and_neg_x_neg_y_pos_z
 dEQP-GLES3.functional.clipping.triangle_vertex.clip_two.clip_pos_y_pos_z_and_neg_x_neg_y_neg_z
 dEQP-GLES3.functional.draw.random.124
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth24_stencil8
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth32f_stencil8
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component16
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component24
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.depth_component32f
 dEQP-GLES3.functional.fbo.depth.depth_write_clamp.depth32f_stencil8
 dEQP-GLES3.functional.fbo.depth.depth_write_clamp.depth_component32f
 dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_color
 dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth
 dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_depth_stencil
 dEQP-GLES3.functional.fbo.invalidate.sub.unbind_blit_msaa_stencil
 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_color
 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth
 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_depth_stencil
 dEQP-GLES3.functional.fbo.invalidate.whole.unbind_blit_msaa_stencil
 dEQP-GLES3.functional.fbo.msaa.2_samples.depth24_stencil8
 dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8
 dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component16
 dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component24
 dEQP-GLES3.functional.fbo.msaa.2_samples.depth_component32f
 dEQP-GLES3.functional.fbo.msaa.2_samples.r11f_g11f_b10f
 dEQP-GLES3.functional.fbo.msaa.2_samples.r16f
 dEQP-GLES3.functional.fbo.msaa.2_samples.r8
 dEQP-GLES3.functional.fbo.msaa.2_samples.rg16f
 dEQP-GLES3.functional.fbo.msaa.2_samples.rg8
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgb10_a2
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgb565
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgb5_a1
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgb8
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgba4
 dEQP-GLES3.functional.fbo.msaa.2_samples.rgba8
 dEQP-GLES3.functional.fbo.msaa.2_samples.srgb8_alpha8
 dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
 dEQP-GLES3.functional.fbo.msaa.4_samples.depth24_stencil8
 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
 dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component16
 dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component24
 dEQP-GLES3.functional.fbo.msaa.4_samples.depth_component32f
 dEQP-GLES3.functional.fbo.msaa.4_samples.r11f_g11f_b10f
 dEQP-GLES3.functional.fbo.msaa.4_samples.r16f
 dEQP-GLES3.functional.fbo.msaa.4_samples.r8
 dEQP-GLES3.functional.fbo.msaa.4_samples.rg16f
 dEQP-GLES3.functional.fbo.msaa.4_samples.rg8
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgb10_a2
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgb565
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgb5_a1
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgb8
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgba4
 dEQP-GLES3.functional.fbo.msaa.4_samples.rgba8
 dEQP-GLES3.functional.fbo.msaa.4_samples.srgb8_alpha8
 dEQP-GLES3.functional.fbo.msaa.4_samples.stencil_index8
 dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_alpha_to_coverage
 dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_sample_coverage
 dEQP-GLES3.functional.multisample.fbo_max_samples.proportionality_sample_coverage_inverted
 dEQP-GLES3.functional.multisample.fbo_max_samples.sample_coverage_invert
 dEQP-GLES3.functional.negative_api.buffer.blit_framebuffer_multisample
 dEQP-GLES3.functional.negative_api.buffer.read_pixels_fbo_format_mismatch
 dEQP-GLES3.functional.polygon_offset.default_displacement_with_units
 dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units
 dEQP-GLES3.functional.polygon_offset.fixed24_displacement_with_units
 dEQP-GLES3.functional.polygon_offset.float32_displacement_with_units
 dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.interpolation.lines_wide
 dEQP-GLES3.functional.rasterization.fbo.rbo_multisample_max.primitives.lines_wide
 dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.interpolation.lines_wide
 dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.points
 dEQP-GLES3.functional.rasterization.fbo.texture_2d.interpolation.lines_wide
 dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.points
 dEQP-GLES3.functional.rasterization.interpolation.basic.line_loop_wide
 dEQP-GLES3.functional.rasterization.interpolation.basic.line_strip_wide
 dEQP-GLES3.functional.rasterization.interpolation.basic.lines_wide
 dEQP-GLES3.functional.rasterization.interpolation.projected.line_loop_wide
 dEQP-GLES3.functional.rasterization.interpolation.projected.line_strip_wide
 dEQP-GLES3.functional.rasterization.interpolation.projected.lines_wide
 dEQP-GLES3.functional.rasterization.primitives.points
 dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points
 dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points
 dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points
 dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points
 dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points
 dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fastest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa2.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.nicest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdx.texture.msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fastest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa2.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.nicest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.dfdy.texture.msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fastest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.float_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.float_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa2.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.nicest.fbo_msaa4.vec4_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.float_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.float_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec2_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec2_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec3_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec3_mediump
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec4_highp
 dEQP-GLES3.functional.shaders.derivate.fwidth.texture.msaa4.vec4_mediump
 dEQP-GLES3.functional.state_query.integers.max_samples_getfloat
 dEQP-GLES3.functional.state_query.integers.max_samples_getinteger64
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_linear_nearest_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_mipmap_nearest_nearest_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.linear_nearest_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_linear_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_clamp_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_mirror_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_clamp_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_clamp_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_mirror_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_mirror_repeat
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_repeat_mirror
 dEQP-GLES3.functional.texture.filtering.3d.combinations.nearest_mipmap_nearest_linear_repeat_repeat_repeat
 dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.r11f_g11f_b10f_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb10_a2_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb565_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb5_a1_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgb9_e5_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba16f_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba4_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.rgba8_snorm_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb8_alpha8_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.formats.srgb_r8_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.128x32x64_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_linear_mipmap_nearest
 dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_nearest_mipmap_linear
 dEQP-GLES3.functional.texture.filtering.3d.sizes.63x63x63_nearest_mipmap_nearest
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_clamp
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_mirror
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_clamp
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_mirror
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_clamp
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_mirror
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_linear_nearest_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_mipmap_nearest_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_clamp
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_mirror
 dEQP-GLES3.functional.texture.vertex.3d.filtering.linear_nearest_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_mipmap_linear_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.filtering.nearest_mipmap_nearest_linear_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_clamp
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_clamp_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_mirror_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_mirror_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_repeat_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.clamp_repeat_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_clamp
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_clamp_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_mirror_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_mirror_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_repeat_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.mirror_repeat_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_clamp
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_clamp_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_clamp
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_mirror_repeat
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_clamp
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_mirror
 dEQP-GLES3.functional.texture.vertex.3d.wrap.repeat_repeat_repeat
 dEQP-GLES3.functional.texture.wrap.astc_8x8.repeat_repeat_linear_divisible
 dEQP-GLES3.functional.texture.wrap.astc_8x8.repeat_repeat_linear_not_divisible
 dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_divisible
 dEQP-GLES3.functional.texture.wrap.astc_8x8_srgb.repeat_repeat_linear_not_divisible
 dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads1
 dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.int2_10_10_10.components4_quads256
 dEQP-GLES31.functional.debug.error_filters.case_29
 dEQP-GLES31.functional.debug.negative_coverage.callbacks.buffer.read_pixels_fbo_format_mismatch
 dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.blit_framebuffer_multisample
 dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.read_pixels_fbo_format_mismatch
 dEQP-GLES31.functional.debug.negative_coverage.log.buffer.read_pixels_fbo_format_mismatch
 dEQP-GLES31.functional.draw_base_vertex.draw_elements_instanced_base_vertex.line_loop.instanced_attributes
 dEQP-GLES31.functional.draw_buffers_indexed.overwrite_indexed.common_color_mask_buffer_color_mask
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.0
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.1
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.10
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.11
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.12
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.14
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.16
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.17
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.19
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.2
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.3
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.4
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.5
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.6
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.7
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.8
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_implementation_draw_buffers.9
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.0
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.14
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.15
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.16
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.17
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.19
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.2
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.4
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.5
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.7
 dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.9
 dEQP-GLES31.functional.draw_indirect.draw_arrays_indirect.line_strip.multiple_attributes
 dEQP-GLES31.functional.fbo.no_attachments.interaction.17x512ms4_default_16x16ms2
 dEQP-GLES31.functional.fbo.no_attachments.interaction.1x1ms0_default_2048x2048ms4
 dEQP-GLES31.functional.fbo.no_attachments.interaction.2048x2048ms4_default_1x1ms0
 dEQP-GLES31.functional.fbo.no_attachments.interaction.256x256ms0_default_512x512ms2
 dEQP-GLES31.functional.fbo.no_attachments.interaction.256x256ms2_default_128x512ms0
 dEQP-GLES31.functional.fbo.no_attachments.multisample.samples2
 dEQP-GLES31.functional.fbo.no_attachments.multisample.samples3
 dEQP-GLES31.functional.fbo.no_attachments.multisample.samples4
 dEQP-GLES31.functional.fbo.no_attachments.random.1
 dEQP-GLES31.functional.fbo.no_attachments.random.11
 dEQP-GLES31.functional.fbo.no_attachments.random.14
 dEQP-GLES31.functional.fbo.no_attachments.random.15
 dEQP-GLES31.functional.fbo.no_attachments.random.4
 dEQP-GLES31.functional.fbo.no_attachments.random.9
 dEQP-GLES31.functional.geometry_shading.query.primitives_generated_amplification
 dEQP-GLES31.functional.geometry_shading.query.primitives_generated_instanced
 dEQP-GLES31.functional.geometry_shading.query.primitives_generated_no_amplification
 dEQP-GLES31.functional.geometry_shading.query.primitives_generated_no_geometry
 dEQP-GLES31.functional.geometry_shading.query.primitives_generated_partial_primitives
 dEQP-GLES31.functional.image_load_store.early_fragment_tests.early_fragment_tests_stencil
 dEQP-GLES31.functional.image_load_store.early_fragment_tests.early_fragment_tests_stencil_fbo
 dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth
 dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
 dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.dynamically_uniform_geometry
 dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getfloat
 dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getinteger
 dEQP-GLES31.functional.state_query.integer.max_framebuffer_samples_getinteger64
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_float
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_integer
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_pure_int
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_format_pure_uint
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_float
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_integer
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_pure_int
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.texture_immutable_levels_pure_uint
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_float
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_integer
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_pure_int
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_format_pure_uint
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_float
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_integer
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_pure_int
 dEQP-GLES31.functional.state_query.texture.texture_2d_multisample_array.texture_immutable_levels_pure_uint
 dEQP-GLES31.functional.texture.border_clamp.depth_compare_mode.depth32f_stencil8.linear_size_npot
 dEQP-GLES31.functional.texture.border_clamp.depth_compare_mode.depth32f_stencil8.linear_size_pot
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_clamp_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_mirror_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_clamp
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_mirror
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_linear_repeat_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_clamp_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_mirror_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_repeat_clamp
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_linear_repeat_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_clamp_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_mirror_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_repeat_clamp
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_linear_nearest_repeat_mirror
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_clamp_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_mirror_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_repeat_clamp
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_linear_repeat_mirror
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_clamp
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_mirror
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_nearest_nearest_repeat_repeat
 dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_nearest_repeat_mirror
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_linear_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb10_a2_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb565_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb565_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb5_a1_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb5_a1_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb9_e5_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgb9_e5_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba16f_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba16f_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba4_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba4_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_snorm_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.rgba8_snorm_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.srgb8_alpha8_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.formats.srgb8_alpha8_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_linear_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.128x128x12_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.63x63x18_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.64x64x12_nearest_mipmap_linear
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.64x64x12_nearest_mipmap_nearest
 dEQP-GLES31.functional.texture.filtering.cube_array.sizes.8x8x6_nearest
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.2d.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8.no_corners.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.no_corners.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.no_corners.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.basic.cube.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_linear_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_nearest_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.filter_mode.min_nearest_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_npot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.depth32f.size_pot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_linear_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.filter_mode.min_nearest_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.filter_mode.min_nearest_mipmap_nearest_mag_nearest
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.filter_mode.min_nearest_mipmap_nearest_mag_nearest
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_linear_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_nearest_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.filter_mode.min_nearest_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_npot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.depth32f.size_pot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_linear_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.filter_mode.min_nearest_mipmap_nearest_mag_linear
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.filter_mode.min_nearest_mipmap_nearest_mag_nearest
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.base_level.level_1
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.base_level.level_2
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.filter_mode.min_nearest_mipmap_nearest_mag_nearest
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.green_blue_alpha_zero
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.red_green_blue_alpha
 dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_pot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8i.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8ui.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_npot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_greater.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.depth32f.size_pot.compare_less.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8i.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_npot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.clamp_to_edge_repeat
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.mirrored_repeat_clamp_to_edge
 dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d_array.rgba8ui.size_pot.repeat_mirrored_repeat
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8i.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.alpha_zero_one_red
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.blue_alpha_zero_one
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.one_red_green_blue
 dEQP-GLES31.functional.texture.gather.offsets.implementation_offset.2d_array.rgba8ui.texture_swizzle.zero_one_red_green
 dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_alpha_to_coverage
 dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_sample_coverage
 dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_and_sample_coverage_and_alpha_to_coverage
 dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_non_effective_bits
 dEQP-GLES31.functional.texture.multisample.samples_1.sample_mask_only

16

.gitlab-ci/deqp-softpipe-skips.txt Normal file

View File

@@ -0,0 +1,16 @@
 # Note: skips lists for CI are just a list of lines that, when
 # non-zero-length and not starting with '#', will regex match to
 # delete lines from the test list.  Be careful.
 # Skip the perf/stress tests to keep runtime manageable
 dEQP-GLES[0-9]*.performance.*
 dEQP-GLES[0-9]*.stress.*
 # These are really slow on tiling architectures (including llvmpipe).
 dEQP-GLES[0-9]*.functional.flush_finish.*
 # Random failures
 dEQP-GLES31.functional.shaders.builtin_functions.*geometry
 dEQP-GLES31.functional.fbo.no_attachments.maximums.all
 dEQP-GLES31.functional.fbo.no_attachments.maximums.size

									
										45

.gitlab-ci/generate_lava.py
									
										Executable file
									
												View File
												
				@@ -0,0 +1,45 @@

				#!/usr/bin/env python3

				from jinja2 import Environment, FileSystemLoader

				import argparse

				import os

				parser = argparse.ArgumentParser()

				parser.add_argument("--template")

				parser.add_argument("--pipeline-info")

				parser.add_argument("--base-artifacts-url")

				parser.add_argument("--device-type")

				parser.add_argument("--kernel-image-name")

				parser.add_argument("--kernel-image-type", nargs='?', default="")

				parser.add_argument("--gpu-version")

				parser.add_argument("--boot-method")

				parser.add_argument("--lava-tags", nargs='?', default="")

				parser.add_argument("--env-vars", nargs='?', default="")

				parser.add_argument("--deqp-version")

				parser.add_argument("--arch")

				parser.add_argument("--ci-node-index")

				parser.add_argument("--ci-node-total")

				args = parser.parse_args()

				env = Environment(loader = FileSystemLoader(os.path.dirname(args.template)), trim_blocks=True, lstrip_blocks=True)

				template = env.get_template(os.path.basename(args.template))

				values = {}

				values['pipeline_info'] = args.pipeline_info

				values['base_artifacts_url'] = args.base_artifacts_url

				values['device_type'] = args.device_type

				values['kernel_image_name'] = args.kernel_image_name

				values['kernel_image_type'] = args.kernel_image_type

				values['gpu_version'] = args.gpu_version

				values['boot_method'] = args.boot_method

				values['tags'] = args.lava_tags

				values['env_vars'] = args.env_vars

				values['deqp_version'] = args.deqp_version

				values['arch'] = args.arch

				values['ci_node_index'] = args.ci_node_index

				values['ci_node_total'] = args.ci_node_total

				f = open('lava-deqp.yml', "w")

				f.write(template.render(values))

				f.close()

93

.gitlab-ci/lava-deqp.yml.jinja2 Normal file

View File

@@ -0,0 +1,93 @@
 job_name: mesa-deqp-{{ gpu_version }} {{ pipeline_info }}
 device_type: {{ device_type }}
 timeouts:
   job:
     minutes: 40
   action:
    minutes: 10
   actions:
     power-off:
       seconds: 30
 priority: 75
 visibility: public
 {% if tags %}
 tags:
 {% for tag in tags %}
   - {{ tag }}
 {% endfor %}
 {% endif %}
 actions:
 - deploy:
     timeout:
       minutes: 10
     to: tftp
     kernel:
       url: {{ base_artifacts_url }}/{{ kernel_image_name }}
 {% if kernel_image_type %}
       {{ kernel_image_type }}
 {% endif %}
     ramdisk:
       url: {{ base_artifacts_url }}/lava-rootfs-{{ arch }}.cpio.gz
       compression: gz
     dtb:
       url: {{ base_artifacts_url }}/{{ device_type }}.dtb
     os: oe
 - boot:
     timeout:
       minutes: 5
     method: {{ boot_method }}
     commands: ramdisk
     prompts:
       - '#'
 - test:
     timeout:
       minutes: 60
     definitions:
     - repository:
         metadata:
           format: Lava-Test Test Definition 1.0
           name: deqp
           description: "Mesa dEQP test plan"
           os:
           - oe
           scope:
           - functional
         run:
           steps:
           - mount -t proc none /proc
           - mount -t sysfs none /sys
           - mount -t devtmpfs none /dev
           - mkdir -p /dev/pts
           - mount -t devpts devpts /dev/pts
 {% if env_vars %}
           - export {{ env_vars }}
 {% endif %}
           - export DEQP_NO_SAVE_RESULTS=1
           - 'export DEQP_RUNNER_OPTIONS="--compact-display false --shuffle false"'
           - export DEQP_EXPECTED_FAILS=deqp-{{ gpu_version }}-fails.txt
           - export DEQP_SKIPS=deqp-{{ gpu_version }}-skips.txt
           - export DEQP_VER={{ deqp_version }}
           - export LIBGL_DRIVERS_PATH=`pwd`/install/lib/dri
           - export CI_NODE_INDEX={{ ci_node_index }}
           - export CI_NODE_TOTAL={{ ci_node_total }}
           # Put stuff where the runner script expects it
           - mkdir artifacts
           - mkdir results
           - mkdir -p install/lib
           - cp /deqp/$DEQP_EXPECTED_FAILS artifacts/.
           - cp /deqp/$DEQP_SKIPS artifacts/.
           - mv /mesa/lib/* install/lib/.
           - "if sh /deqp/deqp-runner.sh; then
                   echo 'deqp: pass';
              else
                   echo 'deqp: fail';
              fi"
         parse:
           pattern: '(?P<test_case_id>\S*):\s+(?P<result>(pass|fail))'
       from: inline
       name: deqp
       path: inline/mesa-deqp.yaml

									
										122

.gitlab-ci/lava-gitlab-ci.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,122 @@

				.lava-test:

				  extends:

				    - .ci-run-policy

				  stage: test

				  variables:

				    GIT_STRATEGY: none # testing doesn't build anything from source

				    ENV_VARS: "MESA_GLES_VERSION_OVERRIDE=3.0 DEQP_PARALLEL=6"

				  script:

				    - BUILD_JOB_ID=`cat artifacts/build_job_id.txt`

				    - >

				      artifacts/generate_lava.py \

				        --template artifacts/lava-deqp.yml.jinja2 \

				        --pipeline-info "$CI_PIPELINE_URL on $CI_COMMIT_REF_NAME ${CI_NODE_INDEX}/${CI_NODE_TOTAL}" \

				        --base-artifacts-url $CI_PROJECT_URL/-/jobs/$BUILD_JOB_ID/artifacts/raw/artifacts \

				        --device-type ${DEVICE_TYPE} \

				        --env-vars "${ENV_VARS}" \

				        --arch ${ARCH} \

				        --deqp-version gles2 \

				        --kernel-image-name ${KERNEL_IMAGE_NAME} \

				        --kernel-image-type "${KERNEL_IMAGE_TYPE}" \

				        --gpu-version ${GPU_VERSION} \

				        --boot-method ${BOOT_METHOD} \

				        --lava-tags "${LAVA_TAGS}" \

				        --ci-node-index "${CI_NODE_INDEX}" \

				        --ci-node-total "${CI_NODE_TOTAL}"

				    - lava_job_id=`lavacli jobs submit lava-deqp.yml`

				    - echo $lava_job_id

				    - rm -rf artifacts/*

				    - cp lava-deqp.yml artifacts/.

				    - lavacli jobs logs $lava_job_id | grep -a -v "{'case':" | tee artifacts/lava-deqp-$lava_job_id.log

				    - lavacli jobs show $lava_job_id

				    - result=`lavacli results $lava_job_id 0_deqp deqp | head -1`

				    - echo $result

				    - '[[ "$result" == "pass" ]]'

				  artifacts:

				    when: always

				    paths:

				      - artifacts/

				.lava-test:armhf:

				  variables:

				    ARCH: armhf

				    KERNEL_IMAGE_NAME: zImage

				    KERNEL_IMAGE_TYPE: "type:\ zimage"

				    BOOT_METHOD: u-boot

				  extends:

				    - .lava-test

				    - .use-arm_build

				  dependencies:

				    - meson-armhf

				  needs:

				    - meson-armhf

				.lava-test:arm64:

				  variables:

				    ARCH: arm64

				    KERNEL_IMAGE_NAME: Image

				    KERNEL_IMAGE_TYPE: "type:\ image"

				    BOOT_METHOD: u-boot

				  extends:

				    - .lava-test

				    - .use-arm_build

				  dependencies:

				    - meson-arm64

				  needs:

				    - meson-arm64

				panfrost-t720-test:arm64:

				  extends: .lava-test:arm64

				  variables:

				    DEVICE_TYPE: sun50i-h6-pine-h64

				    GPU_VERSION: panfrost-t720

				  tags:

				    - lava-sun50i-h6-pine-h64

				panfrost-t760-test:armhf:

				  extends: .lava-test:armhf

				  variables:

				    DEVICE_TYPE: rk3288-veyron-jaq

				    GPU_VERSION: panfrost-t760

				    BOOT_METHOD: depthcharge

				    KERNEL_IMAGE_TYPE: ""

				  tags:

				    - lava-rk3288-veyron-jaq

				panfrost-t860-test:arm64:

				  extends: .lava-test:arm64

				  variables:

				    DEVICE_TYPE: rk3399-gru-kevin

				    GPU_VERSION: panfrost-t860

				    BOOT_METHOD: depthcharge

				    KERNEL_IMAGE_TYPE: ""

				  tags:

				    - lava-rk3399-gru-kevin

				.panfrost-t820-test:arm64:

				  extends: .lava-test:arm64

				  variables:

				    DEVICE_TYPE: meson-gxm-khadas-vim2

				    GPU_VERSION: panfrost-t820

				    LAVA_TAGS: panfrost

				  tags:

				    - lava-meson-gxm-khadas-vim2

				.lima-mali400-test:armhf:

				  parallel: 2

				  extends: .lava-test:armhf

				  variables:

				    DEVICE_TYPE: sun8i-h3-libretech-all-h3-cc

				    GPU_VERSION: lima

				    ENV_VARS: "DEQP_PARALLEL=3"

				  tags:

				    - lava-sun8i-h3-libretech-all-h3-cc

				lima-mali450-test:arm64:

				  extends: .lava-test:arm64

				  variables:

				    DEVICE_TYPE: meson-gxl-s905x-libretech-cc

				    GPU_VERSION: lima

				    ENV_VARS: "DEQP_PARALLEL=6"

				  tags:

				    - lava-meson-gxl-s905x-libretech-cc

									
										13

.gitlab-ci/meson-build.bat
									
										Normal file
									
												View File
												
				@@ -0,0 +1,13 @@

				call "C:\Program Files (x86)\Microsoft Visual Studio\%VERSION%\Common7\Tools\VsDevCmd.bat" -arch=%ARCH%

				del /Q /S _build

				meson _build ^

				        -Dbuild-tests=true ^

				        -Db_vscrt=mtd ^

				        -Dbuildtype=release ^

				        -Dllvm=false ^

				        -Dgallium-drivers=swrast ^

				        -Dosmesa=gallium

				meson configure _build

				ninja -C _build

				ninja -C _build test

									
										64

.gitlab-ci/meson-build.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,64 @@

				#!/bin/bash

				set -e

				set -o xtrace

				CROSS_FILE=/cross_file-"$CROSS".txt

				# We need to control the version of llvm-config we're using, so we'll

				# tweak the cross file or generate a native file to do so.

				if test -n "$LLVM_VERSION"; then

				    LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				    echo -e "[binaries]\nllvm-config = '`which $LLVM_CONFIG`'" > native.file

				    if [ -n "$CROSS" ]; then

				        sed -i -e '/\[binaries\]/a\' -e "llvm-config = '`which $LLVM_CONFIG`'" $CROSS_FILE

				    fi

				    $LLVM_CONFIG --version

				else

				    rm -f native.file

				    touch native.file

				fi

				# cross-xfail-$CROSS, if it exists, contains a list of tests that are expected

				# to fail for the $CROSS configuration, one per line. you can then mark those

				# tests in their meson.build with:

				#

				# test(...,

				#      should_fail: meson.get_cross_property('xfail', '').contains(t),

				#     )

				#

				# where t is the name of the test, and the '' is the string to search when

				# not cross-compiling (which is empty, because for amd64 everything is

				# expected to pass).

				if [ -n "$CROSS" ]; then

				    CROSS_XFAIL=.gitlab-ci/cross-xfail-"$CROSS"

				    if [ -s "$CROSS_XFAIL" ]; then

				        sed -i \

				            -e '/\[properties\]/a\' \

				            -e "xfail = '$(tr '\n' , < $CROSS_XFAIL)'" \

				            "$CROSS_FILE"

				    fi

				fi

				rm -rf _build

				meson _build --native-file=native.file \

				      --wrap-mode=nofallback \

				      ${CROSS+--cross "$CROSS_FILE"} \

				      -D prefix=`pwd`/install \

				      -D libdir=lib \

				      -D buildtype=${BUILDTYPE:-debug} \

				      -D build-tests=true \

				      -D libunwind=${UNWIND} \

				      ${DRI_LOADERS} \

				      -D dri-drivers=${DRI_DRIVERS:-[]} \

				      ${GALLIUM_ST} \

				      -D gallium-drivers=${GALLIUM_DRIVERS:-[]} \

				      -D vulkan-drivers=${VULKAN_DRIVERS:-[]} \

				      -D I-love-half-baked-turnips=true \

				      ${EXTRA_OPTION}

				cd _build

				meson configure

				ninja -j4

				LC_ALL=C.UTF-8 ninja test

				ninja install

				cd ..

									
										36

.gitlab-ci/piglit/disable-vs_in.diff
									
										Normal file
									
												View File
												
				@@ -0,0 +1,36 @@

				diff --git a/generated_tests/CMakeLists.txt b/generated_tests/CMakeLists.txt

				index 738526546..6f89048cd 100644

				--- a/generated_tests/CMakeLists.txt

				+++ b/generated_tests/CMakeLists.txt

				@@ -206,11 +206,6 @@ piglit_make_generated_tests(

					templates/gen_variable_index_write_tests/vs.shader_test.mako

					templates/gen_variable_index_write_tests/fs.shader_test.mako

					templates/gen_variable_index_write_tests/helpers.mako)

				-piglit_make_generated_tests(

				-	vs_in_fp64.list

				-	gen_vs_in_fp64.py

				-	templates/gen_vs_in_fp64/columns.shader_test.mako

				-	templates/gen_vs_in_fp64/regular.shader_test.mako)

				 piglit_make_generated_tests(

					shader_framebuffer_fetch_tests.list

					gen_shader_framebuffer_fetch_tests.py)

				@@ -279,7 +274,6 @@ add_custom_target(gen-gl-tests

							gen_extensions_defined.list

							vp-tex.list

							variable_index_write_tests.list

				-			vs_in_fp64.list

							gpu_shader4_tests.list

				 )

				diff --git a/tests/sanity.py b/tests/sanity.py

				index 12f1614c9..9019087e2 100644

				--- a/tests/sanity.py

				+++ b/tests/sanity.py

				@@ -100,7 +100,6 @@ shader_tests = (

				     'spec/arb_tessellation_shader/execution/barrier-patch.shader_test',

				     'spec/arb_tessellation_shader/execution/built-in-functions/tcs-any-bvec4-using-if.shader_test',

				     'spec/arb_tessellation_shader/execution/sanity.shader_test',

				-    'spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-uint_uvec4-double_dmat3x4_array2-position.shader_test',

				     'spec/glsl-1.50/execution/geometry-basic.shader_test',

				     'spec/oes_viewport_array/viewport-gs-write-simple.shader_test',

				 )

5117

.gitlab-ci/piglit/glslparser.txt Normal file

View File

File diff suppressed because it is too large Load Diff

2220

.gitlab-ci/piglit/quick_gl.txt Normal file

View File

File diff suppressed because it is too large Load Diff

6418

.gitlab-ci/piglit/quick_shader.txt Normal file

View File

File diff suppressed because it is too large Load Diff

									
										29

.gitlab-ci/piglit/run.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,29 @@

				#!/bin/bash

				set -e

				set -o xtrace

				VERSION=`cat artifacts/VERSION`

				cd /piglit

				PIGLIT_OPTIONS=$(echo $PIGLIT_OPTIONS | head -n 1)

				xvfb-run --server-args="-noreset" sh -c \

				         "export LD_LIBRARY_PATH=$OLDPWD/install/lib;

				         wflinfo --platform glx --api gl --profile core | grep \"Mesa $VERSION\\\$\" &&

				         ./piglit run -j4 $PIGLIT_OPTIONS $PIGLIT_PROFILES $OLDPWD/results"

				PIGLIT_RESULTS=${PIGLIT_RESULTS:-$PIGLIT_PROFILES}

				mkdir -p .gitlab-ci/piglit

				cp $OLDPWD/artifacts/piglit/$PIGLIT_RESULTS.txt .gitlab-ci/piglit/$PIGLIT_RESULTS.txt.baseline

				./piglit summary console $OLDPWD/results | head -n -1 | grep -v ": pass" >.gitlab-ci/piglit/$PIGLIT_RESULTS.txt

				if diff -q .gitlab-ci/piglit/$PIGLIT_RESULTS.txt{.baseline,}; then

				    exit 0

				fi

				./piglit summary html --exclude-details=pass $OLDPWD/summary $OLDPWD/results

				echo Unexpected change in results:

				diff -u .gitlab-ci/piglit/$PIGLIT_RESULTS.txt{.baseline,}

				exit 1

									
										59

.gitlab-ci/prepare-artifacts.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,59 @@

				#!/bin/bash

				set -e

				set -o xtrace

				CROSS_FILE=/cross_file-"$CROSS".txt

				# Delete unused bin and includes from artifacts to save space.

				rm -rf install/bin install/include

				# Strip the drivers in the artifacts to cut 80% of the artifacts size.

				if [ -n "$CROSS" ]; then

				    STRIP=`sed -n -E "s/strip\s*=\s*'(.*)'/\1/p" "$CROSS_FILE"`

				    if [ -z "$STRIP" ]; then

				        echo "Failed to find strip command in cross file"

				        exit 1

				    fi

				else

				    STRIP="strip"

				fi

				find install -name \*.so -exec $STRIP {} \;

				# Test runs don't pull down the git tree, so put the dEQP helper

				# script and associated bits there.

				mkdir -p artifacts/

				cp VERSION artifacts/

				cp -Rp .gitlab-ci/deqp* artifacts/

				cp -Rp .gitlab-ci/piglit artifacts/

				# Tar up the install dir so that symlinks and hardlinks aren't each

				# packed separately in the zip file.

				tar -cf artifacts/install.tar install

				# If the container has LAVA stuff, prepare the artifacts for LAVA jobs

				if [ -d /lava-files ]; then

				        # Copy kernel and device trees for LAVA

				        cp /lava-files/*Image artifacts/.

				        cp /lava-files/*.dtb artifacts/.

				        # Pack ramdisk for LAVA

				        mkdir -p /lava-files/rootfs-${CROSS:-arm64}/mesa

				        cp -a install/* /lava-files/rootfs-${CROSS:-arm64}/mesa/.

				        cp .gitlab-ci/deqp-runner.sh /lava-files/rootfs-${CROSS:-arm64}/deqp/.

				        cp .gitlab-ci/deqp-*-fails.txt /lava-files/rootfs-${CROSS:-arm64}/deqp/.

				        cp .gitlab-ci/deqp-*-skips.txt /lava-files/rootfs-${CROSS:-arm64}/deqp/.

				        find /lava-files/rootfs-${CROSS:-arm64}/ -type f -printf "%s\t%i\t%p\n" | sort -n | tail -100

				        pushd /lava-files/rootfs-${CROSS:-arm64}/

				        find -H  |  cpio -H newc -o | gzip -c - > $CI_PROJECT_DIR/artifacts/lava-rootfs-${CROSS:-arm64}.cpio.gz

				        popd

				        # Store job ID so the test stage can build URLs to the artifacts

				        echo $CI_JOB_ID > artifacts/build_job_id.txt

				        # Pass needed files to the test stage

				        cp $CI_PROJECT_DIR/.gitlab-ci/generate_lava.py artifacts/.

				        cp $CI_PROJECT_DIR/.gitlab-ci/lava-deqp.yml.jinja2 artifacts/.

				fi

									
										17

.gitlab-ci/run-shader-db.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,17 @@

				set -e

				set -v

				ARTIFACTSDIR=`pwd`/shader-db

				mkdir -p $ARTIFACTSDIR

				export DRM_SHIM_DEBUG=true

				LIBDIR=`pwd`/install/lib

				export LD_LIBRARY_PATH=$LIBDIR

				cd /usr/local/shader-db

				for driver in freedreno v3d; do

				    env LD_PRELOAD=$LIBDIR/lib${driver}_noop_drm_shim.so \

				        ./run -j 4 ./shaders \

				            > $ARTIFACTSDIR/${driver}-shader-db.txt

				done

									
										12

.gitlab-ci/scons-build.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,12 @@

				#!/bin/bash

				set -e

				set -o xtrace

				if test -n "$LLVM_VERSION"; then

				    export LLVM_CONFIG="llvm-config-${LLVM_VERSION}"

				fi

				rm -rf build

				scons $SCONS_TARGET force_scons=on

				eval $SCONS_CHECK_COMMAND

20

.gitlab-ci/x86_64-w64-mingw32 Normal file

View File

@@ -0,0 +1,20 @@
 [binaries]
 c = ['ccache', 'x86_64-w64-mingw32-gcc']
 cpp = ['ccache', 'x86_64-w64-mingw32-g++']
 ar = 'x86_64-w64-mingw32-ar'
 strip = 'x86_64-w64-mingw32-strip'
 pkgconfig = '/usr/local/bin/x86_64-w64-mingw32-pkg-config'
 windres = 'x86_64-w64-mingw32-windres'
 exe_wrapper = ['wine64']
 [properties]
 needs_exe_wrapper = True
 sys_root = '/usr/x86_64-w64-mingw32/'
 [host_machine]
 system = 'windows'
 cpu_family = 'x86_64'
 cpu = 'x86_64'
 endian = 'little'
 ; vim: ft=dosini

20

.mailmap

View File

@@ -26,6 +26,8 @@ Alexander Monakov <amonakov@gmail.com> <amonakov@ispras.ru>
 Alexander von Gluck IV <kallisti5@unixzen.com> Alexander von Gluck <kallisti5@unixzen.com>
 Alexandros Frantzis <alexandros.frantzis@collabora.com> <Alexandros.Frantzis@canonical.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.prom.eng.vmware.com>
 Alex Corscadden <alexc@vmware.com> <alexc@alexc-dev1.vmware.com>
@@ -50,6 +52,8 @@ Andrew Randrianasulu <randrianasulu@gmail.com> <randrik@mail.ru>
 Arthur Huillet <arthur.huillet@free.fr> Arthur HUILLET <arthur.huillet@free.fr>
 Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> <basni@chromium.org>
 Benjamin Franzke <benjaminfranzke@googlemail.com> ben <benjaminfranzke@googlemail.com>
 Ben Skeggs <bskeggs@redhat.com> <darktama@beleth.(none)>
@@ -129,8 +133,8 @@ David Miller <davem@davemloft.net> David S. Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> Dave Miller <davem@davemloft.net>
 David Miller <davem@davemloft.net> davem69 <davem69>
 David Heidelberger <david.heidelberger@ixit.cz> David Heidelberg <david@ixit.cz>
 David Heidelberger <david.heidelberger@ixit.cz> <d.okias@gmail.com>
 David Heidelberg <david@ixit.cz> David Heidelberger <david.heidelberger@ixit.cz>
 David Heidelberg <david@ixit.cz> <d.okias@gmail.com>
 David Reveman <reveman@chromium.org> <c99drn@cs.umu.se>
@@ -142,6 +146,8 @@ Dylan Baker <dylanx.c.baker@intel.com> <baker.dylan.c@gmail.com>
 Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
 Elie Tournier <tournier.elie@gmail.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
@@ -154,6 +160,7 @@ Emil Velikov <emil.l.velikov@gmail.com> <emmil.velikov@collabora.com>
 Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
 Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
 Eric Engestrom <eric@engestrom.ch> <eric.engestrom@intel.com>
 Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
@@ -162,10 +169,14 @@ Fabian Bieler <der.fabe@gmx.net> <&lt;der.fabe@gmx.net&gt>
 Feng, Haitao <haitao.feng@intel.com> Haitao Feng <haitao.feng@intel.com>
 Frank Binns <frank.binns@imgtec.com> <francisbinns@gmail.com>
 Frank Henigman <fjhenigman@google.com> <fjhenigman@chromium.org>
 George Sapountzis <gsapountzis@gmail.com> George Sapountzis <gsap7@yahoo.gr>
 Gert Wollny <gert.wollny@collabora.com> <gw.fossdev@gmail.com>
 Gwenole Beauchesne <gwenole.beauchesne@intel.com> <gb.devel@gmail.com>
 Hamish Marson <hmarson@users.sourceforge.net> hmarson <hmarson>
@@ -184,6 +195,8 @@ Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.(none)>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@aurora.walkyrie.se>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@tungstengraphics.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <wallbraker 'at' gmail 'dot' com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob.bornecrantz@collabora.com>
 Jakob Bornecrantz <wallbraker@gmail.com> <jakob@collabora.com>
 Jakub Bogusz <qboosh@pld-linux.org> <gboosh@pld-linux.org>
@@ -328,6 +341,7 @@ Michel Dänzer <michel@daenzer.net> <daenzer@vmware.com>
 Michel Dänzer <michel@daenzer.net> <michel@tungstengraphics.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <michel.daenzer@amd.com>
 Michel Dänzer <michel@daenzer.net> Michel Daenzer <daenzer@localhost.(none)>
 Michel Dänzer <michel@daenzer.net> <mdaenzer@redhat.com>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> Mike Kaplinksiy <mike.kaplinskiy@gmail.com>
 Mike Kaplinskiy <mike.kaplinskiy@gmail.com> <mike.kaplinskiy@gmai.com>
@@ -453,6 +467,8 @@ Tom Fogal <tfogal@alumni.unh.edu> <tfogal@sci.utah.edu>
 Tom Stellard <thomas.stellard@amd.com> <tstellar@gmail.com>
 Tom Stellard <thomas.stellard@amd.com> Thomas Stellard <tom.stellard@amd.com>
 Tomeu Vizoso <tomeu.vizoso@collabora.com> <tomeu@tomeuvizoso.net>
 Tormod Volden <debian.tormod@gmail.com> <lists.tormod@gmail.com>
 Török Edwin <edwin+mesa@etorok.net> Török Edvin <edwintorok@gmail.com>

									
										44

.travis.yml
									
												View File
												
				@@ -9,10 +9,25 @@ env:

				  global:

				    - PKG_CONFIG_PATH=""

				matrix:

				  include:

				    - env:

				      - BUILD=meson

				    - env:

				      - BUILD=scons

				before_install:

				  - HOMEBREW_NO_AUTO_UPDATE=1 brew install python3 ninja expat gettext

				  - HOMEBREW_NO_AUTO_UPDATE=1 brew install expat gettext

				  - if test "x$BUILD" = xmeson; then

				      HOMEBREW_NO_AUTO_UPDATE=1 brew install ninja;

				    fi

				  - if test "x$BUILD" = xscons; then

				      HOMEBREW_NO_AUTO_UPDATE=1 brew install scons;

				    fi

				  # Set PATH for homebrew pip3 installs

				  - PATH="$HOME/Library/Python/3.6/bin:${PATH}"

				  - PYTHON_VERSION=$(python3 -V | awk '{print $2}' | cut -d. -f1-2)

				  - PATH="$HOME/Library/Python/$PYTHON_VERSION/bin:${PATH}"

				  # Set PKG_CONFIG_PATH for keg-only expat

				  - PKG_CONFIG_PATH="/usr/local/opt/expat/lib/pkgconfig:${PKG_CONFIG_PATH}"

				  # Set PATH for keg-only gettext

				@@ -28,13 +43,22 @@ before_install:

				  - PKG_CONFIG_PATH="/opt/X11/share/pkgconfig:/opt/X11/lib/pkgconfig:${PKG_CONFIG_PATH}"

				install:

				  - pip3 install --user meson

				  - pip3 install --user mako

				  - if test "x$BUILD" = xmeson; then

				      pip3 install --user meson;

				      pip3 install --user mako;

				    fi

				  - if test "x$BUILD" = xscons; then

				      pip2 install --user mako;

				    fi

				script:

				  - meson _build

				      -Dbuild-tests=true

				      -Dplatforms=x11

				      -Dgallium-drivers=swrast

				  - ninja -C _build

				  - ninja -C _build test

				  - if test "x$BUILD" = xmeson; then

				      meson _build -Dbuild-tests=true;

				      ninja -C _build || travis_terminate 1;

				      ninja -C _build test || travis_terminate 1;

				      ninja -C _build install || travis_terminate 1;

				    fi

				  - if test "x$BUILD" = xscons; then

				      scons force_scons=1 || travis_terminate 1;

				      scons force_scons=1 check || travis_terminate 1;

				    fi

									
										10

Android.common.mk
									
												View File
												
				@@ -39,7 +39,7 @@ LOCAL_CFLAGS += \

					-Wno-initializer-overrides \

					-Wno-mismatched-tags \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"

					-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/issues\"

				# XXX: The following __STDC_*_MACROS defines should not be needed.

				# It's likely due to a bug elsewhere, but let's temporarily add them

				@@ -98,12 +98,14 @@ ifeq ($(filter 5 6 7 8 9, $(MESA_ANDROID_MAJOR_VERSION)),)

				LOCAL_CFLAGS += -DHAVE_TIMESPEC_GET

				endif

				ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				# Android's libc began supporting shm in Oreo

				ifeq ($(shell test $(PLATFORM_SDK_VERSION) -ge 26 && echo true),true)

				LOCAL_CFLAGS += -DHAVE_SYS_SHM_H

				endif

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

					-DUSE_X86_ASM

				endif

				endif

				ifeq ($(ARCH_ARM_HAVE_NEON),true)

				LOCAL_CFLAGS_arm += -DUSE_ARM_ASM

									
										27

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris lima

				#   gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv iris lima panfrost

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -61,7 +61,8 @@ gallium_drivers := \

					virgl.HAVE_GALLIUM_VIRGL \

					etnaviv.HAVE_GALLIUM_ETNAVIV \

					iris.HAVE_GALLIUM_IRIS \

					lima.HAVE_GALLIUM_LIMA

					lima.HAVE_GALLIUM_LIMA \

					panfrost.HAVE_GALLIUM_PANFROST

				ifeq ($(BOARD_GPU_DRIVERS),all)

				MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))

				@@ -83,33 +84,20 @@ endif

				$(foreach d, $(MESA_BUILD_CLASSIC) $(MESA_BUILD_GALLIUM), $(eval $(d) := true))

				# host and target must be the same arch to generate matypes.h

				ifeq ($(TARGET_ARCH),$(HOST_ARCH))

				MESA_ENABLE_ASM := true

				else

				MESA_ENABLE_ASM := false

				endif

				ifneq ($(filter true, $(HAVE_GALLIUM_RADEONSI)),)

				MESA_ENABLE_LLVM := true

				endif

				define mesa-build-with-llvm

				  $(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5), \

				  $(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7), \

				    $(warning Unsupported LLVM version in Android $(MESA_ANDROID_MAJOR_VERSION)),) \

				  $(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_STRING=\"3.7\")) \

				  $(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_STRING=\"3.8\")) \

				  $(if $(filter 8,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \

				  $(if $(filter P,$(MESA_ANDROID_MAJOR_VERSION)), \

				    $(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_STRING=\"3.9\")) \

				  $(eval LOCAL_CFLAGS += -DLLVM_AVAILABLE -DMESA_LLVM_VERSION_STRING=\"3.9\") \

				  $(eval LOCAL_SHARED_LIBRARIES += libLLVM)

				endef

				# add subdirectories

				SUBDIRS := \

					src/etnaviv \

					src/freedreno \

					src/gbm \

					src/loader \

				@@ -122,7 +110,8 @@ SUBDIRS := \

					src/broadcom \

					src/intel \

					src/mesa/drivers/dri \

					src/vulkan

					src/vulkan \

					src/panfrost \

				INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))

				INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)

									
										3

README.rst
									
												View File
												
				@@ -56,5 +56,4 @@ Contributions are welcome, and step-by-step instructions can be found in our

				documentation (`docs/submittingpatches.html

				<https://mesa3d.org/submittingpatches.html>`_).

				Note that Mesa uses email mailing-lists for patches submission, review and

				discussions.

				Note that Mesa uses gitlab for patches submission, review and discussions.

41

REVIEWERS

View File

@@ -1,30 +1,11 @@
 Overview:
 	This file is similar in syntax (or more precisly a subset) of what is
 	used by the MAINTAINERS file in the linux kernel.  Some fields do not
 	apply, for example, in all cases, send patches to:
 		mesa-dev@lists.freedesktop.org
 	and in all cases the patchwork instance is:
 		https://patchwork.freedesktop.org/project/mesa/
 	used by the MAINTAINERS file in the linux kernel.
 	The purpose is not exactly the same the MAINTAINERS file in the linux
 	kernel, as there are not official/formal maintainers of different
 	subsystems in mesa, but is meant to give an idea of who to CC for
 	various patches for review, and to allow the use of
 	scripts/get_reviewer.pl as git --cc-cmd.
 Usage:
 	When sending patches:
 		git send-email --cc-cmd ./scripts/get_reviewer.pl ...
 	Or to configure as default:
 		git config sendemail.cccmd ./scripts/get_reviewer.pl
 	various patches for review.
 Descriptions of section entries:
@@ -36,14 +17,6 @@ Descriptions of section entries:
 	   F:	drivers/net/*	all files in drivers/net, but not below
 	   F:	*/net/*		all files in "any top level directory"/net
 	   One pattern per line.  Multiple F: lines acceptable.
 	N: Files and directories with regex patterns.
 	   N:	[^a-z]tegra	all files whose path contains the word tegra
 	   One pattern per line.  Multiple N: lines acceptable.
 	   scripts/get_maintainer.pl has different behavior for files that
 	   match F: pattern and matches of N: patterns.  By default,
 	   get_maintainer will not look at git log history when an F: pattern
 	   match occurs.  When an N: match occurs, git log history is used
 	   to also notify the people that have git commit signatures.
 Maintainers List (try to look for most precise areas first)
@@ -135,3 +108,13 @@ VULKAN
 R: Eric Engestrom <eric@engestrom.ch>
 F: src/vulkan/
 F: include/vulkan/
 VMWARE DRIVER
 R: Brian Paul <brianp@vmware.com>
 R: Charmaine Lee <charmainel@vmware.com>
 F: src/gallium/drivers/svga/
 VMWARE WINSYS CODE
 R: Thomas Hellstrom <thellstrom@vmware.com>
 R: Deepak Rawat <drawat@vmware.com>
 F: src/gallium/winsys/svga/

									
										23

SConstruct
									
												View File
												
				@@ -20,6 +20,7 @@

				# to get the full list of options. See scons manpage for more info.

				#

				from __future__ import print_function

				import os

				import os.path

				import sys

				@@ -66,6 +67,26 @@ else:

				Help(opts.GenerateHelpText(env))

				#######################################################################

				# Print a deprecation warning for using scons on non-windows

				if common.host_platform != 'windows' and env['platform'] != 'windows':

				    if env['force_scons']:

				        print("WARNING: Scons is deprecated for non-windows platforms (including cygwin) "

				              "please use meson instead.", file=sys.stderr)

				    else:

				        print("ERROR: Scons is deprecated for non-windows platforms (including cygwin) "

				              "please use meson instead. If you really need to use scons you "

				              "can add `force_scons=1` to the scons command line.", file=sys.stderr)

				        sys.exit(1)

				else:

				    print("WARNING: Scons support is in the process of being deprecated on "

				          "on windows platforms (including mingw). If you haven't already "

				          "please try using meson for windows builds. Be sure to report any "

				          "issues you run into", file=sys.stderr)

				#######################################################################

				# Environment setup

				@@ -73,7 +94,7 @@ with open("VERSION") as f:

				  mesa_version = f.read().strip()

				env.Append(CPPDEFINES = [

				    ('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),

				    ('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),

				    ('PACKAGE_BUGREPORT', '\\"https://gitlab.freedesktop.org/mesa/mesa/issues\\"'),

				])

				# Includes

2

VERSION

View File

@@ -1 +1 @@
 .1.6
 .0.0-devel

									
										41

appveyor.yml
									
												View File
												
				@@ -38,6 +38,7 @@ cache:

				- '%LOCALAPPDATA%\pip\Cache -> appveyor.yml'

				- win_flex_bison-2.5.15.zip

				- llvm-5.0.1-msvc2017-mtd.7z

				- subprojects\packagecache -> subprojects\*.wrap

				os: Visual Studio 2017

				@@ -49,41 +50,21 @@ init:

				environment:

				  WINFLEXBISON_VERSION: 2.5.15

				  LLVM_ARCHIVE: llvm-5.0.1-msvc2017-mtd.7z

				  matrix:

				  - compiler: msvc

				    buildsystem: scons

				  - compiler: msvc

				    buildsystem: meson

				    path: C:\Python38-x64;C:\Python38-x64\Scripts;%path%

				install:

				# Check git config

				- git config core.autocrlf

				# Check pip

				- python --version

				- python -m pip --version

				# Install Mako

				- python -m pip install Mako==1.0.7

				# Install pywin32 extensions, needed by SCons

				- python -m pip install pypiwin32

				# Install python wheels, necessary to install SCons via pip

				- python -m pip install wheel

				# Install SCons

				- python -m pip install scons==3.0.1

				- scons --version

				# Install flex/bison

				- set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

				- win_bison --version

				# Download and extract LLVM

				- if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"

				- 7z x -y "%LLVM_ARCHIVE%" > nul

				- mkdir llvm\bin

				- set LLVM=%CD%\llvm

				- cmd: .appveyor\appveyor_msvc.bat install

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1

				after_build:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check

				- cmd: .appveyor\appveyor_msvc.bat build_script

				test_script:

				- cmd: .appveyor\appveyor_msvc.bat test_script

				# It's possible to setup notification here, as described in

				# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

15

bin/.cherry-ignore

View File

@@ -1,15 +0,0 @@
 # fixes: The following commits do not apply cleanly on 19.1 branch, as they
 #        depend on other commits not present in the branch.
 b00e1ff24f974bc99e7ca9a720518da0ce5b89 panfrost: Make ctx->job useful
 f6c44549ee2dd0f218deea1feba3965523609406 iris: Replace devinfo->gen with GEN_GEN
 cd13ccee7bc2733e7a56284dc02bdb1b1c40081 iris: Update fast clear colors on Gen9 with direct immediate writes.
 # fixes: The following commit depends on commits 77a1070d366a and df4c2ec5e19b
 #        in order to compile, which did not land in the branch.
 d799250346331a93b21678dc5605cff74dfa3a1 iris: Avoid unnecessary resolves on transfer maps
 # stable: Explicit 19.2 only nominations.
 e73d863a66caac796ed5fb543a77f0b892df8573 radv: allow to enable VK_AMD_shader_ballot only on GFX8+
 f202ac27a99caf9009aa9d60e2e0d7f3b528e99f radv: add a new debug option called RADV_DEBUG=noshaderballot
 a6ad9e8ccf970a0da68508eb2ce26b316045b9f0 radv: force enable VK_AMD_shader_ballot for Wolfenstein Youngblood
 c27d8d4a7e9372a8a86d970b598fc4e3bfd1 radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0
 a4e6e59db82e61b47ef905f28dde80ae36a67d35 radv/gfx10: do not use NGG with NAVI14
 fe0ec41c4d36fd5a82e7579d89e34cce7423c4e5 radv: Change memory type order for GPUs without dedicated VRAM

0

src/gallium/drivers/panfrost/include/meson.build → bin/init.py

View File

									
										35

bin/bugzilla_mesa.sh
									
												View File
											
				@@ -1,35 +0,0 @@

				#!/bin/sh

				# This script is used to generate the list of fixed bugs that

				# appears in the release notes files, with HTML formatting.

				#

				# Note: This script could take a while until all details have

				#       been fetched from bugzilla.

				#

				# Usage examples:

				#

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes

				# regex pattern: trim before bug number

				trim_before='s/.*show_bug.cgi?id=\([0-9]*\).*/\1/'

				# regex pattern: reconstruct the url

				use_after='s,^,https://bugs.freedesktop.org/show_bug.cgi?id=,'

				echo "<ul>"

				echo ""

				# extract fdo urls from commit log

				git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				while read url

				do

					id=$(echo $url | cut -d'=' -f2)

					summary=$(wget --quiet -O - $url | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')

					echo "<li><a href=\"$url\">Bug $id</a> - $summary</li>"

					echo ""

				done

				echo "</ul>"

									
										272

bin/gen_release_notes.py
									
										Executable file
									
												View File
												
				@@ -0,0 +1,272 @@

				#!/usr/bin/env python3

				# Copyright © 2019 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				"""Generates release notes for a given version of mesa."""

				import asyncio

				import datetime

				import os

				import pathlib

				import sys

				import textwrap

				import typing

				import urllib.parse

				import aiohttp

				from mako.template import Template

				from mako import exceptions

				CURRENT_GL_VERSION = '4.6'

				CURRENT_VK_VERSION = '1.1'

				TEMPLATE = Template(textwrap.dedent("""\

				    <%!

				        import html

				    %>

				    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				    <html lang="en">

				    <head>

				    <meta http-equiv="content-type" content="text/html; charset=utf-8">

				    <title>Mesa Release Notes</title>

				    <link rel="stylesheet" type="text/css" href="../mesa.css">

				    </head>

				    <body>

				    <div class="header">

				    <h1>The Mesa 3D Graphics Library</h1>

				    </div>

				    <iframe src="../contents.html"></iframe>

				    <div class="content">

				    <h1>Mesa ${next_version} Release Notes / ${today}</h1>

				    <p>

				    %if not bugfix:

				        Mesa ${next_version} is a new development release. People who are concerned

				        with stability and reliability should stick with a previous release or

				        wait for Mesa ${version[:-1]}1.

				    %else:

				        Mesa ${next_version} is a bug fix release which fixes bugs found since the ${version} release.

				    %endif

				    </p>

				    <p>

				    Mesa ${next_version} implements the OpenGL ${gl_version} API, but the version reported by

				    glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				    glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				    Some drivers don't support all the features required in OpenGL ${gl_version}. OpenGL

				    ${gl_version} is <strong>only</strong> available if requested at context creation.

				    Compatibility contexts may report a lower version depending on each driver.

				    </p>

				    <p>

				    Mesa ${next_version} implements the Vulkan ${vk_version} API, but the version reported by

				    the apiVersion property of the VkPhysicalDeviceProperties struct

				    depends on the particular driver being used.

				    </p>

				    <h2>SHA256 checksum</h2>

				    <pre>

				    TBD.

				    </pre>

				    <h2>New features</h2>

				    <ul>

				    %for f in features:

				        <li>${html.escape(f)}</li>

				    %endfor

				    </ul>

				    <h2>Bug fixes</h2>

				    <ul>

				    %for b in bugs:

				        <li>${html.escape(b)}</li>

				    %endfor

				    </ul>

				    <h2>Changes</h2>

				    <ul>

				    %for c, author in changes:

				      %if author:

				        <p>${html.escape(c)}</p>

				      %else:

				        <li>${html.escape(c)}</li>

				      %endif

				    %endfor

				    </ul>

				    </div>

				    </body>

				    </html>

				    """))

				async def gather_commits(version: str) -> str:

				    p = await asyncio.create_subprocess_exec(

				        'git', 'log', f'mesa-{version}..', '--grep', r'Closes: \(https\|#\).*',

				        stdout=asyncio.subprocess.PIPE)

				    out, _ = await p.communicate()

				    assert p.returncode == 0, f"git log didn't work: {version}"

				    return out.decode().strip()

				async def gather_bugs(version: str) -> typing.List[str]:

				    commits = await gather_commits(version)

				    issues: typing.List[str] = []

				    for commit in commits.split('\n'):

				        sha, message = commit.split(maxsplit=1)

				        p = await asyncio.create_subprocess_exec(

				            'git', 'log', '--max-count', '1', r'--format=%b', sha,

				            stdout=asyncio.subprocess.PIPE)

				        _out, _ = await p.communicate()

				        out = _out.decode().split('\n')

				        for line in reversed(out):

				            if line.startswith('Closes:'):

				                bug = line.lstrip('Closes:').strip()

				                break

				        else:

				            raise Exception('No closes found?')

				        if bug.startswith('h'):

				            # This means we have a bug in the form "Closes: https://..."

				            issues.append(os.path.basename(urllib.parse.urlparse(bug).path))

				        else:

				            issues.append(bug.lstrip('#'))

				    loop = asyncio.get_event_loop()

				    async with aiohttp.ClientSession(loop=loop) as session:

				        results = await asyncio.gather(*[get_bug(session, i) for i in issues])

				    typing.cast(typing.Tuple[str, ...], results)

				    return list(results)

				async def get_bug(session: aiohttp.ClientSession, bug_id: str) -> str:

				    """Query gitlab to get the name of the issue that was closed."""

				    # Mesa's gitlab id is 176,

				    url = 'https://gitlab.freedesktop.org/api/v4/projects/176/issues'

				    params = {'iids[]': bug_id}

				    async with session.get(url, params=params) as response:

				        content = await response.json()

				    return content[0]['title']

				async def get_shortlog(version: str) -> str:

				    """Call git shortlog."""

				    p = await asyncio.create_subprocess_exec('git', 'shortlog', f'mesa-{version}..',

				                                             stdout=asyncio.subprocess.PIPE)

				    out, _ = await p.communicate()

				    assert p.returncode == 0, 'error getting shortlog'

				    assert out is not None, 'just for mypy'

				    return out.decode()

				def walk_shortlog(log: str) -> typing.Generator[typing.Tuple[str, bool], None, None]:

				    for l in log.split('\n'):

				        if l.startswith(' '): # this means we have a patch description

				            yield l, False

				        else:

				            yield l, True

				def calculate_next_version(version: str, is_point: bool) -> str:

				    """Calculate the version about to be released."""

				    if '-' in version:

				        version = version.split('-')[0]

				    if is_point:

				        base = version.split('.')

				        base[2] = str(int(base[2]) + 1)

				        return '.'.join(base)

				    return version

				def calculate_previous_version(version: str, is_point: bool) -> str:

				    """Calculate the previous version to compare to.

				    In the case of -rc to final that verison is the previous .0 release,

				    (19.3.0 in the case of 20.0.0, for example). for point releases that is

				    the last point release. This value will be the same as the input value

				    for a point release, but different for a major release.

				    """

				    if '-' in version:

				        version = version.split('-')[0]

				    if is_point:

				        return version

				    base = version.split('.')

				    if base[1] == '0':

				        base[0] = str(int(base[0]) - 1)

				        base[1] = '3'

				    else:

				        base[1] = str(int(base[1]) - 1)

				    return '.'.join(base)

				def get_features(is_point_release: bool) -> typing.Generator[str, None, None]:

				    p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / 'new_features.txt'

				    if p.exists():

				        if is_point_release:

				            print("WARNING: new features being introduced in a point release", file=sys.stderr)

				        with p.open('rt') as f:

				            for line in f:

				                yield line

				    else:

				        yield "None"

				async def main() -> None:

				    v = pathlib.Path(__file__).parent.parent / 'VERSION'

				    with v.open('rt') as f:

				        raw_version = f.read().strip()

				    is_point_release = '-rc' not in raw_version

				    assert '-devel' not in raw_version, 'Do not run this script on -devel'

				    version = raw_version.split('-')[0]

				    previous_version = calculate_previous_version(version, is_point_release)

				    next_version = calculate_next_version(version, is_point_release)

				    shortlog, bugs = await asyncio.gather(

				        get_shortlog(previous_version),

				        gather_bugs(previous_version),

				    )

				    final = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes' / f'{next_version}.html'

				    with final.open('wt') as f:

				        try:

				            f.write(TEMPLATE.render(

				                bugfix=is_point_release,

				                bugs=bugs,

				                changes=walk_shortlog(shortlog),

				                features=get_features(is_point_release),

				                gl_version=CURRENT_GL_VERSION,

				                next_version=next_version,

				                today=datetime.date.today(),

				                version=previous_version,

				                vk_version=CURRENT_VK_VERSION,

				            ))

				        except:

				            print(exceptions.text_error_template().render())

				if __name__ == "__main__":

				    loop = asyncio.get_event_loop()

				    loop.run_until_complete(main())

									
										62

bin/gen_release_notes_test.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,62 @@

				# Copyright © 2019 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				from unittest import mock

				import pytest

				from .gen_release_notes import *

				@pytest.mark.parametrize(

				    'current, is_point, expected',

				    [

				        ('19.2.0', True, '19.2.1'),

				        ('19.3.6', True, '19.3.7'),

				        ('20.0.0-rc4', False, '20.0.0'),

				    ])

				def test_next_version(current: str, is_point: bool, expected: str) -> None:

				    assert calculate_next_version(current, is_point) == expected

				@pytest.mark.parametrize(

				    'current, is_point, expected',

				    [

				        ('19.3.6', True, '19.3.6'),

				        ('20.0.0-rc4', False, '19.3.0'),

				    ])

				def test_previous_version(current: str, is_point: bool, expected: str) -> None:

				    assert calculate_previous_version(current, is_point) == expected

				@pytest.mark.asyncio

				async def test_get_shortlog():

				    # Certainly not perfect, but it's something

				    version = '19.2.0'

				    out = await get_shortlog(version)

				    assert out

				@pytest.mark.asyncio

				async def test_gather_commits():

				    # Certainly not perfect, but it's something

				    version = '19.2.0'

				    out = await gather_commits(version)

				    assert out

									
										4

bin/get-pick-list.sh
									
												View File
												
				@@ -32,7 +32,7 @@ is_sha_nomination()

				{

					fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \

						sed -e 's/'"$2"'/\nfixes:/Ig' | \

						grep -Eo 'fixes:[a-f0-9]{8,40}'`

						grep -Eo 'fixes:[a-f0-9]{4,40}'`

					fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`

					if test $fixes_count -eq 0; then

				@@ -143,7 +143,7 @@ do

					esac

					printf "[ %8s ] " "$tag"

					git --no-pager show --no-patch --oneline $sha

					git --no-pager show --no-patch --pretty=oneline $sha

				done

				rm -f already_picked

									
										3

bin/install_megadrivers.py
									
												View File
												
				@@ -1,5 +1,6 @@

				#!/usr/bin/env python3

				# encoding=utf-8

				# Copyright © 2017-2018 Intel Corporation

				# Copyright 2017-2018 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

									
										2

bin/meson.build
									
												View File
												
				@@ -19,3 +19,5 @@

				# SOFTWARE.

				git_sha1_gen_py = files('git_sha1_gen.py')

				symbols_check = find_program('symbols-check.py')

				install_megadrivers_py = find_program('install_megadrivers.py')

									
										117

bin/post_version.py
									
										Executable file
									
												View File
												
				@@ -0,0 +1,117 @@

				#!/usr/bin/env python3

				# Copyright © 2019 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				"""Update the main page, release notes, and calendar."""

				import argparse

				import calendar

				import datetime

				import pathlib

				from lxml import (

				    etree,

				    html,

				)

				def calculate_previous_version(version: str, is_point: bool) -> str:

				    """Calculate the previous version to compare to.

				    In the case of -rc to final that verison is the previous .0 release,

				    (19.3.0 in the case of 20.0.0, for example). for point releases that is

				    the last point release. This value will be the same as the input value

				    for a poiont release, but different for a major release.

				    """

				    if '-' in version:

				        version = version.split('-')[0]

				    if is_point:

				        return version

				    base = version.split('.')

				    if base[1] == '0':

				        base[0] = str(int(base[0]) - 1)

				        base[1] = '3'

				    else:

				        base[1] = str(int(base[1]) - 1)

				    return '.'.join(base)

				def is_point_release(version: str) -> bool:

				    return not version.endswith('.0')

				def update_index(is_point: bool, version: str, previous_version: str) -> None:

				    p = pathlib.Path(__file__).parent.parent / 'docs' / 'index.html'

				    with p.open('rt') as f:

				        tree = html.parse(f)

				    news = tree.xpath('.//h1')[0]

				    date = datetime.date.today()

				    month = calendar.month_name[date.month]

				    header = etree.Element('h2')

				    header.text = f"{month} {date.day}, {date.year}"

				    body = etree.Element('p')

				    a = etree.SubElement(

				        body, 'a', attrib={'href': f'relnotes/{previous_version}.html'})

				    a.text = f"Mesa {previous_version}"

				    if is_point:

				        a.tail = " is released. This is a bug fix release."

				    else:

				        a.tail = (" is released. This is a new development release. "

				                  "See the release notes for mor information about this release.")

				    root = news.getparent()

				    index = root.index(news) + 1

				    root.insert(index, body)

				    root.insert(index, header)

				    tree.write(p.as_posix(), method='html')

				def update_release_notes(previous_version: str) -> None:

				    p = pathlib.Path(__file__).parent.parent / 'docs' / 'relnotes.html'

				    with p.open('rt') as f:

				        tree = html.parse(f)

				    li = etree.Element('li')

				    a = etree.SubElement(li, 'a', href=f'relnotes/{previous_version}.html')

				    a.text = f'{previous_version} release notes'

				    ul = tree.xpath('.//ul')[0]

				    ul.insert(0, li)

				    tree.write(p.as_posix(), method='html')

				def main() -> None:

				    parser = argparse.ArgumentParser()

				    parser.add_argument('version', help="The released version.")

				    args = parser.parse_args()

				    is_point = is_point_release(args.version)

				    previous_version = calculate_previous_version(args.version, is_point)

				    update_index(is_point, args.version, previous_version)

				    update_release_notes(previous_version)

				if __name__ == "__main__":

				    main()

									
										29

bin/shortlog_mesa.sh
									
												View File
											
				@@ -1,29 +0,0 @@

				#!/bin/sh

				# This script is used to generate the list of changes that

				# appears in the release notes files, with HTML formatting.

				#

				# Usage examples:

				#

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes

				in_log=0

				git shortlog $* | while read l

				do

				    if [ $in_log -eq 0 ]; then

					echo '<p>'$l'</p>'

					echo '<ul>'

					in_log=1

				    elif echo "$l" | egrep -q '^$' ; then

					echo '</ul>'

					echo

					in_log=0

				    else

				        mesg=$(echo $l | sed 's/ (cherry picked from commit [0-9a-f]\+)//;s/\&/&amp;/g;s/</\&lt;/g;s/>/\&gt;/g')

					echo '  <li>'${mesg}'</li>'

				    fi

				done

									
										172

bin/symbols-check.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,172 @@

				#!/usr/bin/env python

				import argparse

				import os

				import platform

				import subprocess

				# This list contains symbols that _might_ be exported for some platforms

				PLATFORM_SYMBOLS = [

				    '__bss_end__',

				    '__bss_start__',

				    '__bss_start',

				    '__end__',

				    '_bss_end__',

				    '_edata',

				    '_end',

				    '_fini',

				    '_init',

				]

				def get_symbols_nm(nm, lib):

				    '''

				    List all the (non platform-specific) symbols exported by the library

				    using `nm`

				    '''

				    symbols = []

				    platform_name = platform.system()

				    output = subprocess.check_output([nm, '-gP', lib],

				                                     stderr=open(os.devnull, 'w')).decode("ascii")

				    for line in output.splitlines():

				        fields = line.split()

				        if len(fields) == 2 or fields[1] == 'U':

				            continue

				        symbol_name = fields[0]

				        if platform_name == 'Linux':

				            if symbol_name in PLATFORM_SYMBOLS:

				                continue

				        elif platform_name == 'Darwin':

				            assert symbol_name[0] == '_'

				            symbol_name = symbol_name[1:]

				        symbols.append(symbol_name)

				    return symbols

				def get_symbols_dumpbin(dumpbin, lib):

				    '''

				    List all the (non platform-specific) symbols exported by the library

				    using `dumpbin`

				    '''

				    symbols = []

				    output = subprocess.check_output([dumpbin, '/exports', lib],

				                                     stderr=open(os.devnull, 'w')).decode("ascii")

				    for line in output.splitlines():

				        fields = line.split()

				        # The lines with the symbols are made of at least 4 columns; see details below

				        if len(fields) < 4:

				            continue

				        try:

				            # Making sure the first 3 columns are a dec counter, a hex counter

				            # and a hex address

				            _ = int(fields[0], 10)

				            _ = int(fields[1], 16)

				            _ = int(fields[2], 16)

				        except ValueError:

				            continue

				        symbol_name = fields[3]

				        # De-mangle symbols

				        if symbol_name[0] == '_':

				            symbol_name = symbol_name[1:].split('@')[0]

				        symbols.append(symbol_name)

				    return symbols

				def main():

				    parser = argparse.ArgumentParser()

				    parser.add_argument('--symbols-file',

				                        action='store',

				                        required=True,

				                        help='path to file containing symbols')

				    parser.add_argument('--lib',

				                        action='store',

				                        required=True,

				                        help='path to library')

				    parser.add_argument('--nm',

				                        action='store',

				                        help='path to binary (or name in $PATH)')

				    parser.add_argument('--dumpbin',

				                        action='store',

				                        help='path to binary (or name in $PATH)')

				    args = parser.parse_args()

				    try:

				        if platform.system() == 'Windows':

				            if not args.dumpbin:

				                parser.error('--dumpbin is mandatory')

				            lib_symbols = get_symbols_dumpbin(args.dumpbin, args.lib)

				        else:

				            if not args.nm:

				                parser.error('--nm is mandatory')

				            lib_symbols = get_symbols_nm(args.nm, args.lib)

				    except:

				        # We can't run this test, but we haven't technically failed it either

				        # Return the GNU "skip" error code

				        exit(77)

				    mandatory_symbols = []

				    optional_symbols = []

				    with open(args.symbols_file) as symbols_file:

				        qualifier_optional = '(optional)'

				        for line in symbols_file.readlines():

				            # Strip comments

				            line = line.split('#')[0]

				            line = line.strip()

				            if not line:

				                continue

				            # Line format:

				            # [qualifier] symbol

				            qualifier = None

				            symbol = None

				            fields = line.split()

				            if len(fields) == 1:

				                symbol = fields[0]

				            elif len(fields) == 2:

				                qualifier = fields[0]

				                symbol = fields[1]

				            else:

				                print(args.symbols_file + ': invalid format: ' + line)

				                exit(1)

				            # The only supported qualifier is 'optional', which means the

				            # symbol doesn't have to be exported by the library

				            if qualifier and not qualifier == qualifier_optional:

				                print(args.symbols_file + ': invalid qualifier: ' + qualifier)

				                exit(1)

				            if qualifier == qualifier_optional:

				                optional_symbols.append(symbol)

				            else:

				                mandatory_symbols.append(symbol)

				    unknown_symbols = []

				    for symbol in lib_symbols:

				        if symbol in mandatory_symbols:

				            continue

				        if symbol in optional_symbols:

				            continue

				        if symbol[:2] == '_Z':

				            # Ignore random C++ symbols

				            #TODO: figure out if there's any way to avoid exporting them in the first place

				            continue

				        unknown_symbols.append(symbol)

				    missing_symbols = [

				        sym for sym in mandatory_symbols if sym not in lib_symbols

				    ]

				    for symbol in unknown_symbols:

				        print(args.lib + ': unknown symbol exported: ' + symbol)

				    for symbol in missing_symbols:

				        print(args.lib + ': missing symbol: ' + symbol)

				    if unknown_symbols or missing_symbols:

				        exit(1)

				    exit(0)

				if __name__ == '__main__':

				    main()

									
										13

common.py
									
												View File
												
				@@ -17,6 +17,9 @@ import SCons.Script.SConscript

				host_platform = _platform.system().lower()

				if host_platform.startswith('cygwin'):

				    host_platform = 'cygwin'

				# MSYS2 default platform selection.

				if host_platform.startswith('mingw'):

				    host_platform = 'windows'

				# Search sys.argv[] for a "platform=foo" argument since we don't have

				# an 'env' variable at this point.

				@@ -49,9 +52,18 @@ if 'PROCESSOR_ARCHITECTURE' in os.environ:

				else:

				    host_machine = _platform.machine()

				host_machine = _machine_map.get(host_machine, 'generic')

				# MSYS2 default machine selection.

				if _platform.system().lower().startswith('mingw') and 'MSYSTEM' in os.environ:

				    if os.environ['MSYSTEM'] == 'MINGW32':

				        host_machine = 'x86'

				    if os.environ['MSYSTEM'] == 'MINGW64':

				        host_machine = 'x86_64'

				default_machine = host_machine

				default_toolchain = 'default'

				# MSYS2 default toolchain selection.

				if _platform.system().lower().startswith('mingw'):

				    default_toolchain = 'mingw'

				if target_platform == 'windows' and host_platform != 'windows':

				    default_machine = 'x86'

				@@ -100,6 +112,7 @@ def AddOptions(opts):

				    opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))

				    opts.Add('toolchain', 'compiler toolchain', default_toolchain)

				    opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))

				    opts.Add(BoolOption('force_scons', 'Force enable scons on deprecated platforms', 'false'))

				    opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',

				                        'no'))

				    opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))

									
										12

docs/application-issues.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -48,16 +48,16 @@ start-up because of an extension string buffer-overflow problem.

				<p>

				The problem is a modern OpenGL driver will return a very long string

				for the glGetString(GL_EXTENSIONS) query and if the application

				for the <code>glGetString(GL_EXTENSIONS)</code> query and if the application

				naively copies the string into a fixed-size buffer it can overflow the

				buffer and crash the application.

				</p>

				<p>

				The work-around is to set the MESA_EXTENSION_MAX_YEAR environment variable

				to the approximate release year of the game.

				This will cause the glGetString(GL_EXTENSIONS) query to only report extensions

				older than the given year.

				The work-around is to set the <code>MESA_EXTENSION_MAX_YEAR</code>

				environment variable to the approximate release year of the game.

				This will cause the <code>glGetString(GL_EXTENSIONS)</code> query to only report

				extensions older than the given year.

				</p>

				<p>

									
										10

docs/bugs.html
									
												View File
												
				@@ -2,19 +2,19 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Bug Reporting</title>

				  <title>Report a Bug</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Report a bug</h1>

				<h1>Report a Bug</h1>

				<p>

				The Mesa bug database is hosted on

				@@ -24,8 +24,8 @@ The old bug database on SourceForge is no longer used.

				<p>

				To file a Mesa bug, go to

				<a href="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa">

				Bugzilla on freedesktop.org</a>

				<a href="https://gitlab.freedesktop.org/mesa/mesa/issues">

				GitLab on freedesktop.org</a>

				</p>

				<p>

									
										116

docs/codingstyle.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -41,69 +41,69 @@ as if you're defining a large, static table of information.

				<li>Opening braces go on the same line as the if/for/while statement.

				For example:

				<pre>

				   if (condition) {

				      foo;

				   } else {

				      bar;

				   }

				if (condition) {

				   foo;

				} else {

				   bar;

				}

				</pre>

				<li>Put a space before/after operators.  For example, <tt>a = b + c;</tt>

				and not <tt>a=b+c;</tt>

				<li>Put a space before/after operators.  For example, <code>a = b + c;</code>

				and not <code>a=b+c;</code>

				<li>This GNU indent command generally does the right thing for formatting:

				<pre>

				   indent -br -i3 -npcs --no-tabs infile.c -o outfile.c

				indent -br -i3 -npcs --no-tabs infile.c -o outfile.c

				</pre>

				<li>Use comments wherever you think it would be helpful for other developers.

				<li>

				<p>Use comments wherever you think it would be helpful for other developers.

				Several specific cases and style examples follow.  Note that we roughly

				follow <a href="https://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.

				<br>

				<br>

				follow <a href="http://www.doxygen.nl">Doxygen</a> conventions.

				</p>

				Single-line comments:

				<pre>

				   /* null-out pointer to prevent dangling reference below */

				   bufferObj = NULL;

				/* null-out pointer to prevent dangling reference below */

				bufferObj = NULL;

				</pre>

				Or,

				<pre>

				   bufferObj = NULL;  /* prevent dangling reference below */

				bufferObj = NULL;  /* prevent dangling reference below */

				</pre>

				Multi-line comment:

				<pre>

				   /* If this is a new buffer object id, or one which was generated but

				    * never used before, allocate a buffer object now.

				    */

				/* If this is a new buffer object id, or one which was generated but

				 * never used before, allocate a buffer object now.

				 */

				</pre>

				We try to quote the OpenGL specification where prudent:

				<pre>

				   /* Page 38 of the PDF of the OpenGL ES 3.0 spec says:

				    *

				    *     "An INVALID_OPERATION error is generated for any of the following

				    *     conditions:

				    *

				    *     * &lt;length&gt; is zero."

				    *

				    * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				    * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				    * either.

				    */

				/* Page 38 of the PDF of the OpenGL ES 3.0 spec says:

				 *

				 *     "An INVALID_OPERATION error is generated for any of the following

				 *     conditions:

				 *

				 *     * &lt;length&gt; is zero."

				 *

				 * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				 * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				 * either.

				 */

				</pre>

				Function comment example:

				<pre>

				   /**

				    * Create and initialize a new buffer object.  Called via the

				    * ctx-&gt;Driver.CreateObject() driver callback function.

				    * \param  name  integer name of the object

				    * \param  type  one of GL_FOO, GL_BAR, etc.

				    * \return  pointer to new object or NULL if error

				    */

				   struct gl_object *

				   _mesa_create_object(GLuint name, GLenum type)

				   {

				      /* function body */

				   }

				/**

				 * Create and initialize a new buffer object.  Called via the

				 * ctx-&gt;Driver.CreateObject() driver callback function.

				 * \param  name  integer name of the object

				 * \param  type  one of GL_FOO, GL_BAR, etc.

				 * \return  pointer to new object or NULL if error

				 */

				struct gl_object *

				_mesa_create_object(GLuint name, GLenum type)

				{

				   /* function body */

				}

				</pre>

				<li>Put the function return type and qualifiers on one line and the function

				@@ -113,26 +113,28 @@ the opening brace goes on the next line by itself (see above.)

				<li>Function names follow various conventions depending on the type of function:

				<pre>

				   glFooBar()       - a public GL entry point (in glapi_dispatch.c)

				   _mesa_FooBar()   - the internal immediate mode function

				   save_FooBar()    - retained mode (display list) function in dlist.c

				   foo_bar()        - a static (private) function

				   _mesa_foo_bar()  - an internal non-static Mesa function

				glFooBar()       - a public GL entry point (in glapi_dispatch.c)

				_mesa_FooBar()   - the internal immediate mode function

				save_FooBar()    - retained mode (display list) function in dlist.c

				foo_bar()        - a static (private) function

				_mesa_foo_bar()  - an internal non-static Mesa function

				</pre>

				<li>Constants, macros and enum names are ALL_UPPERCASE, with _ between

				words.

				<li>Mesa usually uses camel case for local variables (Ex: "localVarname")

				while gallium typically uses underscores (Ex: "local_var_name").

				<li>Constants, macros and enum names are <code>ALL_UPPERCASE</code>, with _

				between words.

				<li>Mesa usually uses camel case for local variables (Ex:

				<code>localVarname</code>) while gallium typically uses underscores (Ex:

				<code>local_var_name</code>).

				<li>Global variables are almost never used because Mesa should be thread-safe.

				<li>Booleans.  Places that are not directly visible to the GL API

				should prefer the use of <tt>bool</tt>, <tt>true</tt>, and

				<tt>false</tt> over <tt>GLboolean</tt>, <tt>GL_TRUE</tt>, and

				<tt>GL_FALSE</tt>.  In C code, this may mean that

				<tt>#include &lt;stdbool.h&gt;</tt> needs to be added.  The

				<tt>try_emit_</tt>* methods in src/mesa/program/ir_to_mesa.cpp and

				src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.

				should prefer the use of <code>bool</code>, <code>true</code>, and

				<code>false</code> over <code>GLboolean</code>, <code>GL_TRUE</code>, and

				<code>GL_FALSE</code>.  In C code, this may mean that

				<code>#include &lt;stdbool.h&gt;</code> needs to be added.  The

				<code>try_emit_*</code> methods in <code>src/mesa/program/ir_to_mesa.cpp</code>

				and <code>src/mesa/state_tracker/st_glsl_to_tgsi.cpp</code> can serve as

				examples.

				</ul>

									
										6

docs/conform.html
									
												View File
												
				@@ -2,19 +2,19 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Conformance</title>

				  <title>Conformance Testing</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Conformance</h1>

				<h1>Conformance Testing</h1>

				<p>

				The SGI OpenGL conformance tests verify correct operation of OpenGL

									
										33

docs/contents.html
									
												View File
												
				@@ -33,18 +33,17 @@

				<li><a href="index.html" target="_parent">News</a>

				<li><a href="developers.html" target="_parent">Developers</a>

				<li><a href="systems.html" target="_parent">Platforms and Drivers</a>

				<li><a href="license.html" target="_parent">License &amp; Copyright</a>

				<li><a href="faq.html" target="_parent">FAQ</a>

				<li><a href="license.html" target="_parent">License and Copyright</a>

				<li><a href="faq.html" target="_parent">Frequently Asked Questions</a>

				<li><a href="relnotes.html" target="_parent">Release Notes</a>

				<li><a href="thanks.html" target="_parent">Acknowledgements</a>

				<li><a href="conform.html" target="_parent">Conformance Testing</a>

				<li>more docs below...

				</ul>

				<h2>Download / Install</h2>

				<h2>Download and Install</h2>

				<ul>

				<li><a href="download.html" target="_parent">Downloading / Unpacking</a>

				<li><a href="install.html" target="_parent">Compiling / Installing</a>

				<li><a href="download.html" target="_parent">Downloading and Unpacking</a>

				<li><a href="install.html" target="_parent">Compiling and Installing</a>

				  <ul>

				    <li><a href="meson.html" target="_parent">Meson</a></li>

				  </ul>

				@@ -66,13 +65,13 @@

				<li><a href="egl.html" target="_parent">EGL</a>

				<li><a href="opengles.html" target="_parent">OpenGL ES</a>

				<li><a href="envvars.html" target="_parent">Environment Variables</a>

				<li><a href="osmesa.html" target="_parent">Off-Screen Rendering</a>

				<li><a href="osmesa.html" target="_parent">Off-screen Rendering</a>

				<li><a href="debugging.html" target="_parent">Debugging Tips</a>

				<li><a href="perf.html" target="_parent">Performance Tips</a>

				<li><a href="extensions.html" target="_parent">Mesa Extensions</a>

				<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>

				<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>

				<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>

				<li><a href="llvmpipe.html" target="_parent">Gallium LLVMpipe Driver</a>

				<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D Guest Driver</a>

				<li><a href="postprocess.html" target="_parent">Gallium Post-processing</a>

				<li><a href="application-issues.html" target="_parent">Application Issues</a>

				<li><a href="viewperf.html" target="_parent">Viewperf Issues</a>

				</ul>

				@@ -85,24 +84,24 @@

				<li><a href="helpwanted.html" target="_parent">Help Wanted</a>

				<li><a href="devinfo.html" target="_parent">Development Notes</a>

				<li><a href="codingstyle.html" target="_parent">Coding Style</a>

				<li><a href="submittingpatches.html" target="_parent">Submitting patches</a>

				<li><a href="releasing.html" target="_parent">Releasing process</a>

				<li><a href="release-calendar.html" target="_parent">Release calendar</a>

				<li><a href="submittingpatches.html" target="_parent">Submitting Patches</a>

				<li><a href="releasing.html" target="_parent">Releasing Process</a>

				<li><a href="release-calendar.html" target="_parent">Release Calendar</a>

				<li><a href="sourcedocs.html" target="_parent">Source Documentation</a>

				<li><a href="dispatch.html" target="_parent">GL Dispatch</a>

				</ul>

				<h2>Links</h2>

				<ul>

				<li><a href="https://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="https://dri.freedesktop.org" target="_parent">DRI website</a>

				<li><a href="https://www.opengl.org" target="_parent">OpenGL Website</a>

				<li><a href="https://dri.freedesktop.org" target="_parent">DRI Website</a>

				<li><a href="https://www.freedesktop.org" target="_parent">freedesktop.org</a>

				<li><a href="https://planet.freedesktop.org" target="_parent">Developer blogs</a>

				<li><a href="https://planet.freedesktop.org" target="_parent">Developer Blogs</a>

				</ul>

				<h2>Hosted by:</h2>

				<dl>

				<dd><a href="https://freedesktop.org" target="_parent">freedesktop.org</a>

				<dd><a href="https://www.freedesktop.org" target="_parent">freedesktop.org</a>

				</dl>

				</body>

									
										14

docs/debugging.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -20,9 +20,9 @@

				   Normally Mesa (and OpenGL) records but does not notify the user of

				   errors.  It is up to the application to call

				   <code>glGetError</code> to check for errors.  Mesa supports an

				   environment variable, MESA_DEBUG, to help with debugging.  If

				   MESA_DEBUG is defined, a message will be printed to stdout whenever

				   an error occurs.

				   environment variable, <code>MESA_DEBUG</code>, to help with debugging.  If

				   <code>MESA_DEBUG</code> is defined, a message will be printed to stdout

				   whenever an error occurs.

				</p>

				<p>

				@@ -30,12 +30,12 @@

				   (<code>--buildtype debug</code> for meson, <code>build=debug</code> for scons).

				</p>

				<p>

				   In your debugger you can set a breakpoint in _mesa_error() to trap Mesa

				   errors.

				   In your debugger you can set a breakpoint in <code>_mesa_error()</code> to trap

				   Mesa errors.

				</p>

				<p>

				   There is a display list printing/debugging facility.  See the end of

				   src/dlist.c for details.

				   <code>src/dlist.c</code> for details.

				</p>

				</div>

									
										2

docs/developers.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

									
										37

docs/devinfo.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -29,8 +29,8 @@ To add a new GL extension to Mesa you have to do at least the following.

				<ul>

				<li>

				   If glext.h doesn't define the extension, edit include/GL/gl.h and add

				   code like this:

				   If <code>glext.h</code> doesn't define the extension, edit

				   <code>include/GL/gl.h</code> and add code like this:

				   <pre>

				     #ifndef GL_EXT_the_extension_name

				     #define GL_EXT_the_extension_name 1

				@@ -41,18 +41,18 @@ To add a new GL extension to Mesa you have to do at least the following.

				   </pre>

				</li>

				<li>

				   In the src/mapi/glapi/gen/ directory, add the new extension functions and

				   enums to the gl_API.xml file.

				   In the <code>src/mapi/glapi/gen/</code> directory, add the new extension

				   functions and enums to the <code>gl_API.xml</code> file.

				   Then, a bunch of source files must be regenerated by executing the

				   corresponding Python scripts.

				</li>

				<li>

				   Add a new entry to the <code>gl_extensions</code> struct in mtypes.h

				   if the extension requires driver capabilities not already exposed by

				   another extension.

				   Add a new entry to the <code>gl_extensions</code> struct in

				   <code>mtypes.h</code> if the extension requires driver capabilities not

				   already exposed by another extension.

				</li>

				<li>

				   Add a new entry to the src/mesa/main/extensions_table.h file.

				   Add a new entry to the <code>src/mesa/main/extensions_table.h</code> file.

				</li>

				<li>

				   From this point, the best way to proceed is to find another extension,

				@@ -60,24 +60,23 @@ To add a new GL extension to Mesa you have to do at least the following.

				   as an example.

				</li>

				<li>

				   If the new extension adds new GL state, the functions in get.c, enable.c

				   and attrib.c will most likely require new code.

				   If the new extension adds new GL state, the functions in

				   <code>get.c</code>, <code>enable.c</code> and <code>attrib.c</code>

				   will most likely require new code.

				</li>

				<li>

				   To determine if the new extension is active in the current context,

				   use the auto-generated _mesa_has_##name_str() function defined in

				   src/mesa/main/extensions.h.

				   use the auto-generated <code>_mesa_has_##name_str()</code> function

				   defined in <code>src/mesa/main/extensions.h</code>.

				</li>

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'meson test'

				   The dispatch tests <code>check_table.cpp</code> and

				   <code>dispatch_sanity.cpp</code> should be updated with details about

				   the new extensions functions. These tests are run using

				   <code>meson test</code>.

				</li>

				</ul>

				</div>

				</body>

				</html>

									
										139

docs/dispatch.html
									
												View File
												
				@@ -2,19 +2,19 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>GL Dispatch in Mesa</title>

				  <title>GL Dispatch</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>GL Dispatch in Mesa</h1>

				<h1>GL Dispatch</h1>

				<p>Several factors combine to make efficient dispatch of OpenGL functions

				fairly complicated.  This document attempts to explain some of the issues

				@@ -30,28 +30,28 @@ of the GL related state for the application.  Every texture, every buffer

				object, every enable, and much, much more is stored in the context.  Since

				an application can have more than one context, the context to be used is

				selected by a window-system dependent function such as

				<tt>glXMakeContextCurrent</tt>.</p>

				<code>glXMakeContextCurrent</code>.</p>

				<p>In environments that implement OpenGL with X-Windows using GLX, every GL

				function, including the pointers returned by <tt>glXGetProcAddress</tt>, are

				function, including the pointers returned by <code>glXGetProcAddress</code>, are

				<em>context independent</em>.  This means that no matter what context is

				currently active, the same <tt>glVertex3fv</tt> function is used.</p>

				currently active, the same <code>glVertex3fv</code> function is used.</p>

				<p>This creates the first bit of dispatch complexity.  An application can

				have two GL contexts.  One context is a direct rendering context where

				function calls are routed directly to a driver loaded within the

				application's address space.  The other context is an indirect rendering

				context where function calls are converted to GLX protocol and sent to a

				server.  The same <tt>glVertex3fv</tt> has to do the right thing depending

				server.  The same <code>glVertex3fv</code> has to do the right thing depending

				on which context is current.</p>

				<p>Highly optimized drivers or GLX protocol implementations may want to

				change the behavior of GL functions depending on current state.  For

				example, <tt>glFogCoordf</tt> may operate differently depending on whether

				example, <code>glFogCoordf</code> may operate differently depending on whether

				or not fog is enabled.</p>

				<p>In multi-threaded environments, it is possible for each thread to have a

				different GL context current.  This means that poor old <tt>glVertex3fv</tt>

				different GL context current.  This means that poor old <code>glVertex3fv</code>

				has to know which GL context is current in the thread where it is being

				called.</p>

				@@ -64,38 +64,38 @@ dispatch table stores pointers to functions that actually implement

				specific GL functions.  Each time a new context is made current in a thread,

				these pointers a updated.</p>

				<p>The implementation of functions such as <tt>glVertex3fv</tt> becomes

				<p>The implementation of functions such as <code>glVertex3fv</code> becomes

				conceptually simple:</p>

				<ul>

				<li>Fetch the current dispatch table pointer.</li>

				<li>Fetch the pointer to the real <tt>glVertex3fv</tt> function from the

				<li>Fetch the pointer to the real <code>glVertex3fv</code> function from the

				table.</li>

				<li>Call the real function.</li>

				</ul>

				<p>This can be implemented in just a few lines of C code.  The file

				<tt>src/mesa/glapi/glapitemp.h</tt> contains code very similar to this.</p>

				<code>src/mesa/glapi/glapitemp.h</code> contains code very similar to this.</p>

				<blockquote>

				<table border="1">

				<tr><td><pre>

				<figure>

				<pre>

				void glVertex3f(GLfloat x, GLfloat y, GLfloat z)

				{

				    const struct _glapi_table * const dispatch = GET_DISPATCH();

				    (*dispatch-&gt;Vertex3f)(x, y, z);

				}</pre></td></tr>

				<tr><td>Sample dispatch function</td></tr></table>

				</blockquote>

				}

				</pre>

				<figcaption>Sample dispatch function</figcaption>

				</figure>

				<p>The problem with this simple implementation is the large amount of

				overhead that it adds to every GL function call.</p>

				<p>In a multithreaded environment, a naive implementation of

				<tt>GET_DISPATCH</tt> involves a call to <tt>pthread_getspecific</tt> or a

				<code>GET_DISPATCH</code> involves a call to <code>pthread_getspecific</code> or a

				similar function.  Mesa provides a wrapper function called

				<tt>_glapi_get_dispatch</tt> that is used by default.</p>

				<code>_glapi_get_dispatch</code> that is used by default.</p>

				<h2>3. Optimizations</h2>

				@@ -109,7 +109,7 @@ each can or cannot be used are listed.</p>

				<p>The vast majority of OpenGL applications use the API in a single threaded

				manner.  That is, the application has only one thread that makes calls into

				the GL.  In these cases, not only do the calls to

				<tt>pthread_getspecific</tt> hurt performance, but they are completely

				<code>pthread_getspecific</code> hurt performance, but they are completely

				unnecessary!  It is possible to detect this common case and avoid these

				calls.</p>

				@@ -118,56 +118,54 @@ of the executing thread.  If the same thread ID is always seen, Mesa knows

				that the application is, from OpenGL's point of view, single threaded.</p>

				<p>As long as an application is single threaded, Mesa stores a pointer to

				the dispatch table in a global variable called <tt>_glapi_Dispatch</tt>.

				the dispatch table in a global variable called <code>_glapi_Dispatch</code>.

				The pointer is also stored in a per-thread location via

				<tt>pthread_setspecific</tt>.  When Mesa detects that an application has

				become multithreaded, <tt>NULL</tt> is stored in <tt>_glapi_Dispatch</tt>.</p>

				<code>pthread_setspecific</code>.  When Mesa detects that an application has

				become multithreaded, <code>NULL</code> is stored in <code>_glapi_Dispatch</code>.</p>

				<p>Using this simple mechanism the dispatch functions can detect the

				multithreaded case by comparing <tt>_glapi_Dispatch</tt> to <tt>NULL</tt>.

				The resulting implementation of <tt>GET_DISPATCH</tt> is slightly more

				complex, but it avoids the expensive <tt>pthread_getspecific</tt> call in

				multithreaded case by comparing <code>_glapi_Dispatch</code> to <code>NULL</code>.

				The resulting implementation of <code>GET_DISPATCH</code> is slightly more

				complex, but it avoids the expensive <code>pthread_getspecific</code> call in

				the common case.</p>

				<blockquote>

				<table border="1">

				<tr><td><pre>

				<figure>

				<pre>

				#define GET_DISPATCH() \

				    (_glapi_Dispatch != NULL) \

				        ? _glapi_Dispatch : pthread_getspecific(&amp;_glapi_Dispatch_key)

				</pre></td></tr>

				<tr><td>Improved <tt>GET_DISPATCH</tt> Implementation</td></tr></table>

				</blockquote>

				</pre>

				<figcaption>Improved <code>GET_DISPATCH</code> Implementation</figcaption>

				</figure>

				<h3>3.2. ELF TLS</h3>

				<p>Starting with the 2.4.20 Linux kernel, each thread is allocated an area

				of per-thread, global storage.  Variables can be put in this area using some

				extensions to GCC.  By storing the dispatch table pointer in this area, the

				expensive call to <tt>pthread_getspecific</tt> and the test of

				<tt>_glapi_Dispatch</tt> can be avoided.</p>

				expensive call to <code>pthread_getspecific</code> and the test of

				<code>_glapi_Dispatch</code> can be avoided.</p>

				<p>The dispatch table pointer is stored in a new variable called

				<tt>_glapi_tls_Dispatch</tt>.  A new variable name is used so that a single

				<code>_glapi_tls_Dispatch</code>.  A new variable name is used so that a single

				libGL can implement both interfaces.  This allows the libGL to operate with

				direct rendering drivers that use either interface.  Once the pointer is

				properly declared, <tt>GET_DISPACH</tt> becomes a simple variable

				properly declared, <code>GET_DISPACH</code> becomes a simple variable

				reference.</p>

				<blockquote>

				<table border="1">

				<tr><td><pre>

				<figure>

				<pre>

				extern __thread struct _glapi_table *_glapi_tls_Dispatch

				    __attribute__((tls_model("initial-exec")));

				#define GET_DISPATCH() _glapi_tls_Dispatch

				</pre></td></tr>

				<tr><td>TLS <tt>GET_DISPATCH</tt> Implementation</td></tr></table>

				</blockquote>

				</pre>

				<figcaption>TLS <code>GET_DISPATCH</code> Implementation</figcaption>

				</figure>

				<p>Use of this path is controlled by the preprocessor define

				<tt>GLX_USE_TLS</tt>.  Any platform capable of using TLS should use this as

				the default dispatch method.</p>

				<code>USE_ELF_TLS</code>.  Any platform capable of using ELF TLS should use this

				as the default dispatch method.</p>

				<h3>3.3. Assembly Language Dispatch Stubs</h3>

				@@ -185,13 +183,13 @@ ways that the dispatch table pointer can be accessed.  There are four

				different methods that can be used:</p>

				<ol>

				<li>Using <tt>_glapi_Dispatch</tt> directly in builds for non-multithreaded

				<li>Using <code>_glapi_Dispatch</code> directly in builds for non-multithreaded

				environments.</li>

				<li>Using <tt>_glapi_Dispatch</tt> and <tt>_glapi_get_dispatch</tt> in

				<li>Using <code>_glapi_Dispatch</code> and <code>_glapi_get_dispatch</code> in

				multithreaded environments.</li>

				<li>Using <tt>_glapi_Dispatch</tt> and <tt>pthread_getspecific</tt> in

				<li>Using <code>_glapi_Dispatch</code> and <code>pthread_getspecific</code> in

				multithreaded environments.</li>

				<li>Using <tt>_glapi_tls_Dispatch</tt> directly in TLS enabled

				<li>Using <code>_glapi_tls_Dispatch</code> directly in TLS enabled

				multithreaded environments.</li>

				</ol>

				@@ -204,24 +202,23 @@ terribly relevant.</p>

				few preprocessor defines.</p>

				<ul>

				<li>If <tt>GLX_USE_TLS</tt> is defined, method #3 is used.</li>

				<li>If <tt>HAVE_PTHREAD</tt> is defined, method #2 is used.</li>

				<li>If <code>USE_ELF_TLS</code> is defined, method #3 is used.</li>

				<li>If <code>HAVE_PTHREAD</code> is defined, method #2 is used.</li>

				<li>If none of the preceding are defined, method #1 is used.</li>

				</ul>

				<p>Two different techniques are used to handle the various different cases.

				On x86 and SPARC, a macro called <tt>GL_STUB</tt> is used.  In the preamble

				On x86 and SPARC, a macro called <code>GL_STUB</code> is used.  In the preamble

				of the assembly source file different implementations of the macro are

				selected based on the defined preprocessor variables.  The assembly code

				then consists of a series of invocations of the macros such as:

				<blockquote>

				<table border="1">

				<tr><td><pre>

				<figure>

				<pre>

				GL_STUB(Color3fv, _gloffset_Color3fv)

				</pre></td></tr>

				<tr><td>SPARC Assembly Implementation of <tt>glColor3fv</tt></td></tr></table>

				</blockquote>

				</pre>

				<figcaption>SPARC Assembly Implementation of <code>glColor3fv</code></figcaption>

				</figure>

				<p>The benefit of this technique is that changes to the calling pattern

				(i.e., addition of a new dispatch table pointer access method) require fewer

				@@ -231,32 +228,32 @@ changed lines in the assembly code.</p>

				implementation does not change based on the parameters passed to the

				function.  For example, since x86 passes all parameters on the stack, no

				additional code is needed to save and restore function parameters around a

				call to <tt>pthread_getspecific</tt>.  Since x86-64 passes parameters in

				call to <code>pthread_getspecific</code>.  Since x86-64 passes parameters in

				registers, varying amounts of code needs to be inserted around the call to

				<tt>pthread_getspecific</tt> to save and restore the GL function's

				<code>pthread_getspecific</code> to save and restore the GL function's

				parameters.</p>

				<p>The other technique, used by platforms like x86-64 that cannot use the

				first technique, is to insert <tt>#ifdef</tt> within the assembly

				first technique, is to insert <code>#ifdef</code> within the assembly

				implementation of each function.  This makes the assembly file considerably

				larger (e.g., 29,332 lines for <tt>glapi_x86-64.S</tt> versus 1,155 lines for

				<tt>glapi_x86.S</tt>) and causes simple changes to the function

				larger (e.g., 29,332 lines for <code>glapi_x86-64.S</code> versus 1,155 lines for

				<code>glapi_x86.S</code>) and causes simple changes to the function

				implementation to generate many lines of diffs.  Since the assembly files

				are typically generated by scripts (see <a href="#autogen">below</a>), this

				isn't a significant problem.</p>

				<p>Once a new assembly file is created, it must be inserted in the build

				system.  There are two steps to this.  The file must first be added to

				<tt>src/mesa/sources</tt>.  That gets the file built and linked.  The second

				step is to add the correct <tt>#ifdef</tt> magic to

				<tt>src/mesa/glapi/glapi_dispatch.c</tt> to prevent the C version of the

				<code>src/mesa/sources</code>.  That gets the file built and linked.  The second

				step is to add the correct <code>#ifdef</code> magic to

				<code>src/mesa/glapi/glapi_dispatch.c</code> to prevent the C version of the

				dispatch functions from being built.</p>

				<h3 id="fixedsize">3.4. Fixed-Length Dispatch Stubs</h3>

				<p>To implement <tt>glXGetProcAddress</tt>, Mesa stores a table that

				<p>To implement <code>glXGetProcAddress</code>, Mesa stores a table that

				associates function names with pointers to those functions.  This table is

				stored in <tt>src/mesa/glapi/glprocs.h</tt>.  For different reasons on

				stored in <code>src/mesa/glapi/glprocs.h</code>.  For different reasons on

				different platforms, storing all of those pointers is inefficient.  On most

				platforms, including all known platforms that support TLS, we can avoid this

				added overhead.</p>

				@@ -267,12 +264,10 @@ calculated by multiplying the size of the dispatch stub by the offset of the

				function in the table.  This value is then added to the address of the first

				dispatch stub.</p>

				<p>This path is activated by adding the correct <tt>#ifdef</tt> magic to

				<tt>src/mesa/glapi/glapi.c</tt> just before <tt>glprocs.h</tt> is

				<p>This path is activated by adding the correct <code>#ifdef</code> magic to

				<code>src/mesa/glapi/glapi.c</code> just before <code>glprocs.h</code> is

				included.</p>

				<h2 id="autogen">4. Automatic Generation of Dispatch Stubs</h2>

				</div>

				</body>

				</html>

									
										32

docs/download.html
									
												View File
												
				@@ -2,46 +2,48 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Getting Mesa</title>

				  <title>Downloading and Unpacking</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Downloading</h1>

				<h1>Downloading and Unpacking</h1>

				<h2>Downloading</h2>

				<p>

				Primary Mesa download site:

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)

				or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>

				(HTTPS).

				You can download the released versions of Mesa via

				<a href="https://mesa.freedesktop.org/archive/">HTTPS</a>

				or

				<a href="ftp://ftp.freedesktop.org/pub/mesa/">FTP</a>.

				</p>

				<p>

				Starting with the first release of 2017, Mesa's version scheme is

				year-based. Filenames are in the form <tt>mesa-Y.N.P.tar.gz</tt>, where

				<tt>Y</tt> is the year (two digits), <tt>N</tt> is an incremental number

				(starting at 0) and <tt>P</tt> is the patch number (0 for the first

				year-based. Filenames are in the form <code>mesa-Y.N.P.tar.gz</code>, where

				<code>Y</code> is the year (two digits), <code>N</code> is an incremental number

				(starting at 0) and <code>P</code> is the patch number (0 for the first

				release, 1 for the first patch after that).

				</p>

				<p>

				When a new release is coming, release candidates (betas) may be found

				in the same directory, and are recognisable by the

				<tt>mesa-Y.N.P-<b>rc</b>X.tar.gz</tt> filename.

				<code>mesa-Y.N.P-<b>rc</b>X.tar.gz</code> filename.

				</p>

				<h1>Unpacking</h1>

				<h2>Unpacking</h2>

				<p>

				Mesa releases are available in two formats: <tt>.tar.xz</tt> and <tt>.tar.gz</tt>.

				Mesa releases are available in two formats: <code>.tar.xz</code> and <code>.tar.gz</code>.

				</p>

				<p>

				@@ -56,7 +58,7 @@ To unpack the tarball:

				</pre>

				<h1>Contents</h1>

				<h2>Contents</h2>

				<p>

				Proceed to the <a href="install.html">compilation and installation

				@@ -64,7 +66,7 @@ instructions</a>.

				</p>

				<h1>Demos, GLUT, and GLU</h1>

				<h2>Demos, GLUT, and GLU</h2>

				<p>

				A package of SGI's GLU library is available

									
										10

docs/egl.html
									
												View File
												
				@@ -2,19 +2,19 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa EGL</title>

				  <title>EGL</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Mesa EGL</h1>

				<h1>EGL</h1>

				<p>The current version of EGL in Mesa implements EGL 1.4.  More information

				about EGL can be found at

				@@ -86,14 +86,12 @@ and <code>haiku</code>.

				The <code>android</code> platform can either be built as a system

				component, part of AOSP, using <code>Android.mk</code> files, or

				cross-compiled using appropriate options.

				The <code>haiku</code> platform can only be built with SCons or Meson.

				Unless for special needs, the build system should

				select the right platforms automatically.</p>

				</dd>

				<dt><code>-D gles1=true</code></dt>

				<dt><code>-D gles2=true</code></dt>

				<dt><code>-D gles1=true</code> and <code>-D gles2=true</code></dt>

				<dd>

				<p>These options enable OpenGL ES support in OpenGL.  The result is one big

947

docs/envvars.html

View File

File diff suppressed because it is too large Load Diff

									
										2

docs/extensions.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

									
										142

docs/faq.html
									
												View File
												
				@@ -2,23 +2,21 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa FAQ</title>

				  <title>Frequently Asked Questions</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Mesa Frequently Asked Questions</h1>

				<h1>Frequently Asked Questions</h1>

				Last updated: 19 September 2018

				<br>

				<br>

				<h2>Index</h2>

				<ol>

				  <li><a href="#part1">High-level Questions and Answers</a></li>

				@@ -26,14 +24,10 @@ Last updated: 19 September 2018

				  <li><a href="#part3">Runtime / Rendering Problems</a></li>

				  <li><a href="#part4">Developer Questions</a></li>

				</ol>

				<br>

				<br>

				<h2 id="part1">1. High-level Questions and Answers</h2>

				<h1 id="part1">1. High-level Questions and Answers</h1>

				<h2>1.1 What is Mesa?</h2>

				<h3>1.1 What is Mesa?</h3>

				<p>

				Mesa is an open-source implementation of the OpenGL specification.

				OpenGL is a programming library for writing interactive 3D applications.

				@@ -102,17 +96,17 @@ the Xlib API:

				<li>The GLX wire protocol is not supported and there's no OpenGL extension

				    loaded by the X server.

				<li>There is no hardware acceleration.

				<li>The OpenGL library, libGL.so, contains everything (the programming API,

				    the GLX functions and all the rendering code).

				<li>The OpenGL library, <code>libGL.so</code>, contains everything (the

				    programming API, the GLX functions and all the rendering code).

				</ul>

				<p>

				Alternately, Mesa acts as the core for a number of OpenGL hardware drivers

				within the DRI (Direct Rendering Infrastructure):

				<ul>

				<li>The libGL.so library provides the GL and GLX API functions, a GLX

				    protocol encoder, and a device driver loader.

				<li>The device driver modules (such as r200_dri.so) contain a built-in

				    copy of the core Mesa code.

				<li>The <code>libGL.so</code> library provides the GL and GLX API functions,

				    a GLX protocol encoder, and a device driver loader.

				<li>The device driver modules (such as <code>r200_dri.so</code>) contain

				    a built-in copy of the core Mesa code.

				<li>The X server loads the GLX module.

				    The GLX module decodes incoming GLX protocol and dispatches the commands

				    to a rendering module.

				@@ -132,7 +126,7 @@ Just follow the Mesa <a href="install.html">compilation instructions</a>.

				<h2>1.6 Are there other open-source implementations of OpenGL?</h2>

				<p>

				Yes, SGI's <a href="http://oss.sgi.com/projects/ogl-sample/index.html">

				Yes, SGI's <a href="http://web.archive.org/web/20171010115110_/http://oss.sgi.com/projects/ogl-sample/index.html">

				OpenGL Sample Implementation (SI)</a> is available.

				The SI was written during the time that OpenGL was originally designed.

				Unfortunately, development of the SI has stagnated.

				@@ -144,8 +138,9 @@ Mesa is much more up to date with modern features and extensions.

				an open-source implementation of OpenGL ES for mobile devices.

				<p>

				<a href="http://www.dsbox.com/minigl.html">miniGL</a>

				is a subset of OpenGL for PalmOS devices.

				<a href="http://web.archive.org/web/20130830162848/http://www.dsbox.com/minigl.html">miniGL</a>

				is a subset of OpenGL for PalmOS devices. The website is gone, but the source

				code can still be found on <a href="https://sourceforge.net/projects/minigl/">sourceforge.net</a>.

				<p>

				<a href="http://bellard.org/TinyGL/">TinyGL</a>

				@@ -175,22 +170,16 @@ popular and feature-complete.

				</p>

				<h2 id="part2">2. Compilation and Installation Problems</h2>

				<br>

				<br>

				<h1 id="part2">2. Compilation and Installation Problems</h1>

				<h2>2.1 What's the easiest way to install Mesa?</h2>

				<h3>2.1 What's the easiest way to install Mesa?</h3>

				<p>

				If you're using a Linux-based system, your distro CD most likely already

				has Mesa packages (like RPM or DEB) which you can easily install.

				</p>

				<h2>2.2 I get undefined symbols such as bgnpolygon, v3f, etc...</h2>

				<h3>2.2 I get undefined symbols such as bgnpolygon, v3f, etc...</h3>

				<p>

				You're application is written in IRIS GL, not OpenGL.

				IRIS GL was the predecessor to OpenGL and is a different thing (almost)

				@@ -199,38 +188,49 @@ Mesa's not the solution.

				</p>

				<h2>2.3 Where is the GLUT library?</h2>

				<h3>2.3 Where is the GLUT library?</h3>

				<p>

				GLUT (OpenGL Utility Toolkit) is no longer in the separate MesaGLUT-x.y.z.tar.gz file.

				GLUT (OpenGL Utility Toolkit) is no longer in the separate

				<code>MesaGLUT-x.y.z.tar.gz</code> file.

				If you don't already have GLUT installed, you should grab 

				<a href="http://freeglut.sourceforge.net/">freeglut</a>.

				</p>

				<h2>2.4 Where is the GLw library?</h2>

				<h3>2.4 Where is the GLw library?</h3>

				<p>

				GLw (OpenGL widget library) is now available from a separate <a href="https://cgit.freedesktop.org/mesa/glw/">git repository</a>.  Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.

				GLw (OpenGL widget library) is now available from a separate <a href="https://gitlab.freedesktop.org/mesa/glw">git repository</a>.  Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.

				</p>

				<h2>2.5 What's the proper place for the libraries and headers?</h2>

				<p>

				On Linux-based systems you'll want to follow the

				<a href="http://oss.sgi.com/projects/ogl-sample/ABI/index.html">Linux ABI</a> standard.

				<a href="https://www.khronos.org/registry/OpenGL/ABI/">Linux ABI</a> standard.

				Basically you'll want the following:

				</p>

				<ul>

				<li>/usr/include/GL/gl.h - the main OpenGL header

				</li><li>/usr/include/GL/glu.h - the OpenGL GLU (utility) header

				</li><li>/usr/include/GL/glx.h - the OpenGL GLX header

				</li><li>/usr/include/GL/glext.h - the OpenGL extensions header

				</li><li>/usr/include/GL/glxext.h - the OpenGL GLX extensions header

				</li><li>/usr/include/GL/osmesa.h - the Mesa off-screen rendering header

				</li><li>/usr/lib/libGL.so - a symlink to libGL.so.1

				</li><li>/usr/lib/libGL.so.1 - a symlink to libGL.so.1.xyz

				</li><li>/usr/lib/libGL.so.xyz - the actual OpenGL/Mesa library.  xyz denotes the

				<dl>

				<dt><code>/usr/include/GL/gl.h</code></dt>

				<dd>the main OpenGL header</dd>

				<dt><code>/usr/include/GL/glu.h</code></dt>

				<dd>the OpenGL GLU (utility) header</dd>

				<dt><code>/usr/include/GL/glx.h</code></dt>

				<dd>the OpenGL GLX header</dd>

				<dt><code>/usr/include/GL/glext.h</code></dt>

				<dd>the OpenGL extensions header</dd>

				<dt><code>/usr/include/GL/glxext.h</code></dt>

				<dd>the OpenGL GLX extensions header</dd>

				<dt><code>/usr/include/GL/osmesa.h</code></dt>

				<dd>the Mesa off-screen rendering header</dd>

				<dt><code>/usr/lib/libGL.so</code></dt>

				<dd>a symlink to <code>libGL.so.1</code></dd>

				<dt><code>/usr/lib/libGL.so.1</code></dt>

				<dd>a symlink to <code>libGL.so.1.xyz</code></dd>

				<dt><code>/usr/lib/libGL.so.xyz</code></dt>

				<dd>the actual OpenGL/Mesa library.  xyz denotes the

				Mesa version number.

				</li></ul>

				</dd>

				</dl>

				<p>

				When configuring Mesa, there are three meson options that affect the install

				location that you should take care with: <code>--prefix</code>,

				@@ -249,13 +249,11 @@ After determining the correct values for the install location, configure Mesa

				with <code>meson configure --prefix=/usr --libdir=xxx -D dri-drivers-path=xxx</code>

				and then install with <code>sudo ninja install</code>.

				</p>

				<br>

				<br>

				<h1 id="part3">3. Runtime / Rendering Problems</h1>

				<h2 id="part3">3. Runtime / Rendering Problems</h2>

				<h2>3.1 Rendering is slow / why isn't my graphics hardware being used?</h2>

				<h3>3.1 Rendering is slow / why isn't my graphics hardware being used?</h3>

				<p>

				If Mesa can't use its hardware accelerated drivers it falls back on one of its software renderers.

				(eg. classic swrast, softpipe or llvmpipe)

				@@ -276,60 +274,57 @@ If your DRI-based driver isn't working, go to the

				</p>

				<h2>3.2 I'm seeing errors in depth (Z) buffering.  Why?</h2>

				<h3>3.2 I'm seeing errors in depth (Z) buffering.  Why?</h3>

				<p>

				Make sure the ratio of the far to near clipping planes isn't too great.

				Look

				<a href="https://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>

				<a href="https://www.opengl.org/archives/resources/faq/technical/depthbuffer.htm#0040">here</a>

				for details.

				</p>

				<p>

				Mesa uses a 16-bit depth buffer by default which is smaller and faster

				to clear than a 32-bit buffer but not as accurate.

				If you need a deeper you can modify the parameters to

				<code> glXChooseVisual</code> in your code.

				<code>glXChooseVisual</code> in your code.

				</p>

				<h2>3.3 Why Isn't depth buffering working at all?</h2>

				<h3>3.3 Why Isn't depth buffering working at all?</h3>

				<p>

				Be sure you're requesting a depth buffered-visual.  If you set the MESA_DEBUG

				environment variable it will warn you about trying to enable depth testing

				when you don't have a depth buffer.

				Be sure you're requesting a depth buffered-visual.  If you set the

				<code>MESA_DEBUG</code> environment variable it will warn you about trying

				to enable depth testing when you don't have a depth buffer.

				</p>

				<p>Specifically, make sure <code>glutInitDisplayMode</code> is being called

				with <code>GLUT_DEPTH</code> or <code>glXChooseVisual</code> is being

				called with a non-zero value for GLX_DEPTH_SIZE.

				called with a non-zero value for <code>GLX_DEPTH_SIZE</code>.

				</p>

				<p>This discussion applies to stencil buffers, accumulation buffers and

				alpha channels too.

				</p>

				<h2>3.4 Why does glGetString() always return NULL?</h2>

				<h3>3.4 Why does <code>glGetString()</code> always return <code>NULL</code>?</h3>

				<p>

				Be sure you have an active/current OpenGL rendering context before

				calling glGetString.

				calling <code>glGetString</code>.

				</p>

				<h2>3.5 GL_POINTS and GL_LINES don't touch the right pixels</h2>

				<h3>3.5 <code>GL_POINTS</code> and <code>GL_LINES</code> don't touch the

				right pixels</h3>

				<p>

				If you're trying to draw a filled region by using GL_POINTS or GL_LINES

				and seeing holes or gaps it's because of a float-to-int rounding problem.

				But this is not a bug.

				See Appendix H of the OpenGL Programming Guide - "OpenGL Correctness Tips".

				Basically, applying a translation of (0.375, 0.375, 0.0) to your coordinates

				will fix the problem.

				If you're trying to draw a filled region by using <code>GL_POINTS</code> or

				<code>GL_LINES</code> and seeing holes or gaps it's because of a float-to-int

				rounding problem. But this is not a bug. See Appendix H of the OpenGL

				Programming Guide - "OpenGL Correctness Tips". Basically, applying a

				translation of (0.375, 0.375, 0.0) to your coordinates will fix the problem.

				</p>

				<br>

				<br>

				<h2 id="part4">4. Developer Questions</h2>

				<h1 id="part4">4. Developer Questions</h1>

				<h2>4.1 How can I contribute?</h2>

				<h3>4.1 How can I contribute?</h3>

				<p>

				First, join the <a href="lists.html">mesa-dev mailing list</a>.

				That's where Mesa development is discussed.

				@@ -343,7 +338,7 @@ You should read it.

				extensions, writing hardware drivers (for the DRI), and code optimization.

				</p>

				<h2>4.2 How do I write a new device driver?</h2>

				<h3>4.2 How do I write a new device driver?</h3>

				<p>

				Unfortunately, writing a device driver isn't easy.

				It requires detailed understanding of OpenGL, the Mesa code, and your

				@@ -367,7 +362,8 @@ the archives) is a good way to get information.

				</p>

				<h2>4.3 Why isn't GL_EXT_texture_compression_s3tc implemented in Mesa?</h2>

				<h3>4.3 Why isn't <code>GL_EXT_texture_compression_s3tc</code> implemented in

				Mesa?</h3>

				<p>

				Oh but it is! Prior to 2nd October 2017, the Mesa project did not include s3tc

				support due to intellectual property (IP) and/or patent issues around the s3tc

201

docs/features.txt

View File

@@ -118,23 +118,23 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   - 'precise' qualifier                                 DONE (softpipe)
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE (freedreno, softpipe)
   - Implicit signed -> unsigned conversions             DONE (softpipe)
   - Fused multiply-add                                  DONE (softpipe)
   - Packing/bitfield/conversion functions               DONE (freedreno, softpipe)
   - Enhanced textureGather                              DONE (freedreno, softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE (softpipe)
   - Implicit signed -> unsigned conversions             DONE (softpipe, swr)
   - Fused multiply-add                                  DONE (softpipe, swr)
   - Packing/bitfield/conversion functions               DONE (freedreno, softpipe, swr)
   - Enhanced textureGather                              DONE (freedreno, softpipe, swr)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe, swr)
   - Geometry shader multiple streams                    DONE (softpipe, swr)
   - Enhanced per-sample shading                         DONE ()
   - Interpolation functions                             DONE (softpipe)
   - New overload resolution rules                       DONE (softpipe)
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_sample_shading                                 DONE (freedreno/a6xx, i965/gen6+, nv50)
   GL_ARB_shader_subroutine                              DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+, swr)
   GL_ARB_texture_buffer_object_rgb32                    DONE (freedreno, i965/gen6+, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965/gen6+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                         DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_gather                                 DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (freedreno, i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_query_lod                              DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback2                            DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965/gen7+, llvmpipe, softpipe, swr)
@@ -145,19 +145,19 @@ GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_get_program_binary                             DONE (0 or 1 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (i965/gen7+, all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe, swr)
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_texture_compression_bptc                       DONE (freedreno, i965)
   GL_ARB_texture_compression_bptc                       DONE (freedreno, i965, llvmpipe, softpipe, swr)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
@@ -170,21 +170,21 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, softpipe, llvmpipe)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, softpipe, llvmpipe, swr)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965, softpipe)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965, llvmpipe, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (freedreno, i965, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965, llvmpipe, softpipe)
   GL_ARB_stencil_texturing                              DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (freedreno, nv50, i965, softpipe, llvmpipe)
   GL_ARB_texture_buffer_range                           DONE (freedreno, nv50, i965, softpipe, llvmpipe, swr)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
@@ -204,38 +204,38 @@ GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, virgl)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+, llvmpipe, virgl)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_texture_stencil8                               DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi, r600
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600, softpipe, virgl)
   GL_ARB_clip_control                                   DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_cull_distance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600, virgl)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, softpipe, virgl)
   GL_ARB_clip_control                                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_cull_distance                                  DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_derivative_control                             DONE (i965, nv50, llvmpipe, softpipe, virgl)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600, virgl)
   GL_ARB_texture_barrier                                DONE (freedreno, i965, nv50, r600, virgl)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, virgl)
   GL_ARB_texture_barrier                                DONE (freedreno, i965, nv50, virgl)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (freedreno, i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
 GL 4.6, GLSL 4.60
 GL 4.6, GLSL 4.60 -- all DONE: radeonsi
   GL_ARB_gl_spirv                                       in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, radeonsi, virgl)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx+, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (i965, nvc0, radeonsi)
   GL_ARB_spirv_extensions                               in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_texture_filter_anisotropic                     DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
   GL_ARB_gl_spirv                                       DONE (i965/gen7+)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, llvmpipe, virgl)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, r600, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, llvmpipe, swr, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx+, i965/gen7+, nvc0, r600, llvmpipe, softpipe, virgl)
   GL_ARB_shader_draw_parameters                         DONE (i965, llvmpipe, nvc0)
   GL_ARB_shader_group_vote                              DONE (i965, nvc0, llvmpipe)
   GL_ARB_spirv_extensions                               DONE (i965/gen7+)
   GL_ARB_texture_filter_anisotropic                     DONE (freedreno, i965, nv50, nvc0, r600, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, nvc0, llvmpipe, softpipe, virgl)
   GL_KHR_no_error                                       DONE (all drivers)
 (*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
@@ -244,15 +244,15 @@ These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965/gen7+, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965/gen7+, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx+, i965/gen7+, llvmpipe, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (freedreno, nv50, llvmpipe, softpipe, swr)
@@ -303,14 +303,15 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GL_ARB_fragment_shader_interlock                      DONE (i965)
   GL_ARB_gpu_shader_int64                               DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
   GL_ARB_parallel_shader_compile                        DONE (all drivers)
   GL_ARB_post_depth_coverage                            DONE (i965, nvc0)
   GL_ARB_post_depth_coverage                            DONE (i965, nvc0, radeonsi)
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               DONE (nvc0)
   GL_ARB_seamless_cubemap_per_texture                   DONE (freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
   GL_ARB_seamless_cubemap_per_texture                   DONE (etnaviv/SEAMLESS_CUBE_MAP, freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
   GL_ARB_shader_ballot                                  DONE (i965/gen8+, nvc0, radeonsi)
   GL_ARB_shader_clock                                   DONE (i965/gen7+, nv50, nvc0, r600, radeonsi, virgl)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
   GL_ARB_shader_viewport_layer_array                    DONE (i965/gen6+, nvc0, radeonsi)
   GL_ARB_shading_language_include                       DONE
   GL_ARB_sparse_buffer                                  DONE (radeonsi/CIK+)
   GL_ARB_sparse_texture                                 not started
   GL_ARB_sparse_texture2                                not started
@@ -347,54 +348,54 @@ Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES ve
   GLX_ARB_robustness_share_group_isolation              not started
 GL_EXT_direct_state_access subfeatures (in the spec order):
   GL 1.1: Client commands                               not started
   GL 1.0-1.3: Matrix and transpose matrix commands      not started
   GL 1.1-1.2: Texture commands                          not started
   GL 1.2: 3D texture commands                           not started
   GL 1.2.1: Multitexture commands                       not started
   GL 1.2.1-3.0: Indexed texture commands                not started
   GL 1.2.1-3.0: Indexed generic queries                 not started
   GL 1.2.1: EnableIndexed.. Get*Indexed                 not started
   GL_ARB_vertex_program                                 not started
   GL 1.3: Compressed texture and multitexture commands  not started
   GL 1.5: Buffer commands                               not started
   GL 2.0-2.1: Uniform and uniform matrix commands       not started
   GL_EXT_texture_buffer_object                          not started
   GL_EXT_texture_integer                                not started
   GL_EXT_gpu_shader4                                    not started
   GL_EXT_gpu_program_parameters                         not started
   GL 1.1: Client commands                               DONE
   GL 1.0-1.3: Matrix and transpose matrix commands      DONE
   GL 1.1-1.2: Texture commands                          DONE
   GL 1.2: 3D texture commands                           DONE
   GL 1.2.1: Multitexture commands                       DONE
   GL 1.2.1-3.0: Indexed texture commands                DONE
   GL 1.2.1-3.0: Indexed generic queries                 DONE
   GL 1.2.1: EnableIndexed.. Get*Indexed                 DONE
   GL_ARB_vertex_program                                 DONE
   GL 1.3: Compressed texture and multitexture commands  DONE
   GL 1.5: Buffer commands                               DONE
   GL 2.0-2.1: Uniform and uniform matrix commands       DONE
   GL_EXT_texture_buffer_object                          DONE
   GL_EXT_texture_integer                                DONE
   GL_EXT_gpu_shader4                                    DONE
   GL_EXT_gpu_program_parameters                         DONE
   GL_NV_gpu_program4                                    n/a
   GL_NV_framebuffer_multisample_coverage                n/a
   GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap not started
   GL 3.0: CopyBuffer command                            not started
   GL_EXT_geometry_shader4 commands (expose in GL 3.2)   not started
   GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap DONE
   GL 3.0: CopyBuffer command                            DONE
   GL_EXT_geometry_shader4 commands (expose in GL 3.2)   DONE
   GL_NV_explicit_multisample                            n/a
   GL 3.0: Vertex array/attrib/query/map commands        not started
   Matrix GL tokens                                      not started
   GL 3.0: Vertex array/attrib/query/map commands        DONE
   Matrix GL tokens                                      DONE
 GL_EXT_direct_state_access additions from other extensions (complete list):
   GL_AMD_framebuffer_sample_positions                   n/a
   GL_AMD_gpu_shader_int64                               not started
   GL_ARB_bindless_texture                               not started
   GL_ARB_buffer_storage                                 not started
   GL_ARB_clear_buffer_object                            not started
   GL_ARB_framebuffer_no_attachments                     not started
   GL_ARB_gpu_shader_fp64                                not started
   GL_ARB_instanced_arrays                               not started
   GL_ARB_internalformat_query2                          not started
   GL_AMD_gpu_shader_int64                               n/a (not enabled in compat profile)
   GL_ARB_bindless_texture                               DONE
   GL_ARB_buffer_storage                                 DONE
   GL_ARB_clear_buffer_object                            DONE
   GL_ARB_framebuffer_no_attachments                     DONE
   GL_ARB_gpu_shader_fp64                                DONE
   GL_ARB_instanced_arrays                               DONE
   GL_ARB_internalformat_query2                          DONE
   GL_ARB_sparse_texture                                 n/a
   GL_ARB_sparse_buffer                                  not started
   GL_ARB_texture_buffer_range                           not started
   GL_ARB_texture_storage                                not started
   GL_ARB_texture_storage_multisample                    not started
   GL_ARB_vertex_attrib_64bit                            not started
   GL_ARB_vertex_attrib_binding                          not started
   GL_EXT_buffer_storage                                 not started
   GL_EXT_external_buffer                                not started
   GL_ARB_sparse_buffer                                  DONE
   GL_ARB_texture_buffer_range                           DONE
   GL_ARB_texture_storage                                DONE
   GL_ARB_texture_storage_multisample                    DONE
   GL_ARB_vertex_attrib_64bit                            DONE
   GL_ARB_vertex_attrib_binding                          DONE
   GL_EXT_buffer_storage                                 DONE
   GL_EXT_external_buffer                                n/a
   GL_EXT_separate_shader_objects                        n/a
   GL_EXT_sparse_texture                                 n/a
   GL_EXT_texture_storage                                n/a
   GL_EXT_vertex_attrib_64bit                            not started
   GL_EXT_vertex_attrib_64bit                            DONE
   GL_EXT_EGL_image_storage                              n/a
   GL_NV_bindless_texture                                n/a
   GL_NV_gpu_shader5                                     n/a
@@ -408,7 +409,6 @@ we DO NOT WANT implementations of these extensions for Mesa.
   GL_ARB_geometry_shader4                               Superseded by GL 3.2 geometry shaders
   GL_ARB_matrix_palette                                 Superseded by GL_ARB_vertex_program
   GL_ARB_shading_language_include                       Not interesting
   GL_ARB_shadow_ambient                                 Superseded by GL_ARB_fragment_program
   GL_ARB_vertex_blend                                   Superseded by GL_ARB_vertex_program
@@ -416,7 +416,7 @@ Vulkan 1.0 -- all DONE: anv, radv
 Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_16bit_storage                                  in progress (Alejandro)
   VK_KHR_16bit_storage                                  DONE (anv/gen8+, radv)
   VK_KHR_bind_memory2                                   DONE (anv, radv)
   VK_KHR_dedicated_allocation                           DONE (anv, radv)
   VK_KHR_descriptor_update_template                     DONE (anv, radv)
@@ -435,18 +435,21 @@ Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_maintenance3                                   DONE (anv, radv)
   VK_KHR_multiview                                      DONE (anv, radv)
   VK_KHR_relaxed_block_layout                           DONE (anv, radv)
   VK_KHR_sampler_ycbcr_conversion                       DONE (anv)
   VK_KHR_sampler_ycbcr_conversion                       DONE (anv, radv)
   VK_KHR_shader_draw_parameters                         DONE (anv, radv)
   VK_KHR_storage_buffer_storage_class                   DONE (anv, radv)
   VK_KHR_variable_pointers                              DONE (anv, radv)
 Khronos extensions that are not part of any Vulkan version:
   VK_KHR_8bit_storage                                   DONE (anv, radv)
   VK_KHR_8bit_storage                                   DONE (anv/gen8+, radv)
   VK_KHR_android_surface                                not started
   VK_KHR_create_renderpass2                             DONE (anv, radv)
   VK_KHR_depth_stencil_resolve                          DONE (anv, radv)
   VK_KHR_display                                        DONE (anv, radv)
   VK_KHR_display_swapchain                              DONE (anv, radv)
   VK_KHR_draw_indirect_count                            DONE (radv)
   VK_KHR_display_swapchain                              not started
   VK_KHR_draw_indirect_count                            DONE (anv, radv)
   VK_KHR_driver_properties                              DONE (anv, radv)
   VK_KHR_external_fence_fd                              DONE (anv, radv)
   VK_KHR_external_fence_win32                           not started
   VK_KHR_external_memory_fd                             DONE (anv, radv)
@@ -456,13 +459,23 @@ Khronos extensions that are not part of any Vulkan version:
   VK_KHR_get_display_properties2                        DONE (anv, radv)
   VK_KHR_get_surface_capabilities2                      DONE (anv, radv)
   VK_KHR_image_format_list                              DONE (anv, radv)
   VK_KHR_imageless_framebuffer                          DONE (anv, radv)
   VK_KHR_incremental_present                            DONE (anv, radv)
   VK_KHR_mir_surface                                    not started
   VK_KHR_pipeline_executable_properties                 DONE (anv, radv)
   VK_KHR_push_descriptor                                DONE (anv, radv)
   VK_KHR_sampler_mirror_clamp_to_edge                   DONE (anv, radv)
   VK_KHR_shader_atomic_int64                            DONE (anv, radv)
   VK_KHR_shader_float16_int8                            DONE (anv/gen8+, radv)
   VK_KHR_shader_float_controls                          DONE (anv/gen8+, radv)
   VK_KHR_shader_subgroup_extended_types                 DONE (radv)
   VK_KHR_shared_presentable_image                       not started
   VK_KHR_surface                                        DONE (anv, radv)
   VK_KHR_surface_protected_capabilities                 DONE (anv, radv)
   VK_KHR_swapchain                                      DONE (anv, radv)
   VK_KHR_swapchain_mutable_format                       DONE (anv, radv)
   VK_KHR_uniform_buffer_standard_layout                 DONE (anv, radv)
   VK_KHR_vulkan_memory_model                            not started
   VK_KHR_wayland_surface                                DONE (anv, radv)
   VK_KHR_win32_keyed_mutex                              not started
   VK_KHR_win32_surface                                  not started

									
										17

docs/helpwanted.html
									
												View File
												
				@@ -8,13 +8,13 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Help Wanted / To-Do List</h1>

				<h1>Help Wanted</h1>

				<p>

				We can always use more help with the Mesa project.

				@@ -29,11 +29,11 @@ immediately checked into git because not enough people are testing them.

				Just applying patches, testing and reporting back is helpful.

				<li>

				<b>Driver debugging.</b>

				There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/describecomponents.cgi?product=Mesa">bug database</a>.

				There are plenty of open bugs in the <a href="https://gitlab.freedesktop.org/mesa/mesa/issues">bug database</a>.

				<li>

				<b>Remove aliasing warnings.</b>

				Enable gcc -Wstrict-aliasing=2 -fstrict-aliasing and track down aliasing

				issues in the code.

				Enable gcc's <code>-Wstrict-aliasing=2 -fstrict-aliasing</code> arguments, and

				track down aliasing issues in the code.

				<li>

				<b>Contribute more tests to

				<a href="https://piglit.freedesktop.org/">Piglit</a>.</b>

				@@ -48,7 +48,8 @@ You can find some further To-do lists here:

				</p>

				<ul>

				  <li><a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/docs/features.txt">

				    <b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				    <code>features.txt</code></a> - Status of OpenGL 3.x / 4.x features in

				    Mesa.</li>

				</ul>

				<p>

				@@ -56,9 +57,9 @@ You can find some further To-do lists here:

				</p>

				<ul>

				  <li><a href="https://dri.freedesktop.org/wiki/R600ToDo">

				    <b>r600g</b></a> - Driver for ATI/AMD R600 - Northern Island.</li>

				    <code>r600g</code></a> - Driver for ATI/AMD R600 - Northern Island.</li>

				  <li><a href="https://dri.freedesktop.org/wiki/R300ToDo">

				    <b>r300g</b></a> - Driver for ATI R300 - R500.</li>

				    <code>r300g</code></a> - Driver for ATI R300 - R500.</li>

				</ul>

				<p>

									
										212

docs/index.html
									
												View File
												
				@@ -8,13 +8,117 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>News</h1>

				<h2>January 28, 2020</h2><p><a href="relnotes/19.3.3.html">Mesa 19.3.3</a> is released. This is a bug fix release.</p><h2>January 9, 2020</h2><p><a href="relnotes/19.3.2.html">Mesa 19.3.2</a> is released. This is a bug fix release.</p><h2>December 18, 2019</h2><p><a href="relnotes/19.2.8.html">Mesa 19.2.8</a> is released. This is a bug fix release.</p><h2>December 18, 2019</h2><p><a href="relnotes/19.3.1.html">Mesa 19.3.1</a> is released. This is a bug fix release.</p><h2>December 12, 2019</h2><p><a href="relnotes/19.3.0.html">Mesa 19.3.0</a> is released. This is a new development release. See the release notes for mor information about this release.</p><h2>December 4, 2019</h2><p><a href="relnotes/19.2.7.html">Mesa 19.2.7</a> is released. This is a bug fix release.</p><h2>November 21, 2019</h2><p><a href="relnotes/19.2.6.html">Mesa 19.2.6</a> is released. This is a bug fix release.</p><h2>November 20, 2019</h2><p><a href="relnotes/19.2.5.html">Mesa 19.2.5</a> is released. This is a bug fix release.</p><h2>November 13, 2019</h2><p><a href="relnotes/19.2.4.html">Mesa 19.2.4</a> is released. This is an emergency bugfix release, all users of 19.2.3 are recomended to upgrade immediately.</p>

				<h2>November 6, 2019</h2><p><a href="relnotes/19.2.3.html">Mesa 19.2.3</a> is released. This is a bug fix release.</p><h2>October 24, 2019</h2><p><a href="relnotes/19.2.2.html">Mesa 19.2.2</a> is released. This is a bug fix release.</p><h2>October 21, 2019</h2>

				<p>

				<a href="relnotes/19.1.8.html">Mesa 19.1.8</a> is released.

				This is a bug-fix release.

				</p>

				<p>

				NOTE: It is anticipated that 19.1.8 will be the final release in the

				19.1 series. Users of 19.1 are encouraged to migrate to the 19.2

				series in order to obtain future fixes.

				</p>

				<h2>October 9, 2019</h2><p><a href="relnotes/19.2.1.html">Mesa 19.2.1</a> is released. This is a bug fix release.</p><h2>September 25, 2019</h2>

				<p>

				<a href="relnotes/19.2.0.html">Mesa 19.2.0</a> is released.

				This is a new development release. See the release notes for more

				information about this release

				</p>

				<h2>September 17, 2019</h2>

				<p>

				<a href="relnotes/19.1.7.html">Mesa 19.1.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 3, 2019</h2>

				<p>

				<a href="relnotes/19.1.6.html">Mesa 19.1.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 23, 2019</h2>

				<p>

				<a href="relnotes/19.1.5.html">Mesa 19.1.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 7, 2019</h2>

				<p>

				<a href="relnotes/19.1.4.html">Mesa 19.1.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 23, 2019</h2>

				<p>

				<a href="relnotes/19.1.3.html">Mesa 19.1.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 9, 2019</h2>

				<p>

				<a href="relnotes/19.1.2.html">Mesa 19.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 26, 2019</h2>

				<p>

				<a href="relnotes/19.0.8.html">Mesa 19.0.8</a> is released.

				This is an emergency bug fix release. Users of 19.0.7 should updated to 19.0.8

				or 19.1.1 immediately.

				</p>

				<h2>June 25, 2019</h2>

				<p>

				<a href="relnotes/19.1.1.html">Mesa 19.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 24, 2019</h2>

				<p>

				<a href="relnotes/19.0.7.html">Mesa 19.0.7</a> is released.

				This is a bug-fix release.

				</p>

				<p>

				NOTE: It is anticipated that 19.0.7 will be the final release in the

				19.0 series. Users of 19.0 are encouraged to migrate to the 19.1

				series in order to obtain future fixes.

				</p>

				<h2>June 11, 2019</h2>

				<p>

				<a href="relnotes/19.1.0.html">Mesa 19.1.0</a> is released.

				This is a new development release. See the release notes for more

				information about this release

				</p>

				<h2>June 5, 2019</h2>

				<p>

				<a href="relnotes/19.0.6.html">Mesa 19.0.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 21, 2019</h2>

				<p>

				<a href="relnotes/19.0.5.html">Mesa 19.0.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 9, 2019</h2>

				<p>

				<a href="relnotes/19.0.4.html">Mesa 19.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 24, 2019</h2>

				<p>

				<a href="relnotes/19.0.3.html">Mesa 19.0.3</a> is released.

				@@ -31,7 +135,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/18.3.6.html">Mesa 18.3.6</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 18.3.6 will be the final release in the

				18.3 series. Users of 18.3 are encouraged to migrate to the 19.0

				series in order to obtain future fixes.

				@@ -78,7 +183,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/18.2.8.html">Mesa 18.2.8</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 18.2.8 will be the final release in the

				18.2 series. Users of 18.2 are encouraged to migrate to the 18.3

				series in order to obtain future fixes.

				@@ -137,7 +243,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/18.1.9.html">Mesa 18.1.9</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 18.1.9 will be the final release in the

				18.1 series. Users of 18.1 are encouraged to migrate to the 18.2

				series in order to obtain future fixes.

				@@ -199,7 +306,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/18.0.5.html">Mesa 18.0.5</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 18.0.5 will be the final release in the

				18.0 series. Users of 18.0 are encouraged to migrate to the 18.1

				series in order to obtain future fixes.

				@@ -246,7 +354,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 17.3.9 will be the final release in the

				17.3 series. Users of 17.3 are encouraged to migrate to the 18.0

				series in order to obtain future fixes.

				@@ -305,7 +414,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/17.2.8.html">Mesa 17.2.8</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 17.2.8 will be the final release in the

				17.2 series. Users of 17.2 are encouraged to migrate to the 17.3

				series in order to obtain future fixes.

				@@ -364,7 +474,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/17.1.10.html">Mesa 17.1.10</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 17.1.10 will be the final release in the

				17.1 series. Users of 17.1 are encouraged to migrate to the 17.2

				series in order to obtain future fixes.

				@@ -435,7 +546,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/17.0.7.html">Mesa 17.0.7</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 17.0.7 will be the final release in the 17.0

				series. Users of 17.0 are encouraged to migrate to the 17.1 series in order

				to obtain future fixes.

				@@ -484,7 +596,8 @@ This is a bug-fix release.

				<a href="relnotes/17.0.2.html">Mesa 17.0.2</a> are released.

				These are bug-fix releases from the 13.0 and 17.0 branches, respectively.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 13.0.6 will be the final release in the 13.0

				series. Users of 13.0 are encouraged to migrate to the 17.0 series in order

				to obtain future fixes.

				@@ -519,7 +632,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/12.0.6.html">Mesa 12.0.6</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: This is an extra release for the 12.0 stable branch, as per developers'

				feedback. It is anticipated that 12.0.6 will be the final release in the 12.0

				series. Users of 12.0 are encouraged to migrate to the 13.0 series in order

				@@ -536,7 +650,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/12.0.5.html">Mesa 12.0.5</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 12.0.5 will be the final release in the 12.0

				series. Users of 12.0 are encouraged to migrate to the 13.0 series in order

				to obtain future fixes.

				@@ -598,7 +713,8 @@ about the release.

				<a href="relnotes/11.2.2.html">Mesa 11.2.2</a> are released.

				These are bug-fix releases from the 11.1 and 11.2 branches, respectively.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 11.1.4 will be the final release in the 11.1.4

				series. Users of 11.1 are encouraged to migrate to the 11.2 series in order

				to obtain future fixes.

				@@ -629,7 +745,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/11.0.9.html">Mesa 11.0.9</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 11.0.9 will be the final release in the 11.0

				series. Users of 11.0 are encouraged to migrate to the 11.1 series in order

				to obtain future fixes.

				@@ -693,7 +810,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/10.6.9.html">Mesa 10.6.9</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 10.6.9 will be the final release in the 10.6

				series. Users of 10.6 are encouraged to migrate to the 11.0 series in order

				to obtain future fixes.

				@@ -764,7 +882,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/10.5.9.html">Mesa 10.5.9</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 10.5.9 will be the final release in the 10.5

				series. Users of 10.5 are encouraged to migrate to the 10.6 series in order

				to obtain future fixes.

				@@ -874,7 +993,8 @@ This is a bug-fix release.

				and <a href="relnotes/10.4.2.html">Mesa 10.4.2</a> are released.

				These are bug-fix releases from the 10.3 and 10.4 branches, respectively.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 10.3.7 will be the final release in the 10.3

				series. Users of 10.3 are encouraged to migrate to the 10.4 series in order

				to obtain future fixes.

				@@ -925,7 +1045,8 @@ This is a bug-fix release.

				and <a href="relnotes/10.3.1.html">Mesa 10.3.1</a> are released.

				These are bug-fix releases from the 10.2 and 10.3 branches, respectively.

				<br>

				</p>

				<p>

				NOTE: It is anticipated that 10.2.9 will be the final release in the 10.2

				series. Users of 10.2 are encouraged to migrate to the 10.3 series in order

				to obtain future fixes.

				@@ -1037,7 +1158,8 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/10.0.5.html">Mesa 10.0.5</a> is released.

				This is a bug-fix release.

				<br>

				</p>

				<p>

				NOTE: Since the 10.1.1 release is being released concurrently, it is

				anticipated that 10.0.5 will be the final release in the 10.0

				series. Users of 10.0 are encouraged to migrate to the 10.1 series in

				@@ -1516,9 +1638,9 @@ with a new test that does over 130 tests of the

				shading language and built-in functions.

				</p>

				<h2>April 2007</h2>

				<h2>April 4, 2007</h2>

				<p>

				Thomas Hellstr&ouml;m of Tungsten Graphics has written a whitepaper

				Thomas Hellstr&#246;m of Tungsten Graphics has written a whitepaper

				describing the new DRI memory management system.

				</p>

				@@ -1916,7 +2038,7 @@ d2b5ba32b53e0ad0576c637a4cc1fb41  MesaDemos-5.1.zip

				</pre>

				<H2>November 12, 2003</H2>

				<h2>November 12, 2003</h2>

				<p>

				New Mesa 5.0.2 tarballs have been uploaded to SourceForge which fix a

				@@ -1969,7 +2091,7 @@ Mesa 5.0.2 has been released.  This is a stable, bug-fix release.

				</pre>

				<h2>June 2003</h2>

				<h2>June 8, 2003</h2>

				<p>

				Mesa's directory tree has been overhauled.

				@@ -2346,7 +2468,7 @@ Here's what's new:</p>

				<h2>April 29, 2001</h2>

				<p>New Mesa website</p>

				<p>Mark Manning produced the new website.<br>Thanks, Mark!</p>

				<p>Mark Manning produced the new website. Thanks, Mark!</p>

				<h2>February 14, 2001</h2>

				@@ -2465,8 +2587,9 @@ just bug fixes.</p>

				</pre>

				<p>Please report any problems with this release ASAP. Bugs should be filed on the

				Mesa3D website at sourceforge.<br>

				After 3.2 is wrapped up I hope to release 3.3 beta 1 soon afterward.</p>

				Mesa3D website at sourceforge.

				</p>

				<p>After 3.2 is wrapped up I hope to release 3.3 beta 1 soon afterward.</p>

				<p>-- Brian</p>

				<h2>December 17, 1999</h2>

				@@ -2511,21 +2634,27 @@ ftp, and CVS services aren't fully restored yet. Please be patient.</p>

				<p>-Brian</p>

				<h2>June 7, 1999</h2>

				<p>RPMS of the nVidia RIVA server can be found at <code>ftp://ftp.mesa3d.org/mesa/misc/nVidia/</code>.</p>

				<p>RPMS of the nVidia RIVA server can be found at

				<a href="ftp://ftp.mesa3d.org/mesa/misc/nVidia/">

				ftp://ftp.mesa3d.org/mesa/misc/nVidia/</a>.</p>

				<h2>June 2, 1999</h2>

				<p><a href="https://www.nvidia.com/">nVidia</a> has released some Linux binaries for

				xfree86 3.3.3.1, along with the <b>full source</b>, which includes GLX acceleration

				based on Mesa 3.0. They can be downloaded from <code>https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</code>.</p>

				based on Mesa 3.0. They can be downloaded from

				<a href="https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html">

				https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</a>.</p>

				<h2>May 24, 1999</h2>

				<p>Beta 2 of Mesa 3.1 has been make available at <code>ftp://ftp.mesa3d.org/mesa/beta/</code>.

				If you are into the quake scene, you may want to try this out, as it contains some

				optimizations specifically in the Q3A rendering path.

				<p>Beta 2 of Mesa 3.1 has been make available at

				<a href="ftp://ftp.mesa3d.org/mesa/beta/">ftp://ftp.mesa3d.org/mesa/beta/</a>. If you are into the

				quake scene, you may want to try this out, as it contains some optimizations

				specifically in the Q3A rendering path.

				<h2>May 13, 1999</h2>

				</p><h2>May 13, 1999</h2>

				<p>For those interested in the integration of Mesa into XFree86 4.0, Precision Insight

				has posted their lowlevel design documents at <code>http://www.precisioninsight.com</code>.</p>

				has posted their lowlevel design documents at

				<a href="http://www.precisioninsight.com">www.precisioninsight.com</a>.</p>

				<h2>May 13, 1999</h2>

				<pre>May 1999 - John Carmack of id Software, Inc. has made a donation of

				@@ -2551,11 +2680,11 @@ grateful.

				<h2>May 1, 1999</h2>

				<p>John Carmack made an interesting .plan update yesterday:</p>

				<blockquote>

				    <i>"I put together a document on optimizing OpenGL drivers for Q3 that

				    should be helpful to the various Linux 3D teams.</i><br>

				    http://www.quake3arena.com/news/glopt.html"

				</blockquote>

				<pre>

				I put together a document on optimizing OpenGL drivers for Q3 that should be helpful to the various Linux 3D teams.

				http://www.quake3arena.com/news/glopt.html

				</pre>

				<h2>April 7, 1999</h2>

				<p>Updated the Mesa contributors section and added links to RPM Mesa packages.</p>

				@@ -2563,13 +2692,14 @@ grateful.

				<h2>March 18, 1999</h2>

				<p>The new webpages are now online. Enjoy, and let me know if you find any errors.

				<h2>February 16, 1999</h2>

				</p><h2>February 16, 1999</h2>

				<p><a href="https://www.sgi.com/">SGI</a> releases its

				<a href="https://www.sgi.com/software/opensource/glx/">GLX source code</a>.</p>

				<a href="http://web.archive.org/web/20040805154836/http://www.sgi.com/software/opensource/glx/download.html">GLX source code</a>.

				</p>

				<h2>January 22, 1999</h2>

				<p><a href="https://www.mesa3d.org">www.mesa3d.org</a> established</p>

				</div>

				</body>

				</html>

				</html>

									
										73

docs/install.html
									
												View File
												
				@@ -8,7 +8,7 @@

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -23,7 +23,6 @@

				  <li><a href="#prereq-dri">For DRI and hardware acceleration</a>

				  </ul>

				<li><a href="#meson">Building with meson</a>

				<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>

				<li><a href="#scons">Building with SCons (Windows/Linux)</a>

				<li><a href="#android">Building with AOSP (Android)</a>

				<li><a href="#libs">Library Information</a>

				@@ -31,26 +30,23 @@

				</ol>

				<h1 id="prereq-general">1. Prerequisites for building</h1>

				<h2 id="prereq-general">1. Prerequisites for building</h2>

				<h2>1.1 General</h2>

				<h3>1.1 General</h3>

				<p>

				Build system.

				</p>

				<h4>Build system</h4>

				<ul>

				<li><a href="https://mesonbuild.com">meson</a> is required when building on *nix platforms.

				<li>Autoconf was removed in 19.1.0, use meson instead

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				Windows and optional for Linux (it's an alternative to meson.)

				<li><a href="https://mesonbuild.com">meson</a> is required when building on *nix platforms and is supported on windows.

				<li><a href="http://www.scons.org/">SCons</a> is an alternative for building on

				Windows and Linux.

				</li>

				<li>Android Build system when building as native Android component. Autoconf

				<li>Android Build system when building as native Android component. Meson

				is used when when building ARC.

				</li>

				</ul>

				<h4>Compiler</h4>

				<p>

				The following compilers are known to work, if you know of others or you're

				willing to maintain support for other compiler get in touch.

				@@ -63,14 +59,7 @@ willing to maintain support for other compiler get in touch.

				</ul>

				<p>

				Third party/extra tools.

				<br>

				<strong>Note</strong>: These should not be required, when building from a release tarball. If

				you think you've spotted a bug let developers know by filing a

				<a href="bugs.html">bug report</a>.

				</p>

				<h4>Third party/extra tools.</h4>

				<ul>

				<li><a href="https://www.python.org/">Python</a> - Python is required.

				@@ -81,14 +70,16 @@ When building with meson 3.5 or newer is required.

				Python Mako module is required. Version 0.8.0 or later should work.

				</li>

				<li>lex / yacc - for building the Mesa IR and GLSL compiler.

				<div>

				<p>

				On Linux systems, flex and bison versions 2.5.35 and 2.4.1, respectively,

				(or later) should work.

				On Windows with MinGW, install flex and bison with:

				</p>

				<pre>mingw-get install msys-flex msys-bison</pre>

				<p>

				For MSVC on Windows, install

				<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.

				</div>

				</p>

				</ul>

				<p><strong>Note</strong>: Some versions can be buggy (eg. flex 2.6.2) so do try others if things fail.</p>

				@@ -114,11 +105,14 @@ the packaging tool used by your distro.

				  ... # others

				</pre>

				<h1 id="meson">2. Building with meson</h1>

				<h2 id="meson">2. Building with meson</h2>

				<p><strong>Meson &gt;= 0.46.0 is required</strong></p>

				<p>

				Meson is the latest build system in mesa, it is currently able to build for

				*nix systems like Linux and BSD, and will be able to build for windows as well.

				*nix systems like Linux and BSD, macOS, Haiku, and Windows.

				</p>

				<p>

				@@ -129,20 +123,22 @@ The general approach is:

				  ninja -C builddir/

				  sudo ninja -C builddir/ install

				</pre>

				<p>On windows you can also use the visual studio backend</p>

				<pre>

				  meson builddir --backend=vs

				  cd builddir

				  msbuild mesa.sln /m

				</pre>

				<p>

				Please read the <a href="meson.html">detailed meson instructions</a>

				for more information

				</p>

				<h1 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h1>

				<p>

				  Autoconf support was removed in Mesa 19.1.0. Please use meson instead.

				</p>

				<h1 id="scons">4. Building with SCons (Windows/Linux)</h1>

				<h2 id="scons">3. Building with SCons (Windows/Linux)</h2>

				<p>

				To build Mesa with SCons on Linux or Windows do

				@@ -178,7 +174,7 @@ Additional information is available in <a href="README.WIN32">README.WIN32</a>.

				<h1 id="android">5. Building with AOSP (Android)</h1>

				<h2 id="android">4. Building with AOSP (Android)</h2>

				<p>

				Currently one can build Mesa for Android as part of the AOSP project, yet

				@@ -197,7 +193,7 @@ Android-x86 and/or other resources.

				</p>

				<h1 id="libs">6. Library Information</h1>

				<h2 id="libs">5. Library Information</h2>

				<p>

				When compilation has finished, look in the top-level <code>lib/</code>

				@@ -214,9 +210,8 @@ lrwxrwxrwx    1 brian    users          23 Mar 26 07:53 libOSMesa.so.6 -&gt; lib

				</pre>

				<p>

				<b>libGL</b> is the main OpenGL library (i.e. Mesa).

				<br>

				<b>libOSMesa</b> is the OSMesa (Off-Screen) interface library.

				<b>libGL</b> is the main OpenGL library (i.e. Mesa), while <b>libOSMesa</b>

				is the OSMesa (Off-Screen) interface library.

				</p>

				<p>

				@@ -235,7 +230,7 @@ versions of libGL and device drivers.

				</p>

				<h1 id="pkg-config">7. Building OpenGL programs with pkg-config</h1>

				<h2 id="pkg-config">6. Building OpenGL programs with pkg-config</h2>

				<p>

				Running <code>ninja install</code> will install package configuration files

				@@ -254,8 +249,6 @@ For example, compiling and linking a GLUT application can be done with:

				   gcc `pkg-config --cflags --libs glut` mydemo.c -o mydemo

				</pre>

				<br>

				</div>

				</body>

				</html>

									
										104

docs/intro.html
									
												View File
												
				@@ -2,13 +2,13 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Introduction</title>

				  <title>Introduction</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -50,7 +50,7 @@ systems.

				<h1>Project History</h1>

				<h2>Project History</h2>

				<p>

				The Mesa project was originally started by Brian Paul.

				@@ -185,7 +185,7 @@ of the OpenGL, OpenGL ES and Vulkan specifications.

				<h1>Major Versions</h1>

				<h2>Major Versions</h2>

				<p>

				This is a summary of the major versions of Mesa.

				@@ -194,7 +194,7 @@ of the OpenGL specification is implemented.

				</p>

				<h2>Version 12.x features</h2>

				<h3>Version 12.x features</h3>

				<p>

				Version 12.x of Mesa implements the OpenGL 4.3 API, but not all drivers

				support OpenGL 4.3.

				@@ -204,21 +204,21 @@ Initial support for Vulkan is also included.

				</p>

				<h2>Version 11.x features</h2>

				<h3>Version 11.x features</h3>

				<p>

				Version 11.x of Mesa implements the OpenGL 4.1 API, but not all drivers

				support OpenGL 4.1.

				</p>

				<h2>Version 10.x features</h2>

				<h3>Version 10.x features</h3>

				<p>

				Version 10.x of Mesa implements the OpenGL 3.3 API, but not all drivers

				support OpenGL 3.3.

				</p>

				<h2>Version 9.x features</h2>

				<h3>Version 9.x features</h3>

				<p>

				Version 9.x of Mesa implements the OpenGL 3.1 API.

				While the driver for Intel Sandy Bridge and Ivy Bridge is the only

				@@ -233,7 +233,7 @@ tracker for OpenCL.

				</p>

				<h2>Version 8.x features</h2>

				<h3>Version 8.x features</h3>

				<p>

				Version 8.x of Mesa implements the OpenGL 3.0 API.

				The developers at Intel deserve a lot of credit for implementing most

				@@ -242,14 +242,14 @@ the i965 driver.

				</p>

				<h2>Version 7.x features</h2>

				<h3>Version 7.x features</h3>

				<p>

				Version 7.x of Mesa implements the OpenGL 2.1 API.  The main feature

				of OpenGL 2.x is the OpenGL Shading Language.

				</p>

				<h2>Version 6.x features</h2>

				<h3>Version 6.x features</h3>

				<p>

				Version 6.x of Mesa implements the OpenGL 1.5 API with the following

				extensions incorporated as standard features:

				@@ -289,7 +289,7 @@ OpenGL specification</a> for more details.

				<h2>Version 5.x features</h2>

				<h3>Version 5.x features</h3>

				<p>

				Version 5.x of Mesa implements the OpenGL 1.4 API with the following

				extensions incorporated as standard features:

				@@ -315,7 +315,7 @@ extensions incorporated as standard features:

				</ul>

				<h2>Version 4.x features</h2>

				<h3>Version 4.x features</h3>

				<p>

				Version 4.x of Mesa implements the OpenGL 1.3 API with the following

				@@ -334,7 +334,7 @@ extensions incorporated as standard features:

				<li>GL_ARB_transpose_matrix

				</ul>

				<h2>Version 3.x features</h2>

				<h3>Version 3.x features</h3>

				<p>

				Version 3.x of Mesa implements the OpenGL 1.2 API with the following

				@@ -350,53 +350,53 @@ features:

				</ul>

				<h2>Version 2.x features</h2>

				<h3>Version 2.x features</h3>

				<p>

				Version 2.x of Mesa implements the OpenGL 1.1 API with the following

				features.

				</p>

				<ul>

				<li>Texture mapping:

					<ul>

					<li>glAreTexturesResident

					<li>glBindTexture

					<li>glCopyTexImage1D

					<li>glCopyTexImage2D

					<li>glCopyTexSubImage1D

					<li>glCopyTexSubImage2D

					<li>glDeleteTextures

					<li>glGenTextures

					<li>glIsTexture

					<li>glPrioritizeTextures

					<li>glTexSubImage1D

					<li>glTexSubImage2D

					</ul>

				  <ul>

				  <li>glAreTexturesResident

				  <li>glBindTexture

				  <li>glCopyTexImage1D

				  <li>glCopyTexImage2D

				  <li>glCopyTexSubImage1D

				  <li>glCopyTexSubImage2D

				  <li>glDeleteTextures

				  <li>glGenTextures

				  <li>glIsTexture

				  <li>glPrioritizeTextures

				  <li>glTexSubImage1D

				  <li>glTexSubImage2D

				  </ul>

				<li>Vertex Arrays:

					<ul>

					<li>glArrayElement

					<li>glColorPointer

					<li>glDrawElements

					<li>glEdgeFlagPointer

					<li>glIndexPointer

					<li>glInterleavedArrays

					<li>glNormalPointer

					<li>glTexCoordPointer

					<li>glVertexPointer

					</ul>

				  <ul>

				  <li>glArrayElement

				  <li>glColorPointer

				  <li>glDrawElements

				  <li>glEdgeFlagPointer

				  <li>glIndexPointer

				  <li>glInterleavedArrays

				  <li>glNormalPointer

				  <li>glTexCoordPointer

				  <li>glVertexPointer

				  </ul>

				<li>Client state management:

					<ul>

					<li>glDisableClientState

					<li>glEnableClientState

					<li>glPopClientAttrib

					<li>glPushClientAttrib

					</ul>

				  <ul>

				  <li>glDisableClientState

				  <li>glEnableClientState

				  <li>glPopClientAttrib

				  <li>glPushClientAttrib

				  </ul>

				<li>Misc:

					<ul>

					<li>glGetPointer

					<li>glIndexub

					<li>glIndexubv

					<li>glPolygonOffset

					</ul>

				  <ul>

				  <li>glGetPointer

				  <li>glIndexub

				  <li>glIndexubv

				  <li>glPolygonOffset

				  </ul>

				</ul>

				</div>

									
										22

docs/license.html
									
												View File
												
				@@ -2,23 +2,25 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>License / Copyright Information</title>

				  <title>License and Copyright</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Disclaimer</h1>

				<h1>License and Copyright</h1>

				<h2>Disclaimer</h2>

				<p>

				Mesa is a 3-D graphics library with an API which is very similar to

				that of <a href="https://www.opengl.org/">OpenGL</a>.*

				that of <a href="https://www.opengl.org/">OpenGL</a><sup>[<a href="#trademark">1</a>]</sup>.

				To the extent that Mesa utilizes the OpenGL command syntax or state

				machine, it is being used with authorization from <a

				href="https://www.sgi.com/">Silicon Graphics,

				@@ -32,17 +34,17 @@ vendor.

				<p>

				Please do not refer to the library as <em>MesaGL</em> (for legal

				reasons). It's just <em>Mesa</em> or <em>The Mesa 3-D graphics

				library</em>. <br>

				library</em>.

				</p>

				<p>

				* OpenGL is a trademark of <a href="https://www.sgi.com/"

				>Silicon Graphics Incorporated</a>.

				<a id="trademark">[1]</a>: OpenGL is a trademark of <a

				href="https://www.sgi.com/">Silicon Graphics Incorporated</a>.

				</p>

				<h1>License / Copyright Information</h1>

				<h2>License / Copyright Information</h2>

				<p>

				The Mesa distribution consists of several components.  Different copyrights

				@@ -82,7 +84,7 @@ SOFTWARE.

				</pre>

				<h1>Attention, Contributors</h1>

				<h2>Attention, Contributors</h2>

				<p>

				When contributing to the Mesa project you must agree to the licensing terms

				@@ -92,7 +94,7 @@ and their respective licenses.

				</p>

				<h1>Mesa Component Licenses</h1>

				<h2>Mesa Component Licenses</h2>

				<pre>

				Component         Location               License

									
										8

docs/lists.html
									
												View File
												
				@@ -2,13 +2,13 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Mailing Lists</title>

				  <title>Mailing Lists</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				@@ -68,14 +68,14 @@ kernels, see the

				</p>

				<h1>IRC</h1>

				<h2>IRC</h2>

				<p>join <a href="irc://chat.freenode.net#dri-devel">#dri-devel channel</a>

				on <a href="https://webchat.freenode.net/">irc.freenode.net</a>

				</p>

				<h1>OpenGL Forums</h1>

				<h2>OpenGL Forums</h2>

				<p>

				Here are some other OpenGL-related forums you might find useful:

									
										111

docs/llvmpipe.html
									
												View File
												
				@@ -2,19 +2,21 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>llvmpipe</title>

				  <title>Gallium LLVMpipe Driver</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				  The Mesa 3D Graphics Library

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Introduction</h1>

				<h1>Gallium LLVMpipe Driver</h1>

				<h2>Introduction</h2>

				<p>

				The Gallium llvmpipe driver is a software rasterizer that uses LLVM to

				@@ -28,7 +30,7 @@ It's the fastest software rasterizer for Mesa.

				</p>

				<h1>Requirements</h1>

				<h2>Requirements</h2>

				<ul>

				<li>

				@@ -45,7 +47,7 @@ It's the fastest software rasterizer for Mesa.

				   built with LLVM version 4.0 or later.

				   </p>

				   <p>

				   See /proc/cpuinfo to know what your CPU supports.

				   See <code>/proc/cpuinfo</code> to know what your CPU supports.

				   </p>

				</li>

				<li>

				@@ -54,7 +56,7 @@ It's the fastest software rasterizer for Mesa.

				   For Linux, on a recent Debian based distribution do:

				   </p>

				<pre>

				     aptitude install llvm-dev

				aptitude install llvm-dev

				</pre>

				   <p>

				   If you want development snapshot builds of LLVM for Debian and derived

				@@ -66,13 +68,14 @@ It's the fastest software rasterizer for Mesa.

				   For a RPM-based distribution do:

				   </p>

				<pre>

				     yum install llvm-devel

				yum install llvm-devel

				</pre>

				   <p>

				   For Windows you will need to build LLVM from source with MSVC or MINGW

				   (either natively or through cross compilers) and CMake, and set the LLVM

				   environment variable to the directory you installed it to.

				   (either natively or through cross compilers) and CMake, and set the

				   <code>LLVM</code> environment variable to the directory you installed

				   it to.

				   LLVM will be statically linked, so when building on MSVC it needs to be

				   built with a matching CRT as Mesa, and you'll need to pass

				@@ -101,8 +104,8 @@ It's the fastest software rasterizer for Mesa.

				   </table>

				   <p>

				   You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86

				   to cmake.

				   You can build only the x86 target by passing

				   <code>-DLLVM_TARGETS_TO_BUILD=X86</code> to cmake.

				   </p>

				</li>

				@@ -112,20 +115,20 @@ It's the fastest software rasterizer for Mesa.

				</ul>

				<h1>Building</h1>

				<h2>Building</h2>

				To build everything on Linux invoke scons as:

				<pre>

				  scons build=debug libgl-xlib

				scons build=debug libgl-xlib

				</pre>

				Alternatively, you can build it with meson with:

				<pre>

				  mkdir build

				  cd build

				  meson -D glx=gallium-xlib -D gallium-drivers=swrast

				  ninja

				mkdir build

				cd build

				meson -D glx=gallium-xlib -D gallium-drivers=swrast

				ninja

				</pre>

				but the rest of these instructions assume that scons is used.

				@@ -133,31 +136,34 @@ but the rest of these instructions assume that scons is used.

				For Windows the procedure is similar except the target:

				<pre>

				  scons platform=windows build=debug libgl-gdi

				scons platform=windows build=debug libgl-gdi

				</pre>

				<h1>Using</h1>

				<h2>Using</h2>

				<h2>Linux</h2>

				<h3>Linux</h3>

				<p>On Linux, building will create a drop-in alternative for libGL.so into</p>

				<p>On Linux, building will create a drop-in alternative for

				<code>libGL.so</code> into</p>

				<pre>

				  build/foo/gallium/targets/libgl-xlib/libGL.so

				build/foo/gallium/targets/libgl-xlib/libGL.so

				</pre>

				or

				<pre>

				  lib/gallium/libGL.so

				lib/gallium/libGL.so

				</pre>

				<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p>

				<p>To use it set the <code>LD_LIBRARY_PATH</code> environment variable

				accordingly.</p>

				<p>For performance evaluation pass build=release to scons, and use the corresponding

				lib directory without the "-debug" suffix.</p>

				<p>For performance evaluation pass <code>build=release</code> to scons,

				and use the corresponding lib directory without the <code>-debug</code>

				suffix.</p>

				<h2>Windows</h2>

				<h3>Windows</h3>

				<p>

				On Windows, building will create

				@@ -175,7 +181,9 @@ any OpenGL drivers):

				</p>

				<ul>

				  <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li>

				  <li><p>copy <code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code>

				         to <code>C:\Windows\SysWOW64\mesadrv.dll</code>

				  </p></li>

				  <li><p>load this registry settings:</p>

				  <pre>REGEDIT4

				@@ -192,13 +200,13 @@ any OpenGL drivers):

				</ul>

				<h1>Profiling</h1>

				<h2>Profiling</h2>

				<p>

				To profile llvmpipe you should build as

				</p>

				<pre>

				  scons build=profile &lt;same-as-before&gt;

				scons build=profile &lt;same-as-before&gt;

				</pre>

				<p>

				@@ -206,39 +214,40 @@ This will ensure that frame pointers are used both in C and JIT functions, and

				that no tail call optimizations are done by gcc.

				</p>

				<h2>Linux perf integration</h2>

				<h3>Linux perf integration</h3>

				<p>

				On Linux, it is possible to have symbol resolution of JIT code with <a href="https://perf.wiki.kernel.org/">Linux perf</a>:

				</p>

				<pre>

					perf record -g /my/application

					perf report

				perf record -g /my/application

				perf report

				</pre>

				<p>

				When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with

				symbol address table.  It also dumps assembly code to /tmp/perf-XXXXX.map.asm,

				which can be used by the bin/perf-annotate-jit.py script to produce disassembly of

				the generated code annotated with the samples.

				When run inside Linux perf, llvmpipe will create a

				<code>/tmp/perf-XXXXX.map</code> file with symbol address table.  It also

				dumps assembly code to <code>/tmp/perf-XXXXX.map.asm</code>, which can be

				used by the <code>bin/perf-annotate-jit.py</code> script to produce

				disassembly of the generated code annotated with the samples.

				</p>

				<p>You can obtain a call graph via

				<a href="https://github.com/jrfonseca/gprof2dot#linux-perf">Gprof2Dot</a>.</p>

				<h1>Unit testing</h1>

				<h2>Unit testing</h2>

				<p>

				Building will also create several unit tests in

				build/linux-???-debug/gallium/drivers/llvmpipe:

				<code>build/linux-???-debug/gallium/drivers/llvmpipe</code>:

				</p>

				<ul>

				<li> lp_test_blend: blending

				<li> lp_test_conv: SIMD vector conversion

				<li> lp_test_format: pixel unpacking/packing

				<li> <code>lp_test_blend</code>: blending

				<li> <code>lp_test_conv</code>: SIMD vector conversion

				<li> <code>lp_test_format</code>: pixel unpacking/packing

				</ul>

				<p>

				@@ -246,33 +255,35 @@ Some of these tests can output results and benchmarks to a tab-separated file

				for later analysis, e.g.:

				</p>

				<pre>

				  build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv

				build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv

				</pre>

				<h1>Development Notes</h1>

				<h2>Development Notes</h2>

				<ul>

				<li>

				  When looking at this code for the first time, start in lp_state_fs.c, and

				  then skim through the lp_bld_* functions called there, and the comments

				  at the top of the lp_bld_*.c functions.

				  then skim through the <code>lp_bld_*</code> functions called there, and

				  the comments at the top of the <code>lp_bld_*.c</code> functions.

				</li>

				<li>

				  The driver-independent parts of the LLVM / Gallium code are found in

				  src/gallium/auxiliary/gallivm/.  The filenames and function prefixes

				  need to be renamed from "lp_bld_" to something else though.

				  <code>src/gallium/auxiliary/gallivm/</code>.  The filenames and function

				  prefixes need to be renamed from <code>lp_bld_</code> to something else

				  though.

				</li>

				<li>

				  We use LLVM-C bindings for now. They are not documented, but follow the C++

				  interfaces very closely, and appear to be complete enough for code

				  generation. See 

				  <a href="https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">

				  this stand-alone example</a>.  See the llvm-c/Core.h file for reference.

				  this stand-alone example</a>.  See the <code>llvm-c/Core.h</code> file for

				  reference.

				</li>

				</ul>

				<h1 id="recommended_reading">Recommended Reading</h1>

				<h2 id="recommended_reading">Recommended Reading</h2>

				<ul>

				  <li>

Compare commits

9055 Commits mesa-19.1. ... 20.0-branc

66 .appveyor/appveyor_msvc.bat Normal file Unescape Escape View File

36 .appveyor/llvm-wrap.meson Normal file Unescape Escape View File

6 .editorconfig Unescape Escape View File

4 .gitattributes vendored Unescape Escape View File

2 .gitignore vendored Unescape Escape View File

772 .gitlab-ci.yml Unescape Escape View File

122 .gitlab-ci/README.md Normal file Unescape Escape View File

46 .gitlab-ci/arm.config Normal file Unescape Escape View File

9 src/gallium/drivers/panfrost/ci/arm64.config → .gitlab-ci/arm64.config Unescape Escape View File

10 .gitlab-ci/build-cts-runner.sh Normal file Unescape Escape View File

61 .gitlab-ci/build-deqp-gl.sh Normal file Unescape Escape View File

33 .gitlab-ci/build-deqp-vk.sh Normal file Unescape Escape View File

13 .gitlab-ci/build-piglit.sh Normal file Unescape Escape View File

74 .gitlab-ci/container/arm_build.sh Normal file Unescape Escape View File

64 .gitlab-ci/container/arm_test.sh Normal file Unescape Escape View File

63 .gitlab-ci/container/lava_arm.sh Normal file Unescape Escape View File

52 .gitlab-ci/container/llvm-snapshot.gpg.key Normal file Unescape Escape View File

220 .gitlab-ci/container/x86_build.sh Normal file Unescape Escape View File

59 .gitlab-ci/container/x86_build_old.sh Normal file Unescape Escape View File

96 .gitlab-ci/container/x86_test-gl.sh Normal file Unescape Escape View File

87 .gitlab-ci/container/x86_test-vk.sh Normal file Unescape Escape View File

32 src/gallium/drivers/panfrost/ci/create-rootfs.sh → .gitlab-ci/create-rootfs.sh Unescape Escape View File

1 .gitlab-ci/cross-xfail-i386 Normal file Unescape Escape View File

181 .gitlab-ci/debian-install.sh Unescape Escape View File

10 .gitlab-ci/deqp-default-skips.txt Normal file Unescape Escape View File

33 .gitlab-ci/deqp-freedreno-a307-fails.txt Normal file Unescape Escape View File

3 .gitlab-ci/deqp-freedreno-a630-fails.txt Normal file Unescape Escape View File

21 .gitlab-ci/deqp-freedreno-a630-skips.txt Normal file Unescape Escape View File

205 .gitlab-ci/deqp-lima-fails.txt Normal file Unescape Escape View File

46 .gitlab-ci/deqp-lima-skips.txt Normal file Unescape Escape View File

124 .gitlab-ci/deqp-llvmpipe-fails.txt Normal file Unescape Escape View File

31 .gitlab-ci/deqp-panfrost-t720-fails.txt Normal file Unescape Escape View File

14 .gitlab-ci/deqp-panfrost-t720-skips.txt Normal file Unescape Escape View File

31 .gitlab-ci/deqp-panfrost-t760-fails.txt Normal file Unescape Escape View File

10 .gitlab-ci/deqp-panfrost-t760-skips.txt Normal file Unescape Escape View File

31 .gitlab-ci/deqp-panfrost-t820-fails.txt Normal file Unescape Escape View File

13 .gitlab-ci/deqp-panfrost-t820-skips.txt Normal file Unescape Escape View File

31 .gitlab-ci/deqp-panfrost-t860-fails.txt Normal file Unescape Escape View File

13 .gitlab-ci/deqp-panfrost-t860-skips.txt Normal file Unescape Escape View File

31 .gitlab-ci/deqp-radv-polaris10-skips.txt Normal file Unescape Escape View File

237 .gitlab-ci/deqp-runner.sh Executable file Unescape Escape View File

844 .gitlab-ci/deqp-softpipe-fails.txt Normal file Unescape Escape View File

16 .gitlab-ci/deqp-softpipe-skips.txt Normal file Unescape Escape View File

45 .gitlab-ci/generate_lava.py Executable file Unescape Escape View File

93 .gitlab-ci/lava-deqp.yml.jinja2 Normal file Unescape Escape View File

122 .gitlab-ci/lava-gitlab-ci.yml Normal file Unescape Escape View File

13 .gitlab-ci/meson-build.bat Normal file Unescape Escape View File

64 .gitlab-ci/meson-build.sh Executable file Unescape Escape View File

36 .gitlab-ci/piglit/disable-vs_in.diff Normal file Unescape Escape View File

5117 .gitlab-ci/piglit/glslparser.txt Normal file View File

2220 .gitlab-ci/piglit/quick_gl.txt Normal file View File

6418 .gitlab-ci/piglit/quick_shader.txt Normal file View File

29 .gitlab-ci/piglit/run.sh Executable file Unescape Escape View File

59 .gitlab-ci/prepare-artifacts.sh Executable file Unescape Escape View File

17 .gitlab-ci/run-shader-db.sh Executable file Unescape Escape View File

12 .gitlab-ci/scons-build.sh Executable file Unescape Escape View File

20 .gitlab-ci/x86_64-w64-mingw32 Normal file Unescape Escape View File

20 .mailmap Unescape Escape View File

44 .travis.yml Unescape Escape View File

10 Android.common.mk Unescape Escape View File

27 Android.mk Unescape Escape View File

3 README.rst Unescape Escape View File

41 REVIEWERS Unescape Escape View File

23 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

41 appveyor.yml Unescape Escape View File

15 bin/.cherry-ignore Unescape Escape View File

0 src/gallium/drivers/panfrost/include/meson.build → bin/__init__.py Unescape Escape View File

35 bin/bugzilla_mesa.sh Unescape Escape View File

272 bin/gen_release_notes.py Executable file Unescape Escape View File

62 bin/gen_release_notes_test.py Normal file Unescape Escape View File

4 bin/get-pick-list.sh Unescape Escape View File

3 bin/install_megadrivers.py Unescape Escape View File

2 bin/meson.build Unescape Escape View File

117 bin/post_version.py Executable file Unescape Escape View File

29 bin/shortlog_mesa.sh Unescape Escape View File

172 bin/symbols-check.py Normal file Unescape Escape View File

13 common.py Unescape Escape View File

9055 Commits

mesa-19.1. ... 20.0-branc

66

.appveyor/appveyor_msvc.bat Normal file

View File

36

.appveyor/llvm-wrap.meson Normal file

View File

6

.editorconfig

View File

4

.gitattributes vendored

View File

2

.gitignore vendored

View File

772

.gitlab-ci.yml

View File

122

.gitlab-ci/README.md Normal file

View File

46

.gitlab-ci/arm.config Normal file

View File

9

src/gallium/drivers/panfrost/ci/arm64.config → .gitlab-ci/arm64.config

View File

10

.gitlab-ci/build-cts-runner.sh Normal file

View File

61

.gitlab-ci/build-deqp-gl.sh Normal file

View File

33

.gitlab-ci/build-deqp-vk.sh Normal file

View File

13

.gitlab-ci/build-piglit.sh Normal file

View File

74

.gitlab-ci/container/arm_build.sh Normal file

View File

64

.gitlab-ci/container/arm_test.sh Normal file

View File

63

.gitlab-ci/container/lava_arm.sh Normal file

View File

52

.gitlab-ci/container/llvm-snapshot.gpg.key Normal file

View File

220

.gitlab-ci/container/x86_build.sh Normal file

View File

59

.gitlab-ci/container/x86_build_old.sh Normal file

View File

96

.gitlab-ci/container/x86_test-gl.sh Normal file

View File

87

.gitlab-ci/container/x86_test-vk.sh Normal file

View File

32

src/gallium/drivers/panfrost/ci/create-rootfs.sh → .gitlab-ci/create-rootfs.sh

View File

1

.gitlab-ci/cross-xfail-i386 Normal file

View File

181

.gitlab-ci/debian-install.sh

View File

10

.gitlab-ci/deqp-default-skips.txt Normal file

View File

33

.gitlab-ci/deqp-freedreno-a307-fails.txt Normal file

View File

3

.gitlab-ci/deqp-freedreno-a630-fails.txt Normal file

View File

21

.gitlab-ci/deqp-freedreno-a630-skips.txt Normal file

View File

205

.gitlab-ci/deqp-lima-fails.txt Normal file

View File

46

.gitlab-ci/deqp-lima-skips.txt Normal file

View File

124

.gitlab-ci/deqp-llvmpipe-fails.txt Normal file

View File

31

.gitlab-ci/deqp-panfrost-t720-fails.txt Normal file

View File

14

.gitlab-ci/deqp-panfrost-t720-skips.txt Normal file

View File

31

.gitlab-ci/deqp-panfrost-t760-fails.txt Normal file

View File

10

.gitlab-ci/deqp-panfrost-t760-skips.txt Normal file

View File

31

.gitlab-ci/deqp-panfrost-t820-fails.txt Normal file

View File

13

.gitlab-ci/deqp-panfrost-t820-skips.txt Normal file

View File

31

.gitlab-ci/deqp-panfrost-t860-fails.txt Normal file

View File

13

.gitlab-ci/deqp-panfrost-t860-skips.txt Normal file

View File

31

.gitlab-ci/deqp-radv-polaris10-skips.txt Normal file

View File

237

.gitlab-ci/deqp-runner.sh Executable file

View File

844

.gitlab-ci/deqp-softpipe-fails.txt Normal file

View File

16

.gitlab-ci/deqp-softpipe-skips.txt Normal file

View File

45

.gitlab-ci/generate_lava.py Executable file

View File

93

.gitlab-ci/lava-deqp.yml.jinja2 Normal file

View File

122

.gitlab-ci/lava-gitlab-ci.yml Normal file

View File

13

.gitlab-ci/meson-build.bat Normal file

View File

64

.gitlab-ci/meson-build.sh Executable file

View File

36

.gitlab-ci/piglit/disable-vs_in.diff Normal file

View File

5117

.gitlab-ci/piglit/glslparser.txt Normal file

View File

2220

.gitlab-ci/piglit/quick_gl.txt Normal file

View File

6418

.gitlab-ci/piglit/quick_shader.txt Normal file

View File

29

.gitlab-ci/piglit/run.sh Executable file

View File

59

.gitlab-ci/prepare-artifacts.sh Executable file

View File

17

.gitlab-ci/run-shader-db.sh Executable file

View File

12

.gitlab-ci/scons-build.sh Executable file

View File

20

.gitlab-ci/x86_64-w64-mingw32 Normal file

View File

20

.mailmap

View File

44

.travis.yml

View File

10

Android.common.mk

View File

27

Android.mk

View File

3

README.rst

View File

41

REVIEWERS

View File

23

SConstruct

View File

2

VERSION

View File

41

appveyor.yml

View File

15

bin/.cherry-ignore

View File

0

src/gallium/drivers/panfrost/include/meson.build → bin/init.py

View File

35

bin/bugzilla_mesa.sh

View File

272

bin/gen_release_notes.py Executable file

View File

62

bin/gen_release_notes_test.py Normal file

View File

4

bin/get-pick-list.sh

View File

3

bin/install_megadrivers.py

View File

2

bin/meson.build

View File

117

bin/post_version.py Executable file

View File

29

bin/shortlog_mesa.sh

View File

172

bin/symbols-check.py Normal file

View File

13

common.py

View File

12

docs/application-issues.html

View File