Comparing 11adea4b24...2e7833ad91 - mesa

fran/mesa

Author	SHA1	Message	Date
Dylan Baker	2e7833ad91	Version: update to 19.0-rc5	2019-02-19 11:15:18 -08:00
Tapani Pälli	0a2e4b02ca	mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment This fixes invalid access to Attachment array which would occur if caller would exceed MaxColorAttachments. In practice this should not ever happen because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be valid and InvalidateFramebuffer will error out before but this should make coverity happy. v2: const, remove _EXT (Ian) CID: 1442559 Fixes: `0c42b5f3cb` "mesa: wire up InvalidateFramebuffer" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `9762a9f893`)	2019-02-19 07:08:42 -08:00
Rhys Perry	c7fc61d15b	radv: ensure export arguments are always float So that the signature is correct and consistent, the inputs to a export intrinsic should always be 32-bit floats. This and the previous commit fixes a large amount crashes from dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_* tests Fixes: `b722b29f10` ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `0ca550e01a`)	2019-02-19 07:08:23 -08:00
Rhys Perry	1b093b567f	radv: bitcast 16-bit outputs to integers 16-bit outputs are stored as 16-bit floats in the outputs array, so they have to be bitcast. Fixes: `b722b29f10` ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `64065aa504`)	2019-02-19 07:08:11 -08:00
Eric Anholt	d73e48b63f	v3d: Fix the check for "is the last thrsw inside control flow" The execute.file check used to be good enough, until I stopped setting up the execute mask for uniform ifs. No known tests fixed, noticed while doing a refactor. Fixes: `0805060573` ("v3d: Handle dynamically uniform IF statements with uniform control flow.") (cherry picked from commit `441294962c`)	2019-02-19 07:07:54 -08:00
Eric Anholt	ba24ca67f6	v3d: Use the early_fragment_tests flag for the shader's disable-EZ field. Apparently we need disable-EZ flagged, not just "does Z writes". Fixes dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo on 7278, even though it passed in simulation. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: `051a41d3d5` ("v3d: Add support for the early_fragment_tests flag.") (cherry picked from commit `cd5e0b2729`)	2019-02-19 07:07:38 -08:00
Samuel Pitoiset	110500cc8a	radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled This version is better and safer. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `47616810ed`)	2019-02-19 07:07:22 -08:00
Samuel Pitoiset	0b9f6ebfbb	radv: write the alpha channel of MRT0 when alpha coverage is enabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `0d8f096293`)	2019-02-19 07:07:13 -08:00
Kenneth Graunke	69ebf4569a	nir: Don't reassociate add/mul chains containing only constants The idea here is to reassociate a * (b * c) into (a * c) * b, when b is a non-constant value, but a and c are constants, allowing them to be combined. But nothing was enforcing that 'b' must be non-constant, which meant that running opt_algebraic in a loop would never terminate if the IR contained non-folded constant expressions like 256 * 0.5 * 2. Normally, we call constant folding in such a loop too, but IMO it's better for nir_opt_algebraic to be robust and not rely on that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581 Fixes: `32e266a9a5` i965: Compile fp64 funcs only if we do not have 64-bit hardware support Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `535251487b`)	2019-02-19 07:07:04 -08:00
Matt Turner	385b736238	intel/compiler/test: Add unit test for mismatched signedness comparison v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `ac21dd4aee`)	2019-02-15 12:03:53 -08:00
Matt Turner	4cf1a40f9a	intel/compiler: Avoid propagating inequality cmods if types are different v2: Fix silly bug in logic. s/\|\|/&&/ All but one of the affected shaders is in an Unreal4 demo. The other is in Tomb Raider. All of the cases that Ian investigated appear to be sequences like the following if (int(uint(some_float)) < 0) /* other relations too */ ... At least in Tomb Raider, it's not obvious that this sequence came from the original shader. In some of the Unreal demos, the shader contains code like if (int(uint(textureLod(...))) > 0) ... which explicitly generates the offending sequence. All Gen6+ platforms had similar results (Skylake shown): total instructions in shared programs: 15437170 -> 15437187 (<.01%) instructions in affected programs: 4492 -> 4509 (0.38%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.05% max: 0.73% x̄: 0.66% x̃: 0.73% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.57% 0.75% Instructions are HURT. total cycles in shared programs: 383007996 -> 383007992 (<.01%) cycles in affected programs: 20542 -> 20538 (-0.02%) helped: 6 HURT: 7 helped stats (abs) min: 2 max: 6 x̄: 5.33 x̃: 6 helped stats (rel) min: 0.11% max: 0.36% x̄: 0.32% x̃: 0.36% HURT stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.27% max: 0.27% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for cycles value: -3.30 2.69 95% mean confidence interval for cycles %-change: -0.19% 0.19% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: nagrigoriadis@gmail.com Tested-by: Danylo Piliaiev <danylo.piliaiev@gmail.com> (cherry picked from commit `2dff9a66b6`)	2019-02-15 12:03:44 -08:00
Jason Ekstrand	81e053b757	intel/fs: Bail in optimize_extract_to_float if we have modifiers This fixes a bug in runscape where we were optimizing x >> 16 to an extract and then negating and converting to float. The NIR to fs pass was dropping the negate on the floor breaking a geometry shader and causing it to render nothing. Fixes: `1f862e923c` "i965/fs: Optimize float conversions of byte/word..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109601 Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (cherry picked from commit `367b0ede4d`)	2019-02-15 10:03:26 -08:00
Ilia Mirkin	f30fb27665	swr: set PIPE_CAP_MAX_VARYINGS correctly Unfortunately swr was missed in the original commit. The number of varyings should generally match up to what's reported as the shader caps for fragment inputs. Fixes: `6010d7b8e8` (gallium: add PIPE_CAP_MAX_VARYINGS) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alok Hota <alok.hota@intel.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `8c859367df`)	2019-02-15 10:03:20 -08:00
Kenneth Graunke	1039285288	anv: Put MOCS in the correct location My patch to switch from struct-based MOCS to numeric MOCS accidentally divided all MOCS entries by 2 in the Vulkan driver. MOCS on Gen9+ is just an array index into a table. But in the hardware packets, the index starts at bit 1. So we need to shift it. Fixes: `0b44644ca6` (genxml: Consistently use a numeric "MOCS" field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `39aee57523`)	2019-02-15 10:03:08 -08:00
Ian Romanick	59812ac38d	spirv: Add missing break Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `c6465fec0c` ("spirv: add SpvCapabilityInt64Atomics") CID: 1442555 (cherry picked from commit `9a918050e0`)	2019-02-14 09:30:54 -08:00
Dylan Baker	c19ce6e5e2	meson: Add dependency on genxml to anvil Currently the Intel "anvil" driver races with the generation of genxml files, while i965 has an explicit dependency. This patch adds the same dependency to anvil. Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `279060cd32`)	2019-02-14 09:30:44 -08:00
Samuel Pitoiset	eba57c29b0	radv: always export gl_SampleMask when the fragment shader uses it For some reasons, this breaks trees rendering in Project Cars. Fixes: `85010585cd` ("radv: only enable gl_SampleMask if MSAA is enabled too") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `334da034d8`)	2019-02-14 09:30:38 -08:00
Samuel Pitoiset	e304007d87	radv/winsys: fix BO list creation when RADV_DEBUG=allbos is set Fixes: `50fd253bd6` ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `5e18000d1b`)	2019-02-14 09:30:33 -08:00
Dylan Baker	b4419fdba5	get-pick-list: Add --pretty=medium to the arguments for Cc patches Because none of them have been picked up for 19.0 due to this bug being reintroduced. v2: - Fix fixes tags Fixes: `e6b3a3b201` ("bin/get-pick-list.sh: handle "typod" usecase.") Fixes: `fac10169bb` ("bin/get-pick-list.sh: prefix output with "[stable] "") Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `aff52dd2c6`)	2019-02-13 14:14:35 -08:00
Karol Herbst	7ac15d9e42	nir/opt_if: don't mark progress if nothing changes if we have something like this: loop { ... if x { break; } else { continue; } } opt_if_loop_last_continue returns true marking progress allthough nothing changes. Fixes: `5921a19d4b` "nir: add if opt opt_if_loop_last_continue()" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `7e08f22a72`)	2019-02-13 14:14:35 -08:00
Oscar Blumberg	6b48451110	radeonsi: Fix guardband computation for large render targets Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `3c540e0a74`)	2019-02-13 14:14:35 -08:00
Dylan Baker	838baab472	version: bump for 19.0-rc4	2019-02-13 09:11:02 -08:00
Juan A. Suarez Romero	d8534f931c	anv/cmd_buffer: check for NULL framebuffer This can happen when we record a VkCmdDraw in a secondary buffer that was created inheriting from the primary buffer, but with the framebuffer set to NULL in the VkCommandBufferInheritanceInfo. Vulkan 1.1.81 spec says that "the application must ensure (using scissor if neccesary) that all rendering is contained in the render area [...] [which] must be contained within the framebuffer dimesions". While this should be done by the application, commit `465e5a86` added the clamp to the framebuffer size, in case of application does not do it. But this requires to know the framebuffer dimensions. If we do not have a framebuffer at that moment, the best compromise we can do is to just apply the scissor as it is, and let the application to ensure the rendering is contained in the render area. v2: do not clamp to framebuffer if there isn't a framebuffer v3 (Jason): - clamp earlier in the conditional - clamp to render area if command buffer is primary v4: clamp also x and y to render area (Jason) v5: rename used variables (Jason) Fixes: `465e5a86` ("anv: Clamp scissors to the framebuffer boundary") CC: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `1ad26f9417`)	2019-02-12 14:19:52 -08:00
Samuel Pitoiset	1f33f3cf3a	radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8 This fixes a critical issue. Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `1b8983c25b`)	2019-02-12 14:19:52 -08:00
Samuel Pitoiset	fbcd1ad42c	radv: fix compiler issues with GCC 9 "The C standard says that compound literals which occur inside of the body of a function have automatic storage duration associated with the enclosing block. Older GCC releases were putting such compound literals into the scope of the whole function, so their lifetime actually ended at the end of containing function. This has been fixed in GCC 9. Code that relied on this extended lifetime needs to be fixed, move the compound literals to whatever scope they need to accessible in." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `129a9f4937`)	2019-02-12 14:19:52 -08:00
Kenneth Graunke	9a5c8d2aab	st/mesa: Limit GL_MAX_[NATIVE_]PROGRAM_PARAMETERS_ARB to 2048 Piglit's vp-max-array test creates a vertex program containing a uniform array sized to the value of GL_MAX_NATIVE_PROGRAM_PARAMETERS_ARB. Mesa will then add additional state-var parameters for things like the MVP matrix. radeonsi currently exposes a value of 4096, derived from constant buffer upload size. This means the array will have 4096 elements, and the extra MVP state-vars would get a prog_src_register::Index of over 4096. Unfortunately, prog_src_register::Index is a signed 13-bit integer, so values beyond 4096 end up turning into negative numbers. Negative source indexes are only valid for relative addressing, so this ends up generating illegal IR. In prog_to_nir, this would cause an out of bounds array access. st_mesa_to_tgsi checks for a negative value, assumes it's bogus, and remaps it to parameter 0 in order to get something in-range. This isn't right - instead of reading the MVP matrix, it would read the first element of the vertex program's large array. But the test only checks that the program compiles, so we never noticed that it was broken. This patch limits the size of the program limits, with the understanding that we may need to generate additional state-vars internally. i965 has exposed 1024 for this limit for years, so I don't expect lowering it to 2048 will cause any practical problems for radeonsi or other drivers. Fixes vp-max-array with prog_to_nir.c. Cc: "19.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `f45dd6d31b`)	2019-02-12 14:19:52 -08:00
Leo Liu	c55008e5a0	st/va/vp9: set max reference as default of VP9 reference number If there is no information about number of render targets Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a0a52a0367`)	2019-02-12 14:19:52 -08:00
Leo Liu	ab585817e6	st/va: fix the incorrect max profiles report Add "PIPE_VIDEO_PROFILE_MAX" to enum, so it will make sure here will be correct when adding more profiles in the future. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109107 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `21cdb828a3`)	2019-02-12 14:19:52 -08:00
Marek Olšák	75bec50c2a	winsys/amdgpu: don't drop manually added fence dependencies wow, it's hard to believe that fence and syncobjs dependencies were ignored. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `ddfe209a0d`)	2019-02-12 14:19:52 -08:00
Marek Olšák	62b3bd8cd1	radeonsi: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `61c678d4bc`)	2019-02-12 14:19:52 -08:00
Marek Olšák	fb3485bc92	gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0 Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (cherry picked from commit `4522f01d4e`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	2a97a3a8e7	nvc0: we have 16k-sized framebuffers, fix default scissors For some reason we don't use view volume clipping by default, and use scissors instead. These scissors were set to an 8k max fb size, while the driver advertises 16k-sized framebuffers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cc79a1483f`)	2019-02-12 14:19:52 -08:00
Karol Herbst	ab70eccc75	st/mesa: require RGBA2, RGB4, and RGBA4 to be renderable If the driver does not support rendering to these formats but does support texturing, we can end up in incompatibilities between textures and renderbuffers that are then copied to. Fixes KHR-GL45.copy_image.functional on nvc0 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cbd1ad6165`)	2019-02-12 14:19:52 -08:00
Karol Herbst	24bb2771b6	gallium: add PIPE_CAP_MAX_VARYINGS Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Conflicts resolved by Dylan Signed-off-by: Karol Herbst <karolherbst@gmail.com> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6010d7b8e8`)	2019-02-12 14:19:52 -08:00
Karol Herbst	7b5e0f8316	gm107/ir: add fp64 rsq Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `cce4955721`)	2019-02-12 14:19:52 -08:00
Karol Herbst	77102d0151	gm107/ir: add fp64 rcp Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `815a8e59c6`)	2019-02-12 14:19:52 -08:00
Karol Herbst	c96d433105	gk104/ir: Use the new rcp/rsq in library [imirkin: add a few more "long" prefixes to safen things up] Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `12669d2970`)	2019-02-12 14:19:52 -08:00
Boyan Ding	81810fa5db	gk110/ir: Use the new rcp/rsq in library v2: (Karol Herbst <kherbst@redhat.com> * fix Value setup for the builtins Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> [imirkin: track the fp64 flag when switching ops to calls] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `656ad06051`)	2019-02-12 14:19:52 -08:00
Boyan Ding	c5b9774eb4	gk110/ir: Add rsq f64 implementation Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `7937408052`)	2019-02-12 14:19:52 -08:00
Boyan Ding	a08aba86da	gk110/ir: Add rcp f64 implementation Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `04593d9a73`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	d278b3c187	nvc0: stick zero values for the compute invocation counts Not quite perfect, but at least we don't end up with random values in the query buffer. Fixes KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `6adb9b38bf`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	5a9b7bce9c	nv50,nvc0: use condition for occlusion queries when already complete For the NO_WAIT variants, we would jump into the ALWAYS case for both nested and inverted occlusion queries. However if the query had previously completed, the application could reasonably expect that the render condition would follow that result. To resolve this, we remove the nesting distinction which unnecessarily created an imbalance between the regular and inverted cases (since there's no "zero" condition mode). We also use the proper comparison if we know that the query has completed (which could happen as a result of an earlier get_query_result call). Fixes KHR-GL45.conditional_render_inverted.functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `e00799d3dc`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	b9e5e15f87	nvc0: fix 3d images on kepler Looks like SUBFM.3D and SUEAU are perfectly capable of dealing with 3d tiling, they just need the correct inputs. Supply them. We also have to deal with the case where a 2d "layer" of a 3d image is bound. In this case, we supply the z coordinate separately to the shader, which has to optionally treat every 2d case as if it could be a slice of a 3d texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `162352e671`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	f305135e0b	nvc0/ir: always use CG mode for loads from atomic-only buffers Atomic operations don't update the local cache, which means that we would have to issue CCTL operations in order to get the updated values. When we know that a buffer is primarily used for atomic operations, it's easier to just avoid the caching at that level entirely. The same issue persists for non-atomic buffers, which will have to be fixed separately. Fixes the failing dEQP-GLES31.functional.atomic_counter.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `4443b6ddf2`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	eb766a259e	nvc0: add support for handling indirect draws with attrib conversion The hardware does not natively support FIXED and DOUBLE formats. If those are used in an indirect draw, they have to be converted. Our conversion tries to be clever about only converting the data that's needed. However for indirect, that won't work. Given that DOUBLE or FIXED are highly unlikely to ever be used with indirect draws, read the indirect buffer on the CPU and issue draws directly. Fixes the failing dEQP-GLES31.functional.draw_indirect.random.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `399215eb7a`)	2019-02-12 14:19:52 -08:00
Bas Nieuwenhuizen	a1ae60e9a3	amd/common: Use correct writemask for shared memory stores. The check was for 1 bit being set, which is clearly not what we want. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `3c24fc64c7`)	2019-02-12 14:19:52 -08:00
Bas Nieuwenhuizen	37ade3a566	radv: Only look at pImmutableSamples if the descriptor has a sampler. Equivalent of ANV patch `c7f4a2867c` CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `39ab4e12f7`)	2019-02-12 14:19:52 -08:00
Bart Oldeman	92fa6d6959	gallium-xlib: query MIT-SHM before using it. When Mesa is compiled for gallium-xlib using e.g. ./configure --enable-glx=gallium-xlib --disable-dri --disable-gbm -disable-egl and is used by an X server (usually remotely via SSH X11 forwarding) that does not support MIT-SHM such as XMing or MobaXterm, OpenGL clients report error messages such as Xlib: extension "MIT-SHM" missing on display "localhost:11.0". ad infinitum. The reason is that the code in src/gallium/winsys/sw/xlib uses MIT-SHM without checking for its existence, unlike the code in src/glx/drisw_glx.c and src/mesa/drivers/x11/xm_api.c. I copied the same check using XQueryExtension, and tested with glxgears on MobaXterm. This issue was reported before here: https://lists.freedesktop.org/archives/mesa-users/2016-July/001183.html Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a203eaa4f4`)	2019-02-12 14:19:52 -08:00
Ilia Mirkin	5e85df1cfd	nv50,nvc0: add explicit settings for recent caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `38f542783f`)	2019-02-12 14:19:52 -08:00
Marek Olšák	e9dc4e252f	meson: drop the xcb-xrandr version requirement autotools doesn't have any requirement. This fixes meson on Ubuntu 16.04. Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (cherry picked from commit `1e85cfb91a`)	2019-02-12 14:19:52 -08:00
Dylan Baker	56a47e3421	Bump version for 19.0-rc3	2019-02-12 12:39:36 -08:00
Dylan Baker	ca36eb12fd	Revert "intel/compiler: More peephole select" This reverts commit `8fb8ebfbb0`.	2019-02-12 09:42:59 -08:00
Dylan Baker	9dd433dfa7	Revert "nir/opt_peephole_select: Don't peephole_select expensive math instructions" This reverts commit `378f996771`. This also remove the default true argument from the a2xx nir backend, which was introduced after this commit. There should be no change in functionality.	2019-02-12 09:42:16 -08:00
Dylan Baker	f59c77ef8c	Revert "intel/compiler: More peephole_select for pre-Gen6" This reverts commit `af07141b33`.	2019-02-11 16:26:01 -08:00
Dylan Baker	61c22ba94b	cherry-ignore: Add some patches	2019-02-11 16:24:42 -08:00
Jason Ekstrand	ad2b712a56	nir/deref: Rematerialize parents in rematerialize_derefs_in_use_blocks When nir_rematerialize_derefs_in_use_blocks_impl was first written, I attempted to optimize things a bit by not bothering to re-materialize the sources of deref instructions figuring that the final caller would take care of that. However, in the case of more complex deref chains where the first link or two lives in block A and then another link and the load/store_deref intrinsic live in block B it doesn't work. The code in rematerialize_deref_in_block looks at the tail of the chain, sees that it's already in block B and skips it, not realizing that part of the chain also lives in block A. The easy solution here is to just rematerialize deref sources of deref instructions as well. This may potentially lead to a few more deref instructions being created by the conditions required for that to actually happen are fairly unlikely and, thanks to the caching, it's all linear time regardless. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109603 Fixes: `7d1d1208c2` "nir: Add a small pass to rematerialize derefs per-block" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (cherry picked from commit `9e6a6ef0d4`)	2019-02-11 16:24:42 -08:00
Ian Romanick	07e299a0a0	nir: Silence zillions of unused parameter warnings in release builds Fixes: `cd56d79b59` "nir: check NIR_SKIP to skip passes by name" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (cherry picked from commit `78169870e4`)	2019-02-11 09:07:09 -08:00
Ilia Mirkin	36d99d9ad0	nvc0/ir: fix second tex argument after levelZero optimization We used to pre-set a bunch of extra arguments to a texture instruction in order to force the RA to allocate a register at the boundary of 4. However with the levelZero optimization, which removes a LOD argument when it's uniformly equal to zero, we undid that logic by removing an extra argument. As a result, we could end up with insufficient alignment on the second wide texture argument. Instead we switch to a different method of achieving the same result. The logic runs during the constraint analysis of the RA, and adds unset sources as necessary right before being merged into a wide argument. Fixes MISALIGNED_REG errors in Hitman when run with bindless textures enabled on a GK208. Fixes: `9145873b15` ("nvc0/ir: use levelZero flag when the lod is set to 0") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `5de5beedf2`)	2019-02-07 09:51:39 -08:00
Kristian H. Kristensen	94f0908216	freedreno/a6xx: Emit blitter dst with OUT_RELOCW We're writing to the bo and the kernel needs to know for fd_bo_cpu_prep() to work. Fixes: `f93e431272` ("freedreno/a6xx: Enable blitter") Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> (cherry picked from commit `357ea7da51`)	2019-02-07 09:51:39 -08:00
Bas Nieuwenhuizen	f880c74717	amd/common: handle nir_deref_cast for shared memory from integers. Can happen e.g. after a phi. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `8d1718590b`)	2019-02-07 09:51:39 -08:00
Bas Nieuwenhuizen	6f36d3bbc0	amd/common: Handle nir_deref_type_ptr_as_array for shared memory. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `830fd0efc1`)	2019-02-07 09:51:39 -08:00
Bas Nieuwenhuizen	b4e8a3294c	amd/common: Add gep helper for pointer increment. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `e00d9a9a72`)	2019-02-07 09:51:39 -08:00
Bas Nieuwenhuizen	ef6809ba88	amd/common: Fix stores to derefs with unknown variable. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `dbdb44d575`)	2019-02-07 09:38:23 -08:00
Bas Nieuwenhuizen	7254d2f4a3	radv: Fix the shader info pass for not having the variable. For example with VK_EXT_buffer_device_address or VK_KHR_variable_pointers. Fixes: `a2b5cc3c39` "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `00253ab2c4`)	2019-02-07 09:37:37 -08:00
Eric Engestrom	dbc43e3897	xvmc: fix string comparison Fixes: `6fca18696d` "g3dvl: Update XvMC unit tests." Cc: Younes Manton <younes.m@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `40b53a7203`)	2019-02-07 09:37:17 -08:00
Eric Engestrom	262fd16b99	xvmc: fix string comparison Fixes: `c7b65dcaff` "xvmc: Define some Xv attribs to allow users to specify color standard and procamp" Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit `110a6e1839`)	2019-02-07 09:37:07 -08:00
Jonathan Marek	452f9b9984	freedreno: a2xx: fix fast clear Fixes: `912a9c8d` Signed-off-by: Jonathan Marek <jonathan@marek.ca> Cc: 19.0 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `3361305f57`)	2019-02-06 09:54:31 -08:00
Dylan Baker	131f12d49f	Version: Bump for rc2	2019-02-05 11:49:03 -08:00
Emil Velikov	f8f68c41a1	anv: wire up the state_pool_padding test Cc: Jason Ekstrand <jason@jlekstrand.net> Fixes: `927ba12b53` ("anv/tests: Adding test for the state_pool padding.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com><Paste> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (cherry picked from commit `8943eb8f03`)	2019-02-05 11:41:54 -08:00
Michel Dänzer	15e2fc16e9	loader/dri3: Use strlen instead of sizeof for creating VRR property atom sizeof counts the terminating null character as well, so that also contributed to the ID computed for the X11 atom. But the convention is for only the non-null characters to contribute to the atom ID. Fixes: `2e12fe425f` "loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property" Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `c0a540f320`)	2019-02-05 11:41:48 -08:00
Marek Olšák	3f5099180d	radeonsi: fix crashing performance counters (division by zero) Fixes: `e2b9329f17` "radeonsi: move remaining perfcounter code into si_perfcounter.c" (cherry picked from commit `742d6cdb42`)	2019-02-05 09:05:51 -08:00
Danylo Piliaiev	9667d89fe6	anv: Fix VK_EXT_transform_feedback working with varyings packed in PSIZ Transform feedback did not set correct SO_DECL.ComponentMask for varyings packed in VARYING_SLOT_PSIZ: gl_Layer - VARYING_SLOT_LAYER in VARYING_SLOT_PSIZ.y gl_ViewportIndex - VARYING_SLOT_VIEWPORT in VARYING_SLOT_PSIZ.z gl_PointSize - VARYING_SLOT_PSIZ in VARYING_SLOT_PSIZ.w Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (cherry picked from commit `64d3b148fe`)	2019-02-04 09:16:37 -08:00
Jason Ekstrand	c6649ca94d	intel/fs: Do the grf127 hack on SIMD8 instructions in SIMD16 mode Previously, we only applied the fix to shaders with a dispatch mode of SIMD8 but the code it relies on for SIMD16 mode only applies to SIMD16 instructions. If you have a SIMD8 instruction in a SIMD16 shader, neither would trigger and the restriction could still be hit. Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit `b4f0d062cd`)	2019-02-04 09:16:21 -08:00
Neha Bhende	89f84f98e0	st/mesa: Fix topogun-1.06-orc-84k-resize.trace crash We need to initialize all fields in rs->prim explicitly while creating new rastpos stage. Fixes: `bac8534267` ("st/mesa: allow glDrawElements to work with GL_SELECT feedback") v2: Initializing all fields in rs->prim as per Ilia. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (cherry picked from commit `69d736b17a`)	2019-02-01 09:19:29 -08:00
Ernestas Kulik	c824f8031c	v3d: Fix leak in resource setup error path Reported by Coverity: in the case of unsupported modifier request, the code does not jump to the “fail” label to destroy the acquired resource. CID: 1435704 Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Fixes: `45bb8f2957` ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.") (cherry picked from commit `90458bef54`)	2019-01-31 11:12:29 -08:00
Eric Anholt	7fdb08375f	v3d: Fix image_load_store clamping of signed integer stores. This was copy-and-paste fail, that oddly showed up in the CTS's reinterprets of r32f, rgba8, and srgba8 to rgba8i, but not r32ui and r32i to rgba8i or reinterprets to other signed int formats. Fixes: `6281f26f06` ("v3d: Add support for shader_image_load_store.") (cherry picked from commit `ab4d5775b0`)	2019-01-31 11:09:28 -08:00
Eric Anholt	535cc4f1d5	mesa: Skip partial InvalidateFramebuffer of packed depth/stencil. One of the CTS cases tries to invalidate just stencil of packed depth/stencil, and we incorrectly lost the depth contents. Fixes dEQP-GLES3.functional.fbo.invalidate.whole.unbind_read_stencil Fixes: `0c42b5f3cb` ("mesa: wire up InvalidateFramebuffer") Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `db2ae51121`)	2019-01-31 11:09:05 -08:00
Rob Clark	7f91ae20b9	freedreno: more fixing release tarball Fixes: `aa0fed10d3` freedreno: move ir3 to common location Signed-off-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `39cfdf9930`)	2019-01-31 11:08:53 -08:00
Rob Clark	0a72505a9e	freedreno: fix release tarball Fixes: `b4476138d5` freedreno: move drm to common location Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Rob Clark <robdclark@gmail.com> (cherry picked from commit `e252656d14`)	2019-01-31 11:08:11 -08:00
Samuel Pitoiset	31d0079a20	radv/winsys: fix hash when adding internal buffers This fixes serious stuttering in Shadow Of The Tomb Raider. Fixes: `50fd253bd6` ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `9c762c01c8`)	2019-01-31 11:07:40 -08:00
Ernestas Kulik	4d1dd3b0cd	vc4: Fix leak in HW queries error path Reported by Coverity: in the case where there exist hardware and non-hardware queries, the code does not jump to err_free_query and leaks the query. CID: 1430194 Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Fixes: `9ea90ffb98` ("broadcom/vc4: Add support for HW perfmon") (cherry picked from commit `f6e49d5ad0`)	2019-01-31 11:07:26 -08:00
Emil Velikov	45d1aa2f6c	vc4: Declare the last cpu pointer as being modified in NEON asm. Earlier commit addressed 7 of the 8 instances available. v2: Rebase patch back to master (by anholt) Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com> Cc: Eric Anholt <eric@anholt.net> Fixes: `300d3ae8b1` ("vc4: Declare the cpu pointers as being modified in NEON asm.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `385843ac3c`)	2019-01-31 10:59:58 -08:00
Dylan Baker	2fddad9e3f	VERSION: bump to 19.0.0-rc1	2019-01-30 14:10:12 -08:00
Dylan Baker	2b603ee4f1	android,autotools,i965: Fix location of float64_glsl.h Android.mk and autotools disagree about where generated files should go, which wasn't a problem until we wanted to build a dist tarball. This corrects the problme by changing the output and include paths to be the same on android and autotools (meson already has the correct include path). Fixes: `7d7b30835c` ("automake: Fix path to generated source")	2019-01-30 14:10:12 -08:00
Dylan Baker	e7f6a5d17f	automake: Add --enable-autotools to distcheck flags Fixes: `e68777c87c` ("autotools: Deprecate the use of autotools")	2019-01-30 09:45:14 -08:00
Dylan Baker	1f5f12687f	configure: Bump SWR LLVM requirement to 7 It is currently impossible to build a dist tarball that works when SWR requires LLVM 6. To generate the tarball we'd need to configure with LLVM 6, which is fine. But to build the dist check we need LLVM 7, as RadeonSI and RadV require that version. Unfortunately the headers genererated with LLVM 6 don't compile with LLVM 7, the API has changed between the two versions. I weighed a couple of options here. One would be to ship an unbootstrapped tarball generated with meson. This would fix the issue by not bootstrapping, so whatever version of LLVM used would work because the SWR headers would be generated at compile time. Unfortunately this would involve some heavy modifications to the infastructure used to upload the tarballs, and I've decided not to persue this.	2019-01-30 09:27:14 -08:00
Dylan Baker	90a7a9c973	automake: Add include dir for nir src directory Fixes: `6281f26f06` ("v3d: Add support for shader_image_load_store.") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-29 23:24:57 +00:00
Dylan Baker	82365595e9	automake: Add float64.glsl to dist tarball Fixes: `b63a1f8e40` ("glsl: Create file to contain software fp64 functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-29 23:24:57 +00:00
Dylan Baker	7d7b30835c	automake: Fix path to generated source Fixes: `b63a1f8e40` ("glsl: Create file to contain software fp64 functions") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-29 23:24:57 +00:00
Matt Turner	9de90caca8	nir: Optimize double-precision lower_round_even() Use the trick of adding and then subtracting 2**52 (52 is the number of explicit mantissa bits a double-precision floating-point value has) to implement round-to-even. Cuts the number of instructions on SKL of the piglit test fs-roundEven-double.shader_test from 109 to 21. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-01-29 15:02:23 -08:00
Marek Olšák	3e249b853e	ac: use the correct LLVM processor name on Raven2 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-29 17:46:55 -05:00
Eric Anholt	f7769b5121	v3d: Fix the autotools build. Noticed while looking at the gitlab-CI MR.	2019-01-29 14:00:27 -08:00
Jonathan Marek	31a1348a66	freedreno: fix sysmem rendering being used when clear is used This batch->cleared value is only used to decide to use sysmem rendering or not, so it should include any buffers that are affected by a clear. This is required because the a2xx fast clear doesn't work with sysmem rendering. The a22x "normal" clear path doesn't work with sysmem either. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-29 20:22:33 +00:00
Jonathan Marek	c93d77431f	freedreno: fix depth usage logic Depth can be used even when there is no restore/resolve of depth. This happens when the depth buffer is invalidated after rendering to avoid the resolve operation. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-29 20:22:33 +00:00
Jonathan Marek	bcefa0f1cb	freedreno: fix invalidate logic Set dirty bits on invalidate to trigger invalidate logic in fd_draw_vbo. Also, resource_written for color needs to be after the invalidate logic. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-29 20:22:32 +00:00
Jonathan Marek	786f9639d6	mesa/st: wire up DiscardFramebuffer Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-29 20:22:32 +00:00
Rob Clark	0c42b5f3cb	mesa: wire up InvalidateFramebuffer And before someone actually starts implementing DiscardFramebuffer() lets rework the interface to something that is actually usable. Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-29 20:22:32 +00:00
Jonathan Marek	e685566612	st/dri: invalidate_resource depth/stencil before flush_resource This allows freedreno to be aware of the depth invalidate when flushing batches on flush_resource. AFAIK, the only other driver which might care about this change is vc4, where I think it should help by allowing the depth invalidate to work with GALLIUM_HUD. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-29 20:22:32 +00:00
Mario Kleiner	820dfcea43	egl/wayland-drm: Only announce formats via wl_drm which the driver supports. Check if a pixel format is supported by the Wayland servers gpu driver before exposing it to the client via wl_drm, so we avoid reporting formats to the client which the server gpu can't handle. Restrict this reporting to the new color depth 30 formats for now, as the ARGB/XRGB8888 and RGB565 formats are probably supported by every gpu under the sun. Atm. this is mostly useful to allow proper PRIME renderoffload for depth 30 formats on the typical Intel iGPU + NVidia dGPU "NVidia Optimus" laptop combo. Tested on Intel, AMD, NVidia with single-gpu setup and on a Intel + NVidia Optimus setup. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-01-29 20:03:20 +00:00
Mario Kleiner	a34b0d68bb	egl/wayland: Allow client->server format conversion for PRIME offload. (v2) Support PRIME render offload between a Wayland server gpu and a Wayland client gpu with different channel ordering for their color formats, e.g., between Intel drivers which currently only support ARGB2101010 and XRGB2101010 import/display and nouveau which only supports ABGR2101010 rendering and display on nv-50 and later. In the wl_visuals table, we also store for each format an alternate sibling format which stores colors at the same precision, but with different channel ordering, e.g., ARGB2101010 <-> ABGR2101010. If a given client-gpu renderable format is not supported by the server for import, but the alternate format is supported by the server, expose the client-gpu renderable format as a valid EGLConfig to the client. At eglSwapBuffers time, during the blitImage() detiling blit from the client backbuffer to the linear buffer, the client format is converted to the server supported format. As we have to do a copy for PRIME anyway, this channel swizzling conversion comes essentially for free. Note that even if a server gpu in principle does support sampling from the clients native format, this conversion will be a performance advantage if it allows to convert to the servers preferred format for direct scanout, as the Wayland compositor may then be able to directly page-flip a fullscreen client wl_buffer onto the primary plane, or onto a hardware overlay plane, avoiding an extra data copy for desktop composition. Tested so far under Weston with: nouveau single-gpu, Intel single-gpu, AMD single-gpu, "Optimus" Intel server iGPU for display + NVidia client dGPU for rendering. v2: Implement minor review comments by Eric Engestrom: Add some comment and assert, and some style fixes for clarity. No functional change. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2019-01-29 20:03:20 +00:00
Jason Ekstrand	a920979d4f	intel/fs: Use split sends for surface writes on gen9+ Surface reads don't need them because they just have the one address payload. With surface writes, on the other hand, we can put the address and the data in the different halves and avoid building the payload all together. The decrease in register pressure and added freedom in register allocation resulting from this change reduces spilling enough to improve the performance of one customer benchmark by about 2x. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	014edff0d2	intel/fs: Add interference between SENDS sources Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	eab1c55590	intel/fs: Support SENDS in SHADER_OPCODE_SEND Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	cca199fd85	intel/disasm: Properly disassemble split sends Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	8babaa84e8	intel/eu: Add support for the SENDS[C] messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d6a6e10390	intel/inst: Indent some code We're about to add some more if cases so let's have the giant re-indent in it's own patch to make review easier. Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d96969120d	intel/inst: Fix the ia16_addr_imm helpers These have clearly never seen any use.... On gen8, the bottom 4 bits are missing so we need to shift them off before we call set_bits and shift again when we get the bits. Found by inspection. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	e46fb33143	intel/disasm: Rework SEND decoding to use descriptors Instead of fetching the information out of the instruction directly, fetch the descriptor and then pluck the information out of the descriptor. The current scheme works ok for SEND but with SENDS, it all falls to pieces because the descriptor is completely shuffled around. This commit doesn't actually convert everything. One notable exception is URB messages which don't even use descriptors in emit_urb_WRITE yet. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	13a6fabc62	intel/eu: Add more message descriptor helpers We want to be able to extract data from descriptors as well as unify a bit of the descriptor construction. One of the unifications we do is to unify the read/write and dataport descriptors. On gen4-5, read/write are substantially different and the read descriptors change between gen4 and gen4.x. On gen6, they unified layouts between read, write, and dataport. Then, on gen8, they added one bit to the message type field but left it reserved MBZ for read/write messages. This commit chooses to treat that as if they expanded the field everywhere and just didn't have enough enum values for read/write to bother with the extra bit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	c3aa436bfe	intel/eu/validate: SEND restrictions also apply to SENDC Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	fee6bd8d8e	intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc It's a bit more readable Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	b284d222db	intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+ Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	8514eba693	intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+ Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	f547cebbe0	intel/fs: Use a logical opcode for IMAGE_SIZE Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	d2d3e04501	intel/fs: Use SHADER_OPCODE_SEND for surface messages Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	7f1cf046cd	intel/fs: Add a generic SEND opcode Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	ba3c5300f9	intel/eu: Rework surface descriptor helpers This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	5b17379631	intel/eu: Add has_simd4x2 bools to surface_write functions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	2ce93b88c0	intel/fs: Take an explicit exec size in brw_surface_payload_size() Instead of magically falling back to SIMD8 for atomics and typed messages on Ivy Bridge, explicitly figure out the exec size and pass that into brw_surface_payload_size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	cf42b0f9e2	intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf() Like all the other sends, it's just mlen * REG_SIZE. Fixes: `3cbc02e469` "intel: Use TXS for image_size when we have..." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	009c0bd840	intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS If you pass a bool in as the value to set, the C standard says that it gets converted to an int prior to shifting. If you try to set a bool to bit 31, this lands you in undefined behavior. It's better just to add the explicit cast and let the compiler delete it for us. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Jason Ekstrand	077b9557a4	intel/fs: Get rid of fs_inst::equals There are piles of fields that it doesn't check so using it is a lie. The only reason why it's not causing problem is because it has exactly one user which only uses it for MOV instructions (which aren't very interesting) and only on Sandy Bridge and earlier hardware. Just get rid of it and inline it in the one place that it's actually used. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-29 18:43:55 +00:00
Rob Clark	446a14bc0a	freedreno: minor cleanups Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-29 12:30:50 -05:00
Rob Clark	c3baa077bf	freedreno: stop frob'ing pipe_resource::nr_samples Previously we tried to normalize nr_samples to MAX2(1, nr_samples) to avoid having to deal with 0 vs 1 everywhere. But this causes problems in mesa/st, for example st_finalize_texture() will think there is a nr_samples mismatch and recreate the texture. Somehow this manifests as corrupt x11 font rendering on generations that do not support MSAA (but apparently works fine on a5xx and a6xx which do support MSAA.) Fixes: `cf0c7258ee` freedreno/a5xx: MSAA Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-29 12:30:50 -05:00
Rob Clark	1a6ddfe5ee	freedreno/a6xx: fix blitter nr_samples check nr_samples for non-MSAA case could be either zero or one. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-29 12:22:08 -05:00
Rob Clark	9106a0fe33	freedreno/a5xx: fix blitter nr_samples check nr_samples for non-MSAA case could be either zero or one. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-29 12:21:19 -05:00
Bas Nieuwenhuizen	69edc972fc	radv: Enable VK_EXT_memory_priority. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-29 15:56:56 +01:00
Bas Nieuwenhuizen	50fd253bd6	radv/winsys: Add priority handling during submit. Switched to the raw bo list api to avoid having to use 2 arrays for everything. This was introduced in libdrm 2.4.97 which we already depend upon. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-29 15:56:52 +01:00
Bas Nieuwenhuizen	ead54d4a42	radv/winsys: Set winsys bo priority on creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-29 15:56:41 +01:00
Samuel Pitoiset	3a8d6c0880	radv: re-enable fast depth clears for 16-bit surfaces on VI This has been disabled some months ago because it introduced rendering issues with Shadow Of Warrier II (DXVK). This game is no longer affected, I wonder if `824cfc1ee5` ("radv: rework the TC-compat HTILE hardware bug with COND_EXEC") fixed the problem. I checked The Forest on my Polaris, and it renders fine too. According to Phillip, this gives +5.5% with Rise Of The Tomb Raider and DXVK. This is because DXVK uses 16-bit depth surfaces while the native port from Feral uses 32-bit depth surfaces. Unfortunately, Shadow Of The Tomb Raider isn't affected because it clears each layer of a D16 array texture individually. So it doesn't hit the fast clear path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-29 15:20:55 +01:00
Eric Anholt	932ed9c00b	vc4: Enable NEON asm on meson cross-builds. The core Mesa with_asm_arch and USE_ARM_ASM flags are disabled for meson cross-builds because of the need to run host binaries on the build system. vc4 doesn't need to do that, so skip with_asm_arch to enable NEON on my cross-builds. Fixes: `ebcb4c2156` ("meson: Enable VC4's NEON assembly support.")	2019-01-28 16:45:48 -08:00
Carsten Haitzler (Rasterman)	300d3ae8b1	vc4: Declare the cpu pointers as being modified in NEON asm. Otherwise, the compiler is free to reuse the register containing the input for another call and assume that the value hasn't been modified. Fixes crashes on texture upload/download with current gcc. We now have to have a temporary for the cpu2 value, since outputs must be lvalues. (commit message by anholt) Fixes: `4d30024238` ("vc4: Use NEON to speed up utile loads on Pi2.")	2019-01-28 16:45:45 -08:00
Carsten Haitzler (Rasterman)	522f688471	vc4: Use named parameters for the NEON inline asm. This makes the asm code more intelligible and clarifies the functional change in the next commit. (commit message and commit squashing by anholt)	2019-01-28 16:40:46 -08:00
Jonathan Marek	f6292c32cc	kmsro: Add freedreno renderonly support Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:25:27 -05:00
Jonathan Marek	7d458c0c69	freedreno: a2xx: add perfcntrs Based on a5xx perfcntrs implementation. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:21:16 -05:00
Jonathan Marek	cccec0b457	freedreno: a2xx: minor solid_vertexbuf fixups The big thing here is the 0x60 offset for the mem2gmem copy which I missed in my last patch. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:21:16 -05:00
Jonathan Marek	912a9c8d8c	freedreno: a2xx: clear fixes and fast clear path This fixes the depth/stencil clear on a20x, and adds a fast clear path. The fast clear path is only used for a20x, needs performance tests on a22x. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:21:16 -05:00
Jonathan Marek	cb2322c7c0	freedreno: a2xx: a20x hw binning Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:21:16 -05:00
Jonathan Marek	501c6e70d4	freedreno: update a2xx registers Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-28 18:21:16 -05:00
Timothy Arceri	fb78a6cb72	glsl: use remap location when serialising uniform program resource data This allows us to avoid expensive string compares since we already have a map to the pointers. These compares were taking ~30 seconds for a single shader compile in Godot due to it using 64,000+ uniforms. Fixes: `c4cff5f402` ("glsl: add basic support for resource list to shader cache") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109229	2019-01-29 09:39:54 +11:00
Vinson Lee	be5b271ea7	meson: Fix typo. meson.build:166:21: ERROR: Unknown method "verson_compare" for a string. Fixes: `c1efa240c9` ("meson: Add warnings and errors when using ICC") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Cc: 18.3 <mesa-stable@lists.freedesktop.org>	2019-01-28 10:47:32 -08:00
Jonathan Marek	7c930d99ad	freedreno: a2xx: enable early-Z testing Enable earlyZ when alpha test is disabled. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-28 13:04:41 -05:00
Jonathan Marek	32b1d2d716	freedreno: a2xx: ir2 cleanup Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-28 13:04:41 -05:00
Rob Herring	41a0acd6a1	Switch imx to kmsro and remove the imx winsys The kmsro winsys is equivalent to the imx winsys, so we can switch to it and remove the imx one. Signed-off-by: Rob Herring <robh@kernel.org>	2019-01-28 11:50:08 -06:00
Rob Herring	827e0d6654	kmsro: Add etnaviv renderonly support Enable using etnaviv for KMS renderonly. This still needs KMS driver name mapping to kmsro to be used automatically. Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Rob Herring <robh@kernel.org>	2019-01-28 11:45:43 -06:00
Eric Anholt	272b6cf58f	kmsro: Extend to include hx8357d. This allows vc4 to initialize on the Adafruit PiTFT 3.5" touchscreen with the hx8357d tinydrm driver v2: Whitespace fix noted by Eric Engestrom, update commit message for the driver being merged. v3: Rebase on Rob Herring's pipe-loader changes. Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Acked-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2019-01-28 09:35:45 -08:00
Rob Herring	511e7b6f61	pipe-loader: Fallback to kmsro driver when no matching driver name found If we can't find a driver matching by name, then use the kmsro driver. This removes the need for needing a driver descriptor for every possible KMS driver. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-28 09:35:45 -08:00
Eric Anholt	ed65aeec78	pl111: Rename the pl111 driver to "kmsro". The vc4 driver can do prime sharing to many different KMS-only devices, such as the various tinydrm drivers for SPI-attached displays. Rename the driver away from "pl111" to represent what it will actually support: various sorts of KMS displays with the renderonly layer used to attach a GPU. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-28 09:35:45 -08:00
Samuel Pitoiset	afeef3cacf	radv: set noalias/dereferenceable LLVM attributes based on param types Instead of using this useless array_params_mask variable. This should set these two attributes to streamout buffers too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-28 16:30:38 +01:00
Samuel Pitoiset	320b058d32	radv: simplify allocating user SGPRS for descriptor sets Unnecesary to check the current stages if desc_set_used_mask is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-28 16:30:36 +01:00
Samuel Pitoiset	d1994ed229	radv: remove radv_userdata_info::indirect field Always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-28 16:30:33 +01:00
Gert Wollny	212c0c630a	mesa/main: Expose EXT_sRGB_write_control Use EXT_framebuffer_sRGB to expose EXT_sRGB_write_control on GLES. Remove the checks for desktion GL in the enable calls, since EXT_framebuffer_sRGB now also indicates support for switching the linear-sRGB color space conversion on GLES. Thanks to Ilia Mirkin for all the helpful discussions that helped to rework this series. v2: Fix alphabetical listing of extensions (Tapani Pälli) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2019-01-28 12:18:40 +01:00
Gert Wollny	1013dfece1	mesa/main/version: Lower the requirements for GLES 3.0 GLES 3.0 does not actually require support for EXT_framebuffer_sRGB, it only needs support for sRGB attachments to framebuffers and framebuffer objects as defined in ARB_framebuffer_objects. v2: Clarify that ARB_framebuffer_objects is needed. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-28 12:18:40 +01:00
Gert Wollny	76c3f6fb3f	mesa/main: Use flag for EXT_sRGB instead of EXT_framebuffer_sRGB where possible All drivers that support EXT_framebuffer_sRGB also support EXT_sRGB, but in order to keep this commit minial, and not to break any drivers both flags are checked. v2: - Use only EXT_sRGB (Ilia Mirkin) - Move adding the flag EXT_sRGB to gl_extensions to a separate patch v3: use _mesa_has_EXT_framebuffer_sRGB instead of extension flag The _mesa_has function also checks for the correct versions and should be preferred over using the flags directly (Erik) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-28 12:18:40 +01:00
Gert Wollny	8f9dfb7d88	mesa/st: rework support for sRGB framebuffer attachements For GLES sRGB framebuffer attachemnt support is provided in two steps: sRGB attachments like described in EXT_sRGB (and GLES 3.0) that enable linear to sRGB color space transformation automatically, and the ability to switch formats of the render target surface between sRGB and linear that introduces full support for EXT_framebuffer_sRGB. Set the according flags to reflect these two levels of sRGB support. As a difference between desktopm GL and GLES, on desktop GL for a sRGB framebuffer attachment the linear-sRGB conversion is turned off by default, and for GLES it is turned on. This needs to be taken into account when initally creating a surface, i.e. on desktop GL creation of a sRGB surface is preferred, but on GLES sRGB surfaces are only created when explicitely requested. v2: - Use the new CAPS name Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-28 12:18:40 +01:00
Gert Wollny	385081cd17	i965: Set flag for EXT_sRGB Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: <Gurchetan Singh gurchetansingh@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-28 12:18:40 +01:00
Gert Wollny	7577c82fed	mesa:main: Add flag for EXT_sRGB to gl_extensions EXT_sRGB is an (incomplete) GLES extension that provides support for sRGB framebuffer attachments, hence it can be used to check for this support as an alternative to EXT_framebuffer_sRGB that provies the same functionality but also sRGB write control support. However, since EXT_sRGB is incomplete and superseted by GLES 3.0 it will not be exposed as an extension. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-28 12:18:40 +01:00
Gert Wollny	2845939d6a	virgl: Set sRGB write control CAP based on host capabilities v2: - Use the renamed CAPS - add assetions to make sure that mesa doesn't try to switch destination surface formats when it is not supported. (Ilia Mirkin) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2019-01-28 12:18:40 +01:00
Gert Wollny	8021f1875e	Gallium: Add new CAPS to indicate whether a driver can switch SRGB write Add a new cap that indicates whether the drivers supports enabling/disabling the conversion from linear space to sRGB for a framebuffer attachment. In Driver terms that this CAP indicates whether the driver can switcht between a linear and and a sRGB surface format for draw destinations witout changing the sourface itself. v2: rename CAP to DEST_SURFACE_SRGB_CONTROL to reflect its purpouse better (pointed out by Ilia Mirkin) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-28 12:18:40 +01:00
Neil Roberts	75b3719c4f	spirv: Don't use special semantics when counting vertex attribute size Under Vulkan, the double vertex attributes take up the same size regardless of whether they are vertex inputs or any other stage interface. Under OpenGL (ARB_gl_spirv), from GLSL 4.60 spec, section 4.3.9 Interface Blocks: "It is a compile-time error to have an input block in a vertex shader or an output block in a fragment shader. These uses are reserved for future use." So we also don't need to check if it is an vertex input or not, and use false in any case. v2: (changes made by Alejandro Piñeiro) * Update required after "spirv: Handle location decorations on block interface members" own updates (original patch was sent several months ago) * After Neil suggesting it, confirm that this change can be also done for OpenGL (ARB_gl_spirv). Expand commit message. v3: update after changing name of main method on a previous patch Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-28 11:42:46 +01:00
Neil Roberts	5c797f7354	glsl_types: Rename parameter of glsl_count_attribute_slots glsl_count_attribute_slots takes a parameter to specify whether the type is being used as a vertex input because on GL double attributes only take up one slot. Vulkan doesn’t make this distinction so this patch renames the argument to is_gl_vertex_input in order to make it more clear that it should always be false on Vulkan. v2: minor variable renaming (s/member/member_type) (Tapani) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-28 11:42:46 +01:00
Neil Roberts	dfc3a7cb3c	spirv/nir: handle location decorations on block interface members Previously the code was taking any location decoration on the block and using that to calculate the member locations for all of the members. I think this was assuming that there would only be one location decoration for the entire block. According to the Vulkan spec it is possible to add location decorations to individual members: “If the structure type is a Block but without a Location, then each of its members must have a Location decoration. If it is a Block with a Location decoration, then its members are assigned consecutive locations in declaration order, starting from the first member which is initially the Block. Any member with its own Location decoration is assigned that location. Each remaining member is assigned the location after the immediately preceding member in declaration order.” This patch makes it instead keep track of which members have been assigned an explicit location. It also has a space to store the location for the struct as a whole. Once all the decorations have been processed it iterates over each member to fill in the missing locations using the rules described above. So, this commit is needed to get working a case like this, on both Vulkan and OpenGL using SPIR-V (ARB_gl_spirv): out block { layout(location = 2) vec4 c; layout(location = 3) vec4 d; layout(location = 0) vec4 a; layout(location = 1) vec4 b; } name; v2: (changes made by Alejandro Piñeiro) * Update after introducing struct member splitting (See commit `b0c643d`) * Update after only exposing interface_type for blocks, not to any struct * Update after last changes done for xfb support v3: use "assign" instead of "add" on the new method added (Tapani) Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-28 11:42:46 +01:00
Christian Gmeiner	34458c1cf6	etnaviv: add linear sampling support Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-01-28 07:36:12 +01:00
Christian Gmeiner	42ca4dda2d	etnaviv: update headers from rnndb Update to etna_viv commit 4d2f857. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-01-28 07:36:09 +01:00
Christian Gmeiner	5b4a155d2b	etnaviv: extend etna_resource with an addressing mode Defines how sampler (and pixel pipes) needs to access the data represented with a resource. The used default is mode is ETNA_ADDRESSING_MODE_TILED. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2019-01-28 07:36:05 +01:00
Ilia Mirkin	d1d2bb8c07	nvc0: don't put text segment into bufctx The text segment is shared among multiple contexts, while each one has its own bufctx. So when reallocating the text segment, some contexts may end up with stale values in their bufctx's. Instead limit the exposure to the bufctx to within a single draw. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-27 21:47:09 -05:00
Timothy Arceri	0907ae35ad	radv/ac: fix some fp16 handling Fixes: `b722b29f10` ("radv: add support for 16bit input/output") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-28 10:41:48 +11:00
Eric Anholt	c496b60ed8	v3d: Create separate sampler states for the various blend formats. The sampler border color is encoded in the TMU's blending format (half floats, 32-bit floats, or integers) and must be clamped to the format's range unorm/snorm/int ranges by the driver. Additionally, the TMU doesn't know about how we're abusing the swizzle to support BGRA, A, and LA, so we have to pre-swizzle the border color for those. We don't really want to spend half a kb on sampler states in most cases, so skip generating the variants when the border color is unused or is 0,0,0,0.	2019-01-27 08:30:03 -08:00
Eric Anholt	5fe4250a2c	v3d: Move the sampler state to the long-lived state uploader. Samplers are small (8-24 bytes), so allocating 4k for them is a huge waste.	2019-01-27 08:30:03 -08:00
Eric Anholt	09472006ff	v3d: Use the symbolic names for wrap modes from the XML.	2019-01-27 08:30:03 -08:00
Eric Anholt	c51d125d18	v3d: Fix stencil sampling from a separate-stencil buffer. When the sampler view is in sample-stencil mode, we need to return uint stencil values. To do that, fill in the format table to return R8I, and have the sampler view point at the separate stencil buffer. Fixes dEQP-GLES31.functional.stencil_texturing.format.depth32f_stencil8_2d	2019-01-27 08:30:03 -08:00
Eric Anholt	8a0b0a8f37	v3d: Fix stencil sampling from packed depth/stencil. We need to pick the 8-bit unorm value out, not the depth component.	2019-01-27 08:30:03 -08:00
Eric Anholt	fcdbd441a2	v3d: Fix release-build warning about utile_h.	2019-01-27 08:30:03 -08:00
Eric Anholt	edb1fcd963	v3d: Flush blit jobs immediately after generating them. Fixes OOMs in the CTS's packed_pixels.varied_rectangle.* tests -- the series of texture uploads at the start before texturing occurred would end up all sitting around as cached jobs for reuse. By flushing immediately, peak active BO usage goes from 150M to 40M. We could maybe put some limits on how many jobs we keep around, but blits seem particularly unlikely to get reused for other drawing.	2019-01-27 08:30:03 -08:00
Eric Anholt	ac333ffa59	v3d: Fix BO stats accounting for imported buffers.	2019-01-27 08:30:03 -08:00
Eric Anholt	060575bea8	v3d: Drop maximum number of texture units down to 16. This is the GLES 3.2 minmax, and also what the closed source driver does. Avoids hitting OOMs in the CTS's dEQP-GLES3.functional.texture.units.all_units.only_cube.1.	2019-01-27 08:30:03 -08:00
Eric Anholt	3e743d8cd8	v3d: Avoid duplicating limits defines between gallium and v3d core. We don't want to pull the compiler into every include in the gallium driver, so just make a new little header to store the limits.	2019-01-27 08:30:03 -08:00
Eric Anholt	fe6a21c867	v3d: Fix overly-large vattr_sizes structs. We want one vector size per vector, not per component.	2019-01-27 08:30:03 -08:00
Eric Anholt	533b3f0541	v3d: Rename gallium-local limits defines from VC5 to V3D. The compiler has its limits under V3D_* (like most V3D stuff), so sync up with that.	2019-01-27 08:30:03 -08:00
Bas Nieuwenhuizen	b4870a15ae	radv: Remove unused variable. Trivial.	2019-01-27 13:51:35 +01:00
Niklas Haas	804cc44d09	radv: add device->instance extension dependencies From the vulkan spec 33.3 "Extension Dependencies": "Any device extension that has an instance extension dependency that is not enabled by vkCreateInstance is considered to be unsupported, hence it must not be returned by vkEnumerateDeviceExtensionProperties for any VkPhysicalDevice child of the instance." Therefore we need to check whether the instance-level extensions are actually enabled when deciding to support a device-level extension or not. Furthermore, we need to do this for all instance-level extensions of any (transitive) device-level extension dependency, due to the following paragraph: "If an extension is supported (as queried by vkEnumerateInstanceExtensionProperties or vkEnumerateDeviceExtensionProperties), then required extensions of that extension must also be supported for the same instance or physical device." Finally, because some of these vulkan extensions may be implicitly promoted to future vulkan core API versions, we can also satisfy the dependency if the vulkan API version is high enough. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-27 13:50:35 +01:00
Niklas Haas	d12dc39396	radv: correctly use vulkan 1.0 by default From the vulkan spec 3.2 "Instances": "Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an apiVersion of 0 is equivalent to providing an apiVersion of VK_MAKE_VERSION(1,0,0)." Fixes: `ffa15861ef` "radv: UseEnumerateInstanceVersion for the default version." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-27 12:49:28 +01:00
Niklas Haas	d9bd3b1cb8	glsl: fix block member alignment validation for vec3 Section 7.6.2.2 (Standard Uniform Block Layout) of the GL spec says: The base offset of the first member of a structure is taken from the aligned offset of the structure itself. The base offset of all other structure members is derived by taking the offset of the last basic machine unit consumed by the previous member and adding one. The current code does not reflect this last sentence - it effectively instead aligns up the next offset up to the alignment of the previous member. This causes an issue in exactly one case: layout(std140) uniform block { layout(offset=0) vec3 var1; layout(offset=12) float var2; }; As per section 7.6.2.1 (Uniform Buffer Object Storage) and elsewhere, a vec3 consumes 3 floats, i.e. 12 basic machine units. Therefore, `var1` in the example above consumes units 0-11, with 12 being the first available offset afterwards. However, before this commit, mesa incorrectly assumes `var2` must start at offset=16 when using explicit offsets, which results in a compile-time error. Without explicit offsets, the shaders actually work fine, indicating that mesa is already correctly aligning these fields internally. (Just not in the code that handles explicit buffer offset parsing) This patch should fix piglit tests: ssbo-explicit-offset-vec3.vert ubo-explicit-offset-vec3.vert Signed-off-by: Niklas Haas <git@haasn.xyz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-27 03:00:03 -05:00
Jason Ekstrand	86e5f76d3d	spirv: Add support for SPV_EXT_physical_storage_buffer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	fb282a68bc	spirv: Implement OpConvertPtrToU and OpConvertUToPtr This only implements the actual opcodes and does not implement support for using them with specialization constants. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	837ed2ba51	spirv: Handle OpTypeForwardPointer We handle forward declarations by creating the pointer type with it's storage type based on storage class and just waiting to fill out the actual deref type until we get the OpTypePointer. Because any composites using the forward declared type only care about the storage type (i.e. uint64_t, uvec2, etc.) when creating their glsl_type, this works fine and we can defer the actual deref_type as far as we need. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	4602e705e4	spirv: Drop a bogus assert This was valid back when the only valid types of pointers were uint32 and uvec2. Now that we're allowing more variety, it could be just about anything so we'll just drop the assert. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	9e34781aef	nir: Allow SSBOs and global to alias Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:41:50 -06:00
Jason Ekstrand	9839ce8bf9	nir/validate: Allow array derefs of vectors for nir_var_mem_global Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	5f5503d498	nir/lower_io: Add support for nir_var_mem_global Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	314d2c90c3	nir/lower_io: Add a 32 and 64-bit global address formats These are simple scalar addresses. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-26 13:39:18 -06:00
Jason Ekstrand	e461926ef2	nir: Add load/store/atomic global intrinsics These correspond roughly to reading/writing OpenCL global pointers. The idea is that they just take a bare address and load/store from it. Of course, exactly what this address means is driver-dependent. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-26 13:39:18 -06:00
Axel Davy	6380fedb60	st/nine: Enable debug info if NDEBUG is not set We want to have debug info as well if using meson's debugoptimized when ndebug is off. v2: use u_debug functions that do something even if DEBUG is not set. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2019-01-26 19:53:19 +01:00
Axel Davy	d7433c22e6	st/nine: Immediately upload user provided textures Fixes regression caused by `42d672fa6a` st/nine: Bind src not dst in nine_context_box_upload Before that patch, for user provided textures, when the texture was destroyed, the safety check for pending uploads, which according to the code "Following condition cannot happen currently", was flushing the queue and thus triggering the upload. After the patch, the texture destruction was delayed after the upload. However the user frees the texture buffer, as it thinks the texture released. Instead of reverting the faulty patch, this patch instead flushes the csmt queue right away after queuing the upload for this type of textures. This is more future-proof, as we may want to bind the surface for other reasons in the future. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Cc: 18.3 <mesa-stable@lists.freedesktop.org>	2019-01-26 19:53:00 +01:00
Matt Turner	a7d629a590	i965: Always compile fp64 funcs when needed Compilation of user-specified shaders with software fp64 works by compiling on demand an "fp64-funcs" shader implementing various fp64 operations and then linking it into the "user shader". In commit `64b8c86d37` Author: Timothy Arceri <tarceri@itsqueeze.com> Date: Thu Jan 17 17:16:29 2019 +1100 glsl: be much more aggressive when skipping shader compilation we changed the behavior of the shader cache to skip compilation earlier when we get a cache hit. After the aforementioned commit, compiling a user program using fp64 would store into the cache an entry for the fp64-funcs shader. Subsequent compilations of uncached user shaders using fp64 would fail in compile_fp64_funcs() after finding a cache entry for the fp64-funcs, but being unprepared to read from the cache. It's unclear to me how to retrieve the cached NIR of the fp64-funcs (if it even is cached), so just call _mesa_glsl_compile_shader() with force_recompile=true in order to ensure we generate the fp64-funcs successfully. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-26 10:33:22 -08:00
Matt Turner	18b467c066	intel/compiler: Add a file-level description of brw_eu_validate.c Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-01-26 10:33:22 -08:00
Jonathan Marek	41ddf1d150	freedreno: add renderonly scanout This allows creating a fd_screen with a renderonly object which will be used to allocated scanout resources. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Eric Anholt <eric@anholt.net> [slight tweak to fix uninitialized 'prsc' in debug print] Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-26 10:47:21 -05:00
Rob Clark	cd79b5e0c2	freedreno/a2xx: fix unused variable warning Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-26 10:44:31 -05:00
Timothy Arceri	8e9ad592c3	tgsi: remove culldist semantic from docs The semantic was removed in `e6d9389366`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-26 12:04:53 +11:00
Timothy Arceri	5d66f7103f	ac/nir_to_llvm: fix clamp shadow reference for more hardware Fixes the following piglit test on my VEGA and matches the behaviour in the tgsi backend. tests/spec/glsl-1.10/execution/samplers/glsl-fs-shadow2D-clamp-z.shader_test Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-26 12:03:24 +11:00
Eric Anholt	08f4a904b3	gallium: Make sure we return is_unorm/is_snorm for compressed formats. The util helpers were looking for a non-void channels in a non-mixed format and returning its snorm/unorm state. However, compressed formats don't have non-void channels, so they always returned false. V3D wants to use util_format_is_[su]norm for its border color clamping workarounds, so fix the functions to return the right answer for these. This now means that we ignore .is_mixed. I could retain the is_mixed check, but it doesn't seem like a useful feature -- the only code I could find that might care is freedreno's blit, which has some notes about how things are wonky in this area anyway. Reviewed-by: <Roland Scheidegger sroland@vmware.com>	2019-01-25 13:06:50 -08:00
Eric Anholt	104c7883e7	gallium: Fix comment about possible colorspaces. Two typos, and missing one of the colorspaces. Reviewed-by: <Roland Scheidegger sroland@vmware.com>	2019-01-25 13:06:47 -08:00
Eric Anholt	54abd2e084	gallium: Enable unit tests as actual meson unit tests. These tests don't need swrast, so we can always enable them when build_tests is set. Most of them run to successful completion quickly (.9s on my SKL). Reviewed-by: <Roland Scheidegger sroland@vmware.com>	2019-01-25 13:06:45 -08:00
Emil Velikov	3b6aaab7e9	mapi: print function declarations for shared glapi Earlier commit aimed to remove unneeded function declarations. Namely OpenGL entrypoints which are not applicable for OpenGLES* Although it did not consider the shared glapi which needs all, including hidden ones. Resulting in warning/errors like the following ../build/src/mapi/shared-glapi/glapi_mapi_tmp.h:26014:15: error: no previous prototype for ‘shared_dispatch_stub_1414’ [-Werror=missing-prototypes] This patch addressed that. Cc: Erik Faye-Lund <erik.faye-lund@collabora.com> Reported-by: Eric Anholt <eric@anholt.net> Fixes: `6148cce388` ("mapi: drop unneeded gl_dispatch_stub declarations") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2019-01-25 13:04:04 -08:00
Rob Clark	4aa64940c6	freedreno: limit tiling to PIPE_BIND_SAMPLER_VIEW `1ce5d757d0` dropped this limit.. which is probably the right thing to do. But it results in an extra tiled->linear blit for glReadPixels() (ie. dEQP/piglit) which is hitting some intermittent corruption (looks like cache) on a6xx, causing a lot of spurious fails. Since we are getting close to 19.0 branchpoint, re-instate this limit for now, until the blitter problems are resolved. Fixes: `1ce5d757d0` freedreno: core buffer modifier support Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-25 10:20:05 -05:00
Samuel Pitoiset	378e2d2414	radv: fix computing number of user SGPRs for streamout buffers Streamout buffers are emitted like push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-25 15:36:16 +01:00
Jose Fonseca	65b8d723fd	appveyor: Revert commits adding Cygwin support. This reverts commits `00ad77b9f6` and `5334dafee2`. This avoids Appveyor build breakage due to Cygwin, but more importantly, there are several problems with these patches, as highlighted to my recent mesa-dev mail. So better to revert for now, and pursue Cygwin support after these have been address.	2019-01-25 14:13:26 +00:00
Tapani Pälli	540939ecee	android: fix build issues with libmesa_anv_gen* libraries We need this include path to find nir/nir_xfb_info.h. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-25 15:21:06 +02:00
Andrii Simiklit	4759bb2fcf	intel/batch-decoder: fix a vb end address calculation According to the loop implementation (in 'ctx_print_buffer' function), which advances dword by dword over vertex buffer(vb), the vb size should be aligned by 4 bytes too. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-25 15:12:30 +02:00
Andrii Simiklit	db39a44f10	intel/batch-decoder: fix vertex buffer size calculation for gen<8 It should be incremented by one according to how it is calculated by 'emit_vertex_buffer_state': "\#if GEN_GEN < 8 .BufferAccessType = step_rate ? INSTANCEDATA : VERTEXDATA, .InstanceDataStepRate = step_rate, \#if GEN_GEN >= 5 .EndAddress = ro_bo(bo, end_offset - 1), \#endif \#endif" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109449 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-25 15:12:07 +02:00
Eric Engestrom	69e9440367	meson/vdpau: add missing soversion This mirrors what autotools does in src/gallium/state_trackers/vdpau/Makefile.am and src/gallium/targets/vdpau/Makefile.am: VDPAU_MAJOR = 1 VDPAU_MINOR = 0 libvdpau_gallium_la_LDFLAGS = -version-number $(VDPAU_MAJOR):$(VDPAU_MINOR) Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Fixes: `68076b8747` "meson: build gallium vdpau state tracker" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-25 12:10:00 +00:00
Eric Engestrom	9af77fcf98	anv: drop always-successful VkResult Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-25 09:45:27 +00:00
Rafael Antognolli	f2ece26601	anv/allocator: Avoid race condition in anv_block_pool_map. Accessing bo->map and then pool->center_bo_offset without a lock is racy. One way of avoiding such race condition is to store the bo->map + center_bo_offset into pool->map at the time the block pool is growing, which happens within a lock. v2: Only set pool->map if not using softpin (Jason). v3: Move things around and only update center_bo_offset if not using softpin too (Jason). Cc: Jason Ekstrand <jason@jlekstrand.net> Reported-by: Ian Romanick <idr@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109442 Fixes: `fc3f588320` Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-24 17:39:40 -08:00
Dylan Baker	c1efa240c9	meson: Add warnings and errors when using ICC ICC tries to be helpful by not erroring when it sees something that it doesn't understand, which is completely the opposite of helpful. Meson 0.49.0 does much better at handling this by really trying to make ICC error, but there are some things in mesa that still get ignored until 0.49.1 v2: - Fix id check, which is 'intel' not 'icc' Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)	2019-01-24 19:14:50 +00:00
Dylan Baker	7cb7f35bc7	meson: Fix compiler checks for SWR with ICC This is a bit fragile, as the way this "fixes" the check is to move the one that we know is correct before the one that is incorrectly reported as working. In meson 0.49.1 (which isn't out yet) this is fixed that the incorrect check is reported as a failure. Fixes: `e0b037d697` ("meson: Build SWR driver") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109129 Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-24 19:14:50 +00:00
Dylan Baker	3ba7ab8d2c	meson: fix swr KNL build There's a typo in one of the #defines that breaks compilation. Fixes: `e0b037d697` ("meson: Build SWR driver") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109023 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-24 19:14:50 +00:00
Matt Turner	70a7ece035	gallivm: Return true from arch_rounding_available() if NEON is available LLVM uses the single instruction "FRINTI" to implement llvm.nearbyint. Fixes the rounding tests of lp_test_arit. Bug: https://bugs.gentoo.org/665570 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2019-01-24 11:07:24 -08:00
Matt Turner	385ee7c3d0	gallium: Enable ASIMD/NEON on aarch64. NEON (now called ASIMD) is available on all aarch64 CPUs. Our code was missing an aarch64 path, leading to util_cpu_caps.has_neon always being false on aarch64. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-24 11:07:24 -08:00
Dave Airlie	1f6b92b476	gallium: use put image shm2 path (v2) This fixes the drisw paths to use the new shm2 interface, so that we don't trigger the X server overflow checks when the x offset is non-zero. This just hides the versioning in drisw, and either passes the src_x or adds the offset fixup for the fallback path. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-01-25 04:27:45 +10:00
Dave Airlie	00af91ca46	glx: add support for putimageshm2 path (v2) v2: pass x,0 in as the offset coords at glx level not earlier Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-01-25 04:27:45 +10:00
Dave Airlie	db83a2b40f	dri_interface: add put shm image2 (v2) This adds a new interface to the swrast interface to fix an shm put image bug. The current code adds the x,y src offsets into the offset parameters, however if the x offset is > 0, and the put image copies up to the height of the image, this can trigger an X server validation check to fail and the renderering to get BadMatch. This patch fixes it to pass the x offset coord in as a src x. We cannot pass the Y coordinate due to the horrible code mangling the image w/h vs stride in swrastXPutImage. v2: drop srcx,y from api Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com>	2019-01-25 04:27:45 +10:00
Emil Velikov	281421e1bc	mapi: remove machinery handling CSV files We haven't have one in years, so just drop the code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	8a0012692a	mapi: remove old, unused ES* generator code As of earlier commit, everyone has switched to the new script for the ES dispatch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	a41214ca3e	mapi/es2api: remove no longer present entrypoints With the previous scripts API from the following was incorrectly exported. Drop them from the list, since they're no longer around. GL_EXT_blend_func_extended GL_EXT_texture_integer Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	05f8558b27	mapi/es*api: remove GL_EXT_multi_draw_arrays entrypoints Now we use the upstream XML file and a cleaner generator. Thus the symbols are no longer exported and we can drop them from this list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	5661ce6c64	mapi/es*api: remove GL_OES_EGL_image entrypoints As some point in the past we fixed the scripts so, these are no longer exported. Drop them from the list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	9f86f1da7c	Revert "mapi/new: sort by slot number" This reverts commit a1f5d9412cf7cacb3534635f6c2409fafbe6574e. We no longer needed to sort - it was meant only to ease compare against the old generated files.	2019-01-24 18:13:25 +00:00
Emil Velikov	3bf08292d2	scons: wire the new generator for es1 and es2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	0842bc879b	meson: wire the new generator for es1 and es2 v2: use ${foo})_py naming (Dylan) v3: use symbolic name for genCommon.py Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)	2019-01-24 18:13:25 +00:00
Emil Velikov	656845301d	autotools: wire the new generator for es1 and es2 The output produced functionally identical, with the following changes: - A cosmetic: swapped ABI compatible types [ GLclampf -> GLfloat, etc ] - B cosmetic: renamed parameters [ zNear -> n, etc ] - C dropped extension entrypoints - invalid/incorrect To make things easier to validate, normalise both old/new headers run the sed patterns A, B and C to both sets. A s/\<GLclampf\>/GLfloat/g; s/\<GLclampx\>/GLfixed/g; s/\<GLvoid\>/void/g; B s/\ \* / */g; s/\<texture\>/target/g; s/\<plane\>/p/g; s/\<depth\>/d/g; s/\<modeAlpha\>/modeA/g; s/\<shader\>/program/g; s/\<obj\>/shaders/g; s/\<equation\>/eqn/g; s/\<param\>/data/g; s/\<params\>/data/g; s/\<buffers\>/buffer/g; s/\<src\>/mode/g; s/\<count\>/n/g; s/\<zNear\>/n/g; s/\<zFar\>/f/g; s/\<zfail\>/dpfail/g; s/\<zpass\>/dppass/g; s/\<buf\>/index/g; s/\<value\>/target/g; s/\<cap\>/target/g; s/\<maskNumber\>/index/g; s/\<srcRGB\>/sfactorRGB/g; s/\<dstRGB\>/dfactorRGB/g; s/\<srcAlpha\>/sfactorAlpha/g; s/\<dstAlpha\>/dfactorAlpha/g; s/\<primitiveMode\>/mode/g; s/\<primcount\>/instancecount/g; s/\<top\>/t/g; s/\<bottom\>/b/g; s/\<left\>/l/g; s/\<right\>/r/g; s/\<x\>/v0/g; s/\<y\>/v1/g; s/\<z\>/v2/g; s/\<w\>/v3/g; s/\<sfactor\>/mode/g; s/\<dfactor\>/dst/g; s/\<attribindex\>/bindingindex/g; s/\<internalFormat\>/internalformat/g; s/\<bufSize\>/bufsize/g; C glMultiDrawArraysEXT glMultiDrawElementsEXT glBindFragDataLocationEXT glGetTexParameterIivEXT glGetTexParameterIuivEXT glTexParameterIivEXT glTexParameterIuivEXT v2: - gl_dispatch_stub declarations are addressed with previous patch - the public_entries table is no longer generated Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	389bc2bc6e	mapi/new: remove duplicate GLvoid/void substitution We already do it a few lines above - drop the duplicate. Note that for consistency sake, we keep the substitution since the GL API is a mixed bad - some use GLvoid while others a normal void. We might want to merge this back in GLVND. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	5fa6c34949	mapi/new: fixup the GLDEBUGPROCKHR typedef to the non KHR one This way we can reuse the latter, which is already present in the headers that we use. Thus we can drop the manual typedef we generate. We might want to merge this back in GLVND. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	babec55f7e	mapi/new: don't print info we don't need for ES1/ES2 There is no need for the noop functions, the public_stubs and public_entries table or table size defines. Remove those. Pretty much all of this is applicable to GLVND, although it requires preparatory work. v2: - python style fixes (Dylan) - use "gldispatch" instead of not "glesv1" "glesv2" - remove the public_entries table/array (Erik) v3: - use if == "gldispatch", instead of "in" (Kyle) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v2)	2019-01-24 18:13:25 +00:00
Emil Velikov	5b1bdce156	mapi/new: split out public_entries handling The only instance that requires the public_entries table is the dispatch library - split that into another function. We have to be careful with when undefining the guard, so split it out. We might want to merge this back in GLVND. Minor GLVND cleanup will be needed first. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	313f977224	mapi/new: reinstate _NO_HIDDEN suffixes in the new generator Strictly speaking we can rework the rest of the code so we do not need those. That said, this will require a series on it's own so let's carry this local quirk for now. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	451805f810	mapi/new: use the static_data offsets in the new generator Otherwise the incorrect ones will be used, effectively breaking the ABI. Note: some entries in static_data.py list a suffixed API, while (for ES* at least) we expect the one w/o suffix. v2: - rework path handling (Dylan) - use else if chain (Erik) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	bba375c016	mapi/new: sort by slot number Makes it easier to compare the newly generated header against the old one. Will be reverted after the transition.	2019-01-24 18:13:25 +00:00
Emil Velikov	06eb3fe371	mapi/new: import mapi scripts from glvnd Currently we have over 20 scripts that generate the libGL* dispatch and various other functionality. More importantly we're using local XML files instead of the Khronos provides one(s). Resulting in an increasing complexity of writing, maintaining and bugfixing. One fairly annoying bug is handling of statically exported symbols. Today, if we enable a GL extension for GLES1/2, we add a special tag to the xml. Thus the ES dispatch gets generated, but also since we have no separate notion of GL/ES1/ES2 static functions it also gets exported statically. This commit adds step one towards clearing and simplifying our setup. It imports the mapi generator from GLVND. 012fe39 ("Remove a couple of duplicate typedefs.") v2: use local genCommon.py Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	cd0f11bac5	mapi: move genCommon.py to src/mapi/new The helper will also be used by the new Khronos gl.xml aware generator. v2: Move existing one, instead of duplicating it. v3: Correct genCommon.py references in meson [Erik] v4: Drop the file from the EGL EXTRA_DIST [Erik] Suggested-by: Kyle Brenneman <kbrenneman@nvidia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	a08a793180	genCommon.py: Fix typo in _LIBRARY_FEATURE_NAMES. Port glvnd commit 37fc6caa4b8 ("Fix typo in _LIBRARY_FEATURE_NAMES.") from Michal Srb. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	cf317bf093	mapi: add all _glapi_table entrypoints to static_data.py Currently various parts of mesa use the glapi_table differently. Some use _glapi_get_proc_offset() to get the offset, while others directly reference the specific offset via _gloffset_Function. Add all static entries, to ensure things don't break as we flip to the upstream XML + new mapi generator. Note: the offsets are also used for the alias remap table, thus we need to ensure we honour the correct offsets range or it will break. Currently this is done via MAX_OFFSETS constant, although a better solution is in the works. v2: add FramebufferTexture2DMultisampleEXT v3: add MAX_OFFSETS guard Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	fe9f5c0e21	mapi: sort static entrypoints numerically A few of the entrypoints were incorrectly placed. Sort those to align with the rest of the list. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Emil Velikov	5a81e8d40e	Revert "mesa/main: remove ARB suffix from glGetnTexImage" This reverts commit `f1998e15ff`. This changes the ABI, such that glGetnTexImageARB entry-point from the GLAPI gets removed. Thus accessing many functions by offset (as we do) will result in getting the wrong one. Follow-up work will swap the by-offset handling, but for now revert this patch. Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:25 +00:00
Erik Faye-Lund	6148cce388	mapi: drop unneeded gl_dispatch_stub declarations These declarations are not used anywhere - be that generated code or otherwise. [Emil: format the hunk from Erik into a patch] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 18:13:24 +00:00
Emil Velikov	ca152234e1	mesa: correctly use os.path.join in our python scripts With Windows in mind, using forward slash isn't the right thing to do. Even if it just works, we might want to fix it. As here, use __file__ instead of argv[0] and sys.path.insert over sys.path.append. With the path tweak being reportedly faster. Suggested-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-24 18:13:24 +00:00
Emil Velikov	9cc8e12505	freedreno: automake: ship ir3_nir_trig.py in the tarball Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 18:13:24 +00:00
Eric Engestrom	8ed966b506	egl/glvnd: sync egl.xml from Khronos Fixes: `98984b7cdd` "egl: add glvnd entrypoints for EGL_MESA_query_driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 16:55:21 +00:00
Eric Engestrom	d2ca270511	travis: bump libdrm to 2.4.97 Fixes: `c02f761bdf` "winsys/amdgpu: use the new BO list API" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-24 14:50:33 +00:00
Veluri Mithun	85edfc04b8	egl: Implementation of egl dri2 drivers for MESA_query_driver Signed-off-by: Veluri Mithun <velurimithun38@gmail.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 14:37:52 +00:00
Eric Engestrom	98984b7cdd	egl: add glvnd entrypoints for EGL_MESA_query_driver Fixes: fbdd7bde29863935106c "egl: Implement EGL API for MESA_query_driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 14:37:47 +00:00
Veluri Mithun	6afce78128	egl: Implement EGL API for MESA_query_driver Signed-off-by: Veluri Mithun <velurimithun38@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 14:37:47 +00:00
Eric Engestrom	7d9274388b	egl: update headers from Khronos Cheating a tiny bit as these headers aren't in the Khronos repo yet, but I expect them to be within a couple days. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 14:37:44 +00:00
Eric Engestrom	381d0e753a	egl: finalize EGL_MESA_query_driver Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-24 14:37:36 +00:00
Matt Turner	e166003cb7	intel/compiler: Reset default flag register in brw_find_live_channel() emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its flag_subreg set, so that the IR knows which flag is accessed. However the flag is only used on Gen7 in Align1 mode. To avoid setting unnecessary bits in the instruction words, get the information we need and reset the default flag register. This allows round-tripping through the assembler/disassembler. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-01-23 22:48:29 -08:00
Kenneth Graunke	74c9c906f9	gallium: Add forgotten docs for PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS. Thanks to Ilia for catching this.	2019-01-23 17:16:22 -08:00
Mark Janes	022800a058	Revert "Implement EGL API for MESA_query_driver" This reverts commit `ff621a5055`. with default warnings configuration, this commit generates: ../src/egl/main/eglapi.c:2654:1: error: no previous prototype for ‘eglGetDisplayDriverConfig’ [-Werror=missing-prototypes]	2019-01-23 16:29:13 -08:00
Mark Janes	9e9fa13c81	Revert "Implementation of egl dri2 drivers for MESA_query_driver" This reverts commit `2720f78ef2`.	2019-01-23 16:28:47 -08:00
Veluri Mithun	2720f78ef2	Implementation of egl dri2 drivers for MESA_query_driver Signed-off-by: Veluri Mithun <velurimithun38@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-23 22:29:14 +00:00
Veluri Mithun	ff621a5055	Implement EGL API for MESA_query_driver Signed-off-by: Veluri Mithun <velurimithun38@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-23 22:29:14 +00:00
Veluri Mithun	499869908b	Add extension doc for MESA_query_driver Signed-off-by: Veluri Mithun <velurimithun38@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-23 22:29:14 +00:00
Sergii Romantsov	cfca5cd958	nir: Length of boolean vtn_value now is 1 During conversion type-length was lost due to math. v2 (Jason Ekstrand): - Use a size/offset of 4 bytes Fixes: `44227453ec` (nir: Switch to using 1-bit Booleans for almost everything) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109353 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-23 15:43:06 -06:00
Marek Olšák	42aea4f1a7	st/mesa: fix PRIMITIVES_GENERATED query after the "pipeline stat single" changes When this functionality was added, the PRIMITIVES_GENERATED query was accidentally omitted. This causes issues for drivers that support transform feedback." Fixes: `d644698b44` ("gallium: Add the ability to query a single pipeline statistics counter") Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-23 14:32:57 -05:00
Marek Olšák	c89e8470e5	st/mesa: purge framebuffers when unbinding a context This fixes pipe_surface "leaks". Cc: 18.3 <mesa-stable@lists.freedesktop.org> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-01-23 14:32:57 -05:00
Erik Faye-Lund	5c17c01815	docs: add note about sending merge-requests from forks Sending MRs from the main Mesa repository increase clutter in the repository, and decrease visibility of project-wide branches. So it's better if MRs are sent from forks instead. Let's add a note about this, in case its not obvious to everyone. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-23 18:14:06 +01:00
Rob Clark	5a4af871e3	freedreno: set modifier when exporting buffer Fixes an assert we start hitting with kms/gbm: #0 0x0000007fbf3d6e3c in raise () from /lib64/libc.so.6 #1 0x0000007fbf3c4a68 in abort () from /lib64/libc.so.6 #2 0x0000007fbf3d04e8 in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000007fbf3d0550 in __assert_fail () from /lib64/libc.so.6 #4 0x0000007fbf5a73c4 in gbm_dri_bo_create (gbm=0x5820f0, width=2160, height=1440, format=875713112, usage=0, modifiers=0x695e00, count=1) at ../src/gbm/backends/dri/gbm_dri.c:1150 #5 0x0000007fbf5a49c4 in gbm_bo_create_with_modifiers (gbm=0x5820f0, width=2160, height=1440, format=875713112, modifiers=0x695e00, count=1) at ../src/gbm/main/gbm.c:491 #6 0x0000007fbbac3d64 in get_back_bo (dri2_surf=0x6f4cc0) at ../src/egl/drivers/dri2/platform_drm.c:258 #7 0x0000007fbbac4318 in dri2_drm_image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x6f4cc0, buffer_mask=1, buffers=0x7fffffe210) at ../src/egl/drivers/dri2/platform_drm.c:409 #8 0x0000007fbf5a5318 in image_get_buffers (driDrawable=0x704490, format=4098, stamp=0x6fc730, loaderPrivate=0x70e150, buffer_mask=1, buffers=0x7fffffe210) at ../src/gbm/backends/dri/gbm_dri.c:135 #9 0x0000007fbe4308c4 in dri_image_drawable_get_buffers (drawable=0x6fc730, images=0x7fffffe210, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:339 #10 0x0000007fbe430c44 in dri2_allocate_textures (ctx=0x614b30, drawable=0x6fc730, statts=0x6f2660, statts_count=1) at ../src/gallium/state_trackers/dri/dri2.c:466 #11 0x0000007fbe435580 in dri_st_framebuffer_validate (stctx=0x714160, stfbi=0x6fc730, statts=0x6f2660, count=1, out=0x7fffffe3b8) at ../src/gallium/state_trackers/dri/dri_drawable.c:85 #12 0x0000007fbe7b2c84 in st_framebuffer_validate (stfb=0x6f2190, st=0x714160) at ../src/mesa/state_tracker/st_manager.c:222 #13 0x0000007fbe7b4884 in st_api_make_current (stapi=0x7fbf0430d8 <st_gl_api>, stctxi=0x714160, stdrawi=0x6fc730, streadi=0x6fc730) at ../src/mesa/state_tracker/st_manager.c:1074 #14 0x0000007fbe434f44 in dri_make_current (cPriv=0x703c20, driDrawPriv=0x704490, driReadPriv=0x704490) at ../src/gallium/state_trackers/dri/dri_context.c:301 #15 0x0000007fbe42c910 in driBindContext (pcp=0x703c20, pdp=0x704490, prp=0x704490) at ../src/mesa/drivers/dri/common/dri_util.c:579 #16 0x0000007fbbabab40 in dri2_make_current (drv=0x69d170, disp=0x69c6e0, dsurf=0x6f4cc0, rsurf=0x6f4cc0, ctx=0x70cb40) at ../src/egl/drivers/dri2/egl_dri2.c:1456 #17 0x0000007fbbaa8ef4 in eglMakeCurrent (dpy=0x69c6e0, draw=0x6f4cc0, read=0x6f4cc0, ctx=0x70cb40) at ../src/egl/main/eglapi.c:862 #18 0x0000007fbf5736ac in InternalMakeCurrentVendor (dpy=dpy@entry=0x614fb0, draw=draw@entry=0x6f4cc0, read=read@entry=0x6f4cc0, context=context@entry=0x70cb40, apiState=apiState@entry=0x6fc940, vendor=0x6975f0) at libegl.c:861 #19 0x0000007fbf573764 in InternalMakeCurrentDispatch (dpy=0x614fb0, draw=0x6f4cc0, read=0x6f4cc0, context=0x70cb40, vendor=0x6975f0) at libegl.c:630 #20 0x0000000000403640 in init_egl (egl=0x5805a8 <gl>, gbm=0x580528 <gbm>, samples=0) at ../common.c:263 #21 0x0000000000403c1c in init_cube_smooth (gbm=0x580528 <gbm>, samples=0) at ../cube-smooth.c:225 #22 0x0000000000408618 in main (argc=1, argv=0x7fffffe8d8) at ../kmscube.c:145 Fixes: `1ce5d757d0` freedreno: core buffer modifier support Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-23 10:21:00 -05:00
Samuel Pitoiset	963c044c55	radv: always pass the GFX9 fence data to si_cs_emit_cache_flush() Remove two useless checks. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:14 +01:00
Samuel Pitoiset	5f0b17d581	radv: compute the GFX9 fence VA at allocation time Instead of doing every time we emit cache flushes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:12 +01:00
Samuel Pitoiset	e7ac792400	radv: only allocate the GFX9 fence and EOP BOs for the gfx queue It's invalid to emit a ZPASS_DONE event on the compute queue, and the fence BO is unused on the compute queue (ie. we don't flush CB or DB caches). This saves some space in the upload BO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:09 +01:00
Samuel Pitoiset	bd098884f1	radv: remove old_fence parameter from si_cs_emit_write_event_eop() This parameter is actually useless as the immediate value can always be zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 11:31:07 +01:00
Samuel Pitoiset	698afa177e	radv: improve gathering of load_push_constants with dynamic bindings For example, if a pipeline has two stages VS and FS. And if only the fragment stage needs dynamic bindings, we shouldn't allocate an extra user SGPR for the vertex stage. Of course, if the vertex stage loads constants, it needs an user SGPR. This should reduce the number of SET_SH_REG packets that are emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-23 09:43:53 +01:00
Caio Marcelo de Oliveira Filho	e0485a1dd7	gallium: Add PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS In the Intel backend, it makes the most sense to treat gl_TessLevelInner and gl_TessLevelOuter as ordinary shader inputs. For Radeon, it makes more sense to treat them as system values which get special handling. We already have a compiler option for this, but the Iris driver will need a capability bit so we can set it appropriately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-23 00:35:56 -08:00
Ilia Mirkin	8e26d534be	nv50,nvc0: mark textures dirty on fb update We may have to flush the cache if there are any textures presently bound that refer to the outgoing framebuffer. This is only checked at validation time. Fixes a number of dEQP-GLES3.functional.fbo.color.repeated_clear.sample.* tests, which would bind a texture, then clear it while the binding was in effect, and then render to a different texture. This seems legal under the "no feedback loops" rule. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-01-22 23:16:01 -05:00
Timothy Arceri	678ef2a4a5	ac/nir_to_llvm: fix interpolateAt* for structs This fixes the arb_gpu_shader5 interpolateAt* tests that contain structs. Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-01-23 10:41:37 +11:00
Timothy Arceri	559e5b0408	ac/nir_to_llvm: add bindless support for uniform handles Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-23 10:41:37 +11:00
Timothy Arceri	f0ed59076f	radeonsi/nir: add missing piece for bindless image support This fixes some piglit tests and is was TGSI does. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-23 10:41:37 +11:00
Rob Clark	1ce5d757d0	freedreno: core buffer modifier support Split out of a patch from Fritz Koenig to decouple from a6xx UBWC enablement, and added fd_resource_create_with_modifiers().	2019-01-22 16:33:27 -05:00
Rob Clark	c56fe4118a	loader: fix the no-modifiers case Normally modifiers take precendence over use flags, as they are more explicit. But if the driver supports modifiers, but the xserver does not, then we should fallback to the old mechanism of allocating a buffer using 'use' flags. Fixes: `069fdd5f9f` Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-01-22 16:33:27 -05:00
Fritz Koenig	7c4b9510d1	freedreno: add query for dmabuf modifiers	2019-01-22 16:33:27 -05:00
Fritz Koenig	ddbe6171e6	freedreno: drm_fourcc.h header include Add Qualcomm modifier for UBWC	2019-01-22 16:33:27 -05:00
Brian Paul	956c219c8f	svga: add new gallium formats to the format conversion table Fixes a static assertion which broke the build. Fixes: `3ee240890` "gallium: add SINT formats to have exact counterparts to SNORM formats" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Neha Bhende<bhenden@vmware.com>	2019-01-22 12:58:04 -07:00
Marek Olšák	d85917deaf	radeonsi: rename rfence -> sfence Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:34:03 -05:00
Marek Olšák	260ff57647	radeonsi: rename rbo, rbuffer to buf or buffer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:34:01 -05:00
Marek Olšák	63b91f25bc	radeonsi: rename rsrc -> ssrc, rdst -> sdst Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:33:04 -05:00
Marek Olšák	4666f36c04	radeonsi: rename rquery -> squery Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:32:59 -05:00
Marek Olšák	501ff90a95	radeonsi: rename r600_resource -> si_resource Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:32:18 -05:00
Lionel Landwerlin	a75b12ce66	vulkan: make generated enum to strings helpers available from c++ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-22 18:20:53 +00:00
Marek Olšák	1cfbed7587	radeonsi: remove r600 from comments Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:26:45 -05:00
Marek Olšák	e0a6399eb4	winsys/amdgpu: rename rfence, rsrc, rdst -> afence, asrc, adst Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:26:45 -05:00
Marek Olšák	2792ec2cdd	radeonsi: rename rview -> sview Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:26:45 -05:00
Marek Olšák	96610f625d	radeonsi: rename rscreen -> sscreen Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:25:57 -05:00
Marek Olšák	86e25ed5a3	radeonsi: disable render cond & pipeline stats for internal compute dispatches	2019-01-22 12:24:35 -05:00
Sonny Jiang	1b25d340b7	radeonsi: use compute for resource_copy_region when possible v2: marek: fix snorm8 blits Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-01-22 12:24:35 -05:00
Jiang, Sonny	8daf5bb209	radeonsi: add compute_last_block to configure the partial block fields	2019-01-22 12:22:46 -05:00
Marek Olšák	b443465fb9	gallium/util: add util_format_snorm8_to_sint8 (from radeonsi)	2019-01-22 12:21:43 -05:00
Marek Olšák	3ee240890c	gallium: add SINT formats to have exact counterparts to SNORM formats for radeonsi	2019-01-22 12:21:43 -05:00
Marek Olšák	4d5f8f39f3	radeonsi: move PKT3_WRITE_DATA generation into a helper function Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:14:26 -05:00
Marek Olšák	c252273f98	radeonsi: don't use WRITE_DATA.DST_SEL == MEM_GRBM on >= CIK Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:14:26 -05:00
Marek Olšák	a545415eb9	radeonsi: fix the top-of-pipe fence on SI SI doesn't have MEM. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:14:26 -05:00
Marek Olšák	e402961e1d	radeonsi: correct WRITE_DATA.DST_SEL definitions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:14:26 -05:00
Marek Olšák	c605738113	radeonsi: compile clear and copy buffer compute shaders on demand same as all other shaders	2019-01-22 11:59:27 -05:00
Marek Olšák	f139589069	radeonsi: remove redundant call to emit_cache_flush in compute clear/copy launch_grid calls it.	2019-01-22 11:59:27 -05:00
Marek Olšák	e3d283eaca	radeonsi: use buffer_store_format_x & xy	2019-01-22 11:59:27 -05:00
Marek Olšák	4c4c8bb1f0	radeonsi: fix rendering to tiny viewports where the viewport center is > 8K This fixes an assertion failure with GL CTS when cts-runner is used. (not a specific test) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877 Cc: 18.3 <mesa-stable@lists.freedesktop.org>	2019-01-22 11:59:27 -05:00
Marek Olšák	caa2dcd730	radeonsi: fix a u_blitter crash after a shader with FBFETCH This fixes an assertion failure with GL CTS when cts-runner is used. (not a specific test) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108877 Cc: 18.3 <mesa-stable@lists.freedesktop.org>	2019-01-22 11:59:27 -05:00
Marek Olšák	c02f761bdf	winsys/amdgpu: use the new BO list API	2019-01-22 11:59:27 -05:00
Jason Ekstrand	ac0f8a6ea0	anv: Implement transform feedback queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:57 -06:00
Jason Ekstrand	7f4d9bb7b8	genxml: Add SO_PRIM_STORAGE_NEEDED and SO_NUM_PRIMS_WRITTEN Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:57 -06:00
Jason Ekstrand	673f33c77d	anv: Implement CmdBegin/EndQueryIndexed Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:57 -06:00
Jason Ekstrand	2be89cbd82	anv: Implement vkCmdDrawIndirectByteCountEXT Annoyingly, this requires that we implement integer division on the command streamer. Fortunately, we're only ever dividing by constants so we can use the mulh+add+shift trick and it's not as bad as it sounds. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	36ee2fd61c	anv: Implement the basic form of VK_EXT_transform_feedback Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	39925d60ec	anv: Add pipeline cache support for xfb_info Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	e3bd49eaa7	anv: Add but do not enable VK_EXT_transform_feedback Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Alejandro Piñeiro	6b50b0a4a8	nir/xfb: distinguish array of structs vs array of blocks Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	ac704e777c	nir/xfb: Properly handle arrays of blocks Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Alejandro Piñeiro	5649a0a6e8	nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offset In order to allow nir_gather_xfb_info to be used on OpenGL, specifically ARB_gl_spirv. So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables": "outputs specifying both an XfbBuffer and an Offset are captured, while outputs not specifying both of these are not captured. Values are captured each time the shader writes to such a decorated object." This implies that are captured if both are present, and not if one of those are lacking. Technically, it doesn't explicitly point that having just one or the other is a mistake. In some cases, glslang is adding some extra XfbBuffer without XfbOffset around, and mentioning that technically that is not a bug (see issue#1526) And for the case of Vulkan, as the same glslang issue mentions, it is not clear if that should be a mistake or not. But even if it is a mistake, it is not really needed to be checked on the driver, and we can let the validation layers to check that. v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	4f99ac9144	nir/xfb: Fix offset accounting for dvec3/4 Before, we were double-counting the component slots when we had a dvec3 or dvec4. Instead, just add them in once and manually offset the recorded output offset. Fixes: `19064b8c` "nir: Add a pass for gathering transform feedback info" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	96fa23bca5	nir: Preserve offsets in lower_io_to_scalar_early Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Samuel Pitoiset	b2bbd978d0	nir: fix lowering arrays to elements for XFB outputs If we have a transform feedback output like: float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0) which is lowered by nir_lower_io_arrays_to_elements to, float x2_out (VARYING_SLOT_VAR1.x, 0, 0) float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0) We have to update the destination offset to avoid overwriting the same value. v2 (Jason Ekstrand): - Compute the correct offsets for arrays of vectors and/or doubles Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Samuel Pitoiset	9f4e0aa7c1	nir: do not remove varyings used for transform feedback When a xfb buffer is explicitely declared on a varying variable, we shouldn't remove it at link time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	9c14440e81	spirv: Only set interface_type on blocks Instead of setting interface_type to whatever the per-vertex type is, we only set it on blocks. This allows later passes to tell the difference between variables that are in blocks and those that aren't. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	da29594636	spirv: Only split blocks Instead of splitting every per-vertex struct, just split the ones that are actually blocks. The reason for the split is so that we have separate variables for separate locations, qualifiers, and builtin decorations. The vulkan spec only allows these on members of blocks. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	662cfb121b	spirv: Initialize struct member offsets to -1 This is the "no offset specified" value. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Jason Ekstrand	b4eae8444e	anv: Always emit at least one vertex element This seems to make the simulator happier. The early return wasn't really protecting anything and the code that follows will happily initialize the dummy element to STORE_0 and emit it. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-22 10:42:56 -06:00
Eric Engestrom	610f956fde	configure: EGL requirements only apply if EGL is built Issue was hit with this configuration: --disable-{egl,gbm} --with-platform=drm Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `3208fd2e46` ("configure: move platform handling further up") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-22 16:12:40 +00:00
Jonathan Marek	fc4f6b2f12	freedreno: a2xx: add partial lower_scalar pass for ir2 Some instructions can only be scalar on a2xx, lower these only Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-22 14:45:03 +00:00
Jonathan Marek	9f614c74b7	freedreno: a2xx: add ir2 copy propagation Two cases: * replacing srcs which refer to MOV instructions * replacing MOVs used to write to exports Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-22 14:45:03 +00:00
Jonathan Marek	c7dbf0b280	freedreno: a2xx: insert scalar MOV to allow 2 source scalar If we want to use a scalar instruction with two sources, both sources have to be in the same register. This covers a common case by inserting a scalar MOV into a previous instruction with only a vector alu instruction. A better method would be to have the sources end up in the same register in the first place, but when one source is a constant this is the only way. Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-22 14:45:03 +00:00
Jonathan Marek	67610a0323	freedreno: a2xx: NIR backend This patch replaces the a2xx TGSI compiler with a NIR compiler. It also adds several new features: -gl_FrontFacing, gl_FragCoord, gl_PointCoord, gl_PointSize -control flow (including loops) -texture related features (LOD/bias, cubemaps) -filling scalar ALU slot when possible Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2019-01-22 14:45:03 +00:00
Tapani Pälli	da3ca69afa	nir: cleanup glsl_get_struct_field_offset, glsl_get_explicit_stride Take away const qualifier from return type of these functions as -Wignored-qualifiers points out it is ignored for these cases. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:09:15 +02:00
Eric Engestrom	41a0c00392	travis: fix autotools build after --enable-autotools switch addition Fixes: `e68777c87c` "autotools: Deprecate the use of autotools" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 10:29:19 +00:00
Jason Ekstrand	27af1cc2a6	spirv: Update the JSON and headers from Khronos master This corresponds to commit 79b6681aadcb53c27d1052e on GitHub. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 18:55:05 -06:00
Jason Ekstrand	ca8c6c9781	nir: Mark deref UBO and SSBO access as non-scalar Fixes: `63b9aa2e25` "spirv: Add support for using derefs for..." Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 18:41:47 -06:00
Karol Herbst	5ee0adfb6e	nir/spirv: handle ContractionOff execution mode Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 20:36:41 +01:00
Rob Clark	fa737042ad	nir/vtn: add caps for some cl related capabilities vtn supports these, so don't squalk if user is happy with enabling these. v2: add new members sorted Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 20:36:41 +01:00
Karol Herbst	ce08e5f39c	vtn: handle SpvExecutionModelKernel Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 20:36:41 +01:00
Karol Herbst	8bb46de08b	mesa: add MESA_SHADER_KERNEL used for CL kernels Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 20:36:41 +01:00
Jason Ekstrand	2aa78e46e9	anv/pipeline: Add a pdevice helper variable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-21 11:57:00 -06:00
Jason Ekstrand	344171b9ee	relnotes: Add newly added Vulkan extensions Both the Intel and RADV people have been really bad about adding things to the release notes. We should start actually paying attention. Acked-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-21 11:46:06 -06:00
Jason Ekstrand	c7f4a2867c	anv: Only parse pImmutableSamplers if the descriptor has samplers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-21 11:45:58 -06:00
Rhys Perry	f0ba826054	radv: prevent dirtying of dynamic state when it does not change DXVK often sets dynamic state without actually changing it. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Rhys Perry	e4c6423c5e	radv: avoid context rolls when binding graphics pipelines It's common in some applications to bind a new graphics pipeline without ending up changing any context registers. This has a pipline have two command buffers: one for setting context registers and one for everything else. The context register command buffer is only emitted if it differs from the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug v3: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Rhys Perry	5564a797f2	radv: add missed situations for scissor bug workaround v2: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Rhys Perry	5d1a29071a	radv: pass radv_draw_info to radv_emit_draw_registers() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 14:37:53 +00:00
Jonathan Marek	5886c5d092	freedreno: a2xx: sysmem rendering Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:22:34 -05:00
Jonathan Marek	bec6e4b054	freedreno: a2xx: fix non-zero texture base offsets Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:22:27 -05:00
Jonathan Marek	02ab85afd8	freedreno: a2xx: fix VERTEX_REUSE/DEALLOC on a20x On a20x, set VGT_VERTEX_REUSE_BLOCK_CNTL to 2 and don't change it. Small rearrangement on a220 to reduce the size of draw commands. Only set DEALLOC_CNTL on a20x because the correct a220 value is not known. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:22:22 -05:00
Jonathan Marek	0286a11b7e	freedreno: a2xx: fix gmem2mem viewport Fixes cases where previous viewport values might case gmem2mem to fail. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:22:16 -05:00
Jonathan Marek	64b12520a2	freedreno: a2xx: cleanup REG_A2XX_PA_CL_VTE_CNTL Doesn't change much, but reduces the size of fd2_emit_state gmem2mem does not need to change the value: no Z clipping on resolve mem2gmem now needs to restore the common value after rendering Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:22:10 -05:00
Jonathan Marek	6ef7700ac6	freedreno: a2xx: cleanup init_shader_const Only 3 vertices are used so we can drop the data for vertex 4 It doesn't make sense to have 1.1 for some coordinates, use 1.0 instead Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-21 09:21:51 -05:00
Karol Herbst	0a793c78a3	nir: add bit_size parameter to system values with multiple allowed bit sizes v2: add assert to verify we have at least one valid bit_size v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:17:18 +01:00
Karol Herbst	4125211e9c	nir: add legal bit_sizes to intrinsics With OpenCL some system values match the address bits, but in GLSL we also have some system values being 64 bit like subgroup masks. With this it is possible to adjust the builder functions so that depending on the bit_sizes the correct bit_size is used or an additional argument is added in case of multiple possible values. v2: validate dest bit_size v3: generate hex values in python code remove useless imports rename and move bit_sizes v4: add 1 to legal bit_sizes for front_face Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:16:51 +01:00
Karol Herbst	27bd07e230	nir/validate: allow to check against a bitmask of bit_sizes Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:16:51 +01:00
Karol Herbst	b9fec2b38c	nir: replace more nir_load_system_value calls with builder functions Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-21 00:16:51 +01:00
Karol Herbst	987744be98	glsl/lower_output_reads: set invariant and precise flags on temporaries fixes a couple of deqp tests (on nvc0 and potential other drivers): dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-21 00:16:50 +01:00
Rhys Kidd	8002eaab6c	nv50,nvc0: add missing CAPs for unsupported features Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-20 13:51:01 -05:00
Karol Herbst	acdad24585	nir/spirv: handle SpvStorageClassCrossWorkgroup v2: rename nir_var_global to nir_var_mem_global Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:42 +01:00
Karol Herbst	36a76b7192	nir: rename nir_var_shared to nir_var_mem_shared Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	6fefd69724	nir: rename nir_var_ssbo to nir_var_mem_ssbo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	3afc1e068f	nir: rename nir_var_ubo to nir_var_mem_ubo Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	9b24028426	nir: rename nir_var_function to nir_var_function_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Karol Herbst	e5daef9587	nir: rename nir_var_private to nir_var_shader_temp Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 20:01:41 +01:00
Lionel Landwerlin	ad99c1670a	intel/genxml: add missing MI_PREDICATE compare operations Doesn't save us a great deal of lines but at least they get decoded in aubinators. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-01-19 15:47:36 +00:00
Lionel Landwerlin	79514cc5fb	anv: document cache flushes & invalidations A little bit of explanation regarding how vkCmdPipelineBarrier() works. v2: Avoid referring to data port cache when it's actually sampler caches (Jason) Complete explanation for indirect draws (Jason) v3: s/samplers/sampler/ (Jason) s/UBOs/data port/ Add documentation for VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)	2019-01-19 15:45:41 +00:00
Lionel Landwerlin	3c4c18341a	anv: narrow flushing of the render target to buffer writes In commit `9a7b319903` ("anv/query: flush render target before copying results") we tracked all the render target writes to apply a flushes in the vkCopyQueryResults(). But we can narrow this down to only when we write a buffer (which is the only input of vkCopyQueryResults). v2: Drop newer render target write flags introduce by `1952fd8d2c` ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)	2019-01-19 15:45:41 +00:00
Timothy Arceri	6ca652faf3	glsl: be much more aggressive when skipping shader compilation Currently we only add a cache key for a shader once it is linked. However games like Team Fortress 2 compile a whole bunch of shaders which are never actually linked. These compiled shaders can take up a bunch of memory. This patch changes things so that we add the key for the shader to the cache as soon as it is compiled. This means on a warm cache we can avoid the wasted memory from these shaders. Worst case scenario is we need to compile the shaders at link time but this can happen anyway if the shader has been evicted from the cache. Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a warm cache from start up to the game menu. V2: only add key to cache when compilation is successful. Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-01-19 13:12:25 +11:00
Francisco Jerez	c84ec70b3a	intel/fs: Promote execution type to 32-bit when any half-float conversion is needed. The docs are fairly incomplete and inconsistent about it, but this seems to be the reason why half-float destinations are required to be DWORD-aligned on BDW+ projects. This way the regioning lowering pass will make sure that the destination components of W to HF and HF to W conversions are aligned like the corresponding conversion operation with 32-bit execution data type. Tested-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-18 16:09:39 -08:00
Timothy Arceri	9e669ed22b	ac/nir_to_llvm: fix interpolateAt* for arrays This builds on the recent interpolate fix by Rhys `ee8488ea3b`. This fixes the arb_gpu_shader5 interpolateAt* tests that contain arrays. Fixes: `ee8488ea3b` ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-19 10:59:38 +11:00
Timothy Arceri	860a9e4849	Revert "glsl: be much more aggressive when skipping shader compilation" This reverts commit `64b8c86d37`. Reverting for now as it was causing some segfaults.	2019-01-19 10:45:07 +11:00
Kristian H. Kristensen	5486c9d526	freedreno/a6xx: Turn on texture tiling by default The color swap isn't available for tiled formats and it's not needed either. We pick one channel order and use for all non-linear formats. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-18 14:27:15 -08:00
Kristian H. Kristensen	60c6778dda	freedreno: Synchronize batch and flush for staging resource Staging blit downloads would wait on the src resource instead of the staging resource and didn't make sure to submit the blit batch first. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-18 14:27:12 -08:00
Timothy Arceri	64b8c86d37	glsl: be much more aggressive when skipping shader compilation Currently we only add a cache key for a shader once it is linked. However games like Team Fortress 2 compile a whole bunch of shaders which are never actually linked. These compiled shaders can take up a bunch of memory. This patch changes things so that we add the key for the shader to the cache as soon as it is compiled. This means on a warm cache we can avoid the wasted memory from these shaders. Worst case scenario is we need to compile the shaders at link time but this can happen anyway if the shader has been evicted from the cache. Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a warm cache from start up to the game menu. Acked-by: Marek Olšák <marek.olsak@amd.com>	2019-01-19 08:24:47 +11:00
Timothy Arceri	c9d7b0f184	glsl: don't skip GLSL IR opts on first-time compiles This basically reverts `c2bc0aa7b1`. By running the opts we reduce memory using in Team Fortress 2 from 1.5GB -> 1.3GB from start-up to game menu. This will likely increase Deus Ex start up times as per commit `c2bc0aa7b1`. However currently 32bit games like Team Fortress 2 can run out of memory on low memory systems, so that seems more important. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-19 08:24:43 +11:00
Caio Marcelo de Oliveira Filho	cd56d79b59	nir: check NIR_SKIP to skip passes by name Passes' function names, separated by comma, listed in NIR_SKIP environment variable will be skipped in debug mode. The mechanism is hooked into the _PASS macro, like NIR_PRINT. The extra macro NIR_SKIP is available as a developer convenience, to skip at pointer other than the passes entry points. v2: Fix typo in NIR_SKIP macro. (Bas) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 12:31:49 -08:00
Danylo Piliaiev	1952fd8d2c	anv: Implement VK_EXT_conditional_rendering for gen 7.5+ Conditional rendering affects next functions: - vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect - vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR - vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase - vkCmdClearAttachments Value from conditional buffer is cached into designated register, MI_PREDICATE is emitted every time conditional rendering is enabled and command requires it. v2: by Jason Ekstrand - Use vk_find_struct_const instead of manually looping - Move draw count loading to prepare function - Zero the top 32-bits of MI_ALU_REG15 v3: Apply pipeline flush before accessing conditional buffer (The issue was found by Samuel Iglesias) v4: - Remove support of Haswell due to possible hardware bug - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to define registers in one place. v5: thanks to Jason Ekstrand and Lionel Landwerlin - Workaround the fact that MI_PREDICATE_RESULT is not accessible on Haswell by manually calculating MI_PREDICATE_RESULT and re-emitting MI_PREDICATE when necessary. v6: suggested by Lionel Landwerlin - Instead of calculating the result of predicate once - re-emit MI_PREDICATE to make it easier to investigate error states. v7: suggested by Jason - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set. v8: suggested by Lionel - Precompute conditional predicate's result to support secondary command buffers. - Make prepare_for_draw_count_predicate more readable. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-18 18:31:44 +00:00
Danylo Piliaiev	ed6e2bf263	anv: Implement VK_KHR_draw_indirect_count for gen 7+ v2: by Jason Ekstrand - Move out of the draw loop population of registers which aren't changed in it. - Remove dependency on ALU registers. - Clarify usage of PIPE_CONTROL - Without usage of ALU registers patch works for gen7+ v3: set pending_pipe_bits \|= ANV_PIPE_RENDER_TARGET_WRITES Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-18 18:31:44 +00:00
Dylan Baker	9e989b860a	bin/meson-cmd-extract: Also handle cross and native files Native file support in command line serialization isn't present in meson 0.49, but will be for 0.49.1 and 0.50 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-18 09:37:01 -08:00
Jason Ekstrand	b54df1b6df	anv: Re-sort the extensions list I like to keep things in good order so that you can find them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-18 10:32:23 -06:00
Jason Ekstrand	eb32dad07c	intel/fs: Don't touch accumulator destination while applying regioning alignment rule In some shaders, you can end up with a stride in the source of a SHADER_OPCODE_MULH. One way this can happen is if the MULH is acting on the top bits of a 64-bit value due to 64-bit integer lowering. In this case, the compiler will produce something like this: mul(8) acc0<1>UD g5<8,4,2>UD 0x0004UW { align1 1Q }; mach(8) g6<1>UD g5<8,4,2>UD 0x00000004UD { align1 1Q AccWrEnable }; The new region fixup pass looks at the MUL and sees a strided source and unstrided destination and determines that the sequence is illegal. It then attempts to fix the illegal stride by replacing the destination of the MUL with a temporary and emitting a MOV into the accumulator: mul(8) g9<2>UD g5<8,4,2>UD 0x0004UW { align1 1Q }; mov(8) acc0<1>UD g9<8,4,2>UD { align1 1Q }; mach(8) g6<1>UD g5<8,4,2>UD 0x00000004UD { align1 1Q AccWrEnable }; Unfortunately, this new sequence isn't correct because MOV accesses the accumulator with a different precision to MUL and, instead of filling the bottom 32 bits with the source and zeroing the top 32 bits, it leaves the top 32 (or maybe 31) bits alone and full of garbage. When the MACH comes along and tries to complete the multiplication, the result is correct in the bottom 32 bits (which we throw away) and garbage in the top 32 bits which are actually returned by MACH. This commit does two things: First, it adds an assert to ensure that we don't try to rewrite accumulator destinations of MUL instructions so we can avoid this precision issue. Second, it modifies required_dst_byte_stride to require a tightly packed stride so that we fix up the sources instead and the actual code which gets emitted is this: mov(8) g9<1>UD g5<8,4,2>UD { align1 1Q }; mul(8) acc0<1>UD g9<8,8,1>UD 0x0004UW { align1 1Q }; mach(8) g6<1>UD g5<8,4,2>UD 0x00000004UD { align1 1Q AccWrEnable }; Fixes: `efa4e4bc5f` "intel/fs: Introduce regioning lowering pass" Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2019-01-18 10:18:52 -06:00
Jason Ekstrand	0a7ac6d543	intel/eu: Stop overriding exec sizes in send_indirect_message For a long time, we based exec sizes on destination register widths. We've not been doing that since `1ca3a94427` but a few remnants accidentally remained. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2019-01-18 10:18:52 -06:00
Samuel Pitoiset	f682ed11c3	radv: initialize the per-queue descriptor BO only once Totally useless to write the descriptors inside the loop. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 13:26:32 +01:00
Samuel Pitoiset	72d9745a40	radv: do not write unused descriptors to the per-queue BO Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 13:26:30 +01:00
Samuel Pitoiset	8c164ea8f5	radv: reduce size of the per-queue descriptor BO Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 13:26:28 +01:00
Samuel Pitoiset	83cc87ead4	radv: drop unused code related to 16 sample locations The driver only supports up to 8 sample locations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-18 13:26:24 +01:00
Karol Herbst	80dae7022e	gm107/ir: disable TEXS for tex with derivAll set fixes deqp tests: dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.samplercube_float_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isamplercube_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usamplercube_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.isampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.usampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.isampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.usampler3d_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex Fixes: `f821e80213` "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-18 03:27:51 +01:00
Karol Herbst	30b5c9eda2	nv50/ir: disable tryCollapseChainedMULs in ConstantFolding for precise instructions fixes dEQP-GLES2.functional.shaders.invariance.mediump.loop_3 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-18 02:03:30 +01:00
Bas Nieuwenhuizen	8424cd8fbd	nir: Account for atomics in copy propagation. Otherwise writes get propagated across atomics if no barrier is used. Without barrier writes should still be visible in the same invocation, so an atomic has to be considered a write. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `b3c6146925` "nir: Copy propagation between blocks" Fixes: `62332d139c` "nir: Add a local variable-based copy propagation pass"	2019-01-18 00:55:35 +01:00
Rafael Antognolli	927ba12b53	anv/tests: Adding test for the state_pool padding. Add a test that checks that we can use the extra space allocated for padding while allocating larger anv_states. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:26 -08:00
Rafael Antognolli	731c4adcf9	anv/allocator: Add support for non-userptr. If softpin is supported, create new BOs for the required size and add the respective BO maps. The other main change of this commit is that anv_block_pool_map() now returns the map for the BO that the given offset is part of. So there's no block_pool->map access anymore (when softpin is used. v3: - set fd to -1 on softpin case (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:24 -08:00
Rafael Antognolli	643248b66a	anv: Remove state flush. We have all the state buffers snooped, so we don't need to clflush everything anymore. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:22 -08:00
Rafael Antognolli	5d61c74f3d	anv/allocator: Enable snooping on block pool and anv_bo_pool BOs. We are not going to use userptr for anv block pool BOs anymore. However, so far we have been relying on the fact that userptr BOs are snooped on non-llc platforms. Let's make sure that the block pool BOs are still snooped, and we can also remove the clflush'ing that we do on all state buffers. And since we plan to remove the flushes, set the anv_bo_pool BOs to cached (snooped on non-LLC platforms) too. For LLC platforms, they are all cached by default, so this becomes a no-op. v5: - Add snooping to anv_bo_pool BOs too (Jason). - Remove anv_gem_set_domain. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:20 -08:00
Rafael Antognolli	dfc9ab2ccd	anv/allocator: Add padding information. It's possible that we still have some space left in the block pool, but we try to allocate a state larger than that state. This means such state would start somewhere within the range of the old block_pool, and end after that range, within the range of the new size. That's fine when we use userptr, since the memory in the block pool is CPU mapped continuously. However, by the end of this series, we will have the block_pool split into different BOs, with different CPU mapping ranges that are not necessarily continuous. So we must avoid such case of a given state being part of two different BOs in the block pool. This commit solves the issue by detecting that we are growing the block_pool even though we are not at the end of the range. If that happens, we don't use the space left at the end of the old size, and consider it as "padding" that can't be used in the allocation. We update the size requested from the block pool to take the padding into account, and return the offset after the padding, which happens to be at the start of the new address range. Additionally, we return the amount of padding we used, so the caller knows that this happens and can return that padding back into a list of free states, that can be reused later. This way we hopefully don't waste any space, but also avoid having a state split between two different BOs. v3: - Calculate offset + padding at anv_block_pool_alloc_new (Jason). v4: - Remove extra "leftover". Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:19 -08:00
Rafael Antognolli	7ed0898a8d	anv/allocator: Rework chunk return to the state pool. This commit tries to rework the code that split and returns chunks back to the state pool, while still keeping the same logic. The original code would get a chunk larger than we need and split it into pool->block_size. Then it would return all but the first one, and would split that first one into alloc_size chunks. Then it would keep the first one (for the allocation), and return the others back to the pool. The new anv_state_pool_return_chunk() function will take a chunk (with the alloc_size part removed), and a small_size hint. It then splits that chunk into pool->block_size'd chunks, and if there's some space still left, split that into small_size chunks. small_size in this case is the same size as alloc_size. The idea is to keep the same logic, but make it in a way we can reuse it to return other chunks to the pool when we are growing the buffer. v2: - Include Jason's suggestions to the algorithm that returns chunks. - Update comments. v3: - Disallow returning 0 blocks (Jason). - fix min_size in the loop (Jason). - remove temporary variables (Jason) v4: - return_chunk() should never return blocks larger than pool->block_size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:17 -08:00
Rafael Antognolli	6a1f4c96cc	anv: Remove some asserts. They won't be true anymore once we add support for multiple BOs with non-userptr. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:14 -08:00
Rafael Antognolli	f39dad7e4e	anv: Validate the list of BOs from the block pool. We now have multiple BOs in the block pool, but sometimes we still reference only the first one in some instructions, and use relative offsets in others. So we must be sure to add all the BOs from the block pool to the validation list when submitting commands. v2: - Don't add block pool BOs to the dependency list right before execbuf (Jason) - Call anv_execbuf_add_bo() to each BO in the block pools (Jason) - Use anv_execbuf_add_bo_set() to add surface state dependencies to execbuf. v3: - Add comment to the non-softpin case (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:10 -08:00
Rafael Antognolli	11a5d4620b	anv: Split code to add BO dependencies to execbuf. This part of the anv_execbuf_add_bo() code is totally independent of the BO being added. Let's split it out, so we can reuse it later. v3: rename to anv_execbuf_add_bo_set (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:08 -08:00
Rafael Antognolli	f874604f45	anv/allocator: Add support for a list of BOs in block pool. So far we use only one BO (the last one created) in the block pool. When we switch to not use the userptr API, we will need multiple BOs. So add code now to store multiple BOs in the block pool. This has several implications, the main one being that we can't use pool->map as before. For that reason we update the getter to find which BO a given offset is part of, and return the respective map. v3: - Simplify anv_block_pool_map (Jason). - Use fixed size array for anv_bo's (Jason) v4: - Respect the order (item, container) in anv_block_pool_foreach_bo (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:04 -08:00
Rafael Antognolli	e3dc56d731	anv: Update usage of block_pool->bo. Change block_pool->bo to be a pointer, and update its usage everywhere. This makes it simpler to switch it later to a list of BOs. v3: - Use a static "bos" field in the struct, instead of malloc'ing it. This will be later changed to a fixed length array of BOs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:02 -08:00
Rafael Antognolli	fc3f588320	anv/allocator: Remove pool->map. After switching to using anv_state_table, there are very few places left still using pool->map directly. We want to avoid that because it won't be always the right map once we split it into multiple BOs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:08:00 -08:00
Rafael Antognolli	54e21e145e	anv/allocator: Rename anv_free_list2 to anv_free_list. Now that we removed the original anv_free_list, we can now use its name. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:58 -08:00
Rafael Antognolli	234c9d8a40	anv/allocator: Remove anv_free_list. The next commit already renames anv_free_list2 -> anv_free_list since the old one is gone. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:56 -08:00
Rafael Antognolli	e2179aceaf	anv/allocator: Use anv_state_table on back_alloc too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:52 -08:00
Rafael Antognolli	d18267fb48	anv/allocator: Use anv_state_table on anv_state_pool_alloc. Use anv_state_pool_return_blocks() to return blocks to the pool, instead of manually pushing them. v3: - return blocks from the end of the chunk (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:50 -08:00
Rafael Antognolli	6a1dcfe73d	anv/allocator: Add helper to push states back to the state table. The use of anv_state_table_add() combined with anv_state_table_push(), specially when adding a bunch of states to the table, is very verbose. So we add this helper that makes things easier to digest. We also already add the anv_state_table member in this commit, so things can compile properly, even though it's not used. v2: assert that the states are always aligned to their size (Jason) v3: Add "table" member to anv_state_pool in this commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:47 -08:00
Rafael Antognolli	e8b6e0a5ba	anv/allocator: Add getter for anv_block_pool. We will need the anv_block_pool_map to find the map relative to some BO that is not at the start of the block pool. v2: just return a pointer instead of a struct (Jason) v4: Update comment (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:43 -08:00
Rafael Antognolli	6a2d5ae305	anv/allocator: Add anv_state_table. Add a structure to hold anv_states. This table will initially be used to recycle anv_states, instead of relying on a linked list implemented in GPU memory. Later it could be used so that all anv_states just point to the content of this struct, instead of making copies of anv_states everywhere. One has to call anv_state_table_add(), which returns an index for the state in the table, and then get a pointer to such index, and finally fill in the rest of the struct. TODO: 1) There's a lot of common code between this table backing store memory and the anv_block_pool buffer, due to how we grow it. I think it's possible to refactory this and reuse code on both places. 2) Add unit tests. v3: - Rename state table memfd (Jason) - Return VK_ERROR_OUT_OF_HOST_MEMORY on more places (Jason) - anv_state_table_grow returns VkResult (Jason) - Rename variables to be more informative (Jason) - Return errors on state table grow. - Rename anv_state_table_push/pop to anv_free_list_push2/pop2 This will be renamed again to remove the trailing "2" later. v4: - Remove exit(-1) from anv_state_table (Jason). - Use uint32_t "next" field in anv_free_entry (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:07:34 -08:00
Rafael Antognolli	27478ce00e	anv/tests: Fix block_pool_no_free test. There were 2 problems with this test. First it was comparing highest, which was -1, with an uint32_t. So the current value would never be higher than that, and the assert would always be false. It just never reached this point because of the next problem. It was always looking for the highest value of each thread and storing it in thread_max. So a test case like this wouldn't work: [Thread]: [Blocks] [0]: [0, 32, 64, 96] [1]: [128, 160, 192, 224] [2]: [256, 288, 320, 352] Not only that would skip values and iterate only over thread number 2, instead of walking through all of them, but thread_max was also initialized to -1. And then compared to unsigned blocks[i][next[i]. We fix that by getting the smallest value of each thread, and checking if it is lower than thread_min, which is initialized to INT32_MAX. And then we end up walking through all the blocks of all threads. We also change "blocks" to be int32_t instead of uint32_t, since in some places (alloc_blocks) it was already referenced as int32_t, and that fixes the comparison to -1. v2: - keep highest initialized to -1, and change blocks to be int32_t. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 15:05:58 -08:00
Lionel Landwerlin	4149d41f2e	anv: fix invalid binding table index computation The ++ operator strikes again. Fixes: `f92c5bc8f3` ("anv/device: fix maximum number of images supported") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-17 11:49:10 -08:00
Eric Engestrom	c4c5c90255	docs: explain how to see what meson options exist Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-17 17:05:41 +00:00
Emil Velikov	406623f5b1	docs: update calendar, add news item and link release notes for 18.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-17 11:37:41 +00:00
Emil Velikov	9d58641bf2	docs: add sha256 checksums for 18.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `8320a07221`)	2019-01-17 11:32:20 +00:00
Emil Velikov	2dad014496	docs: add release notes for 18.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `95a3b709c0`)	2019-01-17 11:32:19 +00:00
Iago Toral Quiroga	f92c5bc8f3	anv/device: fix maximum number of images supported We had defined MAX_IMAGES as 8, which we used to size the array for image push constant data. The comment there stated that this was for gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes that array, asserting that we don't exceed that number of images, which imposes a limit of MAX_IMAGES on all gens. Furthermore, despite this, we are exposing up to 64 images per shader stage on all gens, gen8 included. This patch lowers the number of images we expose in gen8 to 8 and keeps 64 images for gen9+ while making sure that only pre-SKL gens use push constant space to handle images. v2: - <= instead of < in the assert (Eric, Lionel) - Change the way the assertion is written (Eric) v3: - Revert the way the assertion is written to the form it had in v1, the version in v2 was not equivalent and was incorrect. (Lionel) v4: - gen9+ doesn't need push constants for images at all (Jason) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)	2019-01-17 07:59:00 +01:00
Tapani Pälli	a311aa631d	anv: do not advertise AHW support if extension not enabled Fixes following failing vk-gl-cts cases on Linux desktop: dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.buffer.info dEQP-VK.api.external.memory.android_hardware_buffer.suballocated.image.info dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.image.info dEQP-VK.api.external.memory.android_hardware_buffer.dedicated.buffer.info Fixes: `517103abf1` "anv/android: add ahardwarebuffer external memory properties" Reported-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-01-17 07:22:02 +02:00
Eric Anholt	99ef66c325	vc4: Don't leak the GPU fd for renderonly usage. Noticed while debugging V3D -- the ro->gpu_fd was freshly opened in ro setup, and it needs to stay open until screen close (since it may be used by renderonly) and should be the same one used by the vc4 screen. Fixes: `7029ec05e2` ("gallium: Add renderonly-based support for pl111+vc4.")	2019-01-16 16:28:41 -08:00
Eric Anholt	0605726776	v3d: Don't leak the GPU fd for renderonly usage. The CTS was running out of fds, because of the ro->gpu_fd never being closed. ro->gpu_fd should match the screen (in case the caller of v3d_drm_screen_create_renderonly() has a scanout_for_resource() that uses gpu_fd) and the screen is expected to close its fd at the end, fixing the resource leak. Fixes: `e113b21cb7` ("v3d: Add renderonly support.")	2019-01-16 16:28:41 -08:00
Eric Anholt	59527a36e9	v3d: Restructure RO allocations using resource_from_handle. I had bugs in the old path where I was laying out as tiled (so we'd render tiled) but then only allocating space in the shared object for linear rendering. The resource_from_handle makes it so the same layout choices are made in both the import and export scanout cases. Also, fixes a leak of the fd that was tripping up the CTS. Now that we're checking PIPE_BIND_SHARED to choose to use RO, the DRM_FORMAT_MOD_LINEAR check wasn't needed any more. Fixes visual corruption and MMU faults in X in renderonly mode. Fixes: `bd09bb1629` ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")	2019-01-16 16:28:41 -08:00
Eric Anholt	d70eb2302b	v3d: If the modifier is not known on BO import, default to linear for RO. Part of fixing DRI3 rendering with RO on X11. Fixes: `e113b21cb7` ("v3d: Add renderonly support.")	2019-01-16 16:28:41 -08:00
Timothy Arceri	cb527d2c4c	ac/nir_to_llvm: add support for structs to get_sampler_desc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-17 10:35:36 +11:00
Timothy Arceri	b12316cc92	ac/nir_to_llvm: fix regression in bindless support This wasn't ported over when deref support was implemented. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-17 10:35:36 +11:00
Timothy Arceri	e106e0f2dd	radeonsi/nir: get correct type for images inside structs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-17 10:35:36 +11:00
Timothy Arceri	292887ac0d	ac/nir_to_llvm: fix type handling in image code The current code only strips off arrays and cannot find the type for images that are struct members. Instead of trying to get the image type from the variable, we just get it directly from the deref instruction. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-17 10:35:36 +11:00
Rhys Perry	8a52e4cc4f	radv: use dithered alpha-to-coverage This matches the behaviour of AMDVLK and hides banding. It is also seems to be allowed by the Vulkan spec. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-16 20:49:23 +00:00
Alok Hota	187a6506a3	swr/rast: Store cached files in multiple subdirs This improves cache filesystem performance, especially during CI tests Also updated jitcache magic number due to codegen parameter changes Removed 2 `if constexpr` to prevent C++17 requirement	2019-01-16 13:53:30 -06:00
Alok Hota	bb98be61f4	swr/rast: New execution engine per JIT Fixes relocation errors with LLVM 7.0.0	2019-01-16 13:53:30 -06:00
Alok Hota	b135db5d58	swr/rast: Scope MEM_CLIENT enum for mem usages Avoids confusion with other defaulted integer parameters - fixed some unspecified usages - removed unnecessary includes - removed unecessary protected access specifier in buckets framework	2019-01-16 13:53:30 -06:00
Alok Hota	c722ad7379	swr/rast: Unaligned and translations in gathers - added graphics address translation in odd gathers - added support for unaligned gathers in fetch shader - changed how 2+ GB offsets are handled to make them compatible with unaligned offsets	2019-01-16 13:53:30 -06:00
Alok Hota	9459863dfa	swr/rast: partial support for Tiled Resources - updated sample from TRTT surfaces correctly - implemented mapped status return for TRTT surfaces - implemented per-sample instruction minLod clamp - updated bilinear filter weight calculation to be closer to D3D specs - implemented "ReducedTexcoordRange" operation from D3D specs to avoid loss of precision on high-value normalized coordinates	2019-01-16 13:53:30 -06:00
Alok Hota	9cacf9d877	swr/rast: Add annotator to interleave isa text To make debugging simpler	2019-01-16 13:53:30 -06:00
Alok Hota	c9fa2ee343	swr/rast: Use gfxptr_t value in JitGatherVertices Use gfxptr_t type value for stream pointer uses in gather and similar calls	2019-01-16 13:53:30 -06:00
Gert Wollny	e68777c87c	autotools: Deprecate the use of autotools Since Meson will eventually be the only build system deprecate autotools now. It can still be used by invoking configure with the flag --enable-autotools NAKed-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2019-01-16 09:52:42 -08:00
Dylan Baker	431e9abaab	meson: allow building dri driver without window system if osmesa is classic This was already enabled for gallium based osmesa with gallium drivers in `9d10581897`, so do the same for classic driver with classic osmesa. Fixes: `cbbd5bb889` ("meson: build classic osmesa") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-16 17:49:51 +00:00
Bruce Cherniak	ed7673afd2	gallium/swr: Fix multi-context sync fence deadlock. Various recreation scenarios lead to API thread getting stuck in swr_fence_finish(). This is a multi-context issue, whereby one context overwrites the fence read-value with a previous sync's lesser value. The fence sync value is supposed to be always increasing. In swr_fence_cb(), only update the "read" value if the new value is greater. (This may seem like we're not waiting on the other context to finish, but had we needed for it to finish there would have been a wait prior to submitting a new sync.) cc: mesa-stable@lists.freedesktop.org	2019-01-16 09:26:36 -06:00
Samuel Pitoiset	d5d7b5e950	ac/nir: don't trash L1 caches for store operations with writeonly memory Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-16 13:57:22 +01:00
Kenneth Graunke	5b51d754d0	st/mesa: Optionally override RGB/RGBX dst alpha blend factors Intel's blending hardware does not properly return 1.0 for destination alpha for RGBX formats; it requires the factors to be overridden to either zero or one. Broadcom vc4 and v3d also could use this override. While overriding these factors is safe in general, Nouveau and Radeon would prefer not to. Their blending hardware already returns correct values for RGB/RGBX formats, and would like to avoid the resulting per-buffer blending and independent blend factors (rgb != a) since it can cause additional overhead. I considered simply handling this in the driver, but it's not as nice. pipe_blend_state doesn't have any format information, so we'd need the hardware blend state to depend on both pipe_blend_state and pipe_framebuffer_state. Furthermore, Intel GPUs don't have a native RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall back to RGBA_SNORM. The pipe_surfaces we get in the driver have an RGBA format, making it impossible to tell that there shouldn't be an alpha channel. One could argue that st not handling it in that case is a bug. To work around this, we'd have to expose RGBX pipe formats, mapped to RGBA hardware formats, and add format swizzling special cases. All doable, but it ends up being more code than I'd like. st_atom_blend already has access to the right information and it's trivial to accomplish there, so we just add a cap bit and do that. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-15 20:53:44 -08:00
Marek Olšák	11735d6c9c	winsys/amdgpu: fix whitespace	2019-01-15 19:10:16 -05:00
Pierre Moreau	0b736f7fd4	meson: Fix with_gallium_icd to with_opencl_icd `with_gallium_icd` is never used throughout the different Meson build files, whereas `with_opencl_icd` tracks whether or not `gallium-opencl` was set to "icd". Fixes: `42ea0631f1` ("meson: build clover") Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-15 13:06:50 -08:00
Kenneth Graunke	d644698b44	gallium: Add the ability to query a single pipeline statistics counter Gallium historically has treated pipeline statistics queries as a single query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11 values. This was originally patterned after the D3D1x API. Much later, Brian introduced an OpenGL extension that exposed these counters - but it exposes 11 separate queries, each of which returns a single value. Today, st/mesa simply queries all 11 values, and returns a single value. While pipeline statistics counters aren't typically performance critical, this is still not a great fit. A D3D1x->GL translator might request all 11 counters by creating 11 separate GL queries...which Gallium would map to reads of all 11 values each time, resulting in a total 121 counter reads. That's not ideal. This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE, and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE. When calling create_query(), q->index should be set to one of the PIPE_STAT_QUERY_* enums to select a counter. Unlike the block query, this returns the value in pipe_query_result::u64 (as it's a single value) instead of the pipe_query_data_pipeline_statistics group. We update st/mesa to expose ARB_pipeline_statistics_query if either capability is set, preferring the new SINGLE variant when available. Thanks to Roland, Ilia, and Marek for helping me sort this out. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-15 11:43:04 -08:00
Kenneth Graunke	f967273fb4	st/mesa: Rearrange PIPE_QUERY_PIPELINE_STATISTICS result fetching. This just changes the order of the switch statements, so we only look at target if the query type is PIPE_QUERY_PIPELINE_STATISTICS. The next commit will introduce a new SINGLE query type which can be used for the same GL query types, and it won't want this processing. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-15 11:43:04 -08:00
Kenneth Graunke	e760be08b4	st/mesa: Make an enum for pipeline statistics query result indices. Gallium handles pipeline statistics queries as a single query (PIPE_QUERY_PIPELINE_STATISTICS) which returns a struct with 11 values. Sometimes it's useful to refer to each of those values individually, rather than as a group. To avoid hardcoding numbers, we define a new enum for each value. Here, the name and enum value correspond to the index in the struct pipe_query_data_pipeline_statistics result. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-15 11:43:04 -08:00
Dylan Baker	4a131a1330	meson: Add a script to extract the cmd line used for meson Upstream I'm persuing a more comprehensive solution, but this should prove a suitable stop-gap measure in the meantime. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109325 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-15 17:38:47 +00:00
Samuel Pitoiset	7bef192018	radv: add support for VK_EXT_memory_budget A simple Vulkan extension that allows apps to query size and usage of all exposed memory heaps. The different usage values are not really accurate because they are per drm-fd, but they should be close enough. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-15 11:18:37 +01:00
Samuel Pitoiset	9784400a6b	radv: add two small helpers for getting VRAM and visible VRAM sizes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-15 11:18:35 +01:00
Samuel Pitoiset	a6e5ce5130	radv: remove unnecessary returns in GetPhysicalDevice*Properties() These functions return nothing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-15 11:18:17 +01:00
Bas Nieuwenhuizen	568e7a2998	radv: Set partial_vs_wave for pipelines with just GS, not tess. Looking at -pro we need to enable it for pipelines with just a GS too. This seems to reduce the hangs from https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to the point where I can't reproduce, after the false start with the wd_switch_on_eop patch due to flakiness. (but people are reporting it does not fix the issue completely for them on polaris 11) CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-15 10:22:30 +01:00
Marek Olšák	5183e794af	radeonsi: also apply the GS hang workaround to draws without tessellation ported from AMDVLK. Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 18:55:58 -05:00
Eric Anholt	bd09bb1629	v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear. We don't have a way to talk to RO about modifiers it can do yet, so assume the minimum.	2019-01-14 15:40:55 -08:00
Eric Anholt	f72820c851	v3d: Add support for CS barrier() intrinsics.	2019-01-14 15:40:55 -08:00
Eric Anholt	9b45b06d7c	v3d: Add support for CS shared variable load/store/atomics. CS shared variables are handled effectively as SSBO access to a temporary buffer that will be allocated at CS dispatch time.	2019-01-14 15:40:55 -08:00
Eric Anholt	01d913cf90	v3d: Add support for CS workgroup/invocation id intrinsics. We get a payload for the ivec3 workgroup and an int local invocation index, and we use the core lowering to turn into the global invocation id and the local invocation id ivec3s.	2019-01-14 15:40:55 -08:00
Eric Anholt	6281f26f06	v3d: Add support for shader_image_load_store. This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).	2019-01-14 15:40:55 -08:00
Eric Anholt	5932c2f0b9	v3d: Add SSBO/atomic counters support. So far I assume that all the buffers get written. If they weren't, you'd probably be using UBOs instead.	2019-01-14 15:40:55 -08:00
Eric Anholt	6c8edcb89c	v3d: Drop the GLSL version level. This was an arbitrary "we support lots of stuff" value when I started the driver. However, at 400 we expose OES_gpu_shader5, which claims support for dynamically indexing samplers, which the driver doesn't do yet.	2019-01-14 13:18:02 -08:00
Eric Anholt	1a63227ea0	v3d: Add support for matrix inputs to the FS. We've been relying on linking splitting up our varying matrices into separate vectors, but with SSO that doesn't happen. Supporting matrix inputs isn't too hard, though.	2019-01-14 13:18:02 -08:00
Eric Anholt	49b7e26fac	v3d: Add an isr to the simulator to catch GMP violations. Otherwise, the simulator raises the GMP interrupt and waits for it to be handled, and v3d ends up spinning in v3d_hw_tick(). Aborting right when violation happens gives us a chance to look at the backtrace of whatever thread triggered the violation.	2019-01-14 13:18:02 -08:00
Eric Anholt	3790ee07e6	v3d: Fix txf_ms 2D_ARRAY array index. We need to pass the array index through our coordinate transform unchanged. Fixes dEQP-GLES31.functional.texture.multisample.samples_1.*_2d_array	2019-01-14 13:18:02 -08:00
Eric Anholt	619a28b845	v3d: Add support for GL_ARB_framebuffer_no_attachments. Fixes dEQP-GLES31.functional.state_query.integer.max_framebuffer_height_getboolean when GLES3 is enabled.	2019-01-14 13:18:02 -08:00
Eric Anholt	051a41d3d5	v3d: Add support for the early_fragment_tests flag. If this flag hasn't been set by the shader and it has some visible side effects, then we need to disable EZ.	2019-01-14 13:18:02 -08:00
Eric Anholt	b417a9f7b2	v3d: Add support for flushing dirty TMU data at job end. This will be needed for SSBOs and image_load_store.	2019-01-14 13:18:02 -08:00
Samuel Pitoiset	ad6ceb2872	ac: add missing 16-bit types to glsl_base_to_llvm_type() Fix crashes with dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 21:18:23 +01:00
Bas Nieuwenhuizen	76b12fa564	radv: Only use 32 KiB per threadgroup on Stoney. Causes hangs on some machines. What works for dEQP-VK.tessellation.shader_input_output.barrier: - running num_patches = 6 (which limits LDS to 32 KiB) - running num_patches = 8, and artificially cutting LDS size at 32 KiB. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-14 19:58:27 +00:00
Marek Olšák	76df5e8f52	st/dri: fix dri2_format_table for argb1555 and rgb565 The bug caused that rgb565 framebuffers used argb1555. Fixes: `433ca3127a` Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-14 14:54:19 -05:00
Jason Ekstrand	2d2737dcfe	nir: Add a bool to float32 lowering pass From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering. ior: fmax instead of fadd allows removing the fsat. inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better with the scalar instruction set. Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-01-14 19:27:06 +00:00
Caio Marcelo de Oliveira Filho	09c3ff01df	src/intel: use new hash table and set creation helpers Replace calls to create hash tables and sets that use _mesa_hash_pointer/_mesa_key_pointer_equal with the helpers _mesa_pointer_hash_table_create() and _mesa_pointer_set_create(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-01-14 10:49:33 -08:00
Caio Marcelo de Oliveira Filho	9fdded0cc3	src/compiler: use new hash table and set creation helpers Replace calls to create hash tables and sets that use _mesa_hash_pointer/_mesa_key_pointer_equal with the helpers _mesa_pointer_hash_table_create() and _mesa_pointer_set_create(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-01-14 10:49:28 -08:00
Caio Marcelo de Oliveira Filho	ee23e8b17c	util: Helper to create sets and hashes with pointer keys These combinations are common enough and deserve a shortcut. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric@engestrom.ch>	2019-01-14 10:49:21 -08:00
Samuel Pitoiset	929df7afaf	ac/nir: set cache policy when loading/storing buffer images This was missing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 17:59:51 +01:00
Samuel Pitoiset	af2a85df74	ac/nir: add get_cache_policy() helper and use it Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 17:59:49 +01:00
Jason Ekstrand	5e4f9ea363	anv: Implement VK_KHR_depth_stencil_resolve	2019-01-14 10:16:52 -06:00
Jason Ekstrand	9f44088468	anv: Move resolve_subpass to genX_cmd_buffer.c We may have to do transitions around certain kinds of resolves so it helps to have it genX code.	2019-01-14 10:16:52 -06:00
Jason Ekstrand	930b17161f	anv/blorp: Refactor MSAA resolves into an exportable helper function This function is modeled after the aux_op functions except that it has a lot more parameters because it deals with two images as well as source and destination regions.	2019-01-14 10:16:52 -06:00
Jason Ekstrand	c92c449361	anv: Rename has_resolve to has_color_resolve	2019-01-14 10:16:52 -06:00
Jason Ekstrand	4bd976e3b8	intel/blorp: Add two more filter modes	2019-01-14 10:16:52 -06:00
Andres Gomez	3ec9ab80b8	bin/get-pick-list.sh: fix redirection in sh "&>" is bash specific. Fixes: `e0dbfc9953` ("bin/get-pick-list.sh: warn when commit lists invalid sha") Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-01-14 17:40:15 +02:00
Andres Gomez	716ed41a36	bin/get-pick-list.sh: fix the oneline printing "--summary" will also print extended header information such as creations, renames and mode changes. Let's just use "--no-patch", which suppresses the diff output. v2: Use "--no-patch" instead of the "-s" abbreviation (Eric). Fixes: `559c32d241` ("bin/get-pick-list.sh: simplify git oneline printing") Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2019-01-14 17:36:56 +02:00
Michel Dänzer	1a20b56798	amd/common: Restore v4i32 suffix for llvm.SI.load.const intrinsic It was accidentally dropped in commit `e4803ab7d2` "amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0", breaking the universe with LLVM 7. Trivial.	2019-01-14 12:52:52 +01:00
Nicolai Hähnle	7fbd48fdc0	amd/common/vi+: enable SMEM loads with GLC=1 Only on LLVM 8.0+, which supports the new intrinsic. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 08:30:15 +01:00
Nicolai Hähnle	e4803ab7d2	amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0 llvm.SI.load.const is deprecated. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-14 08:30:12 +01:00
Iago Toral Quiroga	1c1ae6376c	anv/pipeline_cache: free NIR shader cache Fixes: `f6aa9f7185` 'anv/pipeline_cache: Add support for caching NIR' Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-14 07:59:27 +01:00
Danylo Piliaiev	0862929bf6	glsl: Fix copying function's out to temp if dereferenced by array Function's out variable could be an array dereferenced by an array: func(v[w[i]]); or something more complicated. Copy index in any case. Fixes: `76c27e47b9` ("glsl: Copy function out to temp if we don't directly ref a variable") Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-14 12:04:07 +11:00
Kenneth Graunke	04c2f12ab2	i965: Drop mark_surface_used mechanism. The original idea was that the backend compiler could eliminate surfaces, so we would have it mark which ones are actually used, then shrink the binding table accordingly. Unfortunately, it's a pretty blunt mechanism - it can only prune things from the end, not the middle - since we decide the layout before we even start the backend compiler, and only limit the size. It also basically gives up if it sees indirect array access. Besides, we do the vast majority of our surface elimination in NIR anyway, not the backend - and I don't see that trend changing any time soon. Vulkan abandoned this plan a long time ago, and I don't use it in Iris, but it's still been kicking around in i965. I hacked shader-db to print the binding table size in bytes, and observed no changes with this patch. So, this code appears to do nothing useful. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-13 09:35:32 -08:00
Eric Engestrom	bdf6a5c1d2	egl: fix python lib deprecation warning DeprecationWarning: the imp module is deprecated in favour of importlib Instead of complicated logic, just import the file directly. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2019-01-13 13:59:08 +00:00
Jason Ekstrand	b938d5fbef	spirv: Emit switch conditions on-the-fly Instead of emitting all of the conditions for the cases of a switch statement up-front, emit them on-the-fly as we emit the code for each case. The original justification for this was that we were going to have to build a default case anyway which would need them all. However, we can just trust CSE to clean up the mess in that case. Emitting each condition right before the if statement that uses it reduces register pressure and, in one customer benchmark, reduces spilling and improves performance by about 2x. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Jason Ekstrand	821b6861ec	nir/gcm: Support deref instructions Even though no one's been brave enough to ever use this pass, I like to keep it functionally working. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Jason Ekstrand	24c8108ea6	intel/nir: Call nir_opt_deref in brw_nir_optimize It's an optimization so we should probably be calling it in the optimization loop. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Jason Ekstrand	e57e26121a	spirv: Contain the GLSLang issue #179 workaround to old GLSLang Instead of applying the workaround universally, detect semi-old GLSLang via the generator ID and only enable the workaround on old GLSLang. This isn't nearly as precise as one would like it to be because the first GLSLang generator id version bump was on October 7, 2017 which is about 1.5 years after the bug was fixed. However, it at least lets us disable it for non-GLSLang and for more modern versions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Jason Ekstrand	b57c1ec421	spirv: Whack sampler/image pointers to uniform A long time in a galaxy far far away, there was a GLSLang bug with how it handled samplers passed in as function parameters. (The bug can be found here: https://github.com/KhronosGroup/glslang/issues/179.) Unfortunately, that version was shipped in several apps and has been causing heartburn for our SPIR-V parser ever since. Recent changes to NIR uncovered a moderately old bug in how we work around this issue. In particular, we ended up with a deref_cast from uniform to local which is not a no-op cast so nir_opt_deref wasn't getting rid of the cast. The only reason why it worked before was because someone just happened to call nir_fixup_deref_modes which "fixed" the cast (that shouldn't be happening) and then a later round of copy-prop would get rid of it. The fact that the deref_cast survived that long without causing trouble for other parts of NIR is a bit surprising. Just whacking the mode of the pointer seems to fix it fairly unobtrusively. Currently, only apps with this bug will have a local variable containing an image or sampler. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109304 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-12 17:55:49 -06:00
Kenneth Graunke	2b876bc922	st/nir: Lower TES gl_PatchVerticesIn to a constant if linked with a TCS. If the TCS and TES are linked together, we can simply replace the TES's gl_PatchVerticesIn system value with a constant, possibly allowing extra optimization or letting the driver avoid uploading a special value. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-01-11 13:07:54 -08:00
Jonathan Marek	3d182601bb	glsl/nir: keep bool types when native_integers=false With the new handling of bool types, the conversion to float in glsl_to_nir should not apply to bool types anymore. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-11 19:16:11 +00:00
Jonathan Marek	b27ad17115	glsl/nir: ftrunc for native_integers=false float to int cast out_type in the default cast case is always GLSL_TYPE_FLOAT, so we get a mov otherwise. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-11 19:16:11 +00:00
Jonathan Marek	d3b47e073e	glsl/nir: int constants as float for native_integers=false All alu instructions emitted with native_integers=false expect float (or bool in some cases) constants, so this change is necessary. This will cause changes with some intrinsics which had integer sources, such as nir_intrinsic_load_uniform. Apparently it might cause issues with some opt passes, but perhaps those don't apply in OpenGL ES 2.0 cases? Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-11 19:16:11 +00:00
Jason Ekstrand	1ede463b6e	intel/peephole_ffma: Fix swizzle propagation The num_components value passed into get_mul_for_src is used to only compose the parts of the swizzle that we know will be used so we don't compose invalid swizzle components. However, we had a bug where we passed the number of components of the add all the way through. For the given source, we need the number of components read from that source. In the case where we have a narrow add, say 2 components, that is sourced from a chain of wider instructions, we may not compose all the swizzles. All we really need to do is pass through the right number of components at each level. Fixes: `2231cf0ba3` "nir: Fix output swizzle in get_mul_for_src" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-01-11 10:44:08 -06:00
Kenneth Graunke	ae683ed3bc	nir: Allow a non-existent sampler deref in nir_lower_samplers_as_deref GL_ARB_gl_spirv does not provide a sampler deref for e.g. texelFetch(), so we can't assume that both are present and identical. Simply lower each if it is present. Fixes regressions in GL_ARB_gl_spirv tests since I switched everyone to using this pass. Thanks to Alejandro Piñeiro for catching these. Fixes: `f003859f97` nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-11 07:54:32 -08:00
Eric Engestrom	e12b0b5c6d	travis: avoid using unset llvm-config Fixes the following errors: usage: which [-as] program ... /Users/travis/.travis/job_stages: line 110: --version: command not found ... caused by the use of an undefined $LLVM_CONFIG Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-11 14:38:35 +00:00
Eric Engestrom	c8ae891035	egl: remove unused include Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-11 14:37:47 +00:00
Eric Engestrom	d75fbff667	egl: add missing includes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2019-01-11 14:37:47 +00:00
Iago Toral Quiroga	4b1e436bc9	anv/pipeline_cache: fix incorrect guards for NIR cache Fixes: `f6aa9f7185` 'anv/pipeline_cache: Add support for caching NIR' Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-11 12:45:18 +01:00
Kenneth Graunke	ad9832d17b	blorp: Pass the batch to lookup/upload_shader instead of context This will allow drivers to pin shader buffers if necessary. i965 and anv do not need to do this today, but iris will. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-10 20:52:04 -08:00
Kenneth Graunke	084a1cdbb7	blorp: Add blorp_get_surface_address to the driver interface. Currently, BLORP expects drivers to provide two functions for dealing with buffers: blorp_emit_reloc and blorp_surface_reloc. Both record a relocation and combine the BO address and offset into a full 64-bit address. Traditionally, blorp_surface_reloc has written that combined address to an implicitly-known buffer where surface states are stored. (In contrast, blorp_emit_reloc returns the value.) The upcoming Iris driver stores surface states in multiple buffers, which makes it impossible for blorp_surface_reloc to write the combined address - it only takes an offset, not the actual buffer to write to. This commit adds a third function, blorp_get_surface_address, which combines and returns an address, which is then passed to ISL's surface state fill functions. Softpin-only drivers can return a real address here and skip writing it in blorp_surface_reloc. Relocation-based drivers are have options. They can simply return 0 from the new function, and continue writing the address from blorp_surface_reloc. Or, they can return a presumed address from blorp_get_surface_address, and have other relocation processing write the real value later. For now, i965 and anv simply return 0. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-10 20:51:53 -08:00
Ilia Mirkin	2165636e9c	docs: fix gallium screen cap docs Make sure that the next line starts with spaces so that bullets are maintained throughout, add `` around a few more special tokens, and fix SAMPLE_COUNT_TEXTURE -> SAMPLE_COUNT. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2019-01-10 21:44:09 -05:00
Danylo Piliaiev	a2db6b4254	glsl: Make invariant outputs in ES fragment shader not to cause error In all GLSL ES versions output variables in fragment shader are allowed to be invariant. From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 1.00 spec: "Only the following variables may be declared as invariant: ... - Built-in special variables output from the fragment shader." From Section 4.6.1 ("The Invariant Qualifier") GLSL ES 3.00 spec: "Only variables output from a shader can be candidates for invariance." Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107842	2019-01-11 13:01:11 +11:00
Jason Ekstrand	eb4b1477dc	anv/pipeline: Cache the pre-lowered NIR This adds a second level of caching for the pre-lowered NIR that's only based off of the shader module, entrypoint and specialization constants. This is enough for spirv_to_nir as well as our first round of lowering and optimization. Caching at this level should allow for faster shader recompiles due to state changes. The NIR caching does not get serialized to disk via either the VkPipelineCache serialization mechanism or the transparent on-disk cache. We could but it's usually not that expensive to fall back to SPIR-V for the odd cache miss especially if it only happens once for several misses and it simplifies the cache. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	f6aa9f7185	anv/pipeline_cache: Add support for caching NIR Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	8dfda5ebbe	anv/pipeline: Hash shader modules and spec constants separately The stuff hashed by anv_pipeline_hash_shader is exactly the inputs to anv_shader_compile_to_nir so it can be used for NIR caching. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	b90e55a5d5	compiler/types: Serialize/deserialize subpass input types correctly They have glsl_sampler_dim enum values of 8 and 9 which don't work when you & them with 0x7. Fortunately, we have plenty of bits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Jason Ekstrand	73ddfbeb85	anv/pipeline: Move wpos and input attachment lowering to lower_nir This lets us make anv_pipeline_compile_to_nir take a device instead of a pipeline. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 19:15:27 -06:00
Matt Turner	32e266a9a5	i965: Compile fp64 funcs only if we do not have 64-bit hardware support Brown bag fix...	2019-01-10 15:22:17 -08:00
Jason Ekstrand	8ea8727a87	anv/pipeline: Constant fold after apply_pipeline_layout Thanks to the new NIR load_descriptor intrinsic added by the UBO/SSBO lowering series, we weren't getting UBO pushing because the UBO range detection pass couldn't see the constants it needed. This fixes that problem with a quick round of constant folding. Because we're folding we no longer need to go out of our way to generate constants when we lower the vulkan_resource_index intrinsic and we can make it a bit simpler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-01-10 20:34:00 +00:00
Rob Clark	031e94dc72	freedreno/a6xx: fix 3d+tiled layout The last round of fixing 3d layer+level layout skipped the tiled case, since tiled texture support was not in place yet. This finishes the job. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-10 14:21:39 -05:00
Rob Clark	c92c18c70c	freedreno/a6xx: move tile_mode to sampler-view CSO This is known when the CSO is created, so no need to patch it in later. Also, it seems like smaller textures where the first level is small enough to be linear, it seems like we should set linear tile mode. See: dEQP-GLES3.functional.texture.format.unsized.rgb_unsigned_byte_3d_pot Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-10 14:21:39 -05:00
Rob Clark	eb625d30b7	freedreno/a6xx: separate stencil restore/resolve fixes Previously we'd use format/etc from the primary (z32) buffer for the stencil (s8), due to confusion about rsc vs psurf. Rework this to drop extra arg and push down handling of separate stencil case (and make sure we take the fmt from the right place). This doesn't completely fix separate-stencil, but at least it avoids the GPU scribbling over random other cmdstream buffers and causing a bunch of bogus fails in dEQP. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-10 14:21:39 -05:00
Rob Clark	04aff7e42b	freedreno: make cmdstream bo's read-only to GPU If nothing else, this will make problems with cmdstream getting blit over with pixels easier to track down (ie. faults when it first happens rather than strange failures later from corrupted cmdstream when a stateobj is later reused). (NOTE this somewhat depends on the kernel supporting the flag, and the iommu implementation. But the worst case is just that the cmdstream ends up writeable as before.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-10 14:21:39 -05:00
Guido Günther	286de96af8	etnaviv: fix typo in cflush_all description Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-01-10 18:46:10 +01:00
Eric Engestrom	53fbde4df3	radv: remove a few more unnecessary KHR suffixes Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2019-01-10 16:53:44 +00:00
Rhys Perry	0210243923	nir: fix copy-paste error in nir_lower_constant_initializers Fixes: `393b59e077` ('nir: Rework nir_lower_constant_initializers() to handle functions') Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-10 10:51:52 -06:00
Andres Gomez	6c3164cd08	docs: complete the calendar and release schedule documentation As suggested by Emil Velikov. Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-01-10 15:53:02 +02:00
Andres Gomez	428164d87f	glsl/linker: specify proper direction in location aliasing error The check for location aliasing was always asuming output variables but this validation is also called for input variables. Fixes: `e2abb75b0e` ("glsl/linker: validate explicit locations for SSO programs") Cc: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-10 15:51:57 +02:00
Andres Gomez	e2e03f84f9	editorconfig: Add max_line_length property The property is supported by the most of the editors, but not all: https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties#max_line_length Cc: Eric Engestrom <eric@engestrom.ch> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-01-10 15:50:34 +02:00
Tapani Pälli	864cc419eb	intel/isl: move tiled_memcpy static libs from i965 to isl Patch moves intel_tiled_memcpy[_sse41] libraries to isl, renames some functions and types and makes the required build system changes for meson, automake and Android. No functional changes are introduced. v2: code cleanups, move isl_get_memcpy_type to i965 (Jason) v3: move isl_mem_copy_fn to priv header, cleanups (Jason, Dylan) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-10 08:02:30 +02:00
Matt Turner	406f603b34	i965: Enable 64-bit GLSL extensions Now that we have software implementations of ARB_gpu_shader_int64 and ARB_gpu_shader_fp64 we can unconditionally enable these extensions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	613ac3aaa2	i965: Compile fp64 software routines and lower double-ops Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	18b4e87370	intel/compiler: Heap-allocate temporary storage Shaders containing software implementations of double-precision operations can be very large such that we cannot stack-allocate an array of grf_count*16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	622d429128	intel/compiler: Expand size of the 'nr' field Shaders containing software implementations of double-precision operations can be very large such that we have more the 2^16 virtual registers during optimization. Move the 'nr' field to the union containing the immediate storage and expand it to 32-bits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	7e4e9da90d	intel/compiler: Prevent warnings in the following patch The next patch replaces an unsigned bitfield with a plain unsigned, which triggers gcc to begin warning on signed/unsigned comparisons. Keeping this patch separate from the actual move allows bisectablity and generates no additional warnings temporarily. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	2b801b6668	intel/compiler: Rearrange code to avoid future problems A follow on commit will move nr to the same union as the immediate data, so we should assert these invariants before we overwrite the nr field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	3b967e1724	intel/compiler: Avoid false positive assertions A follow on patch will move the 'nr' field to the union containing the immediate field, so prepare by checking that we're only testing these assertions if the .file is correct. The assertions with != ARF were kind of silly to begin with because the <128 check is specifically only for things in the GRF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:41 -08:00
Matt Turner	8534742404	intel/compiler: Split 64-bit MOV-indirects if needed Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:40 -08:00
Matt Turner	e76772af6c	intel/compiler: Lower 64-bit MOV/SEL operations	2019-01-09 16:42:40 -08:00
Matt Turner	2623653126	nir: Unset metadata debug bit if no progress made NIR metadata validation verifies that the debug bit was unset (by a call to nir_metadata_preserve) if a NIR optimization pass made progress on the shader. With the expectation that the NIR shader consists of only a single main function, it has been safe to call nir_metadata_preserve() iff progress was made. However, most optimization passes calculate progress per-function and then return the union of those calculations. In the case that an optimization pass makes progress only on a subset of the functions in the shader metadata validation will detect the debug bit is still set on any unchanged functions resulting in a failed assertion. This patch offers a quick solution (short of a larger scale refactoring which I do not wish to undertake as part of this series) that simply unsets the debug bit on unchanged functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	e633fae5cb	nir: Add lowering support for 64-bit operations to software Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	fe2cbcf3ee	nir: Create nir_builder in nir_lower_doubles_impl() We're going to use it more in a future patch, and this avoids a lot of gross code. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	ecb115eb3f	nir: Add and set info::uses_64bit Will be used to communicate that a shader uses 64-bit operations to the concerned lowering passes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	41f3e9e5f5	nir: Implement lowering of 64-bit shift operations Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	62d55f1281	nir: Wire up int64 lowering functions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Jason Ekstrand	adab27e741	nir: Add some more int64 lowering helpers [mattst88]: Found in an old branch of Jason's. Jason implemented: inot, iand, ior, iadd, isub, ineg, iabs, compare, imin, imax, umin, umax Matt implemented: ixor, bcsel, b2i, i2b, i2i8, i2i16, i2i32, i2i64, u2u8, u2u16, u2u32, u2u64, and fixed ilt Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Matt Turner	dde73e646f	nir: Tag entrypoint for easy recognition by nir_shader_get_entrypoint() We're going to have multiple functions, so nir_shader_get_entrypoint() needs to do something a little smarter. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 16:42:40 -08:00
Matt Turner	393b59e077	nir: Rework nir_lower_constant_initializers() to handle functions Previously it assumed that only a single function (the entrypoint) existed and attempted to lower constant initializers of shader outputs for each function, for instance.	2019-01-09 16:42:40 -08:00
Sagar Ghuge	f998ce4111	glsl: Add "built-in" functions to do fp32_to_int64(fp32) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	2632c12477	glsl: Add "built-in" functions to do fp32_to_uint64(fp32) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	876a4b85fe	glsl: Add "built-in" functions to do fp64_to_int64(fp64) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	21e9bb2b3f	glsl: Add utility function to round and pack int64_t value Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	5a674fd789	glsl: Add "built-in" functions to do fp64_to_uint64(fp64) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	5a87441807	glsl: Add utility function to round and pack uint64_t value Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	c9d333a6b7	glsl: Add "built-in" functions to do int64_to_fp32(int64_t) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	d5cf6e92b4	glsl: Add "built-in" functions to do uint64_to_fp32(uint64_t) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	b830efb191	glsl: Add "built-in" functions to do int64_to_fp64(int64_t) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Sagar Ghuge	7c5b982b89	glsl: Add "built-in" functions to do uint64_to_fp64(uint64_t) Reviewed-by: Elie Tournier <tournier.elie@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-01-09 16:42:40 -08:00
Matt Turner	15757bc80b	glsl: Add "built-in" functions to convert bool to double And vice versa. Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Matt Turner	e213f3871f	glsl: Add "built-in" functions to do ffract(fp64) Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Matt Turner	5c9a659f50	glsl: Add "built-in" function to do ffloor(fp64) Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Matt Turner	83762afa66	glsl: Add "built-in" functions to do fmin/fmax(fp64) Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Matt Turner	92ac2169fb	glsl: Add "built-in" functions to do ffma(fp64) Definitely not actually a fused-multiply add. Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	3db81b5d9f	glsl: Add "built-in" functions to do round(fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	48891ab441	glsl: Add "built-in" functions to do trunc(fp64) v2: use mix. Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	2119094b1d	glsl: Add "built-in" functions to do sqrt(fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	cad58fc5e7	glsl: Add "built-in" functions to do fp32_to_fp64(fp32) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	407bd1bbf9	glsl: Add "built-in" functions to do fp64_to_fp32(fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	f499942b31	glsl: Add "built-in" functions to do int_to_fp64(int) v2: use mix Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	773190f281	glsl: Add "built-in" functions to do fp64_to_int(fp64) v2: use mix Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	cbf090b809	glsl: Add "built-in" functions to do uint_to_fp64(uint) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	a3551ee61f	glsl: Add "built-in" functions to do fp64_to_uint(fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	4a93401546	glsl: Add "built-in" functions to do mul(fp64, fp64) v2: use mix Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	f111d72596	glsl: Add "built-in" functions to do add(fp64, fp64) v2: use mix and findMSB to optimise. v3: [Sagar] Fix zFrac0 == 0u case in __normalizeRoundAndPackFloat64 Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	c036fc97a2	glsl: Add "built-in" functions to do lt(fp64, fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	3e4d5ea7b8	glsl: Add utility function to extract 64-bit sign Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:40 -08:00
Elie Tournier	ec6e823a99	glsl: Add "built-in" functions to do eq/ne(fp64, fp64)	2019-01-09 16:42:40 -08:00
Elie Tournier	c802cdde9d	glsl: Add "built-in" function to do sign(fp64) v2: use mix. Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	eac66f0248	glsl: Add "built-in" functions to do neg(fp64) v2: use mix. Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Elie Tournier	0428951b9d	glsl: Add "built-in" function to do abs(fp64) Signed-off-by: Elie Tournier <elie.tournier@collabora.com>	2019-01-09 16:42:40 -08:00
Matt Turner	b63a1f8e40	glsl: Create file to contain software fp64 functions The following patches will add implementations of various double-precision operations to this file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:40 -08:00
Ian Romanick	412472da5c	glsl: Add utility to convert text files to C strings Will be used to convert the .glsl source file containing software fp64 routines to a .h file that can be included while building the compiler. This commit contains two squashed together: the first from Ian adding the utility (with the existing title), and the second from Dylan making the code both python2 and python3 compatible. This is somewhat modeled after the xxd utility that comes with Vim. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> xxd.py: Make python2 and 3 compatible This makes use of unicode_literals, so that undecorated strings are considered text (python2 unicode, python3 str) and not bytes in python2 and text in python3. It makes use of io.open, which provides python2 with python3's open behavior (it's an alias in python3), in particular support for the 't' and 'b' option. Finally, it decorates all of the string literals with the 'b' prefix, so that python interprets them as bytes. I've removed the stdin and stdout options, as python2 always requires these to be bytes, but python3 always treats them as text (there is a way to get at the underlying bytes buffer, but that's even more complexity), and makes the input files required arguments. In the meson we use the '@INPUT@' shorthand instead of listing each input, as meson will expand that to [prog_python, '@INPUT0@', @INPUT1@, ..., @OUTPUT@, ...]	2019-01-09 16:42:40 -08:00
Timothy Arceri	76c27e47b9	glsl: Copy function out to temp if we don't directly ref a variable Otherwise we can end up with IR that looks like this: ( (declare (temporary ) vec4 f@8) (assign (xyzw) (var_ref f@8) (var_ref f) ) (call f16 ((swiz y (var_ref f@8) ))) (assign (xyzw) (var_ref f) (var_ref f@8) ) )) When we really need: (declare (temporary ) float inout_tmp) (assign (x) (var_ref inout_tmp) (swiz y (var_ref f) )) (call f16 ((var_ref inout_tmp) )) (assign (y) (var_ref f) (swiz y (swiz xxxx (var_ref inout_tmp) ))) (declare (temporary ) void void_var) The GLSL IR function inlining code seemed to produce correct code even without this but we need the correct IR for GLSL IR -> NIR to be able to understand whats going on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:40 -08:00
Matt Turner	63f6d7afd6	glsl: Add function support to glsl_to_nir Based on a patch from Tim Arceri, but I had to substantially rewrite it as a result of the NIR derefs rework. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-09 16:42:40 -08:00
Francisco Jerez	230a8a541d	intel/fs: Remove FS_OPCODE_UNPACK_HALF_2x16_SPLIT opcodes. These are broken on a future platform, but it turns out we don't need to fix them, since they're just type-converting moves with strided source. Kill them. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	cbea91eb57	intel/fs: Remove nasty open-coded CHV/BXT 64-bit workarounds. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	2c99c7a56c	intel/fs: Remove existing lower_conversions pass. It's redundant with the functionality provided by lower_regioning now. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	efa4e4bc5f	intel/fs: Introduce regioning lowering pass. This legalization pass is meant to handle situations where the source or destination regioning controls of an instruction are unsupported by the hardware and need to be lowered away into separate instructions. This should be more reliable and future-proof than the current approach of handling CHV/BXT restrictions manually all over the visitor. The same mechanism is leveraged to lower unsupported type conversions easily, which obsoletes the lower_conversions pass. v2: Give conditional modifiers the same treatment as predicates for SEL instructions in lower_dst_modifiers() (Iago). Special-case a couple of other instructions with inconsistent conditional mod semantics in lower_dst_modifiers() (Curro). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	b94519971a	intel/fs: Constify fs_inst::can_do_source_mods(). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:09 -08:00
Francisco Jerez	c301f447ea	intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass. Currently the visitor attempts to enforce the regioning restrictions that apply to double-precision instructions on CHV/BXT at NIR-to-i965 translation time. It is possible though for the copy propagation pass to violate this restriction if a strided move is propagated into one of the affected instructions. I've only reproduced this issue on a future platform but it could affect CHV/BXT too under the right conditions. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Francisco Jerez	464e79144f	intel/eu/gen7: Fix brw_MOV() with DF destination and strided source. I triggered this bug while prototyping code for a future platform on IVB. Could be a problem today though if a strided move is copy-propagated into a type-converting move with DF destination. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Francisco Jerez	bc781a0323	intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split. This seems to be a problem in combination with the lower_regioning pass introduced by a future commit, which can modify a SIMD-split instruction causing its execution size to become illegal again. A subsequent call to lower_simd_width() would hit this bug on a future platform. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Francisco Jerez	812ede088f	intel/fs: Implement quad swizzles on ICL+. Align16 is no longer a thing, so a new implementation is provided using Align1 instead. Not all possible swizzles can be represented as a single Align1 region, but some fast paths are provided for frequently used swizzles that can be represented efficiently in Align1 mode. Fixes ~90 subgroup quad swap Vulkan CTS tests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Francisco Jerez	c5f9c0009d	intel/fs: Handle source modifiers in lower_integer_multiplication(). lower_integer_multiplication() implements 32x32-bit multiplication on some platforms by bit-casting one of the 32-bit sources into two 16-bit unsigned integer portions. This can give incorrect results if the original instruction specified a source modifier. Fix it by emitting an additional MOV instruction implementing the source modifiers where necessary. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-01-09 12:03:08 -08:00
Andrii Simiklit	0206ffc28d	anv/pipeline: remove unnecessary null-pointer check Looks like it is impossible that 'last' variable is a null because at least the get_vs_prog_data shouldn't return a null pointer. So this check is unnecessary starts from commit: `99d497c5b6` "anv/pipeline: Replace get_fs_input_map with ..." This small issue is found by cppcheck. Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-09 12:29:12 -06:00
Indrajit Das	d2c170eb35	st/va: Return correct status from vlVaQuerySurfaceStatus This ensures that during encoding, applications can get the correct status of the surface before submitting more operations on the same. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>	2019-01-09 11:34:22 -05:00
Roland Scheidegger	0c226d40ef	Revert "llvmpipe: Always return some fence in flush (v2)" This reverts commit `f6a6da8131`. With this commit we see massive amounts of asserts triggering in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib state tracker and piglit. Not entirely sure if the assert could just be removed.	2019-01-09 17:28:53 +01:00
Marek Olšák	e986c1ca1d	st/mesa: don't leak pipe_surface if pipe_context is not current We have found some pipe_surface leaks internally. This is the same code as surface_destroy in radeonsi. Ideally, surface_destroy would be in pipe_screen. Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2019-01-09 11:08:44 -05:00
Marek Olšák	fd82a1d1d6	st/mesa: don't reference pipe_surface locally in PBO code Reviewed-by: Brian Paul <brianp@vmware.com>	2019-01-09 11:08:44 -05:00
Marek Olšák	5da442338b	st/mesa: unify window-system renderbuffer initialization Reviewed-by: Brian Paul <brianp@vmware.com>	2019-01-09 11:08:44 -05:00
Mario Kleiner	5e30e54e05	radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's. With Mesa 18.1, commit `be973ed21f`, si_llvm_load_input_vs() changed the number of source 32-bit wide dword components used for fetching vertex attributes into the vertex shader from a constant 4 to a variable num_channels number, depending on input data format, with some special case handling for input data formats like 64-Bit doubles. In the case of a GL_DOUBLE input data format with one or two components though, e.g, submitted via ... a) glTexCoordPointer(1, GL_DOUBLE, 0, buffer); b) glTexCoordPointer(2, GL_DOUBLE, 0, buffer); ... the input format would be SI_FIX_FETCH_RG_64_FLOAT, but no special case handling was implemented for that case, so in the default path the number of 32-bit dwords would be set to the number of float input components derived from info->input_usage_mask. This ends with corrupted input to the vertex shader, because fetching a 64-bit double from the vbo requires fetching two 32-bit dwords instead of 1, and fetching a two double input requires 4 dword fetches instead of 2, so in these cases the vertex shader receives incomplete/truncated input data: a) float v = gl_MultiTexCoord0.x; -> v.x is corrupted. b) vec2 v = gl_MultiTexCoord0.xy; -> v.x is assigned correctly, but v.y is corrupted. This happens with the standard TGSI IR compiled shaders. Under NIR with R600_DEBUG=nir, we got correct behavior because the current radeonsi nir code always assigns info->input_usage_mask = TGSI_WRITEMASK_XYZW, thereby always fetches 4 dwords regardless of what the shader actually needs. Fix this by properly assigning 2 or 4 dword fetches for one or two component GL_DOUBLE input. Fixes: `be973ed21f` ("radeonsi: load the right number of components for VS inputs and TBOs") Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: mesa-stable@lists.freedesktop.org Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-01-09 11:08:44 -05:00
Rhys Perry	ee8488ea3b	ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is enabled with DXVK. It also fixes artifacts with Fallout 4's god rays with DXVK. Various piglit interpolateAt*() tests under NIR are also fixed. v2: formatting fix update commit message to include Fallout 4 and the Fixes tag Fixes: `f4e499ec79` ('radv: add initial non-conformant radv vulkan driver') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>	2019-01-09 14:57:07 +00:00
Samuel Pitoiset	b8c4f523b4	radv: skip draws with instance_count == 0 Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-09 14:22:38 +01:00
Samuel Pitoiset	a2b5cc3c39	radv: enable variable pointers The Vulkan spec 1.1.97 says: "variablePointers specifies whether the implementation supports the SPIR-V VariablePointers capability. When this feature is not enabled, shader modules must not declare the VariablePointers capability." As the SPIR-V feature is enabled, we should turn on the extension feature as well. All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.* pass with the khronos internal repo. Note that a bunch of them fails with the public repo, but it's expected as they violate the specification. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-09 12:32:18 +01:00
Samuel Pitoiset	d58b11e709	radv: get rid of bunch of KHR suffixes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-09 12:26:48 +01:00
Maya Rashish	a2ddb710fd	radeon: fix printf format specifier. From glibc printf(3): Z A nonstandard synonym for z that predates the appearance of z. Do not use in new code. Z may not exist on non-glibc systems. Prefer the standard symbol. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-01-09 14:15:06 +11:00
Tomasz Figa	f6a6da8131	llvmpipe: Always return some fence in flush (v2) If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <tfiga@chromium.org>	2019-01-09 02:06:13 +01:00
Eric Anholt	700aeaf9c8	glsl: Fix buffer overflow with an atomic buffer binding out of range. The binding is checked against the limits later in the function, so we need to make sure we don't overflow before the check here. Fixes this valgrind warning (and sometimes segfault): ==1460== Invalid write of size 4 ==1460== at 0x74C98DD: ast_declarator_list::hir(exec_list, _mesa_glsl_parse_state) (ast_to_hir.cpp:4943) ==1460== by 0x74C054F: _mesa_ast_to_hir(exec_list, _mesa_glsl_parse_state) (ast_to_hir.cpp:159) ==1460== by 0x7435C12: _mesa_glsl_compile_shader (glsl_parser_extras.cpp:2130) in dEQP-GLES31.functional.debug.negative_coverage.get_error.compute. exceed_atomic_counters_limit Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-01-08 15:44:58 -08:00
Eric Anholt	211b826790	nir: Make nir_deref_instr_build/get_const_offset actually use size_align. I think this was copy-and-paste mistake -- nir_opt_large_constants was passing in glsl_get_natural_size_align_bytes() given brw_nir.c's arguments to the opt pass. I wanted to reuse this function for handling constant offsets of arrays of images in V3D. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-01-08 15:40:53 -08:00
Danylo Piliaiev	9f29d90327	glsl/linker: Fix unmatched TCS outputs being reduced to local variable Always match TCS outputs since they are shared by all invocations within the patch and should not be converted to local variables. This is one of the issues found in Downward. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104297	2019-01-09 10:31:13 +11:00
Eric Anholt	db3b6b6bca	v3d: Enable GL_ARB_texture_gather on V3D 4.x. This is part of GLES 3.1, and with the NIR lowering we're now passing the GLES31 testcases.	2019-01-08 13:03:44 -08:00
Eric Anholt	6051c11d17	nir: Add nir_lower_tex support for Broadcom's swizzled TG4 results. V3D returns the texels in a different order in the resulting vec4 from what GLSL wants, so we need to put in a swizzle. Fixes dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.base_level.level_1 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 13:03:41 -08:00
Bas Nieuwenhuizen	3fcec4a550	freedreno: Move register constant files to src/freedreno. This way they can be shared. Build tested with meson, but not too sure on the autotools stuff though. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Rob Clark <robdclark@gmail.com>	2019-01-08 21:46:14 +01:00
Caio Marcelo de Oliveira Filho	baabfb1959	nir: fix warning in nir_lower_io.c Initialize the variable with NULL. Fixes the following In file included from ../src/compiler/nir/nir_lower_io.c:34: ../src/compiler/nir/nir_lower_io.c: In function ‘nir_lower_explicit_io’: ../src/compiler/nir/nir.h:668:11: warning: ‘addr’ may be used uninitialized in this function [-Wmaybe-uninitialized] return src; ^~~ ../src/compiler/nir/nir_lower_io.c:735:17: note: ‘addr’ was declared here nir_ssa_def *addr; ^~~~ v2: Avoid using a 'default' case so we get help from the compiler when new deref types are added. (Lionel) Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 12:29:56 -08:00
Chia-I Wu	3cb65cf8aa	freedreno/drm: sync uapi again "pad" was missing in Mesa's msm_drm.h. sizeof(drm_msm_gem_info) remains the same, but now the compiler initializes the field to zero. Buffer allocation results in EINVAL without this for me. Cc: Rob Clark <robdclark@gmail.com> Cc: Kristian Høgsberg <hoegsberg@gmail.com> Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>	2019-01-08 19:55:28 +00:00
Chia-I Wu	6eeb1fe491	meson: fix EGL/X11 build without GLX dep_xcb and others were not set under this configuration. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-08 10:58:48 -08:00
Eric Engestrom	b38a48a569	wsi: drop unneeded KHR suffix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:48:03 +00:00
Eric Engestrom	4f5a526789	anv: drop unneeded KHR suffix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:47:56 +00:00
Karol Herbst	d0c6ef2793	nir: rename global/local to private/function memory the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-08 18:51:46 +01:00
Dylan Baker	401dca1c73	autotools: Remove tegra vdpau driver This has never functioned and probably wont ever function, due to the way gallium media state trackers are architected and the tegra video decoder is architected. Cc: Thierry Reding <thierry.reding@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Fixes: `1755f608f5` ("tegra: Initial support")	2019-01-08 09:42:56 -08:00
Pierre Moreau	ba55cb2bcd	clover/meson: Ignore 'svn' suffix when computing CLANG_RESOURCE_DIR The version exported by LLVM in its CMake configuration files can include the “svn” suffix when building a development version (for example “8.0.0svn”). However the exported clang headers are still found under “lib/clang/8.0.0/”, without the “svn” suffix. Meson takes care of removing the “svn” suffix from the version when using the dependency’s `version()` method. This processing is already performed in “configure.ac” when using autotools. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-08 08:53:38 -08:00
Lionel Landwerlin	add5a2ec92	anv: flush fast clear colors into compressed surfaces In the following scenario : 1. Create image format R8G8B8A8_UNORM 2. Create image view format R8G8B8A8_SRGB 3. Clear the view through a sub pass to a particular color 4. Barrier on the image to from color attachment to source transfer 5. Copy the image into a linear buffer to check the content The step 4 resolving the clear color is unaware of the SRGB format of the view, because the blorp resolve operations operate on images the color associated with the resolve will not operate on SRGB format but UNORM. Leading to the wrong color being written into surfaces. This change forces a clear color resolve at the end of the render pass so following resolves won't have to deal with the clear color with a format that doesn't match the image's format. On gfxbench vulkan_5_normal 1280x720, this appear to cost us ~0.5fps, from 49.316 down to 48.949. v2: Only fast clear resolve when image & view have different formats (Lionel) v3: Update warning (Jason) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2019-01-08 16:37:00 +00:00
Lionel Landwerlin	366eb656ac	anv: explictly specify format for blorp ccs/mcs op Resolve operations can happen when dealing with view (begin/end subpasses) in which case the view's format needs to apply, not the image's format. v2: Relayout arguments of a ccs_op() call (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911 Cc: mesa-stable@lists.freedesktop.org	2019-01-08 16:36:56 +00:00
Tapani Pälli	c292414765	dri3: initialize adaptive_sync as false before configQueryb Fixes following errors from valgrind output: ==23388== Conditional jump or move depends on uninitialised value(s) ==23388== at 0x48B4924: loader_dri3_drawable_init (loader_dri3_helper.c:381) ==23388== by 0x48A97D2: dri3_create_drawable (dri3_glx.c:386) ==23388== by 0x489E190: driFetchDrawable (dri_common.c:369) ==23388== by 0x48A9187: dri3_bind_context (dri3_glx.c:195) ==23388== by 0x488B75C: MakeContextCurrent (glxcurrent.c:220) ==23388== by 0x488B8DB: glXMakeCurrent (glxcurrent.c:267) ==23388== by 0x10A987: ??? (in /usr/bin/glxgears) ==23388== by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so) ==23388== ==23388== Conditional jump or move depends on uninitialised value(s) ==23388== at 0x48B5A40: loader_dri3_swap_buffers_msc (loader_dri3_helper.c:923) ==23388== by 0x48A9B7E: dri3_swap_buffers (dri3_glx.c:587) ==23388== by 0x4887A81: glXSwapBuffers (glxcmds.c:857) ==23388== by 0x10ADED: ??? (in /usr/bin/glxgears) ==23388== by 0x4BEB412: (below main) (in /usr/lib64/libc-2.28.so) Fixes: `2e12fe425f` "loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>	2019-01-08 08:15:07 +02:00
Dave Airlie	4298a85ae8	virgl: use primconvert provoking vertex properly This stores the raster state and calls the correct primconvert interface using the currently bound raster state. Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-01-08 12:06:41 +10:00
Jason Ekstrand	754eff07d2	anv: Sort properties and features switch statements Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Jason Ekstrand	05d72d6d48	spirv: Sort supported capabilities Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-07 18:41:15 -06:00
Jason Ekstrand	34af63fa22	anv: Enable the new deref-based UBO/SSBO path Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	63b9aa2e25	spirv: Add support for using derefs for UBO/SSBO access For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	3a7c5667c8	spirv: Make better use of vtn_pointer_uses_ssa_offset The choice of whether or not we should use block_load/store isn't a choice between external and not so much as a choice between deref instructions and manually calculated offsets. In vtn_pointer_from_ssa, we guard the index+offset case behind vtn_pointer_uses_ssa_offset and then branch out from there. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	adc155a815	spirv: Add explicit pointer types Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	be039cb467	spirv: Choose atomic deref type with pointer_uses_ssa_offset Previously, we hard-coded the rule about workgroup variables and the builder lower_workgroup_access_to_offsets flag. Instead base it on the handy helper we have for exactly this sort of thing. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	5c3cb9c3ce	spirv: Add error checking for Block and BufferBlock decorations Variable pointers being well-defined across the block boundary requires a couple of very specific SPIR-V validation rules. Normally, we'd trust the validator to catch these but since CTS tests have been found in the wild which violate them, we'll carry our own checks. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	e90b738f20	nir/vulkan: Add a descriptor type to vulkan resource intrinsics Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	f393b10b3f	nir/lower_io: Add "explicit" IO lowering This new pass is for lowering explicitly laid out memory coming in from SPIR-V or a similar source. It's quite a bit more complicated than the normal lower_io because we have to be able to handle matrices. The way the stride information is stored for matrices is awkward and dealing with row-major matrices is especially painful. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	52dd43c7ef	nir/validate: Allow array derefs on vectors in more modes Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	013ee5732b	nir/intrinsics: Add access flags to load/store_deref Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	7755171e4c	nir/intrinsics: Allow deref sources to consume anything This commit adds a new num_components value for intrinsic sources of -1 which means that it consumes everything and the number of components effectively isn't validated. This is useful for deref sources which just take the result of the deref and we leave it up to the driver to decide what that size should be. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	d0fe52a456	nir/validate: Allow derefs in phi nodes We added this assert when first moving derefs over to instructions to ensure that deref chains could go all the way back to the variables. Now that we're going to start using derefs for things that we can do variable pointers on such as UBOs and SSBOs, we need to be able to run derefs through phi nodes, selects, and basically anything else. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	7e85480a67	nir/remove_dead_variables: Properly handle deref casts We already detect any incomplete deref chains (where the deref is used for something other than another deref or a load/store) and flag the variable as used thanks to deref_used_for_not_store. All that's left to do is to properly skip casts when cleaning up. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	78d80f7db2	nir/deref: Skip over casts in fixup_deref_modes This pass is used when, for instance, we lazily change the mode of variables rather than replacing the variable with a new one. Since we only do this in cases where we know we have full deref chains, it's ok to just skip them in fixup_deref_modes. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	d8e3edb784	nir/deref: Support casts and ptr_as_array in comparisons The code which constructs deref paths already gives you the path starting at the nearest deref_cast or deref_var. All we need to do for casts is handle the case where the start of the path isn't a deref_var. For ptr_as_array derefs, we just bail if we have any after the divergence point between the two derefs. We may be able to do better in the future but this works for now. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	a1c688517d	nir/opt_deref: Properly optimize ptr_as_array derefs When handling casts, we can't blindly propagate the parent of a cast into a ptr_as_array deref because doing so might loose the stride information from the cast. Instead, before we can propagate into ptr_as_array derefs, we need to check that the cast is a cast of an array deref and that the stride matches. For other types of derefs, we can continue to propagate casts as normal because they don't need the stride. We also add an optimization which can combine a ptr_as_array deref with it parent if it is also an array deref of some form. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	427558a717	nir/validate: Don't allow derefs in if conditions Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	e94a027af8	nir: Add a ptr_as_array deref type These correspond directly to SPIR-V's OpPtrAccessChain. As such, they treat whatever their parent gives them as if it's the first element in some array and dereferences that array. If the parent is, itself, an array deref, then the two indices can just be added together to get the final array deref. However, it can also be used in cases where what you have is a dereference to some random vec2 value somewhere. In this case, we require a cast before the ptr_as_array and use the ptr_stride field in the cast to provide a stride for the ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	fc9c4f89b8	nir: Move propagation of cast derefs to a new nir_opt_deref pass We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	bf1a1eed88	spirv: Propagate layout decorations to created glsl_types Instead of just storing the decorations in the vtn_type, propagate them all the way through to the glsl_type. For array strides, this means we need to handle them earlier so we break array stride handling into it's own function and explicitly call it for both pointer and array types. Due to type deduplication in the SPIR-V, we may have explicit layout decorations on all sorts of types that don't actually want them. In order to prevent these leaking into unfortunate places in NIR, we explicitly strip them off before creating NIR variables and when casting pointers to non-external memory. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-08 00:38:30 +00:00
Jason Ekstrand	6cebeb4f71	glsl_type: Add support for explicitly laid out matrices and arrays SPIR-V allows for matrix and array types to be decorated with explicit byte stride decorations and matrix types to be decorated row- or column-major. This commit adds support to glsl_type to encode this information. Because this doesn't work nicely with std430 and std140 alignments, we add asserts to ensure that we don't use any of the std430 or std140 layout functions with explicitly laid out types. In SPIR-V, the layout information for matrices is applied to the parent struct member instead of to the matrix type itself. However, this is gets rather clumsy when you're walking derefs trying to compute offsets because, the moment you hit a matrix, you have to crawl back the deref chain and find the struct. Instead, we take the same path here as we've taken in spirv_to_nir and put the decorations on the matrix type itself. This also subtly adds support for strided vector types. These don't come up in SPIR-V directly but you can get one as the result of taking a column from a row-major matrix or a row from a column-major matrix. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	7f70b3e555	glsl_type: Simplify glsl_channel_type This is C++ so we can just poke at the fields of glsl_type if we wish and calling get_instance is way easier and more reliable than handling each instance separately. While we're at it, we re-arrange the base type labels to match the enum order and add 8-bit type support. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	d8a11bfc08	glsl_type: Add a C wrapper to get struct field offsets Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	d34f19feba	glsl_type: Drop the glsl_get_array_instance C helper It was added in `bce6f99875` even though it's completely redundant with glsl_array_type(). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	a700a82bda	nir: Distinguish between normal uniforms and UBOs Previously, NIR had a single nir_var_uniform mode used for atomic counters, UBOs, samplers, images, and normal uniforms. This commit splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform is still a bit of a catch-all but the nir_var_ubo is specific to UBOs. While we're at it, we also rename shader_storage to ssbo to follow the convention. We need this so that we can distinguish between normal uniforms and UBO access at the deref level without going all the way back variable and seeing if it has an interface type. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	c9a4135e14	nir: Allow storing to shader_storage I have no idea how shader_storage made it into the list of banned variable modes for stores but it clearly should be allowed. This only doesn't cause us a problem today because we never actually use derefs on shader_storage variables. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	cd93b0a670	nir/validate: Require array indices to match the deref bit size This doesn't currently change anything because array indices are required to be 32 bits and all derefs are also 32 bits. However, we will one day have 64-bit derefs for OpenCL. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	abfe674c54	spirv: Handle arbitrary bit sizes for deref array indices We already had code in link_as_ssa to handle bit sizes; we just need to use it. While we're at it we clean up link_as_ssa a bit and add an explicit bit_size parameter in preparation for a day when we have derefs that aren't 32 bit. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	bfe31c5e46	nir/builder: Add nir_i2i and nir_u2u helpers which take a bit size Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com	2019-01-08 00:38:29 +00:00
Jason Ekstrand	639c236e74	spirv: Emit NIR deref instructions on-the-fly This simplifies our deref handling by emitting the actual NIR deref instructions on-the-fly instead of of building up a deref chain and then emitting them at the last moment. In order for this to work with the parts of the compiler that assume they can chase deref chains, we have to run nir_rematerialize_derefs_in_use_blocks_impl to put the derefs back in the right places. Otherwise, in cases such as loop continues where the SPIR-V blocks are not in the same order as the NIR blocks, we may end up with a deref chain with a parent that does not dominate it's child and nir_repair_ssa_impl will insert phis in the deref chain. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	c59f07684c	spirv: Sign-extend array indices The SPIR-V spec was recently updated to clarify that array indices are treated as signed integers. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	f8992eb5ba	anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic The loop through instructions doesn't set the cursor for us so unless we set it somewhere, we may end up emitting instructions in the wrong place. The only reason why we haven't been bitten by this in the past is that it only happens in a few variable pointers cases and the CTS tests for those don't use much control flow so things were getting emitted in the correct order by accident. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	42b2f3e91f	spirv: Handle any bit size in vector_insert/extract This crops up both in the actual SPIR-V VectorInsert/Extract opcodes as well as various places where we deal with vector derefs. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Jason Ekstrand	a392ddb781	glsl_type: Support serializing 8 and 16-bit types Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-01-08 00:38:29 +00:00
Bas Nieuwenhuizen	70ed049cc6	spirv: Fix matrix parameters in function calls. They can be handled exactly the same as arrays, we just need to handle the base type correctly in the switches. Fixes: `a45b6fb452` "spirv: Pass SSA values through functions" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109204 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-08 01:30:03 +01:00
Bas Nieuwenhuizen	3cc940277a	radv: Fix rasterization precision bits. Note that these limits are exact, not a "precision is at least x", as texel coords also get snapped to a multiple of this step size before filtering. This fixes CTS tests dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:27:30 +01:00
Kenneth Graunke	f003859f97	nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref These days, we have two sampler lowering passes. The newer one, gl_nir_lower_samplers_as_deref, is used by radeonsi. It rewrites variables to drop structures out of sampler deref chains, to make life simpler. It then sets var->data.binding for non-bindless sampler and image variables based on the GL uniform storage's opaque index values. The older one converts sampler deref chains (nir_tex_src_texture_deref) to a numerical offset (nir_tex_src_texture_offset). It also stores the constant-valued portion of that number in tex->texture_index, making life really simple for drivers that don't support indirects. It too pokes at GL uniform storage's opaque index values. Logically, we can do the first pass (simplify derefs, set bindings) then the second (turn derefs to offsets, set texture_index). This patch does exactly that, eliminating some redundancy (only one pass has to poke at GL uniform storage), and gaining proper var->data.binding values for drivers using the full lowering. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-07 14:25:04 -08:00
Kenneth Graunke	c69f9297cf	nir: Fix gl_nir_lower_samplers_as_deref's structure type handling. We recurse to remove structures, and at each step, re-modify the resulting type for our link in the deref chain. For arrays, the result of recursion is the new underlying type - so we wrap it with the array dimensionality again. For structs, we want to simply use the new underlying type, skipping the struct altogether. The correct way to do this is to do nothing at all. Previously, we had reset type to next->type, which is the /old/ field type, not the new field type we obtained by recursing. This undid our recursive work. Fixes about 338 tests with nested structs, such as: dEQP-GLES2.functional.uniform_api.value.initial.get_uniform.nested_structs_arrays.sampler2D_samplerCube_fragment Note that currently only radeonsi uses this pass, and NIR support is disabled there by default, so the breakage was likely not seen by most people. The next commit uses this pass for more drivers, so this fix prevents regressions from that change. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-07 14:25:04 -08:00
Bas Nieuwenhuizen	be6cee51c0	amd/common: Add some parentheses to silence warning. [1/59] Compiling C object 'src/amd/common/src@amd@common@@amd_common@sta/ac_nir_to_llvm.c.o'. ../mesa/src/amd/common/ac_nir_to_llvm.c: In function ‘get_inst_tessfactor_writemask’: ../mesa/src/amd/common/ac_nir_to_llvm.c:4089:32: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = ((1 << num_comps + 1) - 1) << first_component; ~~~~~~~~~~^~~ ../mesa/src/amd/common/ac_nir_to_llvm.c:4091:33: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = (((1 << num_comps + 1) - 1) << first_component) << 4; Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:37 +01:00
Bas Nieuwenhuizen	64c83efaee	radv: Remove unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen	656c1c488c	radv: Remove device path. unused and gcc complains about strncpy. (from what I can see because strncpy does not leave a 0 byte on truncate. That said we don't use it so this does not fix a real bug). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 23:15:14 +01:00
Marek Olšák	492ad9a402	ac: remove unused variable from ac_build_ddxy trivial	2019-01-07 14:51:25 -05:00
Andres Gomez	0cc01f45e7	glsl: correct typo in GLSL compilation error message v2: Add the "fix" tag (Erik). Fixes: `037f68d81e` ("glsl: apply align layout qualifier rules to block offsets") Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-07 19:07:33 +02:00
Jason Ekstrand	027835b1da	vulkan: Update the XML and headers to 1.1.97 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 10:00:01 -06:00
Andres Gomez	6decc6b1d9	docs: update 18.3 and add 19.x cycles for the release calendar v2: replace incorrect "<td/>" with "<td>" (Eric). Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com>	2019-01-07 17:19:47 +02:00
Bas Nieuwenhuizen	110564fdec	anv/android: Do not reject storage images. We do the ImageFormatProperties check already, and rejecting an usage flag when both ImageFormatProperties and the WSI (which is Android) support it is not allowed. Intel does support storage for some of the support WSI formats, such as R8G8B8A8_UNORM, and looking at the ISL_SURF_USAGE_DISABLE_AUX_BIT, the imported images do not have any form of compression that would prevent this fix. v2: Also consider STORAGE bit for Gralloc usage bits. (From Kevin Strasser <kevin.strasser@intel.com>) Fixes: `053d4c328f` "anv: Implement VK_ANDROID_native_buffer (v9)" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-07 15:20:55 +01:00
Bas Nieuwenhuizen	9a45a190ad	radv: Implement buffer stores with less than 4 components. We started using it in the btoi paths for r32g32b32, and the LLVM IR checker will complain about it because we end up with intrinsics with the wrong type extension in the name. Fixes: `593996bc02` ("radv: implement buffer to image operations for R32G32B32") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-01-07 14:54:14 +01:00
Jon Turney	00ad77b9f6	appveyor: Add a Cygwin build script	2019-01-07 13:40:58 +00:00
Jon Turney	5334dafee2	appveyor: put build steps in a script, rather than inline in appveyor.yml	2019-01-07 13:40:57 +00:00
Lucas Stach	d015888efb	etnaviv: annotate variables only used in debug build Some of the status variables in the compiler are only used in asserts and thus may be unused in release builds. Annotate them accordingly to avoid 'unused but set' warnings from the compiler. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-01-07 11:51:02 +01:00
Lucas Stach	b56d903b5a	etnaviv: enable full overwrite in a few more cases Take into account the render target format when checking if the color mask affects all channels of the RT. This allows to enable full overwrite in a few cases where a non-alpha format is used. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2019-01-07 11:50:23 +01:00
Timothy Arceri	6dade5d534	nir: avoid uninitialized variable warning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109231	2019-01-07 10:57:00 +11:00
Timothy Arceri	17fac39398	st/glsl: refactor st_link_nir() The functional change here is moving the nir_lower_io_to_scalar_early() calls inside st_nir_link_shaders() and moving the st_nir_opts() call after the call to nir_lower_io_arrays_to_elements(). This fixes a bug with the following piglit test due to the current code not cleaning up dead code after we lower arrays. This was causing an assert in the new duplicate varyings link time opt introduced in `70be9afccb`. tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy up the code a little and merge some loops. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-07 10:54:20 +11:00
Eric Anholt	8847370424	v3d: Use the core tex lowering. Even without any clever optimization on the unpack operations, this gives us a useful value for the channels read field, which we can use to avoid ldtmu instructions to the no-op register. instructions in affected programs: 890712 -> 881974 (-0.98%)	2019-01-04 15:59:59 -08:00
Eric Anholt	f217a94542	nir: Add nir_lower_tex options to lower sampler return formats. I've been doing this in the nir-to-vir and nir-to-qir backends of v3d and vc4, but nir could potentially do some useful stuff for us (like avoiding unpack/repacks) if we give it the information. v2: Skip lowering for txs/query_levels v3: Fix a crash on old-style shadow v4: Rename to tex_packing, use nir_format_unpack_sint/uint helpers, pack the enum. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-04 15:59:57 -08:00
Eric Anholt	a74f2aeb4f	nir: Allow nir_format_unpack_int/sint to unpack larger values. For V3D, I want to unpack 4-16-bit packed integers for 8 and 16-bit integer samplers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-04 15:59:30 -08:00
Jason Ekstrand	19c608fe43	intel/blorp: Be more conservative about copying clear colors In `92eb5bbc68` we attempted to avoid copying clear colors whenever we weren't doing a resolve. However, this broke MSAA resolves because we need the clear color in the source. This patch makes blorp much more conservative such that it only avoids the clear color copy if either aux_usage == NONE or it's explicitly doing a fast-clear. Fixes: `92eb5bbc68` "intel/blorp: Only copy clear color when doing..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728 Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-01-04 17:57:43 -06:00
Eric Anholt	81b9361b68	v3d: Stop scalarizing our uniform loads. We can pull a whole vector in a single indirect load. This saves a bunch of round-trips to the TMU, instructions for setting up multiple loads, references to the UBO base in the uniforms, and apparently manages to reduce register pressure as well. instructions in affected programs: 3086665 -> 2454967 (-20.47%) uniforms in affected programs: 919581 -> 721039 (-21.59%) threads in affected programs: 1710 -> 3420 (100.00%) spills in affected programs: 596 -> 522 (-12.42%) fills in affected programs: 680 -> 562 (-17.35%) Improves 3dmmes performance by 2.29312% +/- 0.139825% (n=5)	2019-01-04 15:41:23 -08:00
Eric Anholt	f8a8de8b9a	v3d: Do UBO loads a vector at a time. In the process of adding support for SSBOs and CS shared vars, I ended up needing a helper function for doing TMU general ops. This helper can be that starting point, and saves us a bunch of round-trips to the TMU by loading a vector at a time.	2019-01-04 15:41:23 -08:00
Eric Anholt	b0e0086257	v3d: Remove dead switch cases and comments from v3d_nir_lower_io. Moving things to NIR left this mess around. All we lower now is uniforms.	2019-01-04 15:41:23 -08:00
Eric Anholt	f8e6b364b0	v3d: Fix up VS output setup during precompiles. I noticed that a VS I was debugging was missing all of its output stores -- outputs_written was for POS, VAR0, VAR3, while the shader's variables were POS, VAR9, and VAR12. I'm not sure what outputs_written is supposed to be doing here, but we can just walk the declared variables and avoid both this bug and the emission of extra stvpms for less-than-vec4 varyings.	2019-01-04 15:41:23 -08:00
Eric Anholt	e1385e879d	v3d: Reinstate the new shader-db output after v3d_compile() refactor. I misplaced it in the rebase conflicts.	2019-01-04 15:26:19 -08:00
Caio Marcelo de Oliveira Filho	bbf9ee9b18	nir: remove dead code from copy_prop_vars When copy_prop_vars also took care of dead write handling, intrin was used as part of store_to_entry. Now it isn't, so this assignment isn't used really used. Add a comment clarifying what happens to intrin. Fixes: `4dfa7adc10` "nir: Remove handling of dead writes from copy_prop_vars" Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-04 15:18:41 -08:00
Lionel Landwerlin	31e4c9ce40	i965: add CS stall on VF invalidation workaround Even with the previous commit, hangs are still happening. The problem there is that the VF cache invalidate do happen immediately without waiting for previous rendering to complete. What happens is that we invalidate the cache the moment the PIPE_CONTROL is parsed but we still have old rendering in the pipe which continues to pull data into the cache with the old high address bits. The later rendering with the new high address bits then doesn't have the clean cache that it expects/needs. v2: Update commit message/explanation with Jason's Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `a363bb2cd0` ("i965: Allocate VMA in userspace for full-PPGTT systems.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072	2019-01-04 11:18:54 +00:00
Lionel Landwerlin	92b7407090	i965: include draw_params/derived_draw_params for VF cache workaround These buffers are using VB slots and should be included in the workaround decision. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `a363bb2cd0` ("i965: Allocate VMA in userspace for full-PPGTT systems.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072	2019-01-04 11:18:54 +00:00
Lionel Landwerlin	da634a4acb	intel/blorp: emit VF caching workaround before 3DSTATE_VERTEX_BUFFERS Probably no difference but it's nice to have i965 & blorp emit things in the same order. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-04 11:18:51 +00:00
Lionel Landwerlin	e5ed217545	i965: limit VF caching workaround to gen8/9/10 Documentation of the 3DSTATE_VERTEX_BUFFERS packet says this is only needed before ICL. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-01-04 11:18:48 +00:00
Andres Gomez	f0312cfa93	glsl/linker: complete documentation for assign_attribute_or_color_locations Commit `27f1298b9d` ("glsl/linker: validate attribute aliasing before optimizations") forgot to complete the documentation. Cc: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2019-01-04 09:04:31 +02:00
Gurchetan Singh	6b7aea9d85	virgl: remove empty file Fixes: 174f53 ("virgl: consolidate transfer code") Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2019-01-03 20:59:29 +01:00
Gurchetan Singh	ca66457b05	virgl: don't flush an empty range Otherwise, the gl-1.0-long-dlist Piglit test crashes. Fixes: db7757 ("virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT") Reported by airlied@ v2: Exit on any invalid range (Erik) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109190 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Tested-by: Jakob Bornecrantz <jakob@collabora.com>	2019-01-03 20:59:29 +01:00
Eric Engestrom	393a756e6a	docs: advertise distro-provided meson cross-files Hopefully we can kick start the revolution and other distros will start providing them as well :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-03 18:53:21 +00:00
Eric Engestrom	8b363bc42e	docs: fix the meson aarch64 cross-file `gcc-ar` is preferred over the generic `ar`, and the `arm` family is for 32-bit ARM [1]. [1] https://mesonbuild.com/Reference-tables.html#cpu-families Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2019-01-03 18:53:21 +00:00
Jakob Bornecrantz	6a9be6fc0c	virgl/vtest: Use default socket name from protocol header No functional change as the socket name is the same, just removing the double definition of the path. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2019-01-03 15:50:38 +00:00
Rob Clark	e869481ef3	freedreno: fix staging resource size for arrays A 2d-array texture (for example), should get the # of array elements from box->depth, rather than depth0 which is minified. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_bias_float_fragment with tiled textures. Reported-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:11:40 -05:00
Rob Clark	67a7f6f244	freedreno: remove blit_via_copy_region() If we hit the memcpy() path for copy_region(), that will try to do a transfer_map(), which goes badly for blits to/from staging triggered by transfer_map() or transfer_unmap(). We could possibly add fd_blit2() which has allow_transfer_map param, and call that for staging blits. But I'm not really sure if trying the blit via copy_region() is very useful. At least for newer gens that implement fd_context::blit(), it probably isn't. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:10:32 -05:00
Rob Clark	2fc17e16a3	freedreno/a6xx: rework blitter API Switch over to using fd_context::blit(), in the same way that a5xx does. The previous patch wires fd_resource_copy_region() up to the blitter so a6xx no longer needs to bypass the core layer to accelerate this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:10:23 -05:00
Rob Clark	53b8eb78d5	freedreno: try blitter for fd_resource_copy_region() Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:10:16 -05:00
Rob Clark	228eddd7ee	freedreno: rework blit API First step to unify the way fd5 and fd6 blitter works. Currently a6xx bypasses the blit API in order to also accelerate resource_copy_region() But this approach can lead to infinite recursion: #0 fd_alloc_staging (ctx=0x5555936480, rsc=0x7fac485f90, level=0, box=0x7fbab29220) at ../src/gallium/drivers/freedreno/freedreno_resource.c:291 #1 0x0000007fbdebed04 in fd_resource_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/drivers/freedreno/freedreno_resource.c:479 #2 0x0000007fbe5c5068 in u_transfer_helper_transfer_map (pctx=0x5555936480, prsc=0x7fac485f90, level=0, usage=258, box=0x7fbab29220, pptrans=0x7fbab29240) at ../src/gallium/auxiliary/util/u_transfer_helper.c:243 #3 0x0000007fbde2dcb8 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47c780, src_level=0, src_box_in=0x7fbab2945c) at ../src/gallium/auxiliary/util/u_surface.c:350 #4 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173 #5 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47c780, src_level=0, src_box=0x7fbab2945c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587 #6 0x0000007fbde2f3d0 in util_try_blit_via_copy_region (ctx=0x5555936480, blit=0x7fbab29430) at ../src/gallium/auxiliary/util/u_surface.c:864 #7 0x0000007fbdec02c4 in fd_blit (pctx=0x5555936480, blit_info=0x7fbab29588) at ../src/gallium/drivers/freedreno/freedreno_resource.c:993 #8 0x0000007fbdf08408 in fd6_blit (pctx=0x5555936480, info=0x7fbab29588) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:546 #9 0x0000007fbdebdc74 in do_blit (ctx=0x5555936480, blit=0x7fbab29588, fallback=false) at ../src/gallium/drivers/freedreno/freedreno_resource.c:129 #10 0x0000007fbdebe58c in fd_blit_from_staging (ctx=0x5555936480, trans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:326 #11 0x0000007fbdebea38 in fd_resource_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/drivers/freedreno/freedreno_resource.c:416 #12 0x0000007fbe5c5c68 in u_transfer_helper_transfer_unmap (pctx=0x5555936480, ptrans=0x7fac47b7e8) at ../src/gallium/auxiliary/util/u_transfer_helper.c:516 #13 0x0000007fbde2de24 in util_resource_copy_region (pipe=0x5555936480, dst=0x7fac485f90, dst_level=0, dst_x=0, dst_y=0, dst_z=0, src=0x7fac47b8e0, src_level=0, src_box_in=0x7fbab2997c) at ../src/gallium/auxiliary/util/u_surface.c:376 #14 0x0000007fbdf2282c in fd_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/freedreno_blitter.c:173 #15 0x0000007fbdf085d4 in fd6_resource_copy_region (pctx=0x5555936480, dst=0x7fac485f90, dst_level=0, dstx=0, dsty=0, dstz=0, src=0x7fac47b8e0, src_level=0, src_box=0x7fbab2997c) at ../src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:587 ... Instead rework the API to push the fallback back to core code, so that we can rework resource_copy_region() to have it's own fallback path, and then finally convert fd6 over to work in the same way. This also makes ctx->blit() optional, and cleans up some unnecessary callers. Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:09:52 -05:00
Rob Clark	f1c88336e6	freedreno: skip depth resolve if not written For multi-pass rendering, it is common to keep the same depth buffer from previous pass, to discard geometry that would be hidden by later draws. In the later passes with depth-test enabled, but depth-write disabled, there is no reason to do gmem2mem resolve. TODO probably do something similar for stencil.. although stencil buffer isn't used as commonly these days Signed-off-by: Rob Clark <robdclark@gmail.com>	2019-01-03 08:09:24 -05:00
Timothy Arceri	4d3f6cb973	nir: merge some basic consecutive ifs After trying multiple times to merge if-statements with phis between them I've come to the conclusion that it cannot be done without regressions. The problem is for some shaders we end up with a whole bunch of phis for the merged ifs resulting in increased register pressure. So this patch just merges ifs that have no phis between them. This seems to be consistent with what LLVM does so for radeonsi we only see a change (although its a large change) in a single shader. Shader-db results i965 (SKL): total instructions in shared programs: 13098176 -> 13098152 (<.01%) instructions in affected programs: 1326 -> 1302 (-1.81%) helped: 4 HURT: 0 total cycles in shared programs: 332032989 -> 332037583 (<.01%) cycles in affected programs: 60665 -> 65259 (7.57%) helped: 0 HURT: 4 The cycles estimates reported by shader-db for i965 seem inaccurate as the only difference in the final code is the removal of the redundent condition evaluations and jumps. Also the biggest code reduction (~7%) for radeonsi was in a tomb raider tressfx shader but for some reason this does not get merged for i965. Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 232 -> 232 (0.00 %) VGPRS: 164 -> 164 (0.00 %) Spilled SGPRs: 59 -> 59 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 14584 -> 13520 (-7.30 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 13 -> 13 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-01-03 15:17:16 +11:00
Timothy Arceri	19cafe8084	nir: add rewrite_phi_predecessor_blocks() helper This will also be used by the if merge pass in the following commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-01-03 15:17:16 +11:00
Timothy Arceri	5122fbc4ba	nir: simplify does_varying_match() Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-03 11:47:56 +11:00
Timothy Arceri	8d05ee2005	nir: make use of does_varying_match() helper Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-03 11:47:56 +11:00
Timothy Arceri	0016166d19	nir: make nir_opt_remove_phis_impl() static Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2019-01-03 11:47:56 +11:00
Eric Anholt	d2b899c0ec	v3d: Refactor compiler entrypoints. Before, I had per-stage entryoints with some helpers shared between them. As I extended for compute shaders and shader-db, it turned out that the other common code in the middle wanted to be shared too.	2019-01-02 14:12:29 -08:00
Eric Anholt	0805060573	v3d: Handle dynamically uniform IF statements with uniform control flow. Loops will be trickier, since we need some analysis to figure out if the breaks/continues inside are uniform. Until we get that in NIR, this gets us some quick wins. total instructions in shared programs: 6192844 -> 6174162 (-0.30%) instructions in affected programs: 487781 -> 469099 (-3.83%)	2019-01-02 14:12:29 -08:00
Eric Anholt	5e9ee6e841	v3d: Fold comparisons for IF conditions into the flags for the IF. total instructions in shared programs: 6193810 -> 6192844 (-0.02%) instructions in affected programs: 800373 -> 799407 (-0.12%)	2019-01-02 14:12:29 -08:00
Eric Anholt	078dc176bc	v3d: Don't try to fold non-SSA-src comparisons into bcsels. There could have been a write of a src in between the comparison and the bcsel that would invalidate the comparison.	2019-01-02 14:12:29 -08:00
Eric Anholt	2e0433b687	v3d: Move the "Find the ALU instruction generating our bool" out of bcsel. This will be reused for if statements.	2019-01-02 14:12:29 -08:00
Eric Anholt	c3ae0aa264	v3d: Simplify the emission of comparisons for the bcsel optimization. I wanted to reuse the comparison stuff for nir_ifs, but for that I just want the flags and no destination value. Splitting the conditions from the destinations ended up cleaning the existing code up, anyway.	2019-01-02 14:12:29 -08:00
Eric Anholt	49d8e2aff1	v3d: Don't forget to include RT writes in precompiles. Looking at some assembly dumps for an optimization, we were clearly missing important parts of the shader!	2019-01-02 14:12:29 -08:00
Eric Anholt	3a81c753a3	v3d: Fix segfault when failing to compile a program. We'll still fail at draw time, but this avoids a regression in shader-db execution once I enable TLB writes in precompiles. Fixes: `b38e4d313f` ("v3d: Create a state uploader for packing our shaders together.")	2019-01-02 14:12:29 -08:00
Marek Olšák	3ae57957be	radeonsi: always unmap texture CPU mappings on 32-bit CPU architectures Team Fortress 2 32-bit version runs out of the CPU address space. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:59 -05:00
Marek Olšák	edfca1f8dc	radeonsi: remove unused variables in si_insert_input_ptr Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:58 -05:00
Marek Olšák	cba475b3e7	radeonsi: use u_decomposed_prims_for_vertices instead of u_prims_for_vertices It seems to be the same, but this doesn't use integer division with a variable divisor. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:56 -05:00
Marek Olšák	54bc87469a	radeonsi: make si_cp_wait_mem more configurable Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:54 -05:00
Marek Olšák	9d2c3a1fe0	radeonsi: call si_fix_resource_usage for the GS copy shader as well Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:53 -05:00
Marek Olšák	d28e208213	radeonsi: don't emit redundant PKT3_NUM_INSTANCES packets Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:50 -05:00
Caio Marcelo de Oliveira Filho	7d6babf995	nir: add a way to print the deref chain Makes debugging easier when we care about the deref chain and not the deref instruction itself. To make it take a const pointer, constify some of the static functions in nir_print.c. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 10:09:04 -08:00
Dylan Baker	a2596450ac	meson: Error out if building nouveau and using LLVM without rtti Nouveau requires rtti. Often LLVM is configured without rtti, and code with and without cannot be linked safely. Lets just error out if nouveau is requested and llvm is built without rtti. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109202 Fixes: `c5a97d658e` ("meson: fix builds against LLVM built without rtti") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-02 09:30:12 -08:00
Alexander von Gluck IV	1b97a72328	egl/haiku: Fix reference to disp vs dpy Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `00992700c9` "egl: set the EGLDevice when creating a display"	2019-01-02 13:45:09 +00:00
Iago Toral Quiroga	ec79069856	compiler/spirv: use 32-bit polynomial approximation for 16-bit asin() The 16-bit polynomial execution doesn't meet Khronos precision requirements. Also, the half-float denorm range starts at 2^(-14) and with asin taking input values in the range [0, 1], polynomial approximations can lead to flushing relatively easy. An alternative is to use the atan2 formula to compute asin, which is the reference taken by Khronos to determine precision requirements, but that ends up generating too many additional instructions when compared to the polynomial approximation. Specifically, for the Intel case, doing this adds +41 instructions to the program for each asin/acos call, which looks like an undesirable trade off. So for now we take the easy way out and fallback to using the 32-bit polynomial approximation, which is better (faster) than the 16-bit atan2 implementation and gives us better precision that matches Khronos requirements. v2: - Fallback to 32-bit using recursion (Jason). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:39 +01:00
Iago Toral Quiroga	fda3f6d424	compiler/spirv: implement 16-bit frexp Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:35 +01:00
Iago Toral Quiroga	7d3c34197a	compiler/spirv: implement 16-bit hyperbolic trigonometric functions v2: - use nir_fadd_imm and nir_fmul_imm helpers (Jason) v3: - since we need to define one for fsub use it for fdiv too (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	88663ba67c	compiler/spirv: implement 16-bit exp and log v2 - use nir_fmul_imm helper (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	f18554e2ce	compiler/spirv: implement 16-bit atan2 v2: - fix huge_val for 16-bit, it was mean't to be 2^14 not 10^14. v3: - rebase on top of new bool sized opcodes - use nir_b2f helper - use nir_fmul_imm helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	1c8de08ec9	compiler/spirv: implement 16-bit atan v2: - use nir_fadd_imm and nir_fmul_imm helpers (Jason) - rebased on top of new sized boolean opcodes - use nir_b2f helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	df118535ca	compiler/spirv: implement 16-bit acos Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	dbbbe24d76	compiler/spirv: implement 16-bit asin v2: - use nir_fmul_imm and nir_fadd_imm helpers (Jason) v3: - missed one case where we need to replace nir_imm_float with nir_imm_floatN_t (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	95b7c29c2c	compiler/spirv: handle 16-bit float in radians() and degrees() v2: - use nir_imm_fmul helper (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	aeee683780	compiler/nir: add nir_fadd_imm() and nir_fmul_imm() helpers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Iago Toral Quiroga	5fc9ad1cb0	compiler/nir: add a nir_b2f() helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-01-02 07:54:05 +01:00
Timothy Arceri	70be9afccb	nir: link time opt duplicate varyings If we are outputting the same value to more than one output component rewrite the inputs to read from a single component. This will allow the duplicate varying components to be optimised away by the existing opts. shader-db results i965 (SKL): total instructions in shared programs: 12869230 -> 12860886 (-0.06%) instructions in affected programs: 322601 -> 314257 (-2.59%) helped: 3080 HURT: 8 total cycles in shared programs: 317792574 -> 317730593 (-0.02%) cycles in affected programs: 2584925 -> 2522944 (-2.40%) helped: 2975 HURT: 477 shader-db results radeonsi (VEGA): SGPRS: 31576 -> 31664 (0.28 %) VGPRS: 17484 -> 17064 (-2.40 %) Spilled SGPRs: 184 -> 167 (-9.24 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 583340 -> 569368 (-2.40 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 6162 -> 6270 (1.75 %) Wait states: 0 -> 0 (0.00 %) vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 14880 -> 15080 (1.34 %) VGPRS: 10872 -> 10888 (0.15 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 674016 -> 668396 (-0.83 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 2708 -> 2704 (-0.15 %) Wait states: 0 -> 0 (0.00 % V2: bunch of tidy ups suggested by Jason Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	d828694b80	nir: rework nir_link_opt_varyings() This just cleans things up a little and make things more safe for derefs. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	c0aba8b0dc	nir: add can_replace_varying() helper This will be reused by the following patch. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	50de3f80a8	nir: rename nir_link_constant_varyings() nir_link_opt_varyings() The following patches will add support for an additional optimisation so this function will no longer just optimise varying constants. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	0a4378ce56	st/glsl_to_nir: call nir_lower_load_const_to_scalar() in the st This will help the new opt introduced in the following patches allowing us to remove extra duplicate varyings. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-01-02 12:19:17 +11:00
Timothy Arceri	2ef0f944f5	radeonsi: make use of ac_are_tessfactors_def_in_all_invocs() Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 10:01:31 +11:00
Timothy Arceri	2832bc972b	ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs() The following patch will use this with the radeonsi NIR backend but I've added it to ac so we can use it with RADV in future. This is a NIR implementation of the tgsi function tgsi_scan_tess_ctrl(). Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 10:01:24 +11:00
Timothy Arceri	2817a4ec0b	radeonsi: remove unrequired param in si_nir_scan_tess_ctrl() Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 10:01:15 +11:00
Timothy Arceri	4dda445750	tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl() The previous code used a do while loop and continues after walking a nested loop/if-statement. This means we end up evaluating the last instruction from the nested block against the while condition and potentially exit early if it matches the exit condition of the outer block. Fixes: `386d165d8d` ("tgsi/scan: add a new pass that analyzes tess factor writes") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 09:53:01 +11:00
Timothy Arceri	dd061eb044	tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl() This just happened not to crash/assert because all loops have at least 1 if-statement and due to a second bug we end up matching the same ENDIF to exit both the iteration over the if-statment and the loop. The second bug is fixed in the following patch. Fixes: `386d165d8d` ("tgsi/scan: add a new pass that analyzes tess factor writes") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-01-02 09:53:01 +11:00
Ilia Mirkin	8f98ff362c	nv30: disable rendering to 3D textures There's no way to tell the 3D engine about swizzling on such textures. While rendering to NPOT ones may be possible, there's no great way to expose that in gallium, nor would there be any practical benefit. Fixes the non-compressed-format "copyteximage 3D" failures. Something odd going on with the compressed formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2019-01-01 15:11:14 -05:00
Bas Nieuwenhuizen	8c93ef5de9	radv: Do a cache flush if needed before reading predicates. This caused random failures for two conditional rendering tests: dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard These wrote the predicate with the vertex shader, did a barrier and then started the conditional rendering. However the cache flushes for the barrier only happen on first draw, so after the predicate has been read. Fixes: `e45ba51ea4` "radv: add support for VK_EXT_conditional_rendering" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-12-31 20:52:08 +01:00
Erik Faye-Lund	86089a7316	anv/autotools: make sure tests link with -msse2 Without this, I get the following error when building the tests with autotools on i686: ---8<--- src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’: src/intel/common/gen_clflush.h:37:7: warning: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Wimplicit-function-declaration] __builtin_ia32_clflush(p); ^~~~~~~~~~~~~~~~~~~~~~ __builtin_ia32_pause src/intel/common/gen_clflush.h: In function ‘gen_flush_range’: src/intel/common/gen_clflush.h:45:4: warning: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Wimplicit-function-declaration] __builtin_ia32_mfence(); ^~~~~~~~~~~~~~~~~~~~~ __builtin_ia32_fnclex ---8<--- The erros are generated for each of these files: - mesa/src/intel/vulkan/tests/state_pool_no_free.c - mesa/src/intel/vulkan/tests/state_pool.c - mesa/src/intel/vulkan/tests/block_pool_no_free.c - mesa/src/intel/vulkan/tests/state_pool_free_list_only.c This is obviously because gen_clflush.h contains code that uses intrinsics that are only available with SSE3. Since the driver already uses SSE3, it seems reasonable to add this to the tests as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Eric Engeström <eric@engestrom.ch>	2018-12-31 17:28:21 +01:00
Erik Faye-Lund	89679e18a9	anv/meson: make sure tests link with -msse2 Without this, I get the following error when building the tests using meson on i686: ---8<--- In file included from ../../../mesa/src/intel/vulkan/anv_private.h:46, from ../../../mesa/src/intel/vulkan/tests/state_pool_no_free.c:26: ../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_clflush_range’: ../../../mesa/src/intel/common/gen_clflush.h:37:7: error: implicit declaration of function ‘__builtin_ia32_clflush’; did you mean ‘__builtin_ia32_pause’? [-Werror=implicit-function-declaration] __builtin_ia32_clflush(p); ^~~~~~~~~~~~~~~~~~~~~~ __builtin_ia32_pause ../../../mesa/src/intel/common/gen_clflush.h: In function ‘gen_flush_range’: ../../../mesa/src/intel/common/gen_clflush.h:45:4: error: implicit declaration of function ‘__builtin_ia32_mfence’; did you mean ‘__builtin_ia32_fnclex’? [-Werror=implicit-function-declaration] __builtin_ia32_mfence(); ^~~~~~~~~~~~~~~~~~~~~ __builtin_ia32_fnclex ---8<--- The errors are generated for each of these files: - mesa/src/intel/vulkan/tests/state_pool_no_free.c - mesa/src/intel/vulkan/tests/state_pool.c - mesa/src/intel/vulkan/tests/block_pool_no_free.c - mesa/src/intel/vulkan/tests/state_pool_free_list_only.c This is obviously because gen_clflush.h contains code that uses intrinsics that are only available with SSE3. Since the driver already uses SSE3, it seems reasonable to add this to the tests as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engeström <eric@engestrom.ch>	2018-12-31 17:27:33 +01:00
Ilia Mirkin	207fb558e4	nv30: fix some s3tc layout issues s3tc layouts are a bit finicky - they're packed, but not swizzled. Adjust logic to allow for that case: - Don't set a uniform pitch for POT-sized compressed textures - Adjust define_rect API to be less confused about block sizes - Only mark a texture as linear if it has a uniform pitch set This has been tested to fix xonotic (as well as the s3tc-* piglits) on nv3x and keeps it working on nv4x. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-30 23:32:21 -05:00
Ilia Mirkin	ad251330e8	nv30: use correct helper to get blocks in y direction This doesn't matter since all compressed formats supported by this hardware use square blocks, but best to use the correct helper. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-30 23:32:21 -05:00
Ilia Mirkin	b04c1907c8	nv30: add support for multi-layer transfers This logic mirrors what we do on nv50. The relatively new texture_subdata callback can cause this to happen with 3D textures, which is triggered at least by xonotic, and probably many piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-30 23:32:21 -05:00
Ilia Mirkin	b34cfd4749	nv30: fix rare issue with fp unbinding not finding the bufctx If the last-active context gets deleted, the pushbuf doesn't have a bufctx to reference. Then there could be a sequence of binds which would trigger a reset on that bin before validation was done. Instead we just pass in the bufctx in question directly. All other instances of PUSH_RESET happen strictly after a validation is run. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-30 19:44:43 -05:00
Ilia Mirkin	ef3eac9545	nv30: avoid setting user_priv without setting cur_ctx The whole user_priv thing is a mess, but as long as it's there, it basically has to map 1:1 to the cur_ctx. Unfortunately we were setting user_priv to some context, then that context could get deleted without any draws/validations in it, leading user_priv to become NULL, with cur_ctx still pointing at some old context. Then we wouldn't run the switch logic, which in turn led to a NULL bufctx being dereferenced. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-30 19:44:43 -05:00
Eric Anholt	ad1e59cf8d	v3d: Add support for gl_HelperInvocation. We can just look at the MSF flags -- if they're unset, then we're definitely in a helper invocation. Fixes dEQP-GLES31.functional.shaders.helper_invocation.* with GLES3.1 enabled.	2018-12-30 08:05:11 -08:00
Eric Anholt	20021e3473	v3d: Add support for textureSize() on MSAA textures. Fixes failures in dEQP-GLES31.functional.shaders.builtin_functions.texture_size.samples_1_texture_2d in the GLES3.1 suite.	2018-12-30 08:05:11 -08:00
Eric Anholt	f695d62fe5	v3d: Add support for requesting the sample offsets.	2018-12-30 08:05:11 -08:00
Eric Anholt	906fca1b4b	v3d: Add support for non-constant texture offsets. Fixes dEQP-GLES31.functional.texture.gather.offset_dynamic.min_required_offset.2d.rgba8.size_pot.clamp_to_edge_repeat and others.	2018-12-30 08:05:11 -08:00
Eric Anholt	47caefc7b4	v3d: Force sampling from base level for tg4. This is what the GLSL ES 310 spec tells us to do, but apparently the "gather mode" flag doesn't imply it in the HW. Fixes dEQP-GLES31.functional.texture.gather.basic.2d.rgba8.filter_mode.min_nearest_mipmap_linear_mag_linear	2018-12-30 08:05:11 -08:00
Eric Anholt	f9bdce9966	v3d: Add a note for a potential performance win on multop/umul24. Noticed while debugging a testcase.	2018-12-30 08:05:11 -08:00
Eric Anholt	b36757448d	v3d: Dead-code eliminate unused flags updates. The greedy comparison folding in bcsel means that we may have left the original bool-generating NIR ALU instruction dead, but DCE wasn't eliminating the VIR code for it because of the flags updates. total instructions in shared programs: 5186024 -> 5100894 (-1.64%) instructions in affected programs: 1448695 -> 1363565 (-5.88%)	2018-12-30 08:05:11 -08:00
Eric Anholt	20e3526298	v3d: Don't generate temps for comparisons. This was just generated work for vir_opt_dead_code and cluttered up the dumps.	2018-12-30 08:04:54 -08:00
Eric Anholt	ebde5afb93	v3d: Move "does this instruction have flags" from sched to generic helpers. I wanted to reuse it for DCE of flags updates.	2018-12-30 08:03:51 -08:00
Eric Anholt	39b1112189	v3d: Drop incorrect dependency for flpop. It is just shifting probably-means-flags bits out of a value, it doesn't actually update the flags on its own.	2018-12-30 08:03:51 -08:00
Eric Anholt	a7c9fd7573	v3d: Drop unused count_nir_instrs() helper. This was for shader-db, but I haven't cared about NIR instruction counts in a long time.	2018-12-30 08:03:51 -08:00
Eric Anholt	696f63f1b4	v3d: Hook up some shader-db output to GL_ARB_debug_output. This allows the original shader-db project's run.c runner to parse things easily, and is probably a good thing to have for GL_ARB_debug_output in general. I formatted it more like Intel's so I can mostly reuse their report script.	2018-12-30 08:03:51 -08:00
Eric Anholt	87b251a940	v3d: Add a "precompile" debug flag for shader-db. I've been using my apitrace-based shader-db so far, but it's slow (apitrace decompression), intrusive (apitrace windows spamming the screen), and doesn't have much coverage. The original shader-db provides a lot more coverage and compiles faster, at the expense of not having the actual runtime variant key. As v3d has a lot less runtime variation than vc4 did, this tradeoff makes more sense.	2018-12-29 13:52:09 -08:00
Eric Anholt	9ec6a3d621	v3d: Fix uniform pretty printing assertion failure with branches. Fixes: `248a7fb392` ("v3d: Do uniform pretty-printing in the QPU dump.")	2018-12-29 13:52:09 -08:00
Dylan Baker	133a5b8383	meson: Override C++ standard to gnu++11 when building with altivec on ppc64 Otherwise there will be symbol collisions for the vector name. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108943 Distro Bug: https://bugs.gentoo.org/673622 Fixes: `42ea0631f1` ("meson: build clover") Acked-by: Matt Turner <mattst88@gmail.com>	2018-12-28 11:04:57 -08:00
Lionel Landwerlin	f7bccf6ab4	intel/aub_viewer: highlight true booleans Useful to spot PIPE_CONTROL flags. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:46 +00:00
Lionel Landwerlin	6ba61ea391	intel/aub_viewer: fold binding/sampler table items Makes things easier to read rather than a long block of text. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:43 +00:00
Lionel Landwerlin	7ab8c80625	intel/aub_viewer: fix shader view Not decoding the shader at the right offset. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:40 +00:00
Lionel Landwerlin	f3ed4a058d	intel/aub_viewer: print address of missing shader Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:21 +00:00
Lionel Landwerlin	0382e11989	intel/aub_viewer: fixup 0x address prefix Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:18 +00:00
Lionel Landwerlin	8e2fda411a	intel/aub_viewer: fix shader get_bo Instruction addresses are always in ppgtt space. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-12-28 16:48:08 +00:00
Nicholas Kazlauskas	e260493f2a	radeonsi: Enable adaptive_sync by default for radeon It's better to let most applications make use of adaptive sync by default. Problematic applications can be placed on the blacklist or the user can manually disable the feature. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>	2018-12-28 17:08:14 +01:00
Nicholas Kazlauskas	2e12fe425f	loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property The DDX driver can be notified of adaptive sync suitability by flagging the application's window with the _VARIABLE_REFRESH property. This property is set on the first swap the application performs when adaptive_sync is set to true in the drirc. It's performed here instead of when the loader is initialized for two reasons: (1) The window's drawable can be missing during loader init. This can be observed during the Unigine Superposition benchmark. (2) Adaptive sync will only be enabled closer to when the application actually begins rendering. If adaptive_sync is false then the _VARIABLE_REFRESH property is deleted on loader init. The property is only managed on the glx DRI3 backend for now. This should cover most common applications and games on modern hardware. Vulkan support can be implemented in a similar manner but would likely require splitting the function out into a common helper function. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>	2018-12-28 16:44:47 +01:00
Nicholas Kazlauskas	a9c36dbf9c	drirc: Initial blacklist for adaptive sync Applications that don't present at a predictable rate (ie. not games) shouldn't have adapative sync enabled. This list covers some of the common desktop compositors, web browsers and video players. [ Michel Dänzer: Added entry for firefox-esr ] Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>	2018-12-28 16:44:27 +01:00
Nicholas Kazlauskas	7407670036	util: Add adaptive_sync driconf option This option lets the user decide whether mesa should notify the window manager / DDX driver that the current application is adaptive sync capable. It's off by default. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>	2018-12-28 16:38:06 +01:00
Nicholas Kazlauskas	759b940389	util: Get program name based on path when possible Some programs start with the path and command line arguments in argv[0] (program_invocation_name). Chromium is an example of an application using mesa that does this. This tries to query the real path for the symbolic link /proc/self/exe to find the program name instead. It only uses the realpath if it was a prefix of the invocation to avoid breaking wine programs. Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-28 15:41:01 +01:00
Tomeu Vizoso	bf1dfcc3e8	etnaviv: Consolidate buffer references from framebuffers We were leaking surfaces because the references taken in etna_set_framebuffer_state weren't being released on context destroy. Instead of just directly releasing those references in etna_context_destroy, use the util_copy_framebuffer_state helper. Take the chance to remove the duplicated buffer references in compiled_framebuffer_state to avoid confusion. The leak can be reproduced with a client that continuously creates and destroys contexts. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reported-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-12-28 10:22:01 +01:00
Dave Airlie	d1ce7eba8b	virgl/vtest: fix front buffer flush with protocol version 0. Older versions of virglrenderer before 33da7361aec486290df0aec4ad8dfa8ff6adde2c in vtest mode, misrender gears. Fixes: `9d81cd8e7c` (virgl: Pass resource size and transfer offsets) Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-28 16:50:38 +10:00
Dylan Baker	6adbd9ac74	docs/autoconf: Mark autoconf as being replaced I know it's not what anyone wants, but how about we start with a message in the documentation that encourages people to try meson. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engeström <eric@engestrom.ch>	2018-12-27 09:03:20 -08:00
Dylan Baker	4c32964f49	docs/install: Update python dependency section Note that meson requires python 3, scons requires python 2, and autotools works with either. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engeström <eric@engestrom.ch>	2018-12-27 09:03:20 -08:00
Dylan Baker	a57dbe6971	docs/meson: Update LLVM section with information about native files Reviewed-by: Eric Engeström <eric@engestrom.ch>	2018-12-27 09:03:17 -08:00
Dylan Baker	40ec5fec0a	docs/install: Add meson to the main install page Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engeström <eric@engestrom.ch>	2018-12-27 09:03:07 -08:00
Juan A. Suarez Romero	fe7919acad	docs: update calendar, add news item and link release notes for 18.2.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-12-27 17:37:33 +01:00
Juan A. Suarez Romero	0d53451890	docs: add sha256 checksums for 18.2.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `24c31bc0e2`)	2018-12-27 17:35:04 +01:00
Juan A. Suarez Romero	008478e340	docs: add release notes for 18.2.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `785e09e3b3`)	2018-12-27 17:35:02 +01:00
Ilia Mirkin	2269ab8588	nv50,nvc0: add missing CAPs for unsupported features Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-26 20:28:07 -05:00
Ilia Mirkin	1d10bb2025	nvc0: enable GL_NV_shader_atomic_float on pre-Maxwell Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	0dd55db10f	nv50/ir: add support for converting ATOMFADD to proper ir Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	9867f2a1f7	st/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	4d5a6a1649	st/mesa: select ATOMFADD when source type is float Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	d139231b32	gallium: add PIPE_CAP_TGSI_ATOMFADD to indicate support ATOMFADD is a little special -- make drivers have to specify it explicitly. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	5574414edc	tgsi: add ATOMFADD operation This is supported by at least NVIDIA hardware, and exposeable via GL extensions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-26 20:04:57 -05:00
Ilia Mirkin	bac8534267	st/mesa: allow glDrawElements to work with GL_SELECT feedback Not sure if this ever worked, but the current logic for setting the min/max index is definitely wrong for indexed draws. While we're at it, bring in all the usual logic from the non-indirect drawing path. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109086 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-12-26 19:30:33 -05:00
Eric Anholt	7d7ecfbcbc	gallium/ttn: Fix setup of outputs_written. We need a 64-bit value, otherwise we only handle the low 32, and happen to sign-extend to claim to write all varying slots if VARYING_SLOT_VAR2 was used. Fixes: `4d0b2c7aaa` ("ttn: Update shader->info as we generate code.") Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-12-26 11:42:09 -08:00
Lionel Landwerlin	e2ae5f2f0a	anv: don't do partial resolve on layer > 0 We've made the choice not to use fast clears on layer > 0 with multilayer images. This is partly because we would need to store multiple clear colors for each layer, making the existing memory layout, already including aux surfaces, fast clear color, image state, etc... even more complex. Partial resolves are the operations transfering the clear colors into the auxiliary buffers. This operation is currently implemented in Blorp by loading the clear color from the image's BO, into a shader that then samples from the auxiliary buffer and writes the color only if it isn't there already. The problem here is that because we store only one clear color for all layers and it is used for partial resolves. If you trigger a partial clear on a layer > 0, then you're likely to deal with a color that is not what you actually want. In the particular issues below, we have multiple layers, each cleared with a different color but the partial resolve just writes the wrong color into the auxiliary buffers for layers > 0. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108910 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911 Cc: mesa-stable@lists.freedesktop.org	2018-12-24 09:42:46 +00:00
Axel Davy	c6b37e5412	st/nine: Increase the limit of cached ff shaders 100 is too small for some games, which triggers recompilations every frame. Increase to 1024. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-12-23 08:14:50 +01:00
Axel Davy	104681c5d5	st/nine: Add src reference to nine_context_range_upload Just like nine_context_box_upload, nine_context_range_upload should reference the src, which holds the ram source buffer. Fixes: https://github.com/iXit/Mesa-3D/issues/327 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: mesa-stable@lists.freedesktop.org	2018-12-23 08:14:50 +01:00
Axel Davy	42d672fa6a	st/nine: Bind src not dst in nine_context_box_upload nine_context_box_upload uploads a ram buffer (from src) to a pipe_resource (dst). We already have a refcount on the pipe_resource, what needs to be protected from release is the ram buffer, thus a reference to src. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: mesa-stable@lists.freedesktop.org	2018-12-23 08:14:50 +01:00
Axel Davy	f91f748fab	st/nine: Fix volumetexture dtor on ctor failure The dtor is called on allocation failure, thus we must check the volumes are allocated before trying to release them. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: mesa-stable@lists.freedesktop.org	2018-12-23 08:14:50 +01:00
Axel Davy	1cc8192ad0	st/nine: Switch to presentation buffer if resize is detected This enables to match the window size on resize on all cases, as it only works currently with presentation buffers. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-12-23 08:14:50 +01:00
Axel Davy	c442dd7890	st/nine: Use helper to release swapchain buffers later This patch introduces a structure to release the present_handles only when they are fully released by the server, thus making "DestroyD3DWindowBuffer" actually release the buffer right away when called. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-12-23 08:14:50 +01:00
Rob Clark	51a44c3aac	freedreno/a6xx: fix 3d texture layout Maybe not 100% perfect, but seems to be a pretty good approximation of that. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:29:15 -05:00
Rob Clark	8f60f1381d	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:28:50 -05:00
Rob Clark	be9ec158d7	freedreno/a6xx: improve setup_slices() debug msgs Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:28:24 -05:00
Rob Clark	2b497fc507	freedreno/a6xx: simplify special case for 3d layout This logic can be re-written as the two cases for 3d (ie. before/after the miplevel sizes start reducing) vs everything else. I think it is easier to read this way. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:27:57 -05:00
Rob Clark	d71a50f831	freedreno: combine fd_resource_layer_offset()/fd_resource_offset() We really only need this logic in one place. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:27:37 -05:00
Rob Clark	6667dde098	freedreno/ir3: don't treat all inputs/outputs as vec4 This was a hold-over from the early TGSI days, and mostly not needed with NIR. This avoids burning an entire 4 consecutive scalar regs for vec3 outputs, for example. Which fixes a few places that we were doing worse that we should on register usage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-22 15:27:21 -05:00
Rob Clark	3453814622	freedreno/ir3: fix fallout of extra assert Fixes the following crash that happened after `d6110d4d` The problem happens if we first compile a "vanilla" shader with nothing lowered in NIR, which perform the final lowering passes on so->shader-> nir (including nir_lower_locals_to_regs()), and then later we have compile a shader with some lowering. The second time through we would have already done nir_lower_locals_to_regs(). Arguably this was already a bug, just one we hadn't noticed yet. Fixes: `d6110d4d54` intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-21 19:04:22 -05:00
Kenneth Graunke	626f2477ab	st/nir: Drop unused gl_program parameter in VS input handling helper. Nobody uses this, so let's drop it. This makes the helper callable from places without a gl_program. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:32 -08:00
Kenneth Graunke	3a78b46e59	st/nir: Gather info after applying lowering FS variant features DrawPixels lowering, for example, adds new varyings that need to be accounted for in inputs_read. The earlier info gathering at link time cannot account for this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:30 -08:00
Kenneth Graunke	bcb6f19947	st/mesa: Combine the DrawPixels and Bitmap passthrough VS programs. They're now identical, so we can just compile it once. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:29 -08:00
Kenneth Graunke	80dd9dfe33	st/mesa: Don't open code the drawpixels vertex shader. Now that we always copy color, we can just use the util function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:28 -08:00
Kenneth Graunke	ed1a356c5e	st/mesa: Drop !passColor optimization in drawpixels shaders. The glDrawPixels passthrough vertex shader copies position and texcoord vertex attributes to varying outputs. It also optionally copies a third gl_Color attribute, which sometimes is unnecessary. Until now, we've compiled separate variants of the shader, one of which does this extra copy, and the other of which doesn't. We have done this since 2007. But, the vertex shader runs for a whopping four vertices, and so the cost of a copying a single input to output is likely inconsequential. In theory, we could bind one fewer vertex element - but we always bind all three regardless. So, we don't even get that savings. This patch unifies the two, so we always copy the optional color, and save having to compile the variant. It also makes the VS input interface match up with the vertex element state without any dead (unused) input attributes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:25 -08:00
Kenneth Graunke	42d31e0516	st/mesa: Drop dead 'passthrough_fs' field. Dead since 2015 (commit `5142564734`). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-21 15:29:20 -08:00
Bas Nieuwenhuizen	bba5749484	radv: Fix wrongly positioned paren. Trivial. Fixes: `9f0bfbed11` "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."	2018-12-21 21:06:55 +01:00
Dylan Baker	1e872d1486	docs: add note about using backticks for rbs in gitlab So that gitlab will render the < and > correctly allowing the tag to be copy-n-pasted without additional formatting. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-12-21 17:43:56 +00:00
Alex Deucher	516160d717	pci_ids: add new VegaM pci id Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-12-21 11:51:34 -05:00
Roland Scheidegger	171983dc89	gallivm: abort when trying to use non-existing intrinsic Whenever llvm removes an intrinsic (we're using), we're hitting segfaults due to llvm doing calls to address 0 in the jitted code instead. However, Jose figured out we can actually detect this with LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder what got broken. (Of course, someone still needs to fix the code to no longer use this intrinsic.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-12-21 17:37:00 +01:00
Roland Scheidegger	f3b1acff48	gallivm: don't use pavg.b intrinsic on llvm >= 6.0 This intrinsic disppeared with llvm 6.0, using it ends up in segfaults (due to llvm issuing call to NULL address in the jited shaders). Add code doing the same thing as the autoupgrade code in llvm so it can be matched and replaced back with a pavgb. While here, also improve lp_test_format, so it tests both with and without cache (as it was, it tested the cache versions only, whereas cache is actually disabled in llvmpipe, and in any case even with it enabled vertex and geometry shaders wouldn't use it). (Although at least for the unorm8 uncached fetch, the code is still quite different to what llvmpipe is using, since that would use unorm8x16 type, whereas the test code is using unorm8x4 type, hence disabling some intrinsic paths.) Fixes: `6f4083143b` ("gallivm: use llvm jit code for decoding s3tc") Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-12-21 17:35:05 +01:00
Emil Velikov	a8d020c3dc	travis: meson: port gallium build combinations over This commit adds a number of build combinations: - Gallium Drivers {SWR, RadeonSI, Others) Each one has different LLVM requirements. Building SWR alone is twice as slow as all other drivers combined. - Gallium ST Clover LLVM {5,6,7} Because C++ API changes all the time. Analogous to above building Clover takes as much time as building all other ST combined. - Gallium ST Others Nouveau is used, instead of i915g since meson has explicit target tracking. Meaning that a configure error is thrown if we use i915g with say va, vdpau or others. Note: LLVM prior to 5.0 is intentionally dropped. If needed we can add that later. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 01:34:59 +00:00
Emil Velikov	39634f2f35	travis: meson: add explicit handling to gallium ST Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 13:52:20 +00:00
Emil Velikov	51318c32fe	travis: meson: explicitly control the DRI loaders Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 13:42:36 +00:00
Emil Velikov	e890aaabed	travis: meson: add unwind handling Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-12 13:33:14 +00:00
Emil Velikov	266ae2225e	travis: meson: use FOO_DRIVERS directly It makes for a shorter MESON_OPTIONS and cleaner handling. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 13:18:54 +00:00
Dylan Baker	31c162ad22	travis: meson: enable unit tests v2: [Emil] pass the argument directly to meson Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-11 10:34:51 -08:00
Dylan Baker	116f0fb216	travis: Don't try to read libdrm out of configure.ac Since we're going to delete it shortly Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-11 11:09:21 -08:00
Dylan Baker	ecf96413bb	travis: meson: use native files to override llvm-config This is the supported way to do this, and should be more robust and reliable. v2: [Emil] - enable backslash escapes - don't hardcode the path - pass the argument directly to meson Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-11 10:40:25 -08:00
Emil Velikov	81173fd69f	travis: printout llvm-config --version Provides quick and easy feedback. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 10:38:20 +00:00
Emil Velikov	de72c1fe6c	travis: meson: print the configured state Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 17:43:07 +00:00
Emil Velikov	7c38d7b7c8	travis: flip to distro xenial, drop sudo false The latter is the default these days and Travis will be removing sudo soonish. Flipping to xenial, allows us to remove a bunch of hacks we have. Plus it prevents us from adding new ones, to workaround what seems like a gcc/binutils bug. For example (from the upcoming meson build): FAILED: ccache c++ -o src/gallium/targets/pipe-loader/pipe_r600.so ... ... src/util/libmesa_util.a ... /usr/lib/x86_64-linux-gnu/libz.so ... src/util/libmesa_util.a(disk_cache.c.o): In function `deflate_and_write_to_disk': _build/../src/util/disk_cache.c:746: undefined reference to `deflateInit_' _build/../src/util/disk_cache.c:765: undefined reference to `deflate' ... As we can see, even though libz.so is explicitly passed after the object that requires it - the linker still fails to see the symbols. Avoid all those situations - flip the switch. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 11:20:41 +00:00
Emil Velikov	12187550f9	configure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS Seemingly with LLVM7 and GCC 5.0, the former won't properly advertise -std=c++11 and the latter will choke. dd this temporary workaround, otherwise we'll get errors like: In file included from /usr/include/c++/5/type_traits:35:0, from /usr/lib/llvm-7/include/llvm/Support/type_traits.h:18, from /usr/lib/llvm-7/include/llvm/ADT/Optional.h:22, from /usr/lib/llvm-7/include/llvm/ADT/STLExtras.h:20, from /usr/lib/llvm-7/include/llvm/ADT/StringRef.h:13, from /usr/lib/llvm-7/include/llvm/Target/TargetMachine.h:17, from ../../../src/amd/common/ac_llvm_helper.cpp:36: /usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 11:56:40 +00:00
Emil Velikov	f331419f26	glx/test: meson: assorted include fixes Swap '..' with the symbolic inc_glx and add glproto as dependency. That will pull the correct include, effectively fixing the tests on macOS. Fixes: `a47c525f32` ("meson: build glx") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 19:24:14 +00:00
Emil Velikov	e139d7a8a3	glx: meson: wire up the dispatch-index-check test Accidentally dropped with earlier commit.! Fixes: `4ccb981673` ("meson: Use consistent style for tests") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 19:07:52 +00:00
Emil Velikov	b44875e2dc	glx: meson: drop includes from a link-only library When producing the final libGL.so/libGLX_mesa.so we only link the local static helper lib (libglx). Thus there's no reason for the includes. Fixes: `a47c525f32` ("meson: build glx") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 17:55:08 +00:00
Emil Velikov	9527f9ea26	TODO: glx: meson: build dri based glx tests, only with -Dglx=dri The library itself (libGL) is only built when -Dglx=dri, yet it's accompanying tests are build even with -Dglx=xlib. Adjust the guards, so we don't build the tests when they are not applicable v2: - Reword commit message (Dylan) - Drop build_by_default hunk (Dylan) Fixes: `a47c525f32` ("meson: build glx") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 17:47:36 +00:00
Emil Velikov	2eedb79e1a	pipe-loader: meson: reference correct library The library is called libgalliumvl_stub - note singular. Fixes: `42ea0631f1` ("meson: build clover") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 04:10:50 +00:00
Emil Velikov	9d10581897	meson: don't require glx/egl/gbm with gallium drivers The gallium drivers do not require a DRI loader. Drop the artificial and unnecessary restriction. Fixes: `af9d276134` ("meson: build libmesa_gallium") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-13 03:54:03 +00:00
Emil Velikov	e0dbfc9953	bin/get-pick-list.sh: warn when commit lists invalid sha We had cases where people would list old/invalid sha in the commit. Add a trivial checker to catch those and throw a warning. CC: Juan A. Suarez <jasuarez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> CC: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-12-21 14:39:52 +00:00
Emil Velikov	6b296f64af	bin/get-pick-list.sh: rework handing of sha nominations Currently our is_sha_nomination does: - folds any whitespace, attempting to extract sha-like information - checks that at least one of the shas has landed Split it in two and do sha-like validation first. This way, commits with mesa-stable and sha nominations will feature the fixes/revert/etc instead of stable (a) or will be omitted if not applicable for the respective branch (b). Misc examples from 18.3 (a) -[ stable ] `5bc509363b` glx: make xf86vidmode mandatory for direct rendering +[ fixes ] `5bc509363b` glx: make xf86vidmode mandatory for direct rendering (b) -[ stable ] `9a7b319903` anv/query: flush render target before copying results CC: Juan A. Suarez <jasuarez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> CC: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-12-21 14:39:34 +00:00
Eric Anholt	17218a0406	vc4: Hook up perf_debug() output to GL_ARB_debug_output as well. This is the right channel to report these things, so that end-users don't need to know each driver's custom debug options.	2018-12-20 11:31:25 -08:00
Rhys Kidd	acc481ad79	vc4: Wire up core pipe_debug_callback This lets the driver use pipe_debug_message() for GL_ARB_debug_output. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-12-20 11:31:19 -08:00
Eric Anholt	ba36312fbd	v3d: Hook up perf_debug() output to GL_ARB_debug output as well. This is the right channel to report these things, so that end-users don't need to know each driver's custom debug options.	2018-12-20 11:31:19 -08:00
Rhys Kidd	d3991d2472	v3d: Wire up core pipe_debug_callback This lets the driver use pipe_debug_message() for GL_ARB_debug_output. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-12-20 11:31:16 -08:00
Eric Anholt	d80761b8f3	v3d: Drop shadow comparison state from shader variant key. The shadow state is now in the sampler.	2018-12-20 11:29:30 -08:00
Eric Anholt	0e2758daad	v3d: Fix simulator mode on i915 render nodes. i915 render nodes refuse the dumb ioctls, so the simulator would crash on the original non-apitrace shader-db. Replace them with direct i915 calls if we detect that we're on one of their gem fds.	2018-12-20 11:29:30 -08:00
Dylan Baker	0ff7eed289	docs/meson: Recommend not using CFLAGS and friends Because of the many caveats involved, using -Dc_args instead of CFLAGS is recommended both by meson upstream and by us. v2: - Fix typo Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-12-20 11:16:40 -08:00
Samuel Pitoiset	9606310081	radv: enable shaderStorageImageMultisample feature on GFX8+ Untested on older chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:19 +01:00
Samuel Pitoiset	6b976024a8	radv: add support for FMASK expand Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:17 +01:00
Samuel Pitoiset	fa16da53d8	radv: initialize FMASK for images in fully expanded mode The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:15 +01:00
Samuel Pitoiset	65d82c84d2	ac/nir: restrict fmask lookup to image load intrinsics We don't ever want to do the fmask lookup on a atomic or store, the fmask should have been decompressed if the surface has been moved to IMAGE_LAYOUT. Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:11 +01:00
Samuel Pitoiset	f45e43e156	spirv: add support for SpvCapabilityStorageImageMultisample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 18:01:09 +01:00
Samuel Pitoiset	5b1ec10e4c	radv: compute optimal VM alignment for imported buffers This fixes GPU hangs on GFX9 with dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.* Copied from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 17:34:04 +01:00
Bas Nieuwenhuizen	9f0bfbed11	radv: Work around non-renderable 128bpp compressed 3d textures on GFX9. Exactly what title says, the new addrlib does not allow the above with certain dimensions that the CTS seems to hit. Work around it by not allowing the app to render to it via compat with other 128bpp formats and do not render to it ourselves during copies. Fixes: `776b911365` "amd/addrlib: update Mesa's copy of addrlib" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-20 15:07:20 +01:00
Samuel Pitoiset	5c7935f8fc	radv: fix subpass image transitions with multiviews The driver needs to decompress all image layers if a fast depth/color clear has been performed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 13:36:37 +01:00
Samuel Pitoiset	0a7e767e58	radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8 This workaround has been introduced by `135e4d434f` for fixing DXVK GPU hangs with many games. It is no longer needed since LLVM r345718. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-20 12:09:57 +01:00
Samuel Pitoiset	576040f2e5	ac/nir: remove the bitfield_extract workaround for LLVM 8 This workaround has been introduced by `3d41757788` and it is no longer needed since LLVM r346422. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-20 09:40:16 +01:00
Iago Toral Quiroga	d6110d4d54	intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs The former expects to see SSA-only things, but the latter injects registers. The assertions in the lowering where not seeing this because they asserted on the bit_size values only, not on the is_ssa field, so add that assertion too. Fixes: `11dc130779` "nir: Add a bool to int32 lowering pass" CC: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-20 08:02:44 +01:00
Ilia Mirkin	1250383e36	st/mesa: remove sampler associated with buffer texture in pbo logic A long time ago, when this was first implemented, not having a sampler bound would cause problems on Fermi. I didn't work out the reasons, but the solution was simple -- just put the samplers back in. Since then, regular texturing paths appear to have lost their associated samplers which required a fuller investigation and fix in nouveau. Now that this is done, this code should no longer need a sampler state for fetching texels from a buffer texture. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-20 00:27:16 -05:00
Roland Scheidegger	6f4083143b	gallivm: use llvm jit code for decoding s3tc This is (much) faster than using the util fallback. (Note that there's two methods here, one would use a cache, similar to the existing code (although the cache was disabled), except the block decode is done with jit code, the other directly decodes the required pixels. For now don't use the cache (being direct-mapped is suboptimal, but it's difficult to come up with something better which doesn't have too much overhead.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-12-20 06:03:20 +01:00
Jason Ekstrand	ec1d5841fa	radv/query: Use 1-bit booleans in query shaders Fixes: `44227453ec` "nir: Switch to using 1-bit Booleans for almost..." Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:40 -06:00
Jason Ekstrand	6896c91c10	radv/query: Add a nir_test_flag helper This is little more than an iadd_imm right now but it will help in the next commit where we refactor things further. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-19 16:36:26 -06:00
Eduardo Lima Mitev	c2ebc38052	freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat() An earlier patch that introduced the function failed to handle the case where an image format layout qualifier is not specified, which is allowed on desktop GL profiles. In these cases, nir_variable's image format is GL_NONE, and we don't need to print a debug message for those. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-12-19 22:49:05 +01:00
Eric Anholt	90818558f0	docs: Add an encouraging note about providing reviews and acks. Across several projects I've seen new contributors say "I wasn't sure if I should provide a review tag since I'm not really an expert in this area." Everyone I know already applies some implicit weighting to reviews from different people, so encourage participation. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-19 12:49:17 -08:00
Eric Anholt	463df0ffe2	docs: Add a note that MRs should still include any r-b or a-b tags. v2: Mention "Tested-by" too Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-19 12:48:13 -08:00
Eric Anholt	fcfb7f573c	v3d: Load and store aligned utiles all at once. This calls the expensive uif offset function once per utile, but it still gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over calling it on each pixel.	2018-12-19 10:27:26 -08:00
Eric Anholt	7c56b7a6ea	v3d: Add a fallthrough path for utile load/store of 32 byte lines. Now that V3D has 8 byte per pixel formats exposed, we've got stride==32 utiles to load and store. Just handle them through the non-NEON paths for now.	2018-12-19 10:27:26 -08:00
Eric Anholt	f6a0f4f41e	vc4: Move the utile load/store functions to a header for reuse by v3d. These implementations of whole-utile load/stores would be the same for v3d, though the layouts of blocks of utiles has changed.	2018-12-19 10:27:26 -08:00
Eric Anholt	8ee752194c	v3d: Implement texture_subdata to reduce teximage upload copies. This lets us store the non-PBO glTexImage data directly into the tiled image without making an extra untiled memcpy for the gallium transfer. Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around in the kernel mapping and unmapping the transfer's temporary area.	2018-12-19 10:27:26 -08:00
Eric Anholt	e09d8aecb4	v3d: Remove dead prototypes for load/store utile functions.	2018-12-19 10:27:26 -08:00
Eric Anholt	fcf881adda	v3d: Don't try to create shadow tiled temporaries for 1D textures. They're raster order anyway, so we'd assertion fail along with wasting bandwidth. Fixes: `6ad9e8690d` ("v3d: Add support for texturing from linear.")	2018-12-19 10:27:21 -08:00
Eric Anholt	b5adc744ba	v3d: Fix check for TFU job completion in the simulator. We're waiting for the jobs-completed count to increment (with wrapping), not to reach its starting state. This mostly ended up working out because the next v3d_hw_tick() for a submit CL would end up doing the TFU operation first, but it did fail when a blit was used for glReadPixels() at the end of a test. Fixes: `ee0549ff9a` ("v3d: Add the V3D TFU submit interface to the simulator.")	2018-12-19 10:26:04 -08:00
Eric Anholt	365728dc5d	v3d: Put the dst bo first in the list of BOs for TFU calls. In the UAPI, the first BO is the destination, and the one the kernel should do an exclusive reservation on. Currently we only do exclusive reservations, anyway. However, in the simulator path I was only copying back the "destination" BO (actually src in this case), and this caused regressions once I fixed the simulator to actually complete TFU before returning (since otherwise, the TFU op would happen at the start of the next CL submit and the draw would get the right contents). Fixes: `976ea90bdc` ("v3d: Add support for using the TFU to do some blits.")	2018-12-19 10:26:04 -08:00
Caio Marcelo de Oliveira Filho	947f7b452a	nir: properly find the entry to keep in copy_prop_vars When copy propagation handles a store/copy, it iterates the current copy entries to remove aliases, but keeps the "equal" entry (if exists) to be updated. The removal step may swap the entries around (to ensure there are no holes), invalidating previous iteration pointers. The bug was saving such pointer to use later. Change the code to first perform the removals and then find the remaining right entry. This was causing updates to be lost since they were being made to an entry that was not part of the current copies. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624 Fixes: `b3c6146925` "nir: Copy propagation between blocks" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-19 09:33:36 -08:00
Michel Dänzer	9d8395bf0e	winsys/amdgpu: Pull in LLVM CFLAGS Fixes build failure if the LLVM headers aren't in a standard include directory. Fixes: `ec22dd34c8` "radeonsi: move SI_FORCE_FAMILY functionality to winsys" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-12-19 17:54:18 +01:00
Caio Marcelo de Oliveira Filho	0ddc911f4d	nir: properly clear the entry sources in copy_prop_vars When updating a copy entry source value from a "non-SSA" (the data come from a copy instruction) to a "SSA" (the data or parts of it come from SSA values), it was possible to hold invalid data in ssa[0] depending on the writemask. Because the union, ssa[0] could contain a pointer to a nir_deref_instr left-over from previous non-SSA usage. Change code to clean up the array before use to avoid invalid data around. Fixes: `62332d139c` "nir: Add a local variable-based copy propagation pass" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-19 08:35:48 -08:00
Eric Engestrom	0e4c7c3d5b	docs: format code blocks a bit nicely Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-19 16:32:30 +00:00
Eric Engestrom	b0319d0768	docs: add meson cross compilation instructions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-19 16:31:51 +00:00
Gurchetan Singh	b45aa6290b	virgl: move resource creation / import / destruction to common code We can remove some duplicated code. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	1d3d311133	virgl: move resource metadata into base resource A resource is just a buffer with some metadata. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	db77573d7b	virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT Previously, we ignored the the glUnmap(..) operation and flushed before we flush the cbuf. Now, let's just flush the data when we unmap. Neither method is optimal, for example: glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT) glFlushMappedBufferRange(.., 25, 30) glFlushMappedBufferRange(.., 65, 70) We'll end up flushing 25 --> 70. Maybe we can fix this later. v2: Add fixme comment in the code (Elie) Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	11939f6fa2	virgl: make virgl_buffers use resource helpers We can reuse the helpers we created. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	4e2c77cd51	virgl: make transfer code with PIPE_BUFFER targets util_format_get_blocksize returns 1 for R8 formats (all PIPE_BUFFERs are R8). Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	174f530008	virgl: consolidate transfer code We could allocate and destroy transfers in one place. v2: Keep l_stride around. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	13626b46f1	virgl: store layer_stride in metadata Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	2a44acc83b	virgl: move vrend_get_tex_image_offset to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	f749229a8e	virgl: move virgl_resource_layout to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	a63da9c062	virgl: move texture metadata to common code Will be reused. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	6e7d396ad3	virgl: remove unnessecary code With commit 89b479, we moved to tracking buffer cleanliness when binding. TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Gurchetan Singh	6d13d1aadb	virgl: texture_transfer_pool --> transfer_pool It's used for all types of resources. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-12-19 13:29:16 +01:00
Nicolai Hähnle	d73a25f2c0	radeonsi: const-ify the si_query_ops Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:07 +01:00
Nicolai Hähnle	c85b0dea0a	radeonsi: split perfcounter queries from si_query_hw Remove a level of indirection to make the code more explicit -- should make it easier to follow what's going on. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:04 +01:00
Nicolai Hähnle	e0f0d3675d	radeonsi: factor si_query_buffer logic out of si_query_hw This is a move towards using composition instead of inheritance for different query types. This change weakens out-of-memory error reporting somewhat, though this should be acceptable since we didn't consistently report such errors in the first place. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:02:01 +01:00
Nicolai Hähnle	0fc6e573dd	radeonsi: move query suspend logic into the top-level si_query struct Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:59 +01:00
Nicolai Hähnle	e2b9329f17	radeonsi: move remaining perfcounter code into si_perfcounter.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:57 +01:00
Nicolai Hähnle	7dd289d9e4	radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer Other callers of si_set_constant_buffer don't need it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:54 +01:00
Nicolai Hähnle	829d417914	radeonsi: use si_set_rw_shader_buffer for setting streamout buffers Reduce the number of places that encode buffer descriptors. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:52 +01:00
Nicolai Hähnle	ce785f5ffd	radeonsi: add an si_set_rw_shader_buffer convenience function Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:50 +01:00
Nicolai Hähnle	556c4c42b7	radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:48 +01:00
Nicolai Hähnle	1e49d72317	radeonsi: show the fixed function TCS in debug dumps This is rather important for merged VS/TCS as LSHS shaders... Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:45 +01:00
Nicolai Hähnle	6e67e79de4	radeonsi: const-ify si_set_tesseval_regs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:42 +01:00
Nicolai Hähnle	5c841a1b1e	radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:39 +01:00
Nicolai Hähnle	0d58dcc3cf	radeonsi: don't set RAW_WAIT for CP DMA clears There is never a read-after-write hazard because the command doesn't read. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:34 +01:00
Nicolai Hähnle	23af72af25	radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:32 +01:00
Nicolai Hähnle	f18b2ac0db	radeonsi: add si_init_draw_functions and make some functions static Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:30 +01:00
Nicolai Hähnle	555cb668cc	radeonsi: extract declare_vs_blit_inputs Prepare for some later refactoring. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:27 +01:00
Nicolai Hähnle	ec22dd34c8	radeonsi: move SI_FORCE_FAMILY functionality to winsys This helps some debugging cases by initializing addrlib with slightly more appropriate settings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:25 +01:00
Nicolai Hähnle	0ef263d62f	ac/surface: 3D and cube surfaces are never displayable Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:22 +01:00
Nicolai Hähnle	8efaffa893	amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan Allow for a unified but efficient treatment of adding a bitmask over a wave or an entire threadgroup. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:19 +01:00
Nicolai Hähnle	300876a9a7	amd/common: scan/reduce across waves of a workgroup Order-aware scan/reduce can trade-off LDS traffic for external atomics memory traffic in producer/consumer compute shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:17 +01:00
Nicolai Hähnle	3963402fd3	amd/common: add ac_build_ifcc Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:15 +01:00
Nicolai Hähnle	3c77f26ccc	amd/common: whitespace fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:12 +01:00
Nicolai Hähnle	76c5ad1995	amd/sid_tables: add additional python3 compatibility imports This happened to bite me while doing some experiments. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:09 +01:00
Nicolai Hähnle	6f0322b16a	r600: remove redundant semicolon Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:00:49 +01:00
Nicolai Hähnle	7230cb8f2b	ddebug: always flush when requested, even when hang detection is disabled Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 11:59:18 +01:00
Nicolai Hähnle	539fdc49f1	ddebug: simplify watchdog loop and fix crash in the no-timeout case The following race condition could occur in the no-timeout case: API thread Gallium thread Watchdog ---------- -------------- -------- dd_before_draw u_threaded_context draw dd_after_draw add to dctx->records signal watchdog dump & destroy record execute draw dd_after_draw_async use-after-free! Alternatively, the same scenario would assert in a debug build when destroying the record because record->driver_finished has not signaled. Fix this and simplify the logic at the same time by - handing the record pointers off to the watchdog thread before each draw call and - waiting on the driver_finished fence in the watchdog thread Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 11:59:10 +01:00
Tapani Pälli	3627c9efff	anv/android: turn on VK_ANDROID_external_memory_android_hardware_buffer Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:42 +02:00
Tapani Pälli	3dc424a4f4	anv: ignore VkSamplerYcbcrConversion on non-yuv formats This fulfills a requirement for clients that want to utilize same code path for images with external formats (VK_FORMAT_UNDEFINED) and "regular" RGBA images where format is known. This is similar to how OES_EGL_image_external works. To support this, we allow color conversion samplers for non-YUV formats but skip setting up conversion when format does not have can_ycbcr flag set. v2: add comment and bundle can_ycbcr to the existing break condition (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	a7b7772cfb	anv: support VkSamplerYcbcrConversionInfo in vkCreateImageView If a conversion struct was passed, then initialize view using format from the conversion structure. v2: use vk_format directly from the anv_format struct v3: added some assertions (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	bb0721aea4	anv: add VkFormat field as part of anv_format Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	c070b0e25f	anv: support VkExternalFormatANDROID in vkCreateSamplerYcbcrConversion If external format is used, we store the external format identifier in conversion to be used later when creating VkImageView. v2: rebase to `b43f955037` changes v3: added assert, ignore components when creating external format conversion (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	f1654fa7e3	anv/android: support creating images from external format Since we don't know the exact format at creation time, some initialization is done only when bound with memory in vkBindImageMemory. v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if image has external format v3: refactor prepare_ahw_image, support vkBindImageMemory2, calculate stride correctly for rgb(x) surfaces, rename as 'resolve_ahw_image' v4: rebase to `b43f955037` changes v5: add some assertions to verify input correctness (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	517103abf1	anv/android: add ahardwarebuffer external memory properties v2: have separate memory properties for android, set usage flags for buffers correctly v3: code cleanup (Jason) + limit maxArrayLayers to 1 for AHardwareBuffer based images v4: rebase to `b43f955037` changes Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	c79a528d2b	anv/android: support import/export of AHardwareBuffer objects v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB) v3: properly handle usage bits when creating from image v4: refactor, code cleanup (Jason) v5: rebase to `b43f955037` changes, initialize bo flags as ANV_BO_EXTERNAL (Lionel) v6: add assert that anv_bo_cache_import succeeds, add comment about multi-bo support to clarify current implementation (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	5c65c60d6c	anv: refactor, remove else block in AllocateMemory This makes it cleaner to introduce more cases where we import memory from different types of external memory buffers. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	884fc90fde	anv: add anv_ahw_usage_from_vk_usage helper function v2: rebase to `b43f955037` changes Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	1e6a44400a	anv/android: add GetAndroidHardwareBufferPropertiesANDROID Use the anv_format address in formats table as implementation-defined external format identifier for now. When adding YUV format support this might need to change. v2: code cleanup (Jason) v3: set anv_format address as identifier v4: setup suggestedYcbcrModel and suggested[X\|Y]ChromaOffset as expected for HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL v5: set linear tiling for GPU_DATA_BUFFER usage, add comment about multi-bo support to clarify current implementation (Lionel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	aa94e01bfe	anv: add from/to helpers with android and vulkan formats v2: handle R8G8B8X8 as R8G8B8_UNORM (Jason) v3: add HAL_PIXEL_FORMAT_NV12_Y_TILED_INTEL, we make it define for now to avoid direct dependency to minigbm headers Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	c1f15a0a1a	anv: make anv_get_image_format_features public This will be utilized later by GetAndroidHardwareBufferPropertiesANDROID. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	8a469fd335	anv: refactor make_surface to use data from anv_image Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Tapani Pälli	2a98e5bbb9	anv: add create_flags as part of anv_image This will make it possible for next patch to rip anv_image_create_info out from make_surface function. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-19 09:38:41 +02:00
Ian Romanick	96c4b135e3	nir/algebraic: Don't put quotes around floating point literals The quotation marks around 1.0 cause it to be treated as a string instead of a floating point value. The generator then treats it as an arbitrary variable replacement, so any iand involving a ('ineg', ('b2i', a)) matches. v2: Remove misleading comment about sized literals (suggested by Timothy). Add assertion that the name of a varible is entierly alphabetic (suggested by Jason). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Timothy Arceri <tarceri@itsqueeze.com> [v1] Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> [v1] Fixes: `6bcd2af086` ("nir/algebraic: Add some optimizations for D3D-style Booleans") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109075	2018-12-18 23:28:31 -08:00
Vinson Lee	0f7ba5758b	meson: Fix libsensors detection. Fixes: `5e71efef44` ("meson: Add lmsensors support") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-18 19:24:01 -08:00
Vinson Lee	84f39e5971	meson: Fix typo. Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-18 19:14:11 -08:00
Sagar Ghuge	933c44bcc4	nir: Add a new lowering option to lower 3D surfaces from txd to txl. Tested on gen9. v2: Rename lower_txd_3d_surafaces flag to lower_txd_3d (Jason Ekstrand) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-18 13:44:09 -08:00
Christian Gmeiner	7ea8e54dd6	meson: add etnaviv to the tools option Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-18 21:50:58 +01:00
Adam Jackson	e36d136102	specs: Bump GLX_MESA_query_renderer to version 9 Note that we have an official GL extension number, pick the appropriate section of the GLX spec to modify, and add changelog. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-12-18 15:46:10 -05:00
Adam Jackson	9e8332ebc2	specs: Remove GLX_RENDERER_ID_MESA from GLX_MESA_query_renderer This has not even had an attempt at implementation. If you asked for renderer 0 - which, the spec implies, should always work - then dri2_convert_glx_attribs would fail, we'd silently fall back to creating an indirect context, and xserver would also not recognize the attribute and would throw BadValue at you. The API would be difficult to use in any case, since there's no way to enumerate how many renderers the screen has. I'd be tempted to add that by defining: glXQueryRendererIntegerMESA(dpy, screen, /* renderer = */ -1, 0, &value); to return the number of renderers, but a new entrypoint might be cleaner. Still, better to not specify it at all than to lie about it. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-12-18 15:46:10 -05:00
Adam Jackson	c63c391756	specs: Remove GLES profile interaction text from GLX_MESA_query_renderer In one place we say, if GLES isn't supported then the profile version will be 0.0. Then later we say, if the GLES profile extension isn't supported then GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not mentioned in the spec. A strict reading of the latter would mean that GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA is not a recognized token, and the query should instead return False. The implementation does not check for the GLES profile extensions, and the additional complexity doesn't seem worth it. Removing the interaction text makes the spec match the implementation. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-12-18 15:46:10 -05:00
Eduardo Lima Mitev	5820e63418	freedreno/ir3: Make imageStore use num components from image format emit_intrinsic_store_image() is always using 4 components when collecting registers for the value. When image has less than 4 components (e.g, r32f, rg32i, etc) this results in extra mov instructions. This patch uses the actual number of components from the image format. For example, in a shader like: layout (r32f, binding=0) writeonly uniform imageBuffer u_image; ... void main(void) { ... imageStore (u_image, some_offset, vec4(1.0)); ... } instruction count is reduced in at least 3 instructions (note image format is r32f, 1 component only). This obviously reduces register pressure as well. v2: - Added support for image formats from NV_image_format extension (Ilia Mirkin). - Return 4 components by default instead of asserting. (Rob Clark). v3: Added more missing formats (Ilia Mirkin). v4: Added a debug message for unknown image formats (Rob Clark). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-12-18 21:15:20 +01:00
Jason Ekstrand	5dad1abfdc	nir/dead_write_vars: Get modes directly from derefs Instead of going all the way back to the variable, just look at the deref. The modes are guaranteed to be the same by nir_validate whenever the variable can be found. This fixes clear_unused_for_modes for derefs that don't have an accessible variable. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	fa40a58fd9	nir/copy_prop_vars: Get modes directly from derefs Instead of going all the way back to the variable, just look at the deref. The modes are guaranteed to be the same by nir_validate whenever the variable can be found. This fixes apply_barrier_for_modes for derefs that don't have an accessible variable. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	cf7fb39805	nir/lower_wpos_center: Look at derefs for modes This is instead of looking all the way back to the variable which may not exist for all derefs. This makes this code properly ignore casts with modes other than the mode[s] we care about (where casts aren't allowed). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	867fe35a16	nir/lower_io_to_scalar: Look at derefs for modes This is instead of looking all the way back to the variable which may not exist for all derefs. This makes this code properly ignore casts with modes other than the mode[s] we care about (where casts aren't allowed). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	3fe0363dda	nir/lower_io_arrays_to_elements: Look at derefs for modes This is instead of looking all the way back to the variable which may not exist for all derefs. This makes this code properly ignore casts with modes other than the mode[s] we care about (where casts aren't allowed). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	8cc0f92492	nir/linking_helpers: Look at derefs for modes This is instead of looking all the way back to the variable which may not exist for all derefs. This makes this code properly ignore casts with modes other than the mode[s] we care about (where casts aren't allowed). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Jason Ekstrand	8410cf66d7	nir/propagate_invariant: Skip unknown vars If we can't find the variable from the deref, just assume it isn't invariant and continue on. This can happen if, for instance, we're writing to a deref that points into an SSBO. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-18 13:13:28 -06:00
Ian Romanick	29e4b949b4	Revert "nir/lower_indirect: Bail early if modes == 0" "There's no point in walking the program if we're never going to actually lower anything." Except we might lower compacted local arrays. In that case, modes will be 0, but there is still lowering to be done. This reverts commit `7f75cf2a94`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109081 Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2018-12-18 10:47:54 -08:00
Lucas Stach	433ca3127a	st/dri: replace format conversion functions with single mapping table Each time I have to touch the buffer import/export functions in the dri state tracker I get lost in the maze of functions converting between DRI_IMAGE_FOURCC, DRI_IMAGE_FORMAT, DRI_IMAGE_COMPONENTS and pipe format. Rip it out and replace by a single table, which defines the correspondence between the different representations. Also this now stores all the known representations in the __DRIimageRec, to avoid the loss of information we currently have when importing a buffer with a fourcc, which doesn't have a corresponding dri format. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-18 19:19:45 +01:00
Lucas Stach	67174d40f1	st/dri: allow both render and sampler compatible dma-buf formats Currently all the EGL APIs are missing a way to specify how an imported dma-buf is intended to be used. Demanding the format to be both usable for sampling and rendering artificially restricts the list of formats a driver is able to import. Looking at how the Intel driver implements those DRI2 image APIs it doesn't distinguish between render or sampler compatible formats. So this patch aligns behavior between Intel and Gallium based drivers. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-18 19:19:40 +01:00
Lucas Stach	a3e592e839	etnaviv: use surface format directly There is no need to do the detour over the resource behind the surface to get the format. Use the surface format directly. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-12-18 19:07:10 +01:00
Dylan Baker	7a90886921	meson: Add toggle for glx-direct GNU Hurd needs to turn off glx-direct, rather than special case it, we'll just add a toggle. CC: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-18 09:20:53 -08:00
Dylan Baker	8c77f4c76d	meson: Add support for gnu hurd CC: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-18 09:20:49 -08:00
Dylan Baker	6cf5f25bc5	meson: remove duplicate definition Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-18 09:18:12 -08:00
Dylan Baker	e430a034b9	meson: Fix ppc64 little endian detection Old versions of meson returned ppc64le as the cpu_family for little endian power8 cpus, versions >=0.48 don't do this, so the check wouldn't work in that case. This generalizes the check to work for both old and new versions of meson. Fixes: `34bbb24ce7` ("meson: Add support for ppc assembly/optimizations") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-18 09:17:54 -08:00
Jason Ekstrand	3feda3cf35	anv: Bump the patch version to 96 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-18 09:40:46 -06:00
Kenneth Graunke	3c71ba3baa	i965: Don't override subslice count to 4 on Gen11. Gen9-10 have fewer than 4 subslices per slice, so they need this to be rounded up. Gen11 isn't documented as needing this hack, and it can also have more than 4 subslices, so the hack actually can break things. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-12-17 14:03:45 -08:00
Ian Romanick	af07141b33	intel/compiler: More peephole_select for pre-Gen6 No shader-db changes on any Gen6+ platform. All of the shaders with cycles hurt by more than ~2% are from Master of Orion. All of the shaders have instructions helped. It looks like the pass enables some control flow to be converted to bcsels, then the scheduler does dumb things. These are new shaders (just added before doing this shader-db run), so there's probably some low-hanging fruit. Iron Lake total instructions in shared programs: 8214327 -> 8213684 (<.01%) instructions in affected programs: 84469 -> 83826 (-0.76%) helped: 114 HURT: 26 helped stats (abs) min: 2 max: 18 x̄: 7.75 x̃: 9 helped stats (rel) min: 0.17% max: 13.73% x̄: 2.52% x̃: 1.05% HURT stats (abs) min: 2 max: 20 x̄: 9.23 x̃: 8 HURT stats (rel) min: 0.70% max: 2.48% x̄: 1.66% x̃: 1.61% 95% mean confidence interval for instructions value: -5.87 -3.32 95% mean confidence interval for instructions %-change: -2.32% -1.17% Instructions are helped. total cycles in shared programs: 187736850 -> 187749314 (<.01%) cycles in affected programs: 506750 -> 519214 (2.46%) helped: 104 HURT: 36 helped stats (abs) min: 2 max: 72 x̄: 21.96 x̃: 16 helped stats (rel) min: 0.02% max: 6.16% x̄: 0.97% x̃: 0.63% HURT stats (abs) min: 4 max: 1402 x̄: 409.67 x̃: 40 HURT stats (rel) min: 0.33% max: 23.12% x̄: 5.79% x̃: 1.39% 95% mean confidence interval for cycles value: 28.32 149.74 95% mean confidence interval for cycles %-change: -0.07% 1.61% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 5044014 -> 5043652 (<.01%) instructions in affected programs: 46751 -> 46389 (-0.77%) helped: 63 HURT: 13 helped stats (abs) min: 2 max: 29 x̄: 7.65 x̃: 9 helped stats (rel) min: 0.17% max: 13.73% x̄: 2.93% x̃: 1.04% HURT stats (abs) min: 2 max: 20 x̄: 9.23 x̃: 8 HURT stats (rel) min: 0.66% max: 2.35% x̄: 1.58% x̃: 1.52% 95% mean confidence interval for instructions value: -6.54 -2.99 95% mean confidence interval for instructions %-change: -3.04% -1.28% Instructions are helped. total cycles in shared programs: 128143042 -> 128150188 (<.01%) cycles in affected programs: 324564 -> 331710 (2.20%) helped: 57 HURT: 19 helped stats (abs) min: 6 max: 74 x̄: 30.70 x̃: 32 helped stats (rel) min: 0.08% max: 4.74% x̄: 1.22% x̃: 0.81% HURT stats (abs) min: 10 max: 1400 x̄: 468.21 x̃: 60 HURT stats (rel) min: 0.56% max: 19.94% x̄: 5.80% x̃: 1.70% 95% mean confidence interval for cycles value: 6.90 181.15 95% mean confidence interval for cycles %-change: -0.52% 1.59% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	378f996771	nir/opt_peephole_select: Don't peephole_select expensive math instructions On some GPUs, especially older Intel GPUs, some math instructions are very expensive. On those architectures, don't reduce flow control to a csel if one of the branches contains one of these expensive math instructions. This prevents a bunch of cycle count regressions on pre-Gen6 platforms with a later patch (intel/compiler: More peephole select for pre-Gen6). v2: Remove stray #if block. Noticed by Thomas. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	8fb8ebfbb0	intel/compiler: More peephole select Shader-db results: The one shader hurt for instructions is a compute shader that had both spills and fills hurt. v2: Fix typo in comment noticed by Caio. v3: Fix inverted condition in brw_nir.c. Noticed by Lionel. Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15072761 -> 15047884 (-0.17%) instructions in affected programs: 895539 -> 870662 (-2.78%) helped: 3623 HURT: 1 helped stats (abs) min: 1 max: 181 x̄: 6.89 x̃: 4 helped stats (rel) min: 0.10% max: 25.00% x̄: 3.93% x̃: 3.20% HURT stats (abs) min: 92 max: 92 x̄: 92.00 x̃: 92 HURT stats (rel) min: 1.92% max: 1.92% x̄: 1.92% x̃: 1.92% 95% mean confidence interval for instructions value: -7.10 -6.63 95% mean confidence interval for instructions %-change: -4.03% -3.82% Instructions are helped. total cycles in shared programs: 369738930 -> 369535732 (-0.05%) cycles in affected programs: 68027851 -> 67824653 (-0.30%) helped: 2609 HURT: 1035 helped stats (abs) min: 1 max: 4508 x̄: 181.44 x̃: 77 helped stats (rel) min: <.01% max: 71.31% x̄: 9.14% x̃: 5.47% HURT stats (abs) min: 1 max: 33336 x̄: 261.04 x̃: 20 HURT stats (rel) min: <.01% max: 47.61% x̄: 2.93% x̃: 1.47% 95% mean confidence interval for cycles value: -96.43 -15.09 95% mean confidence interval for cycles %-change: -6.07% -5.36% Cycles are helped. total spills in shared programs: 10158 -> 10159 (<.01%) spills in affected programs: 166 -> 167 (0.60%) helped: 1 HURT: 1 total fills in shared programs: 22105 -> 22116 (0.05%) fills in affected programs: 837 -> 848 (1.31%) helped: 4 HURT: 1 Ivy Bridge total instructions in shared programs: 12021190 -> 11990256 (-0.26%) instructions in affected programs: 910561 -> 879627 (-3.40%) helped: 3344 HURT: 18 helped stats (abs) min: 1 max: 99 x̄: 9.29 x̃: 6 helped stats (rel) min: 0.11% max: 31.18% x̄: 5.19% x̃: 3.31% HURT stats (abs) min: 2 max: 20 x̄: 7.89 x̃: 6 HURT stats (rel) min: 0.70% max: 2.59% x̄: 1.63% x̃: 1.70% 95% mean confidence interval for instructions value: -9.49 -8.91 95% mean confidence interval for instructions %-change: -5.32% -4.98% Instructions are helped. total cycles in shared programs: 179077826 -> 178570196 (-0.28%) cycles in affected programs: 63205667 -> 62698037 (-0.80%) helped: 2767 HURT: 620 helped stats (abs) min: 1 max: 7531 x̄: 217.58 x̃: 88 helped stats (rel) min: <.01% max: 75.86% x̄: 9.59% x̃: 6.09% HURT stats (abs) min: 1 max: 31255 x̄: 152.27 x̃: 11 HURT stats (rel) min: <.01% max: 36.36% x̄: 2.77% x̃: 0.58% 95% mean confidence interval for cycles value: -173.94 -125.81 95% mean confidence interval for cycles %-change: -7.68% -6.97% Cycles are helped. Sandy Bridge total instructions in shared programs: 10852569 -> 10843758 (-0.08%) instructions in affected programs: 235803 -> 226992 (-3.74%) helped: 800 HURT: 0 helped stats (abs) min: 1 max: 88 x̄: 11.01 x̃: 8 helped stats (rel) min: 0.11% max: 23.08% x̄: 4.69% x̃: 3.36% 95% mean confidence interval for instructions value: -11.93 -10.10 95% mean confidence interval for instructions %-change: -4.99% -4.39% Instructions are helped. total cycles in shared programs: 154732047 -> 154608941 (-0.08%) cycles in affected programs: 4063110 -> 3940004 (-3.03%) helped: 606 HURT: 253 helped stats (abs) min: 1 max: 2524 x̄: 227.93 x̃: 62 helped stats (rel) min: 0.02% max: 39.24% x̄: 4.36% x̃: 1.81% HURT stats (abs) min: 1 max: 1966 x̄: 59.36 x̃: 11 HURT stats (rel) min: 0.02% max: 67.10% x̄: 3.22% x̃: 0.67% 95% mean confidence interval for cycles value: -170.49 -116.13 95% mean confidence interval for cycles %-change: -2.61% -1.65% Cycles are helped. No change on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	09b7e1d8e4	nir/opt_peephole_select: Don't try to remove flow control around indirect loads That flow control may be trying to avoid invalid loads. On at least some platforms, those loads can also be expensive. No shader-db changes on any Intel platform (even with the later patch "intel/compiler: More peephole select"). v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select. Suggested by Rob. See also the big comment in src/intel/compiler/brw_nir.c. v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from nir_lower_io_arrays_to_elements.c). v4: Fix inverted condition in brw_nir.c. Noticed by Lionel. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	4cd1a0be76	i965/vec4: Propagate conditional modifiers from more compares to other compares If there is a CMP.NZ that compares a single component (via a .zzzz swizzle, for example) with 0, it can propagate its conditional modifier back to a previous CMP that writes only that component. The specific case that I saw was: cmp.l.f0(8) g42<1>.xF g61<4>.xF (abs)g18<4>.zF ... cmp.nz.f0(8) null<1>D g42<4>.xD 0D In this case we can just delete the second CMP. No changes on Broadwell or Skylake because they do not use the vec4 backend. Also no changes on GM45 or Iron Lake. Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Sandy Bridge shown) total instructions in shared programs: 10856676 -> 10852569 (-0.04%) instructions in affected programs: 228322 -> 224215 (-1.80%) helped: 1331 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 3.09 x̃: 4 helped stats (rel) min: 0.11% max: 6.67% x̄: 1.88% x̃: 1.83% 95% mean confidence interval for instructions value: -3.19 -2.99 95% mean confidence interval for instructions %-change: -1.93% -1.83% Instructions are helped. total cycles in shared programs: 154788865 -> 154732047 (-0.04%) cycles in affected programs: 2485892 -> 2429074 (-2.29%) helped: 1097 HURT: 59 helped stats (abs) min: 2 max: 168 x̄: 51.96 x̃: 64 helped stats (rel) min: 0.12% max: 12.70% x̄: 3.44% x̃: 2.22% HURT stats (abs) min: 2 max: 16 x̄: 3.02 x̃: 2 HURT stats (rel) min: 0.18% max: 0.83% x̄: 0.64% x̃: 0.71% 95% mean confidence interval for cycles value: -51.04 -47.26 95% mean confidence interval for cycles %-change: -3.40% -3.07% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	9a83c3d3b3	i965/fs: Eliminate unary op on operand of compare-with-zero The (-abs(x) >= 0) => (x == 0) optimization is removed from the vec4 and scalar parts. In the VS part, adding the new pattern was not helpful. The pattern that is removed is really old, and it has been handled by NIR for ages. All Gen7+ platforms had similar results. (Broadwell shown) total instructions in shared programs: 14715715 -> 14715709 (<.01%) instructions in affected programs: 474 -> 468 (-1.27%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.12% max: 1.35% x̄: 1.28% x̃: 1.35% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -1.40% -1.15% Instructions are helped. total cycles in shared programs: 559569911 -> 559569809 (<.01%) cycles in affected programs: 5963 -> 5861 (-1.71%) helped: 6 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 1.45% max: 1.88% x̄: 1.73% x̃: 1.85% 95% mean confidence interval for cycles value: -18.15 -15.85 95% mean confidence interval for cycles %-change: -1.95% -1.51% Cycles are helped. Iron Lake and Sandy Bridge had similar results. (Iron Lake shown) total instructions in shared programs: 7780915 -> 7780913 (<.01%) instructions in affected programs: 246 -> 244 (-0.81%) helped: 2 HURT: 0 total cycles in shared programs: 177876108 -> 177876106 (<.01%) cycles in affected programs: 3636 -> 3634 (-0.06%) helped: 1 HURT: 0 GM45 total instructions in shared programs: 4799152 -> 4799151 (<.01%) instructions in affected programs: 126 -> 125 (-0.79%) helped: 1 HURT: 0 total cycles in shared programs: 122052654 -> 122052652 (<.01%) cycles in affected programs: 3640 -> 3638 (-0.05%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Ian Romanick	440c051340	i965/vec4/dce: Don't narrow the write mask if the flags are used In an instruction sequence like cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD The other fields of vgrf17 may be unused, but the CMP still needs to generate the other flag bits. To my surprise, nothing in shader-db or any test suite appears to hit this. However, I have a change to brw_vec4_cmod_propagation that creates cases where this can happen. This fix prevents a couple dozen regressions in that patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `5df88c20` ("i965/vec4: Rewrite dead code elimination to use live in/out.")	2018-12-17 13:47:06 -08:00
Ian Romanick	111bcc8d02	i965/vec4: Silence unused parameter warnings in vec4 compiler tests src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::dst_reg* copy_propagation_vec4_visitor::make_reg_for_system_value(int)’: src/intel/compiler/test_vec4_copy_propagation.cpp:57:51: warning: unused parameter ‘location’ [-Wunused-parameter] virtual dst_reg make_reg_for_system_value(int location) ^~~~~~~~ src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual void copy_propagation_vec4_visitor::emit_urb_write_header(int)’: src/intel/compiler/test_vec4_copy_propagation.cpp:77:43: warning: unused parameter ‘mrf’ [-Wunused-parameter] virtual void emit_urb_write_header(int mrf) ^~~ src/intel/compiler/test_vec4_copy_propagation.cpp: In member function ‘virtual brw::vec4_instruction copy_propagation_vec4_visitor::emit_urb_write_opcode(bool)’: src/intel/compiler/test_vec4_copy_propagation.cpp:82:57: warning: unused parameter ‘complete’ [-Wunused-parameter] virtual vec4_instruction emit_urb_write_opcode(bool complete) ^~~~~~~~ src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::dst_reg register_coalesce_vec4_visitor::make_reg_for_system_value(int)’: src/intel/compiler/test_vec4_register_coalesce.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter] virtual dst_reg make_reg_for_system_value(int location) ^~~~~~~~ src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual void register_coalesce_vec4_visitor::emit_urb_write_header(int)’: src/intel/compiler/test_vec4_register_coalesce.cpp:80:43: warning: unused parameter ‘mrf’ [-Wunused-parameter] virtual void emit_urb_write_header(int mrf) ^~~ src/intel/compiler/test_vec4_register_coalesce.cpp: In member function ‘virtual brw::vec4_instruction register_coalesce_vec4_visitor::emit_urb_write_opcode(bool)’: src/intel/compiler/test_vec4_register_coalesce.cpp:85:57: warning: unused parameter ‘complete’ [-Wunused-parameter] virtual vec4_instruction emit_urb_write_opcode(bool complete) ^~~~~~~~ src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::dst_reg cmod_propagation_vec4_visitor::make_reg_for_system_value(int)’: src/intel/compiler/test_vec4_cmod_propagation.cpp:60:51: warning: unused parameter ‘location’ [-Wunused-parameter] virtual dst_reg make_reg_for_system_value(int location) ^~~~~~~~ src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual void cmod_propagation_vec4_visitor::emit_urb_write_header(int)’: src/intel/compiler/test_vec4_cmod_propagation.cpp:85:43: warning: unused parameter ‘mrf’ [-Wunused-parameter] virtual void emit_urb_write_header(int mrf) ^~~ src/intel/compiler/test_vec4_cmod_propagation.cpp: In member function ‘virtual brw::vec4_instruction cmod_propagation_vec4_visitor::emit_urb_write_opcode(bool)’: src/intel/compiler/test_vec4_cmod_propagation.cpp:90:57: warning: unused parameter ‘complete’ [-Wunused-parameter] virtual vec4_instruction *emit_urb_write_opcode(bool complete) ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 13:47:06 -08:00
Bas Nieuwenhuizen	f67dea5e19	radv: Fix multiview depth clears We were not using the view mask for depth clears, causing only the first view to be cleared. Fixes: `2e86f6b259` "radv: Add multiview clears." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 20:16:26 +00:00
Bas Nieuwenhuizen	9add63a3a5	radv: Remove redundant format check. The switch directly after the check has a default case that returns NULL too, so the effective return value is not changed. Also this check is wrong once we start dealing with formats introduced by an extension (e.g. YUV formats). Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 20:09:38 +00:00
Eric Anholt	708d8f4d0a	nir: Fix clamping of uints for image store lowering. I botched some copy-and-paste and clamped to signed int max instead of uint max. Fixes KHR-GL46.shader_image_load_store.multiple-uniforms on skl. Fixes: `d3e046e76c` ("nir: Pull some of intel's image load/store format conversion to nir_format.h") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-17 20:02:22 +00:00
Eric Anholt	00e2cbc049	v3d: Fix the argument type for vir_BRANCH(). Apparently this has been spewing warnings for Jason's clang, but not my gcc.	2018-12-17 09:52:23 -08:00
Eric Anholt	376054fff3	vc4: Reuse nir_format_convert.h in our blend lowering. These helpers came along after and have effectively the same implementation.	2018-12-17 09:52:23 -08:00
Samuel Pitoiset	445867c80d	radv: report Vulkan version 1.1.90 for real I thought the value was correctly propagated, but actually not. Fixes: `2ac6d55f38` ("radv: bump reported version to 1.1.90") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-17 17:51:48 +01:00
Jason Ekstrand	cae373117c	anv,radv: Re-enable VK_EXT_pci_bus_info Now at version 2 with the fixed header. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 10:42:35 -06:00
Jason Ekstrand	e5b59fe6f5	vulkan: Update the XML and headers to 1.1.96 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-17 10:41:56 -06:00
Rhys Perry	ef198e8c6a	radv: switch from nir_bcsel to nir_b32csel Fixes: `191a1dce92` ('nir: Add 1-bit Boolean opcodes') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 14:52:39 +00:00
Rhys Perry	bba94a3d85	radv: don't set surf_index for stencil-only images Fixes: `f8d5b377c8` ('radv: set cb base tile swizzles for MRT speedups (v4)') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108116 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-17 14:52:10 +00:00
Ian Romanick	9dc135efa1	nir: Release per-block metadata in nir_sweep nir_sweep already marks all metadata invalid, so it is safe to release the memory here too. mean soft fp64 using uint64: 1,342,759,331 => 1,010,670,475 gfxbench5 aztec ruins high 11: 63,555,571 => 61,889,811 deus ex mankind divided 148: 62,845,304 => 62,829,640 deus ex mankind divided 2890: 71,922,686 => 71,922,686 dirt showdown 676: 69,238,607 => 69,238,607 dolphin ubershaders 210: 77,822,072 => 77,822,072 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 14:39:56 -08:00
Ian Romanick	7adafd6e1c	nir: Fix holes in nir_instr Found using pahole. Changes in peak memory usage according to Valgrind massif: mean soft fp64 using uint64: 1,343,991,403 => 1,342,759,331 gfxbench5 aztec ruins high 11: 63,619,971 => 63,555,571 deus ex mankind divided 148: 62,887,728 => 62,845,304 deus ex mankind divided 2890: 72,399,750 => 71,922,686 dirt showdown 676: 69,464,023 => 69,238,607 dolphin ubershaders 210: 78,359,728 => 77,822,072 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 14:39:56 -08:00
Ian Romanick	8161a87b24	nir/phi_builder: Use per-value hash table to store [block] -> def mapping Replace the old array in each value with a hash table in each value. Changes in peak memory usage according to Valgrind massif: mean soft fp64 using uint64: 5,499,875,082 => 1,343,991,403 gfxbench5 aztec ruins high 11: 63,619,971 => 63,619,971 deus ex mankind divided 148: 62,887,728 => 62,887,728 deus ex mankind divided 2890: 72,402,222 => 72,399,750 dirt showdown 676: 74,466,431 => 69,464,023 dolphin ubershaders 210: 109,630,376 => 78,359,728 Run-time change for a full run on shader-db on my Haswell desktop (with -march=native) is 1.22245% +/- 0.463879% (n=11). This is about +2.9 seconds on a 237 second run. The first time I sent this version of this patch out, the run-time data was quite different. I had misconfigured the script that ran the test, and none of the tests from higher GLSL versions were run. These are generally more complex shaders, and they are more affected by this change. The previous version of this patch used a single hash table for the whole phi builder. The mapping was from [value, block] -> def, so a separate allocation was needed for each [value, block] tuple. There was quite a bit of per-allocation overhead (due to ralloc), so the patch was followed by a patch that added the use of the slab allocator. The results of those two patches was not quite as good: mean soft fp64 using uint64: 5,499,875,082 => 1,343,991,403 gfxbench5 aztec ruins high 11: 63,619,971 => 63,619,971 deus ex mankind divided 148: 62,887,728 => 62,887,728 deus ex mankind divided 2890: 72,402,222 => 72,402,222 * dirt showdown 676: 74,466,431 => 72,443,591 * dolphin ubershaders 210: 109,630,376 => 81,034,320 * The * denote tests that are better now. In the tests that are the same in both patches, the "after" peak memory usage was at a different location. I did not check the local peaks. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 14:39:56 -08:00
Ian Romanick	e3043e1276	util/hash_table: Add _mesa_hash_table_init function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 14:39:56 -08:00
Jason Ekstrand	db197fdb6c	st/nir: Use nir_src_as_uint for tokens Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-16 15:07:28 -06:00
Jason Ekstrand	47e1e0692c	radv: Fix a stupid if in gather_intrinsic_info Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 15:06:07 -06:00
Jason Ekstrand	6bcd2af086	nir/algebraic: Add some optimizations for D3D-style Booleans D3D Booleans use a 32-bit 0/-1 representation. Because this previously matched NIR exactly, we didn't have to really optimize for it. Now that we have 1-bit Booleans, we need some specific optimizations to chew through the D3D12-style Booleans. Shader-db results on Kaby Lake: total instructions in shared programs: 15136811 -> 14967944 (-1.12%) instructions in affected programs: 2457021 -> 2288154 (-6.87%) helped: 8318 HURT: 10 total cycles in shared programs: 373544524 -> 359701825 (-3.71%) cycles in affected programs: 151029683 -> 137186984 (-9.17%) helped: 7749 HURT: 682 total loops in shared programs: 4431 -> 4399 (-0.72%) loops in affected programs: 32 -> 0 helped: 21 HURT: 0 total spills in shared programs: 10290 -> 10051 (-2.32%) spills in affected programs: 2532 -> 2293 (-9.44%) helped: 18 HURT: 18 total fills in shared programs: 22203 -> 21732 (-2.12%) fills in affected programs: 3319 -> 2848 (-14.19%) helped: 18 HURT: 18 Note that a large chunk of the improvement fixing regressions caused by switching to 1-bit Booleans. Previously, our ability to optimize D3D booleans was improved by using the D3D representation directly in NIR. Now that NIR does 1-bit bools, we need a few more optimizations. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	3b30814791	nir/algebraic: Optimize 1-bit Booleans Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	44227453ec	nir: Switch to using 1-bit Booleans for almost everything This is a squash of a few distinct changes: glsl,spirv: Generate 1-bit Booleans Revert "Use 32-bit opcodes in the NIR producers and optimizations" Revert "nir/builder: Generate 32-bit bool opcodes transparently" nir/builder: Generate 1-bit Booleans in nir_build_imm_bool Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	11dc130779	nir: Add a bool to int32 lowering pass We also enable it in all of the NIR drivers. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	191a1dce92	nir: Add 1-bit Boolean opcodes We also have to add support for 1-bit integers while we're here so we get 1-bit variants of iand, ior, and inot. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	615cc26b97	nir/algebraic: Generalize an optimization This just makes it nicely scale across bit sizes. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	487514ae61	nir/large_constants: Properly handle 1-bit bools Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	3191a82372	nir: Add support for 1-bit data types This commit adds support for 1-bit Booleans and integers. Booleans obviously take a value of true or false. Because we have to define the semantics of 1-bit signed and unsigned integers, we define uint1_t to take values of 0 and 1 and int1_t to take values of 0 and -1. 1-bit arithmetic is then well-defined in the usual way, just with fewer bits. The definition of int1_t and uint1_t doesn't usually matter but we do need something for purposes of constant folding. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	2fe8708ffd	nir/constant_expressions: Rework Boolean handling This commit contains three related changes. First, we define boolN_t for N = 8, 16, and 64 and move the definition of boolN_vec to the loop with the other vec definitions. Second, there's no reason why we need the != 0 on the source because that happens implicitly when it's converted to bool. Third, for destinations, we use a signed integer type and just do -(int)bool_val which will give us the 0/-1 behavior we want and neatly scales to all bit widths. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	80e8dfe9de	nir: Rename Boolean-related opcodes to include 32 in the name This is a squash of a bunch of individual changes: nir/builder: Generate 32-bit bool opcodes transparently nir/algebraic: Remap Boolean opcodes to the 32-bit variant Use 32-bit opcodes in the NIR producers and optimizations Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_$[fiu]lt$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]ge$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]ne$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]eq$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fi]$ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Use 32-bit opcodes in the NIR back-ends Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_$[fiu]lt$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]ge$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]ne$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fiu]eq$/nir_op_\132/g' */.c sed -i 's/nir_op_$[fi]$ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	b569093566	nir/algebraic: Make an optimization more specific Later in this series, bool is not going to imply 32-bit. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	517099809a	nir: Drop support for lower_b2f This was originally added for the out-of-tree Mali driver but I think we've all agreed it's easy enough for them to just do in their back-end. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	4bb1a34727	nir/algebraic: Optimize x2b(xneg(a)) -> a Shader-db results on Kaby Lake: total instructions in shared programs: 15072525 -> 15072525 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 This helps prevent regressions in later commits. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	3595a0abf4	nir/constant_folding: Fix source bit size logic Instead of looking at input_sizes[i] which contains the number of components for each source, we look at the bit size of input_types[i]. This fixes a regression in the 1-bit boolean series though I have no idea how we haven't seen it before now. Fixes: `35baee5dce` "nir/constant_folding: fix incorrect bit-size check" Fixes: `9076c4e289` "nir: update opcode definitions for different bit sizes" Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	9f7bd843af	nir/tgsi: Use nir_bany in ttn_kill_if Reviewed-by: Eric Anholt <eric@anholt.net>	2018-12-16 21:03:02 +00:00
Jason Ekstrand	e17426058c	nir/lower_idiv: Use ilt instead of bit twiddling The previous code was creating a boolean by doing an arithmetic right- shift by 31 which produces a boolean which is true if the argument is negative. This is the same as the expression r < 0 which is much simpler and doesn't depend on NIR's representation of booleans. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-12-16 21:03:02 +00:00
Eric Anholt	2977c77758	v3d: Use the original bit size when scalarizing uniform loads. Prevents a regression in jekstrand's 1-bit series. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 21:03:01 +00:00
Eric Anholt	91a0251dbc	vc4: Use the original bit size when scalarizing uniform loads. Prevents a regression in jekstrand's 1-bit series. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-16 21:03:01 +00:00
Rhys Perry	bde9f482de	ac: split 16-bit ssbo loads that may not be dword aligned Fixes: `7e7ee82698` ('ac: add support for 16bit buffer loads') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108114 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-16 14:56:10 +00:00
Rhys Perry	12dc7cb202	ac: refactor visit_load_buffer This is so that we can split different types of loads more easily. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-16 14:56:10 +00:00
Rhys Perry	ed4020fabe	nir: fix constness in nir_intrinsic_align() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-16 14:56:10 +00:00
Jan Vesely	e4f9a37ace	clover: Fix build after clang r348827 CodeGenOptions were moved to Basic. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org> CC: mesa-stable@lists.freedesktop.org	2018-12-16 06:38:10 -05:00
Jon Turney	d512b35b62	glx: Fix compilation with GLX_USE_WINDOWSGL Sadly, the GLX_USE_APPLEGL and GLX_USE_WINDOWSGL cases are not identical (because GLX_USE_WINDOWSGL uses vtables rather than a maze of ifdefs) Include <sys/time.h> again, as functions prototyped by it are used in the GLX_USE_WINDOWSGL path. Make the include guard around the __glxGetMscRate() definition match the one at it's declaration again, as it's referenced from dri_common.c which is built for GLX_USE_WINDOWSGL. Fixes: `a95ec138` ("glx: mandate xf86vidmode only for "drm" dri platforms") Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-15 13:49:24 +00:00
Eric Anholt	29927e7524	v3d: Drop in a bunch of notes about performance improvement opportunities. These have all been floating in my head, and while I've thought about encoding them in issues on gitlab once they're enabled, they also make sense to just have in the area of the code you'll need to work in.	2018-12-14 17:48:01 -08:00
Eric Anholt	248a7fb392	v3d: Do uniform pretty-printing in the QPU dump. If you're trying to trace what's going on in a QPU dump, this will definitely help you find your way.	2018-12-14 17:48:01 -08:00
Eric Anholt	a370ed76ab	v3d: Use the uniform pretty-printer in v3d_write_uniforms()'s debug code. This will be a lot easier than my usual "38400.000000? that looks like a viewport scale" decoding strategy.	2018-12-14 17:48:01 -08:00
Eric Anholt	532b6c5671	v3d: Move uniform pretty-printing to its own helper function. I want to reuse it in the QPU dump.	2018-12-14 17:48:01 -08:00
Eric Anholt	78ef05bde4	v3d: Move uinfo->data[] dereference to the top of v3d_write_uniforms(). Follows `3954331aff` ("vc4: Pull uinfo->data[i] dereference out to the top of the loop.") which showed a large performance win for vc4, but also cleans up the code a decent bit.	2018-12-14 17:48:01 -08:00
Eric Anholt	a7e15a5086	v3d: Avoid assertion failures when removing end-of-shader instructions. After generating VIR, we leave c->cursor pointing at the end of the shader. If the shader had dead code at the end (for example from preamble instructions in a shader with no side effects), we would assertion fail that we were leaving the cursor pointing at freed memory. Since anything following DCE should be setting up a new cursor anyway, just clear the cursor at the start.	2018-12-14 17:48:01 -08:00
Eric Anholt	5b2cc03852	v3d: Add support for draw indirect for GLES3.1. In trying to enable compute shaders, I found that a bunch of deqp-gles31's compute stuff wanted to interact with indirect dispatch. This was easy to do on its own.	2018-12-14 17:48:01 -08:00
Eric Anholt	ff80e58b38	v3d: Add missing flagging of SYNCB as a TSY op. Fixes: `f2e41daac5` ("broadcom/vc5: Update QPU instruction pack/unpack for v4.2.")	2018-12-14 17:48:01 -08:00
Eric Anholt	3f9bcf9136	v3d: Make sure that a thrsw doesn't split a multop from its umul24. The thrsw will invalidate rtop, just like accumulators and flags. Caught by simulator assertions in CS imulextended/umulextended tests. Fixes: `90269ba353` ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")	2018-12-14 17:48:01 -08:00
Eric Anholt	332a5cf6a5	v3d: Add safety checks for resource_create(). This should ease my debugging next time I screw it up.	2018-12-14 17:48:01 -08:00
Eric Anholt	6ad9e8690d	v3d: Add support for texturing from linear. Just like vc4, we have to support linear shared BOs for X11 on arbitrary displays. When we're faced with a request to texture from one of those, make a shadow image that we copy using the TFU at the start of the draw call.	2018-12-14 17:48:01 -08:00
Eric Anholt	976ea90bdc	v3d: Add support for using the TFU to do some blits. This will be useful in particular for blits from raster to UIF for X11.	2018-12-14 17:48:01 -08:00
Eric Anholt	e5b4d1f55f	v3d: Don't forget to bump the number of writes when doing TFU ops. generatemipmap is just filling out the rest of the mipmap that's already been written (by a mapping or a draw call), so it didn't matter. As I reuse the TFU code for linear-to-UIF conversions, it'll start mattering.	2018-12-14 17:48:01 -08:00
Eric Anholt	485df2574e	v3d: Set up the right stride for raster TFU. I didn't have any raster images in the generatemipmap path, so the pixels-vs-bytes mixup didn't matter here.	2018-12-14 17:48:01 -08:00
Eric Anholt	e731d53716	v3d: Don't forget to wait for our TFU job before rendering from it. Otherwise we may race to read old contents. This didn't show up in the CTS and piglit for me, but it did once I started using the TFU to do linear->UIF blits for X11. Fixes: `2ebca177dc` ("v3d: Use the TFU to do generatemipmap.")	2018-12-14 17:48:01 -08:00
Ilia Mirkin	153d3fc5f9	nvc0: always keep TSC slot 0 bound to fix TXF Same as on nv50, the TXF op always uses the TSC bound to slot 0, returning blank values if nothing is bound. An earlier change arranges for the TSC entries list to always have valid data at entry 0, so here we just make use of it. Fixes arb_texture_buffer_object-subdata-sync among others. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-14 20:01:31 -05:00
Ilia Mirkin	4aeaf89aa7	nvc0: replace use of explicit default_tsc with entry 0 This was used for implementing FBFETCH. However that uses TXF, which doesn't do much with a TSC. The only important bit is that sRGB-decoding works as expected, which we can achieve since all samplers we ever generate enable sRGB-decoding. Always point to entry 0 in the TSC table, and ensure that even before it ever gets initialized, the sRGB-decoding enable bit is set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-14 20:01:31 -05:00
Rob Clark	5f9085638a	freedreno/a6xx: fix corrupted uniforms For older gen's fd_wfi() is used to conditionally insert a WFI if there hasn't already been one since last draw. But this doesn't work out well with stateobj since the order the stateobj is evaluated might not be what you expect. (Ie. stateobj might not be evaluated until a later draw if there is no geometry from the current draw in a given tile.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-14 15:01:30 -05:00
Alex Deucher	4db4b3447d	pci_ids: add new vega20 pci id Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-12-14 14:48:39 -05:00
Alex Deucher	56cf25a114	pci_ids: add new vega10 pci ids Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-12-14 14:48:18 -05:00
Rafael Antognolli	5c454661c6	i965/gen9: Add workarounds for object preemption. Gen9 hardware requires some workarounds to disable preemption depending on the type of primitive being emitted. We implement this by adding a function that checks the primitive type and number of instances right before the 3DPRIMITIVE. For now, we just ignore blorp. The only primitive it emits is 3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can safely leave preemption enabled when it happens. Or it will be disabled by a previous 3DPRIMITIVE, which should be fine too. v3: - Apply missing workarounds for instanced rendering and line loop (Ken) - Move workaround code to brw_draw_single_prim() Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-12-14 09:40:27 -08:00
Rafael Antognolli	d8b50e152a	i965/gen10+: Enable object level preemption. Set bit when initializing context. v3: - Always toggle preemption bool to false before enabling it for the first time, so the state gets emitted (Chris Wilson). - Emit end of pipe sync with PIPE_CONTROL_RENDER_TARGET_FLUSH (Ken) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-12-14 09:40:27 -08:00
Rafael Antognolli	019a92ffa4	intel/genxml: Add register for object preemption. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-12-14 09:40:27 -08:00
Ian Romanick	a6b7d1151c	util/slab: Rename slab_mempool typed parameters to mempool Now everything with type 'struct slab_child_pool ' is name pool, and everything with type 'struct slab_mempool ' is named mempool. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-14 07:36:05 -08:00
Ian Romanick	ba5402ec9a	nir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-14 07:36:05 -08:00
Christian Gmeiner	489ffaf0c1	etnaviv: drop redundant ctx function parameter There is no need to have an extra ctx paramter as all the other parameters carry all the needed information. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-12-14 11:23:00 +01:00
Kenneth Graunke	0b44644ca6	genxml: Consistently use a numeric "MOCS" field When we first started using genxml, we decided to represent MOCS as an actual structure, and pack values. However, in many places, it was more convenient to use a numeric value rather than treating it as a struct, so we added secondary setters in a bunch of places as well. We were not entirely consistent, either. Some places only had one. Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens only had the struct-based setters. The names were sometimes "Constant Buffer Object Control State" instead of "Memory", making it harder to find. Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer packet...which is a bit redundant. On modern hardware, MOCS is simply an index into a table, but we were still carrying around the structure with an "Index to MOCS Table" field, in addition to the direct numeric setters. This is clunky - we really just want a number on new hardware. This patch eliminates the struct-based setters, and makes the numeric setters be consistently called "MOCS". We leave the struct definition around on Gen7-8 for reference purposes, but it is unused. v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2018-12-14 00:44:54 -08:00
Timothy Arceri	a2ec78883f	nir: fix opt_if_loop_last_continue() The pass did not correctly handle loops ending in: if ssa_7 { block block_8: /* preds: block_7 / continue / succs: block_1 / } else { block block_9: / preds: block_7 / break / succs: block_11 */ } The break will get eliminated by another opt but if this pass gets called first (as it does on RADV) we ended up inserting instructions after the break. Fixes: `5921a19d4b` ("nir: add if opt opt_if_loop_last_continue()") Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-12-14 17:21:35 +11:00
Rob Clark	0ac5acaeaa	freedreno/a6xx: fix resource_copy_region() pctx->resource_copy_region() needs to fall back to sw copy for non-renderable formats. But previously for things that we could not use the blitter for, would fall back to 3d. Which won't work if 3d can't render to the dst format either. Instead rework things to fallback to fd_resource_copy_region(), which will try 3d core and then fall back to memcpy(). Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	4ec2f6129b	freedreno: move fd_resource_copy_region() Code-motion prep for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	57b76ee2a8	freedreno/a6xx: more blitter fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	d15fc787bc	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	532f8c0043	gallium/aux: add is_unorm() helper We already had one for is_snorm() but not unorm. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	85cd4df47f	freedreno/a6xx: fix blitter crash Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot Also fixes gpu hangs with some formats that are supported, but which we don't know what internal-format to use for the blitter, for ex dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	cca1e9606c	freedreno/ir3: don't remove unused input components Fixes: `0d240c2214` freedreno/ir3: don't fetch unused tex components Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	c19c4bf488	freedreno/ir3: fix crash Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z Fixes: `0d240c2214` freedreno/ir3: don't fetch unused tex components Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	3e8e033f4c	freedreno: also set DUMP flag on shaders If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP flag as a hint to kernel that this is a useful buffer to snapshot for debug dumps. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	4cd016b5d6	freedreno: debug GEM obj names With a recent enough kernel, set debug names for GEM BOs, which will show up in $debugfs/gem Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Rob Clark	7ef722861b	freedreno/drm: sync uapi and enable softpin Pull in updated UAPI and use kernel API version to enable softpin. Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to signal to kernel that cmdstream buffers are useful to dump for debugging/cmdstream-traces. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-13 15:51:01 -05:00
Eric Anholt	4407e688cd	nir: Move intel's half-float image store lowering to to nir_format.h. I needed the same function for v3d. This was originally in `d3e046e76c` ("nir: Pull some of intel's image load/store format conversion to nir_format.h") before we made am istake about simplifying the function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:26 -08:00
Eric Anholt	3a417a044e	Revert "intel: Simplify the half-float packing in image load/store lowering." This reverts commit `06fbcd2cd5`. nir_pack_half_2x16_split isn't vectorizable, it's 1-component only, thus why we had this split-scalar code in the first place. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:24 -08:00
Eric Anholt	c2c44dba7a	nir: Print the format of image variables. This helps a lot when debugging image load/store lowering on large testcases. Unfortunately the Mesa enum name stuff is under src/mesa and we can't get at it from the compiler. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 12:24:12 -08:00
Eric Anholt	19ffcba161	mesa/st: Expose compute shaders when NIR support is advertised. We have a NIR path, and V3D doesn't have TGSI input for compute (only what TTN can handle for the various gallium-internal shaders). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-12-13 11:44:47 -08:00
Dave Airlie	b3f2b03ece	radv/xfb: fix counter buffer bounds checks. If we gave this function 0 counter buffers, we'd still try and access pCounterBuffers[0] as this check was incorrect. Fixes crash with ext_transform_feedback-pipeline-basic-primgen on zink on radv. Fixes: `677b496b6` (radv: fix begin/end transform feedback with 0 counter buffers.) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-13 19:27:05 +00:00
Jason Ekstrand	9ebc00f32e	i965: Enable nir_opt_idiv_const for 32 and 64-bit integers The pass should work for all bit sizes but it's less clear that the extra instructions are worth it on small integers. Also, the hardware doesn't do mul_high on anything other than 32-bit integers and, absent any decent mechanism for testing the pass on 8 and 16-bit types, it's probably best to just leave it disabled for now. Shader-db results on Sky Lake: total instructions in shared programs: 15105795 -> 15111403 (0.04%) instructions in affected programs: 72774 -> 78382 (7.71%) helped: 0 HURT: 265 Note that hurt here actually means helped because we're getting rid of integer quotient operations (which are a send on some platforms!) and replacing them with fairly cheap ALU ops. Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Jason Ekstrand	455ec7327d	i965/vec4: Implement nir_op_uadd_sat Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Ian Romanick	e639d39faf	i965/fs: Implement nir_op_uadd_sat Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 17:49:48 +00:00
Jason Ekstrand	74492ebad9	nir: Add a pass for lowering integer division by constants It's a reasonably well-known fact in the world of compilers that integer divisions by constants can be replaced by a multiply, an add, and some shifts. This commit adds such an optimization to NIR for easiest case of udiv. Other division operations will be added in following commits. In order to provide some additional driver control, the pass takes a minimum bit size to optimize. Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Ian Romanick	090e282407	nir: Add a saturated unsigned integer add opcode Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-13 17:49:48 +00:00
Jason Ekstrand	39198a1238	nir/lower_int64: Add support for [iu]mul_high Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Jason Ekstrand	9525971e2b	nir: Allow [iu]mul_high on non-32-bit types Reviewed-by: Ian Romanick ian.d.romanick@intel.com	2018-12-13 17:49:48 +00:00
Emil Velikov	a95ec13879	glx: mandate xf86vidmode only for "drm" dri platforms Currently we have the three dri "platforms" - drm, apple and windows. Since xf86vidmode is a thing only for the drm one, adjust the preprocessor guards and correctly check for the dependency. v2: terminate the GLX_USE_WINDOWSGL hunk Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Fixes: `5bc509363b` ("glx: make xf86vidmode mandatory for direct rendering") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-13 17:38:19 +00:00
Alejandro Piñeiro	c7bdcd67aa	nir: remove unused variable To avoid the following warning: ./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable] nir_shader *ns = impl->function->shader; Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-13 16:35:21 +01:00
Erik Faye-Lund	e888f28d1f	virgl: work around bad assumptions in virglrenderer Virglrenderer does the wrong thing when given an instance divisor; it tries to use the element-index rather than the binding-index as the argument to glVertexBindingDivisor(). This worked fine as long as there was a 1:1 relationship between elements and bindings, which was the case util `19a91841c3` "st/mesa: Use Array._DrawVAO in st_atom_array.c.". So let's detect instance divisors, and restore a 1:1 relationship in that case. This will make old versions of virglrenderer behave correctly. For newer versions, we can consider making a better interface, where the instance divisor isn't specified per element, but rather per binding. But let's save that for another day. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `19a91841c3` "st/mesa: Use Array._DrawVAO in st_atom_array.c." Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	8447b64238	virgl: wrap vertex element state in a struct This just has one member for now; the handle. But this is about to change. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	b702ff5378	virgl: simplify virgl_hw_set_index_buffer Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Erik Faye-Lund	00143a6241	virgl: simplify virgl_hw_set_vertex_buffers Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Tested-By: Gert Wollny <gert.wollny@collabora.com>	2018-12-13 16:12:10 +01:00
Juan A. Suarez Romero	0991085f66	docs: update calendar, add news item and link release notes for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-12-13 15:45:20 +01:00
Juan A. Suarez Romero	e0b0995dcf	docs: add sha256 checksums for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `e90429cc6d`)	2018-12-13 15:42:49 +01:00
Juan A. Suarez Romero	c8a17b45ea	docs: add release notes for 18.2.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `419ee20097`)	2018-12-13 15:42:46 +01:00
Samuel Pitoiset	5088ba2aeb	radv: don't check if format is depth in radv_image_can_enable_hile() This is always TRUE if htile_size is not 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:21 +01:00
Samuel Pitoiset	eb0034fe28	radv: check if addrlib enabled HTILE in radv_image_can_enable_htile() When hile_size is 0, we can't enable HTILE. This doesn't change anything, except not calling radv_image_alloc_htile(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:19 +01:00
Samuel Pitoiset	d8325f1f07	radv: switch on EOP when primitive restart is enabled with triangle strips Otherwise, Yakuza hangs the GPU with DXVK. We don't know if linetrip and pointlist are affected, so my point is to do that only for triangle strips. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:16 +01:00
Samuel Pitoiset	74cf3b627c	radv: allow to skip DCC decompressions with the new predicate Feral games aren't affected because they don't decompress DCC. F1 2018 has one DCC decompression per frame, but I don't see any performance improvements. This new predicate will be probably more useful for DCC/MSAA. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:14 +01:00
Samuel Pitoiset	3a5adc2879	radv: add a predicate for reflecting DCC decompression state It's somehow similar to the FCE predicate. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-13 09:21:10 +01:00
Jordan Justen	c506eae53d	i965/compute: Emit GPGPU_WALKER in genX_state_upload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 22:28:06 -08:00
Jordan Justen	1b85c605a6	i965/genX_state: Add register access functions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 22:28:02 -08:00
Eric Anholt	06fbcd2cd5	intel: Simplify the half-float packing in image load/store lowering. This was noted by Jason in review when I tried to make a helper for the old path. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:48 -08:00
Eric Anholt	d3e046e76c	nir: Pull some of intel's image load/store format conversion to nir_format.h I needed the same functions for v3d. Note that the color value in the Intel lowering has already been cut down to image.chans num_components. v2: Drop the half float one, since it was a 1-liner after cleanup. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:43 -08:00
Eric Anholt	19c7cba2ab	nir: Add some more consts to the nir_format_convert.h helpers. Most of the bits were constant, but a few were missed. Avoids warnings from v3d's upcoming static const bits declarations. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 16:09:37 -08:00
Timothy Arceri	9e6b39e1d5	nir: detect more induction variables This allows loop analysis to detect inductions variables that are incremented in both branches of an if rather than in a main loop block. For example: loop { block block_1: /* preds: block_0 block_7 / vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20 vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4 vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4 vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21 vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22 vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9 vec1 32 ssa_14 = ige ssa_8, ssa_5 / succs: block_2 block_3 / if ssa_14 { block block_2: / preds: block_1 / break / succs: block_8 / } else { block block_3: / preds: block_1 / / succs: block_4 / } block block_4: / preds: block_3 / vec1 32 ssa_15 = ilt ssa_6, ssa_8 / succs: block_5 block_6 / if ssa_15 { block block_5: / preds: block_4 / vec1 32 ssa_16 = iadd ssa_8, ssa_7 vec1 32 ssa_17 = load_const (0x3f800000 / 1.000000/) / succs: block_7 / } else { block block_6: / preds: block_4 / vec1 32 ssa_18 = iadd ssa_8, ssa_7 vec1 32 ssa_19 = load_const (0x3f800000 / 1.000000/) / succs: block_7 / } block block_7: / preds: block_5 block_6 / vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18 vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4 vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19 / succs: block_1 */ } Unfortunatly GCM could move the addition out of the if for us (making this patch unrequired) but we still cannot enable the GCM pass without regressions. This unrolls a loop in Rise of The Tomb Raider. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 88 -> 96 (9.09 %) VGPRS: 56 -> 52 (-7.14 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 2168 -> 4560 (110.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 4 -> 4 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211	2018-12-13 10:58:35 +11:00
Timothy Arceri	c03d6e61cc	nir: reword code comment Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:58:35 +11:00
Timothy Arceri	48b40380e3	nir: in loop analysis track actual control flow type This will allow us to improve analysis to find more induction variables. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:58:35 +11:00
Danylo Piliaiev	5921a19d4b	nir: add if opt opt_if_loop_last_continue() Removing the last continue can allow more loops to unroll. Also inserting code into the if branch can allow the various if opts to progress further. The insertion of some loops into the if branch also reduces VGPR use in some shaders. vkpipeline-db results (VEGA): Totals from affected shaders: SGPRS: 6552 -> 6576 (0.37 %) VGPRS: 6544 -> 6532 (-0.18 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 481952 -> 478032 (-0.81 %) bytes LDS: 13 -> 13 (0.00 %) blocks Max Waves: 241 -> 242 (0.41 %) Wait states: 0 -> 0 (0.00 %) Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 168 -> 168 (0.00 %) VGPRS: 144 -> 140 (-2.78 %) Spilled SGPRs: 157 -> 157 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 8524 -> 8488 (-0.42 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 7 -> 7 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: (Timothy Arceri): - allow for continues in either branch - move any trailing loops inside the if as well as blocks. - leave nir_opt_trivial_continues() to actually remove the continue. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211	2018-12-13 10:58:35 +11:00
Timothy Arceri	721566bddb	nir: rework force_unroll_array_access() Here we rework force_unroll_array_access() so that we can reuse the induction variable detection in a following patch. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:39:51 +11:00
Timothy Arceri	48135f175c	nir: factor out some of the complex loop unroll code to a helper Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-12-13 10:34:48 +11:00
Jordan Justen	7fe4e0ad5d	docs: Document GitLab merge request process (email alternative) This documents a process for using GitLab Merge Requests as an second way to submit code changes for Mesa. Only one of the two methods is allowed for each patch series. We will not require all patches to be emailed. Some code changes may be reviewed and merged without any discussion on the mesa-dev email list. v2: * No longer require email. Allow submitter to choose email or a GitLab merge request. * Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason, Matt, Michel and Rob. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Rob Clark <robdclark@gmail.com>	2018-12-12 10:05:29 -08:00
Rhys Kidd	ff6f1dd0d3	meson: libfreedreno depends upon libdrm (for fence support) Error message building freedreno Gallium driver with meson: ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory \#include <libsync.h> Fixes: `4aa69cc425` ("meson: build freedreno") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-12 09:01:06 -08:00
Jason Ekstrand	ca98902d09	nir: Document the function inlining process This has thrown a few people off recently and it's good to have the process and all the rational for it documented somewhere. A comment at the top of nir_inline_functions seems as good a place as any. Acked-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-12-12 08:32:32 -06:00
Jason Ekstrand	5749c0ebc4	intel/blorp: Assert that we don't re-layout a compressed surface Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-12 08:32:32 -06:00
Jason Ekstrand	e4fdc650f1	anv/pipeline: Set the correct binding count for compute shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-12-12 08:32:25 -06:00
Samuel Pitoiset	2ac6d55f38	radv: bump reported version to 1.1.90 After going through the spec changelog, it looks like RADV is up to date. Note that ANV also reports 1.1.90. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-12 13:51:16 +01:00
Erik Faye-Lund	f856f50194	virgl: force linear texturing support When I made sure that half-float texture-filtering was required for ES3, I didn't realize that virgl doesn't report support for this correctly. This regressed the GLES version available on top of several drivers, including i965 from 3.2 to 2.0. This is going to need protocol changes to fix properly, so let's just restore the previous behavior by enabling floating-point filtering unconditionally for now. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `fcf9fcee3c` "mesa/main: do not require float-texture filtering for es3" Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-12-12 11:44:47 +01:00
Iago Toral Quiroga	3918943211	intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments The implementation of these opcodes in the generator assumes that their arguments are packed, and it generates register regions based on that assumption. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-12 08:09:45 +01:00
Jason Ekstrand	a10a450db2	anv: Advertise support for MinLod on Skylake+ These are usually used for dealing with sparse resources but there's no reason why we can't hook them up before we have sparse. We have the hardware; let's light it up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	cb98e0755f	intel/fs: Support min_lod parameters on texture instructions We have to lower some shadow instructions because they don't exist in hardware and we have to lower txb+offset+clamp because the message gets too big and we run into the sampler message length limit of 11 regs. Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	4ef8f46fd1	nir/lower_tex: Add lowering for some min_lod cases Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	4a691cfa7e	nir/lower_tex: Modify txd instructions instead of replacing them I don't know if one is better than the other or not but this approach has the advantage that we never forget to copy information over and we're not hard-coding quite as many assumptions. It's also a lot simpler and much less code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	5a968ae473	nir/lower_tex: Simplify lower_gradient logic Instead of having to call two different lower_gradient functions based on whether or not it's a cube, just make lower_gradient handle cubes. This significantly simplifies some of the logic. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	caeffe7549	spirv: Add support for MinLod Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Jason Ekstrand	e1ef6c3c29	intel/ir: Don't allow allocating zero registers This simple check helps catch bugs early that can end up propagating into later stages of the compile and triggering strange asserts. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-12-11 21:26:23 -06:00
Roland Scheidegger	86c45fe960	gallivm: remove unused float coord wrapping for aos sampling AoS sampling tries to use integers for coord wrapping when possible, as it should be faster. However, for AVX, this was suboptimal, because only floats can use 8x32bit vectors, whereas integers have to be split into 4x32bit vectors. (I believe part of why it was slower was also that at least earlier llvm versions had trouble optimizing it properly, since you can still do simple bit ops with 8x32bit vectors, so a sequence of int add / and / int add / and with such vectors would actually end up doing 128bit inserts/extracts between the operations instead of just doing the cheap 128bit ands.) Hence, a special float coord wrapping path was added to AoS sampling. But this path was actually disabled for a long time already, since we found that just splitting everything before entering the AoS path was still sligthly faster usually, so none of this float coord wrapping code was used anymore (AoS sampling code, when avx2 isn't supported, never sees vectors with length > 4). I thought it might be useful some day again, but I'm not interested anymore in optimizing for very weird instruction sets which have support for 256bit vectors for floats but not for ints, so just drop it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-12-12 03:50:03 +01:00
Emil Velikov	721c296bdc	docs: update calendar, add news item and link release notes for 18.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-11 21:25:18 +00:00
Emil Velikov	5391b65ed1	docs: add sha256 checksums for 18.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-11 21:21:42 +00:00
Emil Velikov	512bd8d3dd	docs: add release notes for 18.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-11 21:21:41 +00:00
Neil Roberts	8600aa35bd	freedreno: Add .dir-locals to the common directory The commit `aa0fed10d3` moved a bunch of Freedreno code to a common directory. The previous directory had a .dir-locals file for Emacs. This patch copies it to the new directory as well. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-12-11 13:14:08 -08:00
Rob Clark	cfe8220904	mesa/st/nir: fix missing nir_compact_varyings LinkedTransformFeedback is normally populated, which had nerf'd varying packing since the check was introduced. Fixes: `dbd52585fa` st/nir: Disable varying packing when doing transform feedback. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-11 15:51:34 -05:00
Rob Clark	9e3fc0c1e0	nir: fix spelling typo Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-12-11 15:51:34 -05:00
Jason Ekstrand	8f401b0ce6	anv,radv: Disable VK_EXT_pci_bus_info The Vulkan working group recently discovered that we made a mistake in assuming that PCI domains are 16-bit even though they can potentially be 32-bit values. To fix this, the next spec update will change the types in the VK_EXT_pci_bus_info struct to be 32 bits which will be a backwards-incompatible change. Normally, Khronos tries very hard to never make backwards incompatible changes to specs. Hopefully, the extension is new enough (2 months) that there are no shipping apps which use the extension so this should be safe. This commit disables the extension for both anv and radv in mesa and should be back-ported to 18.3 ASAP so we avoid any potential issues with new apps running on old drivers. I'll send out a commit (which we can also back-port to 18.3 if we really care) to re-enable the extension in both drivers once this week's spec update ships. The one known use of this extension is internal to mesa and will continue working with the extension disabled and will naturally update when we get a new header. Cc: "18.3" <mesa-stable@lists.freedesktop.org> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-11 11:30:05 -06:00
Juan A. Suarez Romero	fb88dcf5ca	docs: extends 18.2 lifecycle As 18.3 was published with some delay, let's extend 18.2 life for another extra release. CC: Andres Gomez <agomez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-11 15:20:10 +01:00
Kristian H. Kristensen	c0de7c21a3	glapi: fixup EXT_multisampled_render_to_texture dispatch There's a few missing and convoluted bits: - FramebufferTexture2DMultisampleEXT Missing sanity check, should be desktop="false" - RenderbufferStorageMultisampleEXT Missing sanity check, is aliased to RenderbufferStorageMultisample. Thus it's set only when desktop GL or GLES2 v3.0+, while the extension is GLES2 2.0+. If we flip the aliasing we'll break indirect GLX, so loosen the version to 2.0. Not perfect, yet this is the most sane thing I could think of. v2: [Emil] Fixup RenderbufferStorageMultisampleEXT, commmit message Cc: Kristian H. Kristensen <hoegsberg@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108974 Fixes: `1b331ae505` ("mesa: Add core support for EXT_multisampled_render_to_texture{,2}") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 15:09:07 -08:00
Kristian H. Kristensen	9578dde1c8	freedreno: Fix the Makefile.am fix Commit `b028ce29f0` fixed a typo in src/freedreno/Makefile.am, but ended up breaking the build for freedreno. The typo inadvertently made things work, as we were not supposed to link with libnir or libmesautil to begin with. Those come in through libmesagallium and the typo prevented the duplicated linkage. Fixes: `b028ce29f` ("freedreno: add the missing _la in libfreedreno_ir3_la") Cc: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 14:28:09 -08:00
Matt Turner	f447a13032	i965/fs: Handle V/UV immediates in dump_instructions()	2018-12-10 10:46:56 -08:00
Sagar Ghuge	694eb342a2	intel/compiler: Always print flag subregister number While disassembling the predicate always print flag subregister number to keep grammar same across the generation for assembler tool. v2: Combine consecutive format calls (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-12-10 10:07:11 -08:00
Sagar Ghuge	e7598c5a62	intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region When RepCtrl is set, the swizzle field is ignored by the hardware. In order to ensure a 1-to-1 correspondence between the human-readable disassembly and the binary instruction encoding always set the swizzle to XXXX (all zeros) when it is unused due to RepCtrl Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-12-10 10:06:55 -08:00
Dylan Baker	6d3cbbbe15	meson: Add nir_algebraic_parser_test to suites Just to make it easier to run a nir tests together. Fixes: `a0ae12ca91` ("nir/algebraic: Add unit tests for bitsize validation") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-10 09:14:44 -08:00
Emil Velikov	27c4fdfdf8	amd/addrlib: drop si_ci_vi_merged_enum.h from the list Fixes: `776b911365` ("amd/addrlib: update Mesa's copy of addrlib") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 16:35:01 +00:00
Emil Velikov	b028ce29f0	freedreno: add the missing _la in libfreedreno_ir3_la Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 16:35:01 +00:00
Emil Velikov	b30e37ec64	freedreno: drop duplicate MKDIR_GEN declaration Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 16:35:01 +00:00
Rhys Kidd	05c7e726f7	travis: radeonsi and radv require LLVM 7.0 Fixes: `3fbdcd942f` ("amd: remove support for LLVM 6.0") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> Cc: Andres Gomez <agomez@igalia.com> Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 16:20:12 +00:00
Kirill Burtsev	a539316485	loader: free error state, when checking the drawable type Currently we distinguish if the drawable is a window or pixmap by checking xcb_present_select_input throws an error or not. Yet, we don't always free the error state returned by xcb. Cc: Kirill Burtsev <kirill.burtsev@qt.io> Cc: Boyan Ding <boyan.j.ding@gmail.com> Fixes: `6bd9ba7d07` ("loader: Add dri3 helper") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil: add commit message, fixes tag] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-10 16:19:55 +00:00
Timothy Arceri	032f247921	nir: make use of new nir_cf_list_clone_and_reinsert() helper Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-10 13:59:50 +11:00
Timothy Arceri	6b961eb534	nir: add a new nir_cf_list_clone_and_reinsert() helper Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-10 13:59:50 +11:00
Timothy Arceri	03d7c65ad8	nir: clarify some nit_loop_info member names Following commits will introduce additional fields such as guessed_trip_count. Renaming these will help avoid confusion as our unrolling feature set grows. Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-10 13:59:50 +11:00
Timothy Arceri	de0aee7638	nir: small tidy ups for nir_loop_analyze() Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-10 13:59:50 +11:00
Kenneth Graunke	41a4a6ba6f	i965: Flip arguments to load_register_reg helpers. load_register_imm and load_register_mem take the destination as the first argument, so I'd like load_register_reg to do the same the sake of consistency. Otherwise, reading sequences of mixed LRI/LRM/LRR is needlessly confusing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-09 18:39:16 -08:00
Kenneth Graunke	34c9dc2537	i965: Delete dead brw_meta_resolve_color prototype. Dead since commit `09e041d61d` (May 2016).	2018-12-09 18:39:16 -08:00
Karol Herbst	77944fb2b7	nv50/ir: fix use-after-free in ConstantFolding::visit opnd() might delete the passed in instruction, but it's used through i->srcExists() later in visit v2: use continue instead return v3: use brackets for the outer if/else chain Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-09 18:19:59 +01:00
Karol Herbst	d63a133082	nouveau: use atomic operations for driver statistics multiple threads can write to those at the same time Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-09 04:43:20 +01:00
Karol Herbst	a28ff22295	nv50/ir: initialize relDegree staticly this race condition is pretty harmless, but also pretty trivial to fix Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-09 04:43:17 +01:00
Eric Anholt	cc6a5e937b	shader-packing	2018-12-07 16:51:12 -08:00
Eric Anholt	09ad0d870c	tfu	2018-12-07 16:49:41 -08:00
Eric Anholt	f1d98204c3	v3d: Fix a leak of the disassembled instruction string during debug dumps. Fixes: `ade416d023` ("broadcom: Add VC5 NIR compiler.")	2018-12-07 16:48:23 -08:00
Eric Anholt	7f8d8b7d27	vc4: Fix a leak of the transfer helper on screen destroy. Fixes: `d009463a65` ("vc4: Switch to using u_transfer_helper for MSAA maps.")	2018-12-07 16:48:23 -08:00
Eric Anholt	3bd73d31a8	v3d: Fix a leak of the transfer helper on screen destroy. Fixes: `7a30517cce` ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")	2018-12-07 16:48:23 -08:00
Eric Anholt	bad95bb13c	v3d: Add VIR dumping of TMU config p0/p1. I had a bit of it for V3D 3.x, but didn't update it for 4.x.	2018-12-07 16:48:23 -08:00
Eric Anholt	1fc78ff3f1	v3d: Simplify VIR uniform dumping using a temporary.	2018-12-07 16:48:23 -08:00
Eric Anholt	5932575299	v3d: Garbage collect unused uniforms code.	2018-12-07 16:48:23 -08:00
Eric Anholt	62a3192112	v3d: Split most of TEXTURE_SHADER_STATE setup out of sampler views. For shader image load/store, we want most of this logic to be shared.	2018-12-07 16:48:23 -08:00
Eric Anholt	8cb1f3bab7	v3d: Avoid confusing auto-indenting in TEXTURE_SHADER_STATE packing Having "v3dx_pack() {" under each #if branch would confuse emacs's indenter.	2018-12-07 16:48:23 -08:00
Eric Anholt	ee9b758053	v3d: Fix handling of texture first_layer offsets for 3D textures. I think this bug predated adding v3d_layer_offset(). Noticed during an unrelated refactor.	2018-12-07 16:48:23 -08:00
Eric Anholt	acecee4c2d	v3d: Return the right gl_SampleMaskIn[] value. It's supposed to be the dispatched sample mask for this pixel, not the GL state's sample mask.	2018-12-07 16:48:23 -08:00
Eric Anholt	6870111051	v3d: Fix a comment typo	2018-12-07 16:48:23 -08:00
Eric Anholt	ca0e4ae4bc	v3d: Convert to using nir_src_as_uint() from const_value derefs. Follows `16870de8a0` ("nir: Use nir_src_is_const and nir_src_as_* in core code") to clean up v3d.	2018-12-07 16:48:23 -08:00
Eric Anholt	503b55c622	v3d: Don't forget to flush writes to UBOs. If someone did TF into a UBO, we might have left the TF job un-flushed at the point of reading.	2018-12-07 16:48:23 -08:00
Eric Anholt	504d06e4c1	v3d: Make an array for frag/vert texture state in the context. This simplifies a bunch of our texture handling, while introducing the slots necessary for adding new shader stages.	2018-12-07 16:48:23 -08:00
Eric Anholt	d1965344ac	v3d: Re-use the wrap mode uniform on V3D 3.3.	2018-12-07 16:48:23 -08:00
Eric Anholt	e94d034a38	v3d: Put default vertex attribute values into the state uploader as well. The default attributes are long-lived (the state struct is cached), and only 256 bytes each.	2018-12-07 16:48:23 -08:00
Eric Anholt	b38e4d313f	v3d: Create a state uploader for packing our shaders together. Shaders are usually quite short, and are private to the context. We can save memory and reduce the work the kernel needs to do at exec time by packing them together in a stream uploader for long-lived state.	2018-12-07 16:48:23 -08:00
Eric Anholt	1911888760	v3d: Update simulator cache flushing code to match the kernel better. We were missing the invalidate between bin and render (possibly relevant for SSBOs), and still trying to flush the nonexistent L2C on 3.3+.	2018-12-07 16:48:23 -08:00
Eric Anholt	2ebca177dc	v3d: Use the TFU to do generatemipmap. This is a separate, dedicated hardware unit for texture layout conversions and mipmap generation.	2018-12-07 16:48:23 -08:00
Eric Anholt	ee0549ff9a	v3d: Add the V3D TFU submit interface to the simulator. The TFU lets us format raster and SAND images into formats that can be read by the texture engine, and do mipmap generation. The UAPI comes from drm-next e69aa5f9b97f ("Merge tag 'drm-misc-next-2018-12-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")	2018-12-07 16:48:23 -08:00
Eric Anholt	42652ea51e	v3d: Use combined input/output segments. The HW apparently has some issues (or at least a much more complicated VCM calculation) with non-combined segments, and the closed source driver also uses combined I/O. Until I get the last CTS failure resolved (which does look plausibly like some VPM stomping), let's use combined I/O too.	2018-12-07 16:48:23 -08:00
Eric Anholt	fb9bcf5602	v3d: Add missing OES_half_float_linear support. We were exposing ARB_texture_float, but apparently not the OES subset flag. Fixes regression from GLES3 support to GLES2. Fixes: `fcf9fcee3c` ("mesa/main: do not require float-texture filtering for es3")	2018-12-07 16:48:23 -08:00
Eric Anholt	90e98295a4	v3d: Add support for RGBA_SRGB along with BGRA_SRGB. This is the actual native format for the hardware, without swizzling. Noticed while debugging why GLES3 disappeared.	2018-12-07 16:48:23 -08:00
Kenneth Graunke	f0d51e81c9	intel/blorp: Expand blorp_address::offset to be 64 bits. In the softpin world, surface state base address may be a fixed 64-bit address (with no associated BO). It makes sense to store this in the offset field. But it needs to be the full size. We also update the clear color address to be consistently uint64_t everywhere so we can continue passing intel_miptree_get_clear_color a pointer to the blorp_address's offset field without type mismatches. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-12-07 16:35:51 -08:00
Rob Clark	d014af98b7	freedreno/drm: fix memory leak Fix an emberrasing memory leak with the non-softpin submit/rb implementation. Fixes: `f3cc0d2747` freedreno: import libdrm_freedreno + redesign submit Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 14:12:12 -05:00
Rob Clark	5c2c1f0a2d	freedreno/ir3: track max flow control depth for a5xx/a6xx Rather than just hard-coding BRANCHSTACK size. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	9517037bdc	freedreno/ir3: code-motion Split up ir3_compiler_nir.c a bit before starting to add new stuff for a6xx SSBO/image instructions. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	e37351fa57	freedreno/ir3: sync instr/disasm Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	0d240c2214	freedreno/ir3: don't fetch unused tex components Detect when a component of an (for example) texture fetch is unused and propagate the updated wrmask back to the parent instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	b971afd19e	freedreno/a6xx: blitter fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	237ae7daf2	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	e779725f0b	freedreno/drm: fix relocs in nested stateobjs If we have an reloc from stateobjA to stateobjB, we would previously leave stateobjB's bos out of the submit's bos table. Handle this case by copying into stateobjA's reloc_bos table. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	9f7c6c78bc	freedreno/a5xx+a6xx: remove unused fs/vs pvt mem copy/pasta from older gens Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	c500e7b747	gallium: fix typo Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Rob Clark	f6ad286c80	freedreno: remove unused fd_surface fields Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-07 13:49:21 -05:00
Nicolai Hähnle	4275cae95c	meson: link LLVM 'native' component when LLVM is available Linking against LLVM built with BUILD_SHARED_LIBS fails otherwise, as the component is required for the draw module. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-07 16:26:14 +01:00
Connor Abbott	2845c49218	nir: Fixup algebraic test for variable-sized conversions b2i can now take any size boolean in preparation for 1-bit booleans, so the error message printed is slightly different. Fixes: `dca6cd9ce6` ("nir: Make boolean conversions sized just like the others") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108961 Cc: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-07 16:07:51 +01:00
Samuel Pitoiset	e8a383ce67	gallium: add missing PIPE_CAP_SURFACE_SAMPLE_COUNT default value Fixes: `2710c40e3c` ("gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-12-07 15:06:29 +01:00
Emil Velikov	96d4ecbb11	docs: update calendar, add news item and link release notes for 18.3.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-07 11:50:12 +00:00
Emil Velikov	0144bbdb98	docs: add sha256 checksums for 18.3.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d81beab96a`)	2018-12-07 11:44:33 +00:00
Emil Velikov	b1e0336497	docs: update 18.3.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `d603cd9d84`)	2018-12-07 11:44:31 +00:00
Kristian H. Kristensen	3e55df4f83	freedreno: Add support for EXT_multisampled_render_to_texture There is not much to do in freedreno - tile layout and multisample state for gmem renderings is programmed based on the pfb sample count, while resolve blits take the destination sample count from the resource. Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-12-06 16:56:37 -08:00
Rob Clark	913eb7fa58	freedreno/a6xx: MSAA Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-12-06 16:55:59 -08:00
Kristian H. Kristensen	14ea811c67	st/mesa: Add support for EXT_multisampled_render_to_texture In gallium, we model the attachment sample count as a new nr_samples field in pipe_surface. A driver can indicate support for the extension using the new pipe cap, PIPE_CAP_MULTISAMPLED_RENDER_TO_TEXTURE. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-12-06 16:55:46 -08:00
Kristian H. Kristensen	2710c40e3c	gallium: Add new PIPE_CAP_SURFACE_SAMPLE_COUNT This new pipe cap and the new nr_samples field in pipe_surface lets a state tracker bind a render target with a different sample count than the resource. This allows for implementing EXT_multisampled_render_to_texture and EXT_multisampled_render_to_texture2. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-12-06 16:55:43 -08:00
Kristian H. Kristensen	1b331ae505	mesa: Add core support for EXT_multisampled_render_to_texture{,2} This also turns on EXT_multisampled_render_to_texture which is a subset of EXT_multisampled_render_to_texture2, allowing only COLOR_ATTACHMENT0. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-12-06 16:55:30 -08:00
Vinson Lee	b4fd59075b	nir/algebraic: Make algebraic_parser_test.sh executable. Fixes make check permission error. ../../bin/test-driver: line 107: ./nir/tests/algebraic_parser_test.sh: Permission denied FAIL nir/tests/algebraic_parser_test.sh (exit status: 126) Fixes: `a0ae12ca91` ("nir/algebraic: Add unit tests for bitsize validation") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2018-12-06 11:48:20 -08:00
Samuel Pitoiset	3fbdcd942f	amd: remove support for LLVM 6.0 User are encouraged to switch to LLVM 7.0 released in September 2018. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-06 14:02:56 +01:00
Kristian H. Kristensen	3b2ad8b290	gallium: Android build fixes A couple of simple fixes for building on Android with autotools. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-05 13:56:07 -08:00
Jason Ekstrand	dca6cd9ce6	nir: Make boolean conversions sized just like the others Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is one if 8, 16, 32, or 64. This leads to having a few more opcodes but now everything is consistent and booleans aren't a weird special case anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:03:07 -06:00
Jason Ekstrand	be98b1db38	nir/opt_algebraic: Add 32-bit specifiers to a bunch of booleans Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:03:03 -06:00
Jason Ekstrand	2715080d65	nir/opt_algebraic: Drop bit-size suffixes from conversions Suffixes are dropped from a bunch of conversion opcodes when it makes sense to do so. Others are kept if we really do want the bit-size restriction. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:03:01 -06:00
Jason Ekstrand	ff8e3d3b7b	nir/opt_algebraic: Simplify an optimization using the new search ops Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:58 -06:00
Jason Ekstrand	05af952a11	nir/algebraic: Add support for unsized conversion opcodes All conversion opcodes require a destination size but this makes constructing certain algebraic expressions rather cumbersome. This commit adds support to nir_search and nir_algebraic for writing conversion opcodes without a size. These meta-opcodes match any conversion of that type regardless of destination size and the size gets inferred from the sizes of the things being matched or from other opcodes in the expression. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:56 -06:00
Jason Ekstrand	4925290ab1	nir/algebraic: Refactor codegen a bit Instead of using an OrderedDict, just have a (necessarily sorted) array of transforms and a set of opcodes. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:54 -06:00
Jason Ekstrand	d6aac618fb	nir/algebraic: Clean up some __str__ cruft Both of these things are already handled in the Value base class so we don't need to handle them explicitly in Constant. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:52 -06:00
Jason Ekstrand	85f0ea9d8f	nir/opcodes: Rename tbool to tbool32 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:49 -06:00
Jason Ekstrand	03571a7a6c	nir/opcodes: Pull in the type helpers from constant_expressions While we're at it, we rework them a bit to all use regular expressions and assert more. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-12-05 15:02:06 -06:00
Connor Abbott	a0ae12ca91	nir/algebraic: Add unit tests for bitsize validation The non-failure path can be tested by just compiling mesa and then testing it, but the failure paths won't be hit unless you make a mistake, so it's best to test them with some unit tests. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-05 17:57:40 +01:00
Connor Abbott	29a1450e28	nir/algebraic: Rewrite bit-size inference Before this commit, there were two copies of the algorithm: one in C, that we would use to figure out what bit-size to give the replacement expression, and one in Python, that emulated the C one and tried to prove that the C algorithm would never fail to correctly assign bit-sizes. That seemed pretty fragile, and likely to fall over if we make any changes. Furthermore, the C code was really just recomputing more-or-less the same thing as the Python code every time. Instead, we can just store the results of the Python algorithm in the C datastructure, and consult it to compute the bitsize of each value, moving the "brains" entirely into Python. Since the Python algorithm no longer has to match C, it's also a lot easier to change it to something more closely approximating an actual type-inference algorithm. The algorithm used is based on Hindley-Milner, although deliberately weakened a little. It's a few more lines than the old one, judging by the diffstat, but I think it's easier to verify that it's correct while being as general as possible. We could split this up into two changes, first making the C code use the results of the Python code and then rewriting the Python algorithm, but since the old algorithm never tracked which variable each equivalence class, it would mean we'd have to add some non-trivial code which would then get thrown away. I think it's better to see the final state all at once, although I could also try splitting it up. v2: - Replace instances of "== None" and "!= None" with "is None" and "is not None". - Rename first_src to first_unsized_src - Only merge the destination with the first unsized source, since the sources have already been merged. - Add a comment explaining what nir_search_value::bit_size now means. v3: - Fix one last instance to use "is not" instead of != - Don't try to be so clever when choosing which error message to print based on whether we're in the search or replace expression. - Fix trailing whitespace. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-05 17:57:40 +01:00
Samuel Pitoiset	49ef890733	radv: expose VK_EXT_scalar_block_layout Nothing to do, the compiler already handles that. All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some 16-bit tests that are quite related to fdo bug #108114. Only enable the extension on CIK+ because it might not work on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-05 17:38:20 +01:00
Samuel Pitoiset	c6465fec0c	spirv: add SpvCapabilityInt64Atomics Required for VK_KHR_shader_atomic_int64. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-12-05 14:39:55 +01:00
Michal Srb	63c0916ada	drisw: Use separate drisw_loader_funcs for shm The original code was modifying the global drisw_lf variable, which is bad when there are multiple contexts in single process, each initialized with different loader. One may support put_image_shm and the other not. Since there are currently only two possible combinations, lets create two global tables, one for each. Lets make them const, since we won't change them and they can be shared. This fixes crash in VLC. It used two GL contexts (each in different thread), one was initialized by its Qt GUI, the other by its video output plugin. The first one set the put_image_shm=drisw_put_image_shm, the second did not, but since the same structure was used, the drisw_put_image_shm was used too. Then it crashed because the second loader did not have putImageShm set. Downstream bug: https://bugzilla.opensuse.org/show_bug.cgi?id=1113533 v2: Added Fixes and described the VLC bug. Fixes: `63c427fa71` ("drisw: use putImageShm if available") Signed-off-by: Michal Srb <msrb@suse.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-05 13:16:09 +00:00
Michal Srb	c0ac038c97	gallium: Constify drisw_loader_funcs struct The content is not expected to change. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Michal Srb <msrb@suse.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-05 13:16:09 +00:00
Samuel Pitoiset	c7ada4901a	radv: wait on the high 32 bits of timestamp queries In case we are unlucky if the low part is 0xffffffff. Fixes: `5d6a560a29` ("radv: do not use the availability bit for timestamp queries") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-05 13:05:58 +01:00
Samuel Pitoiset	e899728769	radv: reset pending_reset_query when flushing caches If the driver used a compute shader for resetting a query pool, it should be completed when caches are flushed. This might reduce the number of stalls if operations are done between vkCmdResetQueryPool() and vkCmdBeginQuery() (or vkCmdWriteTimestamp()). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Alex Smith <asmith@feralinteractive.com>	2018-12-05 13:05:55 +01:00
Lionel Landwerlin	9a7b319903	anv/query: flush render target before copying results This change tracks render target writes in the pipeline and applies a render target flush before copying the query results to make sure the preceding operations have landed in memory before the command streamer initiates the copy. v2: Simplify logic in CopyQueryResults (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909 Fixes: `37f9788e9a` ("anv: flush pipeline before query result copies") Cc: mesa-stable@lists.freedesktop.org	2018-12-05 11:43:34 +00:00
Alex Smith	c1b6cb068c	radv: Flush before vkCmdWriteTimestamp() if needed As done for vkCmdBeginQuery() already. Prevents timestamps from being overwritten by previous vkCmdResetQueryPool() calls if the shader path was used to do the reset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925 Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-05 10:52:48 +00:00
Samuel Pitoiset	824cfc1ee5	radv: rework the TC-compat HTILE hardware bug with COND_EXEC After investigating on this, it appears that COND_WRITE doesn't work correctly in some situations. I don't know exactly why does it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK also uses COND_EXEC I think there is a reason. Now the driver stores a new metadata value in order to reflect the last fast depth clear state. If a TC-compat HTILE is fast cleared with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to work around that hardware bug. This fixes rendering issues with The Forest and DXVK and doesn't seem to introduce any regressions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914 Fixes: `68dead112e` ("radv: update the ZRANGE_PRECISION value for the TC-compat bug") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-12-05 09:26:31 +01:00
Dieter Nützel	2669dbf881	docs/features: Delete double nv50 entry and wrong enumeration trivial Fix commit `d9b2234042` Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-12-04 18:51:18 -05:00
Marek Olšák	5907412d04	st/mesa: expose EXT_render_snorm on GLES Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-04 15:33:29 -05:00
Marek Olšák	1660f3aa05	mesa: expose AMD_texture_texture4 because the closed driver exposes it. Tested by piglit. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-04 15:33:29 -05:00
Marek Olšák	908f817918	mesa: expose EXT_texture_compression_bptc in GLES tested by piglit. v2: rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-12-04 15:33:29 -05:00
Marek Olšák	34f07ddebb	mesa: expose EXT_texture_compression_rgtc on GLES The spec was modified to support GLES. Tested by piglit. v2: rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-12-04 15:33:29 -05:00
Erik Faye-Lund	91af56e383	mesa/main: fix up _mesa_has_rg_textures for gles2 rg-textures are supported in GLES 2.0 if EXT_texture_rg, so let's make sure the enums are accepted. Fixes: `510b642460` "mesa/main: do not allow rg-textures enums before gles3" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108936 Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-04 21:14:26 +01:00
Erik Faye-Lund	5bf38bfb64	mesa/main: correct validation for GL_RGB565 Technically speaking, this validation was incorrect, because GL_RGB565 is only supported in OpenGL ES 1.x if OES_framebuffer_object is supported. This couldn't lead to any real incorrect behavior, because all drivers support OES_framebuffer_object. But let's keep the code self-documenting, by correcting the check as per the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-04 21:14:16 +01:00
Marek Olšák	4b218984d8	mesa: expose GL_EXT_texture_view as an alias of GL_OES_texture_view There are no spec changes. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-04 12:50:36 -05:00
Marek Olšák	d9b2234042	st/mesa: expose GL_OES_texture_view For format fallbacks like ETC and ASTC, switching between sRGB and linear decoding is undefined, or at least is not bit-exact. Same as EXT_texture_sRGB_decode on GLES. There are no piglit or dEQP regresssions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-04 12:50:36 -05:00
Eric Engestrom	95d62baac5	loader: deduplicate logger function declaration Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-12-04 16:29:32 +00:00
Eric Engestrom	eade6ffeee	mesa: drop unused & deprecated lib DeprecationWarning: the imp module is deprecated in favour of importlib Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-04 16:26:21 +00:00
Eric Engestrom	919bec1c47	anv: add unreachable() for VK_EXT_fragment_density_map This silences the -Wswitch compiler warning. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-04 16:22:55 +00:00
Eric Engestrom	a0b14c1b02	meson: skip asm check when asm is disabled Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-04 16:22:51 +00:00
Andrii Simiklit	6ae873b97d	intel/tools: make sure the binary file is properly read 1. tools/i965_disasm.c:58:4: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result fread(assembly, *end, 1, fp); v2: Fixed incorrect return value check. ( Eric Engestrom <eric.engestrom@intel.com> ) v3: Zero size file check placed before fread with exit() ( Eric Engestrom <eric.engestrom@intel.com> ) v4: - Title is changed. - The 'size' variable was moved to top of a function scope. - The assertion was replaced by the proper error handling. - The error message on a caller side was fixed. ( Eric Engestrom <eric.engestrom@intel.com> ) Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-04 16:19:26 +00:00
Toni Lönnberg	d7b99ab947	intel/aubinator_error_decode: Get rid of warning for missing switch case ../src/intel/tools/aubinator_error_decode.c: In function ‘instdone_register_for_ring’: ../src/intel/tools/aubinator_error_decode.c:177:4: warning: enumeration value ‘I915_ENGINE_CLASS_INVALID’ not handled in switch [-Wswitch] switch (class) { ^~~~~~ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-12-04 12:47:49 +00:00
Ilia Mirkin	bacf8471dc	nouveau: set texture upload budget It doesn't seem like the exact number has too much effect on the performaince in "teximage". However setting it to just about anything prevents some OOMs from getting hit. These values are not well-tuned, but don't seem too bad. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-03 23:11:29 -05:00
Ilia Mirkin	08c64fe7a1	nv50,nvc0: add explicit handling of PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET Since the max attrib stride is 2048, the max src offset makes sense as 2047. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-03 23:11:29 -05:00
Ilia Mirkin	de49e06507	nv50: always keep TSC slot 0 bound All TXF operations implicitly use sampler 0, and fail if it's not bound to anything. This does not happen in LINKED_TSC mode, but we don't currently use this. We ensure that TSC entry at id 0 has the SRGB conversion bit enabled (and all samplers we normally generate will too). Then when the TSC at slot 0 (not to be confused with entry 0 in the global TSC table) is unbound, we bind it to entry 0. This way, TXF operations are not dependent on there being a regular sampler bound there. Fixes arb_texture_buffer_object-subdata-sync among others. (TBO's are particularly susceptible to this as they don't bind a sampler.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-03 23:11:29 -05:00
Dave Airlie	1363a47c9c	radv: use 3d shader for gfx9 copies if dst is 3d This fixes some crucible 3d miptree tests I've been working on when executed using the compute shader path. Fixes: `d08f267814` (radv/gfx9: fix 3d image to image transfers on compute queues.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-04 10:42:31 +10:00
Bas Nieuwenhuizen	12e35a64c0	radv: Check for shareable images in central place. One place to put the logic makes things easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen	3bf48741e1	radv/android: Use buffer metadata to determine scanout compat. These days we don't always allocate scanout compatible textures anymore. That does mean we have to fix the radv android WSI though. Fixes: `b1444c9ccb` "radv: Implement VK_ANDROID_native_buffer." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen	51091b3e1f	radv/android: Mark android WSI image as shareable. Fixes: `b1444c9ccb` "radv: Implement VK_ANDROID_native_buffer." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-04 01:21:38 +01:00
Matt Turner	dd53bb7e1f	Revert "st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp" This reverts commit `198c50f487`. This needs to be reverted after commit `017199d2d2` ("mesa: Revert INTEL_fragment_shader_ordering support")	2018-12-03 16:20:43 -08:00
Matt Turner	017199d2d2	mesa: Revert INTEL_fragment_shader_ordering support This extension is not properly tested (testing for GL_ARB_fragment_shader_interlock is not sufficient), and since this was noted in review on August 28th no tests have been sent. Revert "i965: Add INTEL_fragment_shader_ordering support." Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering" This reverts commit `03ecec9ed2`. This reverts commit `119435c877`. Cc: mesa-stable@lists.freedesktop.org Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Anholt <eric@anholt.net>	2018-12-03 15:37:37 -08:00
Dave Airlie	e3f075439c	virgl: fix const warning on debug flags. Fixes: `8d4bb6e5c` (virgl: Add command and flags to initiate debugging on the host (v2))	2018-12-04 08:11:13 +10:00
Jason Ekstrand	71271e167b	vulkan: Update the XML and headers to 1.1.95 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-12-03 14:27:10 -06:00
Tobias Klausmann	9401a2f2e6	amd/vulkan: meson build - use radv_deps for libvulkan_radeon Without this the build breaks with: FAILED: src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o cc -Isrc/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha -Isrc/amd/vulkan -I../src/amd/vulkan -Isrc/../include -I../src/../include -Isrc -I../src -Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -I../src/gallium/include -Isrc/gallium/auxiliary -I../src/gallium/auxiliary -Isrc/amd -I../src/amd -Isrc/amd/common -I../src/amd/common -Isrc/compiler -I../src/compiler -Isrc/vulkan/util -I../src/vulkan/util -Isrc/vulkan/wsi -I../src/vulkan/wsi -Isrc/compiler/nir -I../src/compiler/nir -I/usr/include -I/usr/include/libdrm -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g '-DVERSION="18.3.0-rc5"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=return-type -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -Wno-format-truncation -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -DNDEBUG -fPIC -pthread -D__STDC_FORMAT_MACROS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -fvisibility=hidden -Wno-override-init -DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_XLIB_KHR -DVK_USE_PLATFORM_WAYLAND_KHR -DVK_USE_PLATFORM_DISPLAY_KHR -DVK_USE_PLATFORM_XLIB_XRANDR_EXT -MD -MQ 'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -MF 'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o.d' -o 'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -c ../src/amd/vulkan/radv_pipeline.c In file included from ../src/vulkan/util/vk_alloc.h:29, from ../src/amd/vulkan/radv_private.h:52, from ../src/amd/vulkan/radv_debug.h:27, from ../src/amd/vulkan/radv_pipeline.c:30: ../src/../include/vulkan/vulkan.h:54:10: fatal error: wayland-client.h: Datei oder Verzeichnis nicht gefunden #include <wayland-client.h> ^~~~~~~~~~~~~~~~~~ compilation terminated. The above command misses the include directory for wayland: -I/usr/include/wayland The missing include is contained in the (until now) unused radv_deps: if with_platform_wayland radv_deps += dep_wayland_client radv_flags += '-DVK_USE_PLATFORM_WAYLAND_KHR' libradv_files += files('radv_wsi_wayland.c') endif Fixes: `673dda8330` "meson: build "radv" vulkan driver for radeon hardware" Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-12-03 09:18:48 -08:00
Erik Faye-Lund	fcf9fcee3c	mesa/main: do not require float-texture filtering for es3 The OpenGL ES 3.0 specification, table 3.13 lists half-float textures as filterable, but not float textures. So we shouldn't depend on ARB_float_texture, which requires full filtering support for both. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	43015b2a89	mesa/st: do not probe for the same texture-formats twice This should be equalent of what we did before. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	212d270b4e	mesa/main: require EXT_texture_sRGB for gles3 sRGB textures is a requirement for OpenGL ES 3.0, so let's make sure we don't incorrectly enable a too high version. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	487010a099	mesa/main: require EXT_texture_type_2_10_10_10_REV for gles3 OpenGL ES 3.0 require this functionality, so we should also test for it to avoid incorrectly exposing a too high GLES version. On desktop, this has been required since all the way back in OpenGL 1.2 anyway. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	74eab1c62f	mesa/main: split float-texture support checking in two On OpenGL ES 2.0, there's separate extensions adding support for half-float and float textures. So we need to validate the enums separately as well. This also prevents these enums from incorrectly being allowed on OpenGL ES 1.x, where there's no extension that enables this in the first place. While we're at it, remove the pointless default-case, and the seemingly stale fallthrough comment. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	c4136ed5cc	mesa/main: do not allow EXT_texture_sRGB_R8 enums before gles3 ctx->Extensions.EXT_texture_sRGB_R8 is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. There's no extension adding support for this on OpenGL ES before version 3.0, so let's tighten the check. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	d972939986	mesa/main: do not allow sRGB texture enums before gles3 ctx->Extensions.EXT_texture_sRGB is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. There's no extension adding support for this on OpenGL ES before version 3.0, so let's tighten the check. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	3629ee025c	mesa/main: do not allow snorm-texture enums before gles3 ctx->Extensions.EXT_texture_snorm is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. There's no extension adding support for this on OpenGL ES before version 3.0, so let's tighten the check. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	52dc8b4f7b	mesa/main: do not allow floating-point texture enums on gles1 ctx->Extensions.OES_texture_float is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. There's no extension enabling floating-point textures for OpenGL ES 1.x, so we shouldn't allow those enums there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	167dcd59ae	mesa/main: do not allow type_2_10_10_10_REV enums before gles3 ctx->Extensions.EXT_texture_type_2_10_10_10_REV is set regardless of the API that's used, so checking for those direcly will always enable extensions when they are supported by the driver. There's no corresponding extension for OpenGL ES 1.x/2.0, so we shouldn't allow these enums there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	b112e62ba4	mesa/main: do not allow MESA_ycbcr_texture enums on gles This extension requies OpenGL, and shouldn't be available on OpenGL ES. So let's not allow the enums from it either. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	1b2e9aca77	mesa/main: do not allow EXT_texture_shared_exponent enums before gles3 ctx->Extensions.EXT_texture_shared_exponent is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. We also need to make sure this is enabled on OpenGL ES 3. Because the check is repeated, let's introduce a helper. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	510b642460	mesa/main: do not allow rg-textures enums before gles3 EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow these enums there, before OpenGL ES 3.0 which also introduce support for these enums. Since this check is repeated a lot, let's make a helper for this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	59690bf0a3	mesa/main: do not allow EXT_packed_float enums before gles3 EXT_packed_float isn't supported on OpenGL ES, we shouldn't allow these enums there, before OpenGL ES 3.0 which also introduce support for these enums. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	83db9d3e3a	mesa/main: do not allow ARB_depth_buffer_float enums before gles3 Floating-point depth buffers are only supported on OpenGL 3.0, OpenGL ES 3.0, or if ARB_depth_buffer_float is supported. Because we checked a driver capability rather than using an extension-check helper, we ended up incorrectly allowing this on OpenGL ES 1.x and 2.x. Since this logic is repeated, let's make a helper for it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	3bbd543b6e	mesa/main: do not allow integer-texture enums before gles3 Integer textures shouldn't be implicitly exposed on OpenGL ES 1.x and 2.x, but because the code checked against a driver-capability rather than using an extension-check helper, we ended up accidentally allowing these enums on older versions when the driver supports it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	b5a370dc25	mesa/main: do not allow ARB_texture_rgb10_a2ui enums before gles3 ARB_texture_rgb10_a2ui isn't supported on OpenGL ES, we shouldn't expose it there even if the driver supports it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	76b038bee7	mesa/main: do not allow stencil-texture enums on gles1 ctx->Extensions.ARB_texture_stencil8 is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. So let's instead check for both ARB_texture_stencil8 and OES_texture_stencil8, so we support depth textures on OpenGL and OpenGL ES 2.0+. There's no extension enabling stencil-textures for OpenGL ES 1.x, so we shouldn't allow those enums there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	19eb0bf28f	mesa/main: do not allow depth-texture enums on gles1 ctx->Extensions.ARB_depth_texture is set regardless of the API that's used, so checking for those direcly will always allow the enums from this extensions when they are supported by the driver. So let's instead check for both ARB_depth_texture and OES_depth_texture, so we support depth textures on OpenGL and OpenGL ES 2.0+. There's no extension enabling depth-textures for OpenGL ES 1.x, so we shouldn't allow those enums there. This fixes oes_packed_depth_stencil-depth-stencil-texture_gles1 on i965 Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	2dfcaf7554	mesa/main: do not allow astc enums on gles1 ctx->Extensions.KHR_texture_compression_astc_ldr is set regardless of the API that's used, so checking for those direcly will always enable extensions when they are supported by the driver. But there's no extension enabling ASTC for OpenGL ES 1.x, so we shouldn't allow those enums there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	1aa134038c	mesa/main: do not allow etc2 enums on gles1 ctx->Extensions.ARB_ES3_compatibility is set regardless of the API that's used, so checking for those direcly will always enable extensions when they are supported by the driver. But there's no extension enabling ETC2 for OpenGL ES 1.x, so we shouldn't allow those enums there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	27ca87ccca	mesa/main: do not allow s3tc enums on gles1 There's no extension enabling S3TC formats on OpenGL ES 1.x, so we shouldn't allow these even if the driver can support it. So let's check for EXT_texture_compression_s3tc instead of ANGLE_texture_compression_dxt, which is supported on all other OpenGL variations. We also need to use _mesa_has_EXT_texture_compression_s3tc() instead of checking the driver cap directly, otherwise we end up enabling this on OpenGL ES 1.x, as the API isn't checked. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	d70cfb322a	mesa/main: use _mesa_has_FOO_bar for compressed format checks _mesa_has_FOO_bar() knows about the APIs these extensions should be supported under, so let's use that to simplify these checks a bit. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	70bfd31287	mesa/main: clean up integer texture check This makes the logic a little bit easier to follow, and reduce a bit of repetition. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	5109742e7b	mesa/main: clean up ES2_compatibility check This makes the logic a little bit easier to follow; this is either about ES2 compatibility or about gles. GL_RGB565 was added already in OpenGL ES 1.0. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	2e753b77dd	mesa/main: clean up OES_texture_float_linear check Using the _mesa_has_FOO_bar helpers is generally more safe and should generally be prefered over checking driver-caps like this code did, because the _mesa_has_FOO_bar helpers also verify the API type and version. This shouldn't have any practical effect here, as this function only gets called for OpenGL ES 3.x right now. But if this was to change in the future, this makes the function behave a lot more predictable. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	1373d117c2	mesa/main: clean up S3_s3tc check S3_s3tc is the extension that enables this functionality on desktop, so let's check for that one. The _mesa_has_S3_s3tc() helper already verifies the API according to the extension-table. As for the second hunk, we currently already only expose EXT_texture_compression_s3tc on desktop so by using the helper instead, we get rid of this detail here, and once we enable it for GLES we'll automaticall get the interaction right. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	e8b331ae13	mesa/main: rename format-check function _mesa_es3_error_check_format_and_type isn't specific to OpenGL ES 3.x, it applies to all versions of OpenGL ES. So let's rename it to reflect this. While we're at it, let's also rename a helper function it uses similarly. As the helper is static, we can also remove the namespacing-prefix from the name. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:44 +01:00
Erik Faye-Lund	ca8e2a5277	mesa/main: make _mesa_has_tessellation return bool All other _mesa_has_foo functions return bool rather than GLboolean, so let's follow that style here as well. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-03 18:16:43 +01:00
Chad Versace	3ef0ca65c9	i965: Fix -Wswitch on INTEL_COPY_STREAMING_LOAD The warning is emitted when building without INLINE_SSE41. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-12-03 13:07:56 +02:00
Karol Herbst	fc0139d283	nv50,nvc0: Fix gallium nine regression regarding sampler bindings The new approach is that samplers don't get unbound even if they won't be used in a draw and we should just leave them be as well. Fixes a regression in multiple windows games using gallium nine and nouveau. v2: adjust num_samplers to keep track of the highest sampler bound v3: rework how to set the new value of num_samplers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106577 Fixes: `4d6fab245e` "cso: don't track the number of sampler states bound" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-12-02 00:05:04 +01:00
Andre Heider	b6f095f7ce	d3dadapter9: use snprintf(..., "%s", ...) instead of strncpy Fixes -Wstringop-truncation compiler warnings. See `f836d799f9` "intel/decoder: use snprintf(..., "%s", ...) instead of strncpy" Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-12-01 21:32:53 +01:00
Mauro Rossi	37a2072e97	android: st/mesa: fix building error due to sched_getcpu() Android has cpufeatures library but pinning of threads is not supported PIPE_OS_LINUX code path causes build error due to sched_getcpu() unavailable thus we need to avoid setting HAVE_SCHED_GETCPU for Android Fixes: `48f2160` ("st/mesa: regularly re-pin driver threads to the CCX where the app thread is") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-01 10:15:58 +01:00
Vinson Lee	4f74580d30	st/xvmc: Add X11 include path. This patch fixes this build error. CC tests/xvmc_bench.o In file included from tests/xvmc_bench.c:35: tests/testlib.h:38:10: fatal error: 'X11/Xlib.h' file not found ^~~~~~~~~~~~ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-30 22:09:43 -08:00
Mauro Rossi	eed3f1121c	android: amd/addrlib: update Mesa's copy of addrlib Needed to fix build error in addrlib in mesa for Android Fixes: `776b911` ("amd/addrlib: update Mesa's copy of addrlib") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-12-01 01:13:53 +01:00
Gurchetan Singh	89b4798c06	virgl: don't mark buffers as unclean after a write We can mark the buffer unclean if it's ever bound as a TBO, SSBO, ABO, or image. This improves dEQP-GLES3.performance.buffer.data_upload.function_call.map_buffer_range.new_specified_buffer.flag_write_full.stream_draw from 9.58 MB/s to 451.17 MB/s. v2: Track buffer cleanliness as a function of bindings (Ilia). v3: virgl_modify_clean --> virgl_dirty_res (Erik) Tested-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-11-30 12:21:01 +01:00
Gurchetan Singh	d18492c64f	virgl: avoid large inline transfers We flush everytime the command buffer (16 kB) is full, which is quite costly. This improves dEQP-GLES3.performance.buffer.data_upload.function_call.buffer_data.new_buffer.usage_stream_draw from 111.16 MB/s to 1930.36 MB/s. In addition, I made the benchmark produce buffers from 0 --> VIRGL_MAX_CMDBUF_DWORDS * 4, and tried ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 2), ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 4), etc. I didn't notice any clear differences, so let's just go with the most obvious heuristic. Tested-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-11-30 12:20:41 +01:00
Gurchetan Singh	c0773315af	virgl: quadruple command buffer size Tested running WebGL aquarium on Nvidia host (10,000 fishes) This moves us from 7 fps to 9 fps. After quadrupling, performance gains diminish. v2: Remove change ID (Erik) Tested-By: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-11-30 12:20:06 +01:00
Lionel Landwerlin	37f9788e9a	anv: flush pipeline before query result copies Pipeline state pending bits should be taken into account when copying results. In the particular bug below, the results of the vkCmdCopyQueryPoolResults() command was being overwritten by the preceding vkCmdCopyBuffer() with a same destination buffer. This is because we copy the buffers using the 3D pipeline whereas we copy the query results using the command streamer. Those pieces of HW work in parallel and the results are somewhat undefined. v2: Unconditionally flush the pipeline before copying the results (Jason) v3: Wrap & expressions (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894 Cc: mesa-stable@lists.freedesktop.org	2018-11-29 22:07:31 +00:00
Marek Olšák	39b20b7d4f	Revert "winsys/amdgpu: overallocate buffers for faster address translation on Gfx9" I didn't mean to push this. I don't think it makes any difference. This reverts commit `f737fe00a0`.	2018-11-29 14:46:06 -05:00
Roland Scheidegger	fbf95ce074	draw: fix infinite loop in line stippling The calculated length of a line may be infinite, if the coords we get are bogus. This leads to an infinite loop in line stippling. To prevent this test for this explicitly (although technically on at least x86 sse it would actually work without the explicit test, as long as we use the int-converted length value). While here also get rid of some always-true condition. Note this does not actually solve the root cause, which is that the coords we receive are bogus after clipping. This seems a difficult problem to solve. One issue is that due to float arithmetic, clip w may become 0 after clipping if the incoming geometry is "sufficiently degenerate", hence x/y/z ndc (and window) coords will be all inf (or nan). Even with w not quite 0, I believe it's possible we produce values which are actually outside the view volume. (Also, x=y=z=w=0 coords in clipspace would be not considered subject to clipping, and similarly result in all NaN coords.) We just hope for now other draw stages (and rasterizers) can handle those relatively safely (llvmpipe itself should be sort of robust against this, certainly converstion to fixed point will produce garbage, it might fail a couple assertions but should neither hang nor crash otherwise). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-11-29 18:39:40 +01:00
Józef Kucia	94bfb8bf38	nir: Fix assert in print_intrinsic_instr(). Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-29 16:29:37 +00:00
Nicolai Hähnle	776b911365	amd/addrlib: update Mesa's copy of addrlib Update to the internal master as of 2018-11-15. This has a lot of gratuitous whitespace change, but on the plus side it's built using the same tooling that's used for AMDVLK, which should help going forward.	2018-11-29 13:18:24 +01:00
Nicolai Hähnle	621c107760	ac/surface/gfx9: let addrlib choose the preferred swizzle kind Our choices here are simply redundant as long as sin.flags is set correctly. (v2: - remove unused function parameter) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-29 13:18:23 +01:00
Nicolai Hähnle	729ebdf07e	radv: remove dependency on addrlib gfx9_enum.h v2: - use SI_CONTEXT_REG_OFFSET Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-29 13:18:23 +01:00
Thomas Hellstrom	058f85d41c	winsys/svga: Fix a memory leak The ioctl.cap_3d member was never freed. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-29 10:42:06 +01:00
Thomas Hellstrom	7fce3ca375	st/xa: Fix a memory leak Free the context after destruction. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-29 10:42:06 +01:00
Samuel Pitoiset	cc7deb749c	radv: drop few useless state changes when doing color/depth decompressions Viewport/scissor don't need to be updated for array textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:55 +01:00
Samuel Pitoiset	6d4f65deea	radv: remove unused pending_clears param in the transition path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:53 +01:00
Samuel Pitoiset	4b9df824f7	radv: optimize CmdClear{Color,DepthStencil}Image() for layered textures If all layers are bound we can perform a fast color or depth clear instead of iterating over all layers. This has the advantage to avoid trashing the framebuffer for nothing if you we end up by doing a fast clear when calling radv_clear_image_layer(), and clearing all layers in one shot is obviously faster. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	7484bc894b	radv: refactor the fast clear path for better re-use Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	f78ee19702	radv: simplify a check in emit_fast_color_clear() Currently only true if RADV_PERFTEST=dccmsaa is set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	eca931a726	radv: add radv_can_fast_clear_{color,depth}() helpers For further optimisations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	93f5ce8fa7	radv: add radv_image_view_can_fast_clear() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	aeaf8dbd09	radv: add radv_image_can_fast_clear() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Samuel Pitoiset	3e718db1ff	radv: remove useless check in emit_fast_color_clear() The driver doesn't support DCC/CMASK for mipmapped textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-29 10:18:42 +01:00
Vinson Lee	d0c7b079d0	freedreno: Fix autotools build. Fix build error. CXXLD pipe_msm.la ../../../../src/gallium/drivers/freedreno/.libs/libfreedreno.a(freedreno_batch.o): In function `batch_init': src/gallium/drivers/freedreno/freedreno_batch.c:54: undefined reference to `fd_device_version' src/gallium/drivers/freedreno/freedreno_batch.c:59: undefined reference to `fd_submit_new' src/gallium/drivers/freedreno/freedreno_batch.c:61: undefined reference to `fd_submit_new_ringbuffer' src/gallium/drivers/freedreno/freedreno_batch.c:64: undefined reference to `fd_submit_new_ringbuffer' src/gallium/drivers/freedreno/freedreno_batch.c:66: undefined reference to `fd_submit_new_ringbuffer' src/gallium/drivers/freedreno/freedreno_batch.c:70: undefined reference to `fd_submit_new_ringbuffer' Fixes: `b4476138d5` ("freedreno: move drm to common location") Fixes: `aa0fed10d3` ("freedreno: move ir3 to common location") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-11-28 22:23:52 -08:00
Marek Olšák	075fd5d8f2	radeonsi: add memory management stress tests for GDS Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	c1d3c08699	winsys/amdgpu: add support for allocating GDS and OA resources Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	d7a4fa91f0	radeonsi: allow si_cp_dma_clear_buffer to clear GDS from any IB Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	72b2b61d8c	winsys/amdgpu: use optimal VM alignment for CPU allocations Acked-by: Christian König <christian.koenig@amd.com>	2018-11-28 20:20:27 -05:00
Marek Olšák	27f9935075	winsys/amdgpu: use optimal VM alignment for imported buffers Window system buffers didn't use the optimal alignment. Acked-by: Christian König <christian.koenig@amd.com>	2018-11-28 20:20:27 -05:00
Marek Olšák	6b554d863f	winsys/amdgpu,radeon: pass vm_alignment to buffer_from_handle Acked-by: Christian König <christian.koenig@amd.com>	2018-11-28 20:20:27 -05:00
Marek Olšák	f737fe00a0	winsys/amdgpu: overallocate buffers for faster address translation on Gfx9 Sadly, the 3 games I tested (DeusEx:MD, DiRT Rally, DOTA 2) are unaffected by the overallocation, because I guess their buffers don't fall into the small range below a power-of-two size. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	8c00f778fc	winsys/amdgpu: increase the VM alignment to the MSB of the size for Gfx9 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	a2a6b06d48	winsys/amdgpu: use >= instead of > for VM address alignment Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	98f2312b4f	winsys/amdgpu: clean up code around BO VM alignment Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	5f9ccf827e	winsys/amdgpu: optimize slab allocation for 2 MB amdgpu page tables - the slab buffer size increased from 128 KB to 2 MB (PTE fragment size) - the max suballocated buffer size increased from 64 KB to 256 KB, this increases memory usage because it wastes memory - the number of suballocators increased from 1 to 3 and they are layered on top of each other to minimize unused space in slabs The final increase in memory usage is: DeusEx:MD: 1.8% DOTA 2: 1.75% DiRT Rally: 0.2% The kernel driver will also receive fewer buffers.	2018-11-28 20:20:27 -05:00
Marek Olšák	cf6835485c	radeonsi: generalize the slab allocator code to allow layered slab allocators There is no change in behavior. It just makes it easier to change the number of slab allocators.	2018-11-28 20:20:27 -05:00
Marek Olšák	9576266a37	winsys/amdgpu: always reclaim/release slabs if there is not enough memory	2018-11-28 20:20:27 -05:00
Marek Olšák	015061beb3	radeonsi: fix is_oneway_access_only for bindless images	2018-11-28 20:20:27 -05:00
Marek Olšák	8c25ab1a23	radeonsi/nir: parse more information about bindless usage fill more tgsi_shader_info fields.	2018-11-28 20:20:27 -05:00
Marek Olšák	2a936f8afa	tgsi/scan: add more information about bindless usage radeonsi will use this.	2018-11-28 20:20:27 -05:00
Marek Olšák	fba91b5173	radeonsi: small cleanup for memory opcodes	2018-11-28 20:20:27 -05:00
Marek Olšák	709905cbb6	radeonsi: fix is_oneway_access_only for image stores We need to look at the Dst for image stores.	2018-11-28 20:20:27 -05:00
Marek Olšák	648dc52367	radeonsi: use structured buffer intrinsics for image views to stop using the workaround in si_make_buffer_descriptor.	2018-11-28 20:20:27 -05:00
Marek Olšák	442dae2693	radeonsi: clean up primitive binning enablement no change in behavior. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Dave Airlie	8eb8be3f54	virgl: fix undefined shift to use unsigned. Ported from virglrenderer. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-11-29 09:09:31 +10:00
Dave Airlie	2ddd44d941	r600: make suballocator 256-bytes align Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108311 Cc: <mesa-stable@lists.freedesktop.org>	2018-11-29 09:09:02 +10:00
Kenneth Graunke	f11780779f	intel/compiler: Use nir's info when checking uses_streams. Vulkan and Gallium don't use Mesa's gl_program data structure, so they can't poke at 'prog'. But we can simply use the copy of the shader info stored with the NIR shader, which is guaranteed to exist. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-28 13:35:29 -08:00
Jason Ekstrand	199a0353d6	nir/derefs: Add a nir_derefs_do_not_alias enum value This makes some of the code more clear. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-11-28 14:29:25 -06:00
Gurchetan Singh	eb44c36cf1	egl: add missing #include <stddef.h> in egldevice.h Otherwise, I get this error: main/egldevice.h:54:13: error: ‘NULL’ undeclared (first use in this function) dev = NULL; ^~~~ with this config: ./autogen.sh --enable-gles1 --enable-gles2 --with-platforms='surfaceless' --disable-glx --with-dri-drivers="i965" --with-gallium-drivers="" --enable-gbm v3: Use stddef.h (Matt) v4: Modify commit message (Eric) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-28 11:22:47 -08:00
Matt Turner	2d48d5116b	gallivm: Use nextafterf(0.5, 0.0) as rounding constant The common truncf(x + 0.5) fails for the floating-point value just less than 0.5 (nextafterf(0.5, 0.0)). nextafterf(0.5, 0.0) + 0.5, after rounding is 1.0, thus truncf does not produce the desired value. The solution is to add nextafterf(0.5, 0.0) instead of 0.5 before truncating. This works for all values. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-11-28 11:22:47 -08:00
Juan A. Suarez Romero	e2ad94d928	docs: update calendar, add news item and link release notes for 18.2.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero	a53a280479	docs: add sha256 checksums for 18.2.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cfd1f8b92c`)	2018-11-28 19:20:09 +01:00
Juan A. Suarez Romero	f6ab6e2867	docs: add release notes for 18.2.6 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3e741344d7`)	2018-11-28 19:20:09 +01:00
Nicolai Hähnle	c02390f8fc	egl/wayland: rather obvious build fix Fixes: `ce74a7bb8d` ("egl/wayland: plug memory leak in drm_handle_device()") Fixes: `c59d3aa4b9` ("egl/wayland: bail out when drmGetMagic fails")	2018-11-28 18:30:36 +01:00
Nicolai Hähnle	eb94b6bd5c	winsys/amdgpu: explicitly declare whether buffer_map is permanent or not Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY that specifies whether the caller will use buffer_unmap or not. The default behavior is set to permanent maps, because that's what drivers do for Gallium buffer maps. This should eliminate the need for hacks in libdrm. Assertions are added to catch when the buffer_unmap calls don't match the (temporary) buffer_map calls. I did my best to update r600 for consistency (r300 needs no changes because it never calls buffer_unmap), even though the radeon winsys ignores the new flag. As an added bonus, this should actually improve the performance of the normal fast path, because we no longer call into libdrm at all after the first map, and there's one less atomic in the winsys itself (there are now no atomics left in the UNSYNCHRONIZED fast path). Cc: Leo Liu <leo.liu@amd.com> v2: - remove comment about visible VRAM (Marek) - don't rely on amdgpu_bo_cpu_map doing an atomic write Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-28 18:24:14 +01:00
Nicolai Hähnle	35eb81987c	winsys/amdgpu: add amdgpu_winsys_bo::lock We'll use it in the upcoming mapping change. Sparse buffers have always had one. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-28 18:23:29 +01:00
Eric Engestrom	e0f1f74eda	vulkan/wsi: fix s/,/;/ typo Fixes: `59e58c348e` "vulkan/wsi: Only wait on semaphores on the first swapchain" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-28 16:44:01 +00:00
Emil Velikov	ce74a7bb8d	egl/wayland: plug memory leak in drm_handle_device() As we fail to open the node, we leak the node/device name. v2: Log and then free() (Eric) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-28 16:12:12 +00:00
Emil Velikov	c59d3aa4b9	egl/wayland: bail out when drmGetMagic fails Currently as the function fails, we pass uninitialized data to the authentication function. Stop doing that and print an warning when the function fails. v2: Plug memory leak in error path (Eric) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-28 16:11:22 +00:00
Eric Engestrom	9575cd2893	wsi/display: fix mem leak when freeing swapchains Fixes: `da997ebec9` "vulkan: Add KHR_display extension using DRM [v10]" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Keith Packard <keithp@keithp.com>	2018-11-28 12:09:54 +00:00
Gert Wollny	f08d107054	i965: Set the FBO error state INCOMPLETE_ATTACHMENT only for SRGB_R8 Originally the driver reported GL_FRAMEBUFFER_UNSUPPORTED in all cases, adding more specific error messages was not correct and broke many tests. Mostly revert this and only report GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT for MESA_FORMAT_R_SRGB8. Fixes: `ebcde34545` i965: be more specific about FBO completeness errors Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108805 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-28 10:12:47 +01:00
Gert Wollny	d8bb88d0b4	i965: Explicitely handle swizzles for MESA_FORMAT_R_SRGB8 The format is emulated by using ISL_FORMAT_L8_SRGB, therefore we need to force swizzles for the GBA channels. However, doing this only based on the data type GL_RED breaks other formats, therefore, test specifically for the format. Fixes: `c5363869d4` i965: Force zero swizzles for unused components in GL_RED and GL_RG Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-28 10:07:02 +01:00
Gert Wollny	091295d7cb	virgl: Don't try handling server fences when they are not supported vtest doesn't implement the according API and would segfault: Program received signal SIGSEGV, Segmentation fault. #0 0x0000000000000000 in ?? () #1 in virgl_fence_server_sync at src/gallium/drivers/virgl/virgl_context.c:1049 #2 in st_server_wait_sync at src/mesa/state_tracker/st_cb_syncobj.c:155 so just don't do the call when the function pointers are not set. Fixes dEQP: dEQP-GLES3.functional.fence_sync.wait_sync_smalldraw dEQP-GLES3.functional.fence_sync.wait_sync_largedraw Fixes: `d1a1c21e76` virgl: native fence fd support Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Robert Foss <robert.foss@collabora.com>	2018-11-28 10:02:31 +01:00
Gert Wollny	073fdd7382	virgl,vtest: Initialize return value Avoids: Conditional jump or move depends on uninitialised value(s) at 0x9E2B39F: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:379) by 0x9E2725F: virgl_buffer_create (virgl_buffer.c:169) by 0x9E246D5: virgl_resource_create (virgl_resource.c:60) by 0xA0C1B9F: bufferobj_data (st_cb_bufferobjects.c:344) by 0xA0C1B9F: st_bufferobj_data (st_cb_bufferobjects.c:390) by 0x9F4ACE3: vbo_use_buffer_objects (vbo_exec_api.c:1136) by 0xA0C68C3: st_create_context_priv (st_context.c:416) by 0xA0C707A: st_create_context (st_context.c:598) by 0x9F81C6B: st_api_create_context (st_manager.c:918) by 0x9BBE591: dri_create_context (dri_context.c:161) by 0x9BB6931: driCreateContextAttribs (dri_util.c:473) by 0x4E97A44: drisw_create_context_attribs (drisw_glx.c:630) by 0x4E7C591: glXCreateContextAttribsARB (create_context.c:78) Uninitialised value was created by a stack allocation at 0x9E2B249: virgl_vtest_winsys_resource_cache_create (virgl_vtest_winsys.c:342) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Robert Foss <robert.foss@collabora.com>	2018-11-28 10:02:31 +01:00
Iago Toral Quiroga	e55cbf26ea	intel/compiler: fix register allocation in opt_peephole_sel This wasn't handling 64-bit cases properly. Found by inspection. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-11-28 08:28:27 +01:00
Matt Turner	6f737b9207	glsl: Remove unused member variable Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-27 22:29:53 -08:00
Matt Turner	1a210268b8	nir: Call fflush() at the end of nir_print_shader() We normally call with stderr which is unbuffered, so this won't affect that, but it does let me call nir_print_shader(nir, fopen("log", "w+")) from gdb and actually get the whole shader in my file. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-27 22:29:53 -08:00
Eric Anholt	e113b21cb7	v3d: Add renderonly support. I've been using this with the kmsro series to test v3d on VKMS without my old KMS hack in the v3d kernel driver. KMSRO still needs some cleanup, but v3d RO support seems reasonable.	2018-11-27 15:03:02 -08:00
Eric Anholt	55edafa73e	gallium: Remove unused variable in u_tests. Fixes: `0d17b685b1` ("gallium/u_tests: add a compute shader test that clears an image") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-27 15:02:57 -08:00
Bas Nieuwenhuizen	6569644bb6	radv: Align large buffers to the fragment size. Improves performance in Talos by about 15% (and significant improvements in RotR and possibly other but did not bench with final patch) on kernel 4.19 and earlier. On 4.20+ a similar effect comes from 433ca054949a "drm/amdgpu: try allocating VRAM as power of two" v2: Do not impact the alignment of the physical memory. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2018-11-27 22:17:42 +01:00
Hyunjun Ko	76945e4140	freedreno: implements get_sample_position Since `1285f71d3e` landed, it needs to provide apps with proper sample position for MSAA. Currently no way to query this to hw, these are taken from blob driver. Fixes: dEQP-GLES31.functional.texture.multisample.samples_#.sample_position Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Rob Clark	5973a4d0b7	freedreno/a3xx: also set FSSUPERTHREADENABLE We set equiv bit in SP_FS_CTRL_REG0. Somehow the hw doesn't hang with this mismatched config, but does run slower. It is faster with either neither bit set, or both bits set, but both is the fastest of the three configurations. Worth a bit over 10% gain in glmark2. Spotted-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	e68cd91251	freedreno: use MSM_BO_SCANOUT with scanout buffers Signed-off-by: Jonathan Marek <jonathan@marek.ca>	2018-11-27 15:44:03 -05:00
Jonathan Marek	3ed4aad524	freedreno: use GENERIC instead of TEXCOORD for blit program blip_fp uses GENERIC as input, so blit_vp should match for linking Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	3a273a4abc	freedreno: a2xx texture update Adds all missing texture related logic. For everything to work it also needs changes to ir2/fd2_program, which are part of the ir2 update patch. Note: it needs rnndb update Signed-off-by: Jonathan Marek <jonathan@marek.ca> [remove stray patch] Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	4887aba638	freedreno/a2xx: Compute depth base in gmem correctly Note: it needs rnndb update Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	e7114575f7	freedreno/a2xx: set VIZ_QUERY_ID on a20x Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	a50b8a0152	freedreno: add missing a20x ids 200: 256KiB GMEM A200 (imx53) 201: 128KiB GMEM A200 (imx51) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	4e6ee033ff	freedreno/a2xx: fix POINT_MINMAX_MAX overflow As it stands, it overflows to zero. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:03 -05:00
Jonathan Marek	78fede86d9	freedreno: a2xx: fd2_draw update Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Jonathan Marek	3e7186d472	nir: add fceil lowering lowers ceil(x) as -floor(-x) Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	11593f9041	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	d47d77d49d	freedreno/a6xx: set guardband clip On older gens, the CLIP_ADJ bitfields were actually 3.6 fixed point. Which might make more sense. Although this formula comes up with values pretty close to what blob does for various viewport sizes (for at least a5xx and a6xx), and seems to work. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	2773919f06	freedreno/a6xx: disable LRZ for z32 `f6131d4ec7` had the side effect of enabling LRZ w/ 32b depth buffers. But there are some bugs with this, which aren't fully understood yet, so for now just skip LRZ w/ z32.. Fixes: `f6131d4ec7` freedreno/a6xx: Clear z32 and separate stencil with blitter Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Kristian H. Kristensen	9595be67a9	freedreno/a6xx: Clear gmem buffers at flush time We generate an IB to clear the gmem at flush time and jump to it before rendering each tile. This lets us get rid of the command stream patching for gmem offsets. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Kristian H. Kristensen	b5a9bb28c6	freedreno/a6xx: Move resolve blits to an IB Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Kristian H. Kristensen	5f068cf3b0	freedreno/a6xx: Move restore blits to IB Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	09300bbe03	mesa/st: better colormask check for clear fallback For RGB surfaces (for example) we don't really care that the colormask is 0x7 instead of 0xf. This should not trigger clear_with_quad() slowpath. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-27 15:44:02 -05:00
Rob Clark	65cee01430	mesa/st: swap order of clear() and clear_with_quad() If we can't clear all the buffers with pctx->clear() (say, for example, because of ColorMask), push the buffers we can clear with pctx->clear() first. Tilers want to see clears coming before draws to enable fast- paths, and clearing one of the attachments with a quad-draw first confuses that logic. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-27 15:44:02 -05:00
Rob Clark	aa0fed10d3	freedreno: move ir3 to common location Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be re-used by some future vulkan driver. The parts that are gallium specific have been refactored out and remain in the gallium driver. Getting the move done now so that it can happen before further refactoring to support a6xx specific instructions. NOTE also removes ir3_cmdline compiler tool from autotools build since that was easier than fixing it and I normally use meson build. Waiting patiently for the day that we can remove everything from the autotools build. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	556eec249d	freedreno/ir3: remove u_inlines usage Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	312eae45a3	freedreno/ir3: split up ir3_shader Split the parts that are gallium specific into ir3_gallium so the rest can move to a common location outside of gallium. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	ea4cbf601d	freedreno/ir3: remove pipe_stream_output_info dependency A bit annoying to have to copy into our own struct. But this is something the compiler really needs to know, at least on earlier generations where streamout is implemented in shader. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	030e98630d	freedreno/ir3: some header file cleanup Clean up some of the low-hanging-fruit usages of freedreno_util.h Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	2482153d52	freedreno/ir3: use env_var_as_unsigned() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	a321f939f6	util: env_var_as_unsigned() helper So I can drop env2u() helper from freedreno_util.h and get rid of one small ir3 dependency on gallium/freedreno Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	bfd8d26372	freedreno/ir3: move disasm and optmsgs debug flags Move them to IR3_SHADER_DEBUG so we can remove ir3's dependency on fd_mesa_debug. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	424d75656f	freedreno: FD_SHADER_DEBUG -> IR3_SHADER_DEBUG Only used by ir3, so move it into ir3 to be more self contained. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	8a654f092e	freedreno: remove shader_stage_name() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	c635703c50	freedreno: shader_t -> gl_shader_stage Just massive search/replace for the most part. Step towards removing ir3 dependency on disasm.h which is shared by a2xx. One step closer to being able to move ir3 out of gallium. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	388aac32ed	freedreno/ir3: standalone compiler updates Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	b4476138d5	freedreno: move drm to common location So that we can re-use at least parts of it for vulkan driver, and so that we can move ir3 to a common location (which uses fd_bo to allocate storage for shaders) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Rob Clark	6cb74eb4f1	freedreno/drm: remove dependency on gallium driver Prep work to move drm to a common location. Slightly hacky, but the softpin debug flag is only temporary. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Dylan Baker	88c4680b5a	util: promote u_memory to src/util as well as os_memory* Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-11-27 15:44:02 -05:00
Eric Anholt	bade179153	gallium: Fix uninitialized variable warning in compute test. The compiler doesn't know that ny != 0, so x might be uninitialized for the printf at the end. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-11-27 11:23:22 -08:00
Bas Nieuwenhuizen	08ea6b9d9b	radv: Clamp gfx9 image view extents to the allocated image extents. Mirrors AMDVLK. Looks like if we go over the alignment of height we actually start to change the addressing. Seems like the extra miplevels actually work with this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245 Fixes: `f6cc15dccd` "radv/gfx9: fix block compression texture views. (v2)" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-27 10:19:52 +01:00
Iago Toral Quiroga	453570cd8c	intel/compiler: fix indentation style in opt_algebraic()	2018-11-27 09:53:09 +01:00
Anuj Phogat	16e4911972	anv/icl: Set use full ways in L3CNTLREG L3 allocation table in h/w specification recommends using 4 KB granularity for programming allocation fields in L3CNTLREG. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-11-26 15:11:36 -08:00
Anuj Phogat	3f55fd3814	intel/icl: Set way_size_per_bank to 4 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-11-26 15:11:36 -08:00
Anuj Phogat	3ce04da5b4	i965/icl: Set use full ways in L3CNTLREG L3 allocation table in h/w specification recommends using 4 KB granularity for programming allocation fields in L3CNTLREG. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-11-26 15:11:36 -08:00
Anuj Phogat	3282c7be89	i965/icl: Fix L3 configurations Use L3 configuration specified in h/w specification. V2: Drop configs which do under allocation of l3 cache. Bump up the comment above table. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-11-26 15:11:36 -08:00
Eric Engestrom	c0c533767e	build: stop defining unused VERSION Scons and autotools don't define it, and as of last commit nothing uses it. `VERSION` is also a generic enough name that something somewhere will eventually clash, and we don't want to repeat the LLVM `DEBUG` fiasco. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-26 22:05:02 +00:00
Eric Engestrom	bd12e02530	vulkan/utils: s/VERSION/PACKAGE_VERSION/ Everything else uses PACKAGE_VERSION, so let's be consistent, and VERSION and PACKAGE_VERSION are currently defined to be the same in meson and android, while VERSION is undefined in autotools and scons. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-26 22:05:02 +00:00
Eric Engestrom	56d126f8fd	anv: correctly use vulkan 1.0 by default Per chapter 3.2 "Instances": > Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing > an apiVersion of 0 is equivalent to providing an apiVersion of > VK_MAKE_VERSION(1,0,0). Reported-by: Niklas Haas <git@haasn.xyz> Fixes: `8c048af589` "anv: Copy the appliation info into the instance" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-26 22:05:02 +00:00
Erik Faye-Lund	d6d35d87f1	mesa/main: fixup requirements for GL_PRIMITIVES_GENERATED This enum is also allowed by EXT_tessellation_shader, which is supported on older i965 HW (as opposed to OES_geometry_shader). This was missed when narrowing this code-path, leading to dEQP regressions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108868 Fixes: `f09d94fbd1` "mesa/main: fix validation of transform-feedback queries" Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-11-26 22:12:07 +01:00
Erik Faye-Lund	c120dbfe4d	mesa/main: fix incorrect depth-error If glGetTexImage or glGetnTexImage is called with a level that doesn't exist, we get an error message on this form: Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0) This is clearly nonsensical, because these APIs don't even have a depth-parameter. The reason is that get_texture_image_dims() return all-zero dimensions for non-existent texture-images, and we go on to validate these dimensions as if they were user-input, because glGetTextureSubImage requires checking. So let's split this logic in two, so glGetTextureSubImage can have stricter input-validation. All arguments that are no longer validated are generated internally by mesa, so there's no use in validating them. Fixes: `42891dbaa1` "gettextsubimage: verify zoffset and depth are correct" Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Erik Faye-Lund	38af69adfa	mesa/main: check cube-completeness in common code This check is the only part of dimensions_error_check that isn't about error-checking the offset and size arguments of glGet[Compressed]TextureSubImage(), so it doesn't really belong in here. This doesn't make a difference right now, apart for changing the presedence of this error. But it will make a difference for the next patch, where we no longer call this method from the non-sub tex-image getters. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Erik Faye-Lund	42820c5727	mesa/main: factor out common error-checking This error checking is the same for teximage and texsubimage getters, so let's factor it out to its own function. This will be useful when getteximage and gettexsubimage gets their own error checking routines a bit later. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Erik Faye-Lund	5e0a84f31c	mesa/main: factor out tex-image error-checking This will be useful when we split error-checking for getteximage and gettexsubimage later. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Erik Faye-Lund	38bbb61252	mesa/main: remove bogus error for zero-sized images The explanation quotes the spec on the following wording to justify the error: "An INVALID_VALUE error is generated if xoffset + width is greater than the texture’s width, yoffset + height is greater than the texture’s height, or zoffset + depth is greater than the texture’s depth." However, this shouldn't generate an error in the case where all three of width, xoffset and the texture's width are zero. In this case, we end up generating an unspecified error. So let's remove this check, and instead make sure that we consider this as an empty texture. So let's not generate an error, there's non mandated in the spec in xoffset/yoffset/zoffset = 0 case. We already avoid doing any work in this case, because of the final, non-error generating check in this function. Fixes: `b37b35a5d2` "getteximage: assume texture image is empty for non defined levels" Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Erik Faye-Lund	f1998e15ff	mesa/main: remove ARB suffix from glGetnTexImage This function has been core since OpenGL 4.3, so naming the implementation and reporting erros using an ARB-suffix can be confusing. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-26 12:29:54 +01:00
Gert Wollny	f5d053702f	glsl: free or reuse memory allocated for TF varying When a shader program is de-serialized the gl_shader_program passed in may actually still hold memory allocations for the transform feedback varyings. If that is the case, free the varying names and reallocate the new storage for the names array. This fixes a memory leak: Direct leak of 48 byte(s) in 6 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875 in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985 ... Indirect leak of 42 byte(s) in 6 object(s) allocated from: in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8) in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887 in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985 Fixes: `ab2643e4b0` glsl: serialize data from glTransformFeedbackVaryings Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-26 09:58:25 +01:00
Bas Nieuwenhuizen	3c96a1e3a9	radv: Fix opaque metadata descriptor last layer. We used the layer count which results in an off by one error. Not sure this really affects anything. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-26 09:29:39 +01:00
Mathias Fröhlich	ff466c2d48	mesa/st: Make st_pipe_vertex_format static. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:10 +01:00
Mathias Fröhlich	2a3eae82a1	mesa/st: Use binding information from the VAO in feedback rendering. Use VAO binding information in feedback rendering. In theory it should reduce the amount of buffer objects scheduled for rendering. Feedback rendering is implemented in a crude way anyhow, so I do not expect much gain here. But for the sake of code reuse we should use the same code for the same task. And finally if feeback rendering may get improved the array setup is already well done there. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:10 +01:00
Mathias Fröhlich	a00a8fb8d1	mesa/st: Avoid extra references in the feedback draw function scope. The change removes the reference that is held on the entries of the vbuffers[] array. The new code does not do that anymore as following the code into draw_set_vertex_buffers() the draw context holds an other reference as long as it is reset down the function again. So it should be already by that argument save to remove that additional reference count. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:10 +01:00
Mathias Fröhlich	6705188cc5	mesa/st: Factor out array and buffer setup from st_atom_array.c. Factor out vertex array setup routines from the array state atom. The factored functions will be used in feedback rendering in the next change. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:09 +01:00
Mathias Fröhlich	774d585d49	mesa/st: Only unmap the uploader that was actually used. In st_atom_array, we only need to unmap the upload buffer that was actually used. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:09 +01:00
Mathias Fröhlich	65332aff29	mesa/st: Only care about the uploader if it was used. In st_atom_array, we only need to care for unmapping the upload buffer if we actually used it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-26 07:57:09 +01:00
Ilia Mirkin	927ce66b39	nv50/ir: remove dnz flag when converting MAD to ADD due to optimizations dnz flag only applies for multiplications (e.g. to make 0 * Infinity becomes 0 instead of NaN). Once we optimize a MAD into an ADD, the dnz flag no longer makes sense, and upsets the GM107 emitter (since it looks at the ftz and dnz flags together). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-24 22:15:53 -05:00
Marek Olšák	d4e7d8b7f0	winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-23 17:08:44 -05:00
Marek Olšák	82aa07f81f	winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-23 17:08:42 -05:00
Samuel Pitoiset	9fc1ce258c	radv: ignore subpass self-dependencies for CreateRenderPass() too We really need to refactor this... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-23 11:59:11 +01:00
Samuel Pitoiset	2951a766bd	radv: remove useless sync before CmdClear{Color,DepthStencil}Image() We don't need to flush anything before these two commands as well. This is because they have to be externally synchronized, so the app should have called CmdPipelineBarrier() prior to that and the driver should have flushed the caches. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-23 11:59:08 +01:00
Erik Faye-Lund	a652842982	mesa/main: remove overly strict query-validation The rules encoded in this code also applies to OpenGL ES 3.0 and up, but the per-enum validation has already been taught about these rules. So let's get rid of this duplicate, narrow version of the validation. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	d52be6dd29	mesa/main: fix validation of GL_TIMESTAMP ctx->Extensions.ARB_timer_query is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_ARB_timer_query(ctx) instead to figure out if the extension is really supported. We also need to check for EXT_disjoint_timer_query for GLES-support. This shouln't have any functional effect, as this entry-point is only valid on desktop GL, or on GLES with EXT_disjoint_timer_query in the first place. But if this gets added to the core of a future version of ES, this should be a step in the right direction. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	7a4d74c35a	mesa/main: fix validation of ARB_query_buffer_object ctx->Extensions.ARB_query_buffer_object is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_ARB_query_buffer_object(ctx) instead to figure out if the extension is really supported. This turns attempts to read queries into buffer objects on ES 3 into errors, as required by the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	75e39b59dc	mesa/main: fix validation of transform-feedback overflow queries ctx->Extensions.ARB_transform_feedback_overflow_query is set based on the driver-capabilities, not based on the context type. We need to check against _mesa_has_RB_transform_feedback_overflow_query(ctx) instead to figure out if the extension is really supported. This turns usage of GL_TRANSFORM_FEEDBACK_STREAM_OVERFLOW and GL_TRANSFORM_FEEDBACK_OVERFLOW into errors on ES 3, as required by the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	f09d94fbd1	mesa/main: fix validation of transform-feedback queries ctx->Extensions.EXT_transform_feedback is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_EXT_transform_feedback(ctx) instead to figure out if the extension is really supported. We also need to check for OES_geometry_shader. This turns usage of GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN into an error on ES 2, as well as usage of GL_PRIMITIVES_GENERATED on ES 3, both as required by the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	b551fe5fa7	mesa/main: fix validation of GL_TIME_ELAPSED ctx->Extensions.EXT_timer_query is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_EXT_timer_query(ctx) instead to figure out if the extension is really supported. We also need to check for EXT_disjoint_timer_query, which enables the same functionality for ES. This turns usage of GL_TIME_ELAPSED into an error on ES 3, as is required by the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:36 +01:00
Erik Faye-Lund	059928e114	mesa/main: fix validation of GL_ANY_SAMPLES_PASSED_CONSERVATIVE ctx->Extensions.ARB_ES3_compatibility is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_ARB_ES3_compatibility(ctx) instead to figure out if the extension is really supported. In addition, EXT_occlusion_query_boolean should also allow this behavior. This shouldn't cause any functional change, as all drivers that support ES3_compatibility should in practice enable either ES3_compatibility or EXT_occlusion_query_boolean under all APIs that export this symbol. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	8ea819dd60	mesa/main: fix validation of GL_ANY_SAMPLES_PASSED ctx->Extensions.ARB_occlusion_query2 is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_ARB_occlusion_query2(ctx) instead to figure out if the extension is really supported. In addition, EXT_occlusion_query_boolean should also allow this behavior. This shouldn't cause any functional change, as all drivers that support ARB_occlusion_query2 should in practice enable either ARB_occlusion_query2 or EXT_occlusion_query_boolean under all APIs that export this symbol. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	fff1738d57	mesa/main: fix validation of GL_SAMPLES_PASSED ctx->Extensions.ARB_occlusion_query is set based on the driver- capabilities, not based on the context type. We need to check against _mesa_has_ARB_occlusion_query(ctx) instead to figure out if the extension is really supported. We also need to check for ARB_occlusion_query2, as ARB_occlusion_query isn't available in core contexts. This turns usage of GL_SAMPLES_PASSED into an error on ES 3, as is required by the spec. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	9c13ad0ea4	mesa/main: simplify pipeline-statistics query validation The _mesa_has_ARB_pipeline_statistics_query(ctx)-helper will already check the GLES-version according to the extension-table, so if this extension would ever be back-ported to ES, we only need to update the table to support this. This shouln't have any functional effect. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	dd4241b34f	mesa/main: use non-prefixed enums for consistency These enums all have the same values as their non-prefixed versions, and there's several aliases for some of them. So let's switch to the non-prefixed versions for simplicity. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	ba4e8d3754	mesa/main: correct year for EXT_occlusion_query_boolean According to the extension spec, this was initially released in 2011, so let's set this to the correct value. The value of 2001 could be a copy-paste mistake, as ARB_occlusion_query which this is based on was released then. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Erik Faye-Lund	35555b08d7	mesa/main: correct requirement for EXT_occlusion_query_boolean EXT_occlusion_query_boolean require support for GL_ANY_SAMPLES_PASSED, which ARB_occlusion_query doesn't supply. We need ARB_occlusion_query2 for this instead. This is still not 100% accurate, as we also require support for the GL_SAMPLES_PASSED_CONSERVATIVE target, which isn't guaranteed by either ARB_occlusion_query nor ARB_occlusion_query2. But it should be trivial to implement for any driver supporting ARB_occlusion_query2, as it can simply be implemented as GL_ANY_SAMPLES_PASSED. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-23 10:48:35 +01:00
Tapani Pälli	09adaa4b89	anv: allow exporting an imported SYNC_FD semaphore type Fixes issues with following SkQP tests: unitTest_VulkanHardwareBuffer_Vulkan_EGL_Syncs unitTest_VulkanHardwareBuffer_Vulkan_Vulkan_Syncs Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-23 07:49:46 +02:00
Eric Engestrom	896c59d690	glapi: add missing visibility args Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108829 Fixes: `3218056e0e` "meson: Build i965 and dri stack" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-22 18:21:05 +00:00
Jason Ekstrand	a24654b49d	anv/nir: Rework arguments to apply_pipeline_layout Instead of taking a whole pipeline (which could be anything!), just take a physical device and robust_buffer_access boolean. This makes it easier to verify that only the things in the hash actually affect pipeline compilation. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-22 09:17:28 -06:00
Jason Ekstrand	617e402b3d	anv: Put robust buffer access in the pipeline hash It affects apply_pipeline_layout. Shaders compiled with the wrong value will work but they may not be robust as requested by the app. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-22 09:17:10 -06:00
Jason Ekstrand	a845c2bc10	anv: Expose VK_EXT_scalar_block_layout Our compile already splits UBO loads into scalars and the untyped surface read messages we use for SSBO reads and writes only require dword alignment. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-22 08:16:47 -06:00
Jason Ekstrand	2ca9a4417d	vulkan: Update the XML and headers to 1.1.93 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-22 08:16:40 -06:00
Samuel Pitoiset	4ff4af3d91	radv: remove useless sync after CmdClear{Color,DepthStencil}Image() 'post_flush' is only set to NULL for the normal clear path (ie. only vkCmdClearColorImage() and vkCmdClearDepthStencilImage() are affected commands). Because these two operations have to be externally synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT, it's useless to set those flags internallY. VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle, while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector caches and L2. RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2 will be superseded by RADV_CMD_FLAG_INV_GLOBAL_L2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-22 08:56:36 +01:00
Bas Nieuwenhuizen	33b2f74e77	vulkan: Allow storage images in the WSI. Since apps also have to follow the ImageFormatProperties query, we can disallow formats that don't allow image stores (for AMD that would be SRGB formats). Note that this only affects anything if the app actually decides to use the flag. Had someone ask for this on IRC and at least on the AMD side we can support it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-21 21:36:55 +01:00
Axel Davy	1f1d4d571a	st/nine: Remove thread_submit warning thread_submit can be useful even without DRI_PRIME, as it can help avoid missed pageflips. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com>	2018-11-21 19:55:28 +01:00
Axel Davy	d304f0aa31	st/nine: Allow 'triple buffering' with thread_submit The path allowing triple buffering behaviour wasn't implemented yet for thread_submit Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com>	2018-11-21 19:55:28 +01:00
Robert Foss	19af208c7d	virgl: add assert and missing function parameter Verify the pipe_fd_type to be of PIPE_FD_TYPE_NATIVE_SYNC. Fixes: `d1a1c21e76` "virgl: native fence fd support" Suggested-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-21 15:59:00 +01:00
Gert Wollny	61b535437e	r600: clean up the GS ring buffers when the context is destroyed This fixes two memory leaks reported by ASAN: Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:725 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1482 Direct leak of 248 byte(s) in 1 object(s) allocated from: in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880) in r600_alloc_buffer_struct ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:578 in r600_buffer_create ../../samba/mesa/src/gallium/drivers/r600/r600_buffer_common.c:600 in r600_resource_create_common ../../samba/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1265 in r600_resource_create ../../samba/mesa/src/gallium/drivers/r600/r600_pipe.c:722 in pipe_buffer_create ../../samba/mesa/src/gallium/auxiliary/util/u_inlines.h:291 in update_gs_block_state ../../samba/mesa/src/gallium/drivers/r600/r600_state_common.c:1489 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Fixes: `1371d65a7f` r600g: initial support for geometry shaders on evergreen (v2) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-11-21 10:34:17 +01:00
Samuel Pitoiset	4b9bc4791b	radv: only sync CP DMA for transfer operations or bottom pipe CP DMA can only be busy when the driver copies buffers. The only affected Vulkan commands are vkCmdCopyBuffer() and vkCmdUpdateBuffer() (because we fallback to a copy depending on a threshold). Clear operations are currently not concerned because the driver always syncs after the last DMA operation. Per the spec, these two operations have to be externally synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-21 10:03:01 +01:00
Samuel Pitoiset	457ac6ce1e	radv: ignore subpass self-dependencies Unnecessary as they allow the app to call vkCmdPipelineBarrier() inside the render pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-21 10:02:59 +01:00
Iago Toral Quiroga	8e73b57634	Revert "nir/builder: Assert that intN_t immediates fit" This reverts commit `1f29f4db1e`. For this to work the compiler must ensure that it never puts the values that arrive to this helper into unsigned variables at any point in its processing, since that would not apply sign extension to the value and it would break the expectations here. Unfortunately, we use uint64_t extensively to pass and copy things around, so some times we get to this helper with values that are not properly sign extended to 64-bit. Here is an example for an 8-bit value that comes from a switch case: (gdb) p /x x $1 = 0xffffffd6 The value seems to have been sign extended to 32-bit at some point getting proper sign extension, but then copied into a uint64_t which wont' apply sign extension, breaking the expectations of the assertion. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-21 08:12:50 +01:00
Iago Toral Quiroga	387888e3b7	nir/from_ssa: fix bit-size of temporary register Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-21 08:07:22 +01:00
Mathias Fröhlich	2d3c466add	mesa: Remove unneeded bitfield widths from the VAO. With the current VAO layout we do not need to make these fields a bitfield. We get a tight struct layout with this change for VAO attributes. v2: Change unsigned char -> GLubyte. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	0a7020b4e6	mesa: Factor out struct gl_vertex_format. Factor out struct gl_vertex_format from array attributes. The data type is supposed to describe the type of a vertex element. At this current stage the data type is only used with the VAO, but actually is useful in various other places. Due to the bitfields being used, special care needs to be taken for the glGet code paths. v2: Change unsigned char -> GLubyte. Use struct assignment for struct gl_vertex_format. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	2da7b0a2fb	tnl: Use gl_array_attribute::_ElementSize. Instead of open coding the size computation, use the already available gl_array_attribute::_ElementSize value. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	a4c01839c2	nouveau: Use gl_array_attribute::_ElementSize. Instead of open coding the size computation, use the already available gl_array_attribute::_ElementSize value. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	182ed6de8c	mesa: Unify glEdgeFlagPointer data type. Use GL_UNSIGNED_BYTE as initialization data type for the edge flag vertex attribute array. The same datatype is used in the glEdgeFlagPointer function when setting the array pointer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	1b743e2966	mesa: Work with bitmasks when en/dis-abling VAO arrays. For enabling or disabling VAO arrays it is now possible to change a set of arrays with a single call without the need to iterate the attributes. Make use of this technique in the vao module. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	3c46fa5988	mesa: Remove gl_array_attributes::Enabled. Now that all users go via the VAO Enabled bitfield, get rid of the Enabled boolean. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	093aeb3565	mesa: Use gl_vertex_array_object::Enabled for glGet. Instead of using gl_array_attributes::Enabled use the much more compact representation stored in gl_vertex_array_object::Enabled using the corresponding bits. Keep the glGet changes in a seperate patch at least for review. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	1217a8448c	mesa: Use the gl_vertex_array_object::Enabled bitfield. Instead of using gl_array_attributes::Enabled use the much more compact representation stored in gl_vertex_array_object::Enabled using the corresponding bits. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Mathias Fröhlich	73d2d313e9	mesa: Rename gl_vertex_array_object::_Enabled -> Enabled. Mark the up to now derived bitfield value now as primary value by removing the underscore. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-21 06:27:19 +01:00
Marek Olšák	ea9f95e2a6	radeonsi: go back to using bottom-of-pipe for beginning of TIME_ELAPSED Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102597 Cc: 18.3 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-20 21:18:48 -05:00
Marek Olšák	6c1a34d2e7	radeonsi: don't send data after write-confirm with BOTTOM_OF_PIPE_TS There are no writes. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-20 21:18:46 -05:00
Marek Olšák	bc5adc27b5	st/mesa: pin driver threads to a fixed CCX when glthread is enabled radeonsi has 3 driver threads (glthread, gallium, winsys), other drivers may have 2 (glthread, gallium), so it makes sense to pin them to a random CCX and keep that irrespective of the app thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-20 21:18:43 -05:00
Marek Olšák	48f2160936	st/mesa: regularly re-pin driver threads to the CCX where the app thread is This is used when glthread is disabled. Mesa pretty much chases the app thread on the CPU. The performance is the same as pinning the app thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-20 21:18:30 -05:00
Marek Olšák	ce7f84eb77	drirc: enable glthread for Talos Principle Ryzen 1700X, Vega 56, 1600x900, 4xAA: improvement +4.4% Immediate mode was needed. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-20 21:17:42 -05:00
Marek Olšák	7f1cac7ba6	mesa/glthread: enable immediate mode Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-20 21:17:41 -05:00
Marek Olšák	247d5a8e94	mesa/glthread: pass the function name to _mesa_glthread_restore_dispatch If you insert printf there, you'll know why glthread was disabled. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-20 21:17:38 -05:00
Marek Olšák	25d95ed535	gallium/u_tests: fix MSVC build by using old-style zero initializers	2018-11-20 19:06:40 -05:00
Kenneth Graunke	562448b75a	i965: Do NIR shader cloning in the caller. This moves nir_shader_clone() to the driver-specific compile function, rather than the shared src/intel/compiler code. This allows i965 to do key-specific passes before calling brw_compile_*. Vulkan should not need this cloning as it doesn't compile multiple variants. We do need to continue cloning in the compute shader code because we lower various things in NIR based on the SIMD width. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-11-20 15:53:46 -08:00
Kenneth Graunke	6a10dd08f4	i965: Use a 'nir' temporary rather than poking at brw_program It's shorter and will also be useful when I adjust cloning soon. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-11-20 15:53:46 -08:00
Marek Olšák	0d17b685b1	gallium/u_tests: add a compute shader test that clears an image	2018-11-20 18:50:48 -05:00
Dave Airlie	3486fe655a	ac: handle cast derefs Just give back the same value for now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-21 08:54:46 +10:00
Dave Airlie	baa4bdd3a6	radv: handle loading from shared pointers We won't have a var to load from, so don't try to the processing required if we don't need it. This avoids crashes in: dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-21 08:54:42 +10:00
Dave Airlie	ec9fe8abc7	ac: avoid casting pointers on bcsel and stores For variable pointers we really don't want to case the pointers to int without a good reason, just add a wrapper for bcsel loading and result storing. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-21 08:54:25 +10:00
Dylan Baker	a999798daa	meson: Add tests to suites Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-20 09:09:22 -08:00
Andrii Simiklit	b787dcf57b	i965/batch: avoid reverting batch buffer if saved state is an empty There's no point reverting to the last saved point if that save point is the empty batch, we will just repeat ourselves. v2: Merge with new commits, changes was minimized, added the 'fixes' tag v3: Added in to patch series v4: Fixed the regression which was introduced by this patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 Reported-by: Mark Janes <mark.a.janes@intel.com> The solution provided by: Jordan Justen <jordan.l.justen@intel.com> CC: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `3faf56ffbd` "intel: Add an interface for saving/restoring the batchbuffer state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 (fixed in v4) Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-20 06:33:43 -08:00
Emil Velikov	982e012b3a	travis: adding missing x11-xcb for meson+vulkan Required by the x11 WSI Fixes: `df82012b2c` ("travis: add meson build for vulkan drivers.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-20 11:16:46 +00:00
Emil Velikov	5bc509363b	glx: make xf86vidmode mandatory for direct rendering Currently we detect the module and if missing, the glXGetMsc* API is effectively a stub, always returning false. This is what effectively has been happening with our meson build :-( Thus users have no chance of using it - they cannot even distinguish if the failure is due to a misconfigured build. There's no reason for keeping xf86vidmode optional - it has been available in all distributions for years. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `a47c525f32` "meson: build glx"	2018-11-20 11:13:20 +00:00
Emil Velikov	84445a86d1	travis: drop unneeded x11proto-xf86vidmode-dev The only place where the package is needed is for building the DRI based libGL library. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-20 11:13:20 +00:00
Samuel Pitoiset	f4563d8f5b	ac/nir: fix intrinsic name string size in visit_image_atomic() Fixes an assertion in SoTTR. Fixes: `dd0172e865` ("radv: Use structured intrinsics instead of indexing workaround for GFX9.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-20 10:23:45 +01:00
Bas Nieuwenhuizen	dd0172e865	radv: Use structured intrinsics instead of indexing workaround for GFX9. These force the index to be used in the instruction so we don't need the workaround. Totals: SGPRS: 1321642 -> 1321802 (0.01 %) VGPRS: 943664 -> 943788 (0.01 %) Spilled SGPRs: 28468 -> 28480 (0.04 %) Spilled VGPRs: 88 -> 89 (1.14 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 80 -> 80 (0.00 %) dwords per thread Code Size: 52415292 -> 52338932 (-0.15 %) bytes LDS: 400 -> 400 (0.00 %) blocks Max Waves: 233903 -> 233803 (-0.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 238344 -> 238504 (0.07 %) VGPRS: 232732 -> 232856 (0.05 %) Spilled SGPRs: 13125 -> 13137 (0.09 %) Spilled VGPRs: 88 -> 89 (1.14 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 80 -> 80 (0.00 %) dwords per thread Code Size: 15752712 -> 15676352 (-0.48 %) bytes LDS: 139 -> 139 (0.00 %) blocks Max Waves: 31680 -> 31580 (-0.32 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-19 23:36:00 +01:00
Kenneth Graunke	0990168642	i965: Allow only one slot of clip distances to be set on Gen4-5. The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0 was written, then VARYING_SLOT_CLIP_DIST1 would be as well. That's true with the current lowering, but not necessary if there are 4 or fewer clip distances. Separate out the checks to allow this. The new NIR-based lowering will trigger this case, which would have caused backend validation errors (src is null) without this patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:33:16 -08:00
Kenneth Graunke	5b682143da	nir: Make nir_lower_clip_vs optionally work with variables. The way nir_lower_clip_vs() works with store_output intrinsics makes a ton of assumptions about the driver_location field. In i965 and iris, I'd rather do this lowering early and work with variables. v3d may want to switch to that as well, and ir3 could too, but I'm not sure exactly what would need updating. For now, handle both methods. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:33:16 -08:00
Kenneth Graunke	d0f746b645	nir: Save nir_variable pointers in nir_lower_clip_vs rather than locs. I'll want the variables in the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:33:16 -08:00
Kenneth Graunke	63c8696874	nir: Inline lower_clip_vs() into nir_lower_clip_vs(). It's now called exactly once, and there's not really any distinction. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:33:14 -08:00
Kenneth Graunke	bfa789aceb	nir: Use nir_shader_get_entrypoint in nir_lower_clip_vs(). Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 14:31:20 -08:00
Dave Airlie	c8a35285f0	nir: handle shared pointers in lowering indirect derefs. Check if the base ends up with no variable, and continue if we see that case outside the loop. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-20 05:36:52 +10:00
Dave Airlie	760859cac2	nir: move getting deref from var after we check deref type. I posted a load of hacks before to do this, Jason suggested this, just check the deref mode, not the variable mode and delay getting the variable until we know the type. avoids crashes when derefing shared memory pointers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-20 05:36:38 +10:00
Dave Airlie	2f4f5a5055	spirv/vtn: handle variable pointers without offset lowering Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-20 05:36:16 +10:00
Jason Ekstrand	dca35c598d	intel/fs,vec4: Fix a compiler warning ../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare] assert(nir_intrinsic_write_mask(instr) == ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ (1 << instr->num_components) - 1); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This was caused by `6339aba775` which added these completely valid checks. However clang likes to complain about signedness mismatches. Fixes: `6339aba775` "intel/compiler: Lower SSBO and shared..." Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-11-19 09:57:41 -06:00
Jason Ekstrand	060817b2fa	intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_values It's not at all intel-specific; the formula is dictated by OpenGL and Vulkan. The only intel-specific thing is that we need the lowering. As a nice side-effect, the new version is variable-group-size ready. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-11-19 09:57:41 -06:00
Eric Engestrom	486091bc00	gbm: add missing comma between strings Fixes: `d971a4230d` "loader: Factor out the common driver opening logic from each loader." Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-19 15:50:56 +00:00
Samuel Pitoiset	724107553c	radv: implement fast HTILE clears for depth or stencil only on GFX9 This allows to fast clear the depth part (or the stencil part) of a depth+stencil surface when HTILE is enabled. I didn't test on GFX8, so it's disabled currently. This gives a very nice boost, for example when clearing the depth aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster). BEFORE: 235 us AFTER: 13 us Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:18 +01:00
Samuel Pitoiset	7dcddbe54d	radv: rewrite the condition that checks allowed depth/stencil values Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:16 +01:00
Samuel Pitoiset	9133bbf186	radv: check allowed fast HTILE clears a bit earlier Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:14 +01:00
Samuel Pitoiset	193ad4748b	radv: add radv_is_fast_clear_{depth,stencil}_allowed() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:12 +01:00
Samuel Pitoiset	c7e142ed78	radv: add radv_get_htile_fast_clear_value() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:10 +01:00
Samuel Pitoiset	6f3fbcc041	radv: remove unnecessary goto in the fast clear paths Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:08 +01:00
Samuel Pitoiset	36006e3cec	radv/winsys: remove the max IBs per submit limit for the sysmem path This path will be eventually improved later but as it's only used on SI (or with RADV_DEBUG=noibs), I'm not sure if that matters much. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:06 +01:00
Samuel Pitoiset	4d30f2c6f4	radv/winsys: remove the max IBs per submit limit for the fallback path The chained submission is the fastest path and it should now be used more often than before. This removes some EOP events. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 16:32:04 +01:00
Lucas Stach	8ca8a6a7b1	etnaviv: use dummy RT buffer when rendering without color buffer At least GC2000 seems to push some dirt from the PE color cache into the last bound render target when drawing depth only. Newer cores seem to behave properly and don't do this, but I have found no way to fix it on GC2000. Flushes and stalls don't seem to make any difference. In order to stop the core from pushing the dirt into a precious real render target, plug in dummy buffer when rendering without a color buffer. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-11-19 15:48:10 +01:00
Dave Airlie	8706204074	virgl: fix vtest regression since fencing changes. The in_fence_fd needs to be initialised to -1. Fixes: `d1a1c21e7` (virgl: native fence fd support) Reviewed-by: Robert Foss <robert.foss@collabora.com>	2018-11-19 15:33:19 +01:00
Samuel Pitoiset	55c75d2b49	radv: always clear the FCE predicate after DCC/FMASK/CMASK decompressions DCC and FMASK also imply a fast-clear eliminate, so it should be safe to reset the predicate unconditionally. We still only skip FMASK or CMASK decompressions for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 14:05:35 +01:00
Samuel Pitoiset	483a28bfd4	radv: tidy up radv_set_dcc_need_cmask_elim_pred() This is just a small cleanup. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-19 14:05:33 +01:00
Nicolai Hähnle	46a59ce026	radeonsi: fix an out-of-bounds read reported by ASAN We read 4 values out of sample_locs_8x, so make sure the array is big enough. Fixes: `ac76aeef20` ("radeonsi: switch back to standard DX sample positions") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-19 11:16:35 +01:00
Gert Wollny	d174cbccfa	r600: Only set context streamout strides info from the shader that has outputs With 5d517a streamout info is only attached to the shader for which the transform feedback is actually recorded, but the driver set the context info with each state submitted, thereby always using the info data that was attached to the vertex shader. Pass the streamout stride info to the context only from the shader that actually has outputs. (Thanks to Marek Olšák for pointing me in the right direction) Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734 Fixes: `5d517a599b` st/mesa: Don't record garbage streamout information in the non-SSO case. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-19 11:06:56 +01:00
Gert Wollny	18a8e11aea	i965:use FRAMEBUFFER_UNSUPPORTED instead of FRAMEBUFFER_INCOMPLETE_DIMENSIONS FRAMEBUFFER_INCOMPLETE_DIMENSIONS is not supported for GLES 3.0 and later and not defined for Desktop OpenGL. Instead use FRAMEBUFFER_UNSUPPORTED like it was done before. Thanks to Iago Toral and Andrey Simiklit for pointing out the problem and the details. Fixes: `ebcde34545` i965: be more specific about FBO completeness errors Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-11-19 11:06:52 +01:00
Gert Wollny	40eca7d3e1	virgl: Use file descriptor instead of un-allocated object The structure qdws is not allocated at this point, nor is the file descriptor set to it's member. Use the fd directly instead. Fixes: `d1a1c21e76` virgl: native fence fd support Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-11-19 11:03:56 +01:00
Gert Wollny	78fdc507a3	i965: Add support for and expose EXT_texture_sRGB_R8 Emulate MESA_FORMAT_R_SRGB8 by using L8_UNORM_SRGB. This is possible because component swizzling is handled based on the mesa format and, hence, the a r001 swizzling can be used to correct the components. Enables and makes pass (tested on Kabylake) dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.* dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-19 08:05:44 +01:00
Gert Wollny	c5363869d4	i965: Force zero swizzles for unused components in GL_RED and GL_RG This makes it possible to use a hardware luminance format as RED format. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-19 08:05:44 +01:00
Gert Wollny	ebcde34545	i965: be more specific about FBO completeness errors The driver was returning GL_FRAMEBUFFER_UNSUPPORTED for all cases of an incomplete fbo, be a bit more specific about this following the description of glCheckFramebufferStatus. This helps to keeps dEQP happy when adding EXT_texture_sRGB_R8 support. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-19 08:05:44 +01:00
Gert Wollny	24a02157dd	i965: Correct L8_UNORM_SRGB table entry As the name says, the format is an sRGB format. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-19 08:05:44 +01:00
Robert Foss	70692adf48	virgl: Clean up fences commit Remove a dead variable, a int->bool conversion and some whitespace changes. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-18 12:14:55 +01:00
Kenneth Graunke	c2e3d0f163	i915: Delete swizzling detection logic. This is all leftover from the i965 split. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-17 10:26:31 -08:00
Ilia Mirkin	beb66d3747	nv50/ir/ra: enforce max register requirement, and change spill order On nv50, certain operations must happen on regs below 64, due to encoding requirements. First of all, we add infrastructure to enforce this. Secondly we change the spill order to first spill RIG nodes that are unconstrained, followed by ones that are. This makes the gamecube logo shadertoy compile properly. Curiously, if we adjust the spill order so that we first spill the constrained RIG nodes instead, the RA also succeeds. However it seems more logical to first spill the unconstrained ones. While we're at it, drop the nv50 max register to reserve r127 as the zero register of last resort (r63 is preferred). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Karol Herbst <kherbst@redhat.com>	2018-11-16 22:43:52 -05:00
Ilia Mirkin	799e021894	nv50/ir/ra: improve condition for short regs, unify with cond for 16-bit Instead of the size restriction existing in two places, and potentially being applied twice, we move this together. Ops with 16-bit register addresses can only take a short reg, and ops with immediates can only take a short reg. Of course we leave the immediate 0 in place since we know that it will be replaced by r63/r127 down the line, so don't treat zeroes as an immediate. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-16 20:53:33 -05:00
Ilia Mirkin	955d943c33	nv50/ir: delete MINMAX instruction that is no longer in the BB We removed the op from the BB, but it was still listed in its sources' uses. This could trip up some logic down the line which analyzes all the uses of an l-value, e.g. spilling. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-16 20:53:09 -05:00
Eric Anholt	7e9fc11ff8	egl: Print the actual message to the console from _eglError(). Previously we would print errors on the console like: libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize When we had everything we needed for: libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize: DRI2: failed to find EGLDevice (for a gbm error in my case) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-16 17:49:31 -08:00
Eric Anholt	d971a4230d	loader: Factor out the common driver opening logic from each loader. I copied the code from egl_dri2.c, but the functionality was equivalent between all the loaders other than their particular environment variables. v2: Drop the logging function equivalent to loader_default_logger() (requested by Eric, Emil). Move the SCons workaround across. Drop the now-unused driGetDriverExtensions() declaration that was lost in a rebase. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-11-16 17:49:17 -08:00
Eric Anholt	cc19815738	loader: Stop using a local definition for an in-tree header I need other types from the header now, and "gl.h is big" is not a good reason to duplicate definitions. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-16 15:38:18 -08:00
Eric Anholt	2bc1f5c2e7	egl: Move loader_set_logger() up to egl_dri2.c. Everyone needs to call it, and platform_x11 forgot to. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-16 15:38:18 -08:00
Eric Anholt	c2b515379b	glx: Move DRI extensions pointer loading to driOpenDriver(). The only thing you do with a dri driver handle is get the extensions pointer, so just fold it in to simplify the callers. v2: Add the declaration of driGetDriverExtensions() that got lost in a rebase. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-11-16 15:38:18 -08:00
Eric Anholt	7076e9f116	glx: Remove an old DEFAULT_DRIVER_DIR default. You can tell by "Mesa/configs/default" how old this is. Your build system really has to provide the DEFAULT_DRIVER_DIR, or other loaders will break. v2: Move the bad (non-prefix-dependent) define to the SConscript to avoid breaking it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-11-16 15:37:47 -08:00
Samuel Pitoiset	d031d5c999	radv: enable primitive binning by default After doing a bunch of benchmarks, primitive binning helps some games like The Talos Principle (+5%) or Serious Sam 2017 (+3%). For other titles, either it doesn't change anything or it hurts very few (less than 1%). This only affects GFX9. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-16 17:51:15 +01:00
Samuel Pitoiset	afd834b62e	radv: add a debug option for disabling primitive binning Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-16 17:51:12 +01:00
Robert Foss	d1a1c21e76	virgl: native fence fd support Following the support for fences on the virtio driver add support for native fence on virgl. This was somewhat based on the freedeno one. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-16 14:41:57 +01:00
Lionel Landwerlin	0db898cef2	intel/aub_viewer: Print blend states properly Identical fix to : commit `70de31d0c1` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Fri Aug 24 16:05:08 2018 -0500 intel/batch_decoder: Print blend states properly Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>	2018-11-16 11:40:38 +00:00
Lionel Landwerlin	ac324a6809	intel/aub_viewer: fix dynamic state printing Identical fix to : commit `cbd4bc1346` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Fri Aug 24 16:04:03 2018 -0500 intel/batch_decoder: Fix dynamic state printing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>	2018-11-16 11:40:14 +00:00
Lionel Landwerlin	59c1059528	intel/aubinator: fix ring buffer pointer We can only start parsing commands from the head pointer. This was working fine up to now because we only dealt with a "made up" ring buffer (generated by aub_write) which always had its head at 0. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>	2018-11-16 11:39:54 +00:00
Lionel Landwerlin	25443cbb72	intel/decoders: read ring buffer length Use this value to limit reading the ring buffer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Toni Lönnberg <toni.lonnberg@intel.com>	2018-11-16 11:37:08 +00:00
Lionel Landwerlin	1c56d21156	egl/dri: fix error value with unknown drm format According to the EGL_EXT_image_dma_buf_import spec, creating an EGL image with a DRM format not supported should yield the BAD_MATCH error : " * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute is set to a format not supported by the EGL, EGL_BAD_MATCH is generated. " Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `20de7f9f22` ("egl/dri2: support for creating images out of dma buffers") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-11-16 10:28:06 +00:00
Daniel Stone	5e1fe240c4	gbm: Clarify acceptable formats for gbm_bo gbm_bo_create() was presumably meant to originally accept gbm_bo_format enums, but it's accepted GBM_FORMAT_* tokens since the dawn of time. This is good, since gbm_bo_format is rarely used and covers a lot less ground than GBM_FORMAT_*. Change the documentation to refer to both; this involves removing a 'see also' for gbm_bo_format, since we can't also use \sa to refer to a family of anonymous #defines. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-16 09:40:46 +00:00
Connor Abbott	ba94a00c7c	Revert "radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT" This reverts commit `647c2b90e9`. There was one recently-introduced bug in ac for dvec3 loads, but the other test failures were actually bugs in the tests. See `9429e621c4` Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-16 10:32:03 +01:00
Eric Anholt	cc71bf529c	vc4: Don't return a vc4 BO handle on a renderonly screen. The handles exported need to be on the KMS device's fd, anything else is failure. Also, this code is assuming that the scanout resource has been created already, so assert it.	2018-11-15 21:11:44 -08:00
Eric Anholt	cc0bc76a38	vc4: Make sure we make ro scanout resources for create_with_modifiers. The DRI3 create_with_modifiers paths don't set tmpl.bind to SCANOUT or SHARED, with the theory that given that you've got modifiers, that's all you need. However, we were looking at the tmpl.bind for setting up the KMS handle in the renderonly case, so we'd end up trying to use vc4's handle on the hx8357d fd. Fixes: `84ed8b67c5` ("vc4: Set shareable BOs as T tiled if possible")	2018-11-15 21:11:44 -08:00
Danylo Piliaiev	f9fd0cf479	i965: Fix calculation of layers array length for isl_view Handle all cases in calculation of layers count for isl_view taking into account texture view and image unit. st_convert_image was taken as a reference. When u->Layered is true the whole level is taken with respect to image view. In other case only one layer is taken. v3: (Józef Kucia and Ilia Mirkin) - Rewrote patch by taking st_convert_image as a reference - Removed now unused get_image_num_layers function - Changed commit message v4: (Jason Ekstrand) - Added assert Fixes: `5a8c8903` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107856 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-15 19:59:54 -06:00
Jason Ekstrand	6339aba775	intel/compiler: Lower SSBO and shared loads/stores in NIR We have a bunch of code to do this in the back-end compiler but it's fairly specific to typed surface messages and the way we emit them. This breaks it out into NIR were it's easier to do things a bit more generally. It also means we can easily share the code between the vec4 and FS back-ends if we wish. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:49 -06:00
Jason Ekstrand	d34fd81e76	nir: Add alignment parameters to SSBO, UBO, and shared access This also changes spirv_to_nir and glsl_to_nir to set them. The one place that doesn't set them is shared memory access lowering in nir_lower_io. That will have to be updated before any consumers of it can effectively use these new alignments. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Karol Herbst <kherbst@redhat.com>	2018-11-15 19:59:42 -06:00
Jason Ekstrand	fb127f7729	nir/lower_io: Add shared to get_io_offset_src Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:31 -06:00
Jason Ekstrand	b5c48271d4	nir/glsl: Force 32-bit for UBO and SSBO Booleans Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:30 -06:00
Jason Ekstrand	44b7005581	nir/spirv: Force 32-bit for UBO and SSBO Booleans Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:29 -06:00
Jason Ekstrand	f16bd8a9fe	nir/builder: Add a nir_pack/unpack/bitcast helpers The new helpers can generate any pack/unpack operation including those for which we do not have specific opcodes and they express a bitcast in terms of these pack/unpack operations. In particular, the new helpers properly handle 8-bit types. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:28 -06:00
Jason Ekstrand	b77d68b78e	nir/builder: Add iadd_imm and imul_imm helpers The pattern of adding or multiplying an integer by an immediate is fairly common especially in deref chain handling. This adds a helper for it and uses it a few places. The advantage to the helper is that it automatically handles bit sizes for you. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-15 19:59:27 -06:00
Jason Ekstrand	1f29f4db1e	nir/builder: Assert that intN_t immediates fit This assert won't catch all mistakes with this helper but it will at least ensure that the top bits are all zero or all one which should help catch bugs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:26 -06:00
Jason Ekstrand	4266932c0b	nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16 It messes up when trying to lower. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-15 19:59:09 -06:00
Ian Romanick	425c133ab9	glsl: Refactor type checking for redeclarations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-15 14:27:32 -08:00
Ian Romanick	61e003ce7e	glsl: Omit redundant qualifier checks on redeclarations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-15 14:27:29 -08:00
Ian Romanick	9b9f3218db	glsl: prevent qualifiers modification of predeclared variables Section 3.7 (Identifiers) of the GLSL spec says: However, as noted in the specification, there are some cases where previously declared variables can be redeclared to change or add some property, and predeclared "gl_" names are allowed to be redeclared in a shader only for these specific purposes. More generally, it is an error to redeclare a variable, including those starting "gl_". This patch should fix piglit tests: clip-distance-redeclare-without-inout.frag clip-distance-redeclare-without-inout.vert However, this causes a regression in clip-distance-out-values.shader_test. A fix for that test has been sent to the piglit list for review: https://patchwork.freedesktop.org/patch/255201/ As far as I understood following mailing thread: https://lists.freedesktop.org/archives/piglit/2013-October/007935.html looks like we have accepted to remove an ability to change qualifiers but have not done it yet. Unless I missed something) v2 (idr): Move 'earlier->data.mode != var->data.mode' test much earlier in the function. Add special handling for gl_LastFragData. Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-15 14:27:26 -08:00
Eric Anholt	538bca78e2	v3d: Don't try to set PF flags on a LDTMU operation We need an ALU op in order to set PF. Fixes a recent assertion failure in dEQP-GLES3.functional.ubo.single_basic_type.shared.bool_vertex	2018-11-15 11:12:54 -08:00
Eric Anholt	03928dd682	v3d: Fix double-swapping of R/B on V3D 4.1 Fixes: `4018eb04e8` ("v3d: Use the TLB R/B swapping instead of recompiles when available.")	2018-11-15 11:12:54 -08:00
Eric Engestrom	2b2f790e59	egl: fix bad rebase I screwed up a rebase over a refactor and didn't notice locally because the uncommitted refactor hid the issue. Fixes: `c973364967` "egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-15 17:51:40 +00:00
Sagar Ghuge	6e60ff1ea9	intel/compiler: Disassemble GEN6_SFID_DATAPORT_SAMPLER_CACHE as dp_sampler Both BRW_SFID_SAMPLER and GEN6_SFID_DATAPORT_SAMPLER_CACHE are getting disassembled as "sampler", which is misleading for assembler tool. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-11-15 09:36:55 -08:00
Eric Engestrom	c973364967	egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache Fixes dEQP-EGL.functional.get_proc_address.extension.egl_android_blob_cache on builds with glvnd enabled. Fixes: `6f5b57093b` "egl: add support for EGL_ANDROID_blob_cache" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 16:27:27 +00:00
Eric Engestrom	2640854399	gbm: add new entrypoint to symbols check Fixes: `6328536ff2` "gbm: Introduce a helper function for printing GBM format names." Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-15 16:25:42 +00:00
Emil Velikov	adbdfc6666	bin/get-pick-list.sh: handle reverts prior to the branchpoint Currently we detect when a breaking commit: - has landed in stable, and - is referenced by a untagged fix in master Yet we did not consider the case of breaking commit: - prior to the branchpoint, and - is referenced by a untagged fix in master Addressing the latter is extremely slow, due to the size of the lookup. That said, we can trivially use the existing is_sha_nomination() helper to catch reverts. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 16:15:15 +00:00
Emil Velikov	c0012a0708	bin/get-pick-list.sh: use test instead of [ ] Latter is rather picky wrt surrounding white space. The explicit `test` doesn't have that problem, plus the statements read a bit easier. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:51 +00:00
Emil Velikov	77ff0bfb5f	bin/get-pick-list.sh: handle unofficial "broken by" tag We have a number of cases were devs will use a tag "broken by". While it's not something officially documented or recommended, checking for it is trivial enough. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:47 +00:00
Emil Velikov	209525aafb	bin/get-pick-list.sh: handle fixes tag with missing colon Every so often, we forget to add the colon after "fixes". Trivially tweak the script to catch it. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:44 +00:00
Emil Velikov	b7418d1f3f	bin/get-pick-list.sh: flesh out is_sha_nomination Refactor is_fixes_nomination into a is_sha_nomination helper. This way we can reuse it for more than the usual "Fixes:" tag. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:40 +00:00
Emil Velikov	533fead423	bin/get-pick-list.sh: tweak the commit sha matching pattern Currently we match on: - any arbitrary length of, - any a-z A-Z and 0-9 characters At the same time, a commit sha consists of lowercase hexadecimal numbers. Any sha shorter than 8 characters is ambiguous - in some cases even 11+ are required. So change the pattern to a-f0-9 and adjust the length to 8-40. As we're here we could use a single grep, instead of the grep/sed combo. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:36 +00:00
Emil Velikov	181203f3c5	bin/get-pick-list.sh: handle the fixes tag Having a separate script to handle the fixes tag, brings a number of issues, so let's fold it in get-pick-list.sh. v2: - pass the sha as argument to the function - Keep original sed pattern Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:31 +00:00
Emil Velikov	e6b3a3b201	bin/get-pick-list.sh: handle "typod" usecase. As the comment in get-typod-pick-list.sh says, there's little point in having a duplicate file. Add the new pattern + tag to get-pick-list.sh and nuke this file. v2: - pass the sha as argument to the function - grep -q instead of using a variable (Eric) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:55:24 +00:00
Emil Velikov	fac10169bb	bin/get-pick-list.sh: prefix output with "[stable] " With later commits we'll fold all the different scripts into one. Add the explicit prefix, so that we know the origin of the nomination v2: - pass the sha as argument to the function - swap $tag = none for an else statment (Juan) - grep -q instead of using a variable (Eric) - print the tag and commit oneline separately (Eric) v3: - drop unused "tag=none" assignment (Juan) - typo nomination Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:54:48 +00:00
Emil Velikov	559c32d241	bin/get-pick-list.sh: simplify git oneline printing Currently we force disable the pager via "\|cat" where --no-pager exists. Additionally we could use git show instead of git log -n1. Use those for a slightly more understandable code. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-15 15:51:24 +00:00
Emil Velikov	7d9556681d	docs: document the staging branch and add reference to it A while back we agreed that having a live/staging branch is beneficial. Sadly we forgot to document that, so here is my first attempt. Document the caveat that the branch history is not stable. CC: Andres Gomez <agomez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-15 15:48:15 +00:00
Emil Velikov	4ae749acf1	docs/submittingpatches.html: correctly handle the <p> tag As pointed out by the w3c validator. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-15 15:48:13 +00:00
Emil Velikov	19a081473f	docs/releasing.html: polish cherry-picking/testing text Reword slightly and highlight the important parts of the text. CC: Andres Gomez <agomez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-15 15:48:08 +00:00
Guido Günther	ab5653680e	etnaviv: Make sure rs alignment checks match etna_resource_alloc and etna_resource_from_handle currently use different checks. This leads to etna_resource_from_handle:492: target=2, format=PIPE_FORMAT_B8G8R8X8_UNORM, 1080x1920x1, array_size=1, last_level=0, nr_samples=0, usage=0, bind=8000a, flags=0 etna_resource_from_handle:541: BO stride 4320 is too small for RS engine width padding (4352, format PIPE_FORMAT_B8G8R8X8_UNORM) since etna_resource_from_handle wants to be aligned to a 16 byte boundary while the etna_resource_alloc does not. Adjust the two checks by using a common function. Broken by `baff59ebf0` Signed-off-by: Guido Günther <guido.gunther@puri.sm> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-11-15 16:38:35 +01:00
Juan A. Suarez Romero	52368ef83a	docs: update calendar, add news item and link release notes for 18.2.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-11-15 13:08:58 +00:00
Juan A. Suarez Romero	aa7a419b8b	docs: add sha256 checksums for 18.2.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit 79be754f9a74a43b5748dc0934241e7701cb9581)	2018-11-15 13:06:12 +00:00
Juan A. Suarez Romero	e53ec08931	docs: add release notes for 18.2.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `f34bddc325`)	2018-11-15 13:06:10 +00:00
Marek Olšák	9367514524	radeonsi: fix video APIs on Raven2 This was missed when I added the new enum. Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-11-14 17:08:34 -05:00
Andrii Simiklit	e13dd70581	i965: avoid 'unused variable' warnings 1. brw_pipe_control.c:311:34: warning: unused variable ‘devinfo’ 2. brw_program_binary.c:209:19: warning: unused variable ‘gen_size’ 3. brw_program_binary.c:216:19: warning: unused variable ‘nir_size’ v2: Changes for unreproducible issues were removed Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-14 14:41:58 +00:00
Andrii Simiklit	7aca650122	compiler: avoid 'unused variable' warnings 1. nir/nir_lower_vars_to_ssa.c:691:21: warning: unused variable ‘var’ nir_variable *var = path->path[0]->var; v2: Changes for some part of 'may be used uninitialized' warnings were removed, seems like it is a compiler issue. ( Eric Engestrom <eric.engestrom@intel.com> ) Possible like this one: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46684 This issue is flagged as duplicate but an original one is not closed yet. Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-14 13:35:38 +00:00
Andrii Simiklit	69ee49ac46	intel/tools: avoid 'unused variable' warnings 1. tools/aub_read.c:271:31: warning: unused variable ‘end’ const uint32_t p = data, end = data + data_len, next; 2. tools/aub_mem.c:292:13: warning: unused variable ‘res’ void res = mmap((uint8_t )bo.map + map_offset, 4096, PROT_READ, tools/aub_mem.c:357:13: warning: unused variable ‘res’ void res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ, v2: The i965_disasm.c changes was moved into a separate patch The 'end' variable declared separately with MAYBE_UNUSED to avoid effect of it to other variables. ( Eric Engestrom <eric.engestrom@intel.com> ) Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-14 13:35:28 +00:00
Thomas Hellstrom	25b48e3df9	st/xa: Bump minor Bump minor to signal support for new formats and higher precision solid pictures. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-11-14 13:12:09 +01:00
Thomas Hellstrom	c9085f6d3b	st/xa: Support Component Alpha with trivial blending Support Component Alpha for those composite operations that do not require per-channel alpha blending. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-11-14 13:12:09 +01:00
Thomas Hellstrom	0477d17f51	st/xa: Minor renderer cleanups constify function arguments to clean up the code a bit. Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-11-14 13:12:09 +01:00
Thomas Hellstrom	56aa23b146	st/xa: Fix transformations when we have both source and mask samplers In the case when we had both source and mask samplers, transformations were typically not applied correctly. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-11-14 13:12:09 +01:00
Thomas Hellstrom	e1298def9f	st/xa: Support a couple of new formats Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-11-14 13:12:09 +01:00
Thomas Hellstrom	258d20152a	st/xa: Support higher color precision for solid pictures The only solid fill picture type we supported only had 8 bit color channels. Add a new solid picture type that supports float channels. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-11-14 13:11:51 +01:00
Thomas Hellstrom	d86ad38205	st/xa: Render update. Better support for solid pictures Remove unused and obsolete code for gradients and component-alpha Support solid source- and mask pictures using a variable number of samplers in the composite pipeline rather than the fixed number we used before. Tested using rendercheck for XA. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-11-14 13:07:00 +01:00
Gert Wollny	4bba280937	nir: Allow to skip integer ops in nir_lower_to_source_mods Some hardware supports source mods only for float operations. Make it possible to skip lowering to source mods in these cases. v2: use option flags instead of a boolean (Jason Ekstrand) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-14 08:59:26 +01:00
Karol Herbst	b4380cb070	nir/spirv: cast shift operand to u32 v2: fix for specialization constants as well Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-11-14 02:09:11 +01:00
Karol Herbst	099728b115	nir: replace nir_load_system_value calls with appropiate builder functions this helps reduce the overall code changes when a bit_size parameter is added to nir_load_system_value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-11-14 02:09:11 +01:00
Karol Herbst	80db331c2d	nir: add const_index parameters to system value builder function this allows to replace some nir_load_system_value calls with the specific system value constructor Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-11-14 02:09:11 +01:00
Timothy Arceri	95b513c937	radv: make use of nir_move_out_const_to_consumer() vkpipeline-db results: Totals from affected shaders: SGPRS: 28400 -> 28576 (0.62 %) VGPRS: 27916 -> 27692 (-0.80 %) Spilled SGPRs: 140 -> 138 (-1.43 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1534456 -> 1520560 (-0.91 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3541 -> 3582 (1.16 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-14 09:41:50 +11:00
Lionel Landwerlin	ea53f76d7b	anv: move helper function internally It's only used in anv_image.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:31 +00:00
Lionel Landwerlin	8b00d3d6eb	anv: use image aspects rather than computed ones This shouldn't make any difference but I feel uneasy to use the expanded aspects that do not represent the image in its entirety. If we ever change the implementation of the anv_image_aspect_to_plane() helper, this is safer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:27 +00:00
Lionel Landwerlin	465de47bad	anv: associate vulkan formats with aspects This will make it easier to associate an aspect with a plane number. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:24 +00:00
Lionel Landwerlin	fe3b7fe982	anv/lower_ycbcr: make sure to set 0s on all components To play around with debugging, we might want to disable one or the other component. Having 0s as default values makes this work. Otherwise we might have NULL components, leading to crashes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:21 +00:00
Lionel Landwerlin	ee8d65c25a	anv/image: remove unused parameter Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:13 +00:00
Lionel Landwerlin	352e297091	anv: simplify internal address offset Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 18:56:10 +00:00
Eric Engestrom	4fa2fb3524	meson: fix wayland-less builds Those empty variables in the !wayland case are useless and running that meson.build with them breaks the build: [287/850] Generating wayland-drm-client-protocol.h with a custom command. FAILED: src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h client-header ../src/egl/wayland/wayland-drm/wayland-drm.xml src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h /bin/sh: client-header: command not found ninja: build stopped: subcommand failed. Fixes: `d1992255bb` "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	7df80de6e6	gbm: remove unnecessary meson include `inc_wayland_drm` is only used if wayland is built, and it's already added in that case a few lines below. Fixes: `a29869e872` "gbm: Don't traverse backwards for includes" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	3832db275e	meson: only run vulkan's meson.build when building vulkan Fixes: `d1992255bb` "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	4f1ae271e1	xmlpool: update translation po files These files are close to 4 years out of date; a lot's changed since. Let's just check in a recently-regenerated version. Changes generated by running `ninja xmlpool-{pot,update-po,gmo}`. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	1e918e5bef	REVIEWERS: add Vulkan reviewer group Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	59b3335496	REVIEWERS: add Emil as EGL reviewer Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-11-13 17:25:02 +00:00
Eric Engestrom	923aca84b2	REVIEWERS: add include path for EGL Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-11-13 17:25:02 +00:00
Toni Lönnberg	2af4e3345f	intel/genxml: Add engine definition to render engine instructions (gen11) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	1921982d3e	intel/genxml: Add engine definition to render engine instructions (gen10) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	030fe0f981	intel/genxml: Add engine definition to render engine instructions (gen9) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added more missing engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	12e34fc7ba	intel/genxml: Add engine definition to render engine instructions (gen8) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	a883fd2277	intel/genxml: Add engine definition to render engine instructions (gen75) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	27cf6252d3	intel/genxml: Add engine definition to render engine instructions (gen7) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	ecf62a967e	intel/genxml: Add engine definition to render engine instructions (gen6) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions v4: Added missing engine to MEDIA_GATEWAY_STATE Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	571d6447d8	intel/genxml: Add engine definition to render engine instructions (gen5) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	6463ceca69	intel/genxml: Add engine definition to render engine instructions (gen45) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added addition engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	a4ca710c96	intel/genxml: Add engine definition to render engine instructions (gen4) Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	102dadec81	intel/decoder: tools: Use engine for decoding batch instructions The engine to which the batch was sent to is now set to the decoder context when decoding the batch. This is needed so that we can distinguish between instructions as the render and video pipe share some of the instruction opcodes. v2: The engine is now in the decoder context and the batch decoder uses a local function for finding the instruction for an engine. v3: Spec uses engine_mask now instead of engine, replaced engine class enums with the definitions from UAPI. v4: Fix up aubinator_viewer (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	a6aab7e436	intel/decoder: tools: gen_engine to drm_i915_gem_engine_class Removed the gen_engine enum and changed the involved functions to use the drm_i915_gem_engine_class enum from UAPI instead. v3: Wrong engine was being used for blocks in video ring v4: Fixed aubinator_viewer.cpp Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Toni Lönnberg	b00bccd012	intel/decoder: Engine parameter for instructions Preliminary work for adding handling of different pipes to gen_decoder. Each instruction needs to have a definition describing which engine it is meant for. If left undefined, by default, the instruction is defined for all engines. v2: Changed to use the engine class definitions from UAPI v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to engine_mask, added check for incorrect engine and added the possibility to define an instruction to multiple engines using the "\|" as a delimiter in the engine attribute. v4: Fixed the memory leak. v5: Removed an unnecessary ralloc_free(). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-13 15:10:12 +00:00
Gert Wollny	8d4bb6e5cd	virgl: Add command and flags to initiate debugging on the host (v2) On the host VREND_DEBUG=guestallow must be set to let the guest override the debug flags. v2: Send flag string instead of flags, this avoids the need to keep the flags in sync. v3: Only request host logging if the host actually understands the command Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-11-13 14:42:22 +01:00
Gert Wollny	caa964b422	mesa: Reference count shaders that are used by transform feedback objects Transform feedback objects may hold a pointer to a shader program, and at least in Gallium, this must be a valid pointer until ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called - which is conform with the spec that any program that is part of a current rendering state should only be flagged for deletion by glDeleteProgram. This was not handled properly for the transform feedback objects so that a call sequence glUseProgram(x) glBeginTransformFreedback(...) glPauseTransformFeedback(...) glDeleteProgram(x) glEndTransformFeedback(...) would result in a use after free bug. With this patch the transform feedback object also updates the reference count to the used program thereby keeping the program valid as long as the transform feedback objects links to it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713 Fixes: `654587696b` mesa: add end_transform_feedback() helper Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-13 10:57:25 +01:00
Samuel Pitoiset	90d68858ed	radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9 Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:36 +01:00
Samuel Pitoiset	f70c5d31cd	radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLE Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:33 +01:00
Samuel Pitoiset	b5f213bb1d	radv: binding streamout buffers doesn't change context regs Cc: 18.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-13 10:24:31 +01:00
Plamena Manolova	c5f3013cba	nir: Don't lower the local work group size if it's variable. If the local work group size is variable it won't be available at compile time so we can't lower it in nir_lower_system_values(). Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-11-13 10:57:04 +02:00
Matt Turner	efb1ccadca	util/ralloc: Make sizeof(linear_header) a multiple of 8 Prior to this patch sizeof(linear_header) was 20 bytes in a non-debug build on 32-bit platforms. We do some pointer arithmetic to calculate the next available location with ptr = (linear_size_chunk )((char )&latest[1] + latest->offset); in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation would only be 4-byte aligned. On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of 4-byte registers to memory) requires an 8-byte aligned address. Such an instruction is used to store to an 8-byte integer type, like intmax_t which is used in glcpp's expression_value_t struct. As a result of the 4-byte alignment returned by linear_alloc_child() we would generate a SIGBUS (unaligned exception) on SPARC. According to the GNU libc manual malloc() always returns memory that has at least an alignment of 8-bytes [1]. I think our allocator should do the same. So, simple fix with two parts: (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally. (2) Mark linear_header with an aligned attribute, which will cause its sizeof to be rounded up to that alignment. (We already do this for ralloc_header) With this done, all Mesa's unit tests now pass on SPARC. [1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html Fixes: `47e1758692` ("glcpp: use the linear allocator for most objects") Bug: https://bugs.gentoo.org/636326 Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-12 20:54:49 -08:00
Matt Turner	7e3748c268	util/ralloc: Switch from DEBUG to NDEBUG The debug code is all asserts, so protect it with the same thing that controls assert. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-12 20:54:49 -08:00
Timothy Arceri	34dffcf913	nir: add support for removing redundant stores to copy prop var For example the following type of thing is seen in TCS from a number of Vulkan and DXVK games: vec1 32 ssa_557 = deref_var &oPatch (shader_out float) vec1 32 ssa_558 = intrinsic load_deref (ssa_557) () vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float) vec1 32 ssa_560 = intrinsic load_deref (ssa_559) () vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float) vec1 32 ssa_562 = intrinsic load_deref (ssa_561) () intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x / intrinsic store_deref (ssa_559, ssa_560) (1) / wrmask=x / intrinsic store_deref (ssa_561, ssa_562) (1) / wrmask=x */ No shader-db changes on i965 (SKL). vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 7832 -> 7728 (-1.33 %) VGPRS: 6476 -> 6740 (4.08 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 469572 -> 456596 (-2.76 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 989 -> 960 (-2.93 %) Wait states: 0 -> 0 (0.00 %) The Max Waves and VGPRS changes here are misleading. What is happening is a bunch of TCS outputs are being optimised away as they are now recognised as unused. This results in more varyings being compacted via nir_compact_varyings() which can result in more register pressure when they are not packed in an optimal way. This is an existing problem independent of this patch. I've run some benchmarks and haven't noticed any performance regressions in affected games. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 15:19:36 +11:00
Timothy Arceri	3561108de0	anv/i965: make use of nir_link_constant_varyings() shader-db results for SLK: total instructions in shared programs: 13106498 -> 13091573 (-0.11%) instructions in affected programs: 1186244 -> 1171319 (-1.26%) helped: 6186 HURT: 0 total cycles in shared programs: 332062633 -> 331961653 (-0.03%) cycles in affected programs: 8537165 -> 8436185 (-1.18%) helped: 5371 HURT: 862 LOST: 6 GAINED: 14 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-13 14:06:32 +11:00
Eric Anholt	621b0fa892	egl: Improve the debugging of gbm format matching in DRI configs. Previously the debug would be: libEGL debug: No DRI config supports native format 0x20203852 libEGL debug: No DRI config supports native format 0x38385247 but libEGL debug: No DRI config supports native format R8 libEGL debug: No DRI config supports native format GR88 is a lot easier to understand. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Eric Anholt	6328536ff2	gbm: Introduce a helper function for printing GBM format names. This requires that the caller make a little (stack) allocation to store the string. v2: Use gbm_format_canonicalize (suggested by Daniel) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Eric Anholt	ee7f848c00	gbm: Move gbm_format_canonicalize() to the core. I want it for the format name debugging code. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-11-12 15:20:23 -08:00
Dylan Baker	4eab98b66e	meson: fix libatomic tests There are two problems: 1) the extra underscore in MISSING_64BIT_ATOMICS 2) we should link with libatomic if the previous test decided we needed it Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>	2018-11-12 13:29:00 -08:00
Marek Olšák	32a334777c	mesa: mark GL_SR8_EXT non-renderable on GLES Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-12 16:19:43 -05:00
Marek Olšák	e0c7114eb3	st/mesa: disable L3 thread pinning This implementation can have massive drawbacks. Cc: 18.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2018-11-12 16:18:15 -05:00
Christian Gmeiner	c6aaafa3a1	nir: add lowering for ffloor Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 21:57:25 +01:00
Alyssa Rosenzweig	41c8f99137	util: Fix warning in u_cpu_detect on non-x86 regs is only set and used on x86; on other platforms (like ARM), this code causes a trivial warning, solved by moving the regs declaration to the architecture-dependent usage. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2018-11-12 10:28:04 -08:00
Dylan Baker	9c2a95b298	meson: Don't set -Wall meson does this for you with its warn levels, so we don't need to set it ourselves. Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 08:55:55 -08:00
Rob Clark	4a0c2cfdd6	freedreno/drm: fix unused 'entry' warnings Looks like importing libdrm_freedreno into mesa crossed paths with `e27902a261`. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-12 10:45:48 -05:00
Lionel Landwerlin	89785e2d56	i965: add support for sampling from AYUV Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	252ca7b43f	dri: add AYUV format v2: Add a AYUV entry android in the android backend (Tapani) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	8a15f06d19	nir/lower_tex: Add AYUV lowering support Byte ordering is : 0: V 1: U 2: Y 3: A v2: Split refactoring of alpha channel (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)	2018-11-12 13:22:54 +00:00
Lionel Landwerlin	0a30c33e83	nir/lower_tex: add alpha channel parameter for yuv lowering We're about to introduce AYUV support which provides its own alpha channel. So give alpha as a parameter and set it to 1 on exising formats. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-12 13:22:54 +00:00
Samuel Pitoiset	97fb1a02fd	radv: make use of num_good_cu_per_sh in si_emit_graphics() too Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:46 +01:00
Samuel Pitoiset	d9d14346c2	radv: clean up setting partial_es_wave for distributed tess on VI Only needed when the pipeline actually uses tessellation. I don't think that changes anything, except improving readability. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:44 +01:00
Samuel Pitoiset	cc4569b733	radv: cleanup and document a Hawaii bug with offchip buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-12 09:35:42 +01:00
Hanno Böck	8dc2085baf	glsl/test: Fix use after free in test_optpass. The variable state is free'd and afterwards state->error is used as the return value, resulting in a use after free bug detected by memory safety tools like address sanitizer. Signed-off-by: Hanno Böck <hanno@hboeck.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-12 07:42:58 +02:00
Timothy Arceri	a068958692	nir: don't pack varyings ints with floats unless flat Fixes: `1c9c42d16b` ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 15:38:56 +11:00
Timothy Arceri	9dd737bb02	nir: add glsl_type_is_integer() helper Fixes: `1c9c42d16b` ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-12 15:38:56 +11:00
Francisco Jerez	552642066f	intel/fs: Prevent emission of IR instructions not aligned to their own execution size. This can occur during payload setup of SIMD-split send message instructions, which can lead to the emission of header setup instructions with a non-zero channel group and fixed SIMD width. Such instructions could end up using undefined channel enable signals except they don't care since they're always marked force_writemask_all. Not known to affect correctness of any workload at this point, but it would be trivial to back-port to stable if something comes up. Reported-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-11-09 19:39:22 -08:00
Timothy Arceri	590fcb50e7	st/mesa: make use of nir_link_constant_varyings() Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 161464 -> 161368 (-0.06 %) VGPRS: 86904 -> 86292 (-0.70 %) Spilled SGPRs: 296 -> 314 (6.08 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3618596 -> 3573852 (-1.24 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 26189 -> 26276 (0.33 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-10 11:41:00 +11:00
Timothy Arceri	d40dd05553	nir: add new linking opt nir_link_constant_varyings() This pass moves constant outputs to the consuming shader stage where possible. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-11-10 11:41:00 +11:00
Andre Heider	414470854d	st/nine: clean up thead shutdown sequence a bit Just break out of the loop instead, it does the same thing. Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:27 +01:00
Andre Heider	123bf9cbe7	st/nine: plug thread related leaks Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:27 +01:00
Andre Heider	10598c9667	st/nine: fix stack corruption due to ABI mismatch This fixes various crashes and hangs when using nine's 'thread_submit' feature. On 64bit, the thread function's data argument would just be NULL. On 32bit, the data argument would be garbage depending on the compiler flags (in my case -march>=core2). Fixes: `f3fa7e3068` ("st/nine: Use WINE thread for threadpool") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-11-09 22:37:26 +01:00
Marek Olšák	d2b2364313	radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	4bec5025ac	gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	9dc776f3f2	radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2 and add has_dcc_constant_encode.	2018-11-09 14:55:04 -05:00
Marek Olšák	832ab883e2	radeonsi: use better DCC clear codes Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-09 14:55:04 -05:00
Marek Olšák	d059eae269	ac/surface: remove the overallocation workaround for Vega12 not needed anymore (probably since the tile_swizzle fix) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-09 14:55:04 -05:00
Lionel Landwerlin	959e2a5aeb	intel/aub_read: remove useless breaks Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-09 18:17:30 +00:00
Erik Faye-Lund	b55af392d9	Revert "mesa: expose NV_conditional_render on GLES" This reverts commit `5213be9fab`.	2018-11-09 17:39:25 +01:00
Erik Faye-Lund	cf8b271cbe	Revert "mesa/main: fixup make check after NV_conditional_render for gles" This reverts commit `cccd7a253f`.	2018-11-09 17:39:22 +01:00
Erik Faye-Lund	cccd7a253f	mesa/main: fixup make check after NV_conditional_render for gles It seems I missed some details when exposing NV_conditional_render on GLES; this fixes up "make check". Fixes: `5213be9fab` ("mesa: expose NV_conditional_render on GLES") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-09 16:47:34 +01:00
Nicolai Hähnle	8c97abc066	radv: include LLVM IR in the VK_AMD_shader_info "disassembly" Helpful for debugging compiler backend problems: this allows us to easily retrieve the LLVM IR from RenderDoc. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-09 14:54:37 +01:00
Erik Faye-Lund	5213be9fab	mesa: expose NV_conditional_render on GLES The extension spec has been updated to include GLES 2 support, so let's enable it there. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-11-09 13:03:00 +01:00
Iago Toral Quiroga	35baee5dce	nir/constant_folding: fix incorrect bit-size check nir_alu_type_get_type_size takes a type as parameter and we were passing a bit-size instead, which did what we wanted by accident, since a bit-size of zero matches nir_type_invalid, which has a size of 0 too. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-09 08:22:15 +01:00
Iago Toral Quiroga	6c418dfa42	intel/compiler: fix node interference of simd16 instructions SIMD16 instructions need to have additional interferences to prevent source / destination hazards when the source and destination registers are off by one register. While we already have code to handle this, it was only running for SIMD16 dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch. An example of this are pull constant loads since commit `b56fa830c6`, but there are more cases. This fixes a number of CTS test failures found in work-in-progress tests that were hitting this situation for 16-wide pull constants in a SIMD8 program. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-11-09 08:22:08 +01:00
Roland Scheidegger	a3c898dc97	gallivm: fix improper clamping of vertex index when fetching gs inputs Because we only have one file_max for the (2d) gs input file, the value actually represents the max of attrib and vertex index (although I'm not entirely sure if we really want the max, since the max valid value of the vertex dimension can be easily deduced from the input primitive). Thus in cases where the number of inputs is higher than the number of vertices per prim, we did not properly clamp the vertex index, which would result in out-of-bound fetches, potentially causing segfaults (the segfaults seemed actually difficult to trigger, but valgrind certainly wasn't happy). This might have happened even if the shader did not actually try to fetch bogus vertices, if the fetching happened in non-active conditional clauses. To fix simply use the correct max vertex index value (derived from the input prim type) instead when clamping for this case. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-11-09 00:53:03 +01:00
Aditya Swarup	a5c39ed974	i965: Lift restriction in external textures for EGLImage support Fixes Skqp's unitTest_EGLImageTest test. For Intel platforms, we support external textures only for EGLImages created with EGL_EXT_image_dma_buf_import. This restriction seems to be Intel specific and not present for other platforms. While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent to the test because of this restriction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301 Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-11-08 12:33:06 -08:00
Ian Romanick	c5a4c26450	glsl: Add pragma to disable all warnings Use #pragma warning(off) and #pragma warning(on) to disable or enable all warnings. This is a big hammer. If we ever need a smaller hammer, we can enhance this functionality. There is one lame thing about this. Because we parse everything, create an AST, then convert the AST to GLSL IR, we have to treat the #pragma like a statment. This means that you can't do something like ' void ' #pragma warning(off) ' __foo ' #pragma warning(on) ' (float param0); Fixing that would, as far as I can tell, require a huge amount of work. I did try just handling the #pragma during parsing (like we do for state for the whole shader. v2: Fix the #pragma lines in the commit message that git-commit ate. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-11-08 11:00:00 -08:00
Ian Romanick	011abfc963	glsl: Add warning tests for identifiers with __ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-11-08 10:59:53 -08:00
Jason Ekstrand	d28bc35ece	intel/fs: Add an assert to optimize_frontfacing_ternary Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:25 -06:00
Jason Ekstrand	bcc6aab065	anv: Use nir_src_is_const and friends in lowering code Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:25 -06:00
Jason Ekstrand	52145070c0	intel/analyze_ubo_ranges: Use nir_src_is_const and friends Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:25 -06:00
Jason Ekstrand	1413512b4c	intel/vec4: Use the new nir_src_is_const and friends As of this commit, all uses of const sources either go through a nir_src_as_<type> helper which handles bit sizes correctly or else are accompanied by a nir_src_bit_size() == 32 assertion to assert that we have the size we think we have. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:25 -06:00
Jason Ekstrand	61e15348c4	nir: Add a read_mask helper for ALU instructions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:22 -06:00
Jason Ekstrand	344cfe6980	intel/fs: Use the new nir_src_is_const and friends As of this commit, all uses of const sources either go through a nir_src_as_<type> helper which handles bit sizes correctly or else are accompanied by a nir_src_bit_size() == 32 assertion to assert that we have the size we think we have. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:20 -06:00
Jason Ekstrand	6b2918709a	intel/fs,vec4: Clean up a repeated pattern with SSBOs Everywhere we handle SSBO intrinsics, we have exactly the same pattern for computing the index so we may as well make a helper for it. We also add a get_nir_src_imm to vec4 and use it for SSBO offsets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-08 10:09:06 -06:00
Samuel Pitoiset	c472ad82e4	radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK HTILE is supported on these chips, not sure how I missed that. This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used. Fixes: `f425d9ee74` ("radv: use LOAD_CONTEXT_REG when loading fast clear values") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-08 11:20:03 +01:00
Samuel Pitoiset	f425d9ee74	radv: use LOAD_CONTEXT_REG when loading fast clear values This avoids syncing the Micro Engine. This is only supported for VI+ currently. There is probably a way for using LOAD_CONTEXT_REG on previous chips but that could be done later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-08 10:41:45 +01:00
Samuel Pitoiset	0dcd99c687	radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+ Inclusive and exclusives scan are missing because older chips don't have llvm.amdgcn.update.dpp. This fixes crashes with dEQP-VK.subgroups.arithmetic.*. CC: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-08 10:41:41 +01:00
Adam Jackson	16f1023037	glx: Demand success from CreateContext requests (v2) GLXCreate{,New}Context, like most X resource creation requests, does not emit a reply and therefore is emitted into the X stream asynchronously. However, unlike most resource creation requests, the GLXContext we return is a handle to library state instead of an XID. So if context creation fails for any reason - say, the server doesn't support indirect contexts - then we will fail in strange places for strange reasons. We could make every GLX entrypoint robust against half-created contexts, or we could just verify that context creation worked. Reuse the __glXIsDirect code to do this, as a cheap way of verifying that the XID is real. glXCreateContextAttribsARB solves this by using the _checked version of the xcb command, so effectively this change makes the classic context creation paths as robust as CreateContextAttribs. v2: Better use of Bool, check that error != NULL first (Olivier Fourdan) Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-11-07 12:38:05 -05:00
Karol Herbst	f7fae7f64e	gm107/ir: fix compile time warning in getTEXSMask In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)': warning: control reaches end of non-void function [-Wreturn-type] Reported-by: Moiman@freenode Fixes: `f821e80213` "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-07 17:48:58 +01:00
Michel Dänzer	32b0eb51a3	winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimport It only behaves any different from amdgpu_bo_handle_type_kms with libdrm 2.4.93, and it breaks if an older version is picked up. Bugzilla: https://bugs.freedesktop.org/108096 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-07 17:37:47 +01:00
Lionel Landwerlin	792dde66f2	intel/dump_gpu: add platform option Got tired of remembering the PCI ids. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-07 11:27:41 +00:00
Lionel Landwerlin	e262cc0353	intel/dump_gpu: move output option together Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-07 11:27:38 +00:00
Samuel Pitoiset	0a0aa2ba6c	radv: disable conditional rendering for vkCmdCopyQueryPoolResults() VK_EXT_conditional_rendering says that copy commands should not be affected by conditional rendering. Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-07 11:31:36 +01:00
Samuel Pitoiset	1e7c3379e1	radv: allocate enough space in CS when copying query results with compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-07 11:31:34 +01:00
Timothy Arceri	9aa3c1915e	ac/nir_to_llvm: fix b2f for f64 Fixes: `d7e0d47b9d` ("nir: Add a bunch of b2[if] optimizations") Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-07 16:35:07 +11:00
Karol Herbst	f821e80213	gm107/ir: use scalar tex instructions where possible TEXS, TLD4 and TLD4S are variants of tex instructions which are more scalar, which gives RA more freedom and is less likely to insert silly MOVs to satisfy quad registers. shader-db changes: total instructions in shared programs : 7687265 -> 7614782 (-0.94%) total gprs used in shared programs : 803620 -> 798045 (-0.69%) total shared used in shared programs : 639636 -> 639636 (0.00%) total local used in shared programs : 24648 -> 24648 (0.00%) total bytes used in shared programs : 82103400 -> 81330696 (-0.94%) local shared gpr inst bytes helped 0 0 3648 10647 10647 hurt 0 0 464 205 205 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-06 19:57:05 +01:00
Karol Herbst	edd6c41751	nv50/ir: add scalar field to TexInstructions Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-06 19:57:05 +01:00
Karol Herbst	8d825f78fc	nv50/ra: add condenseDef overloads for partial condenses Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-06 19:57:05 +01:00
Karol Herbst	a4550de434	nv50/ir: print color masks of tex instructions v2: print the mask for TXG as well make the mask to be printed more mask like Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-06 19:57:05 +01:00
Jason Ekstrand	610061838a	vulkan: Update the XML and headers to 1.1.91 The biggest change here is the rename of VK_NVX_ray_tracing to VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-06 12:21:19 -06:00
Gert Wollny	c171d76b94	r600: Add support for EXT_texture_sRGB_R8 Enables on R600 and makes pass: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.* dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8* v2: remove chunk for dri/radeon (Emil) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-11-06 18:49:02 +01:00
Lionel Landwerlin	421fa01d64	anv/android: mark gralloc allocated BOs as external Allocating through Gralloc implies buffers are going to be used outside the driver. We have special MOCS settings for external BOs and we probably want to use them here too. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a1220e7311` ("anv/android: Set the BO flags in bo_cache_import (v2)") Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-06 15:28:07 +00:00
Lionel Landwerlin	b43f955037	anv: stub internal android code This reduces the amount of #ifdef ANDROID we'll have to have inside the driver. Potentially offering better coverage of the android extensions. v2: Move anv_android.h include before anv_entrypoints.h (Tapani) Fix autotools android build (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-06 15:28:07 +00:00
Kristian H. Kristensen	f6131d4ec7	freedreno/a6xx: Clear z32 and separate stencil with blitter Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-11-06 08:56:38 -05:00
Rob Clark	3bbad81c80	freedreno/a6xx: fix VSC bug with larger # of tiles At higher resolutions with the addition of MSAA, the number of tiles can increase to the point where we use more than one VSC pipe per tile. Which would cause us to calculate an out-of-bounds offset for VSC_SIZE_ADDRESS. So don't try to be clever, just always put it at a fixed offset assuming the max 32 VSC pipes in use. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-06 08:56:21 -05:00
Rob Clark	2d9c3a5db2	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-11-06 08:43:27 -05:00
Olivier Fourdan	55af17ffed	wayland/egl: Resize EGL surface on update buffer for swrast After commit `a9fb331ea` ("wayland/egl: update surface size on window resize"), the surface size is updated as soon as the resize is done, and `update_buffers()` would resize only if the surface size differs from the attached size. However, in the case of swrast, there is no resize callback and the attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior to the `swrast_update_buffers()` so the attached size is always up to date when it reaches `swrast_update_buffers()` and the surface is never resized. This can be observed with "totem" using the GDK backend on Wayland (the default) when running on software rendering: $ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem Resizing the window would leave the EGL surface size unchanged. To avoid the issue, partially revert the part of commit `a9fb331ea` for `swrast_update_buffers()` and resize on the win size and not the attached size. Fixes: `a9fb331ea` - wayland/egl: update surface size on window resize Signed-off-by: Olivier Fourdan <ofourdan@redhat.com> CC: Daniel Stone <daniel@fooishbar.org> CC: Juan A. Suarez Romero <jasuarez@igalia.com> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-11-06 13:59:38 +01:00
Lionel Landwerlin	b47a69ed4c	intel/decoders: fix instruction base address parsing Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `00103db04a` ("intel: Fix decoding for partial STATE_BASE_ADDRESS updates.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-05 13:22:35 -08:00
Emil Velikov	b3ade65387	egl/glvnd: correctly report errors when vendor cannot be found If the user provides an invalid display or device the ToVendor lookup will fail. In this case, the local [Mesa vendor] error code will be set. Thus on sequential eglGetError(), the error will be EGL_SUCCESS. To be more specific, GLVND remembers the last vendor and calls back into it's eglGetError, although there's no guarantee to ever have had one. v2: - Add _eglError call, so the debug callback is executed (Kyle) - Drop XXX comment. Piglit: tests/egl/spec/egl_ext_device_query Fixes: `ce562f9e3f` ("EGL: Implement the libglvnd interface for EGL (v3)") Cc: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kyle Brenneman <kbrenneman@nvidia.com>	2018-11-05 20:53:05 +00:00
Emil Velikov	2a8fefdeb0	egl: add EGL_EXT_device_base entrypoints eglQueryDevicesEXT (unlike the other three functions) does not depend on the display. It is implemented in GLVND, which calls into each driver collecting the list of devices and presenting it to the user. For the other entrypoints, GLVND acts as pass through stub calling into the vendor library. The vendor implementation calls back into GLVND to get the vendor dispatch. Then the driver proceeds to call itself via the said dispatch. This design makes is possible to keep using "old" GLVND with newer vendor drivers. Since effectively all the extension code is within the latter itself. Without said entrypoints, any user will outright crash - as reported in the bug report. Note: there's a follow-up fix needed to our GLVND code, to make piglit happy. v2: add some beefy documentation in the commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635 Fixes: `7552fcb7b9` ("egl: add base EGL_EXT_device_base implementation") Reported-by: kyle.devir@mykolab.com Cc: kyle.devir@mykolab.com Acked-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-05 20:53:05 +00:00
Emil Velikov	7e169cf2a0	docs: mention EXT_shader_implicit_conversions Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-05 20:53:05 +00:00
Marek Olšák	04298a2f24	st/va: fix incorrect use of resource_destroy Fixes: `4373dd3215` ("st/va: Support YUV formats in vaCreateSurfaces") Cc: Drew Davenport <ddavenport@chromium.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-11-05 15:47:50 -05:00
Sergii Romantsov	5aeee1ab15	i965/batch/debug: Allow log be dumped before assert Message that may show the culprit of assert now will be dumped before that for debug purposes. Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel G Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-05 09:24:55 -08:00
Lionel Landwerlin	4fd0ff75f3	intel/sanitize_gpu: add debug message on mmap fail Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-05 15:45:08 +00:00
Lionel Landwerlin	e400ac52e4	intel/sanitize_gpu: deal with non page multiple buffer sizes We can only map at page aligned offsets. We got that wrong with buffer size where (size % 4096) != 0 (anv has a WA buffer of 1024). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-05 15:45:07 +00:00
Lionel Landwerlin	c5fca35af1	intel/sanitize_gpu: add help/gdb options to wrapper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-05 15:45:07 +00:00
Lionel Landwerlin	9ab5089150	intel/dump_gpu: add missing gdb option Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-05 15:43:34 +00:00
Eric Engestrom	d515ded4d9	wsi/wayland: only finish() a successfully init()ed display Fixes: `4369102498` "vulkan/wsi/wayland: Stop caching Wayland displays" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-11-05 15:29:21 +00:00
Eric Engestrom	dcee22afed	wsi/wayland: use proper VkResult type Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 14:55:05 +00:00
Sergii Romantsov	ce837a5372	autotools: library-dependency when no sse and 32-bit Building of 32bit Mesa may fail if __SSE__ is not specified. Added missed dependency from libm. v2: avoided dependecy on any flag, just link v3: meson doesn't fail, but have added dependency on libm CC: Dylan Baker <dylan@pnwbakers.com> CC: Lionel G Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-05 13:21:49 +01:00
Samuel Pitoiset	f7fd0d86a9	radv: more use of radv_cp_wait_mem() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 09:48:50 +01:00
Samuel Pitoiset	c571ca7a08	radv: replace si_emit_wait_fence() with radv_cp_wait_mem() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 09:48:50 +01:00
Samuel Pitoiset	b1b2dd06a7	radv: add missing TFB queries support to CmdCopyQueryPoolsResults() Cc: 18.3 <mesa-stable@lists.freedesktop.org> Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-05 09:48:43 +01:00
Samuel Pitoiset	dc3419195c	radv: remove useless sync after copying query results with compute The spec says: "vkCmdCopyQueryPoolResults is considered to be a transfer operation, and its writes to buffer memory must be synchronized using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT before using the results." VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle, while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector caches and L2. So, it's useless to set those flags internally. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-05 09:47:55 +01:00
Vinson Lee	64a9ed8848	r600/sb: Fix constant logical operand in assert. Fixes: `da977ad907` ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-11-04 21:09:55 -08:00
Kenneth Graunke	5d517a599b	st/mesa: Don't record garbage streamout information in the non-SSO case. In the non-SSO case, where multiple shader stages are linked together, we were recording garbage pipe_stream_output_info structures for all but the last enabled geometry-processing stage. Specifically, we were using the gl_transform_feedback_info from shader_program->last_vert_prog (the stage whose outputs will be recorded)...but were pairing it with the output varying mappings from the current shader stage. For example, a program with a VS and GS, the VS's pipe_shader_state would have a pipe_stream_output_info based on the GS transform feedback info, but the VS output mapping. This generally worked out okay because only the pipe_stream_output_info for the last stage really matters - the others can be ignored. However, we'd like to avoid confusing the pipe driver. In particular, my new driver translates the stream out information to hardware packets at bind_{vs,tes,gs}_state() time...and was hitting asserts about garbage varyings that didn't exist. This patch changes st/mesa to record a blank pipe_stream_output_info with num_outputs = 0 for all stages prior to last_vert_prog. The last one is captured as normal. (In the fully-SSO case, nothing should change - each program contains a single shader stage, so last_vert_prog is the current shader.) Tested with llvmpipe (piglit's gpu profile), and freedreno (a3xx, gpu profile with -t transform.feedback). Fixes several hundred CTS tests on my new driver. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-11-03 23:34:36 -07:00
Kenneth Graunke	b6410a2d22	st/nir: Drop unused parameter from st_nir_assign_uniform_locations(). ARB programs won't have one of these, and we don't use it anyway. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-11-03 23:34:36 -07:00
Kenneth Graunke	5294d65011	st/mesa: Pull nir_lower_wpos_ytransform work into a helper function. This will let me use it in the ARB program code as well. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-11-03 23:34:34 -07:00
Kenneth Graunke	424a6052df	intel: Use a URB start offset of 0 for disabled stages. There are some cases where the VS is the only stage enabled, it uses the entire URB, and the URB is large enough that placing later stages after the VS exceeds the number of bits for "URB Starting Address". For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from Piglit is getting a starting offset of 128 for the GS/HS/DS. But the field is only large enough to hold an offset of 127. i965 doesn't hit any genxml assertions because it's still using the old OUT_BATCH mechanism. 128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0, with the extra bit falling off the end. So we place the disabled stage at the beginning of the URB (overlapping with push constants). This is likely okay since it's a zero size region (0 entries). It seems like the Vulkan driver might hit this assertion, however, and the situation seems harmless. To work around this, always place disabled stages at the start of the URB, so the last enabled stage can fill the remaining space without overflowing the field. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-11-03 23:25:57 -07:00
Mauro Rossi	5c0cff868a	android: radv: add libmesa_git_sha1 static dependency libmesa_git_sha1 whole static dependency is added to get git_sha1.h header and avoid following building error: external/mesa/src/amd/vulkan/radv_device.c:46:10: fatal error: 'git_sha1.h' file not found ^ 1 error generated. Fixes: `9d40ec2cf6` ("radv: Add support for VK_KHR_driver_properties.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-11-03 10:48:45 +01:00
Eric Anholt	0d78c6af0d	vc4: Use the normal simulator ioctl path for CL submit as well. The simulator no longer needs to look back into the gallium structs.	2018-11-02 14:26:38 -07:00
Eric Anholt	c80e267a0a	vc4: Maintain a separate GEM mapping of BOs in the simulator. This will let us avoid looking back into the gallium driver's vc4_bo.	2018-11-02 14:26:38 -07:00
Eric Anholt	645ca269d2	vc4: Take advantage of _mesa_hash_table_remove_key() in the simulator.	2018-11-02 14:26:38 -07:00
Eric Anholt	f32ba7abd7	v3d: Remove the special path for simulaton of the submit ioctl. Now that it doesn't need to find the struct v3d_bos, it can just take the normal v3d_ioctl() path.	2018-11-02 14:26:38 -07:00
Eric Anholt	df9f574c13	v3d: Maintain a mapping of the GEM buffer in the simulator. This way we don't need to reach back into the gallium driver code to get the mapping.	2018-11-02 14:26:38 -07:00
Dylan Baker	7652931d33	meson: link gallium nine with pthreads In some cases (not building with llvm, which automatically pulls in pthreads) nine needs to be directly linked with pthreads. Fixes building on x86 (32 bit) without llvm. Distro bug: https://bugs.gentoo.org/670094 Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Tested-by: Rafal Lalik <rafallalik@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-11-02 13:10:33 -07:00
Anuj Phogat	1c140470ef	anv/icl: Disable prefetching of sampler state entries WA_1606682166: Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes. Disable the Sampler state prefetch functionality in the SARB by programming 0xB000[30] to '1'. This is to be done at boot time and the feature must remain disabled permanently. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-02 08:34:33 -07:00
Topi Pohjolainen	9a41a10f8a	i965/icl: Disable prefetching of sampler state entries In the same spirit as commit `a5889d70f2` "i965/icl: Disable binding table prefetching". Fixes some 110+ intermittent piglit failures with tex-miplevel-selection variants. WA_1606682166: Incorrect TDL's SSP address shift in SARB for 16:6 & 18:8 modes. Disable the Sampler state prefetch functionality in the SARB by programming 0xB000[30] to '1'. This is to be done at boot time and the feature must remain disabled permanently. Anuj: Set SamplerCount = 0 for vs, gs, hs, ds and wm units as well. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-11-02 08:34:33 -07:00
Jan Vesely	9cab8ccd6c	amd: Make vgpr-spilling depend on llvm version The option was removed in LLVM r345763 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-11-02 10:32:47 -04:00
Timothy Arceri	769ae9fb7f	nir: fix condition propagation when src has a swizzle We cannot use nir_build_alu() to create the new alu as it has no way to know how many components of the src we will use. This results in it guessing the max number of components from one of its inputs. Fixes the following CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_frag dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_geom dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_tessc dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order_vert Fixes: `2975422ceb` ("nir: propagates if condition evaluation down some alu chains") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-03 00:44:01 +11:00
Mauro Rossi	b9dec214f5	android: gallium/auxiliary: add include to get u_debug.h header To avoid build error in u_debug_stack_android.cpp due to now missing u_debug.h header: external/mesa/src/gallium/auxiliary/util/u_debug_stack_android.cpp:26:10: fatal error: 'u_debug.h' file not found #include "u_debug.h" ^ 1 error generated. Fixes: `37db383abb` ("util: Move u_debug to utils") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-02 13:31:37 +01:00
Gert Wollny	b710680093	virgl/vtest-winsys: Use virgl version of bind flags The bind flags defined by mesa/gallium might not always be in sync with the ones copied to virglrenderer/gallium. Therefore, use the flags defined in virgl like it is done for all the other calls to create resources. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-02 11:53:09 +01:00
Gert Wollny	acd2968005	mesa/st: Add support for EXT_texture_sRGB_R8 This only adds support on the Gallium core level, for the drivers it is likely that additional changes are needed to support the new texture format and thereby enabling the extension. Enables on softpipe and makes pass: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.* v2: - add include for getting GL_SR8_EXT v4: - since the extension is not required don't bother providing a fallback (Ilia Mirkin) - split patch (2/2) to separate Gallium and mesa/st parts (Roland Scheidegger) - trim commit message to only contain the history of the patch relevant to this part v5: - don't include GLES headers (required enum has been added to glheader.h) (Ilia Mirkin) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-02 11:52:44 +01:00
Gert Wollny	29f0ab2c30	Gallium: Add format PIPE_FORMAT_R8_SRGB This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new format enum, the format entries in Gallium and and svga, the mapping between sRGB and linear formats, and tests. v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB v3: - Add texture format to svga format table since otherwise building mesa will fail when this driver is enabled. It was not tested whether the extension actually works. v4: - svga: remove the SVGA specific format definitions and table entries and only add correct the location of PIPE_FORMAT_R8_SRGB in the format_conversion_table (Ilia Mirkin) - Split patch (1/2) to separate Gallium part and mesa/st part. (Roland Scheidegger) - Trim the commit message to only contain the relevant parts from the split. v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-02 11:52:44 +01:00
Gert Wollny	b8e9c6522d	mesa/core: Add definitions and translations for EXT_texture_sRGB_R8 v2: - fix format definition line - disable for desktop GL - don't add GL_R8_EXT to glext.h since it is already in GLES2/gl2ext.h in glext.h and include this header where needed (all Emil) v3: - swrast: Fill the function table for sRGB_R8 The size of the function table is checked at compile time and must correspond to the number of mesa texture formats. dri/swrast being gles-2.0 doesn't support the extension though v4: - correct format layout comment (Ilia Mirkin) - correct logic for accepting GL_RED only textures (in part Ilia Mirkin) EXT_texture_sRGB_R8 requires OpenGL ES 3.0 which includes ARB_texture_rg/EXT_texture_rg, so one only must check for the first when SR8_EXT is really requested. v5: - add define for GL_ES8_XT to glheader.h and don't include GLES headers (Ilia Mirkin) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-11-02 11:52:44 +01:00
Erik Faye-Lund	742dace825	glsl: do not allow implicit casts of unsized array initializers The GLSL 4.6 specification (section 4.1.14. "Implicit Conversions") says: "There are no implicit array or structure conversions. For example, an array of int cannot be implicitly converted to an array of float." So let's add a check in place when assigning array initializers to implicitly sized arrays, to avoid incorrectly allowing code on the form: int[] foo = float[](1.0, 2.0, 3.0) This fixes the following dEQP test-cases: - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_float_fragment - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.int_to_uint_fragment - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es31.invalid.arrays.uint_to_float_fragment - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_float_fragment - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.int_to_uint_fragment - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_vertex - dEQP-GLES31.functional.shaders.implicit_conversions.es32.invalid.arrays.uint_to_float_fragment Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-02 11:10:36 +01:00
Erik Faye-Lund	6df922f438	mesa/glsl: add support for EXT_shader_implicit_conversions EXT_shader_implicit_conversions adds support for implicit conversions for GLES 3.1 and above. This is essentially a subset of ARB_gpu_shader5, and augments OES_gpu_shader5. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-02 11:10:36 +01:00
Erik Faye-Lund	ecab2d6f14	glsl: fall back to inexact function-match In GLES, we currently either need an exact match with a local function, or an exact match with a builtin. However, if we add support for implicit conversions for GLES shaders, we also need to fall back to a non-exact match in the case where there were no builtin match either. Luckily, we already have a variable ready with this, so let's just return it if the builtin-search failed. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-02 11:10:36 +01:00
Erik Faye-Lund	e975c5b785	glsl: add has_implicit_uint_to_int_conversion()-helper This makes the code a bit easier to read, as well as reduces repetition, especially when we add support for EXT_shader_implicit_conversions. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-02 11:10:36 +01:00
Erik Faye-Lund	12f001f013	glsl: add has_implicit_conversions()-helper This makes the code a bit easier to read, as well as will reduce repetition when we add support for EXT_shader_implicit_conversions. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-11-02 11:10:36 +01:00
Mathias Fröhlich	9f009c1a8f	mesa: Remove needless indirection in some draw functions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-02 08:42:03 +01:00
Timothy Arceri	c7bdda8aa5	nir: allow propagation of if evaluation for bcsel Shader-db results Skylake: total instructions in shared programs: 13109035 -> 13109024 (<.01%) instructions in affected programs: 4777 -> 4766 (-0.23%) helped: 11 HURT: 0 total cycles in shared programs: 332090418 -> 332090443 (<.01%) cycles in affected programs: 19474 -> 19499 (0.13%) helped: 6 HURT: 4 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-02 15:56:34 +11:00
Dave Airlie	677b496b6b	radv: fix begin/end transform feedback with 0 counter buffers. If the user gives 0 counterBuffers then the driver should still enable transform feedback on all targets. This changes the driver to always enable xfb, and use counter buffers where one is defined for the target in question. Fixes: `b4eb029062` (radv: implement VK_EXT_transform_feedback) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-02 04:15:07 +00:00
Dave Airlie	7f37a52a21	radv: apply xfb buffer offset at buffer binding time not later. (v2) In order to handle pause/resume properly, the offset should be added to the buffer binding not to the begin/end paths. v2: don't add offset to size Fixes ext_transform_feedback-alignment* under zink Fixes: `b4eb029062` (radv: implement VK_EXT_transform_feedback) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-11-02 04:13:31 +00:00
Mark Janes	5f312e95f8	Revert "i965/batch: avoid reverting batch buffer if saved state is an empty" This reverts commit `a9031bf9b5`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630	2018-11-01 16:28:05 -07:00
Eric Anholt	43a397c580	vc4: Drop the winsys_stride relayout in the simluator Since `0c1dd9dee0` ("broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride."), we have the vc4-side BO properly laid out (assuming it's linear) in the winsys BO so that we can skip this extra copy.	2018-11-01 14:34:02 -07:00
Eric Anholt	4e1b163eed	v3d: Update the TLB config for depth writes on V3D 4.2. Fixes 311 piglit cases on the simulator.	2018-11-01 13:56:30 -07:00
Eric Anholt	4018eb04e8	v3d: Use the TLB R/B swapping instead of recompiles when available. The recompile reduction is nice, but this also makes it so that a straight texture copy could get optimized some day to not unpack/repack the f16 values.	2018-11-01 13:56:30 -07:00
Eric Anholt	3923cf626d	v3d: Take advantage of _mesa_hash_table_remove_key() in the simulator.	2018-11-01 13:54:36 -07:00
Eric Anholt	47586ab569	v3d: Respect user-passed strides for BO imports. If the caller has passed in a stride for (linear) BO import, we should use that stride when rendering to the BO (or, if we some day support texturing from linear-imported BOs, when doing the linear-to-UIF shadow copy). This lets us remove the extra stride-changing relayout in the simulator.	2018-11-01 13:54:36 -07:00
Eric Anholt	5313fb8abd	v3d: Drop #if 0-ed out v3d_dump_to_file(). This came from vc4, where we had a file format for GPU hangs. I don't have one of those for V3D, and I probably won't ever have the simulator side produce dumps even if I do.	2018-11-01 13:54:36 -07:00
Eric Anholt	d3f66c385b	v3d: Fix a typo in a comment in job handling.	2018-11-01 13:54:36 -07:00
Eric Anholt	b93fc160f4	v3d: Fix a copy-and-paste comment in the simulator code.	2018-11-01 13:54:36 -07:00
Anuj Phogat	13c955182f	anv/icl: Set Error Detection Behavior Control Bit in L3CNTLREG The default setting of this bit is not the desirable behavior. WA_1406697149 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-01 12:00:23 -07:00
Anuj Phogat	b3d6937fb0	i965/icl: Set Error Detection Behavior Control Bit in L3CNTLREG The default setting of this bit is not the desirable behavior. WA_1406697149 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-01 12:00:23 -07:00
Emil Velikov	ac95a0e024	docs: add 19.0.0-devel release notes template Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-01 18:56:54 +00:00
Emil Velikov	97c73c9174	mesa: bump version to 19.1.0-devel Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-01 18:54:02 +00:00
Dylan Baker	1f41104b9b	meson: don't install translation files Tested-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `7834926a4f` ("meson: add support for generating translation mo files")	2018-11-01 10:49:16 -07:00
Eric Engestrom	4da169d368	egl: use the LC_ALL hammer instead of LANG Some environment (like Travis apparently) set LC_* vars, messing up the sort ordering, so let's use envvar with the highest priority to make sure this is actually sorted in ASCII order. Suggested-by: Michel Dänzer <michel@daenzer.net> Fixes: `b42dc50a5f` "egl: fix entrypoint sorting test" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-11-01 17:25:08 +00:00
Eric Engestrom	b42dc50a5f	egl: fix entrypoint sorting test Fixes: `68dc591af1` "egl: Fix eglentrypoint.h sort order." Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 15:45:26 +00:00
Andrii Simiklit	fc3cecda8c	intel/tools: fix resource leak Some memory and file descriptors are not freed/closed. v2: fixed case where we skipped the 'aub' variable initialization Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-01 13:21:07 +00:00
Jonathan Gray	ae8e81b0e3	intel/tools: include stdarg.h in error2aub Include stdarg.h in error2aub.c otherwise it fails to build on OpenBSD due to not finding definitions for va_list va_start va_end. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-11-01 10:27:26 +00:00
Mathias Fröhlich	68dc591af1	egl: Fix eglentrypoint.h sort order. Fixes a make check failure. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108617 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 10:56:21 +01:00
Samuel Pitoiset	9cbdcc86b7	radv: set PA_SU_PRIM_FILTER_CNTL optimally Ported from RadeonSI. It's always TRUE for CIK+ because RADV doesn't support 16 samples. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-01 08:49:15 +01:00
Samuel Pitoiset	85010585cd	radv: only enable gl_SampleMask if MSAA is enabled too Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-01 08:49:11 +01:00
Samuel Pitoiset	0c08074cef	radv: use radeon_info::num_good_cu_per_sh Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-01 08:49:08 +01:00
Samuel Pitoiset	9278089d05	ac/nir: make use of i1false in few more places Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-01 08:49:05 +01:00
Samuel Pitoiset	79410b1e87	radv: add support for Raven2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-11-01 08:48:52 +01:00
Mathias Fröhlich	ad52e19408	mesa: Collect all the draw functions in draw.{h,c}. Some of these functions were distributed across different implementation and header files. Put them at a central place. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	3d64f3c795	mesa/vbo: Move _vbo_draw_indirect -> _mesa_draw_indirect Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	f726c61cc1	mesa/vbo: Move src/mesa/vbo/vbo_exec_array.c -> src/mesa/main/draw.c The array type draw is no longer directly dependent on the vbo module. Thus move array type draws into mesa/main/draw.c. Rename symbols starting with vbo_* to _mesa_* and apply some reindenting to make it consistent. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	952a5da584	vbo: Pull the _mesa_set_draw_vao calls out of the if clauses. These calls are just the same in each if branch. So pull that before the if. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	b00cb994ef	vbo: Preserve vbo_save::no_current_update on primitive restart. With this change we preserve the no_current_update property when we observe a glPrimitiveRestart call. That means that we now also get the no_current_update optimization for display lists that are made out of indexed draws using primitive restart. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	f2a52b3c25	vbo: Make no_current_update an argument to vbo_save_NotifyBegin. Instead of coding additional information into the primitive mode, make the only remaining flag there a direct argument to vbo_save_NotifyBegin. v2: Fix incorrect no_current_update in glRectf. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	b899f5e59c	vbo: Move no_current_update out of _mesa_prim. The _mesa_prim::no_current_update flag should tell the compiled display list if the current attributes that are placed in the dlists vbo shall take a defined state past replay of a display list. Immediate mode draws compiled into display lists should set the current values. Array draws may leave the current values in undefined state. So finally this flag is not a property of every primitive but it is a property of the compiled display list and there it is a property of the last primitive compiled into the list. So move the flag out of _mesa_prim into vbo_save. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	eae4ee9419	vbo: Remove the now unused VBO_SAVE_PRIM_WEAK define. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	873adb06fa	vbo: Remove the always false branch dlist replay. The previous patch left a constant if (0) in the code. Clean that up now. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	1387b4d533	vbo: Test for VBO_SAVE_PRIM_WEAK in _mesa_prim::mode is false. When setting the _mesa_prim::mode field we always filter out all non OpenGL primitive mode bits. So this tested bit cannot be there anymore and the test evaluates to zero. The zero is removed with the next patch to ease review. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	cee0dd8d5a	vbo: Remove VBO_SAVE_PRIM_WEAK from vbo_save_NotifyBegin calls. Now looking at the implementation of vbo_save_NotifyBegin. The VBO_SAVE_PRIM_WEAK flag, delivered in the primitive mode argument to vbo_save_NotifyBegin, is not evaluated anymore. The two users of the mode argument are the primitive mode itself, where the VBO_SAVE_PRIM_WEAK bit is masked out to retrieve the underlying OpenGL primitive mode. The other user is to check for the VBO_SAVE_PRIM_NO_CURRENT_UPDATE bit which is different from VBO_SAVE_PRIM_WEAK. So, since vbo_save_NotifyBegin does not care about VBO_SAVE_PRIM_WEAK, we can savely remove it from the call arguments of vbo_save_NotifyBegin. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	b632c072b2	vbo: Remove set but not used weak field from _mesa_prim. The only reader of the weak field in _mesa_prim is pretty console printing. By that, remove the weak field from _mesa_prim. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	2dc951b7c3	vbo: Remove the VBO_SAVE_FALLBACK flag. On finishing a display list playback the VBO_SAVE_FALLBACK bit is still kept in vbo_save_context::replay_flags. But examining replay_flags and the display list flags that feed this value the corresponding bit is never set these days anymore. So, since it is nowhere set or checked, we can safely remove it. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Mathias Fröhlich	5b41504f66	vbo: Remove unused vbo_save_fallback function. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 06:08:49 +01:00
Emil Velikov	075f92b2b7	docs/relnotes: add the EGL Device extensions Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-01 00:05:43 +00:00
Emil Velikov	83c7fbb4e4	meson: egl: group dri2 bits separately from haiku One cannot have haiku and dri2 - surfaceless,x11,etc. Group things up, which will make the addition of platform_device a bit easier. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-11-01 00:05:43 +00:00
Emil Velikov	c7cc135e23	egl: enable EGL_EXT_device_{base,enumeration,query} Now that we support the extensions, fully, enabled them. The specs mandate that we always have at least one device and each dpy has a device associated with it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Emil Velikov	00992700c9	egl: set the EGLDevice when creating a display This is the final requirement from the base EGLDevice spec. v2: - split from another patch - move wayland hunk after we have the fd Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Emil Velikov	dbb4457d98	egl: add EGL_EXT_device_drm support Add implementation based around the drmDevice API. As such it's only available only when building with libdrm. With the latter already a requirement when using !SW code paths in the platform code. Note: the current code will work if a device is hot-plugged. Yet hot-unplugged is not implemented, since I have no ways of testing it. v2: - ddd some _eglDeviceSupports checks - require DRM_NODE_RENDER - add _eglGetDRMDeviceRenderNode helper v3: - flip inverted asserts (Mathias) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Emil Velikov	f73c5d27c1	egl: add EGL_MESA_device_software support Add a plain software device, which is always available. We can safely assign it as the first/initial device in _eglGlobals, although we ensure that's the case with a handful of _eglDeviceSupports checks throughout the code. v2: - s/_eglFindDevice/_eglAddDevice/ (Eric) - s/_eglLookupAllDevices/_eglRefreshDeviceList/ (Eric) - move ^^ helpers into a earlier patch (Eric, Mathias) - set the SW device on _eglGlobal init. (Eric) - add a number of _eglDeviceSupports checks (Mathias) - split Device/Display attach to a separate patch v3: - flip inverted asserts (Mathias) - s/on-stack/static/ (Mathias) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Adam Jackson	3f08e500c4	specs: Add EGL_MESA_device_software The device extension string is expected to contain the name of the extension defining what kind of device it is, so the caller can know what kinds of operations it can perform with it. So that string had better be non-empty, hence this trivial extension. v2: - drop "fallback", update history and update contributor list Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Emil Velikov	7552fcb7b9	egl: add base EGL_EXT_device_base implementation Introduce the API for device query and enumeration. Those at the moment produce nothing useful since zero devices are actually available. That contradicts with the spec, so the extension isn't advertised just yet. With later commits we'll add support for software (always) and hardware devices. Each one exposing the respective extension string. v2: - fold API boilerplate into this patch - move _eglAddDevice, _eglDeviceSupports, _eglRefreshDeviceList to this patch (Eric, Mathias) - make _eglFiniDevice the one called last v3: - comment on the dummy _egl_device_extension enum entry (Eric) - annotate dev as MAYBE_UNUSED (Mathias) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-11-01 00:05:43 +00:00
Emil Velikov	e55c1bcb08	glx: be explicit about when mapping X <> GLX visuals Write down both X and GLX visual types when mapping from one to the other. Makes grepping through the code a tiny bit easier. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-11-01 00:05:43 +00:00
Emil Velikov	833e3cad19	glx: remove unused __glXPreferEGL() declaration The function definition is no longer around, drop the useless declaration. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-11-01 00:05:43 +00:00
Emil Velikov	4428eed896	travis: use mako for python2 Earlier commit flipped the default to python2 but forgot to update the travis file. Props to pip caching things "worked" for a little while. Fixes: `f22ad5ef18` ("travis: use python3 for the autoconf builds") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-11-01 00:05:43 +00:00
Dave Airlie	fcf15a007d	radv/xfb: don't increase offset by component mask start. This is incorrect, the offset is into the buffer, and it's legal to write loc 0,0 -> buffer0, offset 0 loc 0,1 -> buffer1, offset 0 This fixes a bunch of piglits running on my zink xfb code on radv. Fixes: `6c21645046` (radv: emit stream outputs for vertex and tessellation stages) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-31 23:48:10 +00:00
Dylan Baker	d25179469b	util/gen_xmlpool: Make use of python's foreach loop Instead of using a while loop with indexing. This is much cleaner. This requires some other small changes. Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:46 -07:00
Dylan Baker	465cfcb266	util/gen_xmlpool: Don't use len to test for container emptiness This is a very common python anti-pattern. Not using length allows us to go through faster C paths, but has the same meaning. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:46 -07:00
Dylan Baker	b9cd81ea31	util/gen_xmlpool: Don't write via shell redirection Using shell redirection to write to a file is more complicated than necessary, and has the potential to run into unicode encoding problems. It's also less code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108530 v2: - update commit message to say less about LANG=C - use flags instead of positional arguments for the script (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:46 -07:00
Dylan Baker	1df086662a	util/gen_xmlpool: use with statement to open file Which ensures it is closed at the end of the scope. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	bc4a7645e4	util/gen_xmlpool: use a main function Again, just good style Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	187fad5c0b	util/gen_xmlpool: Use print function instad of sys.stderr.write This ensures that stderr is flushed, unlike writing Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	2c2aa98ee7	util/gen_xmlpool: Use more standard style gen_xmlpool uses a style unlike the rest of mesa, spaces between function/method calls and the parens, strange whitespace to force lining up method calls, and some other whitespace stuff. Since I'm going to be doing some work in the file, I'm going to start cleaning those up. Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	a8004ef03e	docs/meson: Add note about update translations Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	0621e91a8c	util/xmlpool: Update for meson generation Meson won't put the .gmo files in the layout that python's gettext.translation() expects, it puts them in the build directory in a flat layout. This modifies android and autotools to do the same (scons doesn't work with translations at all) v3: - Squash 4 patches into this patch Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	7834926a4f	meson: add support for generating translation mo files Meson has handy a handy built-in module for handling gettext called i18n, this module works a bit differently than our autotools build does, namely it doesn't automatically generate translations instead it creates 3 new top level targets to run. These are: xmlpool-pot xmlpool-update-po xmlpool-gmo v2: - Add new files to autotools dist tarball Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 16:37:12 -07:00
Dylan Baker	2857b18991	util/gen_xmlpool: use argparse for argument handling This is a little cleaner than just looking at sys.argv, but it's also going to allow us to handle the differences in the way meson and autotools handle translations more cleanly. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-31 16:37:12 -07:00
Timothy Arceri	5b757b4097	nir: fix if condition propagation for alu use We need to update the cursor before we check if the alu use is dominated by the if condition. Previously we were checking if the current location of the alu instruction was dominated by the if condition which would miss some optimisation opportunities. Fixes: `a3b4cb3458` ("nir/opt_if: Rework condition propagation") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-11-01 09:22:55 +11:00
Vinson Lee	802ae533ab	freedreno: Do not link ir3_compiler with valgrind libraries. This patch fixes this freedreno autotools build error. CXXLD ir3_compiler /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): In function `_start': (.text+0x0): multiple definition of `_start' /usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here /usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o): relocation R_X86_64_32S against undefined symbol `vgPlain_interim_stack' can not be used when making a PIE object; recompile with -fPIC /usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_trampoline.o): relocation R_X86_64_32 against `.text' can not be used when making a PIE object; recompile with -fPIC /usr/bin/ld: /usr/lib/valgrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-dispatch-amd64-linux.o): relocation R_X86_64_32S against symbol `vgPlain_stats__n_xindirs_32' can not be used when making a PIE object; recompile with -fPIC /usr/bin/ld: final link failed: Nonrepresentable section on output collect2: error: ld returned 1 exit status Fixes: `f3cc0d2747` ("freedreno: import libdrm_freedreno + redesign submit") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108595 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-31 15:05:28 -07:00
Emil Velikov	f22ad5ef18	travis: use python3 for the autoconf builds Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-31 19:16:00 +00:00
Emil Velikov	986033a275	configure: allow building with python3 Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python2 chosen prior to python3 v2: use python2 by default Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-31 19:15:50 +00:00
Juan A. Suarez Romero	6d7d3dbda5	docs: update calendar, add new item and link release notes for 18.2.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-10-31 19:58:00 +01:00
Juan A. Suarez Romero	5b074c756e	docs: add sha256 checksums for 18.2.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `624e384ea8`)	2018-10-31 19:55:28 +01:00
Juan A. Suarez Romero	7c2239aa55	docs: add release notes for 18.2.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `1cdef5e70c`)	2018-10-31 19:55:25 +01:00
Eric Engestrom	091da79bb0	meson: hide warnings from external project `gtest` gtest is an external project that is copied in this tree for technical reasons, but isn't maintained by us, so its warnings are irrelevant. Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-31 18:20:25 +00:00
Eric Engestrom	455a3cd515	tools/imgui: disable all warnings This is an external project we have no control over, and will not be fixing (other than by sometimes pulling the latest sources), so warnings serve no purpose here. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-31 16:28:33 +00:00
Alejandro Piñeiro	95b8da22cf	glspirv: no need to force entrypoint name to "main" Since commit "intel/compiler: Stop assuming the entrypoint is called "main"" there is no need to force the entrypoint name to be "main". Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-31 15:57:23 +01:00
Tapani Pälli	27f1298b9d	glsl/linker: validate attribute aliasing before optimizations Patch does a 'dry run' of assign_attribute_or_color_locations before optimizations to catch cases where we have aliasing of unused attributes which is forbidden by the GLSL ES 3.x specifications. We need to run this pass before unused attributes may be removed and with attribute binding information from program, therefore we re-use existing pass in linker rather than attempt to write another one. This fixes WebGL2 test 'gl-bindAttribLocation-aliasing-inactive' and Piglit test 'gles-3.0-attribute-aliasing'. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106833 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-31 14:53:47 +02:00
Eric Engestrom	a96749b13c	egl: drop EGL driver `name` This is a revert of Marek's `2cb9ab53dd` revert. It was needed to revert the previous commit, and didn't have any issue itself. -- The "DRI2" name was reported as confusing when printing EGL infos (one user reported thinking DRI3 was not working on his X server), and the only alternative is Haiku, which can only be used on a Haiku machine. The name therefore doesn't add any information that the user wouldn't know already, so let's just drop it. Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Related-to: `b174a1ae72` ("egl: Simplify the "driver" interface") Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 11:01:54 +00:00
Eric Engestrom	cb0980e69a	egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku} This is a revert of Marek's `84f3afc2e1` revert, with a missing line added back. I failed a rebase and dropped that crucial line, and didn't do a runtime test after my rebase, and as a result broke EGL for everyone. This commit has been tested by Intel's CI and I re-read it once more, so it should be good this time. -- Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's overwritten by the EGL_NOT_INITIALIZED in eglInitialize(). Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-31 11:01:54 +00:00
Christian Gmeiner	21d9b78289	Revert "imx: make use of loader_open_render_node(..) helper" This reverts commit `773d6ea6e7`. Since kernel 4.17 (drm/etnaviv: remove the need for a gpu-subsystem DT node) the etnaviv DRM driver doesn't have an associated DT node anymore. This is technically correct, as the etnaviv device is a virtual device driving multiple hardware devices. Before 4.17 the userspace had access to the following information: DRIVER=etnaviv OF_NAME=gpu-subsystem OF_FULLNAME=/gpu-subsystem OF_COMPATIBLE_0=fsl,imx-gpu-subsystem OF_COMPATIBLE_N=1 MODALIAS=of:Ngpu-subsystemT<NULL>Cfsl,imx-gpu-subsystem DRIVER=imx-drm OF_NAME=display-subsystem OF_FULLNAME=/display-subsystem OF_COMPATIBLE_0=fsl,imx-display-subsystem OF_COMPATIBLE_N=1 Afer 4.17: DRIVER=etnaviv MODALIAS=platform:etnaviv The OF node has never been part of the etnaviv UABI, simply due to the fact that it's still possible to instantiate the etnaviv driver from a platform file, instead of a devicetree node. A patch set to fix this problem was send out [1] but it looks like that a proper solution needs more time to bake. [1] https://lists.freedesktop.org/archives/dri-devel/2018-October/194651.html Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-10-31 09:41:26 +01:00
Samuel Pitoiset	9ef8ea1451	radv: use WAIT_REG_MEM_GREATER_OR_EQUAL instead of a magic value Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	a9a56f47f8	radv: use pool->stride when calling radv_query_shader() Not needed to recompute the stride. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	e60ab66e33	radv: rename some parameters in Cmd{Begin,End}TransformFeedbackEXT() To match latest spec. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	57982b683b	radv/winsys: do not assign last submission when chained path failed I don't think we want to wait for something that hasn't been correctly submitted. This is similar to the fallback path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	ae3aecd07f	radv/winsys: fix buffer deletion in the sysmem path In case we failed to submit the CS correctly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	72877865d9	radv/winsys: cleanup the chained submission path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Samuel Pitoiset	d12dd16a97	radv/winsys: remove unused surface_best() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-31 09:21:28 +01:00
Jason Ekstrand	d3a0d8b750	intel/compiler: Stop assuming the entrypoint is called "main" This isn't true for Vulkan so we have to whack it to "main" in anv which is silly. Instead of walking the list of functions and asserting that everything is named "main" and hoping there's only one function named "main", just use the nir_shader_get_entrypoint() helper which has better assertions anyway. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 20:14:52 -05:00
Timothy Arceri	31596836fc	st/glsl_to_nir: fix next_stage gathering ffs() just returns the bit that is set, we need to know what stage that bit represents so use u_bit_scan() instead. Fixes: `2ca5d9548f` ("st/glsl_to_nir: gather next_stage in shader_info") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-31 09:33:17 +11:00
Timothy Arceri	9ec4a5ef29	st/mesa: calculate buffer size correctly for packed uniforms Fixes: `edded12376` ("mesa: rework ParameterList to allow packing") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-31 09:32:41 +11:00
Dylan Baker	fb02bd3d1c	util: move u_cpu_detect to util CC: vlee@freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107870 Fixes: `80825abb5d` ("move u_math to src/util") Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	37db383abb	util: Move u_debug to utils Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	2fd5dff7e7	util: Move os_misc to util this is needed by u_debug Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	f1f104e548	gallium/util: remove u_inlines.h from u_debug.c It's not used, and I'm not pulling u_inlines into src/util. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	59d494c1cc	gallium/util: remove p_format.h from u_debug.h Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	314777e86a	gallium/util: move memory debug declarations into u_debug_gallium Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	68074dfa0e	gallium/util: move debug_print_tranfer_flags to u_debug_galilum This also appears to be unused. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	fc39dc9841	gallium/util: move debug_print_bind_flags to u_debug_gallium This also appears to be unused. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	e4f1fea821	gallium/util: move debug_print_usage_enum to the u_debug_gallium This isn't used in mesa, maybe vmware uses this in a closed source state tracker? Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	078b3cdb34	gallium/util: start splitting u_debug into generic and gallium specific components In order to pull u_debug into src/util we need to break the generically useful bits from the bits that are tightly coupled to gallium. Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Dylan Baker	389d59c72a	gallium: split u_prim_name out of u_debug.h This allows us to pull u_prim.h out of u_debug.h Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 14:32:52 -07:00
Andre Heider	25a3ce97d5	gallium/hud: fix power sensor readings for amdgpu users amdgpu doesn't use the INPUT but the AVERAGE subfeature: $ sensors -u amdgpu-pci-0100 Adapter: PCI adapter power1: power1_average: 17.233 power1_cap: 180.000 Signed-off-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 16:30:32 -04:00
Rhys Perry	5172eb231d	glsl_to_tgsi: don't create 64-bit integer MAD/FMA TGSI has no I64MAD/U64MAD opcode. Fixes: `278580729a` ('st/glsl_to_tgsi: add support for 64-bit integers') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-30 20:27:12 +00:00
Marek Olšák	26cb93e229	radeonsi: add support for Raven2 (v2) v2: fix enabling primitive binning Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-30 16:03:02 -04:00
Marek Olšák	0dea85928e	radeonsi: clean up decompress flags in fast color clear	2018-10-30 16:03:02 -04:00
Marek Olšák	99835fff08	radeonsi/gfx9: set optimal OVERWRITE_COMBINER_WATERMARK	2018-10-30 16:03:02 -04:00
Marek Olšák	8ad12c8bec	gallium: rework PIPE_HANDLE_USAGE_* flags Only radeonsi uses them, so adjust them to match its needs.	2018-10-30 16:03:02 -04:00
Danylo Piliaiev	00fc56a68d	anv: Disable dual source blending when shader doesn't support it on gen8+ Dual source blending behaviour is undefined when shader doesn't have second color output. "If SRC1 is included in a src/dst blend factor and a DualSource RT Write message is not used, results are UNDEFINED. (This reflects the same restriction in DX APIs, where undefined results are produced if “o1” is not written by a PS – there are no default values defined)." Dismissing fragment in such situation leads to a hang on gen8+ if depth test in enabled. Since blending cannot be gracefully fixed in such case and the result is undefined - blending is simply disabled. v2 (Jason Ekstrand): - Apply the workaround to each individual entry - Emit a warning through debug_report Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 12:59:53 -07:00
Danylo Piliaiev	eca4a6548d	i965: Disable dual source blending when shader doesn't support it on gen8+ Dual source blending behaviour is undefined when shader doesn't have second color output, dismissing fragment in such situation leads to a hang on gen8+ if depth test in enabled. Since blending cannot be gracefully fixed in such case and the result is undefined - blending is simply disabled. v2 (Kenneth Graunke): - Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND - Also whack BLEND_STATE[] to keep the two in sync, since we're not sure exactly which copy of the redundant info the hardware will use. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 12:59:53 -07:00
Kenneth Graunke	337a808062	i965: Respect GL_TEXTURE_SRGB_DECODE_EXT in GenerateMipmaps() Apparently, we're supposed to look at the texture object's built-in sampler object's sRGB decode setting in order to decide whether to decode/downsample/re-encode, or simply downsample as-is. Previously, I had always done the decoding/encoding. Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-30 12:59:53 -07:00
Andrii Simiklit	e4e0fd5ffe	i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch' If we restore the 'new batch' using 'intel_batchbuffer_reset_to_saved' function we must restore the default state of the batch using 'brw_new_batch' function because the 'intel_batchbuffer_flush' function will not do it for the 'new batch' again. At least the following fields of the batch 'state_base_address_emitted','aperture_space', 'state_used' should be restored to default values to avoid: 1. the aperture_space overflow 2. the missed STATE_BASE_ADDRESS commad in the batch 3. the memory overconsumption of the 'statebuffer' due to uncleared 'state_used' field. etc. v2: merge with new commits, changes was minimized, added the 'fixes' tag v3: added in to patch series Fixes: `3faf56ffbd` "intel: Add an interface for saving/restoring the batchbuffer state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 12:09:17 -07:00
Andrii Simiklit	a9031bf9b5	i965/batch: avoid reverting batch buffer if saved state is an empty There's no point reverting to the last saved point if that save point is the empty batch, we will just repeat ourselves. CC: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `3faf56ffbd` "intel: Add an interface for saving/restoring the batchbuffer state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-30 12:09:09 -07:00
Eric Engestrom	ea738a91de	egl: add messages to a few assert() and turn a couple into unreachable() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	d0d6ec549d	util: s/0/NULL/ for pointer Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	5c64847322	i965: add missing case to fix -Wswitch While at it, turn "unreachable" assert() into unreachable(). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	2894e278cf	mesa: fix struct/class mismatch Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	6000895e2d	mesa: fix memcpy() and memset(0) of non-trivial structs Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	69eb6d58e8	nouveau: remove unused class member Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-10-30 18:10:59 +00:00
Eric Engestrom	6f9309d5d4	scons: drop unused HAVE_STDINT_H macro This was required back when MSVC didn't support C99 and was missing this header, but since MSVC 2013 (or maybe earlier?) this isn't it does and this code isn't doing anything anymore. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	a18d726621	aub_viewer: show vertex buffer pitch Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	0bbee28a3b	meson: add note about intel tools build options Fixes: `ea83a1d304` "intel: tools: import ImGui" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-30 18:10:59 +00:00
Eric Engestrom	4a266d01a7	vl: drop left-over variable Fixes: `6ccc435e7a` "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 18:10:59 +00:00
Eric Anholt	68657d76b9	vc4: Fix unused variable warning. Fixes: `bb84fa146f` ("util: use C99 declaration in the for-loop hash_table_foreach() macro")	2018-10-30 10:46:52 -07:00
Eric Anholt	cc54e1acf9	v3d: Use nir_remove_unused_io_vars to handle binner shader output DCE We were doing this late after nir_lower_io, but we can just reuse the core code. By doing it at this stage, we won't even set up the VS attributes as inputs, reducing our VPM size.	2018-10-30 10:46:52 -07:00
Eric Anholt	c152c79d5e	v3d: Only add output slot tracking for the current varying slot. We always emit 4 slots per slot because things like color output and position processing in the epilogue will potentially look up more values than the variable declaration had. However, when we get a .location_frac != 0, we don't want to overwrite components of the following .driver_location.	2018-10-30 10:46:52 -07:00
Eric Anholt	17c8198952	v3d: Use nir_lower_io_to_scalar_early to DCE unused VS input components. This lets us trim unused trailing components in the vertex attributes, reducing the size of our VPM allocations.	2018-10-30 10:46:52 -07:00
Eric Anholt	fc85f7cfdc	v3d: Don't rely on sorting input vars for VPM read setup. For supporting scalar VPM i/o at the NIR level, we need to do a pass over the vars to figure out how big each attribute is after DCE. Once we've done that, we can just walk over c->vattr_sizes[] instead of bothering with vars.	2018-10-30 10:46:52 -07:00
Eric Anholt	cc78676030	v3d: Split out NIR input setup between FS and VPM. They don't share much code, and I'm about to rewrite the remaining shared code for the VPM case.	2018-10-30 10:46:52 -07:00
Eric Anholt	8265dfaa87	nir: Allow using nir_lower_io_to_scalar_early on VS input vars. This will be used on V3D to cut down the size of the VS inputs in the VPM (memory area for sharing data between shader stages). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-30 10:46:52 -07:00
Jason Ekstrand	f48b742289	anv: Bump the advertised patch version to 90 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-30 11:43:43 -05:00
Emil Velikov	29283921b7	m4: add Werror when checking for compiler flags Seemingly that at some point clang started accepting _any_ flags, whereas previously it would error out. These days, you can give it -Whamsandwich and it will succeed, while at the same time throwing an annoying warning. Add -Werror so that everything gets flagged and set accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108082 Cc: Vinson Lee <vlee@freedesktop.org> Repored-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-30 16:41:05 +00:00
Dylan Baker	a8bed38b54	docs/calendar: Add 18.3 plan and expand 18.2 Emil will be helping out with 18.3, while Juan finalises 18.2 v2: [Emil] add Emil for 18.3, fix typos CC: Emil Velikov <emil.velikov@collabora.com> CC: Juan A. Romero Suarez <jasuarez@igalia.com> Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-10-30 16:35:58 +00:00
Emil Velikov	c210d0c3b7	vulkan/wsi: use the drmGetDevice2() API On older kernels, the drmGetDevice() call will wake up all the GPUs on the system, while fetching the PCI revision. Use the 2 version of the API and pass flags == 0, so we don't fetch the device PCI revision, since we don't need that information. Fixes: `baa38c144f` ("vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-30 16:35:50 +00:00
Jason Ekstrand	a45b6fb452	spirv: Pass SSA values through functions Previously, we would create temporary variables and fill them out. Instead, we create as many function parameters as we need and pass them through as SSA defs. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-30 11:22:44 -05:00
Mauro Rossi	bfe0e32913	android: i965/tiled_memcpy: fix build for x86 generic target x86 32 bit generic target does not enable ARCH_X86_HAVE_SSE4_1 for this reason all Android library modules using SSE4_1 in mesa are built conditionally to ARCH_X86_HAVE_SSE4_1 The same approach is now applied to libmesa_intel_tiled_memcpy_sse41 in order to avoid the following building errors: external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:574:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int' __m128i val = _mm_stream_load_si128((__m128i )src); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:578:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int' __m128i val0 = _mm_stream_load_si128(((__m128i )src) + 0); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:579:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int' __m128i val1 = _mm_stream_load_si128(((__m128i )src) + 1); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:580:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int' __m128i val2 = _mm_stream_load_si128(((__m128i )src) + 2); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ external/mesa/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c:581:15: error: initializing '__m128i' (vector of 2 'long long' values) with an expression of incompatible type 'int' __m128i val3 = _mm_stream_load_si128(((__m128i *)src) + 3); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5 errors generated. Fixes: `11b1afdc92` ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-30 14:45:16 +02:00
Toni Lönnberg	50e952840f	intel: tools: Add handling for video pipe Preliminary work for adding handling of different pipes to gen_decoder. We need to be able to distinguish between different pipes in order to decode the packets correctly due to opcode re-use. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-30 12:43:00 +00:00
Toni Lönnberg	d5a938c58d	intel/decoder: Use 'DWord Length' and 'bias' fields for packet length. Use the 'DWord Length' and 'bias' fields from the instruction definition to parse the packet length from the command stream when possible. The hardcoded mechanism is used whenever an instruction doesn't have this field. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-30 12:43:00 +00:00
Marek Olšák	a09cbaffbf	mesa: expose EXT_texture_compression_s3tc on GLES The spec was modified to support GLES. Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-10-30 13:31:00 +01:00
Michał Janiszewski	2734baa9e2	mesa: Add missing include guards Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-10-30 06:19:10 -06:00
Michał Janiszewski	ec994ca0fc	glx: Add missing include guards Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-10-30 06:19:10 -06:00
Michał Janiszewski	8ebd7039c4	svga: Add missing include guards Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-10-30 06:19:09 -06:00
Michał Janiszewski	0654450911	glsl: Add missing include guards Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-10-30 06:19:09 -06:00
Eric Engestrom	fddf384d1d	intel/batch-decoder: remove never-used function This function was there when the file was introduced in commit `38f10d5a03` "intel: tools: add aubinator viewer", but was never actually used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-30 10:59:43 +00:00
Eric Engestrom	e9fb81375a	st/dri: remove leftover local variable Left over from the cleanup in `6ccc435e7a` "pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-30 10:20:58 +00:00
Vadym Shovkoplias	7d66eddbbd	glsl/linker: Fix out variables linking during single stage Since out variables are copied from shader objects instruction streams to linked shader instruction steam it should be cloned at first to keep source instruction steam unaltered. Fixes: `966a797e43` ("glsl/linker: Link all out vars from a shader objects on a single stage") Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731	2018-10-30 10:19:17 +11:00
Marek Olšák	8676af12c8	ac: fix ac_build_fdiv for f64 trivial Fixes: `a5f35aa742`	2018-10-29 17:24:21 -04:00
Brian Paul	9007c0ed26	nir: fix yet another MSVC build break Trivial.	2018-10-29 11:15:12 -06:00
Eric Engestrom	f3a5757eba	vulkan/wsi: simplify meson file tracking Meson already automatically tracks included headers, so there's no need to add them everywhere; cleans up the code a bit. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-29 16:39:47 +00:00
Eric Engestrom	1df0c1e8fb	clover: add missing meson build dependency Fixes: `42ea0631f1` "meson: build clover" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-29 16:39:42 +00:00
Eric Engestrom	98e7c3e7a7	svga: add missing meson build dependency Fixes: `a537231b22` "meson: build svga driver on linux" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-29 16:39:38 +00:00
Eric Engestrom	912cd0ce3b	radv: add missing meson build dependency Fixes: `9d40ec2cf6` "radv: Add support for VK_KHR_driver_properties." Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-29 16:39:34 +00:00
Eric Engestrom	2be1f9ceba	anv: add missing meson build dependency Fixes: `e4538b93f5` "anv: Implement VK_KHR_driver_properties" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-29 16:39:07 +00:00
Samuel Pitoiset	b4eb029062	radv: implement VK_EXT_transform_feedback This implementation should work and potential bugs can be fixed during the release candidates window anyway. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:10:58 +01:00
Samuel Pitoiset	f8d0337299	radv: add multiple streams support for the GS copy shader Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	6c21645046	radv: emit stream outputs for vertex and tessellation stages Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	19f1b49236	radv: declare streamout SGPRs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	f4fa8de794	radv: gather stream output info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	fe551ec122	radv: allow to emit a vertex to a specified stream This is required for GS multiple streams support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	a59f1b06ef	radv: allow to use up to 4 GSVS ring buffers For all streams. We basically just need to update the base address and compute a stride for every stream. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	98c09c3fcd	radv: adjust the number of output components per stream Same as the previous patch, except that is only the number of components. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	4649471a9e	radv: adjust the GSVS ring sizes based on the number of components For multiple streams support we have to set the different ring buffer sizes correctly. This relies on the number of output components per stream. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	8e428e24a8	radv: gather which GS stream is used for every outputs To only emit outputs for the given stream. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	dd996d1885	radv: gather the number of output components per stream This will be also used for splitting the GS->VS ring buffer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Samuel Pitoiset	87e6866b04	radv: gather the number of streams used by geometry shaders This will be used for splitting the GS->VS ring buffer. The stream ID is always 0 for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 17:09:08 +01:00
Jason Ekstrand	19064b8c3a	nir: Add a pass for gathering transform feedback info This is different from the GL_ARB_spirv pass because it generates a much simpler data structure that isn't tied to OpenGL and mtypes.h. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-29 17:09:08 +01:00
Jason Ekstrand	e8a5fa054d	vulkan: Update the XML and headers to 1.1.90 This doesn't include any new features but it does include an XML and header typo fix for modifiers. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-29 10:17:19 -05:00
Samuel Pitoiset	9e56ffb0b4	radv: remove wrong comment in calculate_gs_ring_sizes() about streams The computation seems correct compared to RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-29 12:33:58 +01:00
Rob Clark	a61952e737	freedreno: don't flush when new and old pfb is identical In the 'inorder' case (ie. FD_MESA_DEBUG=inorder, or old kernel), if the u_blitter clear path is used (a3xx, a4xx, and some fallback cases on newer gens), util_blitter_restore_fb_state() will set_framebuffer_state() to something that is identical to the current fb state, which triggers an unnecessary flush, and then eventually an assert: (gdb) bt #0 0x0000007fbf24a078 in kill () from /lib64/libc.so.6 #1 0x0000007fbe061278 in _debug_assert_fail (expr=0x7fbe93a820 "!batch->flushed", file=0x7fbe93a628 "../src/gallium/drivers/freedreno/freedreno_batch.c", line=491, function=0x7fbe93a990 <__func__.17380> "fd_batch_check_size") at ../src/gallium/auxiliary/util/u_debug.c:322 #2 0x0000007fbe1ccb8c in fd_batch_check_size (batch=0x55556d5a70) at ../src/gallium/drivers/freedreno/freedreno_batch.c:491 #3 0x0000007fbe1d0e08 in fd_clear (pctx=0x55555c61e0, buffers=5, color=0x55556e388c, depth=1, stencil=0) at ../src/gallium/drivers/freedreno/freedreno_draw.c:463 #4 0x0000007fbe57afa4 in st_Clear (ctx=0x55556e17b0, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:452 The assert was introduced in `4b847b38ae`, so from a functionality standpoint this patch fixes that commit. But it should also avoid an unnecessary flush in the 'inorder' case, fixing a performance bug. Fixes: `4b847b38ae` freedreno: make fd_batch a one-shot thing Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-28 14:03:38 -04:00
Rob Clark	32dd75b927	freedreno: dependency tracking for z/s depends on ZSA state ZSA state can change whether depth or stencil is enabled This plus previous patch fix stk, and various things w/ FD_MESA_DEBUG=inorder Fixes: `ec717fc629` freedreno: reduce resource dependency tracking overhead Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-28 14:03:38 -04:00
Rob Clark	05e868925c	freedreno: mark all state dirty after switching batch The problem isn't directly with `ec717fc629` but rather that commit exposes the problem. When we switch batch we cannot assume previous state is clean so we should mark all state dirty. Fixes: `ec717fc629` freedreno: reduce resource dependency tracking overhead Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-28 14:03:38 -04:00
Jason Ekstrand	1bd4f8fefc	anv: Use absolute timeouts in wait_for_bo_fences We were previously using relative timeouts and decrementing the user-provided timeout as we waited. Instead, this commit refactors things to use absolute timeouts throughout. This should fix a subtle bug in the waitAll case where we aren't decrementing the timeout after a successful GPU wait. Since pthread_cond_timedwait already takes an absolute timeout, it's also significantly simpler. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-27 16:18:33 -05:00
Jason Ekstrand	cbd4468695	anv: Flag semaphore BOs as external It probably doesn't actually break anything but it does cause some assertions in debug builds. Fixes: `7a89a0d9ed` "anv: Use separate MOCS settings for external BOs" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-27 00:02:32 -05:00
Jason Ekstrand	663a113700	anv: Improve the asserts in anv_buffer_get_range Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-27 00:02:32 -05:00
Rob Clark	c41772d17a	freedreno/a6xx: inline draw_impl() Now that it is just called once per draw (instead of once for binning and once for draw), let's just inline it. If nothing else, it makes perf-annotate easier to look at. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-26 18:10:00 -04:00
Rob Clark	604b5f1dca	freedreno/a6xx: small cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-26 18:10:00 -04:00
Rob Clark	2a74d9ae8d	freedreno/a6xx: move where we handle dirty vbo state Historically this wasn't in fdN_emit_state(), because prior to addition of blitter in a5xx, fdN_emit_state() was also used in the clear path. These days that is only true for a2xx (a3xx and a4xx use u_blitter). So the reason for it not to be in fd6_emit_state() no longer exists. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-26 18:10:00 -04:00
Rob Clark	ddb7fadaf8	freedreno: avoid no-op flushes by re-using last-fence Noticed that with webgl (in chromium, at least) we end up generating a lot of no-op submits just to get a fence. Tracking the last fence and returning that if there is no rendering since last flush avoids this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	01194cd582	freedreno/a6xx: Move stencil/depth/alpha state to IB Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	a664dc2d59	freedreno/a6xx: Move stencil mask emit to FD_DIRTY_ZSA group Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	3073926512	freedreno/a6xx: Rename FD6_GROUP_ZSA ro FD6_GROUP_LRZ Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	edc0f1b10f	freedreno/a6xx: Move rasterizer state to state object Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	3264eb691a	freedreno/a6xx: Fix set_blit_scissor helper The scissor maxx/maxy are non-inclusive, so don't subtract one from framebuffer width and height. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	4222fe8af2	freedreno/a2xx: Squash a compiler warning We get a warning here for assigning a const char * pointer to char *swizzle in struct ir2_src_register. The constructor strdups a 4 byte string here, so just memcpy to that instead. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Kristian H. Kristensen	4fd6265f42	freedreno/a6xx: Use fd6_emit_ib from a6xx Move it to a header and use it where possible to avoid vfunc call. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-10-26 18:10:00 -04:00
Rob Clark	f3cc0d2747	freedreno: import libdrm_freedreno + redesign submit In the pursuit of lowering driver overhead, it became clear that some amount of redesign of how libdrm_freedreno constructs the submit ioctl would be needed. In particular, as the gallium driver is starting to make heavier use of CP_SET_DRAW_STATE state groups/objects, the over- head of tracking cmd buffers and relocs becomes too much. And for "streaming" state, which isn't ever reused (like uniform uploads) the overhead of allocating/freeing ringbuffer[1] objects is too high. This redesign makes two main changes: 1) Introduces a fd_submit object for tracking bos and cmds table for the submit ioctl, making ringbuffer objects more light- weight. This was previously done in the ringbuffer. But we have many ringbuffer instances involved in a submit (gmem + draw + potentially 1000's of state-group rbs), and only need a single bos and cmds table. (Reloc table is still per-rb) The submit is also a convenient place for a slab allocator for ringbuffer objects. Other options would have required locking because, while we can guarantee allocations will only happen on a single thread, free's could happen either on the application thread or the flush_queue thread. With the slab allocator in the submit object, any frees that happen on the flush_queue thread happen after we know that the application thread is done with the submit. 2) Introduce a new "softpin" msm_ringbuffer_sp implementation that does not use relocs and only has cmds table entries for IB1 (ie. the cmdstream buffers that kernel needs to CP_INDIRECT_BUFFER to from the RB). To do this properly will require some updates on the kernel side, so whether you get the softpin or legacy submit/ringbuffer implementation at runtime depends on your kernel version. To make all these changes in libdrm would basically require adding a libdrm_freedreno2, so this is a good point to just pull the libdrm code into mesa. Plus it allows for using mesa's hashtable, slab allocator, etc. And it lets us have asserts enabled for debug mesa buids but omitted for release builds. And it makes life easier if further API changes become necessary. At this point I haven't tried to pull in the kgsl backend. Although I left the level of vfunc indirection which would make it possible to have other backends. (And this was convenient to keep to allow for the "softpin" ringbuffer to coexist.) NOTE: if bisecting a build error takes you here, try a clean build. There are a bunch of ways things can go wrong if you still have libdrm_freedreno cflags. [1] "ringbuffer" is probably a bad name, the only level of cmdstream buffer that is actually a ring is RB managed by kernel. User- space cmdstream is all IB1/IB2 and state-groups. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-26 18:10:00 -04:00
Jason Ekstrand	aa02d7e878	Revert "anv/skylake: disable ForceThreadDispatchEnable" This reverts commit `0fa9e6d7b3`. The real issue appears to have been that HiZ ops don't like having WM thread dispatch force-enabled. The previous commit fixes that problem so we can go back to using the ForceThreadDispatchEnable bit even on SKL+. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-26 16:39:47 -05:00
Jason Ekstrand	b6b2b27809	blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP Cc: mesa-stable@lists.freedesktop.org Suggested-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-26 16:39:35 -05:00
Axel Davy	2318ca68bb	st/nine: Handle window resize when a presentation buffer is used Usually when a window is resized, the app calls d3d to resize the back buffer to the window size. In some cases, it is not done, and it expects the output resizes to the window size, even if the back buffer size is unchanged. This patch introduces the behaviour when a presentation buffer is used. ID3DPresent_GetWindowInfo is a function available with D3DPresent v1.0, and thus we don't need to check if the function is available. The function had been introduced to implement this very feature. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	e50d374b61	d3dadapter: Fix wrong naming in header file GetWindowInfo used to be GetWindowSize before gallium nine was merged. A left-over remained... Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	3d975e98e4	st/nine: Reduce MaxSimultaneousTextures to 8 Windows drivers don't set this flag (which affects ff) to more than 8. Do the same in case some games check for 8. v2: Remove any dependence on MaxSimultaneousTextures. For non-ff the number of textures is 16 when the device is able of vs/ps3. Add this requirement of 16 textures to the driver requirements. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	739c700950	st/nine: Enable shadow mapping for ps 1.X We didn't implement shadow textures for ps 1.X, assuming the case couldn't happen... Well it does. Fixes: https://github.com/iXit/Mesa-3D/issues/261 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	847861aab4	st/nine: Do not set unused states for stateblocks A lot of these states are used only for the context, and are unused for stateblocks (which just uses the changed.* fields instead for a lot of them). Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	6f373b9b74	st/nine: Fix aliasing states for stateblocks If NINE_STATE_FF_MATERIAL is set, the stateblock will upload its recorded materials matrix. If NINE_STATE_FF_LIGHTING is set, the lighting set is uploaded. These flags could be set by a NineDevice9_SetTransform call or by setting some states related to ff, but that shouldn't trigger these stateblock behaviours. We don't need to follow the context states dirtied by render states. NINE_STATE_FF_VSTRANSF is exactly the state controlling stateblock updates of transformation matrices, NINE_STATE_FF is too broad. These two changes avoid setting the two mentionned states when we shouldn't. Fixes: https://github.com/iXit/Mesa-3D/issues/320 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	454201b452	st/nine: Never update device changed.* fields The device state changed.* field are never used. These fields are used only for stateblocks. Avoid setting them at all for clarity. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	2594b2efdc	st/nine: Capture also default matrices for D3DSBT_ALL We avoid allocating space for never unused matrices. However we must do as if we had captured them. Thus when a D3DSBT_ALL stateblock apply has fewer matrices than device state, allocate the default matrices for the stateblock before applying. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	bbeddb801e	st/nine: Mark transform matrices dirty for D3DSBT_ALL D3DSBT_ALL stateblocks capture the transform matrices. Fixes some d3d test programs not displaying properly. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	a4e9bbb8f8	st/nine: Don't update unused world matrices While to the application we have to track accurately all 256 world matrices (including in stateblocks), hw vertex processing enables to set a limit to the number of world matrices the hardware can access to in the advertised caps, which is 8 for nine. Thus don't bother in the stateblock code to send the updated values for the unreachable matrices. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	2e51c4c7cc	st/nine: Remove two unused states. NINE_STATE_MATERIAL was used incorrectly at one location. Replace it with the correct state. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Axel Davy	cb8ea21e1c	st/nine: Remove commented nine_context_apply_stateblock At some point the project was to adapt the commented version to csmt. The csmt rework enabled to fix some state aliasing issues between stateblocks and internal state updates. The commented version needs a lot of work to work with that. Just drop it. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-10-26 22:16:16 +02:00
Brian Paul	7e64e39f8b	nir: Fix array initializer Empty initializer is not standard C. This fixes MSVC build. Trivial.	2018-10-26 12:35:48 -06:00
Jason Ekstrand	07eb8e7466	anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost This lets us get rid of a bunch of duplicated error messages. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 13:27:21 -05:00
Jason Ekstrand	ade22ae1ac	anv/util: Split a vk_errorv helper out of vk_errorf Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 13:27:21 -05:00
Brian Paul	d6be0b5556	scons/svga: remove opt from the list of valid build types This reverts commit `a5fd54f8bf`. The whole point was to add a way to pass -DVMX86_STATS to the build, but we can do that with a command line argument when we invoke scons. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2018-10-26 12:09:00 -06:00
Nanley Chery	5bcf479524	intel/blorp: Define the clear value bounds for HiZ clears Follow the restriction of making sure the clear value is between the min and max values defined in CC_VIEWPORT. Avoids a simulator warning for some piglit tests, one of them being: ./bin/depthstencil-render-miplevels 146 d=z32f_s8 Jason found this to fix incorrect clearing on SKL. Fixes: `09948151ab` ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-26 10:34:07 -07:00
Eric Engestrom	285ebc84c7	radv: remove duplicate brackets in version string MESA_GIT_SHA1 resolves to either an empty "" string if not build from git, or " (git-DEADBEEF)" if it is. No need to wrap it in additional "()". Fixes: `9d40ec2cf6` "radv: Add support for VK_KHR_driver_properties." Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 18:33:11 +01:00
Eric Engestrom	738f0f789b	vulkan: drop always-true param Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-26 18:33:11 +01:00
Boyuan Zhang	f4126cfaab	radeon/vcn: use util function to get h264 profile idc Use utility function for converting h264 pipe video profile to profile idc, instead of using array. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig at amd.com>	2018-10-26 13:23:06 -04:00
Boyuan Zhang	55cf565698	radeon/vce: use util function to get h264 profile idc Use utility function for converting h264 pipe video profile to profile idc, instead of using array. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig at amd.com>	2018-10-26 13:23:06 -04:00
Boyuan Zhang	b15d0200a9	vl: get h264 profile idc Adding a function for converting h264 pipe video profile to profile idc Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig at amd.com>	2018-10-26 13:23:06 -04:00
Jason Ekstrand	5cdeefe057	intel/nir: Use the OPT macro for more passes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	18fb2c5d92	spirv: Initialize subgroup destinations with the destination type Instead of initializing them manually, just use the type that we already have sitting there. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	8fa70cfcfd	spirv: Use the right bit-size for spec constant ops Previously, we would always pull the bit size from the destination which is wrong for opcodes like nir_ilt where the sources are variable-sized but the destination is a fixed size. We were getting lucky before because nir_op_ilt returns a 32-bit value and basically everyone who uses spec constants uses 32-bit ones. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	1d2ed694c1	nir/prog: Use nir_bany in kill handling We have a helper that does exactly what the bany_inequal was doing. It emits the same code but is a bit higher level and is designed to operate on a bvec4. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	2fe3031440	glsl/nir: Use i2b instead of ine for fixing UBO/SSBO Booleans They do the same thing in the end but i2b is a bit simpler. Also, let's clean up the mess of code for SSBO handling with one line of builder. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	5bfce5fcc2	nir/system_values: Use the bit size from the load_deref This isn't a great solution for bit-sizes but we don't have a particularly convenient way to get a bit size from the system value enum and this keeps the lowering pass from changing it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	a3b4cb3458	nir/opt_if: Rework condition propagation Instead of doing our own constant folding, we just emit instructions and let constant folding happen. This is substantially simpler and lets us use the nir_imm_bool helper instead of dealing with the const_value's ourselves. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	4cd8a58595	nir/search: Use the nir_imm_* helpers from nir_builder This requires that we rework the interface a bit to use nir_builder but that's a nice little modernization anyway. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	6e32115bd6	nir/builder: Handle 16-bit floats in nir_imm_floatN_t Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	ff45649bc2	nir/builder: Add a nir_imm_true/false helpers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	249e32ab17	nir/constant_folding: Use nir_src_as_bool for discard_if Missed one while converting to the nir_src_as_* helpers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	6de1869e86	nir/constant_folding: Add an unreachable to a switch Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	28bb6abd1d	nir/validate: Print when the validation failed Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-26 11:45:29 -05:00
Jason Ekstrand	292ebdbf98	anv: Handle the device loss abort in anv_device_set_lost Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:23 -05:00
Jason Ekstrand	cd0960b430	anv: Add helpers for setting/checking device lost Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:21 -05:00
Jason Ekstrand	319ff6f1ad	anv: Provide a error message with a DEVICE_LOST Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-26 08:40:10 -05:00
Alex Smith	3bd239f71d	anv: Fix sanitization of stencil state when the depth test is disabled When depth testing is disabled, we shouldn't pay attention to the specified depthCompareOp, and just treat it as always passing. Before, if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER (e.g. from the app having zero-initialized the structure), then sanitize_stencil_face() would have incorrectly changed passOp to VK_STENCIL_OP_KEEP. v2: Roll the depthTestEnable check into the ds_aspect check below since they now both do the same thing. Fixes: `028e1137e6` "anv/pipeline: Be smarter about depth/stencil state" Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-26 10:25:40 +01:00
Samuel Pitoiset	79bbdf8e45	radv: implement image to image operations for R32G32B32 This should address the remaining failures in Batman Arkhman City. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:50:08 +02:00
Samuel Pitoiset	6198245775	radv: fix a comment in radv_meta_buffer_to_image_cs_r32g32b32() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:50:05 +02:00
Samuel Pitoiset	02ccef7874	radv: add get_image_stride_for_r32g32b32() helper For the special R32G32B32 paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:50:03 +02:00
Samuel Pitoiset	468c33e2f7	radv: add create_bview_for_r32g32b32() helper For the special R32G32B32 paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:50:00 +02:00
Samuel Pitoiset	e60e3e1b3f	radv: add create_buffer_from_image() helper For the special R32G32B32 paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-26 10:49:58 +02:00
Sagar Ghuge	416abe809a	intel/compiler: Print message descriptor as immediate source While disassembling send(c) instruction print message descriptor as immediate source operand along with message descriptor. This allows assembler to read immediate source operand and set bits accordingly. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-26 06:42:14 +02:00
Sagar Ghuge	d15fa24860	intel/compiler: Print hex representation along with floating point value While encoding the immediate floating point values in instruction we use values upto precision 9, but while disassembling, we print precision to 6 places, which round up the value and gives wrong interpretation for encoded immediate constant. To avoid misinterpretation of encoded immediate values in instruction and disassembled output, print hex representation along with floating point value which can be used by assembler in future. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-26 06:41:08 +02:00
David McFarland	07a00a8729	util: Change remaining uint32 cache ids to sha1 After discussion with Timothy Arceri. disk_cache_get_function_identifier was using only the first byte of the sha1 build-id. Replace disk_cache_get_function_identifier with implementation from radv_get_build_id. Instead of writing a uint32_t it now writes to a mesa_sha1. All drivers using disk_cache_get_function_identifier are updated accordingly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Fixes: `83ea8dd99b` ("util: add disk_cache_get_function_identifier()")	2018-10-26 14:49:22 +11:00
Hyunjun Ko	3d198926a4	freedreno: use fd_bc_alloc_batch instead of fd_batch_create. Following the commit `2385d7b066` and `8e798e28f7`, for resource dependancy tracking. Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo with FD_MESA_DEBUG=inorder Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-25 18:46:19 -04:00
Hyunjun Ko	703271c22a	freedreno/ir3: take reg->num out of union in ir3_register To avoid wrong result when identifying the type of register. Ie. If the reg is an array, it might be identified as address or predicate register. Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-25 18:45:45 -04:00
Rob Clark	3c402d0dc2	freedreno/a6xx: disable unused groups Don't leave vsconst/fsconst group enabled if we switch to shader with no uniforms. Fixes: `abcdf5627a` freedreno/a6xx: move const emit to state group Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-25 18:38:53 -04:00
Rob Clark	d53074d3f1	freedreno: add useful assert Would have been useful to catch the problem fixed in `8e798e28f7` Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-25 18:38:53 -04:00
Alok Hota	edf38019a0	swr/rast: ignore CreateElementUnorderedAtomicMemCpy This function's API changed between LLVM 5 and 6. Compile errors occur when building with LLVM 6+ if LLVM 5 was used for a dist tarball CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-25 11:05:59 -05:00
Alok Hota	8c872ac2e3	swr/rast: fix intrinsic/function for LLVM 7 compatibility Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and removed createInstructionSimplifierPass, which were both removed in LLVM 7.0.0 These changes combine patches we received from the community and our own internal patches Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com>	2018-10-25 10:32:27 -05:00
Rhys Perry	26ed0f0234	nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+ Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82% FPS improvement with Dirt Rally on my GTX 1060. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-10-25 15:25:10 +01:00
Bas Nieuwenhuizen	d41c3cc013	radv: Emit enqueued pipeline barriers on event write. Since the CPU can read them we need to execute any GPU->CPU flushes before the event is written. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524 Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-25 16:17:54 +02:00
Bas Nieuwenhuizen	9d40ec2cf6	radv: Add support for VK_KHR_driver_properties. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-25 16:14:43 +02:00
Eric Engestrom	e27902a261	util: use C99 declaration in the for-loop set_foreach() macro Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-25 12:43:18 +01:00
Eric Engestrom	bb84fa146f	util: use C99 declaration in the for-loop hash_table_foreach() macro Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-25 12:43:18 +01:00
Dylan Baker	3d261cf77b	gen: Add AMD_gpu_shader_int64.xml to tarball CC: Ian Romanick <ian.d.romanick@intel.com> CC: Marek Olšák <marek.olsak@amd.com> Fixes: `b3c17330e6` ("mesa: expose AMD_gpu_shader_int64") Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-10-24 11:29:30 -07:00
Dylan Baker	6d5fa65c74	gen: Add EXT_vertex_attrib_64bit.xml to dependency lists Which is also required to put it in the tarball, a requirement for building with meson from the tarball. CC: Ian Romanick <ian.d.romanick@intel.com> CC: Marek Olšák <marek.olsak@amd.com> Fixes: `263c962cfd` ("mesa: expose EXT_vertex_attrib_64bit") Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-10-24 11:29:29 -07:00
Eric Engestrom	edc06dd533	anv: move variable to proper scope and mark as MAYBE_UNUSED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-24 18:16:20 +01:00
Eric Engestrom	ed5d65a6a1	anv: use snprintf() instead of memset()+strcpy() snprintf() guarantees that it will not write more chars than allowed, and that the string will be null-terminated, without the need to fill the whole thing with zeroes to begin with. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-24 18:15:56 +01:00
Eric Engestrom	33d757096d	anv: drop unused includes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-24 18:15:05 +01:00
Dylan Baker	c4de8ba036	autotools: include intel_tiled_memcopy.c There are two problems with the fixed patch. First, it fails to create a dependency on the sourced .c file, so changes to intel_tiled_memcpy.c won't trigger a rebuild. It also doesn't get included in the dist tarball. Fixes: `11b1afdc92` ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-10-24 09:22:15 -07:00
Dylan Baker	43b0d5fa04	meson: fix formatting and add extra_files to i965 extra_files is just a nice way to to tell certain IDEs (and those reading the file) that this file is also a dependency. Meson will use the .d file generated by the compiler to figure out what the target actually depends on. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-10-24 09:22:13 -07:00
Eduardo Lima Mitev	b0c427043b	ir3_compiler/nir: fix imageSize() for buffer-backed images GL_EXT_texture_buffer introduced texture buffers, which can be used in shaders through a new type imageBuffer. Because how image access is implemented in freedreno, calling imageSize on an imageBuffer returns the size in bytes instead of texels, which is incorrect. This patch adds a division of imageSize result by the bytes-per-pixel of the image format, when image is buffer-backed. Fixes all tests under dEQP-GLES31.functional.image_load_store.buffer.image_size.* v2: Pre-compute and submit the log2 of the image format's bpp as shader constant instead of emitting the LOG2 instruction in code. (Rob Clark) v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin) Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-24 18:18:35 +02:00
Jose Fonseca	d9a04196d9	nir: Fix array initializer. Empty initializer is not standard C. This fixes MSVC build. Trivial.	2018-10-24 11:37:09 +01:00
Liviu Prodea	d99fda17c8	scons: Put to rest zombie texture_float build option. I found a remnant of texture_float build option that wasn't removed in commit `66673bef94` This patch removes it. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-10-24 11:10:17 +01:00
Alex Smith	6c56c1fbd4	anv: Allow presenting via a different GPU anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not. Apps which check for presentation support via the latter (all Feral Vulkan games at least) will therefore fail. This allows me to render on an Intel GPU and present to a display connected to an AMD card (tested HD 530 + Vega 64). v2: Rebase on current master. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-24 09:40:02 +01:00
Juan A. Suarez Romero	3112da346b	nir: fix nir_copy_propagation test Use nir_src_comp_as_uint() to read the proper second component, as nir_src_as_uint() returns the first one. v2: Use nir_src_comp_as_uint() [Jason] Fixes: `16870de8a0` ("nir: Use nir_src_is_const and nir_src_as_* in core code") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532 Tested-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-24 09:13:24 +02:00
Timothy Arceri	0ff1ccca25	radv: call nir_link_xfb_varyings() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-24 08:21:29 +11:00
Timothy Arceri	c769ed10de	radv: move nir_lower_io_to_scalar_early() to radv_link_shaders() nir_lower_io_to_scalar_early() is really part of the link time optimisations. Moving it here allows the code to be simplified and also keeps the code easy to follow in the next patch. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-24 08:21:29 +11:00
Samuel Pitoiset	7c694cbfa4	nir: add linking helper nir_link_xfb_varyings() The linking opts shouldn't try removing or compacting XFB varyings in the consumer. To avoid this we copy the always_active_io flag from the producer. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-24 08:21:29 +11:00
Sagar Ghuge	0a7664fe8c	intel/compiler: Change src1 reg type to unsigned doubleword To have uniform behavior while disassembling send(c) instruction use register type of unsigned doubleword for src1 when message descriptor is immediate value. Bspec does not specifiy anything for src1 immediate default type. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-10-23 12:44:24 -07:00
Eduardo Lima Mitev	22ddd4988e	mesa/glformats: Remove redundant helper _mesa_base_format_component_count There exists _mesa_components_in_format() which already includes all cases handled in _mesa_base_format_component_count(). Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-23 21:29:15 +02:00
Jason Ekstrand	ecb7775e1c	nir/algebraic: Fix a typo in the bit size validation code The conon_bit_class and canon_var_class variables got switched. Fixes: `932c650e0b` "nir/algebraic: Loosen a restriction on variables" Reported-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-23 12:22:29 -05:00
Leo Liu	b75fb8ee36	amd/common: check DRM version 3.27 for JPEG decode JPEG was added after DRM version 3.26 Signed-off-by: Leo Liu <leo.liu@amd.com> Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query) Cc: Boyuan Zhang <boyuan.zhang@amd.com> Cc: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-23 13:12:05 -04:00
Juan A. Suarez Romero	a8c2a6b0ac	docs: update calendar I'll take care of 18.2 releases series on Andres behalf. CC: Andres Gomez <agomez@igalia.com> CC: Dylan Baker <dylan@pnwbakers.com> CC: Emil Velikov <emil.l.velikov@gmail.com>	2018-10-23 18:40:09 +02:00
Lionel Landwerlin	a8594887bc	intel/decoders: fix end of batch limit Pointer arithmetic... v2: s/4/sizeof(uint32_t)/ (Eric) v3: Give bytes to print_batch() in error_decode (Lionel) Make clear what values we're dealing with in error_decode (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-23 14:49:33 +01:00
Boyuan Zhang	55e7de7b19	radeonsi: enable vcn jpeg decode for raven Enable vcn jpeg decode for raven. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	97c473bb29	winsys/amdgpu: add vcn jpeg cs support Add vcn jpeg cs support, align cs by no-op. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	4558758c51	amd/common: add vcn jpeg ip info query Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	6d2d910653	radeon/vcn: implement jpeg target buffer cmd Implement jpeg target buffer cmd by programming registers directly, since there is no firmware for VCN Jpeg decode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	0ee5630cfc	radeon/vcn: implement jpeg bitstream buffer cmd Implement jpeg bitstream buffer cmd by programming registers directly, since there is no firmware for VCN Jpeg decode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	9b478b0c7a	radeon/uvd: remove get mjpeg slice header Move the previous get_mjpeg_slice_heaeder function and eoi from "radeon/vcn" to "st/va". Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	4fc2368e3b	st/va: get mjpeg slice header Move the previous get_mjpeg_slice_heaeder function and eoi from "radeon/vcn" to "st/va". Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	c7a5ef26ad	radeon/vcn: add jpeg decode implementation Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg specific cmd sending function in end_frame call. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	40fceb55f3	radeon/vcn: separate send cmd call from end frame Use function pointer for sending cmd in end_frame call. By doing this, we can assign different cmd sending logics for Jpeg decode later. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	4f1f128f8e	radeon/vcn: create cs based on ring type Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	f7116e4ff8	radeon/winsys: add vcn jpeg ring type Add a new ring type for vcn jpeg. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	e7e68d15b5	radeon/vcn: add vcn jpeg decode interface Add VCN Jpeg decode interfaces and register defines. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	6bc0a3a834	radeon/vcn: move radeon decoder define to header file Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h", so that it can be included by other files later. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	0f59e3f088	meson: update required amdgpu version to 2.4.95 VCN jpeg requires new hw ip Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Boyuan Zhang	2e768ade61	configure.ac: update libdrm amdgpu version to 2.4.95 VCN jpeg requires new hw ip Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-23 08:50:02 -04:00
Samuel Pitoiset	69c44de798	radv: fix btoi for R32G32B32 when the dest offset is not 0 Fixes: `593996bc02` ("radv: implement buffer to image operations for R32G32B32") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-23 14:29:26 +02:00
Scott D Phillips	54c823ec79	i965/miptree: Use cpu tiling/detiling when mapping Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x\|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit `b499b85b0f`. v6: refactor to changes done for sse41 separation (Tapani) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-23 14:08:05 +03:00
Scott D Phillips	11b1afdc92	i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) v3: add Android build support (Tapani) v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy' separate sse41 to own static library (Tapani) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-23 14:08:05 +03:00
Tapani Pälli	91d3a5d1a8	i965: expose type of memcpy instead of memcpy function itself There is currently no use of returned memcpy functions outside intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy type instead of actual function. This makes it easier later to separate streaming load copy in to own static library. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-10-23 14:08:05 +03:00
Eric Engestrom	bc021be78d	util: use unsigned ints for bit operations Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every time this macro is used. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-23 11:44:02 +01:00
Eric Engestrom	17b03b5320	radv: s/abs/fabsf/ for floats Fixes: `a4c4efad89` "radv: Rework guard band calculation" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-23 11:43:51 +01:00
Eric Engestrom	8629d807aa	meson: drop option description relic `platforms` is no longer a comma-separated string, and some of our option descriptions are way too long already. Just drop the incorrect bit. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-23 11:43:51 +01:00
Jason Ekstrand	8b626a22b2	st/mesa: Record shader access qualifiers for images They're not required to be the same as the access flag on the image unit. For hardware that does shader image lowering based on the qualifier (Intel), it may be required for state setup. v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák) - Reduce both access and shader_access to uint16_t to avoid making the pipe_image_view structure larger. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-23 02:36:24 -07:00
Jason Ekstrand	bf441d22a7	nir/algebraic: Provide descriptive asserts for bit size checks This will hopefully make debugging opt_algebraic bit-size compile failures easier. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Jason Ekstrand	932c650e0b	nir/algebraic: Loosen a restriction on variables Previously, we would fail if a variable had an assigned but unknown bit size X and we tried to assign it an actual bit size. However, this is ok because, at the time we do the search, the variable does have an actual bit size and it will match X because of the NIR rules. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Jason Ekstrand	ea9e651423	nir/algebraic: A bit of validation refactoring' We rename some local variables in validate() to be more readable and plumb the var through to get/set_var_bit_class instead of the var index. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Jason Ekstrand	641f4be8e8	nir/algebraic: Make internal classes str-able Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Jason Ekstrand	6068be543b	nir/algebraic: Generalize an optimization There's nothing boolean about (a \| ~a) ~> -1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Jason Ekstrand	69618a8678	nir/algebraic: Use bool internally instead of bool32 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-10-22 16:00:18 -05:00
Kenneth Graunke	00103db04a	intel: Fix decoding for partial STATE_BASE_ADDRESS updates. STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is set. Otherwise, we want to keep the existing base address. Iris uses this for updating Surface State Base Address while leaving the others as-is. v2: Also update aubinator_viewer_decoder (caught by Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-22 13:38:44 -07:00
Jason Ekstrand	16870de8a0	nir: Use nir_src_is_const and nir_src_as_* in core code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 14:24:15 -05:00
Jason Ekstrand	ce36f412c9	nir/search_helpers: Use nir_src_is_const and friends This not only makes them safe for more bit sizes but it also fixes a bug in is_zero_to_one where it would return true for constant NaN. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 14:24:15 -05:00
Jason Ekstrand	7bae7828aa	nir/search: Use nir_src_is_const and friends Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 14:24:15 -05:00
Jason Ekstrand	bca5c2c688	nir: Add some new helpers for working with const sources Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 14:24:15 -05:00
Alyssa Rosenzweig	e0c267c752	mesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs On scalar ISAs, nir_lower_io_to_scalar_early enables significant optimizations. However, on vector ISAs, it is counterproductive and impedes optimal codegen. This patch only calls nir_lower_io_to_scalar_early for scalar ISAs. It appears that at present there are no upstreamed drivers using Gallium, NIR, and a vector ISA, so for existing code, this should be a no-op. However, this patch is necessary for the upcoming Panfrost (Midgard) and Lima (Utgard) compilers, which are vector. With this patch, Panfrost is able to consume NIR directly, rather than TGSI with the TGSI->NIR conversion. For how this affects Lima, see https://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg189216.html Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-10-22 20:37:07 +02:00
Dylan Baker	4e785fb383	meson: don't require libelf for r600 without LLVM r600 doesn't have a hard requirement on LLVM, and therefore doesn't have a hard requirement on libelf. Currently the logic doesn't allow that however. Distro-bug: https://bugs.gentoo.org/669058 Fixes: `5060c51b6f` ("meson: build r600 driver") Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-10-22 11:29:55 -07:00
Jason Ekstrand	ca4e465f7d	anv,radv: Trivially expose two new VK_GOOGLE extensions This patch exposes support for the following two extensions: * VK_GOOGLE_decorate_string * VK_GOOGLE_hlsl_functionality1 There's nothing for the driver to do; it's all handled in spirv_to_nir. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 10:50:20 -05:00
Jason Ekstrand	891886da2f	spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1 This extension adds two new decorations which carry meaning only for HLSL shaders. They are expected to be handled by higher level layers and can be ignored by implementations. However, it does save the client a bit of work if the implementation safely ignores them instead of the client having to strip them out of the SPIR-V in order for it to be valid. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 10:49:53 -05:00
Jason Ekstrand	5f0322d5c3	spirv: Add support for SPV_GOOGLE_decorate_string Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 10:49:53 -05:00
Rob Herring	2bb05d70af	android: Build kms_swrast for the Android platform Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-22 13:08:17 +01:00
Connor Abbott	27fe3f5b5a	ac: Fix loading a dvec3 from an SSBO The comment was wrong, since the loop above casts to a type with the correct bitsize already. Fixes: `7e7ee82698` ("ac: add support for 16bit buffer loads") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 09:44:51 +02:00
Connor Abbott	59535b05cf	ac: Introduce ac_build_expand() And implement ac_bulid_expand_to_vec4() on top of it. Fixes: `7e7ee82698` ("ac: add support for 16bit buffer loads") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-22 09:44:51 +02:00
Eduardo Lima Mitev	fdd926d5b2	ir3/nir: Set up image_dims consts for image_deref_size intrinsic too `nir_intrinsic_image_deref_size` is not being considered during scan for driver constants, so image constants are not emitted if a shader only ever query the size of an image (no load, store, atomic op, etc). This is unlikely, but possible. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-21 21:29:18 +02:00
Karol Herbst	2d235d69c8	nv50/ir: fix ConstantFolding::createMul for 64 bit muls Fixes: `2f52925f5c` "nv50/ir: move a * b -> a << log2(b) code into createMul()" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-10-20 03:00:04 +02:00
Sonny Jiang	bfb2b90246	radeonsi: Disable clear_state with radeon kernel driver Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-10-19 16:16:57 -04:00
Kenneth Graunke	f91f9bab83	meson: Add -Werror=return-type when supported. This warning detects non-void functions with a missing return statement, return statements with a value in void functions, and functions with an bogus return type that ends up defaulting to int. It's already enabled by default with -Wall. Generally, these are fairly serious bugs in the code, which developers would like to notice and fix immediately. This patch promotes it from a warning to an error, to help developers catch such mistakes early. I would not expect this warning to change much based on the compiler version, so hopefully it won't become a problem for packagers/builders. See the GCC documentation or 'man gcc' for more details: https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/Warning-Options.html#index-Wreturn-type Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-19 10:16:57 -07:00
Jason Ekstrand	0d380af809	anv: Define trampolines as the weak functions Instead of having weak references to the anv functions and separate trampoline functions with their own dispatch table, just make the trampoline functions weak. This gets rid of a dispatch table and potentially lets the compiler delete the unused weak function. The end result is a reduction in the .text section of 5.7K and a reduction in the .data section of 1.4K. Before: text data bss dec hex filename 3190329 282232 8960 3481521 351fb1 _install/lib64/libvulkan_intel.so After: text data bss dec hex filename 3184548 280792 8960 3474300 35037c _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-19 11:52:00 -05:00
Juan A. Suarez Romero	f8e789d2ac	docs: fix typo in 18.2.3 release notes link Fixes: `86b4bd52dc` ("docs: update calendar, add news item and link release notes for 18.2.3") Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-10-19 18:48:12 +02:00
Juan A. Suarez Romero	86b4bd52dc	docs: update calendar, add news item and link release notes for 18.2.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-10-19 18:45:41 +02:00
Juan A. Suarez Romero	01f5d37d3e	docs: add sha256 checksums for 18.2.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `27fd12857b`)	2018-10-19 18:43:49 +02:00
Juan A. Suarez Romero	e30970e2cd	docs: add release notes for 18.2.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `d219361b42`)	2018-10-19 18:43:48 +02:00
Jose Fonseca	45bacc4b63	scons: Remove gles option. It's broken, and WGL state tracker is always built with GLES support noawadays. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-19 16:50:26 +01:00
Bas Nieuwenhuizen	68c7833540	radv: Fix WSI & PCI bus info initialization order. Trying to access the bus info before it is initialized is not going to work. Fixes: `baa38c144f` "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Andre Heider <a.heider@gmail.com>	2018-10-19 13:24:19 +02:00
Marek Olšák	69a87b5d47	radeonsi: fix a typo in a comment in emit_guardband	2018-10-18 18:01:22 -04:00
Marek Olšák	2a26b1c045	radeonsi: fix gnome-shell crash I wasn't expecting to get viewports with the center having negative coordinates. Broken by: `6cc79e4411`	2018-10-18 17:55:44 -04:00
Jason Ekstrand	8c0b9fdfa1	Revert "anv: Stop generating weak references for instance entrypoints" This reverts commit `00bb42105d`. It was not as well thought out as I had intended and broke the build when VK_KHR_display is disabled in the build.	2018-10-18 15:36:26 -05:00
Marek Olšák	77bcbe712e	radeonsi: clamp point size to the limit This fixes dEQP-GLES2.functional.rasterization.limits.points. Broken by: `ea039f789d` Tested-by: Jakob Bornecrantz <jakob@collabora.com>	2018-10-18 16:08:56 -04:00
Marek Olšák	eae8f49fc6	radeonsi: fix a VGT hang with primitive restart on Polaris10 and later Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Jakob Bornecrantz <jakob@collabora.com>	2018-10-18 16:08:56 -04:00
Marek Olšák	165817d47f	radeonsi: fix a deadlock due to partially-initialized context on CI	2018-10-18 16:08:56 -04:00
Jan Vesely	06bf56725d	radeonsi: Bump number of allowed global buffers to 32 Fixes assertion failure/crash when running luxmark/luxball on clover. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272 CC: mesa-stable@lists.freedesktop.org Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-18 16:02:42 -04:00
Andres Rodriguez	e71a87775e	radv: fix check for perftest options size It was using the debug options array size. CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:42:20 -04:00
Marek Olšák	6cc79e4411	radeonsi: fix incorrect hw screen offset and guardband computation It resulted in assertion failures or incorrect rendering. Broken by: `9e182b8313`	2018-10-18 14:42:42 -04:00
Jason Ekstrand	baa38c144f	vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching This lets us avoid passing the DRM fd around all over the place and gets us closer to layer utopia. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-18 11:29:00 -05:00
Michel Dänzer	c20ba1be18	loader/dri3: Also wait for front buffer fence if we triggered it In that case, we have to wait for the fence to synchronize with the corresponding drawing we triggered in the X server. Fixes incorrect display with the i965 driver and some applications, e.g. solvespace. Bugzilla: https://bugs.freedesktop.org/108097 Fixes: `aefac10fec` "loader/dri3: Only wait for back buffer fences in dri3_get_buffer" Tested-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2018-10-18 16:52:06 +02:00
Jason Ekstrand	00bb42105d	anv: Stop generating weak references for instance entrypoints We don't need weak references to instance entrypoints because we never have more than one of each so we don't need the NULL fall-back. This also helps us avoid forgetting things because we now get link errors for missing instance entrypoints. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-18 09:17:39 -05:00
Jason Ekstrand	7c65cf9844	vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR This got missed during 1.1 enabling because it was defined as an interaction between device groups and WSI and it wasn't obvious it was in the delta. The idea behind it is that it's supposed to provide a hint to the application in a multi-GPU setup to indicate which regions of the screen are being scanned out by which GPU so a multi-device split-screen rendering application can render each part of the screen on the GPU that will be presenting it and avoid extra bus traffic between GPUs. On a single-GPU setup or one which doesn't support this present mode, we need to do something. We choose to return the window size (or a max-size rect) if the compositor, X server, or crtc is associated with the given physical device and zero rectangles otherwise. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-18 09:17:39 -05:00
Jason Ekstrand	7629c00557	vulkan/wsi: Store the instance allocator in wsi_device We already have wsi_device and we know the instance allocator at wsi_device_init time so there's no need to pass it into the physical device queries. This also fixes a memory allocation domain bug that can occur if CreateSwapchain gets called prior to any queries (not likely) in which case the cached connection gets allocated off the device instead of the instance. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-18 09:17:39 -05:00
Michał Janiszewski	0ef50ecc69	st/xlib: Use more appropriate include guard Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com	2018-10-18 11:03:04 +01:00
Michał Janiszewski	bcc613acc1	gallium: Fix mismatched ifdef-guards Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-18 11:03:03 +01:00
Gert Wollny	74adc624b6	softpipe: dynamically allocate space for immediate constants The number of immediate constants was fixed and the size check was only done by means of an assertion. Given this a shader that emits more immediate constants would result in a memory corruption when mesa is build in release mode. Instead of using this fixed limit allocate the space dynamically, let it grow as needed, and also remove the unused ImmArray. Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-18 10:59:51 +02:00
Timothy Arceri	3a95396f3c	radv: use nir_shrink_vec_array_vars() Totals from affected shaders: SGPRS: 1096 -> 1096 (0.00 %) VGPRS: 1192 -> 1056 (-11.41 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 100940 -> 94384 (-6.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 100 -> 112 (12.00 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from Batman Arkham City. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	8086fa1bcd	radv: use nir_split_array_vars() We call in the opt loop in case another pass results in an array with indirect access being turned into direct access. Totals from affected shaders: SGPRS: 512 -> 496 (-3.12 %) VGPRS: 456 -> 452 (-0.88 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 40040 -> 39664 (-0.94 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 41 -> 43 (4.88 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from Batman Arkham City. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	06675711e7	radv: use nir_opt_find_array_copies() Totals from affected shaders: SGPRS: 1112 -> 1112 (0.00 %) VGPRS: 1492 -> 1196 (-19.84 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 112172 -> 101316 (-9.68 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 93 -> 98 (5.38 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from "Batman: Arkham City" over DXVK. The pass detects that the temporary array created by DXVK for storing TCS inputs is a copy of the input arrays and allows us to avoid copying all of the input data and then indirecting on it with if-ladders, instead we just do indirect indexing. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Timothy Arceri	9d5b106b2e	radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars Totals from affected shaders: SGPRS: 2856 -> 2856 (0.00 %) VGPRS: 3236 -> 3248 (0.37 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 236560 -> 233548 (-1.27 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 277 -> 283 (2.17 %) Wait states: 0 -> 0 (0.00 %) Even in the cases were we have increased VGPR use it appears the NIR is improved significantly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-18 15:04:09 +11:00
Keith Packard	67a2c1493c	vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5] Offers three clocks, device, clock monotonic and clock monotonic raw. Could use some kernel support to reduce the deviation between clock values. v2: Ensure deviation is at least as big as the GPU time interval. v3: Set device->lost when returning DEVICE_LOST. Use MAX2 and DIV_ROUND_UP instead of open coding these. Delete spurious TIMESTAMP in radv version. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> v4: Add anv_gem_reg_read to anv_gem_stubs.c Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v5: Adjust maxDeviation computation to max(sampled_clock_period) + sample_interval. Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-17 20:10:15 -07:00
Topi Pohjolainen	a11cafbd7a	intel/compiler/icl: Use invocation id bits 22:16 instead of 23:17 Identifier bits in the dispatch header have changed. See Bspec: SINGLE_PATCH Payload: 3D Pipeline Stages - 3D Pipeline Geometry - Hull Shader (HS) Stage IVB+ - Payloads IVB+ Fixes: KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-10-17 21:19:57 +03:00
Neil Roberts	a9475d9337	Fix setting indent-tabs-mode in the Emacs .dir-locals.el files Some of the .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-10-17 19:03:08 +02:00
Rob Clark	d27b1c83b9	freedreno/a6xx: don't allocate binning rb Now that a single cmdstream is used for both binning and draw passes, we can skip allocation of cmdstream buffer for binning. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	24d57a6d8f	freedreno/a6xx: single cmdstream for draw+binning Now that state which is different for draw vs binning pass is split out into different state-groups with appropriate enable_mask (so the appropriate one is chosen for draw vs binning), switch over to using a single cmdstream for both passes. This should significantly lower draw overhead for CPU bound benchmarks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	72f6164fef	freedreno/a6xx: split binning vs draw program stateobj's Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	3313d693af	freedreno/a6xx: split VBO state into binning/draw variants Blob seems to manage to use same input registers for BS (binning pass) vs VS (draw pass) shaders, so it can use the same VBO state for both. We can't quite do that yet, so split them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	b23fc4cacb	freedreno/a6xx: move VBO state to stateobj Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:49 -04:00
Rob Clark	e194056832	freedreno/a6xx: move ZSA state to stateobj Step towards single cmdstream, where we need different state-group-id's for binning vs draw ZSA state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	a50a9a44e8	freedreno/a6xx: remove vismode param We don't need to keep this IGNORE_VISIBILITY in binning pass. Prep work for using single cmdstream for both draw and binning passes. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	d9dbc9c21f	freedreno/ir3: move binning-pass fixup for a6xx+ Move this to after ir3_cp (which can add lowered immediates to the const state) for a6xx+, to ensure the uniform state matches between binning and vertex shaders. This way we can emit just a single VS_CONST state- group when we re-use single cmdstream for both binning and draw passes. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1a51c4a87e	freedreno/a6xx: a bit more state emit cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	2ffc79c7d1	freedreno/a6xx: move framebuffer state emit to emit_mrt() No point in checking this per-draw, since framebuffer change means new batch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	5894f37b85	freedreno/a6xx: small emit_mrt() cleanup On a6xx, this is only used for pfb->cbufs so we can just directly pass the pfb state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	b4e94af37d	freedreno/a6xx: use program cache Use the in-memory cache to construct shader program state and re-use it on subsequent draws, to lower driver overhead. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1d7fbe2cd1	freedreno/ir3: shader variant cache Cache that maps gallium hwcso (in this case, 'struct ir3_shader') plus shader variant key to a generation specific state object. This could eventually replace the linked list of shader variants, but for now it lets us re-use the work currently done in fdN_program_emit() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	2e9c08c0bc	freedreno/ir3: move binning_pass out of shader variant key Prep work for a following patch, that introduces a cache to map from program state (all shader stages) plus variant key to pre-baked hw state (which could be emit'd via CP_SET_DRAW_STATE, for example). To do that, we really want the variant key to be immutable, and to treat the binning pass shader as an extra shader stage, rather than as a VS variant. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	8b1a3b5dde	freedreno/ir3: track # of samplers used by shader This is useful for a6xx to avoid program state from depending on bound tex/samp state. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	1b9d69410c	freedreno/a6xx: texture state obj Unfortunately gallium doesn't match what the hw wants perfectly here, in using a separate CSO for each texture/sampler. So we have to use a hash table to map the collection of texture/samplers to hw state object. We probably could use separate hw state objects for texture and sampler state, but mesa/st tends to update the tex and samp state together. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	e8606b11dd	freedreno: add resource seqno Intended to be something more compact than a 64b pointer, which could be used as a key into hashtables. Prep work for texture state objects. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	abcdf5627a	freedreno/a6xx: move const emit to state group Eventually we want to move nearly everything, but no other state depends on const state, so this is the easiest one to move first. For webgl aquarium, this reduces GPU load by about 10%, since for each fish it does a uniform upload plus draw.. fish frequently are visible in only a single tile, so this skips the uniform uploads for other tiles. The additional step of avoiding WFI's when using CP_SET_DRAW_STATE seems to be work an additional 10% gain for aquarium. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	a398d26fd2	freedreno/a6xx: add infrastructure for CP_DRAW_STATE Add helper to add state-groups to emit, and code to emit CP_DRAW_STATE packet if we have any state-groups. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	ec717fc629	freedreno: reduce resource dependency tracking overhead Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Neil Roberts	ee61790daf	freedreno: Remove the Emacs mode lines These are not necessary because the corresponding settings are set via the .dir-locals.el file anyway. Most of them were missing a ‘:’ after “tab-width” which was making Emacs display an annoying warning whenever you open the file. This patch was made with: sed -ri '/-\- mode:/,/^$/d' \ $(find src/gallium/{drivers,winsys} -name \.\[ch\] \ -exec grep -l -- '-\*- mode:' {} \+) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Neil Roberts	afe640b360	freedreno: Fix the Emacs indentation configuration file The .dir-locals.el had the wrong name for the truthy value so it wasn’t setting indent-tabs-mode. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Hyunjun Ko	8e798e28f7	freedreno: allocate batches from the cache in launch_grid Needs to allocate batches from the cache so that it could get a valid index and make resource dependancy tracking right. In addition this fixes assertion on debug build since the commit `1a40faa8` landed. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Hyunjun Ko	2385d7b066	freedreno: adds nondraw param to fd_bc_alloc_batch Needs to specify nondraw when creating a batch through fd_bc_alloc_batch since it'd better create a batch through it rather than fd_batch_create. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	9e6019bd46	freedreno/a6xx: remove fd6_emit_render_cntl() It was dead code carried over from a5xx Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	835cb06965	freedreno/ir3: fix broken texcoord inputs TODO not sure if this is best solution, but current logic is broken for texcoord inputs. It is definitely the simplest solution. Fixes: `1a24f51966` freedreno/ir3: ignore unused inputs Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Rob Clark	cbf9fe50b5	freedreno: fix off-by-one error in BEGIN_RING() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-17 12:44:48 -04:00
Marek Olšák	669dd22983	util: document a limitation of util_fast_udiv32 trivial	2018-10-17 12:27:58 -04:00
Matt Turner	58a51d0a67	i965/fs: Add 64-bit int immediate support to dump_instructions() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-16 17:48:17 -07:00
Marek Olšák	fcc70e4855	radeonsi: track context rolls better for the Vega scissor bug workaround We should get fewer context rolls with the SET_CONTEXT_REG optimization, but it would have been for nothing if the scissor state rolled the context anyway. Don't emit the scissor state if there is no context roll.	2018-10-16 17:23:25 -04:00
Marek Olšák	25ddb15cfe	radeonsi: emit sample locations for 1xAA only when the hw bug is present	2018-10-16 17:23:25 -04:00
Marek Olšák	9b331e462e	radeonsi: use compute shaders for clear_buffer & copy_buffer Fast color clears should be much faster. Also, fast color clears on evicted buffers should be 200x faster on GFX8 and older.	2018-10-16 17:23:25 -04:00
Marek Olšák	5030adcbe0	radeonsi: use copy_buffer in buffer_do_flush_region directly	2018-10-16 17:23:25 -04:00
Marek Olšák	0b40fbc879	radeonsi: use faster integer division for instance divisors We know the divisors when we upload them, so instead we can precompute and upload division factors derived from each divisor. This fast division consists of add, mul_hi, and two shifts, and we have to load 4 dwords intead of 1. This probably won't affect any apps.	2018-10-16 17:23:25 -04:00
Marek Olšák	bfc795670e	ac: add helpers for fast integer division by a constant	2018-10-16 17:23:25 -04:00
Marek Olšák	ea039f789d	radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewports	2018-10-16 15:28:22 -04:00
Marek Olšák	4fd8d2df9c	radeonsi: move emission of PA_SU_VTX_CNTL into emit_guardband We'll modify the quant mode there, which also affects the guarband computation.	2018-10-16 15:28:22 -04:00
Marek Olšák	41a6c3de1f	radeonsi: don't re-upload the sample position constant buffer repeatedly	2018-10-16 15:28:22 -04:00
Marek Olšák	b94824c787	radeonsi: set PA_SU_PRIM_FILTER_CNTL optimally	2018-10-16 15:28:22 -04:00
Marek Olšák	9e182b8313	radeonsi: center viewport to improve guardband clipping for high resolutions This will be more useful when we change the quant mode to increase subpixel precision and decrease the viewport range (which might not be possible if the viewport is not centered in the viewport range).	2018-10-16 15:28:22 -04:00
Marek Olšák	fedc1fda30	radeonsi: save raster config in screen, add se_tile_repeat	2018-10-16 15:28:22 -04:00
Marek Olšák	ac76aeef20	radeonsi: switch back to standard DX sample positions Apps may rely on them.	2018-10-16 15:28:22 -04:00
Marek Olšák	67f02cf810	radeonsi: add GDS support to CP DMA	2018-10-16 15:28:22 -04:00
Marek Olšák	0d05581578	radeonsi: rename si_gfx_* functions to si_cp_* and write_event_eop -> release_mem	2018-10-16 15:28:22 -04:00
Marek Olšák	6e1cf6532d	radeonsi: make si_gfx_write_event_eop more configurable	2018-10-16 15:28:22 -04:00
Sergii Romantsov	0fa9e6d7b3	anv/skylake: disable ForceThreadDispatchEnable On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang. -v2: enabling of ForceThreadDispatchEnable is only for gen8, for gen9 and higher reverted enabling of PixelShaderHasUAV. -v3 (Jason Ekstrand): Rework the comments a bit. CC: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760 Fixes: `79270d2140` (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-16 13:20:51 -05:00
Lionel Landwerlin	322a919a41	anv: Implement VK_EXT_pci_bus_info Even though the Intel GPU are always at the same PCI location, all the info we need is already provided by libdrm. Let's be future proof. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-16 12:47:55 +01:00
Jose Fonseca	8550be7a2f	appveyor: Cache pip's cache files. It should speed up the Python packages installation. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:14 +01:00
Jose Fonseca	bfb8afb14d	appveyor: Update to newer Mako/winflexbison versions. As that's what most people are bound to use. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:12 +01:00
Jose Fonseca	b94f9cd8f9	appveyor: Update to MSVC 2017. That's what we (and I suppose most people out there) are using now. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-16 09:41:07 +01:00
Samuel Pitoiset	647c2b90e9	radv: disable VK_SUBGROUP_FEATURE_VOTE_BIT This feature isn't used for now, so disable it until wwm is fixed in LLVM. Fixes dEQP-VK.subgroups.vote.graphics.subgroupallequal* https://bugs.freedesktop.org/show_bug.cgi?id=108115 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-16 10:24:19 +02:00
Samuel Pitoiset	593996bc02	radv: implement buffer to image operations for R32G32B32 This should fix rendering issues with Batman Arkham City. We will probably need to implement itob and itoi at some point, but currently nothing hits these paths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-16 09:22:38 +02:00
Alex Smith	ca83d51cfb	ac/nir: Use context-specific LLVM types LLVMInt*Type() return types from the global context and therefore are not safe for use in other contexts. Use types from our own context instead. Fixes frequent crashes seen when doing multithreaded pipeline creation. Fixes: `4d0b02bb5a` "ac: add support for 16bit load_push_constant" Fixes: `7e7ee82698` "ac: add support for 16bit buffer loads" Cc: "18.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-16 08:18:24 +01:00
Vadym Shovkoplias	ad558408ff	glsl: Check the subroutine associated functions names Adding compile time check for subroutine functions with the same names. Similar check for intrastage linking was already landed in commit `5f0567a4f6`. From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-16 08:15:21 +03:00
Vadym Shovkoplias	d2ea3d4a76	glsl/linker: Change the format of spec quotation Also there is no "OpenGL ES Shading Language 4.00" spec, so change it to GLSL 4.00 spec. Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-16 08:15:21 +03:00
Dave Airlie	ff281e6204	nir: fix clip cull lowering to not assert if GLSL already lowered. If GLSL has already done the lowering, we'd rather not crash in this pass. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-15 18:53:48 -07:00
Kenneth Graunke	5bd8369681	i965: Add PCI IDs for new Amberlake parts that are Coffeelake based See commit c0c46ca461f136a0ae1ed69da6c874e850aeeb53 in the Linux kernel, where José Roberto de Souza added this new PCI ID there. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2018-10-15 18:10:27 -07:00
Kenneth Graunke	8f8111646c	intel: disable FS IR validation in release mode. We probably don't need to iterate, fprintf, and abort in release mode. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-10-15 18:10:27 -07:00
Caio Marcelo de Oliveira Filho	b3c6146925	nir: Copy propagation between blocks Extend the pass to propagate the copies information along the control flow graph. It performs two walks, first it collects the vars that were written inside each node. Then it walks applying the copy propagation using a list of copies previously available. At each node the list is invalidated according to results from the first walk. This approach is simpler than a full data-flow analysis, but covers various cases. If derefs are used for operating on more memory resources (e.g. SSBOs), the difference from a regular pass is expected to be more visible -- as the SSA copy propagation pass won't apply to those. A full data-flow analysis would handle more scenarios: conditional breaks in the control flow and merge equivalent effects from multiple branches (e.g. using a phi node to merge the source for writes to the same deref). However, as previous commentary in the code stated, its complexity 'rapidly get out of hand'. The current patch is a good intermediate step towards more complex analysis. The 'copies' linked list was modified to use util_dynarray to make it more convenient to clone it (to handle ifs/loops). Annotated shader-db results for Skylake: total instructions in shared programs: 15105796 -> 15105451 (<.01%) instructions in affected programs: 152293 -> 151948 (-0.23%) helped: 96 HURT: 17 All the HURTs and many HELPs are one instruction. Looking at pass by pass outputs, the copy prop kicks in removing a bunch of loads correctly, which ends up altering what other other optimizations kick. In those cases the copies would be propagated after lowering to SSA. In few HELPs we are actually helping doing more than was possible previously, e.g. consolidating load_uniforms from different blocks. Most of those are from shaders/dolphin/ubershaders/. total cycles in shared programs: 566048861 -> 565954876 (-0.02%) cycles in affected programs: 151461830 -> 151367845 (-0.06%) helped: 2933 HURT: 2950 A lot of noise on both sides. total loops in shared programs: 4603 -> 4603 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 11085 -> 11073 (-0.11%) spills in affected programs: 23 -> 11 (-52.17%) helped: 1 HURT: 0 The shaders/dolphin/ubershaders/12.shader_test was able to pull a couple of loads from inside if statements and reuse them. total fills in shared programs: 23143 -> 23089 (-0.23%) fills in affected programs: 2718 -> 2664 (-1.99%) helped: 27 HURT: 0 All from shaders/dolphin/ubershaders/. LOST: 0 GAINED: 0 The other generations follow the same overall shape. The spills and fills HURTs are all from the same game. shader-db results for Broadwell. total instructions in shared programs: 15402037 -> 15401841 (<.01%) instructions in affected programs: 144386 -> 144190 (-0.14%) helped: 86 HURT: 9 total cycles in shared programs: 600912755 -> 600902486 (<.01%) cycles in affected programs: 185662820 -> 185652551 (<.01%) helped: 2598 HURT: 3053 total loops in shared programs: 4579 -> 4579 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 80929 -> 80924 (<.01%) spills in affected programs: 720 -> 715 (-0.69%) helped: 1 HURT: 5 total fills in shared programs: 93057 -> 93013 (-0.05%) fills in affected programs: 3398 -> 3354 (-1.29%) helped: 27 HURT: 5 LOST: 0 GAINED: 2 shader-db results for Haswell: total instructions in shared programs: 9231975 -> 9230357 (-0.02%) instructions in affected programs: 44992 -> 43374 (-3.60%) helped: 27 HURT: 69 total cycles in shared programs: 87760587 -> 87727502 (-0.04%) cycles in affected programs: 7720673 -> 7687588 (-0.43%) helped: 1609 HURT: 1416 total loops in shared programs: 1830 -> 1830 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 1988 -> 1692 (-14.89%) spills in affected programs: 296 -> 0 helped: 1 HURT: 0 total fills in shared programs: 2103 -> 1668 (-20.68%) fills in affected programs: 438 -> 3 (-99.32%) helped: 4 HURT: 0 LOST: 0 GAINED: 1 v2: Remove the DISABLE prefix from tests we now pass. v3: Add comments about missing write_mask handling. (Caio) Add unreachable when switching on cf_node type. (Jason) Properly merge the component information in written map instead of replacing. (Jason) Explain how removal from written arrays works. (Jason) Use mode directly from deref instead of getting the var. (Jason) v4: Register the local written mode for calls. (Jason) Prefer cf_node instead of node. (Jason) Clarify that remove inside iteration only works in backward iterations. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	dc349f07b5	nir: Take call instruction into account in copy_prop_vars Calls are not used yet (functions are inlined), but since new code is already taking them into account, do it here too. The convention here and in other places is that no writable memory is assumed to remain unchanged, as well as global variables. Also, explicitly state the modes affected (instead of using the reverse logic) in one of the apply_for_barrier_modes calls. Suggested by Jason. v2: Consider local vars used by a call to be conservative, SPIR-V has such cases. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	797f01c220	nir: Add tests for copy propagation of derefs Also tests for removal of redundant loads, that we currently handle as part of the copy propagation. Note some tests involve multiple blocks and are currently DISABLED because they (expectedly) fail. v2: Add missing DISABLED prefix to "multi block" tests. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	4dfa7adc10	nir: Remove handling of dead writes from copy_prop_vars These are covered by another pass now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	c20dd1f77c	intel/nir, freedreno/ir3: Use the separated dead write vars pass No changes to shader-db for intel. No changes to shader-db expected for freedreno. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	cb126cf67a	nir: Separate dead write removal into its own pass Instead of doing this as part of the existing copy_prop_vars pass. Separation makes easier to expand the scope of both passes to be more than per-block. For copy propagation, the information about valid copies comes from previous instructions; while the dead write removal depends on information from later instructions ("have any instruction used this deref before overwrite it?"). Also change the tests to use this pass (instead of copy prop vars). Note that the disabled tests continue to fail, since the standalone pass is still per-block. v2: Remove entries from dynarray instead of marking items as deleted. Use foreach_reverse. (Caio) (all from Jason) Do not cache nir_deref_path. Not worthy for this patch. Clear unused writes when hitting a call instruction. Clean up enumeration of modes for barriers. Move metadata calls to the inner function. v3: For copies, use the vector length to calculate the mask. (all from Jason) Use nir_component_mask_t when applicable. Rename functions for clarity. Consider local vars used by a call to be conservative (SPIR-V has such cases). Comment and assert the assumption that stores and copies are always to a deref that ends with a vector or scalar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	a02fd7000d	nir: Add tests for dead write elimination Note at the moment the pass called is nir_opt_copy_prop_vars, because dead write elimination is implemented there. Also added tests that involve identifying dead writes in multiple blocks (e.g. the overwrite happens in another block). Those currently fail as expected, so are marked to be skipped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	bbda2a17f7	nir: Add test file for vars related passes Add basic helpers for doing tests on the vars related optimization passes. The main goal is to lower the barrier to create tests during development and debugging of the passes. Full coverage is not a requirement. v2: Make find_next_intrinsic() skip blocks before 'after'. (Jason) Move nir_imm_ivec2() to nir_builder.h. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	c869646b7d	nir: Add nir_imm_ivec2 helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho	3966f053a1	util: Add foreach_reverse for dynarray Useful to walk the array removing elements by swapping them with the last element. v2: Change iteration to make sure we never underflow. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 17:29:46 -07:00
Eric Anholt	8ec83dc51e	v3d: Add support for hardware pack/unpack of half floats. Cuts the formerly 7-minute simulation time of fs-packHalf2x16.shader_test in half.	2018-10-15 17:16:44 -07:00
Eric Anholt	7d77fe1bcc	nir: Expose nir_remove_unused_io_vars(). For gallium drivers where you want to do some linking at variant compile time, you don't have the other producer/consumer shader on hand to modify. By exposing the inner function, the driver can have the used varyings in the compiled shader cache key and still do linking. This is also useful for V3D, where the binning shader wants to only output position and TF varyings. We've been removing those after nir_lower_io, but this will be less driver-specific code and let more of the shader get DCEed early in NIR. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-15 17:16:44 -07:00
Eric Anholt	b788ab6d5c	nir: Be sure to fix deref modes after demoting shader i/o vars to global. Fixes assertion failures when calling nir_remove_unused_varyings() or nir_remove_unused_io_vars(). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-15 17:16:44 -07:00
Eric Anholt	dda1ae9b3c	gallium/ttn: Convert inputs and outputs to derefs of variables. This means that TTN shaders more closely resemble GTN shaders: they have inputs and outputs as variable derefs, with the variables having their .driver_location already set up for you. This will be useful for v3d to do input variable DCE in NIR, which we can't do when the TTN shaders never have a pre-nir_lower_io stage. Acked-by: Rob Clark <robdclark@gmail.com>	2018-10-15 17:16:43 -07:00
Eric Anholt	da15a0d88e	gallium/ttn: Fix the type of gl_FragDepth. In TGSI we have a vec4 of which only .z is used, but for NIR we should be using a float the same as other NIR IR. We were already moving TGSI's .z to the .x channel. Acked-by: Rob Clark <robdclark@gmail.com>	2018-10-15 17:16:43 -07:00
Kristian H. Kristensen	f93e431272	freedreno/a6xx: Enable blitter Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-15 15:22:38 -07:00
Kristian H. Kristensen	47bc9fad3e	freedreno/a6xx: Update headers Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-15 15:22:35 -07:00
Kristian H. Kristensen	421863412c	freedreno/a6xx: Remove unnecessary GRAS_2D_BLIT_INFO write Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-15 15:20:28 -07:00
Jason Ekstrand	e4c9bcd037	anv: Don't advertise ASTC support on BSW Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-10-15 16:55:25 -05:00
Samuel Pitoiset	26a2ce35ab	radv: do not force the flat qualifier for clip/cull distances This fixes some new CTS that reads clip/cull distances from the fragment shader stage: dEQP-VK.clipping.user_defined.clip_* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-15 21:55:28 +02:00
Samuel Pitoiset	80c84bdba9	radv: bump discreteQueuePriorities to 2 It's the minimum value required by the spec. This fixes dEQP-VK.api.info.device.properties. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-15 21:55:25 +02:00
Jason Ekstrand	ae18c53ba6	anv: Split dispatch tables into device and instance There's no reason why we need generate trampoline functions for instance functions or carry N copies of the instance dispatch table around for every hardware generation. Splitting the tables and being more conservative shaves about 34K off .text and about 4K off .data when built with clang. Before splitting dispatch tables: text data bss dec hex filename 3224305 286216 8960 3519481 35b3f9 _install/lib64/libvulkan_intel.so After splitting dispatch tables: text data bss dec hex filename 3190325 282232 8960 3481517 351fad _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-15 13:30:24 -05:00
Kenneth Graunke	18cc65edf8	i965: Drop assert about number of uniforms in ARB handling. My recent prog_to_nir patch started making new sampler uniforms, which apparently increased the number of parameters. We used to poke at the one parameter directly, making it important that there was only one, but we haven't done that in a while. It should be safe to just delete the assertion. Fixes: `1c0f92d8a8` "nir: Create sampler variables in prog_to_nir." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-15 10:56:12 -07:00
Jason Ekstrand	2241be1d1b	vulkan: Add the fuchsia headers These were missing in the last couple of spec updates. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-15 10:20:31 -05:00
Bas Nieuwenhuizen	6ed0fd24d4	radv: Implement VK_EXT_pci_bus_info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-15 12:27:49 +02:00
Kenneth Graunke	38a23517fd	gallium/u_transfer_helper: Add support for separate Z24/S8 as well. u_transfer_helper already had code to handle treating packed Z32_S8 as separate Z32_FLOAT and S8_UINT resources, since some drivers can't handle that interleaved format natively. Other hardware needs depth and stencil as separate resources for all formats. For example, V3D3 needs this for 24-bit depth as well. This patch adds a new flag to lower all depth/stencils formats, and implements support for Z24_UNORM_S8_UINT. (S8_UINT_Z24_UNORM is left as an exercise to the reader, preferably someone who has access to a machine that uses that format.) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-14 23:36:28 -07:00
Kenneth Graunke	c3d219837a	gallium/format: Add a helper to combine separate Z24 and S8 stencil. This new function takes separate Z24 depth and S8 stencil sources, and packs them into a single combined Z24S8 buffer. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-14 23:36:28 -07:00
Kenneth Graunke	5849e0612c	gallium/auxiliary: Add util_format_get_depth_only() helper. This will be used by u_transfer_helper.c shortly, in order to split packed depth-stencil into separate resources. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-10-14 23:36:28 -07:00
Kenneth Graunke	1c0f92d8a8	nir: Create sampler variables in prog_to_nir. This is needed for nir_gather_info to actually count the textures, since it operates solely on variables. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-14 23:35:47 -07:00
Kenneth Graunke	ed169c9ad2	nir: Create sampler2D variables in nir_lower_{bitmap,drawpixels}. This is needed for nir_gather_info to actually count the new textures, since it operates solely on variables. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-14 23:35:35 -07:00
Jason Ekstrand	b7397b09d5	spirv: Update SPIR-V json and headers to Khronos master This corresponds to commit 801cca8104245c07e8cc532 on GitHub. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-13 09:56:18 -05:00
Samuel Pitoiset	13fd4e601c	vulkan: Update the XML and headers to 1.1.88 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-13 09:56:18 -05:00
Vinson Lee	cc33621e3b	r600/sb: Fix constant-logical-operand warning. sb/sb_bc_parser.cpp:620:27: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand] if (cf->bc.op_ptr->flags && FF_GDS) ^ ~~~~~~ sb/sb_bc_parser.cpp:620:27: note: use '&' for a bitwise operation if (cf->bc.op_ptr->flags && FF_GDS) ^~ & sb/sb_bc_parser.cpp:620:27: note: remove constant to silence this warning if (cf->bc.op_ptr->flags && FF_GDS) ~^~~~~~~~~ Fixes: `da977ad907` ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-12 10:58:58 -07:00
Rafael Antognolli	ca168ec008	i965/miptree: Use enum instead of boolean. ISL_AUX_USAGE_NONE happens to be the same as "false", but let's do the right thing and use the enum. v2: fix intel_miptree_finish_depth too (Caio) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-12 10:14:20 -07:00
Samuel Pitoiset	2c139e2cdf	radv: do not support blitting surfaces for R32G32B32 formats Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-12 15:28:21 +02:00
Jose Fonseca	7c5aececda	scons: Allow building with custom MSVC_USE_SCRIPT script. SCons MSVC support relies on vcvarsall.bat to extract the PATH, CPP includes, library paths, etc. And SCons also has an build env var named MSVC_USE_SCRIPT which one can use to point to alternative vcvarsall.bat script. This change exposes this MSVC_USE_SCRIPT build env variable as a SCons command line variable. This will enable using MSVC outside Program Files (e.g, network shares, etc.) This change also links advapi32 library, necessary for the Windows Registry API used by WGL state tracker, avoiding missing symbols. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-12 07:45:53 +01:00
Samuel Pitoiset	416013b4f5	radv: emit the GLC bit for SSBO loads/stores when needed This fixes some new memory model tests: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-12 08:42:08 +02:00
Samuel Pitoiset	4b74f05f6b	spirv/nir: handle memory access qualifiers for SSBO loads/stores v2: - change how the access qualifiers are accumulated v3: - duplicate members in struct_member_decoration_cb() - handle access qualifiers on variables - remove access qualifiers handling in _vtn_variable_load_store() - fix setting access qualifiers on type->array_element Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net	2018-10-12 08:42:08 +02:00
Tapani Pälli	26a10e3844	anv/android: we need git_sha1.h in include paths Fixes: `e4538b9` "anv: Implement VK_KHR_driver_properties" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-12 07:29:03 +03:00
Nanley Chery	0ee0e0b6b9	anv: Clear WM_HZ_OP overrides in init_device_state This is basically a port of commit, `3ade766684` ("i965: Disable 3DSTATE_WM_HZ_OP fields.") The BDW+ docs describe how to use the 3DSTATE_WM_HZ_OP instruction in the section titled, "Optimized Depth Buffer Clear and/or Stencil Buffer Clear." It mentions that the packet overrides GPU state for the clear operation and needs to be reset to 0s to clear the overrides. Depending on the kernel, we may not get a context with the GPU state for this packet zeroed. Do it ourselves just in case. Prevents a number of GPU hangs when running crucible on ICL. I tried to get the exact number of hangs that occurs without this patch, but was unsuccessful. The test machine became unresponsive before completing the full run. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-10-11 16:31:08 -07:00
Jordan Justen	494d2ec277	i965/gen10+: Initialize new fields in STATE_BASE_ADDRESS Ref: `263b584d5e` "i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake." Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-10-11 15:16:04 -07:00
Jordan Justen	d18a0d955e	anv/gen9+: Initialize new fields in STATE_BASE_ADDRESS Ref: `263b584d5e` "i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake." Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-10-11 15:16:00 -07:00
Jason Ekstrand	d7e0d47b9d	nir: Add a bunch of b2[if] optimizations The b2f and b2i conversions always produce zero or one which are both representable in every type and size. Since b2i and b2f support all bit sizes, we can just get rid of the conversion opcode. total instructions in shared programs: 15089335 -> 15084368 (-0.03%) instructions in affected programs: 212564 -> 207597 (-2.34%) helped: 896 HURT: 0 total cycles in shared programs: 369831123 -> 369826267 (<.01%) cycles in affected programs: 2008647 -> 2003791 (-0.24%) helped: 693 HURT: 216 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-11 15:21:19 -05:00
Jason Ekstrand	0e0dc596a2	intel/vec4: Fix nir_op_b2[fi] with 64-bit result This is valid NIR but you can't actually hit this case today. GLSL IR doesn't have a bool to double opcode; it does f2d(b2f(x)). In SPIR-V we don't have any to/from bool conversion opcodes at all. However, the next commit will make us start generating it so we should be ready. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-11 15:21:19 -05:00
Jason Ekstrand	497675c21e	intel/fs: Fix nir_op_b2[fi] with 64-bit result on Gen8 LP and Gen9 LP Several of the Atom GPUs have additional restrictions on alignment when moving < 64-bit source to a 64-bit destination. All of the nir_op_264 code generation paths respected this, but nir_op_b2[fi] did not. Previous to commit `a68dd47b91` it was not possible to generate such an instruction from the GLSL path. It may have been possible from SPIR-V, but it's not clear. The aforementioned patch converts a 64-bit nir_op_fsign into a sequence of operations including a nir_op_b2f with a 64-bit result. This "just works" everywhere except these Atom parts. This problem was not detected during normal CI testing because the Atom parts are not included in developer builds. v2 (idr): Make the patch compile, and make some cosmetic changes. Add a commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108319 Fixes: `a68dd47b91` "nir/algebraic: Simplify fsat of fsign" Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-11 15:21:19 -05:00
Vinson Lee	4ece6aa552	egl: Use correct shared libraries suffix on macOS. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-11 11:30:00 -07:00
Illia Iorin	b18f8e63ef	mesa: Fix pack_uint_Z_FLOAT32() Fixed pack_uint_Z_FLOAT32 by casting row data to float instead uint. Remove code duplicate function pack_uint_Z_FLOAT32_X24S8. Edited case in "_mesa_get_pack_uint_z_func". Now it looks like "_mesa_get_pack_float_z_func". Remove _mesa_problem call, which was added for debuging this issue. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91433 Signed-off-by: Illia Iorin <illia.iorin@globallogic.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-10-11 10:15:09 -07:00
Rodrigo Vivi	24db1c7fcc	intel: Introducing Whiskey Lake platform Whiskey Lake uses the same gen graphics as Coffe Lake, including some ids that were previously marked as reserved on Coffe Lake, but that now are moved to WHL page. This follows the ids and approach used on kernel's commit b9be78531d27 ("drm/i915/whl: Introducing Whiskey Lake platform") and commit c1c8f6fa731b ("drm/i915: Redefine some Whiskey Lake SKUs") v2: Lionel noticed that GT{1,2,3} on kernel wasn't following spec when looking to number of EUs, so kernel has been updated. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-11 10:02:40 -07:00
Boyuan Zhang	d76c277421	st/va: use provided sizes and coords for vlVaGetImage vlVaGetImage should respect the width, height, and coordinates x and y that passed in. Therefore, pipe_box should be created with the passed in values instead of surface width/height. v2: add input size check, return error when size out of bounds v3: fix the size check for vaimage v4: add size adjustment for x and y coordinates Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Cc: "18.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Christian König <christian.koenig@amd.com>	2018-10-11 09:00:18 -04:00
Samuel Pitoiset	229803b66a	radv: implement clear operations for R32G32B32 This fixes crashes for some CTS: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color..linear__* dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.._linear_* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-11 14:49:16 +02:00
Samuel Pitoiset	c3ba3c2611	radv: disallow 3D images and mipmaps/layers for R32G32B32 linear formats R32G32B32 are weird formats and we are only going to support some basic operations for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-11 14:49:14 +02:00
Samuel Pitoiset	d179312b53	radv: add a workaround for a VGT hang with prim restart and strips Otherwise, Yakuza and The Evil Within hang the GPU with DXVK. This apparently only works on Polaris. Suggested by Marek. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-11 10:16:11 +02:00
Timothy Arceri	3bc012a34e	glsl: remove redundant es_shader checks The es check is already covered by the is_version() check. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-11 14:45:43 +11:00
Dave Airlie	cc2fe57922	st/glsl_to_tgsi: initialise need_uarl in contructor Found by coverity Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-11 10:20:37 +10:00
Dave Airlie	c5c3da6c90	glspirv: drop pointless assert (size_t is unsigned) Found by coverity Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-10-11 10:19:48 +10:00
Dave Airlie	600d8ecb57	radv: remove unsigned comparison against 0 The value is always >= 0 here. Found by coverity Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-11 10:19:20 +10:00
Dave Airlie	6e1d294804	radv: remove dead code for master_fd close We have never opened master_Fd at this point, so remove code to close it. Found by coverity. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-11 10:19:16 +10:00
Dave Airlie	7c04b96f03	radv: don't pass shader key by copy Coverity pointed out we were copying 168 bytes here unnecessarily. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-10-11 10:18:43 +10:00
Dave Airlie	29a7631986	anv: add missing unlock in error path. Not going to matter, but be consistent. Found by coverity Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `caf41c78c` (anv/allocator: Support softpin in the BO cache)	2018-10-11 09:50:27 +10:00
Jason Ekstrand	4ba445e011	intel: Don't propagate conditional modifiers if a UD source is negated This fixes a bug uncovered by my NIR integer division by constant optimization series. Fixes: `19f9cb72c8` "i965/fs: Add pass to propagate conditional..." Fixes: `627f94b72e` "i965/vec4: adding vec4_cmod_propagation..." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-10 13:13:12 -05:00
Jason Ekstrand	328d4d080b	util: Add tests for fast integer division by constants While I generally trust rediculousfish to have done his homework, we've made some adjustments to suit the needs of mesa and it'd be good to test those. Also, there's no better place than unit tests to clearly document the different edge cases of the different methods. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-10 13:13:12 -05:00
Marek Olšák	a9be8dddfe	util: Add power-of-two divisor support to compute_fast_udiv_info Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-10 13:13:12 -05:00
Jason Ekstrand	7cde4dbcd7	util: Generalize fast integer division to be variable bit-width There's nothing inherently fixed-width in the code. All that's required to generalize it is to make everything internally 64-bit and pass UINT_BITS in as a parameter to util_compute_fast_[us]div_info. With that, it can now handle 8, 16, 32, and 64-bit integer division by a constant. We also add support for division by 1 and by other powers of 2. This is useful if you want to divide by a uniform value in a shader where you have the opportunity to adjust the uniform on the CPU before passing it in. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-10 13:13:12 -05:00
Marek Olšák	64eb0738d4	util: Add fast division helpers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-10 13:13:12 -05:00
Marek Olšák	2940c257a6	util: import public domain code for integer division by a constant Compilers can use this to generate optimal code for integer division by a constant. Additionally, an unsigned division by a uniform that is constant but not known at compile time can still be optimized by passing 2-4 division factors to the shader as uniforms and executing one of the fast_udiv* variants. The signed division algorithm doesn't have this capability. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-10 13:13:12 -05:00
Jason Ekstrand	0dca6730b4	util: Add a simple big math library Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-10 13:13:12 -05:00
Dylan Baker	b8521704ed	meson: Don't allow building EGL on Windows or MacOS Currently mesa only supports EGL on Unix like systems, cygwin, and haiku. Meson should actually enforce this. This fixes the default build on MacOS. v2: - invert the condition, mark darwin and windows as not supported instead of trying to mark what is supported. v3: - add missing ) v3: - Update comment to reflect condition change in v2 CC: 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-10 11:02:36 -07:00
Timothy Arceri	0346ad3774	glsl: ignore trailing whitespace when define redefined The Nvidia/AMD binary drivers allow this, as does GCC. This fixes shader compilation issues in the latest update of No Mans Sky. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-10 15:08:32 +11:00
Ian Romanick	b44c9292b7	intel/compiler: Don't handle fsign.sat No shader-db or CI changes on any Intel platform. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-10-09 13:56:42 -07:00
Ian Romanick	a68dd47b91	nir/algebraic: Simplify fsat of fsign These allows us to not support fsign.sat in the Intel compiler backend, and that will simplify some later changes. No shader-db changes on any Intel platform. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-10-09 13:56:42 -07:00
Ian Romanick	1546204cdd	nir/algebraic: sign(x)xx is abs(x)*x shader-db results: All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 15106023 -> 15105981 (<.01%) instructions in affected programs: 300 -> 258 (-14.00%) helped: 6 HURT: 0 helped stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 helped stats (rel) min: 14.00% max: 14.00% x̄: 14.00% x̃: 14.00% 95% mean confidence interval for instructions value: -7.00 -7.00 95% mean confidence interval for instructions %-change: -14.00% -14.00% Instructions are helped. total cycles in shared programs: 566050327 -> 566050075 (<.01%) cycles in affected programs: 2826 -> 2574 (-8.92%) helped: 6 HURT: 0 helped stats (abs) min: 40 max: 44 x̄: 42.00 x̃: 42 helped stats (rel) min: 8.89% max: 8.94% x̄: 8.92% x̃: 8.92% 95% mean confidence interval for cycles value: -44.30 -39.70 95% mean confidence interval for cycles %-change: -8.95% -8.88% Cycles are helped. No changes on Gen6 or earlier. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-10-09 13:56:42 -07:00
Ian Romanick	10f4a8871e	nir: Add helper functions to get the instruction that generated a nir_src Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-10-09 13:56:42 -07:00
Brian Paul	797e34f658	svga: change svga_destroy_shader_variant() to return void svga_destroy_shader_variant() itself flushes and retries the command if there's a failure. So no need for the callers to do it. Other callers of the function were already ignoring the return value. This also fixes a corner-case double-free reported by Coverity (and reported by Dave Airlie). Tested with various OpenGL apps. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-10-09 11:17:14 -06:00
Dylan Baker	b781688636	meson: Don't build glsl compiler tests unless OpenGL is enabled Since there are no other users of the glsl compiler. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-09 08:56:00 -07:00
Dylan Baker	d84f003b95	meson: Only build gallium state tracker tests with shared_glapi This has always been a requirement, it's just somehow been missed in the meson build. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-09 08:55:56 -07:00
Dylan Baker	0fa6a8271a	meson: only build clapi tests when OpenGL is being built Otherwise building just vulkan (among other things) will build these tests, pull in a bunch of stuff they shouldn't, and potentially fail to compile. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-09 08:55:48 -07:00
Ilia Mirkin	92f56fbd89	nvc0: fix blitting red to srgb8_alpha For some reason the 2d engine can't handle this. Red formats get special treatment there, so perhaps related. Fixes dEQP-GLES3 tests of the form: dEQP-GLES3.functional.fbo.blit.conversion.r{8,16f,32f}_to_srgb8_alpha8 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2018-10-09 10:33:11 -04:00
Ilia Mirkin	9bf0614116	nv50,nvc0: guard against zero-size blits The current state tracker can generate these sometimes. Fixing this is more involved, and due to some integer math we can generate divisions-by-zero. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2018-10-09 10:33:11 -04:00
Ilia Mirkin	78d3640e49	nv50,nvc0: mark RGBX_UINT formats as renderable This helps st/mesa avoid some (apparently) buggy fallbacks. Specifically the CopyTexSubImage fallback tries to read texture A as RGBA_FLOAT and write back that data into the target format, which fails for integer formats which have no appropriate logic to do the conversion. Since integer formats don't blend, there's no harm in the fact that the "A" component gets written anyways. Fixes, among others: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/textures/canvas/tex-2d-rgb8ui-rgb_integer-unsigned_byte.html Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2018-10-09 10:33:11 -04:00
Eric Engestrom	976188737d	radv: add missing meson c++ visibility arguments Fixes: `6f3aee40f9` "radv: using tls to store llvm related info and speed up compiles (v10)" Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-09 14:22:24 +01:00
Michel Dänzer	9d3fefdc41	gbm: Add GBM_FORMAT_ARGB1555 support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-09 10:32:51 +02:00
Michel Dänzer	e7e033ed8a	st/dri: Handle BGRA5551 format Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-09 10:32:50 +02:00
Rob Clark	fa52ff856d	freedreno/a5xx+a6xx: fix LRZ pitch alignment Both RB_2D_DST_SIZE.PITCH (a6xx) and RB_MRT[n].PITCH (a5xx) need alignment to 64. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 19:05:14 -04:00
Rob Clark	82c3b6fe49	freedreno/a6xx: add LRZ support As with a5xx, hidden behind FD_MESA_DEBUG=lrz due to being paranoid about z-fighting issues with some games (in particular, this was observed with 0ad on a5xx.. but I think the proper solution to enable this by default is to figure out how to do driver specific driconf options). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 19:05:14 -04:00
Rob Clark	a877451a41	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 18:03:35 -04:00
Rob Clark	bf79a7cc25	freedreno/a6xx: add helper for various CP_EVENT_WRITE Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 17:50:26 -04:00
Rob Clark	60af89815e	freedreno/a6xx: remove unused fxns Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 17:50:26 -04:00
Rob Clark	d5bd3ce89c	freedreno/a6xx: remove fd6_shader_stateobj Earlier gen's already got this cleanup, but a6xx was still off on a branch then. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-08 17:50:26 -04:00
Ilia Mirkin	1bb1c03d61	glsl: fix array assignments of a swizzled vector This happens in situations where we might do vec.wzyx[i] = ... The swizzle would get effectively ignored because of the interaction between how ir_assignment->set_lhs works and overwriting the write_mask. There are two cases, one where i is a constant, and another where i is variable. We have to be extra-careful in both cases. Fixes the following WebGL test: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/glsl3/vector-dynamic-indexing-swizzled-lvalue.html And the new piglit tests: swizzled-writemask-indexing-nonconst.shader_test swizzled-writemask-indexing.shader_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-10-08 14:29:14 -04:00
Samuel Pitoiset	d3682766f6	radv: tidy up radv_pipeline_init_multisample_state() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-08 14:17:43 +02:00
Samuel Pitoiset	b38228ccb0	radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARK It has probably no effect without out of order rasterization anyway. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-08 14:17:40 +02:00
Samuel Pitoiset	937986ca1d	radv: set DB_EQAA.INCOHERENT_EQAA_READS My attempt was to set this field instead of duplicating one. Fixes: `6cfa321c39` ("radv: add potential missing fields for DB_EQAA") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-08 14:17:33 +02:00
Chystiakov, Dmytro	47e3338b04	i965: fallback RGBX to RGBA in glEGLImageTargetRenderbufferStorageOES In the same fashion as is done for glEGLImageTextureTarget2D. v2: share the fallback which sets baseformat and internalformat correctly which makes both of the tests pass (Tapani) Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests: #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-10-08 08:03:45 +03:00
Tapani Pälli	d1fa69ed61	glsl: do not attempt assignment if operand type not parsed correctly v2: check types of both operands (Ian) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108012	2018-10-08 08:02:50 +03:00
Marek Olšák	d877451b48	util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY Initial version discussed with Rob Clark under a different patch name. This approach leaves his driver unaffected.	2018-10-06 22:05:58 -04:00
Marek Olšák	066aa44fc5	radeonsi: fix a typo at CS_PARTIAL_FLUSH harmless	2018-10-06 21:50:52 -04:00
Marek Olšák	77903c8cfb	ac: add ac_build_round	2018-10-06 21:50:09 -04:00
Marek Olšák	fa023f293e	ac: correct PKT3_COPY_DATA definitions	2018-10-06 21:50:09 -04:00
Marek Olšák	82f5f89bf6	ac: simplify LLVM alloca helpers	2018-10-06 21:50:09 -04:00
Marek Olšák	a668c8d6ba	ac: define all address spaces properly	2018-10-06 21:50:09 -04:00
Gert Wollny	8f77156c26	gallivm: Make it possible to disable some optimization shortcuts in release builds For testing it is of interest that all tests of dEQP pass, e.g. to test virglrenderer on a host only providing software rendering like in a CI. Hence make it possible to disable certain optimizations that make tests fail. While we are there also add some documentation to the flags to make it clear that this is opt-out. Setting the environment variable "GALLIVM_PERF=no_filter_hacks" can be used to make the following tests pass in release mode: dEQP-GLES2.functional.texture.mipmap.2d.affine._linear_ dEQP-GLES2.functional.texture.mipmap.cube.generate.* dEQP-GLES2.functional.texture.vertex.2d.filtering._mipmap_linear_ dEQP-GLES2.functional.texture.vertex.2d.wrap.* Related: https://bugs.freedesktop.org/show_bug.cgi?id=94957 v2: rename optimization disabling flag to 'safemath' and also move the nopt flag to the perf flags. v3: rename flag "safemath" to "no_filter_hacks" since safemath is usually associated with floating point operations (Roland) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-10-06 13:12:48 +02:00
Tomeu Vizoso	9d81cd8e7c	virgl: Pass resource size and transfer offsets Pass the size of a resource when creating it so a backing can be kept in the other side. Also pass the required offset to transfer commands. This moves vtest closer to how virtio-gpu works, making it more useful for testing. v2: - Use new messages for creation and transfers, as changing the behavior of the existing messages would be messy given that we don't want to break compatibility with older servers. v3: - Use correct strides: The resource corresponding to the output display might have a differnt line stride then the IOVs, so when reading back to this resource take the resource stride and the the IOV stride into account. v4: Fix transfer size calculation (Andrey Simiklit) v5: Add comment about transfer size value in the PUT commend (Gurchetan). Add a comment about the size correction for transfers for reading and writing the resource. Fixing this by correctly evaluating the size upfront will need some work also on the virglrenderer side. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-10-06 13:12:44 +02:00
Gert Wollny	5d7858f151	virgl, vtest: Correct the transfer size calculation The transfer size used in virglrenderer refers to uint32_t, so one must add 3 and then divide by 4 instead of adding 3/4 which is a no-op with integers. Fixes: `b3b82fe8ea` virgl/vtest: add vtest driver Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-10-06 13:12:44 +02:00
Alan Coopersmith	066850edad	util: Make xmlconfig.c build on Solaris without d_type in dirent (v2) v2: check for lstat() failing Fixes: `04bdbbcab3` "xmlconfig: read more config files from drirc.d/" Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-10-05 17:30:45 -07:00
Sonny Jiang	084cf3b966	radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	ce1d72609d	radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	4de328da07	radeonsi:optimizing SET_CONTEXT_REG for shaders PS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	f243980f2c	radeonsi:optimizing SET_CONTEXT_REG for shaders VS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	4052624398	radeonsi:optimizing SET_CONTEXT_REG for shaders GS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Marek Olšák	86f004bdfc	radeonsi: optimize and allow reg > 31 in radeon_opt_set_context_reg functions reg_saved will have 64 bits, and (1 << reg) where reg > 31 has undefined behavior. (1ull << reg) would be correct for 64 bits. This commit shifts the other way in order to merge the conditions.	2018-10-05 19:04:13 -04:00
Sonny Jiang	eeb9170599	radeonsi: optimizing SET_CONTEXT_REG for shaders ES Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 17:53:52 -04:00
Samuel Pitoiset	a1bc152340	spirv: mark variables decorated with XfbBuffer as always active Otherwise, they are removed during NIR linking or in some lowering passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-10-05 18:13:25 +02:00
Juan A. Suarez Romero	5bd03d02c1	docs: update calendar, add news and link release notes to 18.2.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-10-05 12:51:34 +02:00
Juan A. Suarez Romero	c565eeee0b	docs: add sha256 checksums for 18.2.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cb63a4e114`)	2018-10-05 12:46:33 +02:00
Juan A. Suarez Romero	3537465059	docs: add release notes for 18.2.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `abaeb79eb2`)	2018-10-05 12:46:31 +02:00
Jason Ekstrand	dd553bc67f	nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions The ssa_for_alu_src helper will correctly handle swizzles and other source modifiers for you. The expansions for unpack_half_2x16, pack_uvec2_to_uint, and pack_uvec4_to_uint were all broken with regards to swizzles. The brokenness of unpack_half_2x16 was causing rendering errors in Rise of the Tomb Raider on Intel ever since `c11833ab24` which added an extra copy propagation to the optimization pipeline and caused us to start seeing swizzles where we hadn't seen any before. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926 Fixes: `9ce901058f` "nir: Add lowering of nir_op_unpack_half_2x16." Fixes: `9b8786eba9` "nir: Add lowering support for packing opcodes." Tested-by: Alex Smith <asmith@feralinteractive.com> Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-10-04 12:43:59 -05:00
Vadym Shovkoplias	5f0567a4f6	glsl/linker: Check the subroutine associated functions names >From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification "A program will fail to compile or link if any shader or stage contains two or more functions with the same name if the name is associated with a subroutine type." v2: - error out earlier (Tapani) - style fixes (Iago) Fixes: * no-overloads.vert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-04 17:41:19 +02:00
Tomeu Vizoso	ed53a79cf8	virgl: Negotiate version with vtest server Check if server supports version negotation by sending a PING_PROTOCOL_VERSION message right before a dummy RESOURCE_BUSY_WAIT. If we don't get a reply for the first, we know the server doesn't support it. If it does support it, we can query the max protocol version supported by the server and fall back if needed. v2: - Send a new message to negotiate the protocol version, checking if the server supports this message by immediately sending a busy wait message. (Dave Airlie) v3: - Send a zero-arg command PING_PROTOCOL_VERSION so we actually keep compatibility with older servers. (Code by Dave Airlie) Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-10-04 16:18:36 +02:00
Sagar Ghuge	0c70e11206	intel: aubinator: Fix memory leaks Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-04 10:01:56 +01:00
Sagar Ghuge	29a2eaf3db	intel/decoder: construct correct xml filename construct correct gen xml filename when we try to load hardware xml description from a given path v2: remove temporary variable (Francesco Ansanelli) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-04 10:01:56 +01:00
Sagar Ghuge	f9c8468c82	intel/decoder: Avoid freeing invalid pointer v2: Free ctx.spec if error while reading genxml (Lionel Landwerlin) v3: Handle case where genxml is empty (Lionel Landwerlin) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-04 10:01:56 +01:00
Sagar Ghuge	ba3304e764	intel/decoder: add gen_spec_init method Initialize gen_spec instance properly when loading hardware xml description from specifc directory to avoid segmentation fault. v2: correct function definition (Lionel Landwerlin) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-04 10:01:56 +01:00
Samuel Pitoiset	2b34985d93	radv: fix resetting the pool for timestamp queries Since the driver no longer uses the availability bit for timestamp queries it shouldn't reset it. Instead, it should reset the query values to UINT32_MAX. This fixes VM faults. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108164 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-10-04 10:56:25 +02:00
Guido Günther	b2a876a42b	etnaviv: Use write combine instead of unached mappings for shader bo The later are sensitive to unaligned accesses on arm64[1] and we don't need an uncached mapping here. [1]: https://lists.freedesktop.org/archives/etnaviv/2018-September/001956.html Signed-off-by: Guido Günther <guido.gunther@puri.sm> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-10-04 10:33:25 +02:00
Marek Olšák	8e0b4cb8a1	drirc: add a workaround for ARMA 3 Cc: 18.2 <mesa-stable@lists.freedesktop.org>	2018-10-04 01:01:54 -04:00
Jason Ekstrand	f5bab06428	anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START Previously, we just went ahead and emitted MI_BATCH_BUFFER_START as normal. If we are near enough to the end, this can cause us to start a new BO just for the MI_BATCH_BUFFER_START which messes up chaining. We always reserve enough space at the end for an MI_BATCH_BUFFER_START so we can just increment cmd_buffer->batch.end prior to emitting the command. Fixes: `a0b133286a` "anv/batch_chain: Simplify secondary batch return..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926 Tested-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-03 09:03:12 -05:00
Jason Ekstrand	7a89a0d9ed	anv: Use separate MOCS settings for external BOs On Broadwell and above, we have to use different MOCS settings to allow the kernel to take over and disable caching when needed for external buffers. On Broadwell, this is especially important because the kernel can't disable eLLC so we have to do it in userspace. We very badly don't want to do that on everything so we need separate MOCS for external and internal BOs. In order to do this, we add an anv-specific BO flag for "external" and use that to distinguish between buffers which may be shared with other processes and/or display and those which are entirely internal. That, together with an anv_mocs_for_bo helper lets us choose the right MOCS settings for each BO use. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-03 09:03:03 -05:00
Emil Velikov	08bff097e1	meson: remove invalid "opencl" llvm component Seeming copy/paste mistake from configure.ac which uses $2 for the component and $3 for the fancy name printing. Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	fe8be81b4a	Revert "mesa: remove unnecessary 'sort by year' for the GL extensions" This reverts commit `3d81e11b49`. As reported by Federico, some games require the 'sort by year' since they truncate the extensions which do not fit the fixed size string array. Seemingly I did not consider that, as the documentation (both Mesa and Nvidia) mentions about program crashes ... which are worked around by setting the env. variable. This commit reinstates the workaround and enhances the documentation. Cc: Marek Olšák <maraeo@gmail.com> Cc: Ian Romanick <idr@freedesktop.org> Reported-by: Federico Dossena <info@fdossena.com> Fixes: `3d81e11b49` ("mesa: remove unnecessary 'sort by year' for the GL extensions") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Federico Dossena <info@fdossena.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	91ff8b1dd9	mesa: reorder and document the tokens in glheader.h Split into different sections, document each one as well as strange cases like GL_ATI_texture_compression_3dc. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	5f70964b1d	mesa: remove duplicate declarations from glheader.h Remove all the desktop GL and GLX entries from the list. Former are pulled by the gl.h and glext.h includes at the top while the latter are no longer needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	01b92916af	i965: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one Earlier commit updated the code to use the DRI tokens, yet forgot to update the comment. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	e04b2c0376	i915: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one Earlier commit updated the code to use the DRI tokens, yet forgot to update the comment. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	d26b122ee8	dri/common: move the required GLX_* token definitions locally Will allow us to remove even bigger hack elsewhere. But more importantly, we should not be using _any_ GLX tokens in DRI. Document the gory details about the current side-effects. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	4ef53669af	dri/common: use __DRI_ATTRIB_SWAP* instances when describing db_modes Somewhat recently Thomas Hellstrom added the respective DRI tokens and updated the drivers. Update the documentation to match reality. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	d6a6760139	egl/x11: remove eglSwap* surface check Already handled further up in eglapi.c. To make things a tiny bit strange, X11+DRI3 was doing the wrong thing by returning EGL_FALSE (+ no error), while X11+DRI2 was returning EGL_TRUE. Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	8030741996	egl/surfaceless: remove eglSwap* stubs The API validation in eglapi.c already returns if the surface type is !window. Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Cc: Gurchetan Singh <gurchetansingh@chromium.org> Cc: Chad Versace <chadversary@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	a370e278d3	egl/drm: remove eglSwap* surface check Already handled further up in eglapi.c Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	91ccb59ff4	egl/android: remove eglSwap* surface check Already handled further up in eglapi.c Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:06 +01:00
Emil Velikov	8f66743ca2	egl: make eglSwapBuffers* a no-op for !window surfaces Analogous to the previous commit - the spec says the function is a no-op when a pbuffer or pixmap surface is used. Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	64b4ccde0c	egl: make eglSwapInterval a no-op for !window surfaces As the spec says, the function is a no-op when the surface is not a window one. That spec implies that EGL_TRUE should be returned in that case, yet the ARM driver seems to return EGL_FALSE + EGL_BAD_SURFACE. The Nvidia driver returns EGL_TRUE. We follow that behaviour until a decision is made. https://gitlab.khronos.org/egl/API/merge_requests/17 Cc: samiuddi <sami.uddin.mohammad@intel.com> Cc: Erik Faye-Lund <kusmabite@gmail.com> Cc: Tomasz Figa <tfiga@chromium.org> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	c231b49c53	freedreno: add the a6xx sources to the Android build Add the files otherwise things just won't build. Haven't actually tested it, but it's a small step in the right direction. Fixes: `de3b34df97` ("freedreno: Add a6xx backend") Cc: Kristian H. Kristensen <hoegsberg@chromium.org> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	7419b22413	pipe-loader: add a dup() in pipe_loader_sw_probe_kms The pipe_loader_release API closes the fd given, even if the pipe-loader should _not_ take ownership of it. With earlier commit we fixed pipe_loader_drm_probe_fd, and now with cover the final piece. Note that unlike the DRM case, here the caller _did_ forget to dup before using it ... most likely leading to all sorts of fun. Don't forget the close in the error path. Seems like the things are a bit leaky/asymmetrical with the semi-recent config work. But we can shave that yak another day ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	6ccc435e7a	pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd Currently pipe_loader_drm_probe_fd takes ownership of the fd given. To match that, pipe_loader_release closes it. Yet we have many instances which do not want the change of ownership, and thus duplicate the fd before passing it to the pipe-loader. Move the dup() within pipe-loader, explicitly document that and document all the cases through the codebase. A trivial git grep -2 pipe_loader_release makes things as obvious as it gets ;-) Cc: Leo Liu <leo.liu@amd.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Axel Davy <davyaxel0@gmail.com> Cc: Patrick Rudolph <siro@das-labor.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com> (for nine)	2018-10-03 13:38:05 +01:00
Emil Velikov	7b8d1b313c	st/nine: do not double-close the fd on teardown As the newly introduced comment says: The pipe loader takes ownership of the fd Thus, there's no need to close it again. Cc: Patrick Rudolph <siro@das-labor.org> Cc: Axel Davy <davyaxel0@gmail.com> Cc: mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Axel Davy <davyaxel0@gmail.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	fa9df82f67	mesa: fold _glapi_check_multithread() back into _mesa_make_current With commit `c6c0f94714`, back in 2006 Brian removed the _glapi_check_multithread() call from core mesa - _mesa_make_current. It was done to remove fairly awkward #ifdef guard which caused subtle differences in core mesa. Since that guard is long gone, we can drop the duplication and reintroduce the call in core. Note that the function is was missing when using EGL + classic dri HW drivers. Yet on TLS builds it's a no-op, so we're safe. Any non TLS users - more or less anything !Linux (or even musl on Linux up-to semi-recently) may have experienced problems. v2: don't remove the call from swrast - move it to core (Eric) Cc: Eric Anholt <eric@anholt.net> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	d081ad2aa2	vl/dri3: do full teardown on screen_destroy Earlier commit added support for 'front_buffers', erroneously adding a return in vl_dri3_screen_destroy. Effectively leaking a lot of state. Fixes: `8d7ac0a4e4` ("vl/dri3: implement DRI3 BufferFromPixmap") Cc: Leo Liu <leo.liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	1301674c39	st/dri: make swrast_no_present member of dri_screen Just like the dri2 options, this is better suited in the dri_screen struct. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	80b62e2d6d	st/dri: inline dri2_buffer.h within dri2.c The header was used only by dri2.c, containing a two-member struct and cast wrapper. Just inline it where it's used/needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	89c2c386c0	st/xa: remove unused xa_screen::d[s]_depth_bits_last Unused since the initial import. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:05 +01:00
Emil Velikov	5ade4b10e2	mesa: use C99 initializer in get_gl_override() The overrides array contains entries indexed on the gl_api enum. Use a C99 initializer to make it a bit more obvious. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-10-03 13:38:05 +01:00
Gabriel Majeri	f0b987646a	anv: Ensure discreteQueuePriorities is at least 2 This is the minimum value according to the spec. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-03 07:57:37 +02:00
Timothy Arceri	2b5f42068d	r600: use build-id when available for disk cache Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-03 09:49:21 +10:00
Timothy Arceri	397f2603eb	nouveau: use build-id when available for disk cache Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-03 09:49:21 +10:00
Timothy Arceri	2169acbf34	radeonsi: use build-id when available for disk cache Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-03 09:49:21 +10:00
Timothy Arceri	83ea8dd99b	util: add disk_cache_get_function_identifier() This can be used as a drop in replacement for disk_cache_get_function_timestamp(). Here we use build-id to generate a driver-id rather than build timestamp if available. This should resolve issues such as distros using reproducable builds and flatpak not having real build timestamps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-03 09:49:21 +10:00
Timothy Arceri	6a884014e4	util: rename timestamp param in disk_cache_create() Only some drivers use a timestamp here. Others use things such as build-id, or even a combination of build-ids from Mesa and LLVM. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-10-03 09:49:21 +10:00
Józef Kucia	e24a4e05c7	radeonsi: avoid sending GS_EMIT in shaders without outputs Fixes GPU hangs. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107857 Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-02 17:13:52 -04:00
Fritz Koenig	08f97407fb	i965: Replace checks for rb->Name with FlipY (v2) In the GL_MESA_framebuffer_flip_y implementation _mesa_is_winsys_fbo checks were replaced with FlipY checks. rb->Name is also used to determine if a buffer is winsys. v2: Fixes annotation [for emil] Fixes: `ab05dd183c` ("i965: implement GL_MESA_framebuffer_flip_y [v3]") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-10-02 11:28:46 -07:00
Marek Olšák	2fd58d8eb2	radeonsi: initialize ac_gpu_info::name when using SI_FORCE_FAMILY so that it's not NULL when loading radeonsi and a GCN GPU is not present in the system.	2018-10-02 12:21:49 -04:00
Marek Olšák	0b062f0419	radeonsi: don't set the VS prolog key for the blit VS	2018-10-02 12:21:49 -04:00
Jason Ekstrand	58360ca09d	spirv: Move function call handling to vtn_cfg It makes way more sense for it to live there with the rest of function handling. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-02 10:24:56 -05:00
Jason Ekstrand	00f385e6d4	nir/from_ssa: Don't rewrite derefs destinations to registers We already call nir_rematerialize_derefs_in_use_blocks_impl prior to calling nir_lower_ssa_defs_to_regs_block so the assertion that all deref uses in the block should hold. This fixes the following CTS test when SPIR-V optimization recipe 1: dEQP-VK.glsl.struct.local.loop_nested_struct_array_vertex Fixes: `606eb56ab9` "intel/nir: Only lower load/store derefs" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-02 10:24:56 -05:00
Jason Ekstrand	bfc89c668e	nir/cf: Remove phi sources if needed in nir_handle_add_jump If the block in which the jump is inserted is the predecessor of a phi then we need to remove phi sources otherwise the phi may end up with things improperly connected. This fixes the following CTS test when dEQP is run with SPIR-V optimization recipe 1: dEQP-VK.glsl.functions.control_flow.return_in_nested_loop_vertex Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-10-02 10:24:56 -05:00
Eric Engestrom	7b0752fb10	anv: suppress warning about unhandled image layout Let's just be explicit that VK_NV_shading_rate_image is not supported. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `6ee1709170` "vulkan: Update the XML and headers to 1.1.86" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-10-02 15:09:29 +01:00
Rob Clark	ae78489d3e	freedreno/a6xx: hwbinning Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-02 10:08:18 -04:00
Rob Clark	8ff349e564	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-10-02 10:08:18 -04:00
Jason Ekstrand	7e7959fcb7	intel/fs: Fix a typo in need_matching_subreg_offset This fixes a bunch of Vulkan subgroup tests on little core platforms. Fixes: `4150920b95` "intel/fs: Add a helper for emitting scan operations" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-10-02 07:44:25 -05:00
Timothy Arceri	ea66bfda88	util: disable cache if we have no build-id and timestamp is zero Timestamp can be zero for example when Flatpak is used. In this case just disable the cache rather then segfaulting when incompatible cache items are loaded. V2: actually return false when mtime is 0. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-02 22:07:55 +10:00
Eric Engestrom	0bdf7b1d0f	include: sync eglext.h from Khronos Signed-off-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2018-10-02 12:10:46 +01:00
Timothy Arceri	0e6cdfd561	radeonsi: add a workaround for bitfield_extract when count is 0 This ports the fix from `3d41757788`. Both LLVM 7 & 8 continue to have this problem. It fixes rendering issues in some menu and loading screens of Civ VI which can be seen in the trace from bug 104602. Note: This does not fix the black triangles on Vega for bug 104602. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276	2018-10-02 08:39:51 +10:00
Jason Ekstrand	e4538b93f5	anv: Implement VK_KHR_driver_properties Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 13:21:12 -05:00
Jason Ekstrand	6ee1709170	vulkan: Update the XML and headers to 1.1.86 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 11:43:20 -05:00
Samuel Pitoiset	c2867e4c2a	radv: do not try to set DCC_CONTROL when image doesn't use DCC Unnecessary. While we are at it, remove the check for pre-VI because it's already checked earlier. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 12:13:12 +02:00
Samuel Pitoiset	f622ab889a	radv: add a sanity check for mutable formats and TC-compat HTILE If apps use the MUTABLE bit and the same formats as the image one in the list, we can still enable TC-compat HTILE. I don't think this happens often but given the fact that TC-compat HTILE allows a nice boost in some situations, it's worth checking. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 12:13:09 +02:00
Samuel Pitoiset	dc91c4d40a	radv: disable HTILE for very small depth surfaces Like we disable DCC/CMASK for small color surfaces as well. Serious Sam 2017 creates a 1x1 depth surface and I think it should be faster to do slow clears on the graphics queue instead of fast clears on compute, and eventually a depth expand if the surface isn't TC-compatible HTILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 10:16:33 +02:00
Samuel Pitoiset	6cfa321c39	radv: add potential missing fields for DB_EQAA Other drivers set these two as well, just apply the same rule. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 10:16:30 +02:00
Samuel Pitoiset	bd6df2f923	radv: disable complicated point clipping against user clip planes I don't think this is required by Vulkan too. Ported from RadeonSI (AMDVLK doesn't set it either). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-10-01 10:16:25 +02:00
Michel Dänzer	cb863de626	gallium/util: Clarify comment in util_init_thread_pinning As discussed in the review of the patch which added the comment: Nothing happens when a thread is created, because pthread_atfork doesn't affect creating threads. However, spawning a child process will likely crash. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-28 17:52:11 +02:00
Samuel Pitoiset	3fb4adae83	radv: do not sync CP DMA when copying buffers We already track if the DMA engine is busy/idle with a flag, and we emit a packet that waits for all CP DMA operations to be complete. This is done at end of command buffer because the kernel doesn't wait for them, and also when emitting barriers, so it should be safe. This improves small copies for both aligned and unaligned sizes. Aligned sizes: BEFORE: 1 KB: 59.840000 ms 2 KB: 71.200000 ms AFTER: 1 KB: 31.200000 ms 2 KB: 31.040000 ms Unaligned sizes: BEFORE: 2 KB: 68.3200 ms 3 KB: 79.3600 ms 5 KB: 76.6400 ms 9 KB: 90.8800 ms 17 KB: 116.0000 ms AFTER: 2 KB: 31.0400 ms 3 KB: 32.0000 ms 5 KB: 30.8800 ms 9 KB: 30.5600 ms 17 KB: 29.6000 ms Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-28 09:08:52 +02:00
Samuel Pitoiset	621e70dd40	radv: adjust the CmdUpdateBuffer threshold for optimal performance According to my benchmark results, it appears that we should reduce the threshold to 1024. BEFORE: 1 KB: 68.656000 ms 2 KB: 118.368000 ms AFTER: 1 KB: 31.760000 ms 2 KB: 29.840000 ms Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-28 09:08:44 +02:00
Samuel Pitoiset	5d6a560a29	radv: do not use the availability bit for timestamp queries It's unnecessary because we can just check if the timestamp is to different to the default value when a pool is created or resetted. Instead of waiting for the availability bit to be 1, we have to emit a not equal WAIT_REG_MEM for checking if the timestamp is ready. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-28 09:08:03 +02:00
Kristian H. Kristensen	3e90505224	freedreno/a6xx: Build up draw dword0 outside visibilty if statement Pulling this logic out means we can share the logic and avoid a couple of temporary variables that helped make things clearer before. Note that in either vismode case, we always program vismode 0. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	74a87cdaa6	freedreno/a6xx: Simplify draw_emit() branches a bit Now that we've copied the emit logic into each branch of the if (info->index_size) statement, we can simplify the logic a bit according to which case we're in. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	2516073cb6	freedreno/a6xx: Copy OUT_RING() part into each branch of the index if Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	c3d58d9ffc	freedreno/a6xx: Split fd6_draw_emit into direct and indirect paths This splits the two code paths into separate functions and moves the "if (info->indirect)" test into draw_impl(). Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	adcd83fb22	freedreno/a6xx: Inline fd6_draw() Simplify the code a bit by inlining this helper. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	fb1c6b89a2	freedreno/a6xx: Move emit_marker and wfi to draw_impl() This way the markers clearly bracket the draw call and isn't duplicated for both direct and indirect draw code. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Kristian H. Kristensen	0559050557	freedreno/a6xx: Move inline functions out of fd6_draw.h Only used in fd6_draw.c so put them there. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-27 16:08:52 -04:00
Hyunjun Ko	1a40faa864	freedreno: fix a typo in launch_grid	2018-09-27 16:06:19 -04:00
Hyunjun Ko	aef410f31e	freedreno/ir3: fix the param order of cmpxchg According to the following definition, int AtomicCompSwap(inout int mem, uint compare, uint data); the preceding one in atomic_comp_swap of NIR is compare and data is followed, while src0 for cmpxchg needs vec2(data, compare) So for ssbo/image deref comp_swap, that should be reversed. Fixes: dEQP-GLES31.functional.image_load_store..atomic.comp_swap	2018-09-27 16:05:49 -04:00
Rob Clark	49d22c2dfc	freedreno/a6xx: fix shaders w/ >= 24 regs Possibly these bits mean something else now. Blob always seems to use FOUR_QUADS, and changing to TWO_QUADS seems to cause different threads to overlap registers. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:49:14 -04:00
Rob Clark	6530fcc4a7	freedreno/a6xx: fix gl_FragCoord.w Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:45:44 -04:00
Rob Clark	919741b8d5	freedreno: handle invalidated buffers harder Do a better job of skipping mem2gmem/gmem2mem.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:41:46 -04:00
Rob Clark	19e9d28646	freedreno/a6xx: fix constlen Fix a few bits of confusion, as with previous gen's constlen is aligned to 4, and value in bitfield is left-shifted by 2 (ie. divided by 4). But this is done by the CONSTLEN() accessor/builder fxn, so don't do it twice. Also HLSQ_FS_CNTL.CONSTLEN is not special. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:33:10 -04:00
Rob Clark	12de415ad1	freedreno: fix inorder rendering case Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:32:39 -04:00
Rob Clark	b65b6f7606	freedreno/a6xx: backface stencil state Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:31:56 -04:00
Rob Clark	93db15d300	freedreno/a6xx: fix gpu crash with separate-stencil Fixes a crash in (of all things) dEQP-GLES2.info.vendor with --deqp-surface-type=fbo.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:31:34 -04:00
Rob Clark	a52ef80d24	freedreno/a6xx: fix MRT config Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:30:36 -04:00
Rob Clark	8930e83642	freedreno: fix potential hang when destroying batch batch_flush_reset_dependencies() expects to be called unlocked, and can call fd_batch_reference() which can try to aquire the screen lock again. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:29:45 -04:00
Rob Clark	ef6d15f8a8	freedreno: fix corrupted fb state In `c3d9f29b` we allowed ctx->batch to be null, and started tracking the current framebuffer state in fd_context. But the existing logic in fd_blitter_pipe_begin() would, if !ctx->batch, set null fb state to be restored after blit. Which broke the world of deqp (and probably other things) Fixes: `c3d9f29b78` freedreno: allocate ctx's batch on demand Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:27:38 -04:00
Rob Clark	5bb96bf73a	freedreno: simplify pctx->clear() This is defined to always clear the entire surface(s) specified, regardless of scissor state.. mesa/st will turn scissored clears into a draw. So rip about a bunch of unnecessary machinery. Also remove a comment that was obsolete since using u_blitter to turn clear into draw (for the cases where there isn't a hw blitter fast-path). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:26:32 -04:00
Rob Clark	a7fa44cd33	freedreno: fix FD_MESA_DEBUG=flush The logic to force a flush every draw was short-circuited with newer kernels. Also it should apply to clears as well. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:25:49 -04:00
Rob Clark	83c5c026ee	freedreno: fix scissor state emit The effective scissor changes based on rasterizer->scissor flag, so we need to re-emit scissor state when rasterizer state changes. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:25:24 -04:00
Rob Clark	106f18258a	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-27 15:25:01 -04:00
Erik Faye-Lund	c3486cd8c9	st/mesa: do not call update_framebuffer_size with NULL pointer In st_renderbuffer_alloc_storage, we avoid allocating storage for zero-sized buffers, leading to this pointer being NULL. We already take care to avoid dereferencing these pointers for color-buffers, but not for depth/stencil-buffers. So let's thread a bit more carefully here. This avoids a crash while running Piglit's glx/glx-visuals-stencil test, both on virgl and r600g. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-27 10:33:44 +02:00
Maxime	dd333c66bd	vulkan: Disable randr lease for libxcb < 1.13 Since the Randr lease code was added, compiling against libxcb 1.12 no longer works. CC: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108024 Fixes: `7ab1fffcd2` Tested-By: Maxime <berillions@gmail.com> Fixes: `7ab1fffcd2` "vulkan: Add EXT_acquire_xlib_display [v5]"	2018-09-27 16:31:42 +10:00
Bas Nieuwenhuizen	40585ddb48	radv: Remove garbage comment. Trivial.	2018-09-27 02:04:06 +02:00
Bas Nieuwenhuizen	0207ebcbf1	radv: Do not use multiple draws for multisample copies. Use sample rate shading instead, should give better locality. Makes Nier with 8x msaa on a Raven go 5 fps -> 7 fps in the menu. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-27 02:04:00 +02:00
Jordan Justen	ca1d3fc538	anv: If softpin is supported, use it with the hiz clear value bo Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-09-26 10:21:23 -07:00
Jordan Justen	2a97390552	anv: s/batch/value_bo/ on anv_device_init_hiz_clear_batch Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-09-26 10:21:23 -07:00
Dylan Baker	e9bd071f49	docs: update calendar, add news and link release notes for 18.1.9	2018-09-26 09:44:40 -07:00
Dylan Baker	d4bdcf5d22	docs: Add sha256 sums to 18.1.9	2018-09-26 09:41:53 -07:00
Dylan Baker	4769f49455	docs: Add 18.1.9 release notes	2018-09-26 09:40:56 -07:00
Jason Ekstrand	b3f477ef7a	intel/isl: Add a unit suffixes to some struct fields and variables I was about to make the claim to someone that every field in isl_surf is either an enum or has explicit units. Then I looked at isl_surf and discovered this claim was wrong. We should fix that. This commit does a few refactors: * Add _B suffixes to some struct fields * Add _B to some variables and parameters * Rename row_pitch_tiles -> row_pitch_tl Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-09-26 08:52:26 -05:00
Axel Davy	0d495bec25	radeonsi: NaN should pass kill_if Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105333 Fixes: https://github.com/iXit/Mesa-3D/issues/314 For this application, NaN is passed to KILL_IF and is expected to pass. v2: Explain in the code why UGE is used. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> CC: <mesa-stable@lists.freedesktop.org>	2018-09-25 22:05:24 +02:00
Axel Davy	46814e771a	st/nine: Do not mark both ff vs and ps updated Previously if only ff vs or only ff ps was used, the constants for both were marked as updated, while only the constants of the used ff shader were updated. Now that NINE_STATE_FF_VS and NINE_STATE_FF_PS do not intersect anymore, we can correctly mark the correct set of constant as updated. Fixes: https://github.com/iXit/Mesa-3D/issues/319 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	8e0526555d	st/nine: Split NINE_STATE_FF_OTHER NINE_STATE_FF_OTHER was mostly ff vs states. Rename it to NINE_STATE_FF_VS_OTHER and move common states with ps to NINE_STATE_FF_PS_CONSTS (renamed from NINE_STATE_FF_PSSTAGES). Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	5f7a41c33b	st/nine: Add dummy ff shader state Some states only affect the ff shader, not its constants. Currently we don't check anything and always recompute the ff shader key. However we do check for NINE_STATE_FF_OTHER and if set we reupload some constants. Thus for those states which had NINE_STATE_FF_OTHER set but didn't need it, replace by a dummy ff shader state (which is easier to understand for an external reader than just setting 0 and more future proof). Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	f6bf1d2db0	st/nine: Mark pointsize states as ff states The pointsize states were missing the ff NINE_STATE_FF_OTHER flag, and thus might miss state updates when using ff. Fixes some wine tests. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	89beea100f	st/nine: Minor refactor of a few NINE_STATE_* flags Rename NINE_STATE_FOG_SHADER, NINE_STATE_POINTSIZE_SHADER and NINE_STATE_PS1X_SHADER into NINE_STATE_VS_PARAMS_MISC and NINE_STATE_PS_PARAMS_MISC. The behaviour is unchanged, except one minor change: D3DRS_FOGTABLEMODE doesn't need to affect VS. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	7ae2509ce0	st/nine: Increase maximum number of temp registers With some test app I hit the limit. As we allocate on demand (up to the maximum), it is free to increase the limit. Signed-off-by: Axel Davy <davyaxel0@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2018-09-25 22:05:24 +02:00
Axel Davy	dc4b53e129	st/nine: Lock the entire buffer in some cases. Previously we had already found that for MANAGED buffers the buffer started dirty (which meant all writes out of bound before the first draw call using the buffer have to be taken into account). Possibly it is the same for the other types of buffers. For now always lock the entire buffer (starting from the offset) for these (except for DYNAMIC buffers, which might hurt performance too much). Fixes: https://github.com/iXit/Mesa-3D/issues/301 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	0eeb583650	st/nine: Don't call SetCursor until a cursor is set The previous code was ignoring the input until a cursor is set inside d3d (with SetCursorProperties), as expected by wine tests. However it did still make a call to ID3DPresent_SetCursor, which would result into a SetCursor(NULL) call, thus hidding any cursor set outside d3d, which we shouldn't do. Add comment about not avoiding redundant ID3DPresent_SetCursor calls once a cursor has been set in d3d, as it has been tested to cause regressions. Fixes: https://github.com/iXit/Mesa-3D/issues/197 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	dcfde02bb0	st/nine: Avoid redundant SetCursorPos calls For some applications SetCursorPosition is called when a cursor event is received. Our SetCursorPosition was always calling wine SetCursorPos which would trigger a cursor event. The infinite loop is avoided by not calling SetCursorPos when the position hasn't changed. Found thanks to wine tests. Fixes irresponsive GUI for some applications. Fixes: https://github.com/iXit/Mesa-3D/issues/173 Signed-off-by: Axel Davy <davyaxel0@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2018-09-25 22:05:24 +02:00
Axel Davy	112c770597	st/nine: Init cursor position at device creation This is only useful for software cursor, but at least now we won't start it at (0, 0). Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	62ea55ec8b	st/nine: Initialize manually cursor structure Initialize manually the cursor structure fields for more clarity on its content. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	110950318c	st/nine: Check if format is DS before retrieving flags d3d9_get_pipe_depth_format_bindings assumes the input format is a depth stencil format. Previously the user could hit this function with an invalid format. Protect the last non protected call with a depth_stencil_format check. Another solution is to have d3d9_get_pipe_depth_format_bindings support non depth stencil format, but we don't want the user to create depth buffers with d3d formats that can't be one, it's better to check if the format can be depth buffer with d3d. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	af60fbc0a4	st/nine: Remove clamping when mul_zero_wins Tests show the clamping can be removed when mul_zero_wins is supported. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	a0afa80889	st/nine: Implement predicated instructions Most of the work was already there, just not implemented. Fixes: https://github.com/iXit/Mesa-3D/issues/318 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	e7e82bcdc9	st/nine: Fix aliased read in ff Fix aliasing of colorarg_b4 with colorarg_b5. Fixes: https://github.com/iXit/Mesa-3D/issues/302 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	9fc6aa1bbe	st/nine: Fix ff assignment with aliasing "tex_stage[s][D3DTSS_COLORARG0] >> 4" could be a two bit number, thus colorarg_b4 was incorrectly set. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	8c35fb0280	st/nine: Clarify some ff assignments colorarg0, etc are 3 bits wide. Make the code more readable by adding an & 0x7 to further indicate we only remember the first 3 bits only. The 4th bit is always 0, and colorarg_b4, colorarg_b5, etc are used to store the 5th and 6th bits. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	59aaeeb730	st/nine: Print transform matrices in debug This is useful to see the matrices content in the log to debug. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	d9da0a1f6d	st/nine: Add ff key hash to help debug This is very useful to find in the log the ff shader shource of a given call. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	fcbb00a502	st/nine: Avoid RefToBind calls in ff When using csmt, ff shader creation happens on the csmt thread. Creating the shaders, then calling RefToBind causes the device ref to be increased then decreased. However the device dtor assumes than no work pending on the csmt thread could increase the device ref, leading to hang. The issue is avoided by creating the shaders with a bind count directly. Fixes: https://github.com/iXit/Mesa-3D/issues/295 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	e83b15cba0	st/nine: Add new helper for object creation with bind Add a new helper to create objects starting with a bind count instead of a ref count. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	fd86ce7c14	st/nine: Add parameter to start with bind Add a parameter to start new object with a bind instead of a refcount. Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	a9bf82ecf4	st/nine: Use perspective correction for ps depth fog Emulate perspective interpolation of depth for programmable ps fog ff ps fog uses position z, or 1/w depending on the ff projection matrix set. This is according to public documents found describing the algorithm and tests we made. In the case of programmable ps, we used position's z, which was sufficient to pass wine tests (which test shaders don't set w). Issue https://github.com/iXit/Mesa-3D/issues/315 showed that this calculation was wrong. Using perspective interpolation on z, that is using z * 1/w seems to satisfy both this application and wine tests. Fixes: https://github.com/iXit/Mesa-3D/issues/315 Signed-off-by: Axel Davy <davyaxel0@gmail.com>	2018-09-25 22:05:24 +02:00
Axel Davy	7ee5e5e239	st/nine: Clamp RCP when 0inf!=0 Tests done on several devices of all 3 vendors and of different generations showed that there are several ways of handling infs and NaN for d3d9. Tests showed Intel on windows does always clamp RCP, RSQ and LOG (thus preventing inf/nan generation), for all shader versions (some vendor behaviours vary with shader versions). Doing this in nine avoids 0inf issues for drivers that can't generate 0*inf=0 (which is controled by TGSI's MUL_ZERO_WINS). For now clamp for all drivers. An ulterior optimization would be to avoid clamping for drivers with MUL_ZERO_WINS for the specific shader versions where NV or AMD don't clamp. LOG and RSQ being already clamped, this patch only clamps RCP. Fixes: https://github.com/iXit/Mesa-3D/issues/316 Signed-off-by: Axel Davy <davyaxel0@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2018-09-25 22:05:23 +02:00
Jan Vesely	1f3fe4aaeb	.travis: Drop note about Clover builds being slow SWR takes 17+ minutes to build. Clover builds take ~6-7 minutes. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-09-25 14:08:06 -04:00
Jan Vesely	cb1b109733	.travis: Add LLVM-7 Clover build Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-09-25 14:08:06 -04:00
Caio Marcelo de Oliveira Filho	3cf07361ac	intel/compiler: Export TCS passthrough creation Move create_passthrough_tcs() from i965 so can be used in other contexts. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-25 09:16:31 -07:00
Gert Wollny	47a6f98e15	mesa/st: In the precense of integer buffers enable per buffer blending Since blending will be disabled later for integer formats we have to consider that in the case of a mixed set of integer/non-integer format buffers blending must be handled on a per buffer basis. Fixes on r600: dEQP-GLES31.functional.draw_buffers_indexed.random. max_required_draw_buffers.13 Fixes: `8fb966688b` st/mesa: Disable blending for integer formats. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-25 15:54:38 +02:00
Eric Engestrom	97ae5a858d	meson+autotools: get rid of spammy GCC warning -Wformat-truncation That warning fires every time a string function takes an argument that could possibly be longer than its max output, which triggers all over the place, especially when working with file paths ("what if every file path is MAX_PATH long?" is what GCC is saying, which is really annoying when we know that "/dev/dri/cardN" is not gonna be 4096 char long and it's safe to store it in a 32-char array). Anyway, we either add a ton of dead code all over the place to make GCC happy, or we get rid of its spam. I chose the latter. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-09-25 11:40:08 +01:00
Eric Engestrom	1a37a80bf6	meson: make it trivial to add other -Wno-foo CFLAGS Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-25 11:39:56 +01:00
Eric Engestrom	f5b41f9121	gallivm: ensure string is null-terminated instead of assert()ing Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-25 11:39:30 +01:00
Topi Pohjolainen	1cc17fb731	intel/compiler/icl: Use barrier id bits 24:30 instead of 24:27,31 Fixes gpu hangs with Carchase and Manhattan. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-09-25 09:59:59 +03:00
Andres Rodriguez	ec1fcf92ae	radv: only emit ZPASS_DONE for timestamp queries on gfx queues A ZPASS_DONE packet doesn't make sense for the compute queue. It will result in a gpu hang. This change resolves a gpu hang for SteamVR+Vega. Cc: mesa-stable@lists.freedesktop.org Fixes: `1f616a840e` "radv: emit a dummy ..." Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-25 02:30:34 -04:00
Timothy Arceri	72e4287e8f	radv: make use of nir_lower_load_const_to_scalar() This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example in radeonsi more loops are unrolled in Civilization Beyond Earth. The actual pipeline-db stats are not overwhelming but even in the negatively affected shaders the NIR is clearly better. It just happens that the code shuffling and in some cases calls to max rather than a flt result in the final output from LLVM not giving as good numbers. However this is an incremental opt that further passes build off so the change should be made IMO. Totals from affected shaders: SGPRS: 20192 -> 20184 (-0.04 %) VGPRS: 19516 -> 19524 (0.04 %) Spilled SGPRs: 437 -> 444 (1.60 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1527444 -> 1522276 (-0.34 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 1018 -> 1016 (-0.20 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-25 09:31:22 +10:00
Dylan Baker	f03a160592	meson: de-duplicate LLVM check By adding `_llvm == 'true'` to the required argument we can check the 'auto' and 'true' case in one path. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-24 13:02:07 -07:00
Eric Engestrom	f2519e3493	vulkan/wsi/display: wsi_display_select_crtc() doesn' need to modify the connector Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-24 17:38:11 +01:00
Eric Engestrom	bde3102c0d	vulkan/wsi/display: check if wsi_swapchain_init() succeeded Fixes: `da997ebec9` "vulkan: Add KHR_display extension using DRM [v10]" Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-24 17:37:43 +01:00
Leo Liu	3e7b5e5db2	radeon/uvd: use bitstream coded number for symbols of Huffman tables Signed-off-by: Leo Liu <leo.liu@amd.com> Fixes: 130d1f456(radeon/uvd: reconstruct MJPEG bitstream) Cc: "18.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-09-24 09:12:49 -04:00
Rhys Perry	6ca1402c11	nv50/ir: fix link-time build failure Seems this fixes linking problems that occur in some situations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-09-23 18:20:08 +01:00
Rhys Perry	b473fcc9a3	nvc0: fix bindless multisampled images on Maxwell+ NVC0_CB_AUX_BINDLESS_INFO isn't written to on Maxwell+ and it's too small anyway. With these changes, TXQ is used to determine the number of samples and the coordinate adjustment information looked up in a small array in the driver constant buffer. v2: rework to use TXQ and a small array instead of a larger array with an entry for each texture v3: get rid of the small array and calculate the adjustments in the shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `c2ae9b4052` ('nvc0: implement multisampled images on Maxwell+') Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-09-22 20:13:17 +01:00
Eric Engestrom	ed797f6597	docs: fix couple typos/outdated info `git-branch` doesn't exist, and mesa3d-dev hasn't been used in a great many years :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-22 17:23:18 +01:00
Eric Engestrom	ae2694efe0	docs: update repo URLs after GitLab move I also updated the developer instructions; presumably someone who's been given commit rights already knows how to clone a repository :) A more useful thing is to show how to update the pushurl, and how to use access tokens to push over HTTPS (especially for us at Intel, where non-http traffic is a pain). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-22 17:23:18 +01:00
Stuart Young	c95dd966c4	docs: Update FAQ with respect to s3tc support It's just over 10 months since 17.3.0 was released with s3tc support enabled. Probably a good idea to update the FAQ page. v2: Incorporate feedback from Adam Jackson <ajax@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `04396a134f` ("mesa: Import libtxc_dxtn sources") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-09-22 17:23:18 +01:00
Rhys Perry	f580a895b1	nvc0: warn about changing NVC0_CB_AUX_MP_INFO and NVC0_CB_AUX_DRAW_INFO Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-09-22 16:50:39 +01:00
Rhys Perry	01fa76b707	nvc0: Update counter reading shaders to new NVC0_CB_AUX_MP_INFO Fixes: `66ca7e400b` ('nvc0: add support for programmable sample locations') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-09-22 16:50:22 +01:00
Eric Anholt	cd667edecc	vc4: Remove dead i == 0 code from the cos() implementation. The loop starts at 1.	2018-09-21 17:16:43 -07:00
Eric Anholt	10d5d2d527	vc4: Fix sin(0.0) and cos(0.0) accuracy to fix SDL rendering rotation. SDL has some shaders that compute sin(angle) and cos(angle) for a rotation matrix in the VS, and angle is usually 0.0. Our previous implementation had quite a bit of error around 0.0, causing single-pixel rotations at typical window sizes. SDL2 has changed as of August 28th (commit 12156:e5a666405750) to not need sin/cos in the VS, but we should still fix this for existing implementations or similar patterns that other programs may have. glsl-cos goes from 32 instructions to 36, but 9 uniforms to 7. glsl-sin goes from 32 instructions to 34, but 8 uniforms to 7. This seems like a fine impact to have for the bugfix. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Fixes: https://github.com/anholt/mesa/issues/110	2018-09-21 17:16:43 -07:00
Anuj Phogat	a0baedb638	intel/icl: Fix URB size for different SKUs Different ICL SKUs have different URB sizes. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-21 14:40:04 -07:00
Anuj Phogat	fa1ff71a0f	i965/icl: Set Enabled Texel Offset Precision Fix bit h/w specification requires this bit to be always set. V2: Fix bit mask (Chris Wilson) Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-21 14:40:04 -07:00
Anuj Phogat	5eb173304b	anv/icl: Set Enabled Texel Offset Precision Fix bit h/w specification requires this bit to be always set. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-21 14:40:04 -07:00
Alex Deucher	afb7c6b301	pci_ids: add new polaris pci id Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-09-21 14:33:13 -05:00
Marek Olšák	f0cd7dbcd7	glsl_to_tgsi: invert gl_SamplePosition.y for the default framebuffer Fixes dEQP-GLES31.functional.shaders.sample_variables.sample_pos.correctness.default_framebuffer with --deqp-gl-config-name=rgba8888d24s8ms4 Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-21 13:39:00 -04:00
Caio Marcelo de Oliveira Filho	b29ec31854	util: Add macro to get number of elements in dynarray Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-09-21 10:12:51 -07:00
Dylan Baker	be56f8a788	docs/meson: Add note about llvm-config$version and llvm-config-$version v2: - fix typo These are how FreeBSD and Debian handle multiple versions of LLVM installed at the same time, respectively. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-21 10:03:15 -07:00
Dylan Baker	e0829f9c1a	docs/meson: Update notes on using CFLAGS and -Dc_args v2: - Use ${} to denote variables instead of just $ - fix spelling error bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107313 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-21 10:03:15 -07:00
Dylan Baker	1da60667b5	docs: update meson docs to reflect the current status v2: - minor grammar changes - fix typo Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-21 10:03:15 -07:00
Dylan Baker	509ea4649a	meson: Don't force libva to required from auto We already correctly handle va being auto, but we force it to being true, which is bad. Fixes `94cf397092` ("meson: Fix auto option for va") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-21 10:03:15 -07:00
Dylan Baker	5dcb77e491	meson: Don't compile pipe loader with dri support when not using dri Corrects building glx as gallium-xlib without any dri targets. v2: - fix ugly formatting Fixes: `66c94b9313` ("meson: build gallium winsys for dri, null, and wrapper") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-21 10:03:15 -07:00
Samuel Pitoiset	fe3f13cc5a	radv: use the resolve compute path if dest uses multiple layers The hardware path doesn't support resolving layers, for both source and destination images. This fixes a reflection issue when MSAA is enabled which affects GTA V and probably DIRT3. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107786 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Gregor Münch <gr.muench_at_gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-21 16:35:59 +02:00
Jason Ekstrand	ab80889e92	anv,radv: Implement vkAcquireNextImage2 This was added as part of 1.1 but it's very hard to track exactly what extension added it. In any case, we should implement it. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <Airlied@redhat.com>	2018-09-21 07:02:35 -05:00
Juan A. Suarez Romero	24bacaddef	docs: update calendar, add news and link release notes to 18.2.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-09-21 13:09:21 +02:00
Juan A. Suarez Romero	eefc77e691	docs: add sha256 checksums for 18.2.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `686eab6642`)	2018-09-21 13:06:14 +02:00
Juan A. Suarez Romero	17fbb1ef74	docs: add release notes for 18.2.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3c8c851fe4`)	2018-09-21 13:06:12 +02:00
Samuel Pitoiset	674fcfaecc	radv: only enable shaderInt16 on GFX9+ and LLVM7+ The throughput is similar to 32-bit integers on GFX8 and AMDVLK does not expose 16-bit integers on pre Vega as well. On GFX9+, only LLVM 7+ has support. This fixes a bunch of CTS crashes on GFX9/LLVM 6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-21 10:56:17 +02:00
Marek Olšák	945e9cdb2b	docs/features: add EXT_direct_state_access features Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-21 03:01:58 -04:00
Bas Nieuwenhuizen	0a77e70d10	radv: Fix driver UUID SHA1 init. Was missing the init, found by Emil. Fixes: `d17443a459` "radv: Use build ID if available for cache UUID." CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-20 23:38:38 +02:00
Charmaine Lee	64731e7c5e	svga: fix uninitialized fields in DefineDepthStencilView/DefineStreamOutput This patch fixes uninitialized fields in DefineDepthStencilView and DefineStreamOutput commands that are not relevant in SM4 device. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-20 13:20:10 -06:00
Brian Paul	7f4e6f4c97	r300g: add PIPE_SHADER_CAP_SCALAR_ISA switch case to silence warning Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-20 13:20:10 -06:00
Brian Paul	198c50f487	st/mesa: silenced unhanded enum warning in st_glsl_to_tgsi.cpp Add ir_intrinsic_begin_fragment_shader_ordering switch case to silence warning Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-20 13:20:10 -06:00
Brian Paul	35ea66a68e	mesa: use GLsizeiptrARB, GLintptrARB in bufferobj.c The function pointer declarations in dd.h for the BufferData() and BufferSubData() use the ARB-suffixed datatypes. This patch changes the buffer_data_fallback() and buffer_sub_data_fallback() functions to use those datatypes too. This fixes a build warning when building 32-bit libraries. Evidently, GLsizeiptrARB and GLsizeiptr are defined differently in that situation. All all implementations of these driver hooks use the ARB-suffixed types. Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-20 13:20:10 -06:00
Neha Bhende	708d34d41a	svga: Enable Opengl 3.3 compatibility profile With this patch, svga driver will start advertising OpenGL 3.3 compatibility profile. Tested with some mesa demos, piglit and glretrace. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-20 13:20:10 -06:00
Neha Bhende	ede805dd19	svga: Apply texcoord scale factors only if there is sampler view We need to convert unnormalized texcoords to normalized texcoords when we are sampling from texture. We don't need this conversion if there is no sampler view. Tested with piglit, glretrace Fixes vmware bug 2101970 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-20 13:20:10 -06:00
Charmaine Lee	1dcf377a76	svga: fix texture array layer index in transfer map In gallium, the layer index of a texture array to be mapped is specified in the z component, whereas in svga device, the index is specified in a separate argument. Currently in svga_texture_transfer_map(), we explicitly modify the z value in the base transfer map to 0 so the layer offset will not be applied twice, but this causes problem when state tracker later refers to the base transfer map and expects the slice index to be specified in z (commit `463b0ea1f6`). To fix the problem, this patch makes a local copy of the box in svga_transfer and modifies the z value in this copy instead. Fixes spec@khr_texture_compression-astc piglit test crashes. Fixes regression in the dma path with commit 1fdd3dd94a. Tested with mtt glretrace, piglit on Windows VM and Linux VM. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-20 13:20:10 -06:00
Dylan Baker	18a6e426f3	Revert "utils/u_math: break dependency on gallium/utils" This reverts commit `0abce6d770`. Which broke the windows build.	2018-09-20 10:36:33 -07:00
Caio Marcelo de Oliveira Filho	2567ad28bb	i965: remove outdated comment about TCS passthrough Since commit `75881bed9e` "i965: Rework the TCS passthrough shader to use NIR." the created nir_shader is not dummy, and it is compiled by the backend like the others. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-20 09:58:55 -07:00
Christoph Haag	b01834b56c	meson: add option to statically link llvm Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-20 06:08:50 -07:00
Dylan Baker	0abce6d770	utils/u_math: break dependency on gallium/utils Currently u_math needs gallium utils for cpu detection. Most of what u_math uses out of u_cpu_detection is duplicated in src/mesa/x86 (surprise!), so I've just reworked it as much as possible to use the x86/common_x86_features.h macros instead of the gallium ones. The mesa implementation is a header only approach, with no external dependencies. There is one small function that was copied over, as promoting u_cpu_detection is itself a fairly hefty undertaking, as it depends on u_debug, and this fixes the bug for now. bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107870 Tested-by: Vinson Lee <vlee@freedesktop.org>	2018-09-20 05:52:23 -07:00
Emil Velikov	b8b3517a49	egl/android: rework device probing Unlike the other platforms, here we aim do guess if the device that we somewhat arbitrarily picked, is supported or not. In particular: when a vendor is _not_ requested we loop through all devices, picking the first one which can create a DRI screen. When a vendor is requested - we use that and do _not_ fall-back to any other device. The former seems a bit fiddly, but considering EGL_EXT_explicit_device and EGL_MESA_query_renderer are MIA, this is the best we can do for the moment. With those (proposed) extensions userspace will be able to create a separate EGL display for each device, query device details and make the conscious decision which one to use. v2: - update droid_open_device_drm_gralloc() - set the dri2_dpy->fd before using it - return a EGLBoolean for droid_{probe,open}_device* - do not warn on droid_load_driver failure (Tomasz) - plug mem leak on dri2_create_screen failure (Tomasz) - fixup function name typo (Tomasz, Rob) v3: - add forward declaration for droid_load_driver() Fixes the HAVE_DRM_GRALLOC build (Mauro) - split dup() assignment and check in separate lines (Tomasz, Eric) - make droid_load_driver() static (Tomasz) - drop unused prop_set variable (Tomasz) v4: - rebase - fwd declarationi should be for droid_probe_device() Cc: Robert Foss <robert.foss@collabora.com> Cc: Tomasz Figa <tfiga@chromium.org> Cc: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2018-09-20 10:15:38 +01:00
Danylo Piliaiev	18be7403a1	glsl: Add an assert when cloning ir_dereference_record with invalid field Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-20 08:30:11 +10:00
Danylo Piliaiev	6f3c7374b1	glsl: Avoid propagating incompatible type of initializer do_assignment validated assigment but when rhs type was not compatible it proceeded without issues and returned error_emitted = false. On the other hand process_initializer expected do_assignment to always return compatible type and never fail. As a result when variable was initialized with incompatible type the type of variable changed to the incompatible one. This manifested in unnecessary error messages and in one case in crash. Example GLSL: vec4 tmp = vec2(0.0); tmp.z -= 1.0; Past error messages: initializer of type vec2 cannot be assigned to variable of type vec4 invalid swizzle / mask `z' type mismatch operands to arithmetic operators must be numeric After this patch: initializer of type vec2 cannot be assigned to variable of type vec4 In the other case when we initialize variable with incompatible struct, accessing variable's field leaded to a crash. Example: uniform struct {float field;} data; ... vec4 tmp = data; tmp.x -= 1.0; After the patch there is only error line without a crash: initializer of type #anon_struct cannot be assigned to variable of type vec4 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107547	2018-09-20 08:30:11 +10:00
Michal Srb	194bf0a2e0	st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it This is equivalent to commit `a65db0ad1c`, but for dri_kms_init_screen. Without this gbm_dri_is_format_supported always returns false. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104926 Fixes: `e14fe41e0b` ("st/dri: implement createImageFromRenderbuffer(2)") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Tested-by: Adam Williamson <adamwill@fedoraproject.org>	2018-09-19 15:20:04 -04:00
Jason Ekstrand	c811af767e	anv/so_memcpy: Don't consider src/dst_offset when computing block size The only thing that matters is the size since we never specify any offsets in terms of blocks. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-19 09:38:04 -05:00
Jakob Bornecrantz	09171705d5	Revert "mesa: only update framebuffer-state for clears" This reverts commit `fb86365148`.	2018-09-19 15:21:26 +01:00
Samuel Pitoiset	121f226471	radv: use a 64-bit unsigned integer when allocating a descriptor pool pool->size is a 64-bit unsigned integer too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-19 13:36:12 +02:00
Samuel Pitoiset	35656823b9	radv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BIT All CTS pass on Polaris/Vega with LLVM 6, 7 and master, so I think it's safe to enable the feature. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-19 13:36:10 +02:00
Samuel Pitoiset	febdc13a6c	radv: do not support blitting surfaces with depth and stencil Fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal_nearest And all friends that try to blit a surface with different depth and stencil formats. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-19 13:36:07 +02:00
Erik Faye-Lund	fb86365148	mesa: only update framebuffer-state for clears If we update the program-state etc, we risk compiling needless shaders, which can cost quite a bit of performance. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-19 11:52:53 +02:00
Juan A. Suarez Romero	0c82e3603e	nir: add initializer data to fix MSVC compile error CC: Jason Ekstrand <jason@jlekstrand.net> Fixes: 82799a5d1b8 ("nir: Add a small pass to rematerialize derefs per-block") Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-09-19 11:46:44 +02:00
Jason Ekstrand	976046a8d8	nir: Add some asserts that we don't put derefs in phis The lcssa and phis_to_regs passes are used by various NIR optimizations that modify the CFG. Putting a couple of asserts will help ensure that we don't accidentally put derefs in phis as part of an optimization pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-09-19 02:00:49 -05:00
Jason Ekstrand	864c780566	nir/opt_if: Re-materialize derefs in use blocks before peeling loops Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107879 Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-09-19 02:00:49 -05:00
Jason Ekstrand	0796c3934e	nir/loop_unroll: Re-materialize derefs in use blocks before unrolling When we're about to re-arrange a bunch of blocks, it's a good idea to make sure that we don't have deref uses crossing block boundaries. Otherwise we may end up with a deref going through a phi and that would be bad. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-09-19 01:59:40 -05:00
Jason Ekstrand	7d1d1208c2	nir: Add a small pass to rematerialize derefs per-block This pass re-materializes deref instructions on a per-block basis to ensure that every use of a deref occurs in the same block as the instruction which uses it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-09-19 01:59:40 -05:00
Kenneth Feng	4490fce166	amd: Add Picasso device id No changes here compared to Raven. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-18 18:05:17 -04:00
Bas Nieuwenhuizen	95bb7d82ca	Revert "radv: fix descriptor pool allocation size" This reverts commit `90819abb56`. This logic was wrong, the original code is correct. The direct impact is that we allocate up to approximately a squared amount of memory compared to what we should allocate. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-18 22:51:42 +02:00
Samuel Pitoiset	c9dbe52f84	radv: implement VK_EXT_conservative_rasterization Only supported by GFX9+. The conservativeraster Sascha demo seems to work as expected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-18 13:28:01 +02:00
Samuel Pitoiset	450a325858	radv: do not re-create the sampler for every blits in CmdBlitImage() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-18 13:27:59 +02:00
Samuel Pitoiset	3871dd7a92	radv: allow to force anisotropy via RADV_TEX_ANISO Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-18 13:27:58 +02:00
Timothy Arceri	b54a2311a9	mesa: enable EXT_framebuffer_object in core profile Since user defined names are not allowed in core profile we remove the allow_user_names bool and just check if we have a core profile like all other buffer/texture object handling code does. This extension is required by "Wolfenstein: The Old Blood" and is exposed in core in the Nvidia binary driver. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:58:24 +10:00
Timothy Arceri	02843ed768	mesa: move legacy dri config option texture_depth Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	f958ea6eff	mesa: move legacy dri config option fthrottle_mode Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	4b1a81ef9d	mesa: move legacy dri config option def_max_anisotropy Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	6164d59bcc	mesa: move legacy dri config option no_neg_lod_bias Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	6d1890fa07	mesa: move legacy dri config option round_mode Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	3a1d09fd55	mesa: remove unused dri option float_depth This seems to have only been used by DRI1 drivers which were removed with `e4344161bd`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	91e76ce493	mesa: move legacy dri config option dither_mode Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	2d7dc9591d	mesa: move legacy dri config option color_reduction Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	408d41a413	mesa: move legacy TCL dri config options Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:43:05 +10:00
Timothy Arceri	024abd3534	util: use force_compat_profile for Wolfenstein The Old Blood This game is looking for some odd extension after creating a core context such as ARB_vertex_program and EXT_framebuffer_object. Rather then enabling these in core this forces the game to use compat. This allows the game to run and seems to work without issues. All other id tech games/engines use a compat profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:34:54 +10:00
Timothy Arceri	64ec50d52f	mesa/st: add force_compat_profile option to driconfig Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-18 19:34:54 +10:00
Timothy Arceri	7a992fcfa0	Revert "radeonsi: avoid syncing the driver thread in si_fence_finish" This reverts commit `bc65dcab3b`. This was manually reverted. Reverting stops the menu hanging in some id tech games such as RAGE and Wolfenstein The New Order. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107891	2018-09-18 19:21:32 +10:00
Eric Anholt	4e1af6808c	v3d: Switch from FLUSH_ALL_STATE to FLUSH for ending our bin CLs. The HW for FLUSH_ALL_STATE isn't validated, since the closed driver only uses FLUSH. Now that we don't have any new state at the end of our bin CLs, follow their lead.	2018-09-17 16:35:45 -07:00
Eric Anholt	0b8007b523	v3d: Stop clearing the OQ state at the end of the job. Ever since we added OQ support, we've been clearing OQ state at the start of the job anyway. We're intentionally breaking old-and-new-driver-mix systems, because we need to stop using the unvalidated FLUSH_ALL_STATE.	2018-09-17 16:35:45 -07:00
Eric Anholt	350cb79045	v3d: Always emit a TF disable at the start of drawing on V3D 4.x. The HW's FLUSH_ALL_STATE is not validated, so we probably shouldn't use it, meaning that we need to reset state at the start. By doing this, we also make ourselves more resilient to another client leaving the TF state enabled at the end of their batch (as we now do, ourselves). However, we still need to emit a single TF disable at the end of the frame, for SWVC5-718.	2018-09-17 16:35:45 -07:00
Dylan Baker	7f08bcb73f	build: Don't overlink gallium xlib target Currently gallium's xlib target will fail to link due to multiple definitions of all the symbols in libmesautil, this only shows up in autotools, and not in meson due to differences in the way that meson and autotools handle linking static archives into static archives. Autotools uses -Wl,--whole-archive implicitly, meson requires this behavior to be opted-into. The solution is just to remove libmesautils from the libgl-xlib target, since it will get all of those symbols form libmesagallium. I've dropped the link from meson as well, it doesn't seem to hurt anything and should make linking just a little faster. Fixes: `8396043f30` ("Replace uses of _mesa_bitcount with util_bitcount") bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107923 Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Cc: Sergii Romantsov<sergii.romantsov@globallogic.com>	2018-09-17 13:21:01 -07:00
Dylan Baker	3acc18fcf7	move pthread_setaffinity_np check to the build system Rather than trying to encode all of the rules in a header, lets just put them in the build system where they belong. This fixes the build on FreeBSD, which does have pthraed_setaffinity_np, but it's in a pthread_np.h, not behind _GNU_SOURCE. FreeBSD also implements cpu_set slightly differently, so additional changes would be required to get it working right there anyway. v2: - fix #define in autotools Fixes: `9f1bbbdbbd` ("util: try to fix the Android and MacOS build") Cc: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-17 13:16:46 -07:00
Fritz Koenig	60d0c0d062	mesa: FramebufferParameteri parameter checking Missing break; causes parameter checking to never pass GL_FRAMEBUFFER_FLIP_Y_MESA parameters. Fixes: `318c265160` ("mesa: GL_MESA_framebuffer_flip_y extension [v4]") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-17 11:48:00 -07:00
Fritz Koenig	ba6cc32cf9	mesa: Additional FlipY applications Instances where direction was determined based on winsys or user fbo and should be determined based on FlipY. Key STATE_FB_WPOS_Y_TRANSFORM for of FlipY instead of _mesa_is_user_fbo. This corrects gl_FragCoord usage when applying GL_MESA_framebuffer_flip_y. Fixes: `ab05dd183c` ("i965: implement GL_MESA_framebuffer_flip_y [v3]") Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-17 11:48:00 -07:00
Bas Nieuwenhuizen	d17443a459	radv: Use build ID if available for cache UUID. To get an useful UUID for systems that have a non-useful mtime for the binaries. I started using SHA1 to ensure we get reasonable mixing in the various possibilities and the various build id lengths. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-17 20:19:52 +02:00
Samuel Pitoiset	08103c5f65	radv: enable shaderInt16 capability Not sure if this is all wired up. CTS does pass and the Tangrams demo works fine on Vega. There are corruption issues on Polaris but not sure if that related to 16-bit support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:39 +02:00
Samuel Pitoiset	cd76ce0078	ac: add 16-bit support to ac_build_bitfield_reverse() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:37 +02:00
Samuel Pitoiset	fc398f4d67	ac: add 16-bit support to ac_build_bit_count() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:34 +02:00
Samuel Pitoiset	94dd08eb7c	ac: add 16-bit support to ac_find_lsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:32 +02:00
Samuel Pitoiset	5a6c8ca3e8	ac: add 16-bit support to ac_build_umsb() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:30 +02:00
Samuel Pitoiset	3e7f3e2cd1	ac: add 16-bit support to ac_build_isign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:28 +02:00
Samuel Pitoiset	cfd6314cfe	ac: add 16-bit constant values for zero and one Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:26 +02:00
Samuel Pitoiset	074e29183c	ac: add ac_build_bifield_reverse() helper Are we missing 64-bit support? Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:23 +02:00
Samuel Pitoiset	371c35e5bb	ac: add ac_build_bit_count() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 15:18:20 +02:00
Samuel Pitoiset	aec9151464	radv: fix use of unreachable() in the meta blit path Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-17 11:29:25 +02:00
Samuel Pitoiset	6521d4a659	Revert "radv: Optimize rebinding the same descriptor set." This introduces random GPU hangs on Vega, at least. This reverts commit `02a43edf18`.	2018-09-17 11:20:57 +02:00
Samuel Pitoiset	90819abb56	radv: fix descriptor pool allocation size The size has to be multiplied by the number of sets. This gets rid of the OUT_OF_POOL_KHR error and fixes a crash with the Tangrams demo. CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-17 10:18:01 +02:00
Jason Ekstrand	67094e11e9	anv/query: Add an emit_srm helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-17 02:57:21 -05:00
Jason Ekstrand	40149441b8	anv: Add a mi_memset and use it for zeroing queries Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-17 02:57:21 -05:00
Jason Ekstrand	b11e9b5ffe	anv/query: Use anv_address everywhere Instead of passing around BOs and offsets, use addresses which are anv's GPU equivalent of pointers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-17 02:57:21 -05:00
Jason Ekstrand	07e214f1ce	anv/query: Write both dwords in emit_zero_queries Each query slot is a uint64_t and we were only zeroing half of it. Fixes: `7ec6e4e689` "anv/query: implement multiview interactions" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-17 02:57:21 -05:00
Jason Ekstrand	c0420a62c9	anv/query: Increment an index while writing results Instead of computing an index at the end which we hope maps to the number of things written, just count the number of things as we go. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-17 02:57:21 -05:00
Ian Romanick	df9dbc03d3	i965/fs: Don't propagate conditional modifiers from integer compares to adds No shader-db changes on any Intel platform... which probably explains why no bugs have been bisected to this problem since it landed in Mesa 18.1. :( The commit mentioned below is in 18.2, so 18.1 would need a slightly different fix (due to code refactoring). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `77f269bb56` "i965/fs: Refactor propagation of conditional modifiers from compares to adds" Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (reviewed the original patch) Cc: Matt Turner <mattst88@gmail.com> (reviewed the original patch)	2018-09-17 00:38:22 -07:00
Bas Nieuwenhuizen	0dd8189f15	radv: Only allow 16 user SGPRs for compute on GFX9+. Apparently for compute there are only 16 instead of the 32 for the graphics path. Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-16 12:50:58 +02:00
Bas Nieuwenhuizen	d97c892584	radv: Set the user SGPR MSB for Vega. Otherwise using 32 user SGPRs would be broken. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-16 12:50:58 +02:00
Bas Nieuwenhuizen	02a43edf18	radv: Optimize rebinding the same descriptor set. This makes it cheaper to just change the dynamic offsets with the same descriptor sets. Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-16 12:50:19 +02:00
Gert Wollny	14976817f4	r600/sb: use safe math optimizations when TGSI contains precise operations Fixes: dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-09-15 20:44:53 +02:00
Mauro Rossi	cc3b99bb48	android: broadcom/cle: export the broadcom top level path headers Fixes the following building error in vc4 build: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27: In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34: In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39: In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10: fatal error: 'cle/v3d_packet_helpers.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `5b102160ae` ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:46 +02:00
Mauro Rossi	9158e0bd82	android: broadcom/cle: add gallium include path Fixes the following building error: In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38: In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29: external/mesa/src/gallium/auxiliary/util/u_math.h:42:10: fatal error: 'pipe/p_compiler.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `5b102160ae` ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:42 +02:00
Mauro Rossi	3341429d74	android: broadcom/genxml: fix collision with intel/genxml header-gen macro Fixes the following building error, happening when building both intel and broadcom: Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h /bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \ external/mesa/src/broadcom/cle/v3d_packet_v21.xml \ > gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h" Traceback (most recent call last): File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module> p = Parser(sys.argv[2]) IndexError: list index out of range header-gen macro is already defined by Intel genxml building rules and the existing header-gen does not have the $(PRIVATE_VER) argument, infact the bash command line logged in the building error is missing exactly $(PRIVATE_VER) argument Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk solves the building error, another possible way is to keep the gen rules commands expanded and not use the macros. Fixes: `7f80a9ff13` ("vc4: Introduce XML-based packet header generation like Intel's.") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-15 09:14:33 +02:00
Caio Marcelo de Oliveira Filho	f9d25f630c	anv/memcpy: fix build after starting to use addresses The offsets now come from the anv_address, these references were not updated and using the old variable. Fixes: `e1ab834557` "anv/memcpy: Use addresses instead of bo+offset" Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-09-14 21:45:50 -07:00
Jason Ekstrand	d6a73824bd	anv/cmd_buffer: Take an address in emit_lrm Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-14 22:12:11 -05:00
Jason Ekstrand	e1ab834557	anv/memcpy: Use addresses instead of bo+offset Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-14 22:12:11 -05:00
Jason Ekstrand	90b46f6c17	anv/so_memcpy: Use the correct SO_BUFFER size on gen8+ This shouldn't matter as we'll never write OOB anyway but we may as well get it right. It's supposed to be in dwords - 1. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-09-14 22:12:11 -05:00
Timothy Arceri	e29f0ede75	ac: fix get_image_coords() for radeonsi Because this was setting image to true we would end up calling si_load_image_desc() when we sould be calling si_load_sampler_desc(). This fixes an assert() in Deus Ex: MD Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-15 12:23:32 +10:00
Marek Olšák	914bd3014f	gallium/util: don't let child processes inherit our thread affinity v2: corrected the comment	2018-09-14 21:15:39 -04:00
Marek Olšák	7d41a7593a	gallium/util: start with a random L3 cache index for AMD Zen	2018-09-14 21:05:37 -04:00
Josh Pieper	936e0dcd61	st/mesa: Validate the result of pipe_transfer_map in make_texture (v2) When using Freecad, I was getting intermittent segfaults inside of mesa. I traced it down to this path in st_cb_drawpixels.c where the result of pipe_transfer_map wasn't being checked. In my case, it was returning NULL because nouveau_bo_new returned ENOENT. I'm by no means a mesa developer, but this patch solves the problem for me and seems reasonable enough. v2: Marek - also unmap the PBO and release the texture, and call the make_texture function sooner for less cleanup Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-14 21:05:37 -04:00
Samuel Pitoiset	c79aad30ae	radv: emit the initial config only once in the preambles It shouldn't be needed to emit the initial graphics or compute state when beginning a new command buffer. Emitting them in the preamble should be enough and this will reduce IB sizes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	9de062ef20	radv: fix setting global locations for indirect descriptors Indirect descriptors only need one entry, we don't have to emit a location for every descriptors. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	748f4cce18	radv: fix flushing indirect descriptors Let say, we first bind a graphics pipeline that needs indirect descriptors sets. The userdata pointers will be emitted at draw time. Then if we bind a compute pipeline that doesn't need any indirect descriptors, the driver will re-emit them for all grpahics stages. To avoid this to happen, just check the bind point type. CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	063264db5b	radv: fix GPU hangs with 32-bit indirect descriptors LLVM 6 isn't affected. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	aa30205929	radv: handle loc->indirect correctly for the first descriptor This was wrong for descriptor #0 when all of them are indirect. This is because indirect_offset was 0 and we emitted a "normal" descriptor pointer for nothing. While we are at it remove radv_userdata_info::indirect_offset which is useless. CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	b9f6521157	radv: bump the maximum number of arguments to 64 Bumping to 64 should be safe enough. Fixes some crashes with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	c28ea92947	radv: tidy up ac_setup_rings() for the GSVS rings Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	40fb8c7fca	radv: fix setting the number of entries for GSVS on VI+ According to RadeonSI, it's unnecessary to multiply by the stride. That field seems to always be 64. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	a006c24237	radv: always compute the number of components from the output mask That removes two special cases for clip/cull distances. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	9447e91329	radv: emit data contiguously in the GS->VS ring buffer Instead of having holes. The other ring parameters like offset and stride can be updated later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	fbc064a5b4	radv: make use of the output usage mask in GS copy shader This is just for consistency because LLVM can detect and remove unused loads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	f398595dca	radv: improve a comment in si_emit_set_predication_state() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	abdf396cbe	radv: fix VK_EXT_conditional_rendering visibility It's actually just the opposite. This fixes the new Sascha conditionalrender demo. CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Samuel Pitoiset	18464d298b	radv: make use of ac_unpack_param() instead of ac_build_bfe() Same code is generated because LLVM ends up by using bfe, but that seems cleaner to me. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-14 10:59:52 +02:00
Timothy Arceri	21e34bab09	nir: add loop unroll support for complex wrapper loops In GLSL IR we cheat with switch statements and simply convert them into loops with a single iteration. This allowed us to make use of the existing jump instruction handling provided by the loop handing code, it also allows dead code to be cleaned up once we have wrapped the code in a loop. However using loops in this way created previously unrollable loops which limits further optimisations. Here we provide a way to unroll loops that end in a break and have multiple other exits. All shader-db changes are from the dolphin uber shaders. There is a small amount of HURT shaders but in general the improvements far exceed the HURT. shader-db results IVB: total instructions in shared programs: 10018187 -> 10016468 (-0.02%) instructions in affected programs: 104080 -> 102361 (-1.65%) helped: 36 HURT: 15 total cycles in shared programs: 220065064 -> 154529655 (-29.78%) cycles in affected programs: 126063017 -> 60527608 (-51.99%) helped: 51 HURT: 0 total loops in shared programs: 2515 -> 2308 (-8.23%) loops in affected programs: 903 -> 696 (-22.92%) helped: 51 HURT: 0 total spills in shared programs: 4370 -> 4124 (-5.63%) spills in affected programs: 1397 -> 1151 (-17.61%) helped: 9 HURT: 12 total fills in shared programs: 4581 -> 4419 (-3.54%) fills in affected programs: 2201 -> 2039 (-7.36%) helped: 9 HURT: 15 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-14 16:07:36 +10:00
Timothy Arceri	2975422ceb	nir: propagates if condition evaluation down some alu chains v2: - only allow nir_op_inot or nir_op_b2i when alu input is 1. - use some helpers as suggested by Jason. v3: - evaluate alu op for single input alu ops - add helper function to decide if to propagate through alu - make use of nir_before_src in another spot shader-db IVB results: total instructions in shared programs: 9993483 -> 9993472 (-0.00%) instructions in affected programs: 1300 -> 1289 (-0.85%) helped: 11 HURT: 0 total cycles in shared programs: 219476091 -> 219476059 (-0.00%) cycles in affected programs: 7675 -> 7643 (-0.42%) helped: 10 HURT: 1 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-14 16:07:36 +10:00
Timothy Arceri	ef4ad7baf1	nir: evaluate if condition uses inside the if branches Since we know what side of the branch we ended up on we can just replace the use with a constant. All the spill changes in shader-db are from Dolphin uber shaders, despite some small regressions the change is clearly positive. V2: insert new constant after any phis in the use->parent_instr->type == nir_instr_type_phi path. v3: - use nir_after_block_before_jump() for inserting const - check dominance of phi uses correctly v4: - create some helpers as suggested by Jason. v5 (Jason Ekstrand): - Use LIST_ENTRY to get the phi src shader-db results IVB: total instructions in shared programs: 9999201 -> 9993483 (-0.06%) instructions in affected programs: 163235 -> 157517 (-3.50%) helped: 132 HURT: 2 total cycles in shared programs: 231670754 -> 219476091 (-5.26%) cycles in affected programs: 143424120 -> 131229457 (-8.50%) helped: 115 HURT: 24 total spills in shared programs: 4383 -> 4370 (-0.30%) spills in affected programs: 1656 -> 1643 (-0.79%) helped: 9 HURT: 18 total fills in shared programs: 4610 -> 4581 (-0.63%) fills in affected programs: 374 -> 345 (-7.75%) helped: 6 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-14 16:07:36 +10:00
Erik Faye-Lund	fa5e9f1f73	virgl: adjust strides when mapping temp-resources When we're mapping temp-resources, we clip the resource to the transfer-box, which means the stride might not be correct any more. So let's update the stride from the temp-resource, and recompute the layer-stride. This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4 Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Fixes: `a8987b88ff` "virgl: add driver for virtio-gpu 3D (v2)" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-14 10:59:02 +10:00
Pierre Moreau	21b92b3464	nvir: Always split 64-bit IMAD/IMUL operations Those operations do not map to actual hardware instructions, therefore those should always be lowered to 32-bit instructions. Fixes: `009c54aa7a` "nv50/ir: Split 64-bit integer MAD/MUL operations" Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-09-13 20:49:38 +02:00
Leo Liu	cb63e5d1eb	st/vdpau: Use output buffer as back buffer with 24-bit color only Using output buffer with 8 bits video RGB as back buffer certainly is not working for 30 bits color depth visual. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-13 14:28:32 -04:00
Leo Liu	4d8ec12f03	vl/dri: add color depth to vl winsys For VDPAU use later Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-13 14:28:32 -04:00
Leo Liu	cd77d49ecf	vl/dri3: add support for 10 bits format Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-13 14:28:32 -04:00
Leo Liu	902358de4b	vl/dri: add 10 bits format supports v2: Tell B10G10R10X2 and R10G10B10X2 formats for different HW. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-13 14:28:32 -04:00
Kristian H. Kristensen	aaafae4f55	egl/android: Declare droid_load_driver() static Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2018-09-13 11:12:35 -07:00
Samuel Pitoiset	d4bf954fe6	radv: fix function names for VK_EXT_conditional_rendering Otherwise they are not exported. CC: 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-13 16:03:18 +02:00
Jason Ekstrand	1a263b377c	anv: Silence a couple compiler warnings [63/93] Compiling C object 'src/intel/vulkan/...intel@vulkan@@anv_common@sta/anv_device.c.o'. ../src/intel/vulkan/anv_device.c:685:30: warning: passing 'const char ' to parameter of type 'void ' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] vk_free(&instance->alloc, instance->app_info.app_name); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here vk_free(const VkAllocationCallbacks alloc, void data) ^ ../src/intel/vulkan/anv_device.c:686:30: warning: passing 'const char ' to parameter of type 'void ' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] vk_free(&instance->alloc, instance->app_info.engine_name); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/vulkan/util/vk_alloc.h:62:51: note: passing argument to parameter 'data' here vk_free(const VkAllocationCallbacks alloc, void data) ^ [65/93] Compiling C object 'src/intel/vulkan/...ommon@sta/anv_nir_apply_pipeline_layout.c.o'. ../src/intel/vulkan/anv_nir_apply_pipeline_layout.c:519:13: warning: unused variable 'image_uniform' [-Wunused-variable] unsigned image_uniform; Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-12 21:20:27 -05:00
Michel Dänzer	e34dd4f508	loader/dri3: Don't wait for fence of old buffer when re-allocating it We only need to wait for the fence before drawing to a buffer, not before reading from it. This might avoid hangs when re-allocating the fake front buffer, similar to the previous change. But I haven't seen any evidence that this was actually happening in practice. Tested-by: Olivier Fourdan <ofourdan@redhat.com>	2018-09-12 16:55:09 +02:00
Michel Dänzer	aefac10fec	loader/dri3: Only wait for back buffer fences in dri3_get_buffer We don't need to wait before drawing to the fake front buffer, as front buffer rendering by definition is allowed to produce artifacts. Fixes hangs in some cases when re-using the fake front buffer, due to it still being busy (i.e. in use for presentation). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/106404 Bugzilla: https://bugs.freedesktop.org/107757 Tested-by: Olivier Fourdan <ofourdan@redhat.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2018-09-12 16:53:58 +02:00
Vadym Shovkoplias	9b5c0c520f	glsl/linker: Check the invariance of built-in special variables From Section 4.6.4 (Invariance and Linkage) of the GLSL ES 1.0 specification "The invariance of varyings that are declared in both the vertex and fragment shaders must match. For the built-in special variables, gl_FragCoord can only be declared invariant if and only if gl_Position is declared invariant. Similarly gl_PointCoord can only be declared invariant if and only if gl_PointSize is declared invariant. It is an error to declare gl_FrontFacing as invariant. The invariance of gl_FrontFacing is the same as the invariance of gl_Position." Fixes: * glsl-pcoord-invariant.shader_test * glsl-fcoord-invariant.shader_test * glsl-fface-invariant.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107734 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-09-12 11:43:21 +03:00
Tapani Pälli	30580640f2	intel/tools: fix initial position of window in aubinator viewer Currently position is set before widgets are sized by gtk and calculation can get wrong results where window is positioned offscreen. Patch fixes this by setting aubfile window position as 0,0 only when size_allocate has been called to the widget. Now window is always positioned to 0,0 if imgui.ini is missing. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-12 11:43:21 +03:00
Erik Faye-Lund	eaa718588e	winsys/virgl: avoid unintended behavior If we end up never taking the loop that writes ret, we can end up with an uninitialized value, and if we're really unlucky, that value can be -1, causing us to go down an error-path instead of a success path. This was obviously not intended, so let's just initialize this to zero. Noticed by Valgrind: Conditional jump or move depends on uninitialised value(s) at 0xBA640A0: virgl_drm_winsys_resource_cache_create (virgl_drm_winsys.c:348) by 0xBA62FCF: virgl_buffer_create (virgl_buffer.c:170) by 0xBA605AC: virgl_resource_create (virgl_resource.c:60) by 0xBCF816F: bufferobj_data (st_cb_bufferobjects.c:344) by 0xBCF816F: st_bufferobj_data (st_cb_bufferobjects.c:390) by 0xBB7E836: vbo_use_buffer_objects (vbo_exec_api.c:1136) by 0xBCFCC6E: st_create_context_priv (st_context.c:414) by 0xBCFD3CD: st_create_context (st_context.c:590) by 0xBBB30CA: st_api_create_context (st_manager.c:896) by 0xB981E76: dri_create_context (dri_context.c:155) by 0xB97BDCE: driCreateContextAttribs (dri_util.c:473) by 0x5288331: dri3_create_context_attribs (dri3_glx.c:309) by 0x5264D64: glXCreateContextAttribsARB (create_context.c:78) Fixes: `a8987b88ff` ("virgl: add driver for virtio-gpu 3D (v2)") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-09-12 10:14:43 +02:00
Juan A. Suarez Romero	d631916f29	travis: use python3.5 for meson Newer Meson versions require python >=3.5. But in Trusty default python3 version is 3.4.x. Install python3.5 and makes it the default version for Meson using update-alternatives method. CC: Jan Vesely <jano.vesely@gmail.com> CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Jon Turney <jon.turney@dronecode.org.uk> CC: Eric Engestrom <eric.engestrom@intel.com> CC: Dylan Baker <dylan@pnwbakers.com> Fixes: `3824c8e7cd` "meson: disable asserts by default on release builds" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-09-11 14:27:58 +01:00
Samuel Pitoiset	3d08631fe5	radv: adjust ESGS ring buffer size computation on VI+ Noticed while working in this area. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-11 11:30:19 +02:00
Gert Wollny	47e01e77d8	mesa/texture: Also check for LA texture when querying intensity component size Gallium may pick L16A16_FLOAT to represent GL_INTENSITY16F if no intensity format is provided by the driver. However, when calling glGetTexLevelParameteriv(..., GL_TEXTURE_INTENSITY_SIZE, ...) mesa will return a zero size because the actually used format has no intensity channel and as a fallback only the sizes of the red/green channels are checked. Also checking for LA sizes in the allocated texture resolves this problem. v2: Only check alpha channel size and return it (Marek) L and A size are always the same in this case. Fixes (on virgl): ext_framebuffer_multisample-fast-clear GL_ARB_texture_float * Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107832 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-11 09:07:05 +02:00
Ilia Mirkin	133e12fb69	nv50,nvc0: warn on not-explicitly-handled caps Not handling caps explicitly means that we're likely getting incorrect values -- these need to be reviewed and set appropriately. While we're at it, add in some missing caps, and set all the subpixel stuff to 8 as that seems to be what the blob reports. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-09-11 01:25:19 -04:00
Timothy Arceri	e66c2158f8	mesa: remove duplicate dispatch sanity tests This removes duplicate tests from gl_core_functions_possible that are already covered by common_desktop_functions_possible. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-09-11 10:13:31 +10:00
Timothy Arceri	355a5ef761	mesa: tidy up init_matrix_stack() Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-09-11 09:26:04 +10:00
Christopher Egert	51995f6920	radeon: fix ColorMask Since commit `af3685d149` various OpenGL applications regressed on the classic mesa radeon driver. Signed-off-by: Christopher Egert <cme3000@gmail.com> CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-09-10 16:57:20 -04:00
Elie Tournier	9179c745f6	gallium: Correctly handle no config context creation This patch fixes the following Piglit test: spec@egl_mesa_configless_context@basic It also fixes few test in a virgl guest. v2: Evaluate the value of no_config (Ilia) Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-09-10 15:30:17 -04:00
Bas Nieuwenhuizen	f6e09db2e6	radv: Support v3 of VK_EXT_vertex_attribute_divisor. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-10 21:26:17 +02:00
Marek Olšák	867f7aaed2	radeonsi/nir: port some bindless and sampler code from TGSI These might be all missing changes for bindless textures. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:23:21 -04:00
Marek Olšák	b00deed66f	radeonsi: adjust and simplify max_alloc_size determination Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	203ef19f48	radeonsi: split si_copy_buffer compute and SDMA will be added into it. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	986d6f12fb	radeonsi: don't call VBO prefetch with size=0 for the next commit. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	1119fe5c25	radeonsi: merge SI and CI dma_clear_buffer and remove the callback also use assertions for the requirements that offset and size are a multiple of 4. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	be0bd95abf	radeonsi: fix GPU hangs with bindless textures and LLVM 7.0 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	fa595e3d0c	ac: remove deprecated use of LLVMInt1Type() Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	cc36ebbdc3	ac: use iN_0/1 constants Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	bc09c3d59e	ac: add radeon_info::num_good_cu_per_sh Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	a5f35aa742	ac: revert new LLVM 7.0 behavior for fdiv Cc: 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	662db03577	radeonsi: fix printing a BO list into ddebug reports important for debugging Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	da72b6296c	r600: fix HTILE for NPOT textures with mipmapping Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	d4e52281aa	winsys/radeon: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI Cc: 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	a1b9a00f82	radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI VI uses addrlib so it's unaffected. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Brian Paul	5162735957	docs: document new features/extensions in driver for WS 15 / Fusion 11 Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	7baf45dfc7	svga: assorted fixes/changes in svga_pipe_blit.c To align the code with VMware's in-house copy. Signed-off-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	25fceccf72	svga: set buffer bind_flags in svga_buffer_add_host_surface() To match the in-house VMware code. Signed-off-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	337a74aa40	svga: add format conversion for legacy formats This patch extends the format_conversion table to support different view formats on texture buffer. For legacy image formats such as INTENSITY, LUMINANCE, LUMINANCE_ALPHA, special swizzle masks will be used on the red or RG channels. This fixes piglit test arb_texture_buffer_object-formats fs\|vs arb Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	389450a271	svga: remove obsolete code to reemit gs binding The svga_reemit_gs_bindings function is no longer needed. Remove it. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	c174ee9f9d	svga: move variant->fs_shadow_compare_units assignment Fixes a crash since the variant object isn't allocated until later in the function. Not sure how this got through. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	cb70474b20	svga: fix resource checking in is_blending_enabled() This patch makes sure a valid color buffer is bound before checking its resource. This fixes Unigine Valley running in SM41 device. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Neha Bhende	c6103328ab	svga: Use texture_copy_region instead of texture_copy_handle for multisampling This fixes some of tests cases in arb_copy_image-formats and also fixes SurfaceCopy related errors in vmware.log when multi sampled surfaces are used. Tested with piglit, glretrace on windows and linux VM. v2: As per Brian's comment Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	fdf5885183	svga: add missing devcap check for texture array support The patch checks DXFMT_ARRAY devcap for texture array support. Tested with MTT-piglit. No regressions. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	3069581260	svga: no need to check MULTISAMPLE devcap for view format According to the current SVGA contract, any view format can be used on the underlying resource that is multisample. So there is no need to check the MULTISAMPLE devcap for the view format. Fixes black rendering issue with Tropics running with 4xMSAA. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	6f254ad9b4	svga: sync devcap name changes in svga3d_devcaps.h Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	49428c8d61	svga: explicit set DXFMT_SHADER_SAMPLE for DS format for pre-SM41 device Explicit set the DXFMT_SHADER_SAMPLE bit for depth stencil formats for pre-SM41 device only. This bit is now set by the SM41 device. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	379a2f265f	svga: remove unused variable Trivial.	2018-09-10 13:07:30 -06:00
Brian Paul	cbcc416a58	svga: draw round points when msaa is enabled See comments for details. This allows the piglit ext_framebuffer_multisample-point-smooth test to pass. Also, test the pipe_rasterizer_state::point_quad_rasterization field to see if sprite point rasterization is needed because it's possible for no sprite_coord_enable bits to be set when drawing sprites. Finally, remove old, stale comments. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	6b039c7d7c	svga: check number of samples before emitting MSAA decls/opcodes If real MSAA is not available, we only support 1 sample/pixel. In that case, we must not declare MSAA resources or emit MSAA opcodes. Do that by checking the sample count. Fixes several piglit MSAA tests, such as arb_texture_multisample-sample-depth (when the hard-coded sample count of 4 is fixed in that test). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	cf2fb6813c	svga: remove obsolete comment on format_cap_table[] We removed the special cases referred to in this comment in the commit "svga: add a separate function to get dx format capabilities from vgpu10 device". Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	0fc6c17bf2	svga: allow TGSI_TEXTURE_CUBE_ARRAY in emit_tg4() Technically, SM4.1 doesn't support cube map arrays, but our backend renderers actually do. This allows the Piglit textureGather cube map array tests to pass. Tested with GLrenderer, DX11renderer and SWrenderer. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	3467a274e0	svga: no dma on multisample surface Force direct map on multisample surface. Fixes SVGA Driver Errors running multisample piglit tests on Linux VM v2: use texture for the check. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	5f14444184	svga: src surface for IntraSurfaceCopy cannot be multisample Fixes SVGA Driver Errors with piglit test arb_copy_image-targets Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	026e1ad7bb	svga: fix missing format multisample devcap check In commit e4048f6cd1, svga_is_dx_format_supported() is supposed to also check the SVGA3D_DXFMT_MULTISAMPLE bit for multisample support of a format. Somehow that code is not included in that commit. This patch fixes it. Fixes piglit test spec@ext_framebuffer_multisample@formats all_samples. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	285d8b47b1	svga: fix incorrect multisample support in VGPU9 device Commit e4048f6cd1 unintentionally allows multisample support for VGPU9 device. This patch fixes this regression. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	59a56ca1c8	svga: fix the missing devcap for SVGA3D_BC3_UNORM_SRGB Set the devcap to SVGA3D_DEVCAP_DXFMT_BC3_UNORM_SRGB Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	16666eb470	svga: add a separate function to get dx format capabilities from vgpu10 device Currently we have one function to get format capabailities and we convert DX10 devcaps back to DX9. This can be confusing. Going forward we will have a separate function for dealing with dx formats. This patch also fixes the depth stencil devcap. Instead of hardcoding the capabilities for the depth stencil formats, we will inquire the device for the capabilities. Note: we will still need to explicity set the SVGA3D_DXFMT_SHADER_SAMPLE bit for SVGA3D_R32_FLOAT_X8X24 and SVGA3D_R24_UNORM_X8 since this bit is not advertised but supported by the device. v2: reapply the patch after svga_is_format_supported is moved to svga_format.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	b1aee7ff05	svga: assign a separate function for is_format_supported() for vgpu10 device This patch adds a new function svga_is_dx_format_supported() to check for format support in a VGPU10 device. v2: reapply the patch after svga_is_format_supported is moved to svga_format.c Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	1ea9c80d6d	svga: add some devcap debugging code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	96ef81e39e	svga: fix depth and coverage mask output declaration Set the component mask to zero for both registers. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	7187a2f7ff	svga: add sample positions for 2 samples Fixes piglit tests spec@arb_sample_shading@builtin-gl-sample-position 2 spec@arb_texture_multisample@fb-completeness@2 Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	73c850fb9a	svga: check sample count devcaps Check sample count devcaps from the svga device to determine the supported sample counts. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	afacde3553	svga: fix 1-element cube map array issue As with 1D and 2D array textures, if there's only one array element (one cubemap in this case) we have to issue different shader code. This fixes a number of Piglit cubemap array tests. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	767c1eb436	svga: simplify array test in svga_init_shader_key_common() And squash commit a patch to silence a compiler warning (add default case to the switch statement). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	45517f492b	winsys/drm: check for CAPS2/SM41 support if VGPU10 is enabled No need to check for HW_CAPS2 or SM4_1 support if VGPU10 is not enabled or is explicitly disabled via the environment variable SVGA_VGPU10. Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-09-10 13:07:30 -06:00
Deepak Rawat	159e706c4c	winsys/drm: Add support for quality level in surface ioctl A new argument "quality level" is added in surface define v3 which represets precision settings for surface. This commit add support for quality level in DRM_VMW_GB_SURFACE_CREATE_EXT and DRM_VMW_GB_SURFACE_REF_EXT. Signed-off-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	b343c6915c	svga: sync svga3d_types.h with upstream changes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	b5827db2ea	winsys/drm: enable intra_surface_copy if HW_CAP2 is supported With drm version 2_15, we can inquire for support of HW_CAP2. If it is supported, we can enable intra_surface_copy support. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	7448bb0089	svga: add git version logging at init time Before we can log the git version in the host log, we'll add the git version in the init debug message. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	4669ffd29b	svga: fix a typo in svga_texture_copy_region() Trivial.	2018-09-10 13:07:30 -06:00
Charmaine Lee	3233d05390	svga: use helper function to do copy region Use the common helper function svga_texture_copy_region for copy region command. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	74791b80b9	svga: fix cubemap array rendering with backed surface view This patch fixes the layer index when rendering to a backed surface view of a cubemap array. Fixes piglit test fbo-generatemipmap-cubemap array. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	2d39e6d0c8	svga: add a helper function to send ResolveCopy command Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	9a24b08a49	svga: sync svga3d header files This is a squash of what was orginally three commits. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	f3eda3e5e1	svga: add SM4_1 enable debug print Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	ccd895db76	svga: fix swizzling for texture gather Texture swizzling for texture gather needs to be done to the selected texels rather than to the returned vector. This patch has specical cases for the different swizzles in emit_tg4(). Fixes a lot of piglit texture gather tests. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	be1993d6ed	svga: fix starting index for system values Currently, the starting index for system values is assigned to the next index after the highest index of the tgsi declared input registers. But the tgsi index might be different from the actual assigned index, hence this might cause overlap of indices. With this patch, the shader linker keeps track of the highest index of the translated input registers, and the next index will be used for the starting index for system values. Fixes SHIM errors running arb_copy_image-formats on SM4_1 device. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Deepak Rawat	569f838987	winsys/svga: Add support for new surface ioctl, multisample pattern Kernel driver version 2.15 added new surface ioctl named: DRM_VMW_GB_SURFACE_CREATE_EXT DRM_VMW_GB_SURFACE_REF_EXT The new ioctl has support for 64-bit svga3d_flags if DRM_VMW_PARAM_SM4_1 is available. Multisampling surface mob size calculation is added. Also synced the relevant header update. svga device modified the surface define command V3 with new parameter multisampling pattern. Adding support for that in winsys. Signed-off-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	3f55425ee6	svga: enable MSAA for SM4_1 device The SVGA device is deprecating the DX9 MSAA support. This patch enables MSAA for SM4_1 device by explicitly setting the SVGA3D_SURFACE_MULTISAMPLE bit. For SM4_1 device, only 4 samples is supported. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	8088cb6f53	svga: add sample count to the surface_can_create interface With this patch, sample count is also taken into account when determining if a resource can be created. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	4a1976bfcf	svga: implement support for GL_ARB_texture_query_lod Just translate the TGSI LODQ intruction to VGPU10 LOD instruction. All (4) Piglit GL_ARB_texture_query_lod tests pass. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-10 13:07:30 -06:00
Neha Bhende	252e97ecdf	svga: Add support for arb_texture_gather With sm4_1, we can support single channel 2D or CubeMap textures. This patch exercises this feature. Tested with piglit v2: As per Brian's comment Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	36c84bcd77	svga: add support for interpolation at sample position Vs. sampling at the centroid or the fragment center. Note that this does not fix failures with the Piglit arb_sample_shading-interpolate-at-sample-position or arb_sample_shading-ignore-centroid-qualifier.exe tests at this time. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	bcf7aaa9f7	svga: clarify sys value -> input register mapping We translate TGSI system value registers to VGPU10 input registers. Add a comment and set file = TGSI_FILE_INPUT. That's not stricly necessary since we map both TGSI_FILE_INPUT and TGSI_FILE_SYSTEM_VALUE to VGPU10_OPERAND_TYPE_INPUT, but this makes the code a bit more understandable. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	9de5bdb341	svga: add support for FS sample mask output This, with the previous work for sample position/id query, allows us to enable per-sample shading for VGPU 10.1. Note that quite a few Piglit arb_sample_shading tests still do not pass, but many do. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	0a219dd918	svga: add support for sample id, sample position Sample ID is just a system value. Sample position must be implemented with the VGPU10_OPCODE_SAMPLE_POS instruction. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	ac4a0c0e82	svga: implement no-op svga_set_min_samples() This is part of the per-sample shading feature (PIPE_CAP_SAMPLE_SHADING). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	3c3fc7154e	svga: add support for independent blend function per render target This patch adds support for GL_ARB_draw_buffers_blend extension for SM4_1 device. Fixes piglit test fbo-draw-buffers-blend. This patch is squashed with a subsequent patch which fixed a regression. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	5512f943b8	svga: emit shader version as 4.0 or 4.1 depending on device support Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	1d806b6f13	svga: restructure nested if's in emit_src_register() To make it cleaner for subsequent changes. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	16439085f5	svga: sync VGPU10ShaderTokens.h with upstream changes This includes new DX 10.1 opcodes and tokens. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	22e8099711	svga: add support for shadow cubemap array Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	f929247d24	svga: add support for rendering to cubemap array Fixes piglit test arb_texture_cube_map_array-fbo-cubemap-array Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	1df17fc697	svga: add support for TXL2 opcode This patch adds support for cubemap array texture lookup with explicit LOD. Fixes piglit test arb_texture_cube_map_array-cubemap-lod Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Charmaine Lee	62402be407	svga: add support for cubemap array This patch adds support for cubemap array for SM4_1. Fixes piglit test arb_texture_cube_map_array-cubemap Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Brian Paul	018ff0112f	svga: add have_sm4_1 flag, helper function Signed-off-by: Brian Paul <brianp@vmware.com>	2018-09-10 13:07:30 -06:00
Marek Olšák	d211679017	gallium/u_inlines: remove the destroy variable in pipe_reference_described Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 14:53:01 -04:00
Marek Olšák	ed880fe192	gallium/u_inlines: improve pipe_reference_described perf for debug builds Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-10 14:53:01 -04:00
Marek Olšák	c042a34b14	gallium/auxiliary: don't dereference counters twice needlessly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 14:52:32 -04:00
Marek Olšák	61767c059e	gallium/u_inlines: normalize naming, use dst & src, style fixes (v2) v2: update comments Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 14:52:32 -04:00
Marek Olšák	9f1bbbdbbd	util: try to fix the Android and MacOS build Bionic does not have pthread_setaffinity_np. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107869 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-09-10 14:49:07 -04:00
Jason Ekstrand	6f00785765	anv: Support v3 of VK_EXT_vertex_attribute_divisor Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-10 13:45:32 -05:00
Jason Ekstrand	34a17a48d4	vulkan: Update the XML and headers to 1.1.84 Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-10 13:30:21 -05:00
Sergii Romantsov	bbe551f3ea	mesa/meson: 32bit xmlconfig linkage Building of 32bit mesa with meson causes linkage issue: "undefined reference to `util_get_process_name'" Fixed by adding link-with mesa_util for xmlconfig primary. v2: Removed '[]', commit message corrected. v3: Reverted changes in gbm and glx libraries. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843 Fixes: `2e1e6511f7` "util: extract get_process_name from xmlconfig.c" Cc: Marek Olšák <marek.olsak@amd.com> Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-10 08:57:42 -07:00
Jose Fonseca	52ca32121b	Require Visual Studio 2015. We no longer need or use Visual Studio 2013. https://ci.appveyor.com/project/jrfonseca/mesa/build/52 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-09-10 10:10:16 +01:00
Jose Fonseca	d5f934522d	util: Make util_context_thread_changed a no-op on Windows. Spite using thrd_t types, these functions are wed to pthreads, and break Windows builds, because thrd_current() is not implemented there, as it's impossible to have an efficient thrd_current() implementation on Windows. Trivial.	2018-09-10 10:10:16 +01:00
Erik Faye-Lund	c4017106bb	virgl: do not map zero-sized resource When creating textures, we avoid creating backing-store for all multisampled textures, not just depth buffers. So we can't try to map them later. That's just going to fail. So let's take the blit-based code-path that seems to avoid this problem. This make this piglit test-case no longer crash (although it still fails): bin/copyteximage 2D -samples=2 -auto Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-10 10:35:42 +02:00
Erik Faye-Lund	8083464013	virgl: remove dead code We don't use the size we calculate in this function, so let's just drop the calculation Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-10 10:35:32 +02:00
Erik Faye-Lund	b9c40e492d	virgl: drop needless return-code We always return TRUE, and we never check the return-value. Let's just drop the return value instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-10 10:35:20 +02:00
Erik Faye-Lund	9635869d73	virgl: free trans on map-error When we fail to map memory, we should also free trans to avoid leaking memory. Noticed while reading code. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-10 10:35:02 +02:00
Chris Wilson	44e3e6a9b4	i965: Bump aperture tracking to u64 As a prelude to handling large address spaces, first allow ourselves the luxury of handling the full 4G. Reported-by: Andrey Simiklit <asimiklit.work@gmail.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-10 09:14:46 +01:00
Mathias Fröhlich	2fece204c0	etnaviv: Reduce max offset to available hardware bits. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-10 07:59:31 +02:00
Mathias Fröhlich	4569bc6ad0	gallium: New cap PIPE_CAP_MAX_VERTEX_ELEMENT_SRC_OFFSET. Introduce a new capability for the maximum value of pipe_vertex_element::src_offset. Initially just every driver backend returns the value previously set from _mesa_init_constants. So this shall end up in no functional change. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-10 07:59:31 +02:00
Dave Airlie	240af61494	virgl: don't send a shader create with no data. (v2) This fixes the situation where we'd send a shader with just the header and no data. piglit/glsl-max-varyings test was causing this to happen, and the renderer fix was breaking it. v2: drop fprintf Fixes: `a8987b88ff` "virgl: add driver for virtio-gpu 3D (v2)" Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-09-10 12:23:30 +10:00
Timothy Arceri	14fe9fa11b	mesa: enable ARB_vertex_buffer_object in core profile This extension is required by "Wolfenstein: The Old Blood" and is exposed in core in the Nvidia binary driver. All the functions are just alias of the core functions so there should be nothing more to do. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-08 14:35:09 +10:00
Marek Olšák	21ca322e63	st/mesa: throttle texture uploads if their memory usage goes beyond a limit This prevents radeonsi from running out of memory. It also increases texture upload performance by being nice to the kernel memory manager.	2018-09-07 17:59:02 -04:00
Marek Olšák	9ce2cef68f	gallium: add PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGET	2018-09-07 17:59:02 -04:00
Andres Gomez	ecfe41e690	docs: update calendar, add news item and link release notes for 18.2.0 Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-09-08 00:40:43 +03:00
Andres Gomez	5382a90cb2	docs: add sha256 checksums for 18.2.0 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `cb1ddf48e2`)	2018-09-08 00:28:23 +03:00
Andres Gomez	65f3327db6	docs: update 18.2.0 release notes Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `7378180e7a`)	2018-09-08 00:28:21 +03:00
Marek Olšák	7ac52c2e38	Revert "gallium/os_thread: simplify helper pipe_current_thread_get_time_nano" This reverts commit `6d477bc546`. It fixes the Windows build hopefully.	2018-09-07 16:52:36 -04:00
Jason Ekstrand	465e5a868c	anv: Clamp scissors to the framebuffer boundary The Vulkan 1.1.81 spec says: "It is legal for offset.x + extent.width or offset.y + extent.height to exceed the dimensions of the framebuffer - the scissor test still applies as defined above. Rasterization does not produce fragments outside of the framebuffer, so such fragments never have the scissor test performed on them." Elsewhere, the Vulkan 1.1.81 spec says: "The application must ensure (using scissor if necessary) that all rendering is contained within the render area, otherwise the pixels outside of the render area become undefined and shader side effects may occur for fragments outside the render area. The render area must be contained within the framebuffer dimensions." Unfortunately, there's some room for interpretation here as to what the consequences are of having the render area set to exactly the framebuffer dimensions and having a scissor that is larger than the framebuffer. Given that GL and other APIs provide automatic clipping to the framebuffer, it makes sense that applications would assume that Vulkan does this as well. It costs us very little to play it safe and just clamp client-provided scissors to the framebuffer dimensions. Fortunately, the user is required to provide us with at least one scissor so we don't need to handle the case where they don't. Fixes: `fb2a5ceb32` "anv: Emit DRAWING_RECTANGLE once at driver..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-07 15:19:02 -05:00
Jason Ekstrand	b08b4b2b25	anv: Disable the vertex cache when tessellating on SKL GT4 I have no idea if I'm correct about what's going wrong or if this is the correct fix. However, in my multiple weeks of banging my head on this hang, a VUE reference counting bug seems to match all the symptoms and it definitely fixes the hang. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107280 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-07 15:19:02 -05:00
Jason Ekstrand	5dee89438a	anv: Implement a VF cache invalidate workaround Known to fix nothing whatsoever but it's in the docs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-07 15:19:02 -05:00
Jason Ekstrand	c643c5e18d	anv: Re-emit vertex buffers when the pipeline changes Some of the bits of VERTEX_BUFFER_STATE such as access type, instance data step rate, and pitch come from the pipeline. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-07 15:19:02 -05:00
Marek Olšák	25ffb84016	radeonsi: pin the winsys thread to the requested L3 cache (v2) v2: rebase Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 16:03:36 -04:00
Marek Olšák	8016639f63	gallium/u_threaded: implement set_context_param for thread pinning (v2) v2: - use set_context_param - set set_context_param even if the driver doesn't implement it Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 16:03:36 -04:00
Marek Olšák	8d473f555a	st/mesa: pin driver threads to a specific L3 cache on AMD Zen (v2) v2: use set_context_param Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 16:03:30 -04:00
Marek Olšák	e5e3b5cdcc	gallium: add pipe_context::set_context_param for tuning perf on AMD Zen (v2) State trackers will not use the new param directly, but will instead use a helper in MakeCurrent that does the right thing. v2: rework the interface Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 15:48:31 -04:00
Marek Olšák	6d477bc546	gallium/os_thread: simplify helper pipe_current_thread_get_time_nano Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 15:48:31 -04:00
Marek Olšák	15fa2c5e35	gallium/u_cpu_detect: get the number of cores per L3 cache for AMD Zen Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 15:48:31 -04:00
Marek Olšák	ce432e259d	gallium/u_cpu_detect: fix parsing the CPU family According to: https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf Also Intel: https://www.microbe.cz/docs/CPUID.pdf Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 15:48:31 -04:00
Marek Olšák	a84fd58f48	gallium/u_cpu_detect: fix a race condition on initialization Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-07 15:48:31 -04:00
Dylan Baker	8396043f30	Replace uses of _mesa_bitcount with util_bitcount and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem in nir for platforms that don't have popcount or popcountll, such as 32bit msvc. v2: - Fix additional uses of _mesa_bitcount added after this was originally written Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v1) Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-07 10:21:26 -07:00
Dylan Baker	80825abb5d	move u_math to src/util Currently we have two sets of functions for bit counts, one in gallium and one in core mesa. The ones in core mesa are header only in many cases, since they reduce to "#define _mesa_bitcount popcount", but they provide a fallback implementation. This is important because 32bit msvc doesn't have popcountll, just popcount; so when nir (for example) includes the core mesa header it doesn't (and shouldn't) link with core mesa. To fix this we'll promote the version out of gallium util, then replace the core mesa uses with the util version, since nir (and other non-core mesa users) can and do link with mesautils. Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-07 10:21:26 -07:00
Dylan Baker	aa4386ebfe	docs: update calendar, add news item and link release notes for X.Y.Z Signed-off-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-07 10:19:33 -07:00
Dylan Baker	d514f55611	docs/relnotes: Add sha256 sums for mesa 18.1.8	2018-09-07 10:17:38 -07:00
Dylan Baker	f6a9f44529	docs: Add release notes for 18.1.8	2018-09-07 10:17:36 -07:00
Jason Ekstrand	f9e630e23d	i965: Workaround the gen9 hw astc5x5 sampler bug gen9 hardware has a bug in the sampler cache that can cause GPU hangs whenever an texture with aux compression enabled is in the sampler cache together with an ASTC5x5 texture. Because we can't control what the client binds at any given time, we have two options: resolve the CCS or decompresss the ASTC. Doing a CCS or HiZ resolve is far less drastic and will likely have a smaller performance impact. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-09-07 10:42:40 -05:00
Eric Anholt	a91b158bd9	v3d: Fix setup of the VCM cache size. There were two bugs working together to make things mostly work: I wasn't dividing the VPM output size available by the size of a batch (vertex), but I also had the size of the VPM reduced by a factor of 8. Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it seems also my intermittent varying failures. Fixes: `1561e4984e` ("v3d: Emit the VCM_CACHE_SIZE packet.")	2018-09-07 08:11:38 -07:00
Eric Anholt	f73f748323	v3d: Fix SRC_ALPHA_SATURATE blending for RTs without alpha. Fixes dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate and friends with --deqp-egl-config-name=rgb565d0s0 Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-09-07 08:11:05 -07:00
Lionel Landwerlin	69874e9a6a	intel/genxml: turn SLM Enable bit into boolean Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-07 14:46:20 +01:00
Sergii Romantsov	97fcccb25e	i965/tools: 32bit compilation with meson Building of 32bit mesa with meson causes issue: "implicit declaration of function ‘__builtin_ia32_clflush’". Fixed by adding msse2 compilation flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843 Fixes: `314879f7fe` (i965: Fix asynchronous mappings on !LLC platforms.) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-07 13:46:48 +01:00
Sergii Romantsov	d709f12792	intel: compiler option msse2 and mstackrealign Seems in case of 32-bit library, usage of msse2 makes some stack corruption or incorrect instructions. Usage with mstackrealign fixes that case. v2: Fixed meson. v3: Definition of c_sse2_args moved on the top (L.Landwerlin). Added mstackrealign for Android's mks where msee4.1 is used. v4: Added for Vulkan also. v5: Commit message correction. CC: <mesa-stable@lists.freedesktop.org> Fixes: `6b05c080f2` (i965: Compile with -msse3) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107779 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-07 13:45:46 +01:00
Rob Clark	5404e0637f	freedreno: fix rast->depth_cleap_near/far Fixes: `daa19363de` gallium: split depth_clip into depth_clip_near & depth_clip_far Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-07 07:41:43 -04:00
Marek Olšák	fda7683726	gallium: enable GL_AMD_depth_clamp_separate on r600, radeonsi	2018-09-06 21:53:00 -04:00
Marek Olšák	daa19363de	gallium: split depth_clip into depth_clip_near & depth_clip_far for AMD_depth_clamp_separate.	2018-09-06 21:53:00 -04:00
Jason Ekstrand	7b26741806	anv/pipeline: Only consider double elements which actually exist The brw_vs_prog_data::double_inputs_read field comes directly from shader_info::double_inputs which may contain inputs which are not actually read. Instead of using it directly, AND it with inputs_read which is only things which are read. Otherwise, we may end up subtracting too many elements when computing elem_count. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103241 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-06 16:07:50 -05:00
Jason Ekstrand	44ec31cd75	nir: Drop the vs_inputs_dual_locations option It was very inconsistently handled; the only things that made use of it were glsl_to_nir, glspirv, and nir_gather_info. In particular, nir_lower_io completely ignored it so anyone using nir_lower_io on 64-bit vertex attributes was going to be in for a shock. Also, as of the previous commit, it's set by every driver that supports 64-bit vertex attributes. There's no longer any reason to have it be an option so let's just delete it. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-06 16:07:50 -05:00
Jason Ekstrand	0909a57b63	radeonsi/nir: Set vs_inputs_dual_locations and let NIR do the remap We were going out of our way to disable dual-location re-mapping in NIR only to then do the remapping in st_glsl_to_nir.cpp. Presumably, this was so that double_inputs would be correct for the core state tracker. However, now that we've it to gl_program::DualSlotInputs which is unaffected by NIR lowering, we can let NIR lower things for us. The one tricky bit here is that we have to remap the inputs_read bitfield back to the single-slot convention for the gallium state tracker to use. Since radeonsi is the only NIR-capable gallium driver that also supports GL_ARB_vertex_attrib_64bit, we only have to worry about radeonsi when making core gallium state tracker changes. Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-06 16:07:50 -05:00
Jason Ekstrand	25efd787cf	compiler: Move double_inputs to gl_program::DualSlotInputs Previously, we had two field in shader_info: double_inputs_read and double_inputs. Presumably, the one was for all double inputs that are read and the other is all that exist. However, because nir_gather_info regenerates these two values, there is a possibility, if a variable gets deleted, that the value of double_inputs could change over time. This is a problem because double_inputs is used to remap the input locations to a two-slot-per-dvec3/4 scheme for i965. If that mapping were to change between glsl_to_nir and back-end state setup, we would fall over when trying to map the NIR outputs back onto the GL location space. This commit changes the way slot re-mapping works. Instead of the double_inputs field in shader_info, it adds a DualSlotInputs bitfield to gl_program. By having it in gl_program, we more easily guarantee that NIR passes won't touch it after it's been set. It also makes more sense to put it in a GL data structure since it's really a mapping from GL slots to back-end and/or NIR slots and not really a NIR shader thing. Tested-by: Alejandro Piñeiro <apinheiro@igalia.com> (ARB_gl_spirv tests) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-09-06 16:07:50 -05:00
Marek Olšák	1285f71d3e	gallium: add PIPE_CAP_RASTERIZER_SUBPIXEL_BITS Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-09-06 16:07:40 -04:00
Eric Engestrom	3824c8e7cd	meson: disable asserts by default on release builds By the time Mesa 18.3 comes out (probably December '18), Meson 0.45 will be 9 months old (March '18), so I think this is reasonable. (btw, the currently-required Meson 0.44.1 was released less than 12 days before 0.45, so we're really not bumping by much.) Currently, the Meson versions in the major distributions are: Arch: ships 0.47.2 CentOS: 7 ships 0.47.1 Debian: stable ships 0.37.1, so it hasn't been usable in a long time. everything more recent ships 0.47.2 Fedora: 28 ships 0.45.1 FreeBSD: ships 0.46.1 (ports) Gentoo: ships 0.46.1 OpenSUSE: 15 ships 0.46 Ubuntu: 18.04 ships 0.45.1 Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-06 18:16:31 +01:00
Andrii Simiklit	2930b76cfe	mesa/util: add missing va_end() after va_copy() MSDN: "va_end must be called on each argument list that's initialized with va_start or va_copy before the function returns." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107810 Fixes: `c6267ebd6c` "gallium/util: Stop bundling our snprintf implementation." Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2018-09-06 17:33:27 +01:00
Andrii Simiklit	65cfe698b0	mesa/util: don't ignore NULL returned from 'malloc' We should exit from the function 'util_vasprintf' with error code -1 for case where 'malloc' returns NULL Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `864148d69e` "util: add util_vasprintf() for Windows (v2)" Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2018-09-06 17:33:27 +01:00
Andrii Simiklit	570cacba7a	mesa/util: don't use the same 'va_list' instance twice The first usage of the 'va_list' instance could change it. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `864148d69e` "util: add util_vasprintf() for Windows (v2)" Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2018-09-06 17:33:27 +01:00
Andrii Simiklit	267ed29288	apple/glx/log: added missing va_end() after va_copy() Each invocation of va_copy() must be matched by a corresponding invocation of va_end() Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `51691f0767` "darwin: Use ASL for logging" Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2018-09-06 17:33:27 +01:00
Eric Engestrom	6daba55aa1	meson: drop unnecessary llvm version hacks The current minimum meson version supported is 0.44.1, so we have met both the 0.43 and 0.44 requirement to not need these hacks anymore :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-06 17:16:58 +01:00
Danylo Piliaiev	2b98a023d9	mesa: add missing return statement for GL_RG_SNORM case Fixes: `0d356cf478` "mesa: enable EXT_render_snorm extension" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-09-06 17:24:53 +03:00
Eric Engestrom	e67dadd3a9	meson: consolidate langs lists Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-09-06 15:22:24 +01:00
Eric Engestrom	07ff56791d	intel/compiler: remove unused get_image_base_type() Unused since `09f1de97a7` "anv,i965: Lower away image derefs in the driver". Cc: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-06 15:22:24 +01:00
Mathias Fröhlich	a6232b6932	tnl: Fix green gun regression in xonotic. Fix an other regression of mesa: Make gl_vertex_array contain pointers to first order VAO members. The regression showed up with drivers using the tnl module and was reproducible using xonotic-glx -benchmark demos/the-big-keybench.dem. Fixes: `64d2a20480` mesa: Make gl_vertex_array contain pointers to first order VAO members. Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-09-06 14:35:12 +02:00
Lionel Landwerlin	2dce1175c1	Revert "i965/tools: 32bit compilation with meson" This reverts commit `4aec44c0d9`. Unfortunately this patch needed a another one to be committed first.	2018-09-06 12:25:07 +01:00
Sergii Romantsov	4aec44c0d9	i965/tools: 32bit compilation with meson Building of 32bit mesa with meson causes issue: "implicit declaration of function ‘__builtin_ia32_clflush’". Fixed by adding msse2 compilation flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843 Fixes: `314879f7fe` (i965: Fix asynchronous mappings on !LLC platforms.) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-06 11:55:57 +01:00
Timothy Arceri	b9fe8ff23d	glsl: fixer lexer for unreachable defines If we have something like: #ifdef NOT_DEFINED #define A_MACRO(x) \ if (x) #endif The # on the #define is not skipped but the define itself is so this then gets recognised as #if. Until `28a3731e3f` this didn't happen because we ended up in <HASH>{NONSPACE} where BEGIN INITIAL was called stopping the problem from happening. This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for if/else/endif when processing a define. Cc: Ian Romanick <idr@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772 Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-09-06 10:13:21 +10:00
Hyunjun Ko	2454742a84	freedreno/ir3: insert mov if same instruction in the outputs. For example, result0 = texture(sampler[indexBase + 5], coords); result1 = texture(sampler[indexBase + 0], coords); result2 = texture(sampler[indexBase + 0], coords); out_result0 = result0; out_result1 = result1; out_result2 = result2; In this kind of case we need to insert an extra mov to the outputs so that the result could be assigned to each register respectively. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Hyunjun Ko	b4da2f6667	freedreno/ir3: make immediates array dynamic Since most shaders wouldn't need that large array of immediates, making the array dynamic could save unnecessary spaces. In addition, sometimes we can potentially have a much larger array of immediates to be lowered, which might be more than 64. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	c3d9f29b78	freedreno: allocate ctx's batch on demand Don't fall over when app wants more than 32 contexts. Instead allocate contexts on demand. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	a122118c14	freedreno: add fd_context_batch() accessor For cases in which (after the following commit) ctx->batch may be null. Prep work for following commit. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	a45e1802db	freedreno/a6xx: fix mem2gmem for zsbuf Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	c77e0948c7	freedreno/batch: fix crash in !reorder case We aren't using the batch-cache if reorder==false. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	2c623e7071	freedreno/ir3: better compile_error() printing Try to show the error at the appropriate line of nir Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	ca758251ba	freedreno/a6xx: bordercolor fixes Port fixes from a5xx (`f0715442`) TODO maybe this should move to shared code, since it seems to be the same. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	73378013d7	freedreno: fix context teardown harder The border_color_uploaders need to be torn down before the transfer_pool is destroyed. Fixes: `e11e9d6394` freedreno: fix context teardown race Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	1a24f51966	freedreno/ir3: ignore unused inputs We could end up w/ inputs larger than vec4, simply because unused inputs are not split. Fixes things like dEQP-GLES31.functional.separate_shader.random.77 (and probably a handful of others) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Rob Clark	6b4397feab	freedreno/a6xx: fix debug build crash Porting `0c8d9e923a` to a6xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-09-05 13:38:43 -04:00
Dylan Baker	d25a27ec56	meson: Print a message about why a libdrm version was selected We require a single version of libdrm for all of our libdrm dependencies (core and driver), but the way this is structured can make the error message less than helpful, as one driver might be the one setting the libdrm requirement, while another might be the one that generates the version failure. This adds a simple message to the output announcing which libdrm module set the version, which might be more helpful. v2: - Use message suggested by Eric Engstrom Fixes: `c445b1d56f` ("meson: Use the same version for all libdrm checks") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-09-05 10:32:51 -07:00
Charmaine Lee	af104ad799	svga: rename face to layer_face Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Brian Paul	e334e104d0	svga: encode sample count in resource declarations No regressions before the corresponding host-side change. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-09-05 11:22:42 -06:00
Charmaine Lee	49678e9e49	svga: sync with upstream changes to surface flags SVGA device now supports 64 bits surface flags. This patch updates the winsys interface to allow 64 bits surface flags. The linux winsys layer will for now only honor the lower 32 bits of the surface flags. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	4310649ccb	svga: avoid try_blit() for some depth formats on non vgpu10. On non vgpu10, driver doesn't support util_blitter_blit for SVGA3D_Z_D16, SVGA3D_Z_D24x8, SVGA3D_Z_D24S8. Patch fixes following piglit tests regression on hwv8 caused by commit 27bf35caea5e: spec@arb_depth_texture@fbo-depth-gl-depth-component16-blit spec@arb_depth_texture@fbo-depth-gl-depth-component24-blit spec@arb_depth_texture@fbo-depth-gl-depth-component32-blit Tested with mtt-piglit on hw 8,9,10,11,13 and mtt-glretrace on windows and linux. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	53091a0312	svga: convert dst format to linear when blending is enabled. When blending is enabled, framebuffer colorspace has to be linear. Previously, we never hit this case because we were not supporting sRGB drawable. Previous patch added that support. Tested with mtt glretrace, viewperf, piglit, conform. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	dfab1289e8	winsys/svga: Avoid cap2 code path for now CAP2 functionality is not yet part of vmwgfx. This is causing unnecessary dmesg error messages. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	8449c33a27	svga: start using SVGA3dCmdIntraSurfaceCopy command for svga_blit. Basically, SVGA3dCmdIntraSurfaceCopy command allow copying when source and destination are same. Tested with MTT piglit, glretrace, viewperf, conform v2: changes as per Charmaine's comment v3: changes as per Charmaine's comment Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	4639ef3763	svga/winsys: Add cap2 support in winsys Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Neha Bhende	6b3627da08	svga: Add SVGA3dCmdIntraSurfaceCopy command support in OpenGL driver v2: changes as per Charmaine's comment Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Brian Paul	bac94dfefa	svga: update device header files from upstream This is a squash commit of several earlier patches. Signed-off-by: Brian Paul <brianp@vmware.com>	2018-09-05 11:22:42 -06:00
Charmaine Lee	f4f39fa5d9	winsys/drm: Fix assert when try to accumulate an invalid fd This patch makes sure there is a valid fd before merging it to the context's fd in vmw_svga_winsys_fence_server_sync(). This fixes the assert running webot. No regression running kmscube. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-09-05 11:22:42 -06:00
Eric Anholt	16f17e3a3c	loader: Drop unused argument from dri3_update_drawable(). The argument has never been used since the function was added. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-09-05 10:11:27 -07:00
Alejandro Piñeiro	4e1f8d82c2	i965/fs: include multisamplers on image_intrinsic_coord_components This is the second patch needed to fix the following piglit tests: tests/spec/arb_gl_spirv/linker/uniform/multisampler.shader_test tests/spec/arb_gl_spirv/linker/uniform/multisampler-array.shader_test Although in this case it doesn't affect so many borrowed tests, as there aren't too many tests using multisamplers on Intel. It is worth to note that this patch is also needed when those tests are run on GLSL mode (using the --glsl option). Although most Intel drivers would not be able to run/execute tests using multisamplers, as GL_MAX_IMAGE_SAMPLES is zero, technically those tests are expected to link correctly, so linking tests should pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-05 17:02:28 +02:00
Alejandro Piñeiro	8969777686	i965: move brw_nir_lower_gl_images call At this moment that lowering is using info coming from the UniformStorage, so for the ARB_gl_spirv codepath, it needs to be done after calling gl_nir_link_uniforms. As for the GLSL codepath it can also be called later, we just move the call on both cases, to avoid adding several shader->spirv_data checks, and keep the patch as small as possible. This is the first patch needed to fix the following piglit tests: tests/spec/arb_gl_spirv/linker/uniform/multisampler.shader_test tests/spec/arb_gl_spirv/linker/uniform/multisampler-array.shader_test but fixes thousands of tests when borrowing the tests from other specs (that needs to be done manually right now). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-05 17:02:28 +02:00
Alejandro Piñeiro	2a6182fe06	intel/compiler: rename brw_nir_lower_glsl_images To brw_nir_lower_gl_images, as it will be also used on the ARB_gl_spirv codepath, that doesn't involves GLSL at all. So the lowering is about images following the OpenGL semantics. In any case "brw_nir_lower_opengl_images" seemed too long to me, so I just used gl. That shortening is already used on other parts of the code. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-05 17:02:28 +02:00
Alejandro Piñeiro	960f6459be	intel/compiler: remove unused variable num_images Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-05 17:02:28 +02:00
Gert Wollny	218ff0d510	winsys/virgl/vtest: Correct off-by-one error in resource allocation The resource bo array must already extended when the target index is equal to the current size of the array. Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-09-05 13:54:01 +02:00
Gert Wollny	5341260f62	winsys/virgl: Initialize value to silence valgrind Silences: Conditional jump or move depends on uninitialised value(s) at 0xB72F2C0: virgl_drm_winsys_create (virgl_drm_winsys.c:854) by 0xB72F2C0: virgl_drm_screen_create (virgl_drm_winsys.c:926) by 0xB21C885: pipe_virgl_create_screen (drm_helper.h:275) by 0xB7201F0: pipe_loader_create_screen (pipe_loader.c:137) by 0xB639C91: dri2_init_screen (dri2.c:2112) by 0xB634F68: driCreateNewScreen2 (dri_util.c:153) by 0x63023E6: dri3_create_screen (dri3_glx.c:893) by 0x62D35BD: AllocAndFetchScreenConfigs (glxext.c:820) by 0x62D35BD: __glXInitialize (glxext.c:946) by 0x62CECB3: GetGLXPrivScreenConfig (glxcmds.c:174) by 0x62CF69C: glXQueryExtensionsString (glxcmds.c:1304) by 0x60AA7D9: ??? (in /usr/lib/x86_64-linux-gnu/libwaffle-1.so.0.5.2) by 0x4F81450: wfl_checked_display_connect (piglit-util-waffle.h:74) by 0x4F829E0: piglit_wfl_framework_init (piglit_wfl_framework.c:627) Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-09-05 13:54:01 +02:00
Gert Wollny	9b0e8d8723	winsys/virgl: correct resource and handle allocation (v2) Fixes crash with piglit/bin/map_buffer_range-invalidate CopyBufferSubData \ increment-offset -auto -fbo * Resize the resource storage already when the count is equal to the allocated size, fixes: Invalid write of size 8 at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629) by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663) by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776) by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585) by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940) by 0x109A1E: upload (invalidate.c:169) by 0x109C2F: piglit_display (invalidate.c:215) by 0x4F80FBE: run_test (piglit_fbo_framework.c:52) by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10949D: main (invalidate.c:47) Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567) * Also resize the space allocated for the handles, fixes: Invalid write of size 4 at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631) by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663) by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776) by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585) by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940) by 0x109A1E: upload (invalidate.c:169) by 0x109C2F: piglit_display (invalidate.c:215) by 0x4F80FBE: run_test (piglit_fbo_framework.c:52) by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10949D: main (invalidate.c:47) Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd at 0x4C2FB0F: malloc ( in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so) by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572) Fixes: `4b15b5e803` ("virgl: resize resource bo allocation if we need to.") v2: - Use REALLOC macro and avoid memory leak when re-allocation fails - add Fixes tag (both Emil Velikov) - reorder commit message Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-09-05 13:54:01 +02:00
Tomeu Vizoso	f13de57edb	virgl: use hw-atomics instead of in-ssbo ones Emulating atomics on top of ssbos can lead to too small max SSBO count, so let's use the hw-atomics mechanism to expose atomic buffers instead. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:58 +01:00
Erik Faye-Lund	1bd927d997	virgl: update minor differences to upstream header virgl_protocol.h is considered to have it's upstream in the virglrenderer repository, and somehow these minor differences has crept in. Let's sync with the upstream to avoid this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:52 +01:00
Erik Faye-Lund	5a587d18d5	gallium: add PIPE_CAP_MAX_COMBINED_HW_ATOMIC_COUNTER{S,_BUFFERS} This moves the evergreen-specific max-sizes out as a driver-cap, so other drivers with less strict requirements also can use hw-atomics. Remove ssbo_atomic as it's no longer needed. We should now be able to use hw-atomics for some stages and not for other, if needed. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:46 +01:00
Erik Faye-Lund	d641d3f48b	gallium: add PIPE_CAP_MAX_COMBINED_SHADER_BUFFERS This gets rid of a r600 specific hack in the state-tracker, and prepares for other drivers to be able to use hw-atomics. While we're at it, clean up some indentation in the various drivers. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:37 +01:00
Erik Faye-Lund	84795f8c64	st/mesa: simplify MaxAtomicBufferSize-logic MaxAtomicCounters has already been assigned in the loop above in the ssbo_atomic = true case, so this will calculate the same value as the default. While we're at it, fixup indentation on the MaxAtomicBufferBindings assign. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:33 +01:00
Erik Faye-Lund	38f0c078de	st/mesa: clean up atomic vs ssbo code This makes the code a bit easier to follow; we first set up MaxShaderStorageBlocks, then we either set up a dedicated MaxAtomicBuffers, or we split MaxShaderStorageBlocks in two. While we're at it, also make the SSBO-splitting code tolerate the hypothetical case of having an odd number of SSBOs without incorrectly dropping the last SSBO. This has the nice result that the SSBOs and atomic buffers are dealt with almost completely orthogonally, easing some upcoming patches. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:27 +01:00
Erik Faye-Lund	a805e4e9de	st/mesa: use real bool for can_ubo We're doing full c99 now, so there's no point in using the old boolean type. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-09-05 05:46:09 +01:00
Marek Olšák	28e542dcdb	gallium/u_threaded: increase batch size to increase performance This reduces mutex overhead. radeonsi: +4.4% performance with piglit/drawoverhead, DrawElements, Ryzen X1700 iris_dri.so: +14% with piglit/drawoverhead, DrawArrays, i7 7700HQ. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-09-04 14:31:56 -04:00
Marek Olšák	ebd5806e0f	st/vdpau: silence an unitialized-variable warning	2018-09-04 14:01:43 -04:00
Marek Olšák	725e8ad559	st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures GL_STENCIL_INDEX uses GL_INTENSITY for the border color, which is nicer to hardware that doesn't read the stencil border value from the X channel. This fixes a bunch of dEQP tests on Vega & Raven. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-04 14:01:43 -04:00
Ernestas Kulik	d49904085a	glsl_to_tgsi: Fix potential leak Reported by Coverity: arr_live_ranges is freed in a different branch than the one in which it was allocated. Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-09-04 14:01:43 -04:00
Ernestas Kulik	ea1e50cc16	u_vbuf: Fix leak Reported by Coverity: data is heap-allocated, but only freed in the info->index_size != 0 branch. Signed-off-by: Ernestas Kulik <ernestas.kulik@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 18.2 <mesa-stable@lists.freedesktop.org>	2018-09-04 14:01:43 -04:00
Eric Anholt	2e59b88903	freedreno: Drop a bunch of duplicated gallium PIPE_CAP default code. Now that we have the util function for the default values, we can get rid of the boilerplate. v2: Rebase on new gallium caps Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)	2018-09-04 08:08:22 -07:00
Eric Anholt	492b74b445	v3d: Drop a bunch of duplicated gallium PIPE_CAP default code. Now that we have the util function for the default values, we can get rid of the boilerplate. v2: Rebase on new gallium caps	2018-09-04 08:08:18 -07:00
Eric Anholt	c311e00000	vc4: Drop a bunch of duplicated gallium PIPE_CAP default code. Now that we have the util function for the default values, we can get rid of the boilerplate. v2: drop GLSL level in favor of defaults. v3: Rebase on new gallium caps	2018-09-04 08:08:10 -07:00
Eric Anholt	ad782a7020	gallium: Add a helper for implementing PIPE_CAP_* default values. One of the pains of implementing a gallium driver is filling in a million pipe caps you don't know about yet when you're just starting out. One of the pains of working on gallium is copy-and-pasting your new PIPE_CAP into each driver. We can fix both of these by having each driver call into the default helper from their default case, so that both sides can ignore each other until they need to. v2: fix i915g build, revert swr change to avoid breaking scons build (https://travis-ci.org/anholt/mesa/jobs/419739857) v3: Rebase on 3 new gallium caps. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Cc: Bruce Cherniak <bruce.cherniak@intel.com> Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2018-09-04 08:07:52 -07:00
Jason Ekstrand	67571ae796	intel/compiler: Remove redundant nir_remove_dead_variables call As of `07a2098a70`, brw_nir_optimize calls nir_remove_dead_variables as the last optimization. Doing it again is just pointless. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-09-04 09:03:16 -05:00
Lionel Landwerlin	07a2098a70	intel: compiler: remove dead local variables at optimization pass We're hitting an assert in gfxbench because one of the local variable is a sampler (according to Jason this isn't valid) : testfw_app: ../src/compiler/nir_types.cpp:551: void glsl_get_natural_size_align_bytes(const glsl_type, unsigned int, unsigned int*): Assertion `!"type does not have a natural size"' failed. Since this particular variable isn't used, it can be eliminated by removing unused local variables at the end of the optimization loop. This makes sense also for valid local variables. v2: Move additional local variable removal out of optimization loop, but before large constant removal (Jason/Lionel) v3: Move the removal at the end of brw_nir_optimize() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107806 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-09-03 17:24:19 +01:00
Andrii Simiklit	095600dad6	intel/decoder: fix the possible out of bounds group_iter The "gen_group_get_length" function can return a negative value and it can lead to the out of bounds group_iter. v2: printing of "unknown command type" was added v3: just the asserts are added Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-09-03 11:14:30 +01:00
Bas Nieuwenhuizen	233718a199	radv: Fix CMASK dimensions. Mirrors `1e40f69483` "ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI" CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-03 09:24:30 +02:00
Bas Nieuwenhuizen	ab64891f4c	radv: Use a lower max offchip buffer count. No clue what gets fixed by this but both radeonsi and amdvlk do it. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-03 09:24:30 +02:00
Bas Nieuwenhuizen	4dc244eb44	radv: Add VEGA20 support. Just mirror the radeonsi bits. Since this is just adding the extra switch entries for new HW I think this should be fine for stable. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-09-03 09:24:30 +02:00
Dave Airlie	c1ba33c34b	radv: don't expose linear depth surfaces on SI/CIK/VI either. ac_surface.c: gfx6_compute_surface says /* DB doesn't support linear layouts. */ Now if we expose linear depth and create a linear depth image and use CmdCopyImage to copy into it, we can't map the underlying memory and read it linearly which I think should work. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-09-03 11:38:00 +10:00
Mauro Rossi	ac0856ae41	egl/android: do not indent HAVE_DRM_GRALLOC preprocessor directive Fixes: `3f7bca44d9` ("egl/android: #ifdef out flink name support") Fixes: `c7bb82136b` ("egl/android: Add DRM node probing and filtering") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-09-02 11:27:08 +02:00
Jason Ekstrand	2ad9917e18	anv/blorp: Fix a comment as per Nanley's review feedback This accidentally didn't make it into `62378c5e9e`	2018-09-01 09:12:08 -05:00
Jason Ekstrand	62378c5e9e	anv/blorp: Do more flushing around HiZ clears We make the flush after a HiZ clear unconditional and add a flush/stall before the clear as well. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-09-01 09:08:36 -05:00
Ian Romanick	82530ce1b5	i965/vec4: Clamp indirect tes input array reads with 0x0fffffff Page 190 of "Volume 7: 3D Media GPGPU Engine (Haswell)" says the valid range of the offset is [0, 0FFFFFFFh]. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-09-01 00:23:45 -07:00
Ian Romanick	75666605c9	i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset Fixes failure in the new piglit test tes-patch-input-array-vec2-index-invalid-rd.shader_test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-09-01 00:23:43 -07:00
Andres Gomez	adad7e3aa8	docs: update calendar to extended the 18.1 cycle by one more release Due to having 2 additional RCs for 18.2. Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-09-01 02:23:14 +03:00
Rodrigo Vivi	e8c42ed4ab	intel: Introducing Amber Lake platform Amber Lake uses the same gen graphics as Kaby Lake, including a id that were previously marked as reserved on Kaby Lake, but that now is moved to AML page. This follows the ids and approach used on kernel's commit e364672477a1 ("drm/i915/aml: Introducing Amber Lake platform") Reported-by: Timo Aaltonen <timo.aaltonen@canonical.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-31 13:57:52 -07:00
Rodrigo Vivi	886a048feb	intel: aubinator: Adding missed platforms to the error message. Many new platforms got added to gen_device_name_to_pci_device_id() but the error message inside aubinator didn't reflected those changes. So syncing on the same order to be sure that we are not missing any now. Cc: Anuj Phogat <anuj.phogat@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-31 13:57:41 -07:00
Nanley Chery	904c2a617d	i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9 According to internal docs, some gen9 platforms have a pixel shader push constant synchronization issue. Although not listed among said platforms, this issue seems to be present on the GeminiLake 2x6's we've tested. We consider the available workarounds to be too detrimental on performance. Instead, we mitigate the issue by applying part of one of the workarounds. Re-emit PUSH_CONSTANT_ALLOC at the top of every batch (as suggested by Ken). Fixes ext_framebuffer_multisample-accuracy piglit test failures with the following options: * 6 depth_draw small depthstencil * 8 stencil_draw small depthstencil * 6 stencil_draw small depthstencil * 8 depth_resolve small * 6 stencil_resolve small depthstencil * 4 stencil_draw small depthstencil * 16 stencil_draw small depthstencil * 16 depth_draw small depthstencil * 2 stencil_resolve small depthstencil * 6 stencil_draw small * all_samples stencil_draw small * 2 depth_draw small depthstencil * all_samples depth_draw small depthstencil * all_samples stencil_resolve small * 4 depth_draw small depthstencil * all_samples depth_draw small * all_samples stencil_draw small depthstencil * 4 stencil_resolve small depthstencil * 4 depth_resolve small depthstencil * all_samples stencil_resolve small depthstencil v2: Include more platforms in WA (Ken). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106865 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93355 Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-31 13:19:17 -07:00
Christian Gmeiner	773d6ea6e7	imx: make use of loader_open_render_node(..) helper Gets rid of hard-coded gpu device path. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-31 21:47:13 +02:00
Christian Gmeiner	b05a8f4f41	tegra: make use loader_open_render_node(..) helper Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-31 21:46:32 +02:00
Christian Gmeiner	ab348885eb	loader: add loader_open_render_node(..) This helper is almost a 1:1 copy of tegra_open_render_node(). Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-31 21:46:03 +02:00
Christian Gmeiner	d0b09e2dfe	tegra: fix memory leak Fixes: `1755f608f5` ("tegra: Initial support") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-31 21:45:16 +02:00
Daniel Stone	01c0aa9f05	st/dri: Don't expose sRGB formats to clients Though the SARGB8888 format is used internally through its FourCC value, it is not a real format as defined by drm_fourcc.h; it cannot be used with KMS or other interfaces expecting drm_fourcc.h format codes. Ensure we don't advertise it through the dmabuf format/modifier query interfaces, preventing us from tripping over an assert. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Michel Dänzer <michel.daenzer@amd.com> Fixes: `8c1b9882b2` ("egl/dri2: Guard against invalid fourcc formats") Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-08-31 18:02:42 +01:00
Samuel Pitoiset	686ec97cfb	radv: add missing support for protected memory properties Fixes Vulkan CTS CL#2849. Similar to the ANV driver. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-31 17:35:13 +02:00
Samuel Pitoiset	7355e9326b	radv: remove dead code in scan_shader_output_decl() Never used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	e9acf069b2	radv: remove radv_shader_context::num_output_{clips,culls} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	a6a6441c75	radv: adjust the cull dist mask in scan_shader_output_decl() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	ea778e760c	radv: get length of the clip/cull distances array from usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	732679c25e	radv: do not recompute the output usage mask for clipdist twice The shader info pass takes care of this now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	730c704f86	radv: gather the output usage mask for clip/cull distances correctly It's a special case because both are combined into a single array. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	ffe3a2a298	radv: add set_output_usage_mask() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-31 17:34:41 +02:00
Samuel Pitoiset	6f47df3129	radv: fix passing clip/cull distances from VS to PS CTS doesn't test input clip/cull distances for the fragment shader stage, which explains why this was totally broken. I wrote a simple test locally that works now. This fixes a crash with GTA V and DXVK. Note that we are exporting unused parameters from the vertex shader now, but this can't be optimized easily because we don't keep the fragment shader info... Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-31 17:34:36 +02:00
Juan A. Suarez Romero	54a9622dd5	egl/wayland: do not leak wl_buffer when it is locked If color buffer is locked, do not set its wayland buffer to NULL; otherwise it can not be freed later. Rather, flag it in order to destroy it later on the release event. v2: instruct release event to unlock only or free wl_buffer too (Daniel) This also fixes dEQP-EGL.functional.swap_buffers_with_damage.* tests. CC: Daniel Stone <daniel@fooishbar.org> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-08-31 16:29:36 +02:00
Dave Airlie	2c1f249f2b	ac/radeonsi: fix CIK copy max size While adding transfer queues to radv, I started writing some tests, the first test I wrote fell over copying a buffer larger than this limit. Checked AMDVLK and found the correct limit. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-31 15:11:49 +10:00
Dave Airlie	c9f5448695	radeonsi: fix regression in indirect input swizzles. This fixes: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3.shader_test since I reworked the 64-bit swizzles. Fixes: `bb17ae49ee` (gallivm: allow to pass two swizzles into fetches.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-31 06:08:24 +01:00
Dave Airlie	750b829daf	radeonsi: fix tess/gs fetchs for new swizzle. I have piglit results from my machine, but I must have messed up, and not built mesa in between properly. Fixes: `bb17ae49ee` (gallivm: allow to pass two swizzles into fetches.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-31 06:08:21 +01:00
Marek Olšák	355ed029b0	mesa: ignore VAO IDs equal to 0 in glDeleteVertexArrays This fixes a firefox crash. Fixes: `781a78914c` Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-30 22:30:28 -04:00
Kenneth Graunke	b147254d36	Revert "intel/tools/aubwrite: Always use physical addresses for traces." This reverts commit `f8cfc77660`. This appears to break intel_dump_gpu for Gen9 systems - I can load them in the simulator, but nothing happens. Reverting the patch makes the simulator properly execute our commands and shaders again.	2018-08-30 14:36:28 -07:00
Jason Ekstrand	a0f18f2142	intel/nir: Lowering image loads and stores trashes all metadata This fixes the GL_ARB_fragment_shader_interlock piglit test on gen8 platforms where the lack of metadata dirtying was causing another pass to accidentally delete a much needed loop. https://bugs.freedesktop.org/show_bug.cgi?id=107745 Fixes: `37f7983bcc` "intel/compiler: Do image load/store lowering..." Jason Ekstrand <jason@jlekstrand.net> writes: Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-30 14:06:31 -05:00
Jason Ekstrand	d9cf4308ce	i965/screen: Allow modifiers on sRGB formats This effectively reverts `a266934935` which was a misguided attempt at protecting intel_query_dma_buf_modifiers from invalid formats. Unfortunately, in some internal EGL cases, we can get an SRGB format validly in this function. Rejecting such formats caused us to not allow CCS in some cases where we should have been allowing it. This regressed the performance of some SynMark tests as well as GfxBench ALU2, Tessellation and Manhattan 3.0 tests There's some question of whether or not we really should be using SRGB "fourcc" formats that aren't actually in drm_foucc.h but there's not much harm in allowing them through here. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107223 Fixes: `a266934935` "i965/screen: Return false for unsupported..." Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-30 11:41:50 -05:00
Jason Ekstrand	8c1b9882b2	egl/dri2: Guard against invalid fourcc formats We already reject attempts to import images with invalid fourcc formats but don't really guard the queries all that well. This makes us error out in any calls to eglQueryDmaBufModifiersEXT if the given format is not a valid fourcc format. We also add an assert to ensure that drivers don't advertise any non-fourcc formats. Cc: mesa-stable@lists.freedesktop.org Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-30 11:41:50 -05:00
Jason Ekstrand	b95896f492	egl/dri2: Add a helper for the number of planes for a FOURCC format This also serves as a convenient "is this a fourcc format" check as well which we'll take advantage of in the next commit. Cc: mesa-stable@lists.freedesktop.org Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-30 11:41:50 -05:00
Jason Ekstrand	19bdc7dd0f	radv/meta: Set num_components on image_store intrinsics Now that image load/store intrinsics are variable-width, we need to set num_components accordingly. In `15d39f474b`, both glsl_to_nir and spirv_to_nir were updated to properly set num_components but radv meta was left behind. Fixes: `15d39f474b` "nir: Make image load/store intrinsics..." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-30 08:26:14 -05:00
Vicki Pfau	8c0e3f3822	gallivm: Detect VSX separately from Altivec Previously gallivm would attempt to use VSX instructions on all systems where it detected that Altivec is supported; however, VSX was added to POWER long after Altivec, causing lots of crashes on older POWER/PPC hardware, e.g. PPC Macs. By detecting VSX separately from Altivec we can automatically disable it on hardware that supports Altivec but not VSX Signed-off-by: Vicki Pfau <vi@endrift.com>	2018-08-30 06:09:49 +02:00
Ilia Mirkin	3e04c67950	nv50: bump compat glsl level to same as core Passes the compat piglits. I'm sure that there will be odd issues that aren't caught by them, but at least it should basically work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-08-29 20:51:40 -04:00
Ilia Mirkin	a608e5cc9f	nvc0: bump compat GLSL version to match core This passes the handful of tests in piglit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-08-29 20:51:40 -04:00
Ilia Mirkin	52a7297dc6	glsl: avoid lowering texcoord array except in simple cases With compat creeping up to geometry and tess shaders, lowering texcoord accesses/writes becomes more complicated. Since it's an optimization anyways, just avoid the complication for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-29 20:51:23 -04:00
Andres Gomez	3731233cba	docs: update calendar 18.2.0-rc5 is out, extend to 18.2.0-rc6 Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-30 03:33:08 +03:00
Timothy Arceri	9c47c39687	st/mesa, gallium: add a workaround for No Mans Sky The spec seems clear this is not allowed but the Nvidia binary forces apps to add layout qualifiers so this works around the issue for No Mans Sky until the CTS can be sorted out. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 09:54:40 +10:00
Timothy Arceri	9ce7d79cdc	glsl: add a mechanism to allow layout qualifiers on function params The spec is quite clear this is not allowed: From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers can appear in several forms of declaration. They can appear as part of an interface block definition or block member, as shown in the grammar in the previous section. They can also appear with just an interface-qualifier to establish layouts of other declarations made with that qualifier: layout-qualifier interface-qualifier ; Or, they can appear with an individual variable declared with an interface qualifier: layout-qualifier interface-qualifier declaration ;" From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers cannot be used on formal function parameters, and layout qualification is not included in parameter matching." However on the Nvidia binary driver they actually fail to compile if image function params don't have a layout qualifier. This results in applications such as No Mans Sky using layout qualifiers on params. I've submitted a CTS test to expose this problem in the Nvidia driver but until that is resolved this patch will help Mesa drivers work around the issue. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 09:54:40 +10:00
Timothy Arceri	28a3731e3f	glsl: skip stringification in preprocessor if in unreachable branch This fixes compilation of some "No Mans Sky" shaders where the stringification happens in branches intended for DX12. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-30 09:51:57 +10:00
Bas Nieuwenhuizen	4738b6ac81	radv: Add missing checks in radv_get_image_format_properties. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-30 01:21:20 +02:00
Dave Airlie	bb17ae49ee	gallivm: allow to pass two swizzles into fetches. This hijacks the top 16-bits of swizzle, to pass in the swizzle for the second channel. This fixes handling .yx swizzles of 64-bit values. This should fixup radeonsi and llvmpipe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107524 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 00:15:40 +01:00
Timothy Arceri	3bcec6cf1c	radeonsi: enable radeonsi_zerovram for No Mans Sky Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 07:57:38 +10:00
Timothy Arceri	5566dd8a61	radeonsi: add radeonsi_zerovram driconfig option More and more games seem to require this so lets make it a config option. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 07:57:38 +10:00
Timothy Arceri	406c3d748d	radeonsi: enable GL 4.5 in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 07:57:38 +10:00
Timothy Arceri	781a78914c	mesa: enable ARB_direct_state_access in compat for GL3.1+ We could enable it for lower versions of GL but this allows us to just use the existing version/extension checks that are already used by the core profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-30 07:57:38 +10:00
Marek Olšák	93b8b987d0	radeonsi: add a thorough clear/copy_buffer benchmark	2018-08-29 15:31:42 -04:00
Marek Olšák	5914f5bd4a	radeonsi: let internal compute dispatches tune WAVES_PER_SH	2018-08-29 15:31:42 -04:00
Marek Olšák	c5442c1165	radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI	2018-08-29 15:31:42 -04:00
Marek Olšák	d7250e4304	radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA_SI for measuring DMA on SI DMA on SI doesn't support the timestamp packet, so it's emulated.	2018-08-29 15:31:42 -04:00
Marek Olšák	c359880d8b	radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performance	2018-08-29 15:31:42 -04:00
Marek Olšák	0c5429cc73	radeonsi: add flag L2_STREAM for minimal cache usage	2018-08-29 15:31:41 -04:00
Marek Olšák	8f6e06d160	gallium: add TGSI_MEMORY_STREAM_CACHE_POLICY For internal radeonsi shaders.	2018-08-29 15:31:41 -04:00
Jason Ekstrand	d8033d4083	intel/compiler: Remove surface_idx from brw_image_param Now that the drivers are lowering to surface indices themselves, we no longer need to push the surface index into the shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:03 -05:00
Jason Ekstrand	3cbc02e469	intel: Use TXS for image_size when we have a typed surface Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:03 -05:00
Jason Ekstrand	09f1de97a7	anv,i965: Lower away image derefs in the driver Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:03 -05:00
Jason Ekstrand	0de003be03	nir: Add handle/index-based image intrinsics Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	3942943819	nir: Use a bitfield for image access qualifiers This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you can read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	48e4fa7dd8	glsl/link,i965: Make ImageAccess four-state The GLSL spec allows you to set both the "readonly" and "writeonly" qualifiers on images to indicate that it can only be used with imageSize. However, we had no way of representing this int he linked shader and flagged it as GL_READ_ONLY. This is good from a "does it use this buffer?" perspective but not from a format and access lowering perspective. By using GL_NONE for if "readonly" and "writeonly" are both set, we can detect this case in the driver and handle it correctly. Nothing currently relies on the type of surface in the "readonly" + "writeonly" case but that's about to change. i965 is the only drier which uses the ImageAccess field and gl_bindless_image::access is currently unused. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	4289143899	intel/compiler: Use two components for 1D array image sizes Having the array length component stored in .z was a small convenience for the ISL image param filling code and an annoyance in the NIR lowering code. The only convenience of treating 1D arrays like 2D arrays in the lowering code is in the address calculation code so let's put all the complexity there as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	b1c414ef28	isl: Use the view array length for the image size Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	37f7983bcc	intel/compiler: Do image load/store lowering to NIR This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	b217705dec	nir/types: Add a wrapper for coordinate_components Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	f2d0a2b110	anv/pipeline: Remove dead image loads in lower_input_attacnments Dead code will get rid of them eventually but it's better if they're just gone so we guarantee they won't trip up later passes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	15d39f474b	nir: Make image load/store intrinsics variable-width Instead of requiring 4 components, this allows them to potentially use fewer. Both the SPIR-V and GLSL paths still generate vec4 intrinsics so drivers which assume 4 components should be safe. However, we want to be able to shrink them for i965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	7cdf8f9339	nir/format_convert: Fix a bitmask in unpack_11f11f10f Fixes: `4e337b42f9` "nir/format_convert: Add pack/unpack for R11F_G11F_B10F" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	1f7be4968f	nir/format_convert: Rename pack_r11g11b10f to pack_11f11f10f This matches the unpack function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	7bd0363d6f	nir/format_convert: Add [us]norm conversion helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	152fdeddbb	nir/format_convert: Rename nir_format_bitcast_uint_vec We have a name for that, it's called a uvec. This just makes the function name a bit shorter. While we're here, we also add an assert for one of the assumptions this function makes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	7c5df52bdc	nir/format_convert: Add vec mask and sign-extend helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	ea4f200864	nir/format_convert: Add support for unpacking signed integers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	80c424148b	nir/opcodes: Make unpack_half_2x16_split_* variable-width There is nothing inherent about these opcodes that requires them to only take scalars. It's very convenient if we let them take vectors as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	d448fa3ae3	nir/algebraic: Add some max/min optimizations Found by inspection. This doesn't help much now but we'll see this pattern with images if you load UNORM and then store UNORM. Shader-db results on Kaby Lake: total instructions in shared programs: 15166916 -> 15166910 (<.01%) instructions in affected programs: 761 -> 755 (-0.79%) helped: 6 HURT: 0 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	4dd5263663	nir/algebraic: Add more extract_[iu](8\|16) optimizations This adds the "(a << N) >> M" family of mask or sign-extensions. Not a huge win right now but this pattern will soon be generated by NIR format lowering code. Shader-db results on Kaby Lake: total instructions in shared programs: 15166918 -> 15166916 (<.01%) instructions in affected programs: 36 -> 34 (-5.56%) helped: 2 HURT: 0 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Jason Ekstrand	116b47fe3c	nir/algebraic: Be more careful converting ushr to extract_u8/16 If it's not the right bit-size, it may not actually be the correct extraction. For now, we'll only worry about 32-bit versions. Fixes: `905ff86198` "nir: Recognize open-coded extract_u16" Fixes: `76289fbfa8` "nir: Recognize open-coded extract_u8" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-29 14:04:02 -05:00
Sagar Ghuge	40fc4b5acd	intel/tools: new i965_disasm tool Adds a new i965 instruction disassemble tool v2: 1) fix a few nits (Matt Turner) 2) Remove i965_disasm header (Matt Turner) v3: 1) Redirect output to correct file descriptors (Matt Turner) 2) Refactor code (Matt Turner) 3) Use better formatting style (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>	2018-08-29 11:19:55 -07:00
Kenneth Graunke	8fb966688b	st/mesa: Disable blending for integer formats. Blending isn't valid for integer formats. Rather than having drivers worry about this, just disable blending in this case. This hopefully will increase hits in the CSO cache as well, by eliminating most of the meaningless fields in this case. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-29 10:51:11 -07:00
Brian Paul	18e9b4791b	svga: add missing switch cases for shadow textures This doesn't seem to make any difference in testing, but it fixes a failed assertion when dumping sm3 shaders. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-29 11:29:07 -06:00
Brian Paul	fb7e462c97	svga: fix vgpu9 sprite coordinate bug Setting GL_POINT_SPRITE_COORD_ORIGIN to GL_LOWER_LEFT did not work for vgpu9. We can use the rasterizer sprite_coord_enable bitfield as-is. We need to index into it using the TGSI semantic index, not the register index. This fixes the Piglit fbo-gl_pointcoord and glsl-fs-pointcoord tests. Testing done: Piglit, Mesa sprite demos Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-29 11:29:07 -06:00
Brian Paul	8331d69a87	svga: fix PIPE_TEXTURE_RECT/BUFFER const buffer issue The flag_rect and flag_buffer fields didn't sufficiently capture the state changes needed for those resource types. For example, if a texture binding was changed from a 500x500 rect texture to a 400x400 rect texture we didn't set SVGA_NEW_TEXTURE_CONSTS. But we need to do that to emit the new texcoord scale factors to the constant buffers. Rather than track the sizes of all bound resources, just set the flag if the resource is a rect. Same story with texture buffers. Also, since rect/buffer textures are usable with VS/GS shaders, add SVGA_NEW_TEXTURE_CONSTS to the flags we check for emitting VS/GS constants. This seems to help with XFCE / xfwm4 desktop scaling. VMware issue 2156696. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-29 11:29:07 -06:00
Brian Paul	46c7433da8	svga: minor improvements in svga_state_constants.c Add const qualifiers. Add 'f' suffix on floats to avoid double promotion. Remove unneeded shader type assertion since the switch statement handled it already. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-29 11:29:07 -06:00
Jason Ekstrand	cdea5d996e	anv: Free the app and engine name Fixes: `8c048af589` "anv: Copy the appliation info into the instance" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-29 11:24:57 -05:00
Rhys Kidd	f7d0c112cb	nv50/ir: silence partitionLoadStore() unused function warning Move this now-unused function into the existing comment block, which was its only prior use. ../../../../../src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:2645:1: warning: unused function 'partitionLoadStore' [-Wunused-function] partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask) Fixes: ("86e4440361 nouveau: codegen: Disable more old resource handling code") Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-08-29 08:59:27 -04:00
vadym.shovkoplias	966a797e43	glsl/linker: Link all out vars from a shader objects on a single stage During intra stage linking some out variables can be dropped because it is not used in a shader with the main function. But these out vars can be referenced on later stages which can lead to further linking errors. Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731	2018-08-29 20:03:56 +10:00
Lionel Landwerlin	5a1c23d150	anv: blorp: support multiple aspect blits Newer blit tests are enabling depth&stencils blits. We currently don't support it but can do by iterating over the aspects masks (copy some logic from the CopyImage function). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9f44745eca` ("anv: Use blorp to implement VkBlitImage") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-29 10:31:06 +01:00
Tapani Pälli	a72dbc461b	mesa: allow GL_UNSIGNED_BYTE type for SNORM reads OpenGL ES spec states: "For normalized fixed-point rendering surfaces, the combination format RGBA and type UNSIGNED_BYTE is accepted." This fixes following failing VK-GL-CTS tests: KHR-GLES3.packed_pixels.pbo_rectangle.rgba8_snorm KHR-GLES3.packed_pixels.rectangle.rgba8_snorm KHR-GLES3.packed_pixels.varied_rectangle.rgba8_snorm Signed-off-by: Tapani Pälli <tapani.palli@intel.com> https://bugs.freedesktop.org/show_bug.cgi?id=107658 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Andres Gomez <agomez@igalia.com>	2018-08-29 09:26:23 +03:00
Timothy Arceri	5db981952a	nir: add loop unroll support for wrapper loops This adds support for unrolling the classic do { // ... } while (false) that is used to wrap multi-line macros. GLSL IR also wraps switch statements in a loop like this. shader-db results IVB: total loops in shared programs: 2515 -> 2512 (-0.12%) loops in affected programs: 33 -> 30 (-9.09%) helped: 3 HURT: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-29 16:02:05 +10:00
Timothy Arceri	0f450b57a1	nir/opt_loop_unroll: Remove unneeded phis if we make progress Now that SSA values can be derefs and they have special rules, we have to be a bit more careful about our LCSSA phis. In particular, we need to clean up in case LCSSA ended up creating a phi node for a deref. This avoids validation issues with some CTS tests with the following patch, but its possible this we could also see the same problem with the existing unrolling passes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-29 16:02:05 +10:00
Timothy Arceri	5a6b04d94b	nir: add complex_loop bool to loop info In order to be sure loop_terminator_list is an accurate representation of all the jumps in the loop we need to be sure we didn't encounter any other complex behaviour such as continues, nested breaks, etc during analysis. This will be used in the following patch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-29 16:02:05 +10:00
Timothy Arceri	fef6325e58	nir: always attempt to find loop terminators This will help later patches with unrolling loops that end with a break i.e. loops the always exit on their first interation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-29 16:02:05 +10:00
Marek Olšák	1e40f69483	ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI This fixes VM faults and corruption. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-28 19:51:51 -04:00
Ian Romanick	c836326a29	i965/vec4: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1 No shader-db changes on any Intel platform. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-28 15:35:50 -07:00
Ian Romanick	c856403868	i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for imageAtomicAdd of +1 or -1 v2: Refactor selection of atomic opcode to a separate function. Suggested by Jason. No changes on any other Intel platforms. Skylake total instructions in shared programs: 14304261 -> 14304241 (<.01%) instructions in affected programs: 1625 -> 1605 (-1.23%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5 helped stats (rel) min: 1.01% max: 14.29% x̄: 5.86% x̃: 4.07% 95% mean confidence interval for instructions value: -10.66 0.66 95% mean confidence interval for instructions %-change: -15.91% 4.19% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 527531226 -> 527531194 (<.01%) cycles in affected programs: 92204 -> 92172 (-0.03%) helped: 2 HURT: 0 Haswell and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 14615730 -> 14615710 (<.01%) instructions in affected programs: 1838 -> 1818 (-1.09%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5 helped stats (rel) min: 0.89% max: 13.04% x̄: 5.37% x̃: 3.78% 95% mean confidence interval for instructions value: -10.66 0.66 95% mean confidence interval for instructions %-change: -14.59% 3.85% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-28 15:35:46 -07:00
Ian Romanick	b6e247cf0e	i965/fs: Refactor image atomics to be a bit more like other atomics This greatly simplifies the next patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-28 15:35:46 -07:00
Ian Romanick	fabe3ead57	i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1 Funny story... a single shader was hurt for instructions, spills, fills. That same shader was also the most helped for cycles. #GPUsAreWeird No changes on any other Intel platform. v2: Refactor selection of atomic opcode to a separate function. Suggested by Jason. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14304116 -> 14304261 (<.01%) instructions in affected programs: 12776 -> 12921 (1.13%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1 helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55% HURT stats (abs) min: 189 max: 189 x̄: 189.00 x̃: 189 HURT stats (rel) min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87% 95% mean confidence interval for instructions value: -12.83 27.33 95% mean confidence interval for instructions %-change: -1.57% 0.31% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 527552861 -> 527531226 (<.01%) cycles in affected programs: 1459195 -> 1437560 (-1.48%) helped: 16 HURT: 2 helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6 helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03% HURT stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -3699.81 1295.92 95% mean confidence interval for cycles %-change: -0.94% 0.30% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8025 -> 8033 (0.10%) spills in affected programs: 208 -> 216 (3.85%) helped: 1 HURT: 1 total fills in shared programs: 10989 -> 11040 (0.46%) fills in affected programs: 444 -> 495 (11.49%) helped: 1 HURT: 1 Ivy Bridge total instructions in shared programs: 11709181 -> 11709153 (<.01%) instructions in affected programs: 3505 -> 3477 (-0.80%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4 helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61% total cycles in shared programs: 254741126 -> 254738801 (<.01%) cycles in affected programs: 919067 -> 916742 (-0.25%) helped: 3 HURT: 0 helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160 helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03% total spills in shared programs: 4536 -> 4533 (-0.07%) spills in affected programs: 40 -> 37 (-7.50%) helped: 1 HURT: 0 total fills in shared programs: 4819 -> 4813 (-0.12%) fills in affected programs: 94 -> 88 (-6.38%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-28 15:35:38 -07:00
Ian Romanick	41399f4bc7	intel/compiler: Silence unused parameter warnings in brw_eu.h All of the other brw__desc functions take a devinfo parameter, and all of the others at least have an assert that uses it. Keep the parameter, but mark it as unused. Silences 37 warnings like: In file included from src/intel/common/gen_disasm.c:27:0: src/intel/compiler/brw_eu.h: In function ‘brw_pixel_interp_desc’: src/intel/compiler/brw_eu.h:377:53: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_pixel_interp_desc(const struct gen_device_info devinfo, ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-28 15:35:38 -07:00
Sagar Ghuge	56574f4df3	i965: enable AMD_depth_clamp_separate Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	e6adea0dc0	i965: add functional changes for AMD_depth_clamp_separate Gen >= 9 have ability to control clamping of depth values separately at near and far plane. z_w is clamped to the range [min(n,f), 0] if clamping at near plane is enabled, [0, max(n,f)] if clamping at far plane is enabled and [min(n,f) max(n,f)] if clamping at both plane is enabled. v2: 1) Use better coding style (Ian Romanick) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	2765749e0f	mesa: add EXTRA_EXT for AMD_depth_clamp_separate Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	2770446740	mesa: add support for GL_AMD_depth_clamp_separate tokens _mesa_set_enable() and _mesa_IsEnabled() extended to accept new two tokens GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD. v2: Remove unnecessary parentheses (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	5650d39978	mesa: Add support for AMD_depth_clamp_separate Enable _mesa_PushAttrib() and _mesa_PopAttrib() to handle GL_DEPTH_CLAMP_NEAR_AMD and GL_DEPTH_CLAMP_FAR_AMD tokens. Remove DepthClamp, because DepthClampNear + DepthClampFar replaces it, as suggested by Marek Olsak. Driver that enables AMD_depth_clamp_separate will only ever look at DepthClampNear and DepthClampFar, as suggested by Ian Romanick. v2: 1) Remove unnecessary parentheses (Marek Olsak) 2) if AMD_depth_clamp_separate is unsupported, TEST_AND_UPDATE GL_DEPTH_CLAMP only (Marek Olsak) 3) Clamp against near and far plane separately (Marek Olsak) 4) Clip point separately for near and far Z clipping plane (Marek Olsak) v3: Clamp raster position zw to the range [min(n,f), 0] for near plane and [0, max(n,f)] for far plane (Marek Olsak) v4: Use MIN2 and MAX2 instead of CLAMP (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	379949b967	mesa: Add types for AMD_depth_clamp_separate. Add some basic types and storage for the AMD_depth_clamp_separate extension. v2: 1) Drop unnecessary definition (Marek Olsak) 2) Expose extension in compatibility profile (Marek Olsak) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-28 12:57:27 -07:00
Sagar Ghuge	f663fb5487	glapi: define AMD_depth_clamp_separate Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-28 12:57:27 -07:00
Jason Ekstrand	c92a463d23	anv: Claim to support depthBounds for ID games Cc: "18.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-28 13:05:54 -05:00
Jason Ekstrand	8c048af589	anv: Copy the appliation info into the instance Cc: "18.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-28 13:05:54 -05:00
Jason Ekstrand	4ffb575da5	vulkan/alloc: Add a vk_strdup helper Cc: "18.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-28 13:05:54 -05:00
Dylan Baker	7c00db9527	meson: Actually load translation files Currently we run the script but don't actually load any files, even in a tarball where they exist. Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-28 08:51:05 -07:00
Caio Marcelo de Oliveira Filho	f172a77dd8	nir: Remove outdated comment Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-28 08:11:03 -07:00
Kevin Rogovin	03ecec9ed2	i965: Add INTEL_fragment_shader_ordering support. Adds suppport for INTEL_fragment_shader_ordering. We achieve the fragment ordering by using the same instruction as for beginInvocationInterlockARB() which is by issuing a memory fence via sendc. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-08-28 17:15:10 +03:00
Kevin Rogovin	119435c877	mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-08-28 17:15:10 +03:00
Andrii Simiklit	1b0df8a460	i965/gen6/xfb: handle case where transform feedback is not active When the SVBI Payload Enable is false I guess the register R1.4 which contains the Maximum Streamed Vertex Buffer Index is filled by zero and GS stops to write transform feedback when the transform feedback is not active. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107579 Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-08-28 15:32:45 +02:00
Rhys Perry	743e11c10b	docs: add forgotten features to 18.2.0 release notes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewied-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 18.2: <mesa-stable@lists.freedesktop.org>	2018-08-28 13:50:51 +01:00
Erik Faye-Lund	a4e60ccb56	virgl: add debug-switch to output TGSI This is quite useful for debugging shader-transpiling issues in virglrenderer. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-08-28 14:13:43 +02:00
Erik Faye-Lund	4ab06cc56e	virgl: introduce $VIRGL_DEBUG=verbose This adds an environment-varaible that can be used for driver-specific flags, as well as a flag for it to enable verbose output. While we're at it, quiet some overly chatty debug-output by default. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-08-28 14:13:43 +02:00
Erik Faye-Lund	1b2444dffc	virgl: replace fprintf-call with debug_printf This is the only direct call-site for fprintf in virgl; all other call-sites call debug_printf instead. So let's follow in style here. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-08-28 14:13:43 +02:00
Erik Faye-Lund	2ebfa90abe	virgl: delete commented out fprintf-call This is just debug-cruft left over. Let's just get rid of it. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-08-28 14:13:43 +02:00
Guido Günther	9de34b4dde	meson: Don't enable any vulkan drivers on arm, aarch64 There's no Vulkan support for arm atm. Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-27 11:32:04 -07:00
Guido Günther	05e2fc6860	meson: Be a bit more helpful when arch or OS is unknown V2: Add one missing @0@ Signed-off-by: Guido Günther <guido.gunther@puri.sm> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-27 11:31:52 -07:00
Sagar Ghuge	a1e3305f75	intel/eu: print bytes instead of 32 bit hex value INTEL_DEBUG=hex prints 32 bit hex value and due to endianness of CPU byte order is reversed. In order to disassemble binary files, print each byte instead of 32 bit hex value. v2: Print blank spaces in order to vertically align output of compacted instructions hex value with uncompacted instructions hex value. (Matt Turner) v3: Fix line wrap at correct length Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-08-27 11:07:39 -07:00
Lionel Landwerlin	440a988bd1	intel: decoder: handle 0 sized structs Gen7.5 has a BLEND_STATE of size 0 which includes a variable length group. We did not deal with that very well, leading to an endless loop. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-27 18:33:18 +01:00
Rhys Perry	e56e600bd3	nv50/ir,nvc0: use constant buffers for compute when possible on Kepler+ Gives a +7.79% increase in FPS with Hitman on lowest quality settings on my GTX 1060. total instructions in shared programs : 5787979 -> 5748677 (-0.68%) total gprs used in shared programs : 669901 -> 669373 (-0.08%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21064 (-0.02%) local shared gpr inst bytes helped 1 0 152 274 274 hurt 0 0 0 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 14:23:42 +01:00
Rhys Perry	d27c791891	nv50/ir: optimize multiplication by 16-bit immediates into two xmads Rather than the usual three that would be created. total instructions in shared programs : 5796385 -> 5786560 (-0.17%) total gprs used in shared programs : 670103 -> 669968 (-0.02%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21068 (-0.45%) local shared gpr inst bytes helped 1 0 64 1040 1040 hurt 0 0 27 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:57:11 +01:00
Rhys Perry	400a4eb964	nv50/ir: optimize near power-of-twos into shladd total instructions in shared programs : 5819319 -> 5796385 (-0.39%) total gprs used in shared programs : 670571 -> 670103 (-0.07%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21164 (0.00%) local shared gpr inst bytes helped 0 0 318 1758 1758 hurt 0 0 63 0 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:57:01 +01:00
Rhys Perry	2f52925f5c	nv50/ir: move a * b -> a << log2(b) code into createMul() With this commit, OP_MAD is handled on nv50 too. This commit is also useful for later commits. Also, instead of creating a shladd, it relies on LateAlgebraicOpt to create one. This simplifies the code and helps shader-db slightly overall. total instructions in shared programs : 5820882 -> 5819319 (-0.03%) total gprs used in shared programs : 670595 -> 670571 (-0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21164 -> 21164 (0.00%) local shared gpr inst bytes helped 0 0 18 230 230 hurt 0 0 8 263 263 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:56:47 +01:00
Rhys Perry	b60bc7a4ab	nv50/ir: optimize imul/imad to xmads This hits the shader-db numbers a good bit, though a few xmads is way faster than an imul or imad and the cost is mitigated by the next commit, which optimizes many multiplications by immediates into shorter and less register heavy instructions than the xmads. total instructions in shared programs : 5768871 -> 5820882 (0.90%) total gprs used in shared programs : 669919 -> 670595 (0.10%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21164 (0.46%) local shared gpr inst bytes helped 0 0 38 0 0 hurt 1 0 365 3076 3076 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:56:44 +01:00
Rhys Perry	bcbcdf8448	gm107/ir: add support for OP_XMAD on GM107+ v4: make the immediate field 16 bits v5: don't ever emit h1 flags for immediates Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:56:41 +01:00
Rhys Perry	5d6952d2de	nv50/ir: add preliminary support for OP_XMAD v4: remove uint16_t(...) v4: don't allow immediates outside [0,65535] in insnCanLoad() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-08-27 13:56:36 +01:00
vadym.shovkoplias	4a8444d5bc	glsl/linker: Allow unused in blocks which are not declated on previous stage >From Section 4.3.4 (Inputs) of the GLSL 1.50 spec: "Only the input variables that are actually read need to be written by the previous stage; it is allowed to have superfluous declarations of input variables." Fixes: * interstage-multiple-shader-objects.shader_test v2: Update comment in ir.h since the usage of "used" field has been extended. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-27 12:13:53 +02:00
Jason Ekstrand	07a227f543	nir: Pull block_ends_in_jump into nir.h We had two different implementations in different files. May as well have one and put it in nir.h. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-27 02:15:38 -05:00
Samuel Iglesias Gonsálvez	59a8e0dbf8	anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2() VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1. Fixes Vulkan CTS CL#2849. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-27 09:07:52 +02:00
Jason Ekstrand	aad501f15e	intel/tools: Add 0x in front of a couple of hex values Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 18:47:08 -05:00
Jason Ekstrand	76b0e4d8c9	anv: Fill holes in the VF VUE to zero This fixes a GPU hang in DOOM 2016 running under wine. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 18:47:08 -05:00
Kai Wasserbäch	b2313ef4a8	intel: tools: Fix aubinator_error's fprintf call (format-security) The recent commit `4616639b49` introduced the new function aubinator_error() which is a trivial wrapper around fprintf() to STDERR. The call to fprintf() however is passed the message msg directly: fprintf(stderr, msg); This is a format-security violation and leads to an FTBFS with -Werror=format-security (GCC 8): ../../../src/intel/tools/aubinator.c: In function 'aubinator_error': ../../../src/intel/tools/aubinator.c:74:4: error: format not a string literal and no format arguments [-Werror=format-security] fprintf(stderr, msg); ^~~~~~~ This patch fixes this trivially by introducing a catch-all "%s" format argument. Fixes: `4616639b49` ("intel: tools: split aub parsing from aubinator") Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 16:52:12 +01:00
Jason Ekstrand	70de31d0c1	intel/batch_decoder: Print blend states properly Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 07:50:45 -05:00
Jason Ekstrand	cbd4bc1346	intel/batch_decoder: Fix dynamic state printing Instead of printing addresses like everyone else, we were accidentally printing the offset from state base address. Also, state_map is a void pointer so we were incrementing in bytes instead of dwords and every state other than the first was wrong. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 07:50:43 -05:00
Jason Ekstrand	d1971be6ea	intel/decoder: Print ISL formats for vertex elements Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 07:50:40 -05:00
Jason Ekstrand	2abd7ae189	intel/decoder: Clean up field iteration and fix sub-dword fields First of all, setting iter->name in advance_field is unnecessary because it gets set by gen_decode_field which gets called immediately after gen_decode_field in the one call-site. Second, we weren't properly initializing start_bit and end_bit in the initial condition of gen_field_iterator_next so the first field of a struct would get printed wrong if it doesn't start on the first bit. This is fixed by adding a iter_start_field helper which sets the field and also sets up the other bits we need. This fixes decoding of 3DSTATE_SBE_SWIZ. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-25 07:50:36 -05:00
Kenneth Graunke	1281608849	gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE. Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER. Drivers for such hardware would like to advertise support for ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp. This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit, changes the extension enable to be based on that, and enables it in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP (so they continue supporting this mode).	2018-08-24 17:25:36 -07:00
Lionel Landwerlin	f430a37fa7	intel: decoder: unify MI_BB_START field naming The batch decoder looks for a field with a particular name to decide whether an MI_BB_START leads into a second batch buffer level. Because the names are different between Gen7.5/8 and the newer generation we fail that test and keep on reading (invalid) instructions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-24 23:10:08 +01:00
Dylan Baker	7f745c19c1	docs: Update calendar, news, relnotes for 18.1.7	2018-08-24 09:35:24 -07:00
Dylan Baker	82c2e7bf9e	docs: Add mesa 18.1.7 notes	2018-08-24 09:34:03 -07:00
Dylan Baker	2d8569073e	docs: Add mesa 18.1.7 docs	2018-08-24 09:33:59 -07:00
Andres Gomez	0d3bb146a8	docs: update calendar 18.2.0-rc4 is out, extend to 18.2.0-rc5 Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-24 18:58:00 +03:00
Kevin Rogovin	e345247092	docs/relnotes: Mark NV_fragment_shader_interlock support in i965 Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-24 08:59:54 -05:00
Emil Velikov	081395e99d	egl/drm: use gbm_dri_bo() wrapper Remove the explicit cast, using the appropriate wrapper instead. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-08-24 11:53:24 +01:00
Emil Velikov	7b4269a5e0	egl/drm: use gbm_dri_surface() wrapper Remove the explicit cast, using the appropriate wrapper instead. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-08-24 11:53:20 +01:00
Emil Velikov	7eb4a28d41	egl/drm: use gbm_dri_device() wrapper Remove the explicit cast, using the appropriate wrapper instead. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-08-24 11:52:48 +01:00
Emil Velikov	2c049384b1	egl/android: simplify device open/probe Currently droid_probe_device, does not do any 'probing' but filtering out a device if it doesn't match the vendor string given. Rename the function, straighten the return type and call it only as needed - an actual vendor string is provided. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-08-24 11:52:44 +01:00
Emil Velikov	2f8403a4ca	egl/android: remove drmVersion::name NULL check The name string is guaranteed to be non-NULL. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-08-24 11:52:41 +01:00
Emil Velikov	d1211f3112	egl/android: remove droid_probe_driver() The function name is misleading - it effectively checks if loader_get_driver_for_fd fails. Which can happen only only on strdup error - a close to impossible scenario. Drop the function - we call the loader API at at later stage. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-08-24 11:52:39 +01:00
Emil Velikov	9b5bf7afce	egl/android: use strcmp with drmVersion::name The name string is guaranteed to be NULL terminated. Drop the explicit length check that comes with strncmp(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-08-24 11:52:37 +01:00
Emil Velikov	3827966643	egl/android: use drmDevice instead of the manual /dev/dri iteration Replace the manual handling of /dev/dri in favor of the drmDevice API. The latter provides a consistent way of enumerating the devices, providing device details as needed. v2: - Use ARRAY_SIZE (Frank) - s/famour/favor/ typo (Frank) - Make MAX_DRM_DEVICES a macro - fix vla errors (RobF) - Remove left-over dev_path instance (RobF) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Robert Foss <robert.foss@collabora.com> (v1) Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-08-24 11:50:36 +01:00
Emil Velikov	cff80b6c15	Revert "configure: allow building with python3" This reverts commit `ae7898dfdb`. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html	2018-08-24 11:14:15 +01:00
Emil Velikov	7a4d2d1fdf	Revert "travis: use python3 for the autoconf builds" This reverts commit `855af9a5a2`. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html	2018-08-24 11:10:24 +01:00
Kenneth Graunke	93e8e17fa4	Revert "mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES" This reverts commit `095515e16c`. This breaks KHR-GL46.map_buffer_alignment.functional on i965. This code was apparently not reviewed and I don't know why we would move from a driver configurable constant to a hardcoded value for all drivers. This really looks like an accidental hack push.	2018-08-24 00:36:01 -07:00
Kenneth Graunke	9d670fd86c	Revert recent changes about not including compute in combined limits. As far as I can tell, no one reviewed these changes, they made i965 assert fail on driver load, and I am not certain they are correct. (Hopefully reverting these does not break radeonsi too badly...) The uniform related changes seem fine and reasonable, but the texture image units change is possibly incorrect. According to the OES_tessellation_shader spec issue 5: (5) How are aggregate shader limits computed? RESOLVED: Following the GL 4.4 model, but we restrict uniform buffer bindings to 12/stage instead of 14, this results in MAX_UNIFORM_BUFFER_BINDINGS = 72 This is 12 bindings/stage * 6 shader stages, allowing a static partitioning of the bindings even though at most 5 stages can appear in a program object). MAX_COMBINED_UNIFORM_BLOCKS = 60 This is 12 blocks/stage * 5 stages, since compute shaders can't be mixed with other stages. MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96 This is 16 textures/stage * 6 stages. which definitely is including compute shaders in that last limit. Not including compute shaders breaks the following test: dEQP-GLES31.functional.state_query.integer.max_combined_texture_image_units_getinteger There was enough breakage that I figured we should just send this back to the drawing board. Revert "i965: don't include compute resources in "Combined" limits" Revert "st/mesa: don't include compute resources in "Combined" limits" Revert "mesa: don't include compute resources in MAX_COMBINED_* limits" This reverts commit `b03dcb1e5f`. This reverts commit `cff290df4c`. This reverts commit `45f87a48f9`.	2018-08-24 00:36:01 -07:00
Roland Scheidegger	8e1be9a34a	gallivm: don't use saturated unsigned add/sub intrinsics for llvm 8.0 These have been removed. Unfortunately auto-upgrade doesn't work for jit. (Worse, it seems we don't get a compilation error anymore when compiling the shader, rather llvm will just do a call to a null function in the jitted shaders making it difficult to detect when intrinsics vanish.) Luckily the signed ones are still there, I helped convincing llvm removing them is a bad idea for now, since while the unsigned ones have sort of agreed-upon simplest patterns to replace them with, this is not the case for the signed ones, and they require _significantly_ more complex patterns - to the point that the recognition is IMHO probably unlikely to ever work reliably in practice (due to other optimizations interfering). (Even for the relatively trivial unsigned patterns, llvm already added test cases where recognition doesn't work, unsaturated add followed by saturated add may produce atrocious code.) Nevertheless, it seems there's a serious quest to squash all cpu-specific intrinsics going on, so I'd expect patches to nuke them as well to resurface. Adapt the existing fallback code to match the simple patterns llvm uses and hope for the best. I've verified with lp_test_blend that it does produce the expected saturated assembly instructions. Though our cmp/select build helpers don't use boolean masks, but it doesn't seem to interfere with llvm's ability to recognize the pattern. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106231 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-08-24 07:50:13 +02:00
Marek Olšák	45b5f5fa25	st/mesa: expose KHR_texture_compression_astc_sliced_3d This is ASTC 2D LDR allowing texture arrays and 3D, compressing each slice as a separate 2D image. Tested by piglit. Trivial.	2018-08-24 00:36:18 -04:00
Marek Olšák	dae4cf397d	st/mesa: expose EXT_disjoint_timer_query same cap as ARB_timer_query, no changes needed, tested by piglit	2018-08-24 00:36:18 -04:00
Marek Olšák	263c962cfd	mesa: expose EXT_vertex_attrib_64bit because the closed driver exposes it. It's the same as the ARB extension. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-24 00:36:18 -04:00
Marek Olšák	5c90091036	mesa: expose AMD_query_buffer_object it's a subset of the ARB extension. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-24 00:36:18 -04:00
Marek Olšák	056b9a5a36	mesa: expose AMD_multi_draw_indirect because the closed driver exposes it. This is equivalent to the ARB extension. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-24 00:36:18 -04:00
Marek Olšák	b3c17330e6	mesa: expose AMD_gpu_shader_int64 because the closed driver exposes it. It's equivalent to ARB_gpu_shader_int64. In this patch, I did everything the same as we do for ARB_gpu_shader_int64. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-24 00:36:18 -04:00
Marek Olšák	1cf3631b9c	mesa: expose ARB_post_depth_coverage in the Compatibility profile It only contains GLSL changes. v2: allow the layout qualifier on GLSL <= 1.30	2018-08-24 00:36:18 -04:00
Jason Ekstrand	8d8222461f	intel/nir: Enable nir_opt_find_array_copies We have to be a bit careful with this one because we want it to run in the optimization loop but only in the first brw_nir_optimize call. Later calls assume that we've lowered away copy_deref instructions and we don't want to introduce any more. Shader-db results on Kaby Lake: total instructions in shared programs: 15176942 -> 15176942 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 In spite of the lack of any shader-db improvement, this patch completely eliminates spilling in the Batman: Arkham City tessellation shaders. This is because we are now able to detect that the temporary array created by DXVK for storing TCS inputs is a copy of the input arrays and use indirect URB reads instead of making a copy of 4.5 KiB of input data and then indirecting on it with if-ladders. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:47:51 -05:00
Jason Ekstrand	53072582dc	nir: Add an array copy optimization This peephole optimization looks for a series of load/store_deref or copy_deref instructions that copy an array from one variable to another and turns it into a copy_deref that copies the entire array. The pattern it looks for is extremely specific but it's good enough to pick up on the input array copies in DXVK and should also be able to pick up the sequence generated by spirv_to_nir for a OpLoad of a large composite followed by OpStore. It can always be improved later if needed. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:47:47 -05:00
Jason Ekstrand	a4a9c07549	intel/nir: Use nir_shrink_vec_array_vars Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15176765 (<.01%) instructions in affected programs: 4259 -> 3419 (-19.72%) helped: 1 HURT: 0 total spills in shared programs: 10954 -> 10855 (-0.90%) spills in affected programs: 295 -> 196 (-33.56%) helped: 1 HURT: 0 total fills in shared programs: 22222 -> 22117 (-0.47%) fills in affected programs: 417 -> 312 (-25.18%) helped: 1 HURT: 0 The helped shader is from the OglCSDof synmark test. On my Kaby Lake laptop, the actual framerate of the benchmark didn't appear to improve beyond the noise. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:46:56 -05:00
Jason Ekstrand	be8d009908	nir: Add a array-of-vector variable shrinking pass This pass looks for variables with vector or array-of-vector types and narrows the type to only the components used. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:46:56 -05:00
Jason Ekstrand	02a5442dd7	intel/nir: Use the new structure and array splitting passes We call structure splitting once because it is guaranteed to split all the structures in the entire shader in one go. We call array splitting in the loop in case future optimizations turn indirects into direct dereferences and we can split more arrays. Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15177605 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 This is unsurprising because nir_lower_vars_to_ssa already effectively does structure and array splitting internally. It doesn't actually split the variables but it's ability to reason about aliasing in the presence of arrays and structures and pick out scalars or vectors to be lowered to SSA values is fairly advanced. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:44:14 -05:00
Jason Ekstrand	fa6417495c	nir: Add an array splitting pass This pass looks for array variables where at least one level of the array is never indirected and splits it into multiple smaller variables. This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through arrays of arrays and can detect indirects on just one level or even see that arr[i][0][5] does not alias arr[i][1][j]. This pass exists to help other passes more easily see through arrays of arrays. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. v2 (Jason Ekstrand): - Better comments and naming (some from Caio) - Rework to use one hash map instead of two v2.1 (Jason Ekstrand): - Fix a couple of bugs that were added in the rework including one which basically prevented it from running Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:44:14 -05:00
Jason Ekstrand	26eb077ec4	nir: Add a structure splitting pass This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through structures and considers them to be "split". This pass exists to help other passes more easily see through structure variables. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 21:44:14 -05:00
Jason Ekstrand	b489998e63	nir/types: Add array_or_matrix helpers Reviewed-by: Thomas Helland<thomashelland90@gmail.com>	2018-08-23 21:44:14 -05:00
Kenneth Graunke	b03dcb1e5f	i965: don't include compute resources in "Combined" limits The combined limits should only include shader stages that can be active at the same time. We don't need to include compute. See also `cff290df4c` for st/mesa. Unbreaks i965 from assert failing on driver load since Marek's `45f87a48f9`, which dropped the core Mesa capabilities before adjusting driver limits down to match.	2018-08-23 17:27:27 -07:00
Marek Olšák	9176703788	radeonsi: increase the maximum UBO size to 2 GB Same as the closed driver. This causes a failure in GL45-CTS.compute_shader.max, which has a trivial bug. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	5693ca865d	radeonsi: bump MAX_GS_INVOCATIONS same as the closed driver Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	d3c1b212bc	gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZE Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	f6ccd594e7	gallium: add PIPE_CAP_MAX_GS_INVOCATIONS Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	8c71b70f07	tgsi/ureg: don't call tgsi_sanity when it's too slow Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	80aecad0ca	st/mesa: fix up uniform limits to be able to expose large UBOs Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	cff290df4c	st/mesa: don't include compute resources in "Combined" limits The combined limits should only include shader stages that can be active at the same time. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	d36af3a9d9	st/mesa: set ctx->Const.SubPixelBits Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	3867af39f9	glsl: fix error checking against MAX_UNIFORM_LOCATIONS Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	f01338118c	mesa: make MaxCombinedUniformComponents 64-bit to allow large UBOs Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	a8b71f2db8	mesa: add ctx->Const.MaxGeometryShaderInvocations radeonsi wants to report a different value Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	45f87a48f9	mesa: don't include compute resources in MAX_COMBINED_* limits 5 is the maximum number of shader stages that can be used by 1 execution call at the same time (e.g. a draw call). The limit ensures that each stage can use all of its binding points. Compute is separate and doesn't need the 5x multiplier. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	095515e16c	mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICES same number as our closed GL driver v2: don't use MaxArrayLockSize Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	356ff963ec	mesa: remove incorrect change for EXT_disjoint_timer_query Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	37eee90df7	glapi: actually implement GL_EXT_robustness for GLES The extension was exposed but not the functions. This fixes: dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-23 16:54:30 -04:00
Kenneth Graunke	578e45ab7b	intel/decoder: Decode SFIXED values. This lets us example SAMPLER_STATE's LOD Bias field, among other things. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-23 13:04:53 -07:00
Emil Velikov	855af9a5a2	travis: use python3 for the autoconf builds Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 17:00:28 +01:00
Emil Velikov	ae7898dfdb	configure: allow building with python3 Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 17:00:13 +01:00
Emil Velikov	c51e7486d9	bin/git_sha1_gen.py: remove execute bit/shebang The script is executed explicitly via the build system, that uses PYTHON/prog_python and equivalent. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 17:00:04 +01:00
Eric Engestrom	993a456360	vk/wsi: avoid reading uninitialised memory It will be ignored by x11_swapchain_result() anyway (because reaching the `fail` label without setting `result` means the swapchain status was already a hard error), but the compiler still complains about reading uninitialised memory. While at it, drop the unused assignment right before returning. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-23 14:47:59 +01:00
Eric Engestrom	a0f6a11944	egl: drop unused _EGL_BUILT_IN_DRIVER_DRI2 Unused since `b174a1ae72` "egl: Simplify the "driver" interface". Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-08-23 14:47:59 +01:00
Samuel Pitoiset	87fbc16e34	radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BIT Single-sample color and single-sample depth (not stencil) are coherent with shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl	2018-08-23 15:42:56 +02:00
Mathieu Bridon	6027d354d1	bin/install_megadrivers.py: Remove shebang and executable bit Since the script is never executed directly, but launched by Meson as an argument to the Python interpreter, those are not needed any more. In addition, they are the reason this script was missed when I moved the Meson buildsystem to Python 3, so removing them helps avoiding future confusion. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-23 12:12:06 +01:00
Mathieu Bridon	8c8fd0bb8e	meson: Run the install script with Python 3 The script was being run directly as an executable, and it has a Python 2 shebang. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-23 12:12:06 +01:00
Emil Velikov	48820ed8da	glsl: remove execute bit and shebang from python tests Just like the rest of the tree - these should be run either as part of the build system check target, or at the very least with an explicitly versioned python executable. Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Fixes: `97c28cb082` ("glsl/tests: Convert optimization-test.sh to pure python") Fixes: `3b52d29227` ("glsl/tests: reimplement warnings-test in python") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 12:02:45 +01:00
Emil Velikov	e39b916d0c	docs: update required mako version The requirement was bumped a while back, but we forgot to update the docs. Fixes: `ed871af91c` ("configure.ac: raise Mako required version to 0.8.0") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-23 12:02:45 +01:00
Emil Velikov	e7149369bd	configure: use distutils in ax_check_python_mako_module Handling the version comparison by hand is a bad idea. Python has a handy module distutils for that - use it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-23 11:59:48 +01:00
Emil Velikov	df2042d99a	configure: enforce python 2.7 with AM_PATH_PYTHON Currently we use AC_CHECK_PROGS looking for python2.7, python2 and finally python. That is due to the varying names used across the different OS. Use the handy AM_PATH_PYTHON which finds the correct name and checks for the version. Note: python2.7 has been an unofficial requirement for quite some time. Update the docs to reflect that. Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-23 11:55:55 +01:00
Ian Romanick	c7c0b391ef	i965: Enable INTEL_shader_atomic_float_minmax on Gen9+ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	59c17dbc6c	i965: Sort Gen9+ extension enables This is a strictly alphabetic sort, as is done in extensions_table.h There are other options. We should pick one and document it. Right now, this file is chaos. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	d515c75463	intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	f347348f8a	intel/compiler: Expand untyped atomic message type field by a bit This is necessary for a new Gen9 message type that will be added in the next patch. There are also Gen8 message types that need the extra bit (mostly for bindless). v2: Split off from the next patch. Suggested by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	d628642a34	intel/compiler: Silence unused parameter warnings src/intel/compiler/brw_disasm_info.c: In function ‘nir_print_instr’: src/intel/compiler/brw_disasm_info.c:30:61: warning: unused parameter ‘instr’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr instr, FILE fp) {} ^~~~~ src/intel/compiler/brw_disasm_info.c:30:74: warning: unused parameter ‘fp’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr instr, FILE fp) {} ^~ src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: src/intel/compiler/brw_disasm.c:850:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ src/intel/compiler/brw_fs_surface_builder.cpp: In function ‘void brw::surface_access::emit_byte_scattered_write(const brw::fs_builder&, const fs_reg&, const fs_reg&, const fs_reg&, unsigned int, unsigned int, unsigned int, brw_predicate)’: src/intel/compiler/brw_fs_surface_builder.cpp:193:57: warning: unused parameter ‘size’ [-Wunused-parameter] unsigned dims, unsigned size, ^~~~ v2: Update commit message. brw_fs_generator.cpp warnings were already fixed by another patch. Noticed by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	0842655ac6	nir: Add floating point atomic min, max, and compare-swap instrinsics Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	69ce7baa9e	nir: Add floating point atomic add instrinsics Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	a390158d10	glsl: Add support for lowering shared-variable float atomics Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	39bf3100ac	glsl: Add support for lowering SSBO float atomics Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	280ab4afa8	glsl: Add built-in functions for INTEL_shader_atomic_float_minmax Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	c9d52c83a4	mesa: Extension boilerplate for INTEL_shader_atomic_float_minmax Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	346321a836	docs: Initial version of INTEL_shader_atomic_float_minmax spec v2: Describe interactions with the capabilities added by SPV_INTEL_shader_atomic_float_minmax v3: Remove 64-bit float support. v4: Explain NaN issues. Explain issues with atomicMin(-0, +0) and atomicMax(-0, +0). v5: Fix whitespace issues noticed by Caio. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	88b6c7bc14	glsl: Add built-in functions for NV_shader_atomic_float Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Ian Romanick	9527bb4e70	mesa: Extension boilerplate for NV_shader_atomic_float Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 20:31:32 -07:00
Gurchetan Singh	c731508b98	meson: fix egl build for android Haven't tested this, but we do include loader.h in platform_android.c Fixes: `c5ec155685` ("meson: wire up egl/android") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-22 16:47:19 -07:00
Gurchetan Singh	ec6cb01e21	meson: fix egl build for surfaceless Without this, I get: > platform_surfaceless.c:38:10: fatal error: 'loader.h' file not found > #include "loader.h" > ^~~~~~~~~~ > 1 error generated. Fixes: `108d257a16` ("meson: build libEGL") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> v2: Split up patches, modify commit message (Dylan)	2018-08-22 16:47:09 -07:00
Caio Marcelo de Oliveira Filho	410de0e3f1	nir: Give end_block its own index Since there's no particular reason for the index to be 0, choose an index that is not used by other block. This is convenient when we store "per-block" data in an array AND look for the successors data (e.g. any kind of backwards data-flow analysis). v2: Add a note about end_block's index. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho	8364ec3fce	nir: Skip common instructions when comparing deref paths Deref paths may share the same deref instructions in their chains, e.g. ssa_100 = deref_var A ssa_101 = deref_struct "array_field" of ssa_100 ssa_102 = deref_array "[1]" of ssa_101 ssa_103 = deref_struct "field_a" of ssa_102 ssa_104 = deref_struct "field_a" of ssa_103 when comparing the two last deref instructions, their paths will share a common sequence ssa_100, ssa_101, ssa_102. This patch skips to next iteration if the deref instructions are the same. Path[0] (the var) is still handled specially, so in the case above, only ssa_101 and ssa_102 will be skipped. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho	5196041e93	nir: Export deref comparison functions Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho	7f8ecedced	util/dynarray: add a clone function v2: Fix mem_ctx parameter type. (Thomas) Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 14:41:26 -07:00
Mariusz Ceier	61b84b8c14	amd/addrlib: Fix include path for c99_compat.h Without this patch mesa doesn't compile: In file included from ../mesa-9999/src/amd/addrlib/addrinterface.cpp:39: ../mesa-9999/src/util/macros.h:29:10: fatal error: c99_compat.h: No such file or directory #include "c99_compat.h" ^~~~~~~~~~~~~~ compilation terminated. Fixes: `15ca5ce99a` ("amd/addrlib: mark returnCode as MAYBE_UNUSED in") Signed-off-by: Mariusz Ceier <mceier+mesa-dev@gmail.com> Acked-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-22 14:39:02 -07:00
Grazvydas Ignotas	0076ea92a9	vulkan/wsi: fix pointer-integer conversion warnings For 32bit build. Trivial. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-23 00:34:32 +03:00
Grazvydas Ignotas	9177074524	radv: use different builtin shader cache for 32bit Currently if 64bit and 32bit programs are used interchangeably, radv will keep overwriting the cache. Use separate cache files to avoid that. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-23 00:34:32 +03:00
Grazvydas Ignotas	356f6673d6	radv: place pointer length into cache uuid Thanks to reproducible builds, binary file timestamps may be identical for both 32bit and 64bit packages when built from the same source. This means radv will use the same cache for both 32 and 64 bit processes, which leads to crashes. Conveniently there is a spare byte in cache_uuid, let's place the pointer size there. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107601 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105904 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-23 00:34:32 +03:00
Grazvydas Ignotas	2edf47edf0	llvmpipe: add cc clobber to inline asm The bsr instruction modifies flags, so that needs to be indicated to the compiler. No effect on generated code, but still needed for correctness. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-08-23 00:34:32 +03:00
Nanley Chery	6d80b0b4ba	intel/isl: Avoid tiling some 16K-wide render targets Fix rendering issues on BDW and SKL. Fixes: `0288fe8d04` ("i965/miptree: Use the correct BLT pitch") Fixes the following regressions seen exclusively on SKL: * KHR-GL46.texture_barrier_ARB.disjoint-texels * KHR-GL46.texture_barrier_ARB.overlapping-texels * KHR-GL46.texture_barrier.disjoint-texels * KHR-GL46.texture_barrier.overlapping-texels and both on BDW and SKL: * GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners * GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners v2: Note the fixed tests (Andres). Don't cause failures with multisampled buffers (Andres). Don't hamper SKL GT4 (Ken). v3: Fix the Fixes tag (Dylan). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359 Cc: <mesa-stable@lists.freedesktop.org> Tested-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 13:53:19 -07:00
Nanley Chery	b041fc0649	i965/miptree: Fix can_blit_slice() Check the destination's row pitch against the BLT engine's row pitch limitation as well. Fixes: `0288fe8d04` ("i965/miptree: Use the correct BLT pitch") v2: Fix the Fixes tag (Dylan). Check the destination row pitch (Chris). Reported-by: Dylan Baker <dylan@pnwbakers.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 13:53:02 -07:00
Nanley Chery	030b6efcfd	i965/miptree: Use miptree_map in map_blit functions This struct contains all the data of interest. can_blit_slice() will use it in the next patch to calculate the correct pitch. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-22 13:23:17 -07:00
Rafael Antognolli	f8cfc77660	intel/tools/aubwrite: Always use physical addresses for traces. It looks like we can't rely on the simulator to always translate virtual addresses to physical ones correctly. So let's use physical everywhere. Since our current GGTT maps virtual to physical addresses in a 1:1 way, no further changes are required. Additionally, we have other address spaces not in use right now. So let's make it easier to switch which one we are using but putting the default one into the aub_file struct. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-22 12:52:41 -07:00
Rafael Antognolli	e82d8fa964	intel/tools/aubwrite: Rename "legacy" to "Trace Block". Hopefully it's a little more descriptive, and more accurate. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-22 12:52:41 -07:00
Jason Ekstrand	68ae66542a	nir/vars_to_ssa: Don't build deref nodes for non-local variables Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-08-22 14:17:38 -05:00
Marek Olšák	e80e8d7adc	ac: fix WAITCNT flags for GFX9 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-22 14:34:43 -04:00
Kai Wasserbäch	c836a751bc	amd/addrlib: mark physicalSliceSize as MAYBE_UNUSED in Addr::V1::EgBasedLib::HwlGetSizeAdjustmentMicroTiled Only used, when asserts are enabled. Fixes an unused-but-set-variable warning with GCC 8: ../../../src/amd/addrlib/r800/egbaddrlib.cpp: In member function 'virtual long long unsigned int Addr::V1::EgBasedLib::HwlGetSizeAdjustmentMicroTiled(unsigned int, unsigned int, ADDR_SURFACE_FLAGS, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int) const': ../../../src/amd/addrlib/r800/egbaddrlib.cpp:4111:13: warning: variable 'physicalSliceSize' set but not used [-Wunused-but-set-variable] UINT_64 physicalSliceSize; ^~~~~~~~~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-22 14:33:21 -04:00
Kai Wasserbäch	2e0586e379	amd/addrlib: mark numPipes as MAYBE_UNUSED in Addr::V1::EgBasedLib::SanityCheckMacroTiled (v2) Only used, when asserts are enabled. Fixes an unused-variable warning with GCC 8: ../../../src/amd/addrlib/r800/egbaddrlib.cpp: In member function 'int Addr::V1::EgBasedLib::SanityCheckMacroTiled(ADDR_TILEINFO*) const': ../../../src/amd/addrlib/r800/egbaddrlib.cpp:982:13: warning: unused variable 'numPipes' [-Wunused-variable] UINT_32 numPipes = HwlGetPipes(pTileInfo); ^~~~~~~~ v2: Don't realign other variable definitions, to keep in line with file style (Marek) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-22 14:33:21 -04:00
Kai Wasserbäch	6a7ef7c7dc	amd/addrlib: mark pEqToCheck as MAYBE_UNUSED in Addr::V2::Gfx9Lib::ComputeStereoInfo (v2) Only used, when asserts are enabled. Fixes an unused-variable warning with GCC 8: ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp: In member function 'ADDR_E_RETURNCODE Addr::V2::Gfx9Lib::ComputeStereoInfo(const ADDR2_COMPUTE_SURFACE_INFO_INPUT, ADDR2_COMPUTE_SURFACE_INFO_OUTPUT, unsigned int) const': ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp:3879:34: warning: unused variable 'pEqToCheck' [-Wunused-variable] const ADDR_EQUATION *pEqToCheck = &m_equationTable[eqIndex]; ^~~~~~~~~~ v2: Don't realign other variable definitions, to keep in line with file style (Marek) Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-22 14:33:21 -04:00
Kai Wasserbäch	556f89a715	amd/addrlib: mark microBlockDim as MAYBE_UNUSED in Addr::V2::Gfx9Lib::HwlComputeBlock256Equation Only used, when asserts are enabled. Fixes an unused-but-set-variable warning with GCC 8: ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp: In member function 'virtual ADDR_E_RETURNCODE Addr::V2::Gfx9Lib::HwlComputeBlock256Equation(AddrResourceType, AddrSwizzleMode, unsigned int, ADDR_EQUATION*) const': ../../../src/amd/addrlib/gfx9/gfx9addrlib.cpp:2473:15: warning: variable 'microBlockDim' set but not used [-Wunused-but-set-variable] Dim2d microBlockDim = Block256_2d[elementBytesLog2]; ^~~~~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-22 14:33:21 -04:00
Kai Wasserbäch	15ca5ce99a	amd/addrlib: mark returnCode as MAYBE_UNUSED in ElemGetExportNorm Only used, when asserts are enabled. Fixes an unused-but-set-variable warning with GCC 8: ../../../src/amd/addrlib/addrinterface.cpp: In function 'int ElemGetExportNorm(ADDR_HANDLE, const ELEM_GETEXPORTNORM_INPUT*)': ../../../src/amd/addrlib/addrinterface.cpp:835:23: warning: variable 'returnCode' set but not used [-Wunused-but-set-variable] ADDR_E_RETURNCODE returnCode = ADDR_OK; ^~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-22 14:33:21 -04:00
Lionel Landwerlin	8b0e48887f	intel: aubinator_viewer: add urb view This is available through a "Show URB" button on the 3DPRIMITIVE instructions. v2: Fix urb allocation end value in tooltip (Rafael) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	d1c4a62bf8	intel: aubinator_viewer: store urb state during decoding Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	38f10d5a03	intel: tools: add aubinator viewer A graphical user interface version of aubinator. Allows you to : - simultaneously look at multiple points in the aub file (using all the goodness of the existing decoding in aubinator) - edit an aub file v2: Switch from GLFW to GTK+3 v3: Fix warning when exiting Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	ea83a1d304	intel: tools: import ImGui We want to add a new UI tool to decode aub files. This will use the Dear ImGui library to render its interface. The build of this UI toolkit is conditional to -Dwith_tools=intel-ui which superseeds -Dwith_tools=intel. The main way to use ImGui is to embed its source code at a particular revision. Most embedding projects have to do a bit of integration which is really specific to one's project. In our case the only modification is to include libepoxy. We also choose to use Gtk+3 for the window system integration. As oppose to the previous previous version of this patch using GLFW, Gtk+ is able to handle X11/Wayland session as well as property DPI scaling on retina monitors. The import was done at this commit (https://github.com/ocornut/imgui) : commit 6211f40f3d903dd9df961256e044029c49793aa3 Author: omar <omarcornut@gmail.com> Date: Fri Jul 27 12:29:33 2018 +0200 Internals: Drag and Drop: default drop preview use a narrower clipping rectangle (no effect here, but other branches uses a narrow clipping rectangle that was too small so this is a fix for it) + Comments v2: Switch from GLFW to GTK+ (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	4ba12e8c54	intel: tools: aub_mem: reuse already mapped ppgtt buffers When we map a PPGTT buffer into a continous address space of aubinator to be able to inspect it, we currently add it to the list of BOs to unmap once we're finished. An optimization we can apply it to look up that list before trying to remap PPGTT buffers again (we already do this for GGTT buffers). We need to take some care before doing this because the list also contains GGTT BOs. As GGTT & PPGTT are 2 different address spaces, we can have matching addresses in both that point to different physical locations. This changes adds a flag on the elements of the list of mapped BOs to differenciate between GGTT & PPGTT, which allows use to reuse that list when looking up both address spaces. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	8fd78b4eea	intel: tools: aubmem: map gtt data to aub file This will allow the aubinator viewer tool to modify the aub data that was loaded at a particular gtt address. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	ebb145ee12	intel: tools: create libaub Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	475d670ef7	intel: tools: aubwrite: wrap function declarations for c++ Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	ed21007a6a	intel: tools: split memory management out of aubinator Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 18:02:11 +01:00
Lionel Landwerlin	14a1cb37eb	util: rb_tree: add safe iterators v2: Add helper to make iterators more readable (Rafael) Fix rev iterator bug (Rafael) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-22 17:49:36 +01:00
Lionel Landwerlin	4616639b49	intel: tools: split aub parsing from aubinator v2: add parsing error callback (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)	2018-08-22 17:49:36 +01:00
Mathieu Bridon	e15686567c	meson: Run the test with Python 3 This is a patch from me and a patch from Mathieu Bridon squashed together. Signed-off-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>	2018-08-22 08:41:01 -07:00
Mathieu Bridon	ff0ce31e2a	python: Disable universal newlines We are testing the behaviour of a tool, for different input files, each one using a different newline sequence. ('\n' on UNIX, '\r\n' on Windows, …) Unfortunately, when opening a file in text mode, Python 3 will by default enable the "universal newlines" mode, which means it replaces all the known newline sequences by '\n'. This (usually useful) behaviour breaks the tests, which are specifically trying to handle files with newline sequences different from '\n'. Disabling the universal newlines mode fixes the tests. However, to keep the script compatible with both Python 2 and 3, we must use the io.open() function instead of the open() builtin, as the latter only knows about the `newline` argument on Python 3. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-22 08:41:01 -07:00
Mathieu Bridon	fc708069f7	python: difflib prefers unicode strings Python 3 does not automatically convert from bytes to unicode strings like Python 2 used to do. This commit makes sure we pass unicode strings to difflib.unified_diff, so that the script works on both Python 2 and 3. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-22 08:41:01 -07:00
Dylan Baker	477d4b9960	compiler/glsl/tests: Make tests python3 safe v2: - explicitly decode the output of subprocesses - handle bytes and string types consistently rather than relying on python 2's coercion for bytes and ignoring them in python 3 v3: - explicitly set encode as well as decode - python 2.7 and 3.x `bytes` instead of defining an alias Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>	2018-08-22 08:41:01 -07:00
Juan A. Suarez Romero	6ea5718318	travis: SWR requires LLVM 6.0 v2: update clarification why ubuntu-toolchain-r-test is required (Emil) Fixes: `0cef0cccf5` ("swr: bump minimum supported LLVM version to 6.0") Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-22 17:29:20 +02:00
Samuel Pitoiset	4c43ec461d	ac/nir: fix getting GLSL type of array of samplers for TG4 This fixes a crash in build_tex_intrinsic() when trying to launch the Basemark GPU benchmark on GFX8. It looks like there is still something wrong because some frames are black. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106980 CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-22 15:23:11 +02:00
Samuel Pitoiset	24ee53231d	radv: remove dead variables after splitting per member structs Otherwise, nir_lower_clip_cull_distance_arrays might report wrong number of output clips/culls because it relies on shader output variables and some of them might be dead. This fixes a rendering issue with Dolphin and Super Mario Sunshine. Fixes: `b0c643d8f5` ("spirv: Use NIR per-member splitting") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107610 CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-22 13:57:18 +02:00
Yunchao He	bea4d4c78c	anv: add VK_EXT_sampler_filter_minmax support This extension can be supported on SKL+. With this patch, all corresponding tests (6K+) in CTS can pass. No test fails. I verified CTS with the command below: deqp-vk --deqp-case=dEQP-VK.pipeline.sampler.view_type.reduce v2: 1) support all depth formats, not depth-only formats, 2) fix a wrong indention (Jason). v3: fix a few nits (Lionel). v4: fix failures in CI: disable sampler reduction when sampler reduction mode is not specified via this extension (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-22 11:56:19 +01:00
Samuel Pitoiset	0608349232	radv: use ac_build_imad() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-22 09:17:40 +02:00
Marek Olšák	d87fe1f0fd	ac,radeonsi: use ac_build_gather_values more Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	60beac9efc	ac,radeonsi: use ac_build_fmad Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	c401ead68a	radeonsi: use ac_build_imad Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	659f2e0fcb	ac: add imad & fmad helpers Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	2276f8f064	ac: add ac_build_s_barrier Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	6224144b6d	radeonsi: print the shader stage name when printing LLVM IR Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	5d20b9be90	radeonsi: use is_merged shader in si_prolog_get_rw_buffers needed to change the input type to si_shader_context Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Marek Olšák	a4a104fc81	ac: completely remove +auto-waitcnt-before-barrier it causes corruption on several different GPU generations. Cc: 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-21 20:50:37 -04:00
Anuj Phogat	2383ddace1	anv/icl: Allow headerless sampler messages for pre-emptable contexts It fixes simulator warnings in vulkancts tests complaining about missing support for headerless sampler messages for pre-emptable contexts. Bit 5 in SAMPLER MODE register is newly introduced for ICLLP. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 12:50:05 -07:00
Anuj Phogat	81b74b5d96	anv/icl: Disable binding table prefetching Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to disable prefetching of binding tables for ICLLP A0 and B0 steppings. We have a similar patch for i965 driver in Mesa commit `a5889d70`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 12:50:05 -07:00
Anuj Phogat	482f328f3b	i965/icl: Allow headerless sampler messages for pre-emptable contexts It fixes simulator warnings in piglit tests complaining about missing support for headerless sampler messages for pre-emptable contexts. Bit 5 in SAMPLER MODE register is newly introduced for ICLLP. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 12:50:05 -07:00
Dave Airlie	32529e6084	r600/eg: rework atomic counter emission with flushes With the current code, we didn't do the space checks prior to atomic counter setup emission, but we also didn't add atomic counters to the space check so we could get a flush later as well. These flushes would be bad, and lead to problems with parallel tests. We have to ensure the atomic counter copy in, draw emits and counter copy out are kept in the same command submission unit. This reworks the code to drop some useless masks, make the counting separate to the emits, and make the space checker handle atomic counter space. [airlied: want this in 18.2] Fixes: `06993e4ee` (r600: add support for hw atomic counters. (v3))	2018-08-21 20:45:38 +01:00
Dave Airlie	41d58e2098	virgl: ARB_enhanced_layouts support We need to handle the gaps in the streamout bindings on the guest side and enable if it the host has the rest enabled. Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>	2018-08-22 05:05:21 +10:00
Chad Versace	aa79cc2bc8	i965: Implement EGL_KHR_mutable_render_buffer Testing: - Manually tested a low-latency handwriting demo that toggles EGL_RENDER_BUFFER. Toggling changed the display latency as expected. Used Android on Chrome OS, Kabylake GT2. - No change in dEQP-EGL.functional.* on Fedora 27, Wayland, Skylake GT2. Used deqp at tag android-p-preview-5. - No regressions in dEQP-EGL.functional., ran on Android on Chrome OS, Kabylake GT2. Some dEQP-EGL.functional.mutable_render_buffer. test change from NotSupported to Pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-21 09:56:20 -07:00
Chad Versace	ed7c694688	egl/android: Implement EGL_KHR_mutable_render_buffer Specifically, implement the extension DRI_MutableRenderBufferLoader. However, the loader enables EGL_KHR_mutable_render_buffer only if the DRI driver implements its half of the extension, DRI_MutableRenderBufferDriver. Testing: - No change in dEQP-EGL.functional.* on Fedora 27, Wayland, Skylake GT2. Used deqp at tag android-p-preview-5. - No change in dEQP-EGL.functional.*, ran on Android on Chrome OS, Kabylake GT2. - Manually inspected Android apps on same Chrome OS device. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-21 09:56:20 -07:00
Eric Engestrom	317c460a4d	util/xmlpool: make indentation coherent Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-21 17:36:13 +01:00
Eric Engestrom	2de9e841e7	egl: add helper to combine two u32 into one u64 Use a helper to avoid the common issues of upcasting after the right shift (losing the upper bits) and shifting signed values (sign gets shifted too). Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-21 15:50:02 +01:00
Eric Engestrom	1ca23420c1	docs: trivial s/>/>/ html fix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-21 15:41:41 +01:00
Eric Engestrom	6ff1c47996	autotools: don't ship the git_sha1.h generated in git in the tarballs This file is regenerated at build time anyway, so this would just get overwritten anyway. No reason to ship it in the tarball. Fixes: `44df06211c` "autotools: include git_sha1.h in dist tarball" Fixes: `471f708ed6` "git_sha1: simplify logic" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-21 15:30:56 +01:00
Eric Engestrom	81fe9bdf6d	intel/genxml: minor python style fix Suggested-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-21 15:30:55 +01:00
Jose Fonseca	9e5e3a8ead	appveyor: Set git core.autocrlf setting to true. The git core.autocrlf setting defaults to true (ie, all text files get checked out as CRLF on Windows), except on Appveyor where's set to "input" (ie, all text files get checked out with the upstream repository's line endings, which for us typically means LF.) And this was masking on Appveyor a regression in gen_xmlpool.py processing t_options.h with CRLF line endings. This change makes core.autocrlf to be true, which would have enabled to immediately catch the issue, as seen in https://ci.appveyor.com/project/jrfonseca/mesa/build/51 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-08-21 09:46:19 +01:00
Timothy Arceri	797cd198ae	mesa: move legacy hyperz option from dri config Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 09:19:02 +10:00
Timothy Arceri	02062ab1e1	mesa: remove unused dri config option disable_shader_bit_encoding This was added as a workaround for Heaven 3.0 but was later removed by `5ead448719` to allow Heaven 4.0 to work correctly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 09:19:02 +10:00
Timothy Arceri	c5f863f2fd	mesa: drop legacy no_rast dri option Add enviroment var overrides to legacy drivers instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 09:19:01 +10:00
Timothy Arceri	02e32c92a2	i965: remove unused no_rast bool Forcing software fallbacks for i965 hasn't been an option since `5e3c093ff8`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 09:19:01 +10:00
Timothy Arceri	7867c1078a	i915: remove early_z dri option This driver is in maintenance mode so lets remove this hidden unsafe option. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-21 09:19:01 +10:00
Kevin Rogovin	7ec308d978	Add NV_fragment_shader_interlock support. The main purpose for having NV_fragment_shader_interlock extension is because that extension is also for GLES31 while the ARB extension is for GL only. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-08-20 13:32:43 -07:00
Juan A. Suarez Romero	44df06211c	autotools: include git_sha1.h in dist tarball This fixes `make distcheck`. Fixes: `471f708ed6` ("git_sha1: simplify logic") CC: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-20 18:43:50 +02:00
Juan A. Suarez Romero	0cef0cccf5	swr: bump minimum supported LLVM version to 6.0 RADV now requires LLVM 6.0 or greater, and thus we can't build dist tarball because swr requires LLVM 5.0. Let's bump required LLVM to 6.0 in swr too. v2: bump also in meson.build (Eric) Fixes: `fd1121e839` ("amd: remove support for LLVM 5.0") Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-08-20 16:13:37 +02:00
Danylo Piliaiev	25ec806eb2	i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+ We use floating-points for viewport bounds so VIEWPORT_SUBPIXEL_BITS should reflect this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105975 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-20 15:11:57 +01:00
Rob Clark	e11e9d6394	freedreno: fix context teardown race We could still have batches queued up to flush, so fd_context_destroy() (which will kill and sync on the flush_queue) before deleting buffers that might be referenced from fdN_gmem() from context of flush_queue. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-20 10:03:05 -04:00
Kai Wasserbäch	5fab32ddad	intel/decoder: mark total_length as MAYBE_UNUSED in gen_spec_load Only used, when asserts are enabled. Fixes an unused-variable warning with GCC 8: ../../../src/intel/common/gen_decoder.c: In function 'gen_spec_load': ../../../src/intel/common/gen_decoder.c:535:47: warning: variable 'total_length' set but not used [-Wunused-but-set-variable] uint32_t text_offset = 0, text_length = 0, total_length; ^~~~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-20 11:08:52 +01:00
Kai Wasserbäch	4228e052b3	intel/tools: initialise bo_addr to 0 in main Supresses a maybe-uninitialized warning with GCC 8. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-20 11:08:52 +01:00
Kai Wasserbäch	ccdefbb559	intel: aubinator: mark ftruncate_res as MAYBE_UNUSED in ensure_phys_mem Only used, when asserts are enabled. Fixes an unused-variable warning with GCC 8: ../../../src/intel/tools/aubinator.c: In function 'ensure_phys_mem': ../../../src/intel/tools/aubinator.c:209:11: warning: unused variable 'ftruncate_res' [-Wunused-variable] int ftruncate_res = ftruncate(mem_fd, mem_fd_len += 4096); ^~~~~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-20 11:08:52 +01:00
Kai Wasserbäch	64c2bca59f	intel/aubinator_error_decode: mark ret as MAYBE_UNUSED in main Only used, when asserts are enabled. Fixes an unused-but-set-variable warning with GCC 8: ../../../src/intel/tools/aubinator_error_decode.c: In function 'main': ../../../src/intel/tools/aubinator_error_decode.c:759:11: warning: variable 'ret' set but not used [-Wunused-but-set-variable] int ret; ^~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-20 11:08:52 +01:00
Samuel Pitoiset	0aacb5eab6	radv: do not use CP predication for DCC decompressions This fixes a regression with some Unity demos. Not sure what the root cause of the problem is, especially because the driver doesn't perform any fast color clears. So, it shouldn't be needed to decompress DCC. RadeonSI says that the decompression is relatively cheap if the surface has been decompressed already. One possible improvement is to two use predicates, one for DCC and one for FCE that could be cleared when DCC, FMASK or CMASK are performed by the driver. That might skip some unnecessary decompression passes (not DCC though). Fixes: `ff7daadca1` ("radv: enable/disable predication for the DCC decompression pass") CC: 18.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-20 11:54:37 +02:00
Tapani Pälli	799b3d16d4	egl: implement EXT_surface_SMPTE2086_metadata and EXT_surface_CTA861_3_metadata Patch implements common bits for EXT_surface_SMPTE2086_metadata and EXT_surface_CTA861_3_metadata extensions by adding new required attributes and eglQuerySurface + eglSurfaceAttrib changes. Currently none of the drivers are utilizing this data but this patch is enabler in getting there. v2: don't enable extension globally, should be only enabled by EGL drivers that can transfer metadata to the window system (Jason) use EGLint instead of uint16_t (Eric) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-20 09:44:53 +03:00
Timothy Arceri	5a0684d665	mesa: move legacy dri config option texture_units Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:59 +10:00
Timothy Arceri	8b4157d578	mesa: remove unused dri config option texture_heaps This seems to have only been used by DRI1 drivers which were removed with `e4344161bd`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:59 +10:00
Timothy Arceri	fb277f504e	mesa: move legacy dri config option texture_blend_quality Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:59 +10:00
Timothy Arceri	c470db706a	util: remove unused S3TC translation for dri config Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:59 +10:00
Timothy Arceri	7d2474afb5	mesa: remove dri configs unused software-fallback options These seems to have only been used by DRI1 drivers which were removed with `e4344161bd`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:58 +10:00
Timothy Arceri	24da2d162d	mesa: remove unused dri config option excess_mipmap This seems to have only been used by DRI1 drivers which were removed with `e4344161bd`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:58 +10:00
Timothy Arceri	498831c7e6	mesa: remove unused dri config option performance_boxes This seems to have only been used by DRI1 drivers which were removed with `e4344161bd`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-20 13:53:58 +10:00
Timothy Arceri	4a91d4ef0f	docs: update the default mesa shader cache dir We renamed the dir in commit `28b326238b`, this just updates the website to reflect the change.	2018-08-20 08:08:58 +10:00
Kai Wasserbäch	2c020dbf06	vulkan/wsi: initialise image_index to 0 in x11_manage_fifo_queues Supresses a maybe-uninitialized warning with GCC 8. Note: image_index should always be initialised due to the result check, but the compiler doesn't see that. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-18 10:34:19 +10:00
Kai Wasserbäch	6f0647c0b2	nir: mark prev_block as MAYBE_UNUSED in opt_peel_loop_initial_if Only used, when asserts are enabled. Fixes an unused-variable warning with gcc-8: ../../../src/compiler/nir/nir_opt_if.c: In function 'opt_peel_loop_initial_if': ../../../src/compiler/nir/nir_opt_if.c:109:15: warning: unused variable 'prev_block' [-Wunused-variable] nir_block prev_block = ^~~~~~~~~~ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-18 10:34:15 +10:00
Kai Wasserbäch	9387ca29ae	util: mark s as MAYBE_UNUSED in _mesa_half_to_unorm8 Only used, when asserts are enabled. Fixes an unused-variable warning with gcc-8: ../../../src/util/half_float.c: In function '_mesa_half_to_unorm8': ../../../src/util/half_float.c:189:14: warning: unused variable 's' [-Wunused-variable] const int s = (val >> 15) & 0x1; ^ Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-18 10:34:12 +10:00
Timothy Arceri	0da93de9c8	util: add drirc workarounds for RAGE This allows the game to run on wine (tested on radeonsi where we have compat profile support).	2018-08-18 09:26:51 +10:00
Timothy Arceri	3f9d8e9c88	util: better handle program names from wine For some reason wine will sometimes give us a windows style path for an application. For example when running the 64bit version of Rage wine gives a Unix style path, but when running the 32bit version is gives a windows style path. If we detect no '/' in the path at all it should be safe to assume we have a wine application and instead look for a '\'. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-18 09:20:39 +10:00
Timothy Arceri	d0803dea11	nir: allow more nested loops to be unrolled The innermost check was added to stop us from unrolling multiple loops in a single pass, and to stop outer loops from unrolling. When we successfully unroll a loop we need to run the analysis pass again before deciding if we want to go ahead an unroll a second loop. However the logic was flawed because it never tried to unroll any nested loops other than the first innermost loop it found. If this innermost loop is not unrolled we end up skipping all other nested loops. This unrolls a loop in a Deus Ex: MD shader on ultra settings and also unrolls a loop in a shader from the game Prey when running on DXVK. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-18 09:03:13 +10:00
Ray Strode	9baff597ce	gallium/winsys/kms: don't unmap what wasn't mapped At the moment, depending on pipe transfer flags, the dumb buffer map address can end up at either kms_sw_dt->ro_mapped or kms_sw_dt->mapped. When it's time to unmap the dumb buffer, both locations get unmapped, even though one is probably initialized to 0. That leads to the code segment getting unmapped at runtime and crashes when trying to call into unrelated code. This commit addresses the problem by using MAP_FAILED instead of NULL for ro_mapped and mapped when the dumb buffer is unmapped, and only unmapping mapped addresses at unmap time. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107098 Signed-off-by: Ray Strode <rstrode@redhat.com> Fixes: `d891f28df9` ("gallium/winsys/kms: Fix possible leak in map/unmap.") Cc: Lepton Wu <lepton@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Qiang Yu	0aa80abf25	loader: add dri_driver option to override dri driver to load drirc implementation of MESA_LOADER_DRIVER_OVERRIDE which can be used to override dri driver to load. Usage: override dri driver for device with spec kernel driver name: <device kernel_driver="kernel_driver_name"> <option name="dri_driver" value="new_dri_driver" /> </device> or <device driver="loader" kernel_driver="kernel_driver_name"> <option name="dri_driver" value="new_dri_driver" /> </device> v2: add kernel_driver device attribute to specify kernel driver name instead of reuse driver attribute v3: seperate loader_get_kernel_driver_name into another patch seperate add kernel_driver attribute into another patch Suggested-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [v4 Emil: add HAVE_LIBDRM guard around __driConfigOptionsLoader and loader_get_dri_config_driver] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Qiang Yu	3bbe180b98	xmlconfig: add kernel_driver device attribute This attribute can be used by loader to apply different option to device use specific kernel driver. Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Qiang Yu	e8b91e99e9	loader: abstract loader_get_kernel_driver_name for reuse This function can be shared by the following kernel_driver drirc patch. Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Qiang Yu	30b10dbb7c	driconf: move ${sysconfdir}/drirc to ${datadir}/drirc.d/00-mesa-defaults.conf ${sysconfdir} is for store admin config files, so move this mesa default config file to ${datadir}/drirc.d. Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Qiang Yu	04bdbbcab3	xmlconfig: read more config files from drirc.d/ Driver and application can put their drirc files in ${datadir}/drirc.d/ with name xxx.conf. Config files will be read and applied in file name alphabetic order. So there are three places for drirc listed in order: 1. /usr/share/drirc.d/ 2. /etc/drirc 3. ~/.drirc v4: fix meson build v3: 1. seperate driParseConfigFiles refine into another patch 2. fix entries[i] mem leak v2: drop /etc/drirc.d Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Emil Velikov	0da417129e	xmlconfig: refine driParseConfigFiles to use parseOneConfigFile Also prepare for the usage of following parseConfigDir patch. Signed-off-by: Qiang Yu <Qiang.Yu@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> [Emil: add #include <limits.h>] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-17 17:16:32 +01:00
Jason Ekstrand	d9ea015ced	anv/pipeline: Lower pipeline layouts etc. after linking This allows us to use the link-optimized shader for determining binding table layouts and, more importantly, URB layouts. For apps running on DXVK, this is extremely important as DXVK likes to declare max-size inputs and outputs and this lets is massively shrink our URB space requirements. VkPipeline-db results (Batman pipelines only) on KBL: total instructions in shared programs: 820403 -> 790008 (-3.70%) instructions in affected programs: 273759 -> 243364 (-11.10%) helped: 622 HURT: 42 total spills in shared programs: 8449 -> 5212 (-38.31%) spills in affected programs: 3427 -> 190 (-94.46%) helped: 607 HURT: 2 total fills in shared programs: 11638 -> 6067 (-47.87%) fills in affected programs: 5879 -> 308 (-94.76%) helped: 606 HURT: 3 Looking at shaders by hand, it makes the URB between TCS and TES go from containing 32 per-vertex varyings per tessellation shader pair to a more reasonable 8-12. For a 3-vertex patch, that's at least half the URB space no matter how big the patch section is. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	f210a5f4bb	anv/pipeline: Set tess IO read/written key fields in compile_* We want these to be set as close to the final compile as possible so that they are guaranteed to happen after nir_shader_gather_info is called. The next commit is going to move nir_shader_gather_info to after the linking step which makes this necessary. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	2e4094cd8f	anv/pipeline: Use more fields from stage in compile_cs Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	4af1a8c9e4	anv/apply_pipeline_layout: Add to the bind map instead of replacing it This commit makes three changes. One is to only walk the descriptors once and set bind map sizes at the same time as filling out the entries. The second is to make the pass additive so that we can put stuff in the bind map before applying the pipeline layout. Third, we switch to using designated initializers. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-17 10:50:28 -05:00
Jason Ekstrand	320dacb0a0	anv/lower_ycbcr: Use the binding array size for bounds checks Because lower_ycbcr gets called before apply_pipeline_layout, the indices are all logical and the binding layout HW size is actually too big for the bounds check. We should just use the regular logical array size instead. Fixes: `f3e91e78a3` "anv: add nir lowering pass for ycbcr textures" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-17 10:50:28 -05:00
Mathieu Bridon	459ec5265c	python: Open the template as text, with an explicit encoding In commit `bd27203f4d` we changed this to open in binary mode, to then explicitly decode the lines with the right encoding. Unfortunately, that broke the build on Windows, where the template file can have '\r\n' as line terminators: opening in binary mode would keep those terminators and break the regexp. We need to go back to text mode, where the "universal newlines" mode takes care of this. However, to fix the initial issue, let's specify the encoding explicitly when opening the file, and make sure it is open in text mode, so we only get unicode strings. Reviewed-by: Jose Fonseca <jfonseca@vmware>	2018-08-17 09:34:49 -06:00
Mathieu Bridon	f9415d760a	python: Help Python 2 print the line Reviewed-by: Jose Fonseca <jfonseca@vmware>	2018-08-17 09:33:16 -06:00
Rob Clark	a8ef7f5e02	freedreno/a6xx: streamout Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Rob Clark	7fa2a8c3c4	freedreno/a6xx: fragz fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Rob Clark	7c73d41160	freedreno/a6xx: scissor fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Rob Clark	b7f18e49b7	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Rob Clark	a4754c245b	freedreno/a6xx: fix srgb Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Rob Clark	2658f63701	freedreno: fix dEQP-GLES3.functional.fence_sync.* Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-17 11:04:21 -04:00
Samuel Pitoiset	d27e1584ce	radv/winsys: fix creating the BO list for virtual buffers When the number of unique BO is 0, we optimize the list creation by copying all buffers of the current CS directly into it. But this is only valid if the CS doesn't have virtual buffers, otherwise they are not added and hw might report VM faults. This fixes VM faults with: dEQP-VK.sparse_resources.image_sparse_binding.2d.rgba8ui.1024_128_1 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-17 15:00:21 +02:00
Kristian H. Kristensen	de3b34df97	freedreno: Add a6xx backend This adds a freedreno backend for the a6xx generation GPUs, which at the time of this commit is about 98% GLES2 conformant. Much remains to be done - both performance work and feature work towards more recent GLES versions, but this is a good start. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-16 19:13:36 -04:00
Rob Clark	6ee58e8257	freedreno: update generated headers pull in a6xx registers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-16 19:11:08 -04:00
Kristian H. Kristensen	e89683d5a2	freedreno: Fix warnings Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-16 19:11:08 -04:00
Dylan Baker	c782168751	scons: Check for mako 0.8.0 v2: - Use distutils to do the version checking Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107565 Acked-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-08-16 13:53:10 -07:00
Dylan Baker	64e4638130	scons: Require python 2.7 less than 2.7 is not supported. v2: - Remove check for python >= 2.0, since we've already enforced 2.7 Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-08-16 13:52:56 -07:00
Dylan Baker	5a8f824d8c	meson: use python3 module to find python3 This handy helper is nice for OSes that are not linux or BSD like (mac and windows) as it knows how to find python3 in odd places. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-08-16 13:51:44 -07:00
Dylan Baker	52194ae4df	meson: Ensure that mako is >= 0.8.0 It's what autotools has required for a long time. v3: - Use distutils.version.StrictVersion instead of comparing strings Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-08-16 13:50:51 -07:00
Eric Engestrom	03ec672213	svga: simplify Mesa version string Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	bc8abc1adf	bin: always define MESA_GIT_SHA1 to make it directly usable in code Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	471f708ed6	git_sha1: simplify logic Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	9a6a631762	i965: drop unused assignment Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	7a1f4340b6	anv: drop cast-to-void of used variable `device` is used 2 lines below, even visible in the diff context printed. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	6cf0d4f91f	anv: use safer snprintf() to ensure NULL string-terminator Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	d6aea40326	intel/batch-decoder: replace local ARRAY_LENGTH() macro with global ARRAY_SIZE() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-16 17:38:31 +01:00
Eric Engestrom	81c1989e4f	intel: various python cleanups Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:25 +01:00
Eric Engestrom	aa78b29eba	egl: check for buffer overflow before corrupting our memory Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:22 +01:00
Eric Engestrom	eb6b41749b	egl/wayland: remove sign from bitfield `formats` Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:18 +01:00
Eric Engestrom	c5d9b48a71	mailmap: add various typos of Emil's address from the log Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-16 17:38:04 +01:00
Eric Engestrom	882ed53946	egl: some spelling fixes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-16 14:15:18 +01:00
Samuel Pitoiset	f9e8456c39	radv: initialize the DCC predicate correctly when it's compressed We have to do a fast-clear eliminate when clearing DCC metadata with 0x20202020. I don't know if that fixes anything but that seems correct to me. CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-16 14:11:51 +02:00
Samuel Pitoiset	f3a78a9da0	radv: fix missing initialization of the conditional rendering state This was missing when VK_EXT_conditional_rendering has been implemented. The predication type should be -1 to avoid restoring previous state when performing a decompression pass with DCC enabled. Note that we don't have to handle secondary command buffers because we don't support this feature currently. CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-16 14:11:48 +02:00
Eric Engestrom	c5dd02287f	bin: split `write_if_different()` out Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-16 12:33:35 +01:00
Eric Engestrom	c2e00f9eee	bin: whitespace cleanup Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-16 12:30:30 +01:00
Bas Nieuwenhuizen	011a811652	radv: Revert divisor = 0 case for vertex attribute extension. Seems like DXVK depends on that and it might get reverted upstream. Since apps are not supposed to use 0 in v2 anyway, we should be safe implementing the old behavior there. Fixes: `66e12451ac` "radv: Update to new VK_EXT_vertex_attribute_divisor to version 2." CC: 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-16 11:13:19 +02:00
Bas Nieuwenhuizen	3308db2dd7	radv: Possible on-demand compilation fix. Seems that in a single case we use the renderpass before checking the pipeline, so check the renderpass before we use it. Fixes: `fbcd167314` "radv: Add on-demand compilation of built-in shaders." Tested-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-16 11:13:19 +02:00
Gert Wollny	1560c58b12	mesa/st: fix array indices off-by-one error in remapping When moving the array sizes from the old list to the new one it was not taken into account that the array indices start with one, but the array_size array started at index zero, which resulted in incorrect array sizes when arrays were merged. Correct this by copying the array_size values of the retained arrays with an offset of -1. Also fix whitespaces for the replaced lines. Fixes: `d8c2119f9b` mesa/st/glsl_to_tgsi: Expose array live range tracking and merging Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-16 08:52:26 +02:00
Alexander Tsoy	9a96bf0ecd	meson: fix build for egl platform_x11 without dri3 and gbm Compiling EGL's platform_x11 without dri3 and gbm yields this compile failure: platform_x11 needs inc_loader: ../mesa-18.2.0-rc2/src/egl/drivers/dri2/platform_x11.c:48:10: fatal error: loader.h: No such file or directory #include "loader.h" ^~~~~~~~~~ Fixes: `108d257a16` ("meson: build libEGL") Bugzilla: https://bugs.gentoo.org/663534 Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-08-15 16:37:16 -07:00
Jason Ekstrand	10f44da775	Revert "intel/nir: Call nir_lower_io_to_scalar_early" Commit `4434591bf5` caused substantially more URB messages in geometry and tessellation shaders. Before we can really enable this sort of optimization, We either need some way of combining them back together into vectors or we need to do cross-stage vector element elimination without splitting everything into scalars. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107510 Fixes: `4434591bf5` "intel/nir: Call nir_lower_io_to_scalar_early" Acked-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-08-15 17:56:50 -05:00
Erik Faye-Lund	da1f7c56da	i965: do not emit empty surface state If called with an empty size, brw_emit_buffer_surface_state asserts. We already have a dedicated helper for uploading nothing, so let's use that instead. Avoids an assert in dEQP-GLES31.functional.shaders.opaque_type_indexing.ssbo.const_literal_vertex when running a debug build of i965. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-15 23:23:16 +01:00
Sergii Romantsov	743dff1cca	intel/ppgtt: 4096 replaced by PAGE_SIZE Usage of number 4096 replaced by PAGE_SIZE. Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-15 23:23:16 +01:00
Sergii Romantsov	24839663a4	intel/ppgtt: memory address alignment Kernel (for ppgtt) requires memory address to be aligned to page size (4096). -v2: added marking that also fixes initial commit `01058a5522`. -v3: numbers replaced by PAGE_SIZE; buffer-object size is aligned instead of alignment of offsets (Chris Wilson). -v4: changes related to PAGE_SIZE moved to separate commit -v5: restored alignment to page-size for 0-size. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106997 Fixes: `a363bb2cd0` (i965: Allocate VMA in userspace for full-PPGTT systems.) Fixes: `01058a5522` (i965: Add virtual memory allocator infrastructure to brw_bufmgr.) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-15 23:23:16 +01:00
Timothy Arceri	f0a8accb0d	radv: add Doom workaround Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-16 07:53:38 +10:00
Sergii Romantsov	efb28aa970	i965: Emitting 3DSTATE_SO_BUFFER of 0-size. Avoided filling of whole structure and bo-allocation if size of surface is 0. Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>	2018-08-15 13:15:28 -07:00
Erik Faye-Lund	98b3b6367a	virgl: report actual max-texture sizes Instead of doing conservative guesses, we should report the max levels based on the max sizes we get from GL on the host. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>	2018-08-15 18:48:16 +02:00
Erik Faye-Lund	825aaeae39	virgl: do not use SP_MAX_TEXTURE_*_LEVELS defines These macro-names are also used for softpipe, so let's avoid confusion by avoiding them. Besides, they are just used in one place in virgl, so let's just inline them into the place they are used instead. While we're at it, fixup an error in the comment for the 3D version. Mesa subtracts computes max-size by doing by 2^(n-1), which means this should be 256 cubed, not 512 cubed. The other comments are correct. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jakob Bornecrantz <jakob@collabora.com>	2018-08-15 18:48:08 +02:00
Dylan Baker	ef7ae84daf	docs: Add news item for 18.1.6	2018-08-15 09:09:59 -07:00
Samuel Pitoiset	71d5b2fbf8	radv: disable the auto-waitcnt-before-barrier LLVM option This option allows us to remove additional s_waitcnt instructions because s_barrier internally does s_waitcnt 0. Though, apparently there is a problem with LDS accesses that causes rendering issues with FFXV and DXVK. Disable this optimization for now (RadeonSI still uses it). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460 CC: 18.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-15 16:21:50 +02:00
Samuel Pitoiset	85113c4d05	radv: fix memory leaks in radv_load_meta_pipeline() Reported by Coverity. Fixes: `fbcd167314` ("radv: Add on-demand compilation of built-in shaders.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-15 16:20:58 +02:00
Samuel Pitoiset	17e79865cf	radv: drop wrong initialization of COMPUTE_RESOURCE_LIMITS The last parameter of radeon_set_sh_reg_seq() is the number of dwords to emit. We were lucky because WAVES_PER_SH(0x3) is 3 but it was initialized to 0. COMPUTE_RESOURCE_LIMITS is correctly set when generating compute pipelines, so we don't need to initialize it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-15 16:20:38 +02:00
Andres Gomez	53b4701cb0	docs: update calendar 18.2.0-rc3 is out Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-15 15:48:18 +03:00
Mauro Rossi	43318d5857	radv/meta_decompress: fix pointer to integer conversion VK_NULL_HANDLE replaces NULL to avoid following building error: external/mesa/src/amd/vulkan/radv_meta_decompress.c:365:54: error: incompatible pointer to integer conversion passing 'void ' to parameter of type 'VkShaderModule' (aka 'unsigned long long') [-Werror,-Wint-conversion] VkResult ret = create_pipeline(cmd_buffer->device, NULL, samples, ^~~~ prebuilts/clang/host/linux-x86/clang-4053586/lib64/clang/5.0.300080/include/stddef.h:105:16: note: expanded from macro 'NULL' # define NULL ((void)0) ^~~~~~~~~~ external/mesa/src/amd/vulkan/radv_meta_decompress.c:97:32: note: passing argument to parameter 'vs_module_h' here VkShaderModule vs_module_h, ^ 1 error generated. Fixes: `fbcd167314` ("radv: Add on-demand compilation of built-in shaders.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-08-15 14:34:50 +02:00
Mauro Rossi	73b342c7a5	egl/android: fix regression in drm_gralloc path (v2) This patch fixes a regression in mesa 18.2 and mesa-dev branches for HAVE_DRM_GRALLOC code path which is causing black screen on Android and prevents boot due to SIGSEGV MAPERR crash related to unproper handling of drm_gralloc drm FD in new droid_open_device() path. Problem is due to `c7bb82136b` ("egl/android: Add DRM node probing and filtering") To avoid the crash the former existing working droid_open_device() is restored, renamed droid_open_device_drm_gralloc() and kept within HAVE_DRM_GRALLOC braces. Tested with mesa-dev and mesa 18.2 branch and oreo-x86 bootanimation and Androdi GUI booting is fixed with i965, nouveau, radeon. The changes are compatible with gbm_gralloc, I've tested build with hwc too. (v2) remove indentation from HAVE_DRM_GRALLOC pre-processor directive NOTE: Definition of enum{} for GRALLOC_MODULE_PERFORM_GET_DRM_FD is not necessary and it's actually causing a redefinition building error, because in HAVE_DRM_GRALLOC path gralloc_drm.h is already exported by libgralloc_drm which is currently still a dependency. Fixes: `c7bb82136b` ("egl/android: Add DRM node probing and filtering") Cc: "18.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>	2018-08-15 14:07:49 +02:00
Tapani Pälli	656ccf4ef8	mesa: shader dump/read support for ARB programs Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106283 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-08-15 11:03:35 +03:00
Danylo Piliaiev	479a849ad6	glsl: Avoid calling get_array_element for scalar constants Accessing scalar constant as an array in function call or initializer list triggered assert in get_array_element. Examples: func(0[0]); vec2 t = { 0[0], 0 }; Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107550 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-15 10:01:43 +03:00
Marek Olšák	bffa025ada	radeonsi: enable 1 missing PS_SU perf counter on Polaris	2018-08-14 21:20:31 -04:00
Marek Olšák	df50099834	radeonsi: use radeon_info::name Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	84652721b9	ac: add radeon_info::name Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	de8d5edbc4	radeonsi: split si_clear_buffer to remove enum si_method Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:12 -04:00
Marek Olšák	4de92f2abb	radeonsi: replace CP_DMA_USE_L2 with enum si_cache_policy Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:10 -04:00
Marek Olšák	bc132d62f9	radeonsi: declare coher in si_copy_buffer Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:09 -04:00
Marek Olšák	cddd7ce325	radeonsi: make PFP_SYNC_ME an explicit CP DMA flag Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:07 -04:00
Marek Olšák	277295962c	radeonsi: don't use emit_data->args in load_emit Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:06 -04:00
Marek Olšák	8fb34050b5	radeonsi: don't use emit_data->args in store_emit Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:04 -04:00
Marek Olšák	a2c18bfbe3	radeonsi: don't use emit_data->args in atomic_emit Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:03 -04:00
Marek Olšák	297fb213b3	radeonsi: don't use emit_data->args in build_interp_intrinsic Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:01 -04:00
Marek Olšák	99ae440d4e	radeonsi: inline atomic_fetch_args Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:59 -04:00
Marek Olšák	267e92893c	radeonsi: inline store_fetch_args Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:58 -04:00
Marek Olšák	f15e55aa8a	radeonsi: inline load_fetch_args Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:56 -04:00
Marek Olšák	2c94f321eb	radeonsi: merge txq_emit and resq_emit Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:55 -04:00
Marek Olšák	a14c803166	radeonsi: inline resq_fetch_args Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:54 -04:00
Marek Olšák	347e52adcd	radeonsi: inline txq_fetch_args Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:52 -04:00
Marek Olšák	c9b2ce2672	radeonsi: use get_resinfo directly in lower_gather4_integer Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:36 -04:00
Marek Olšák	7804ddaf87	radeonsi: inline tex_fetch_args into build_tex_intrinsic The diff looks like it moves code that I didn't touch. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:34 -04:00
Marek Olšák	da1d8adc29	radeonsi: remove fetch_args callbacks for ALU instructions Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:33 -04:00
Marek Olšák	ac72a6bd0b	radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	0ca8294ece	radeonsi: implement EXT_window_rectangles Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:19:02 -04:00
Marek Olšák	465e929d6a	gallium/u_blitter: save/restore window rectangles Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:19:01 -04:00
Marek Olšák	15fc0f8d4a	noop: implement set_window_rectangles Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:18:59 -04:00
Marek Olšák	7c8716e4fb	ddebug: implement set_window_rectangles Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:18:51 -04:00
Rodrigo Vivi	44f1dcf9b3	i965: Add a new CFL PCI ID. One more CFL ID added to spec. Align with kernel commit d0e062ebb3a4 ("drm/i915/cfl: Add a new CFL PCI ID.") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-14 15:46:56 -07:00
Rob Clark	70bf639328	freedreno/ir3: add support for a6xx 'merged' register set Starting with a6xx, half and full precision registers conflict. Which makes things a bit more efficient, ie. if some parts of the shader are heavy on half-precision and others on full precision, you don't have to allocate the worst case for both. But it means we need to setup some additional conflicts. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	4813060ed4	freedreno/ir3: small RA cleanup Collapse is_temp() into it's only callsite, and pass compiler object as struct rather than void. Just cleanups to reduce noise in next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	fdd35f497b	freedreno/ir3: stop hard-coding FS input regs We originally did this because at the time we didn't know all the bitfields to configure where various frag shader sysval's went. But we do. So switch to using sysvals for all the frag shader inputs. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	e97b56172c	freedreno/ir3: use r63.x for unused inputs This way, unused sysval inputs, like frag_vcoord, get the correct regid value to disable the input. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	066930e54d	freedreno/ir3: create all inputs in first block create_input()/create_input_compmask() should take the ctx as arg, rather than block, to enforce that all inputs are created in the first block, so that RA sees them as live at the start of the shader. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	62da068fd3	freedreno/ir3: rename s/frag_pos/frag_vcoord/g Make it more clear that this is varying fetch related. Also fixup some comments. Just cleanup for next patches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	4a7f9feada	compiler: add SYSTEM_VALUE_VARYING_COORD Used internally in freedreno/ir3 for the vec2 value that hw passes to shader to use as coordinate for bary.f (varying fetch) instruction. This is not the same as SYSTEM_VALUE_FRAG_COORD. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Rob Clark	b5a098b202	freedreno/ir3: move per-generation compiler config Move it from the compile ctx to the compiler object, before adding new things for a6xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 17:59:02 -04:00
Bas Nieuwenhuizen	66e12451ac	radv: Update to new VK_EXT_vertex_attribute_divisor to version 2. Behavior wrt firstInstance got changed, and a divisor of 0 has been disallowed. The new version of the ext got published in specification 1.1.81. Sending to stable since the only known user is DXVK, which needs this for correctness. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.2 <mesa-stable@lists.freedesktop.org>	2018-08-14 22:13:09 +02:00
Bas Nieuwenhuizen	4bb6c49375	radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9. Follow radeonsi. Fixes: `3665f66ef2` "radv: Add support for ETC2 textures." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 22:11:04 +02:00
Bas Nieuwenhuizen	bf33ca7512	radv: Fix missing Android platform define. CC: <mesa-stable@lists.freedesktop.org> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 22:11:04 +02:00
Rob Clark	13b9d32fb1	freedreno: move free() into fdN_context_destroy() Following patches will be doing further cleanup after calling fd_context_destroy() so it is easier if we move the free() into the per-gen backend code. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 15:46:34 -04:00
Jonathan Marek	dc9705f30d	freedreno: a2xx: ir2 update this patch brings a number of changes to ir2: -ir2 now generates CF clauses as necessary during assembly. this simplifies fd2_program/fd2_compiler and is necessary to implement optimization passes -ir2 now has separate vector/scalar instructions. this will make it easier to implementing scheduling of scalar+vector instructions together. dst_reg is also now seperate from src registers instead of a single list -ir2 now implements register allocation. this makes it possible to compile shaders which have more than 64 TGSI registers -ir2 now implements the following optimizations: removal of IN/OUT MOV instructions generated by TGSI and removal of unused instructions when some exports are disabled -ir2 now allows full 8-bit index for constants -ir2_alloc no longer allocates 4 times too many bytes Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-14 12:46:25 -04:00
Andres Gomez	5406eb5513	docs: update calendar 18.2.0-rc1 and 18.2.0-rc2 are out Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-14 17:07:09 +03:00
Bas Nieuwenhuizen	fbcd167314	radv: Add on-demand compilation of built-in shaders. In environments where we cannot cache, e.g. Android (no homedir), ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the startup cost of creating a device in radv is rather high, due to compiling all possible built-in pipelines up front. This meant depending on the CPU a 1-4 sec cost of creating a Device. For CTS this cost is unacceptable, and likely for starting random apps too. So if there is no cache, with this patch radv will compile shaders on demand. Once there is a cache from the first run, even if incomplete, the driver knows that it can likely write the cache and precompiles everything. Note that I did not switch the buffer and itob/btoi compute pipelines to on-demand, since you cannot really do anything in Vulkan without them and there are only a few. This reduces the CTS runtime for the no caches scenario on my threadripper from 32 minutes to 8 minutes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:24 +02:00
Bas Nieuwenhuizen	24a9033d6f	radv: Refactor blit pipeline creation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:11 +02:00
Bas Nieuwenhuizen	806a792b43	radv: Make fs key exemplars ordered to be a reverse fs_key lookup. While at it, share the exemplars and account for a non-occurring fs key. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:06 +02:00
Dave Airlie	0be5e9f5a1	virgl: ARB_texture_barrier support Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>	2018-08-14 16:55:56 +10:00
Dylan Baker	6d61aed231	docs: update calendar, add news item and link release notes for 18.1.6 Signed-off-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-13 10:06:45 -07:00
Dylan Baker	973ae7a06b	docs: Add sha256 sums for 18.1.6	2018-08-13 10:05:44 -07:00
Dylan Baker	66c8a64e67	docs: Add release notes for 18.1.6	2018-08-13 10:05:42 -07:00
Alejandro Piñeiro	668ab8aeb1	mesa/glspirv: fix compilation with MSVC From AppVeyor #8582, it seems that MSVC doesn't like uint, so this patch replaces it with unsigned. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-08-13 18:57:18 +02:00
Eric Engestrom	f976d22759	travis: install correct version of mako for each build system Meson now uses python3, so let's add a block for Autotools, move that line into the buildsys-specific blocks, and set the correct version for Meson. Fixes: `2ee1c86d71` "meson: Build with Python 3" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-13 17:29:42 +01:00
Erik Faye-Lund	ae5770171c	mesa/st/glsl_to_tgsi: fixup copy-paste mistake This is clearly a copy-paste error; if we validate the reladdr2-pointer, we don't want to traverse to the reladdr-pointer. Especially since the check above shows that reladdr could be NULL here. Noticed by Coverity. CID: 1438389, 1438390 Fixes: `568bda2f2d` ("mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-08-13 18:15:36 +02:00
Neil Roberts	c91a5f70fb	i965/nir: Use the nir copy of shader_info to handle gl_PatchVerticesIn Instead of using the copy of shader_info stored in gl_program, it now uses the one in nir_shader. This is needed for SPIR-V because the info.tess.tcs_vertices_out is filled in via _mesa_spirv_to_nir which happens much later than with a GLSL shader. The copy of shader_data in gl_program is only updated later via brw_shader_gather_info but that is too late. For GLSL this shouldn't create any problems because the nir copy of the shader_info is immediately copied from the gl_program in glsl_to_nir. v2: updated after commit "i965: Combine both gl_PatchVerticesIn lowering passes." (488972) (Alejandro Piñeiro) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Neil Roberts	a105c1e6e5	mesa/glspirv: Set separate_shader on shader_info The value is copied from the gl_program. If we don’t do this then it will get reset back to zero in brw_shader_gather_info. This isn’t a problem for GLSL because in that case the nir_shader is initialised with a copy of the shader_info from the gl_program. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Iago Toral Quiroga	40947d4744	mesa/glspirv: pick off the only entry point we need This is the same we do for vulkan drivers This is needed to pass the following CTS test: KHR-GL45.gl_spirv.spirv_modules_shader_binary_multiple_shader_objects_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Alejandro Piñeiro	32e1d4c34b	mesa/glspirv: compute double inputs and remap attributes input locations used by input attributes are not handled in the same way in OpenGL vs Vulkan. There is a detailed explanation of such differences on the following commit: `c2acf97fcc` So with this commit, the same adjustment that is done after glsl_to_nir, is being done after spirv_to_nir, when it is used on OpenGL (ARB_gl_spirv). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Alejandro Piñeiro	d6c8066663	nir/glsl: make nir_remap_attributes public As we plan to reuse it for ARB_gl_spirv implementation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Alejandro Piñeiro	af194bd38e	nir/lower_samplers: don't assume a deref for both texture and sampler srcs After commit "nir: Use derefs in nir_lower_samplers" (`75286c2d08`) assumes one deref for both the texture and the sampler. However there are cases (on OpenGL, using ARB_gl_spirv) where SPIR-V is not providing a sampler, like for texture query levels ops. Although we could make spirv_to_nir to provide a sampler deref for those cases, it is not really needed, and wrong from the Vulkan point of view. This patch fixes the following (borrowed) tests run on SPIR-V mode: arb_compute_shader/execution/basic-texelFetch.shader_test arb_gpu_shader5/execution/sampler_array_indexing/fs-simple-texture-size.shader_test arb_texture_query_levels/execution/fs-baselevel.shader_test arb_texture_query_levels/execution/fs-maxlevel.shader_test arb_texture_query_levels/execution/fs-miptree.shader_test arb_texture_query_levels/execution/fs-nomips.shader_test arb_texture_query_levels/execution/vs-baselevel.shader_test arb_texture_query_levels/execution/vs-maxlevel.shader_test arb_texture_query_levels/execution/vs-miptree.shader_test arb_texture_query_levels/execution/vs-nomips.shader_test glsl-1.30/execution/fs-textureSize-compare.shader_test v2: merge lower_tex_src_to_offset and calc_sampler_offsets together, update texture/sampler index and texture_array_size directly on lower_tex_src_to_offset (Jason) v3: clarify one comment (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-13 16:28:27 +02:00
Alejandro Piñeiro	fe2de39fb2	nir/linker: take into account hidden uniforms So they are not exposed through the introspection API. It is worth to note that the number of hidden uniforms of GLSL linking vs SPIR-V linking would be somewhat different due the differen order of the nir lowerings/optimizations. For example: gl_FbWposYTransform. This is introduced as part of nir_lower_wpos_ytransform. On GLSL that is executed after the IR-based linking. So that means that on GLSL the UniformStorage will not include this uniform. With the SPIR-V linking, that uniform is already present, but marked as hidden. So it will be included on the UniformStorage, but as hidden. One alternative would create a special how_declared for that case, but seemed an overkill. Using hidden should be ok as far as it is used properly. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:27 +02:00
Alejandro Piñeiro	5332d7582d	nir: add how_declared to nir_variable.data Equivalent to the already existing how_declared at GLSL IR. The only difference is that we are not adding all the declaration_type available on GLSL, only the one that we will use on the short term. We would add more mode if needed on the future. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 16:28:26 +02:00
Neil Roberts	be6f472b23	spirv: Make VertexIndex and VertexId both non-zero-based GLSL has gl_VertexID which is supposed to be non-zero-based. SPIR-V has both VertexIndex and VertexId builtins whose meanings are defined by the APIs. Vulkan defines VertexIndex as being non-zero-based. In Vulkan VertexId and InstanceId have no meaning and are pretty much just reserved for OpenGL at this point. GL_ARB_spirv removes VertexIndex and defines VertexId to be the same as gl_VertexId (which is also non-zero-based). Previously in Mesa it was treating VertexIndex as non-zero-based and VertexId as zero-based, so it was breaking for GL. This behaviour was apparently based on Khronos bug 14255. However that bug doesn’t seem to have made a final decision for VertexId. Assuming there really is no other definition for VertexId for Vulkan it seems better to just make them both have the same value. v2: update comment and commit descriptions, based on Jason Ekstrand explanation of the meaning/rationale behind all those builtins (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-13 16:23:36 +02:00
Alejandro Piñeiro	624c00f1a6	spirv: fill info.gs.input_primitive too info.gs.output_primitive was already being filled. Not sure why this is not needed on Vulkan, but we found to be needed for ARB_gl_spirv. Specifically, this is needed to get the following test passing: KHR-GL45.gl_spirv.spirv_validation_builtin_variable_decorations_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-13 12:56:51 +02:00
Tapani Pälli	ed94a5799d	docs/features: mark GL_EXT_render_snorm as done for i965 Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-13 13:08:22 +03:00
Tapani Pälli	fa9e6c235d	i965: enable EXT_render_snorm Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-08-13 12:03:17 +03:00
Tapani Pälli	0d356cf478	mesa: enable EXT_render_snorm extension Patch sets additional formats renderable and enables the extension when OpenGL ES 3.1 is supported. v2: instead of dummy_true, have a separate toggle for extension (Eric Anholt) v3: add missing checks, simplify some existing checks and fix glCopyTexImage2D check (Nanley Chery) add SHORT and BYTE support in read_pixels_es3_error_check Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-08-13 12:03:17 +03:00
Kenneth Graunke	de57926dc9	blorp: Properly handle Z24X8 blits. One of the reasons we didn't notice that R24_UNORM_X8_TYPELESS destinations were broken was that an earlier layer was swapping it out for B8G8R8A8_UNORM. That made Z24X8 -> Z24X8 blits work. However, R32_FLOAT -> R24_UNORM_X8_TYPELESS was still totally broken. The old code only considered one format at a time, without thinking that format conversion may need to occur. This patch moves the translation out to a place where it can consider both formats. If both are Z24X8, we continue using B8G8R8A8_UNORM to avoid having to do shader math workarounds. If we have a Z24X8 destination, but a non-matching source, we use our shader hacks to actually render to it properly. Fixes: `804856fa57` (intel/blorp: Handle more exotic destination formats) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-11 12:34:01 -07:00
Kenneth Graunke	8a29086285	blorp: Don't try to use R32_UNORM for R24_UNORM_X8_TYPELESS rendering. The hardware doesn't support rendering to R24_UNORM_X8_TYPELESS, so Jason decided to fake it with a bit of shader math and R32_UNORM RTs. The only problem is that R32_UNORM isn't renderable either...so we've just traded one bad format for another. This patch makes us use R32_UINT instead. Fixes: `804856fa57` (intel/blorp: Handle more exotic destination formats) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-11 12:33:27 -07:00
Jason Ekstrand	a9f7bcfdf9	intel: Switch the order of the 2x MSAA sample positions The Vulkan 1.1.82 spec flipped the order to better match D3D. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-08-11 10:58:12 -05:00
Gert Wollny	8a87138885	mesa/st/tests: Add array life range estimation and renumbering tests Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	0981fc84df	mesa/st/tests: Add array life range tests infrastructure to common test class Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	d8c2119f9b	mesa/st/glsl_to_tgsi: Expose array live range tracking and merging This patch ties in the array split, merge, and interleave code. shader-db changes in the TGSI code are: original code \| array-merge \| change mean max \| mean max \| best mean % worst ----------------------------------------------------------- arrays 0.05 2 \| 0.00 0 \| -2 -100 0 total temps 5.05 21 \| 4.92 20 \| -15 -2.59 1 instr 55.33 988 \| 55.20 988 \| -15 -0.24 0 Evaluation: Run shader-db in single thread mode (otherwise the output is not ordered and the best and worst column don't make sense) to get results pre-stats.txt and post-stats.txt. Then using python pandas: import pandas as pd old_stats = pd.read_csv('pre-stats.txt') new_stats = pd.read_csv('post-stats.txt') omean = old_stats.mean() omax = old_stats.max() nmean = new_stats.mean() nmax = new_stats.max() delta = new_stats - old_stats pd.concat([omean, omax, nmean, nmax, delta.min(), delta.mean()/old_stats.mean()*100, delta.max()], axis=1, keys=['mean', 'max', 'mean', 'max', 'best', 'avg change %', 'worst']) v4: - Correct typo and add bugs that are fixed by this series. - Update stats and describe stats evaluation Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105371 https://bugs.freedesktop.org/show_bug.cgi?id=100200 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	c317d0ab54	mesa/st/glsl_to_tgsi: add array life range evaluation into tracking code v4: Also track the register given in inst->resource. (thanks: Benedikt Schemmer for testing the patches on radeonsi, which revealed that I was missing tracking this) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	5e58eb37f1	mesa/st/glsl_to_tgsi: add class for array access tracking Because of the indirect access it is impossible to obtain an accurate per component and array element tracking. Therefore, the tracking is simplified to only track whether any element was accessed, whether this happend conditionally in a loop. In addition, while tracking of temporaries requires a per-componet tracking that is later fused, for arrays only the components access mask is neede. The resulting tracking code and evaluation of the array live range is sufficiently different from the evaluation of the live range of temporaries to justify implementing this in a different class instead of adding more complexity to the already existing code for temporary life range evaluation. v4: Update commit message to make it clearer why this class is seperate from the tracking of temporaries. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	7d55d01b53	mesa/st/glsl_to_tgsi: move evaluation of read mask up in the call hierarchy In preparation of the array live range tracking the evaluation of the read mask is moved out the register live range tracking to the enclosing call of the generalized read access tracking. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	f2a4636339	mesa/st/glsl_to_tgsi: rename access_record to register_merge_record and some more renames In preparartion of adding the tracking of the live range the classes that refer to temporary registers are renamed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	8c89728889	mesa/st/tests: Add tests for array merge helper classes. v2: - Define tests also in the meson.build file. v4: - Check no-op mapping of all bits. - Convert tests to the new class layout used in the merge evaulation. - remove dependency on llvm in meson build (Thanks Dylan Baker for pointing out that this might not needed) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	12316aa217	mesa/st/glsl_to_tgsi: Add array merge logic v4: - Update the code to use the new merge logic. - Use a cleaner, class-based approach for the evaluation of merges. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	d097ef4204	mesa/st/glsl_to_tgsi: Add helper classes to apply array merging and interleaving v4: - Remove logic for evaluation of swizzles and merges since this was moved to array_live_range. This class now only handles the actual remapping. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	d54c2f92f9	mesa/st/glsl_to_tgsi: Add helper class for array live range merging and interleaving This class holds the array length, live range, and accessed components, and it implements the logic for evaluating how arrays are merged and interleaved. v4: - Add logic to evaluate merge and interleave of a pair of arrays to the class array_live_range. - document class - update commit message Thanks Nicolai Hähnle for the pointers given. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	331ae3cde5	mesa/st/glsl_to_tgsi:rename lifetime to register_live_range On one hand "live range" is the term used in the literature, and on the other hand a distinction is needed from the array live ranges. v4: Fix indentions and white spaces Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	f40c9d0225	mesa/st/glsl_to_tgsi: Properly resolve life times simple if/else + use constructs in constructs like below, currently the live range estimation extends the live range of t unecessarily to the whole loop because it was not detected that t is unconditional written and later read only in the "if (a)" scope. while (foo) { ... if (a) { ... if (b) t = ... else t = ... x = t; ... } ... } This patch adds a unit test for this case and corrects the minimal live range estimation accordingly. v4: update comments Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	568bda2f2d	mesa/st/glsl_to_tgsi: Split arrays whose elements are only accessed directly Array whose elements are only accessed directly are replaced by the according number of temporary registers. By doing so the otherwise reserved register range becomes subject to further optimizations like copy propagation and register merging. Thanks to the resulting reduced register pressure this patch makes the piglits spec/glsl-1.50/execution - variable-indexing/vs-output-array-vec3-index-wr-before-gs geometry/max-input-components pass on r600 (barts) where they would fail before with a "GPR limit exceeded" error (even with the spilling that was recently added). v2: * rename method dissolve_arrays to split_arrays * unify the tracking and remapping methods for src and dst registers * also track access to arrays via reladdr* v3: * enable this optimization only if the driver requests register merge v4: * Correct comments * Also update inst->resource if it is an array element (thanks: Benedikt Schemmer for testing the patches on radeonsi, which revealed that I was missing tracking this) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	b1cead3add	mesa/st/glsl_to_tgsi: Add method to collect some TGSI statistics When mesa is compiled in debug mode then this adds the possibility to print out some statistics about the translated and optimized TGSI shaders to a file. The functionality is enabled by setting the environment variable GLSL_TO_TGSI_PRINT_STATS to the file name where the statistics should be collected. The file is opened in append mode so that statistics from various runs will be accumulated. v4: Make accress to log file thread save (thanks for pointing this out Nicolai Hähnle) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-11 12:32:42 +02:00
Gert Wollny	be95ca9be7	Gallium/tgsi: Correct signdness of return value of bit operations The GLSL operations findLSB, findMSB, and countBits always return a signed integer type. Let TGSI reflect this. v2: Properly set values in infer_(src\|dst)_type (Thanks Roland Schneidegger for pointing out problems with my 1st approach) v2: Set values in the common infer_type code path, and only add the correct source type for UMSB (Roland Schneidegger) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-08-11 11:14:29 +02:00
Mathieu Bridon	2ee1c86d71	meson: Build with Python 3 Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-10 15:15:09 -07:00
Mathieu Bridon	bd27203f4d	python: Rework bytes/unicode string handling In both Python 2 and 3, opening a file without specifying the mode will open it for reading in text mode ('r'). On Python 2, the read() method of a file object opened in mode 'r' will return byte strings, while on Python 3 it will return unicode strings. Explicitly specifying the binary mode ('rb') then decoding the byte string means we always handle unicode strings on both Python 2 and 3. Which in turns means all re.match(line) will return unicode strings as well. If we also make expandCString return unicode strings, we don't need the call to the unicode() constructor any more. We were using the ugettext() method because it always returns unicode strings in Python 2, contrarily to the gettext() one which returns byte strings. The ugettext() method doesn't exist on Python 3, so we must use the right method on each version of Python. The last hurdles are that Python 3 doesn't let us concatenate unicode and byte strings directly, and that Python 2's stdout wants encoded byte strings while Python 3's want unicode strings. With these changes, the script gives the same output on both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-10 15:14:48 -07:00
Mathieu Bridon	15ac05fd45	python: Fix inequality comparisons On Python 3, executing `foo != bar` will first try to call foo.__ne__(bar), and fallback on the opposite result of foo.__eq__(bar). Python 2 does not do that. As a result, those __eq__ methods were never called, when we were testing for inequality. Expliclty adding the __ne__ methods fixes this issue, in a way that is compatible with both Python 2 and 3. However, this means the __eq__ methods are now called when testing for `foo != None`, so they need to be guarded correctly. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-10 08:45:59 -07:00
Gert Wollny	e94095ec30	mesa/st: ETC2 now uses R8G8B8A8_SRGB as fallback The check for ETC2 compatibility was not updated when the fallback format was changed. Fixes: `71867a0a61` st/mesa: Fall back to R8G8B8A8_SRGB for ETC2 Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-10 10:09:22 +02:00
Mathieu Bridon	08fe9b3e3a	python: Simplify list sorting Instead of copying the list, then sorting the copy in-place, we can just get a new sorted copy directly. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:19 -07:00
Mathieu Bridon	8d3ff6244c	python: Use key-functions when sorting containers In Python 2, the traditional way to sort containers was to use a comparison function (which returned either -1, 0 or 1 when passed two objects) and pass that as the "cmp" argument to the container's sort() method. Python 2.4 introduced key-functions, which instead only operate on a given item, and return a sorting key for this item. In general, this runs faster, because the cmp-function has to get run multiple times for each item of the container. Python 3 removed the cmp-function, enforcing usage of key-functions instead. This change makes the script compatible with Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:19 -07:00
Mathieu Bridon	1e668ca111	python: Better check for integer types Python 3 lost the long type: now everything is an int, with the right size. This commit makes the script compatible with Python 2 (where we check for both int and long) and Python 3 (where we only check for int). Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:19 -07:00
Mathieu Bridon	14f1ab998f	python: Do not mix bytes and unicode strings Mixing the two is a long-standing recipe for errors in Python 2, so much so that Python 3 now completely separates them. This commit stops treating both as if they were the same, and in the process makes the script compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:19 -07:00
Mathieu Bridon	c644b2d7a7	python: Explicitly use a list On Python 2, the builtin functions filter() returns a list. On Python 3, it returns an iterator. Since we want to use those objects in contexts where we need lists, we need to explicitly turn them into lists. This makes the code compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:18 -07:00
Mathieu Bridon	d9ca4a172e	python: Use the right function for the job The code was just reimplementing itertools.combinations_with_replacement in a less efficient way. This does change the order of the results slightly, but it should be ok. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-09 16:49:18 -07:00
Eric Anholt	b618d7ea59	egl: Fix leak of X11 pixmaps backing pbuffers in DRI3. This is basically copied from the DRI2 destroy path. Without this, Raspberry Pi would quickly run out of CMA during the EGL tests in the CTS due to all the pixmaps laying around. Fixes: `f35198bade` ("egl/x11: Implement dri3 support with loader's dri3 helper") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-09 13:12:13 -07:00
Kenneth Graunke	08a5c395ab	intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5. When the SIMD16 Gen4-5 fragment shader payload contains source depth (g2-3), destination stencil (g4), and destination depth (g5-6), the single register of stencil makes the destination depth unaligned. We were generating this instruction in the RT write payload setup: mov(16) m14<1>F g5<8,8,1>F { align1 compr }; which is illegal, instructions with a source region spanning more than one register need to be aligned to even registers. This is because the hardware implicitly does (nr \| 1) instead of (nr + 1) when splitting the compressed instruction into two mov(8)'s. I believe this would cause the hardware to load g5 twice, replicating subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2 artifact blocks in both TIS-100 and Reicast. Normally, we rely on the register allocator to even-align our virtual GRFs. But we don't control the payload, so we need to lower SIMD widths to make it work. To fix this, we teach lower_simd_width about the restriction, and then call it again after lower_load_payload (which is what generates the offending MOV). Fixes: `8aee87fe4c` (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Diego Viola <diego.viola@gmail.com>	2018-08-09 12:33:41 -07:00
Kenneth Graunke	11b9f63a74	i965: Only enable depth IZ signals if there's an actual depthbuffer. According to the G45 PRM Volume 2 Page 265 we're supposed to only set these signals when there is an actual depth buffer. Note that we already do this for the stencil buffer by virtue of brw->stencil_enabled invoking _mesa_is_stencil_enabled(ctx) which checks whether the current drawbuffer's visual has stencil bits (which is updated based on what buffers are bound). We just need to do it for depth as well. Not observed to fix anything. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-09 12:33:38 -07:00
Adam Jackson	63a6b719d9	glx: GLX_MESA_multithread_makecurrent is direct-only This extension is not defined for indirect contexts. Marking it as "client only", as the old code did here, would make the extension available in indirect contexts, even though the server would certainly not have it in its extension list. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-09 12:33:14 -04:00
Eric Engestrom	fcf259ef97	anv: set error in all failure paths Cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `5b196f39bd` "anv/pipeline: Compile to NIR in compile_graphics" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-09 11:20:27 +01:00
Eric Engestrom	aac80f7597	intel/tools: add missing variable initialisation Fixes: `6a60beba40` "intel/tools: Add an error state to aub translator" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-08-09 11:20:18 +01:00
vadym.shovkoplias	e0de26eacc	drirc: Allow extension midshader for Metro Redux This fixes both Metro 2033 Redux and Metro Last Light Redux Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99730 Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-09 13:13:20 +03:00
Tapani Pälli	03a5acec68	glsl: handle error case with ast_post_inc, ast_post_dec Return ir_rvalue::error_value with ast_post_inc, ast_post_dec if parser error was emitted previously. This way process_array_size won't see bogus IR generated like with commit `9c676a6427`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98699 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-08-09 13:07:16 +03:00
Eric Anholt	fdfb689a48	vc4: Implement texture_subdata() to directly upload tiled data. This avoids a memcpy into a temporary in the upload path. Improves x11perf -putimage100 performance by 12.1586% +/- 1.38155% (n=145)	2018-08-08 18:14:31 -07:00
Eric Anholt	25bee5ef9e	vc4: Handle partial loads/stores of tiled textures. Previously, we would load out the tile-aligned area, update the raster copy, and store it back. This was a huge cost for XPutImage calls to the screen under glamor. Instead, implement a general load/store path that walks over the source x/y writing into the corresponding pixel of the destination (using clever math from https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/). If things are aligned, we go through the previous utile-at-a-time loop. Improves x11perf -putimage10 performance by 139.777% +/- 2.83464% (n=5) Improves x11perf -putimage100 performance by 383.908% +/- 22.6297% (n=11) Improves x11perf -getimage10 performance by 2.75731% +/- 0.585054% (n=145)	2018-08-08 16:45:44 -07:00
Eric Anholt	3e06b918aa	vc4: Compile the LT image helper per cpp we might load/store. For the partial load/store support I'm about to add, we want the memcpy to be compiled out to a single load/store. This should also eliminate the calls to vc4_utile_width/height(). Improves x11perf -putimage100 performance by 3.76344% +/- 1.16978% (n=15)	2018-08-08 15:53:25 -07:00
Eric Anholt	d6a174669f	vc4: Refactor to reuse the LT tile walking code.	2018-08-08 12:34:48 -07:00
Juan A. Suarez Romero	a9fb331ea7	wayland/egl: update surface size on window resize According to EGL 1.5 spec, section 3.10.1.1 ("Native Window Resizing"): "If the native window corresponding to _surface_ has been resized prior to the swap, _surface_ must be resized to match. _surface_ will normally be resized by the EGL implementation at the time the native window is resized. If the implementation cannot do this transparently to the client, then eglSwapBuffers must detect the change and resize surface prior to copying its pixels to the native window." So far, resizing a native window in Wayland/EGL was interpreted in Mesa as a request to resize, which is not executed until the first draw call. And hence, surface size is not updated until executing it. Thus, querying the surface size with eglQuerySurface() after a window resize still returns the old values. This commit updates the surface size values as soon as the resize is done, even when the real resize is done in the draw call. This makes the semantics that any native window resize request take effect inmediately, and if user calls eglQuerySurface() it will return the new resized values. v2: update surface size if there isn't a back surface (Daniel) CC: Daniel Stone <daniel@fooishbar.org> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-08-08 18:29:58 +02:00
Juan A. Suarez Romero	1fe7cbdf05	wayland/egl: initialize window surface size to window size When creating a windows surface with eglCreateWindowSurface(), the width and height returned by eglQuerySurface(EGL_{WIDTH,HEIGHT}) is invalid until buffers are updated (like calling glClear()). But according to EGL 1.5 spec, section 3.5.6 ("Surface Attributes"): "Querying EGL_WIDTH and EGL_HEIGHT returns respectively the width and height, in pixels, of the surface. For a window or pixmap surface, these values are initially equal to the width and height of the native window or pixmap with respect to which the surface was created" This fixes dEQP-EGL.functional.color_clears.* CTS tests v2: - Do not modify attached_{width,height} (Daniel) - Do not update size on resizing window (Brendan) CC: Daniel Stone <daniel@fooishbar.org> CC: Brendan King <brendan.king@imgtec.com> CC: mesa-stable@lists.freedesktop.org Tested-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-08-08 18:28:52 +02:00
Juan A. Suarez Romero	f9d0e7d3bc	travis: make drivers explicit in Meson targets Like in the autotools target, make the list of drivers to be built in each of the Meson targets explicit. This will help to identify missing dependencies and other issues more easily. CC: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-08 17:56:32 +02:00
Brian Paul	51e878cdb3	svga: use pipe_sampler_view::target in svga_set_sampler_views() instead of the underlying texture's target. This fixes an issue where the TGSI sampler type was not agreeing with the sampler view target/type. In particular, this fixes a Mint 19 XFCE desktop scaling issue because the TGSI code was using a RECT sampler but the sampler view's underlying texture was PIPE_TEXTURE_2D. We want to use the sampler view's type rather than the underlying resource, as we do for the view's surface format. No piglit regressions. VMware issue 2156696. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-08 08:20:10 -06:00
Brian Paul	92e5dc94ac	svga: use SVGA3D_RS_FILLMODE for vgpu9 I'm not sure why we didn't support this in the past, but fillmode is supported by all renderers nowadays. Also fix the logic in svga_create_rasterizer_state() to avoid a few swtnl case. No piglit regressions Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-08 08:20:10 -06:00
Brian Paul	a45b495700	svga: add TGSI_SEMANTIC_FACE switch case in svga_swtnl_update_vdecl() Fixes failed assertion running Piglit polygon-mode-face test. Though, the test still does not pass. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-08-08 08:20:10 -06:00
Brian Paul	92e7342a6f	xlib: remove unused Fake_glXGetAGPOffsetMESA() function To silence compiler warning. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-08 08:20:09 -06:00
Brian Paul	6ff4795c62	gl.h: define GLeglImageOES depending on GL_EXT_EGL_image_storage To avoid duplicate typedef with the definition in glext.h V2: test for both GL_OES_EGL_image and GL_EXT_EGL_image_storage in case both the GL and GLES headers are included. Per Emil. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107488 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-08-08 08:20:01 -06:00
Emil Velikov	32aa7ff647	Android: copy -fnomath options from the autotools build Add -fno-math-errno and -fno-trapping-math to the build. Mesa does not depend on the functionality provided, thus this should result in slightly faster code and smaller binaries. Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-08 13:45:55 +01:00
Emil Velikov	315c46cfdc	autotools: use correct gl.pc LIBS when using glvnd This is more of a hack, since glvnd itself should be providing the file. Until that happens, ensure the libs is correctly set to -lGL CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	8dc96416c9	glx: automake: add egl.pc/headers TODO when using glvnd Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	94ed4c4a16	egl: automake: add egl.pc/headers TODO when using glvnd Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	25a9450a44	autotools: error out when building with mangling and glvnd It's not a thing that can work, nor is a wise idea to attempt. v2: Tweak error message (Dylan) CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)	2018-08-08 13:37:09 +01:00
Emil Velikov	d5ac236471	autotools: error out when using the broken --with-{gl, osmesa}-lib-name The toggles were broken with the introduction of --enable-mangling. Fixing that up might be possible, but it's not worth the complexity since one can rename the libraries at any point. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	4f2b73d9fd	meson: recommend building the surfaceless platform It has no special requirements, size and build-time is effectively zero. v2: Rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	a7ea7511ba	automake: require shared glapi when using DRI based libGL This has been a requirement for ages, yet it seems like we never explicitly errored out during configure. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-08-08 13:37:09 +01:00
Emil Velikov	834036500c	ttn: remove {varying_slot, frag_result}_to_tgsi_semantic helpers The respective drivers have been updated and the helpers are no longer needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-08-08 13:33:07 +01:00
Juan A. Suarez Romero	db432194a1	travis: remove libedit-dev dependency in LLVM 6.0 targets In LLVM <6.0 we added explicitly libedit-dev, as it was required to satisfy apt dependencies. In LLVM 6.0, this is not required anymore, so let's remove it. CC: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-08 13:00:33 +02:00
Erik Faye-Lund	0f450e0cbe	glsl_to_tgsi: plumb image writable through to driver The virgl driver cares about the writable-flag on image definitions, because it re-emits GLSL from the TGSI. However, so far it was hardcoded to true in glsl_to_tgsi, which cause problems when virglrenderer is running on top of GLES 3.1, where not all formats are supported for writable images. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-08 09:35:09 +02:00
Eric Anholt	cfe69d0aaa	vc4: Fix vc4_fence_server_sync() on pre-syncobj kernels. We won't have an FD if we're just having the server wait on a fence created by eglCreateSyncKHR(). Our seqno fences will happen in order, so server-side waits are no-ops in that case. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.buffers.gen_delete Fixes: `b0acc3a562` ("broadcom/vc4: Native fence fd support")	2018-08-07 17:00:49 -07:00
Eric Anholt	69158c452b	vc4: Ignore samplers for finding uniform offsets. Fixes: dEQP-GLES2.shaders.struct.uniform.sampler_array_fragment dEQP-GLES2.shaders.struct.uniform.sampler_array_vertex dEQP-GLES2.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.shaders.struct.uniform.sampler_nested_vertex Cc: mesa-stable@lists.freedesktop.org	2018-08-07 17:00:22 -07:00
Eric Anholt	e24a8e5232	vc4: Extend dumping of uniforms in QIR and in the command stream. Similar to what I did for V3D, provide some description of the uniforms.	2018-08-07 17:00:22 -07:00
Eric Anholt	3954331aff	vc4: Pull uinfo->data[i] dereference out to the top of the loop. Reduces the size of vc4_uniforms.o by about 10%. We would basically always end up loading the cachline of uinfo->data[i] anyway, so it should be good for performance as well as making the code a bit cleaner.	2018-08-07 17:00:22 -07:00
Eric Anholt	550e9c917c	vc4: Make sure to emit a tile coordinates between two MSAA loads. The HW only executes a load once the tile coordinates packet happens, and only tracks one at a time, so by emitting our two MSAA loads back to back we would end up with an undefined color or Z buffer. The simulator doesn't seem to care, but sync up the RCL generation with the kernel anyway. Fixes dEQP-EGL.functional.render.multi_context.gles2.rgb888_window	2018-08-07 17:00:22 -07:00
Eric Anholt	9ab6912a00	vc4: Respect a sampler view's first_layer field. Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture Cc: mesa-stable@lists.freedesktop.org	2018-08-07 17:00:22 -07:00
Dave Airlie	fe0a3a45bb	virgl: add ARB_shader_clock support Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-08-08 08:36:40 +10:00
Mathieu Bridon	ba1ebf2ee1	python: Specify the template output encoding We're trying to write a unicode string (i.e decoded) to a file opened in binary (i.e encoded) mode. In Python 2 this works, because of the automatic conversion between byte and unicode strings. In Python 3 this fails though, as no automatic conversion is attempted. This change makes the scripts compatible with both versions of Python. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-07 13:28:35 -07:00
Mathieu Bridon	e1b88aee68	python: Fix rich comparisons Python 3 doesn't call objects __cmp__() methods any more to compare them. Instead, it requires implementing the rich comparison methods explicitly: __eq__(), __ne(), __lt__(), __le__(), __gt__() and __ge__(). Fortunately Python 2 also supports those. This commit only implements the comparison methods which are actually used by the build scripts. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-07 13:10:34 -07:00
Mathieu Bridon	9b6746b7c0	python: Use explicit integer divisions In Python 2, divisions of integers return an integer: >>> 32 / 4 8 In Python 3 though, they return floats: >>> 32 / 4 8.0 However, Python 3 has an explicit integer division operator: >>> 32 // 4 8 That operator exists on Python >= 2.2, so let's use it everywhere to make the scripts compatible with both Python 2 and 3. In addition, using __future__.division tells Python 2 to behave the same way as Python 3, which helps ensure the scripts produce the same output in both versions of Python. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2) Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-07 13:07:44 -07:00
Chad Versace	3dc22381fa	egl/main: Add bits for EGL_KHR_mutable_render_buffer A follow-up patch enables EGL_KHR_mutable_render_buffer for Android. This patch is separate from the Android patch because I think it's easier to review the platform-independent bits separately. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 11:11:05 -07:00
Chad Versace	5c6d6eedb3	dri: Add param driCreateConfigs(mutable_render_buffer) If set, then the config will have __DRI_ATTRIB_MUTABLE_RENDER_BUFFER, which translates to EGL_MUTABLE_RENDER_BUFFER_BIT_KHR. Not used yet. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 11:11:05 -07:00
Chad Versace	bbe2d50b58	dri: Define DRI_MutableRenderBuffer extensions Define extensions DRI_MutableRenderBufferDriver and DRI_MutableRenderBufferLoader. These are the two halves for EGL_KHR_mutable_render_buffer. Outside the DRI code there is one additional change. Add gl_config::mutableRenderBuffer to match __DRI_ATTRIB_MUTABLE_RENDER_BUFFER. Neither are used yet. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 11:11:05 -07:00
Chad Versace	eabf59791e	egl/dri2: In dri2_make_current, return early on failure This pulls an 'else' block into the function's main body, making the code easier to follow. Without this change, the upcoming EGL_KHR_mutable_render_buffer patch transforms dri2_make_current() into spaghetti. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 11:11:05 -07:00
Chad Versace	f48f9a78da	egl: Simplify queries for EGL_RENDER_BUFFER There exist two queryable EGL_RENDER_BUFFER states in EGL: eglQuerySurface(EGL_RENDER_BUFFER) and eglQueryContext(EGL_RENDER_BUFFER). These changes eliminate potentially very fragile code in the upcoming EGL_KHR_mutable_render_buffer implementation. * eglQuerySurface(EGL_RENDER_BUFFER) The implementation of eglQuerySurface(EGL_RENDER_BUFFER) contained abstruse logic which required comprehending the specification complexities of how the two EGL_RENDER_BUFFER states interact. The function sometimes returned _EGLContext::WindowRenderBuffer, sometimes _EGLSurface::RenderBuffer. Why? The function tried to encode the actual logic from the EGL spec. When did the function return which variable? Go study the EGL spec, hope you understand it, then hope Mesa mutated the EGL_RENDER_BUFFER state in all the correct places. Have fun. To simplify eglQuerySurface(EGL_RENDER_BUFFER), and to improve confidence in its correctness, flatten its indirect logic. For pixmap and pbuffer surfaces, simply return a hard-coded literal value, as the spec suggests. For window surfaces, simply return _EGLSurface::RequestedRenderBuffer. Nothing difficult here. * eglQueryContext(EGL_RENDER_BUFFER) The implementation of this suffered from the same issues as eglQuerySurface, and the solution is the same. confidence in its correctness, flatten its indirect logic. For pixmap and pbuffer surfaces, simply return a hard-coded literal value, as the spec suggests. For window surfaces, simply return _EGLSurface::ActiveRenderBuffer. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 11:11:05 -07:00
Marek Olšák	d145e33e7c	radeonsi: set GLC=1 for all write-only shader resources	2018-08-07 13:52:34 -04:00
Marek Olšák	2ab8cf6de5	radeonsi: don't load block dimensions into SGPRs if they are not variable	2018-08-07 13:52:34 -04:00
Juan A. Suarez Romero	03cff7ecd8	travis: meson/Vulkan requires LLVM 6.0 RADV now requires LLVM 6.0. Fixes: `fd1121e839` ("amd: remove support for LLVM 5.0") CC: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-07 19:29:29 +02:00
Juan A. Suarez Romero	80f937ea4d	travis: add ubuntu-toolchain-r-test LLVM 6.0 requires libstc++4.9, which is not available in main Travis repository. v2: LLVM 6.0 requires libstdc+4.9, rather than GCC 4.9 (Jan Vesely) Fixes: `fd1121e839` ("amd: remove support for LLVM 5.0") CC: Marek Olšák <marek.olsak@amd.com> CC: Emil Velikov <emil.velikov@collabora.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-07 19:27:07 +02:00
Emil Velikov	85cad15298	egl: set EGL_BAD_NATIVE_PIXMAP in the copy_buffers fallback As the spec says: EGL_BAD_NATIVE_PIXMAP is generated if the implementation does not support native pixmaps. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:59:24 +01:00
Emil Velikov	5463064f7a	egl/x11: use the no-op dri2_fallback_copy_buffers for swrast Currently dri2_copy_buffers is used for swrast, which depends on the DRI2_FLUSH extension. Since that's not a thing on software based drivers we crash out. Do the slightly more graceful, thing of returning EGL_FALSE. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:59:09 +01:00
Emil Velikov	670cd4080b	egl: remove unneeded _eglGetNativePlatform check There's little point in calling _eglGetNativePlatform() in eglCopyBuffers. The platform returned should be identical to the one already stored in our _EGLDisplay. In the following corner case, the check is incorrect. The function _eglGetNativePlatform effectively invokes the old-style eglGetDisplay platform selection. Thus if the EGL_PLATFORM platform does not match with the EGL_EXT_platform_* used to create the display we'll error out. Addresses the egl-copy-buffers piglit test. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 17:58:52 +01:00
Emil Velikov	b4b277f770	travis: use https for all the links Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:27:06 +01:00
Emil Velikov	6b8657aff0	autoconf: stop exporting internal wayland details With version v1.15 the "code" option was deprecated in favour of "private-code" or "public-code". Before the interface symbol generated was exported (which is a bad idea since it's internal implementation detail) and others may misuse it. That was the case with libva approx. 1 year ago. Since then libva was fixed, so we can finally hide it by using "private-code" Inspired by similar xserver patch by Adam Jackson. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:23:17 +01:00
Emil Velikov	2f1d9e6cb8	meson: stop exporting internal wayland details With version v1.15 the "code" option was deprecated in favour of "private-code" or "public-code". Before the interface symbol generated was exported (which is a bad idea since it's internal implementation detail) and others may misuse it. That was the case with libva approx. 1 year ago. Since then libva was fixed, so we can finally hide it by using "private-code" Inspired by similar xserver patch by Adam Jackson. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:23:17 +01:00
Emil Velikov	c077b74ee8	meson: use dependency()+find_program() for wayland-scanner Helps when the native wayland-scanner is located outside of PATH. Inspired by the xserver code ;-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 17:23:17 +01:00
Emil Velikov	54d844897f	swr: don't export swr_create_screen_internal With earlier rework the user and provider of the symbol are within the same binary. Thus there's no point in exporting the function. Spotted while reviewing patch from Chuck, that nearly added another unneeded PUBLIC function. Cc: Chuck Atkins <chuck.atkins@kitware.com> Cc: Tim Rowley <timothy.o.rowley@intel.com> Fixes: `f50aa21456` "(swr: build driver proper separate from rasterizer") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com<mailto:george.kyriazis@intel.com>> Tested-by: Chuck Atkins <chuck.atkins@kitware.com<mailto:chuck.atkins@kitware.com>>	2018-08-07 17:23:17 +01:00
Eric Engestrom	e02f061b69	meson: install KHR/khrplatform.h when needed Fixes: `f7d42ee7d3` "include: update GL & GLES headers (v2)" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-07 15:57:32 +01:00
Eric Engestrom	ed07e831a8	i965: gen_shader_sha1() doesn't use the brw_context Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-08-07 14:20:50 +01:00
Eric Engestrom	87c156183c	configure: install KHR/khrplatform.h when needed Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107511 Fixes: `f7d42ee7d3` "include: update GL & GLES headers (v2)" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Brad King <brad.king@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-08-07 14:20:50 +01:00
Lionel Landwerlin	303e7b39b5	intel: don't build tools without -Dtools=intel Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107487 Fixes: 4334196ab325c6w ("intel: tools: simplify meson build") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-07 11:58:47 +01:00
Erik Faye-Lund	c4f183492d	virgl: update virgl_hw.h from virglrenderer This just makes sure we're currently up-to-date with what virglrenderer has. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-07 09:38:41 +02:00
Erik Faye-Lund	0914e1464e	virgl: rename msaa_sample_positions -> sample_locations This matches what this field is called in virglrenderer's copy of this. This reduces the diff between the two different versions of virgl_hw.h, and should make it easier to upgrade the file in the future. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: Dave Airlie <airlied@redhat.com>	2018-08-07 09:38:27 +02:00
Eric Anholt	9507e03699	vc4: Fix a leak of the no-vertex-elements workaround BO. Fixes: `bd1925562a` ("vc4: Convert the driver to emitting the shader record using pack macros.")	2018-08-06 19:10:06 -07:00
Eric Anholt	86095e9bb1	vc4: Fix context creation when syncobjs aren't supported. Noticed when trying to run current Mesa on rpi's downstream kernel. Fixes: `b0acc3a562` ("broadcom/vc4: Native fence fd support")	2018-08-06 19:10:06 -07:00
Eric Anholt	1561e4984e	v3d: Emit the VCM_CACHE_SIZE packet. This is needed to ensure that we don't get blocked waiting for VPM space with bin/render overlapping. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	5d49076990	v3d: Drop "VC5" from the renderer string. VC5 isn't a useful name any more, just stick to v3d.	2018-08-06 13:03:23 -07:00
Eric Anholt	50a8713d4f	v3d: Avoid spilling that breaks the r5 usage after a ldvary. Fixes bad rendering when forcing 2 spills in glxgears. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	f2c0d310d6	v3d: Make sure that QPU instruction-has-a-dest matches VIR. Found when debugging register spilling -- we would try to spill the dest of a STVPMV, inserting spill code after entering the last segment. In fact, we were likely to to choose to do this, given that the STVPMV "dest" temp was never read from, making it cheap to spill. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	3f9cb2eb05	v3d: Wait for TMU writes to complete before continuing after a spill. The simulator complained that we had write responses outstanding at shader end. It seems that a TMU read does not guarantee that previous TMU writes by the thread have completed, which surprised me. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	ccbe33af5b	v3d: Make sure we don't emit a thrsw before the last one finished. Found while forcing some spilling, which creates a lot of short tmua->thrsw->ldtmu sequences. Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-06 13:03:23 -07:00
Eric Anholt	f9d54dc3cf	v3d: Add some debug code for forcing register spilling. This is useful for periodically testing out register spilling to see how it goes on simple shaders, rather than only failing on insanely complicated ones.	2018-08-06 13:03:23 -07:00
Chad Versace	aaa41cd297	drisw: Fix build on Android Nougat, which lacks shm (v2) In commit `cf54bd5e8`, dri_sw_winsys.c began using <sys/shm.h> to support the new functions putImageShm, getImageShm in DRI_SWRastLoader. But Android began supporting System V shared memory only in Oreo. Nougat has no shm headers. Fix the build by ifdef'ing out the shm code on Nougat. Fixes: `cf54bd5e8` "drisw: use shared memory when possible" Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@gmail.com>	2018-08-06 11:09:38 -07:00
Ian Romanick	6229ee87c7	mesa: fix make check for AMD_framebuffer_multisample_advanced Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107483 Fixes: `3d6900d76e` ("glapi: define AMD_framebuffer_multisample_advanced and add its functions") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: Vinson Lee <vlee@freedesktop.org>	2018-08-06 10:31:56 -07:00
Ian Romanick	b7946f6778	glapi: Fix GLES versioning for AMD_framebuffer_multisample_advanced functions The GL_AMD_framebuffer_multisample_advanced spec says: OpenGL ES dependencies: Requires OpenGL ES 3.0. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107483 Fixes: `3d6900d76e` ("glapi: define AMD_framebuffer_multisample_advanced and add its functions") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: Vinson Lee <vlee@freedesktop.org>	2018-08-06 10:30:06 -07:00
Gert Wollny	7a46b2d641	meson, install_megadrivers: Also remove stale symlinks os.path.exists doesn't return True for stale symlinks, but they are in the way later, when a link/file with the same name is to be created. For instance it is conceivable that the pointed to file is replaced by a file with a new name, and then the symlink is dead. To handle this check specifically for all existing symlinks to be removed. (This bugged me for some time with a link libXvMCr600.so always being in the way of installing this file) v2: use only os.lexist and replace all instances of os.exist (Dylan Baker) v3: handle directory check correctly (Eric Engestrom) Fixes: `f7f1b30f81` ("meson: extend install_megadrivers script to handle symmlinking") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>(v2 minus dir check) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-06 18:42:01 +02:00
Tapani Pälli	5eb4b384d9	anv: add more swapchain formats This change helps with some of the dEQP-VK.wsi.android.* tests that try to create swapchain with using such formats. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-08-06 09:25:11 +03:00
Karol Herbst	c3325097be	nvc0/ir: return 0 in imageLoad on incomplete textures We already guarded all OP_SULDP against out of bound accesses, but we ended up just reusing whatever value was stored in the dest registers. Fixes CTS test shader_image_load_store.incomplete_textures v2: fix for loads not ending up with predicates (bindless_texture) v3: fix replacing the def Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-08-04 18:25:20 +02:00
Karol Herbst	0ca046d7e9	gm200/ir: optimize rcp(sqrt) to rsq mitigates hurt shaders after adding sqrt: total instructions in shared programs : 5456166 -> 5454825 (-0.02%) total gprs used in shared programs : 647522 -> 647551 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58288696 -> 58274448 (-0.02%) local shared gpr inst bytes helped 0 0 0 516 516 hurt 0 0 27 2 2 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-08-04 15:24:08 +02:00
Karol Herbst	6f98a3065b	gm200/ir: add native OP_SQRT support ./GpuTest /test=pixmark_piano 1024x640 30sec: 301 -> 327 points shader-db: total instructions in shared programs : 5472103 -> 5456166 (-0.29%) total gprs used in shared programs : 647530 -> 647522 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58459304 -> 58288696 (-0.29%) local shared gpr inst bytes helped 0 0 27 8281 8281 hurt 0 0 21 431 431 v2: use NVISA_GM200_CHIPSET Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-08-04 15:24:08 +02:00
Lionel Landwerlin	4334196ab3	intel: tools: simplify meson build Remove the if tools condition and just put it through the install: parameter. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-04 09:45:34 +01:00
Lionel Landwerlin	87a3c97781	intel: aubinator: simplify decoding Since we don't support streaming an aub file, we can drop the decoding status enum. v2: include stdbool (Eric) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-04 09:40:14 +01:00
Lionel Landwerlin	02ebc064ea	intel: common: add missing stdint include Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-04 09:39:01 +01:00
Lionel Landwerlin	db4770ee57	intel: decoder: remove unused variable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-04 09:38:58 +01:00
Lionel Landwerlin	7471286bb0	intel: tools: aubwrite: reuse canonical address helper Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-04 09:38:44 +01:00
Lionel Landwerlin	35955afa7a	intel: aubinator: fix read the context/ring Up to now we've been lucky that the buffer returned was always exactly at the address we requested. Fixes: `144b40db54` ("intel: aubinator: drop the 1Tb GTT mapping") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-04 09:38:34 +01:00
Ian Romanick	3b07d28f81	nir: Transform expressions of b2f(a) and b2f(b) to a == b All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14276886 -> 14276838 (<.01%) instructions in affected programs: 312 -> 264 (-15.38%) helped: 2 HURT: 0 total cycles in shared programs: 532578395 -> 532570985 (<.01%) cycles in affected programs: 682562 -> 675152 (-1.09%) helped: 374 HURT: 4 helped stats (abs) min: 2 max: 200 x̄: 20.39 x̃: 18 helped stats (rel) min: 0.07% max: 11.64% x̄: 1.25% x̃: 1.28% HURT stats (abs) min: 2 max: 114 x̄: 53.50 x̃: 49 HURT stats (rel) min: 0.06% max: 11.70% x̄: 5.02% x̃: 4.15% 95% mean confidence interval for cycles value: -21.30 -17.91 95% mean confidence interval for cycles %-change: -1.30% -1.06% Cycles are helped. Sandy Bridge total instructions in shared programs: 10488123 -> 10488075 (<.01%) instructions in affected programs: 336 -> 288 (-14.29%) helped: 2 HURT: 0 total cycles in shared programs: 150260379 -> 150260439 (<.01%) cycles in affected programs: 4726 -> 4786 (1.27%) helped: 0 HURT: 2 No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	c658b6c4c8	nir: Transform expressions of b2f(a) and b2f(b) to a ^^ b All Gen platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14276892 -> 14276886 (<.01%) instructions in affected programs: 484 -> 478 (-1.24%) helped: 2 HURT: 0 total cycles in shared programs: 532578397 -> 532578395 (<.01%) cycles in affected programs: 3522 -> 3520 (-0.06%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	3aca80aabc	nir: Transform expressions of b2f(a) and b2f(b) to !(a && b) All Gen platforms had pretty similar results. (Skylake shown) total cycles in shared programs: 532578400 -> 532578397 (<.01%) cycles in affected programs: 2784 -> 2781 (-0.11%) helped: 1 HURT: 1 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.26% max: 0.26% x̄: 0.26% x̃: 0.26% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% v2: s/fmax/fmin/. Noticed by Thomas Helland. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	1713c97181	nir: Transform expressions of b2f(a) and b2f(b) to a && b No changes on any Gen platform. v2: s/fmax/fmin/. Noticed by Thomas Helland. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	4425f4786a	nir: Transform expressions of b2f(a) and b2f(b) to !(a \|\| b) All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14276961 -> 14276892 (<.01%) instructions in affected programs: 3215 -> 3146 (-2.15%) helped: 28 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 2.46 x̃: 2 helped stats (rel) min: 0.47% max: 9.52% x̄: 4.34% x̃: 1.92% 95% mean confidence interval for instructions value: -2.87 -2.06 95% mean confidence interval for instructions %-change: -5.73% -2.95% Instructions are helped. total cycles in shared programs: 532577068 -> 532578400 (<.01%) cycles in affected programs: 121864 -> 123196 (1.09%) helped: 35 HURT: 30 helped stats (abs) min: 2 max: 268 x̄: 42.34 x̃: 22 helped stats (rel) min: 0.12% max: 12.14% x̄: 3.22% x̃: 1.86% HURT stats (abs) min: 2 max: 246 x̄: 93.80 x̃: 36 HURT stats (rel) min: 0.09% max: 13.63% x̄: 4.47% x̃: 2.58% 95% mean confidence interval for cycles value: -5.02 46.01 95% mean confidence interval for cycles %-change: -0.99% 1.65% Inconclusive result (value mean confidence interval includes 0). Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 7781299 -> 7781342 (<.01%) instructions in affected programs: 22300 -> 22343 (0.19%) helped: 13 HURT: 40 helped stats (abs) min: 2 max: 3 x̄: 2.85 x̃: 3 helped stats (rel) min: 1.15% max: 7.69% x̄: 3.72% x̃: 3.33% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.26% max: 1.30% x̄: 0.47% x̃: 0.43% 95% mean confidence interval for instructions value: 0.23 1.39 95% mean confidence interval for instructions %-change: -1.18% 0.07% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 177878928 -> 177879332 (<.01%) cycles in affected programs: 383298 -> 383702 (0.11%) helped: 7 HURT: 43 helped stats (abs) min: 2 max: 18 x̄: 10.00 x̃: 10 helped stats (rel) min: 0.17% max: 4.81% x̄: 2.62% x̃: 3.40% HURT stats (abs) min: 2 max: 38 x̄: 11.02 x̃: 12 HURT stats (rel) min: 0.08% max: 1.54% x̄: 0.25% x̃: 0.09% 95% mean confidence interval for cycles value: 5.21 10.95 95% mean confidence interval for cycles %-change: -0.51% 0.21% Inconclusive result (%-change mean confidence interval includes 0). v2: s/fmin/fmax/. Noticed by Thomas Helland. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	6b3670ae80	nir: Transform -fabs(a) >= 0 to a == 0 All Gen platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14276964 -> 14276961 (<.01%) instructions in affected programs: 411 -> 408 (-0.73%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.47% max: 1.96% x̄: 1.04% x̃: 0.68% total cycles in shared programs: 532577062 -> 532577068 (<.01%) cycles in affected programs: 1093 -> 1099 (0.55%) helped: 1 HURT: 1 helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16 helped stats (rel) min: 7.77% max: 7.77% x̄: 7.77% x̃: 7.77% HURT stats (abs) min: 22 max: 22 x̄: 22.00 x̃: 22 HURT stats (rel) min: 2.48% max: 2.48% x̄: 2.48% x̃: 2.48% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	46e7c340d4	nir: Transform expressions of b2f(a) and b2f(b) to a \|\| b All Gen6+ platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14277184 -> 14276964 (<.01%) instructions in affected programs: 10082 -> 9862 (-2.18%) helped: 37 HURT: 1 helped stats (abs) min: 1 max: 30 x̄: 5.97 x̃: 4 helped stats (rel) min: 0.14% max: 16.00% x̄: 5.23% x̃: 2.04% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70% 95% mean confidence interval for instructions value: -7.87 -3.71 95% mean confidence interval for instructions %-change: -6.98% -3.16% Instructions are helped. total cycles in shared programs: 532577990 -> 532577062 (<.01%) cycles in affected programs: 170959 -> 170031 (-0.54%) helped: 33 HURT: 9 helped stats (abs) min: 2 max: 120 x̄: 30.91 x̃: 30 helped stats (rel) min: 0.02% max: 7.65% x̄: 2.66% x̃: 1.13% HURT stats (abs) min: 2 max: 24 x̄: 10.22 x̃: 8 HURT stats (rel) min: 0.09% max: 1.79% x̄: 0.61% x̃: 0.22% 95% mean confidence interval for cycles value: -31.23 -12.96 95% mean confidence interval for cycles %-change: -2.90% -1.02% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 7781539 -> 7781301 (<.01%) instructions in affected programs: 10169 -> 9931 (-2.34%) helped: 32 HURT: 0 helped stats (abs) min: 2 max: 20 x̄: 7.44 x̃: 6 helped stats (rel) min: 0.47% max: 17.02% x̄: 4.03% x̃: 1.88% 95% mean confidence interval for instructions value: -9.53 -5.34 95% mean confidence interval for instructions %-change: -5.94% -2.12% Instructions are helped. total cycles in shared programs: 177878590 -> 177878932 (<.01%) cycles in affected programs: 78706 -> 79048 (0.43%) helped: 7 HURT: 21 helped stats (abs) min: 6 max: 34 x̄: 24.57 x̃: 28 helped stats (rel) min: 0.15% max: 8.33% x̄: 4.66% x̃: 6.37% HURT stats (abs) min: 2 max: 86 x̄: 24.48 x̃: 22 HURT stats (rel) min: 0.01% max: 4.28% x̄: 1.21% x̃: 0.70% 95% mean confidence interval for cycles value: 0.30 24.13 95% mean confidence interval for cycles %-change: -1.52% 1.01% Inconclusive result (%-change mean confidence interval includes 0). v2: s/fmin/fmax/. Noticed by Thomas Helland. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	be7d3ba34a	nir: Transform -fabs(a) < 0 to a != 0 Unlike the much older -abs(a) >= 0.0 transformation, this is not precise. The behavior changes if a is NaN. All Gen platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14277216 -> 14277184 (<.01%) instructions in affected programs: 2300 -> 2268 (-1.39%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.00 x̃: 3 helped stats (rel) min: 0.48% max: 15.15% x̄: 4.41% x̃: 1.01% 95% mean confidence interval for instructions value: -6.45 -1.55 95% mean confidence interval for instructions %-change: -9.96% 1.13% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 532577848 -> 532577990 (<.01%) cycles in affected programs: 17486 -> 17628 (0.81%) helped: 2 HURT: 5 helped stats (abs) min: 2 max: 6 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.06% max: 1.81% x̄: 0.93% x̃: 0.93% HURT stats (abs) min: 6 max: 50 x̄: 30.00 x̃: 26 HURT stats (rel) min: 0.55% max: 2.17% x̄: 1.19% x̃: 1.02% 95% mean confidence interval for cycles value: -1.06 41.63 95% mean confidence interval for cycles %-change: -0.58% 1.74% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	d49eab2757	nir: Rearrange bcsel with two bcsel sources All Gen platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14277220 -> 14277216 (<.01%) instructions in affected programs: 422 -> 418 (-0.95%) helped: 2 HURT: 0 total cycles in shared programs: 532577908 -> 532577848 (<.01%) cycles in affected programs: 2800 -> 2740 (-2.14%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	b92fded6eb	nir: Collapse more repeated bcsels on the same argument All Gen platforms had pretty similar results. (Skylake shown) total instructions in shared programs: 14277230 -> 14277220 (<.01%) instructions in affected programs: 751 -> 741 (-1.33%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 3 x̄: 2.50 x̃: 2 helped stats (rel) min: 1.23% max: 1.40% x̄: 1.32% x̃: 1.32% 95% mean confidence interval for instructions value: -3.42 -1.58 95% mean confidence interval for instructions %-change: -1.47% -1.17% Instructions are helped. total cycles in shared programs: 532577947 -> 532577908 (<.01%) cycles in affected programs: 10641 -> 10602 (-0.37%) helped: 4 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 13.75 x̃: 7 helped stats (rel) min: 0.11% max: 3.08% x̄: 1.10% x̃: 0.60% HURT stats (abs) min: 2 max: 8 x̄: 5.33 x̃: 6 HURT stats (rel) min: 0.13% max: 0.55% x̄: 0.30% x̃: 0.23% 95% mean confidence interval for cycles value: -20.69 9.55 95% mean confidence interval for cycles %-change: -1.63% 0.63% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-04 01:12:03 -07:00
Ian Romanick	408330ed48	nir: Don't compare i2f or u2i with zero Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14277620 -> 14277230 (<.01%) instructions in affected programs: 36905 -> 36515 (-1.06%) helped: 101 HURT: 6 helped stats (abs) min: 1 max: 6 x̄: 4.46 x̃: 6 helped stats (rel) min: 0.32% max: 7.69% x̄: 1.80% x̃: 1.51% HURT stats (abs) min: 1 max: 28 x̄: 10.00 x̃: 1 HURT stats (rel) min: 0.33% max: 1.74% x̄: 0.68% x̃: 0.47% 95% mean confidence interval for instructions value: -4.59 -2.70 95% mean confidence interval for instructions %-change: -1.90% -1.41% Instructions are helped. total cycles in shared programs: 532580716 -> 532577947 (<.01%) cycles in affected programs: 940575 -> 937806 (-0.29%) helped: 92 HURT: 12 helped stats (abs) min: 2 max: 158 x̄: 51.04 x̃: 62 helped stats (rel) min: 0.24% max: 3.99% x̄: 2.14% x̃: 2.41% HURT stats (abs) min: 10 max: 1112 x̄: 160.58 x̃: 63 HURT stats (rel) min: 0.06% max: 21.90% x̄: 4.22% x̃: 0.20% 95% mean confidence interval for cycles value: -50.66 -2.59 95% mean confidence interval for cycles %-change: -2.09% -0.73% Cycles are helped. total spills in shared programs: 8116 -> 8124 (0.10%) spills in affected programs: 200 -> 208 (4.00%) helped: 0 HURT: 2 total fills in shared programs: 11086 -> 11094 (0.07%) fills in affected programs: 436 -> 444 (1.83%) helped: 0 HURT: 2 Ivy Bridge and Haswell had similar results. (Haswell shown) total instructions in shared programs: 12979054 -> 12978067 (<.01%) instructions in affected programs: 33633 -> 32646 (-2.93%) helped: 120 HURT: 2 helped stats (abs) min: 1 max: 13 x̄: 8.53 x̃: 13 helped stats (rel) min: 0.30% max: 16.67% x̄: 4.55% x̃: 3.17% HURT stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18 HURT stats (rel) min: 1.15% max: 2.84% x̄: 2.00% x̃: 2.00% 95% mean confidence interval for instructions value: -9.19 -6.99 95% mean confidence interval for instructions %-change: -5.27% -3.62% Instructions are helped. total cycles in shared programs: 411212880 -> 411199636 (<.01%) cycles in affected programs: 696441 -> 683197 (-1.90%) helped: 107 HURT: 5 helped stats (abs) min: 2 max: 864 x̄: 124.90 x̃: 146 helped stats (rel) min: 0.03% max: 29.20% x̄: 8.58% x̃: 5.88% HURT stats (abs) min: 2 max: 50 x̄: 24.00 x̃: 22 HURT stats (rel) min: 0.01% max: 5.35% x̄: 1.29% x̃: 0.25% 95% mean confidence interval for cycles value: -136.96 -99.54 95% mean confidence interval for cycles %-change: -9.75% -6.53% Cycles are helped. total spills in shared programs: 78623 -> 78631 (0.01%) spills in affected programs: 66 -> 74 (12.12%) helped: 0 HURT: 2 total fills in shared programs: 80104 -> 80108 (<.01%) fills in affected programs: 133 -> 137 (3.01%) helped: 0 HURT: 2 No changes on Sandy Bridge, Iron Lake, or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	a3845616a2	nir: Remove f2i(i2f(x)) conversions Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14277978 -> 14277620 (<.01%) instructions in affected programs: 36957 -> 36599 (-0.97%) helped: 76 HURT: 1 helped stats (abs) min: 2 max: 90 x̄: 4.89 x̃: 4 helped stats (rel) min: 0.44% max: 5.88% x̄: 1.04% x̃: 0.87% HURT stats (abs) min: 14 max: 14 x̄: 14.00 x̃: 14 HURT stats (rel) min: 0.36% max: 0.36% x̄: 0.36% x̃: 0.36% 95% mean confidence interval for instructions value: -7.06 -2.24 95% mean confidence interval for instructions %-change: -1.28% -0.77% Instructions are helped. total cycles in shared programs: 532584581 -> 532580716 (<.01%) cycles in affected programs: 973591 -> 969726 (-0.40%) helped: 76 HURT: 1 helped stats (abs) min: 2 max: 9940 x̄: 159.80 x̃: 32 helped stats (rel) min: <.01% max: 8.70% x̄: 1.15% x̃: 1.19% HURT stats (abs) min: 8280 max: 8280 x̄: 8280.00 x̃: 8280 HURT stats (rel) min: 2.10% max: 2.10% x̄: 2.10% x̃: 2.10% 95% mean confidence interval for cycles value: -386.98 286.59 95% mean confidence interval for cycles %-change: -1.41% -0.81% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8127 -> 8116 (-0.14%) spills in affected programs: 108 -> 97 (-10.19%) helped: 1 HURT: 0 total fills in shared programs: 11090 -> 11086 (-0.04%) fills in affected programs: 440 -> 436 (-0.91%) helped: 1 HURT: 1 Haswell total instructions in shared programs: 12979174 -> 12979054 (<.01%) instructions in affected programs: 9040 -> 8920 (-1.33%) helped: 14 HURT: 1 helped stats (abs) min: 2 max: 34 x̄: 8.79 x̃: 6 helped stats (rel) min: 0.41% max: 7.04% x̄: 2.66% x̃: 1.14% HURT stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 HURT stats (rel) min: 0.19% max: 0.19% x̄: 0.19% x̃: 0.19% 95% mean confidence interval for instructions value: -13.58 -2.42 95% mean confidence interval for instructions %-change: -3.94% -1.01% Instructions are helped. total cycles in shared programs: 411227148 -> 411212880 (<.01%) cycles in affected programs: 630506 -> 616238 (-2.26%) helped: 15 HURT: 0 helped stats (abs) min: 2 max: 11192 x̄: 951.20 x̃: 38 helped stats (rel) min: <.01% max: 16.01% x̄: 3.92% x̃: 0.17% 95% mean confidence interval for cycles value: -2544.28 641.88 95% mean confidence interval for cycles %-change: -6.89% -0.94% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78626 -> 78623 (<.01%) spills in affected programs: 42 -> 39 (-7.14%) helped: 1 HURT: 0 total fills in shared programs: 80111 -> 80104 (<.01%) fills in affected programs: 140 -> 133 (-5.00%) helped: 1 HURT: 1 Ivy Bridge total instructions in shared programs: 11684101 -> 11684030 (<.01%) instructions in affected programs: 3080 -> 3009 (-2.31%) helped: 4 HURT: 1 helped stats (abs) min: 5 max: 59 x̄: 18.50 x̃: 5 helped stats (rel) min: 6.47% max: 7.04% x̄: 6.87% x̃: 6.99% HURT stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3 HURT stats (rel) min: 0.15% max: 0.15% x̄: 0.15% x̃: 0.15% 95% mean confidence interval for instructions value: -45.59 17.19 95% mean confidence interval for instructions %-change: -9.38% -1.56% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 258407697 -> 258389653 (<.01%) cycles in affected programs: 328323 -> 310279 (-5.50%) helped: 5 HURT: 0 helped stats (abs) min: 32 max: 14908 x̄: 3608.80 x̃: 32 helped stats (rel) min: 1.26% max: 17.22% x̄: 9.30% x̃: 10.60% 95% mean confidence interval for cycles value: -11616.71 4399.11 95% mean confidence interval for cycles %-change: -16.56% -2.03% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 4537 -> 4528 (-0.20%) spills in affected programs: 64 -> 55 (-14.06%) helped: 1 HURT: 0 total fills in shared programs: 4823 -> 4815 (-0.17%) fills in affected programs: 189 -> 181 (-4.23%) helped: 1 HURT: 1 Sandy Bridge total instructions in shared programs: 10488464 -> 10488449 (<.01%) instructions in affected programs: 272 -> 257 (-5.51%) helped: 3 HURT: 0 helped stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5 helped stats (rel) min: 5.49% max: 5.56% x̄: 5.51% x̃: 5.49% total cycles in shared programs: 150263359 -> 150263263 (<.01%) cycles in affected programs: 7978 -> 7882 (-1.20%) helped: 3 HURT: 0 helped stats (abs) min: 32 max: 32 x̄: 32.00 x̃: 32 helped stats (rel) min: 1.15% max: 1.23% x̄: 1.20% x̃: 1.23% No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Ian Romanick	ea6c276436	nir: Mark the 0.0 < abs(a) transformation as imprecise Unlike the much older -abs(a) >= 0.0 transformation, this is not precise. The behavior changes if the source is NaN. No shader-db changes on any platform. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-08-04 01:12:03 -07:00
Marek Olšák	4bad50ded9	radeonsi: cosmetic changes	2018-08-04 03:10:30 -04:00
Marek Olšák	6508b93d78	st/mesa: expose & set limits for AMD_framebuffer_multisample_advanced Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:47:58 -04:00
Marek Olšák	7f587b57f7	st/mesa: add renderbuffer support for AMD_framebuffer_multisample_advanced Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	8e3d0019e1	st/mesa: pass storage_sample_count parameter into st_choose_format Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	459f05c7ec	mesa: add functional FBO changes for AMD_framebuffer_multisample_advanced - relax FBO completeness rules - validate sample counts Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	328c1c8d99	mesa: add gl_renderbuffer::NumStorageSamples Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	a96e946d25	mesa: implement glGet for AMD_framebuffer_multisample_advanced Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	3d6900d76e	glapi: define AMD_framebuffer_multisample_advanced and add its functions Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	2d115056d3	mesa: add storageSamples parameter to renderbuffer functions It's just passed to other functions but otherwise unused. It will be used in following commits. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-04 02:46:55 -04:00
Marek Olšák	f7d42ee7d3	include: update GL & GLES headers (v2) v2: use correct files Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-04 02:43:05 -04:00
Marek Olšák	fd1121e839	amd: remove support for LLVM 5.0 Users are encouraged to switch to LLVM 6.0 released in March 2018. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-03 18:36:11 -04:00
Marek Olšák	461a864316	winsys/amdgpu: pass the BO list via the CS ioctl on DRM >= 3.27.0	2018-08-03 18:35:19 -04:00
Marek Olšák	0f79b2015b	gallium/u_vbuf: handle indirect multidraws correctly and efficiently (v3) v2: need to do MAX{start+count} instead of MAX{count} added piglit tests v3: use malloc Cc: 18.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-08-03 18:30:46 -04:00
Mauro Rossi	1c7a2433b2	android: radv: build vulkan.radv conditionally to radeonsi A problem was reported with arm,arm64 targets build due to missing libLLVM shared library dependency with AOSP; to avoid this issue vulkan.radv is built conditionally only when radeonsi is in BOARD_GPU_DRIVERS Fixes: `0ca153f869` ("android: radv: enable build of vulkan.radv HAL module") Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "18.2" <mesa-stable@lists.freedesktop.org>	2018-08-03 20:09:16 +02:00
Roland Scheidegger	c72f91deba	util: return 0 for NaNs in float_to_ubyte d3d10 requires NaNs to get converted to 0 for float->unorm conversions (and float->int etc.). GL spec probably doesn't care in general, but it would make sense to have reasonable behavior in any case imho - the old code was converting negative NaNs to 0, and positive NaNs to 255. (Note that using float comparison isn't actually all that much more effort in any case, at least with sse2 it's just float comparison (ucommiss) instead of int one - I converted the second comparison to float too simply because it saves the probably somewhat expensive transfer of the float from simd to int domain (with sse2 via stack), so the generated code actually has 2 less instructions, although float comparisons are more expensive than int ones.) Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-03 17:07:38 +02:00
Jason Ekstrand	1d900e55fd	anv/pipeline: Disable FS dispatch for pointless fragment shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-03 05:52:23 -07:00
Timothy Arceri	d5175d21c7	nir: add fall through comment to nir_gather_info This stops Coverity reporting a defect and helps make the code less error-prone. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-03 09:30:57 +10:00
Dan Willemsen	12e3334f1e	CleanSpec.mk: Remove HOST_OUT_release This is a forward port of a patch from the AOSP/master tree: `bd633f11de`%5E%21/ Which replaces HOST_OUT_release with HOST_OUT As per Dan's explanation, the current code was incorrect to use $(HOST_OUT_release) as $(HOST_OUT) will be set properly for whether the current build that's being cleaned during incrementals is using host debug or release builds. Additionally Dan noted it was incredibly uncommon to use a debug host build, as there was never a shortcut and one had to set an environment variable manually. Thus it was rarely if ever tested. Change-Id: I7972c0a50fa3520dcfa962d6dd7e602bfe22368d Cc: Rob Herring <rob.herring@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Marissa Wall <marissaw@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org>	2018-08-02 15:42:40 -06:00
Sumit Semwal	d0b63b6583	Android.common.mk: define HAVE_TIMESPEC_GET This is a forward port of a patch from the AOSP/master tree: `bd30b663f5`%5E%21/ Since https://android-review.googlesource.com/c/718518 added timespec_get() to bionic, mesa3d doesn't build due to redefinition of timespec_get(). Avoid redefinition by defining HAVE_TIMESPEC_GET flag. Test: build and boot tested db820c to UI. Change-Id: I3dcc8034b48785e45cd3fa50e4d9cf2c684694a0 Cc: Rob Herring <rob.herring@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Marissa Wall <marissaw@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org>	2018-08-02 15:42:27 -06:00
Dan Willemsen	dc030d1ec9	util: Android.mk: Convert implicit rules to static pattern rules This is a partial cherry-pick from AOSP's mesa3d tree: `a88dcf769e`%5E%21/ "We're deprecating make implicit rules, preferring static pattern rules, or just regular rules." Without this patch, the freedesktop/master branch won't build in the AOSP environment, and this patch corrects that, as tested on the Dragonboard 820c. The i965 portion of the patch this is based on collided badly, and I'm not sure how to best forward port it. However, so far we don't see build issues without that portion. Comments or feedback would be appreciated! Change-Id: Id6dfd0d018cbd665fa19d80c14abd5f75fa10b8a Cc: Rob Herring <rob.herring@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Marissa Wall <marissaw@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org>	2018-08-02 15:42:23 -06:00
Darren Powell	726a48c94f	radeonsi: add new R600_DEBUG test "testclearbufperf" Signed-off-by: Darren Powell <darren.powell@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-02 16:09:22 -04:00
Brian Paul	977638006b	mesa: add switch case for GL 2.0 in _mesa_compute_version() Previously, I added a switch case for GL 2.1 (ed7a0770b881791dd697f3). I don't know of any driver which only supports GL 2.0, but adding this switch case avoids a failure if the app queries GL_SHADING_LANGUAGE_VERSION. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-02 13:20:00 -06:00
Andres Gomez	2d4d139877	intel/tools: add error2aub creation into autotools Tarball distribution is done through "make distcheck". We include the meson targets also into autotools so they won't fail when building from the tarball. Fixes: `6a60beba40` ("intel/tools: Add an error state to aub translator") Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-02 21:15:57 +03:00
Jason Ekstrand	7ef6cd0ee8	anv/pipeline: Do cross-stage linking optimizations This appears to help the Aztec Ruins benchmark by about 2% on my Kaby Lake gt2 laptop. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	a5bffa061d	anv/pipeline: Pull most of the anv_pipeline_compile_* into common code This leaves us with a series of little anv_pipeline_compile_* functions which each take a compiler object, a mem_ctx, the stage to compile, and the previous stage for VUE linking purposes. Some of them do interesting things but most are little more than wrappers around brw_compile_*. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5351339554	anv/pipeline: Add a separate "link" stage This breaks compilation up a bit into "link" and "compile". In the "link" stage, new anv_pipeline_link_* helpers are called which are responsible for setting up the binding table and doing anything needed to properly link with the next stage in the pipeline if one exists. They are called in reverse order starting with the fragment shader so you can assume linking in later stages is already done. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	5b196f39bd	anv/pipeline: Compile to NIR in compile_graphics This pulls the SPIR-V to NIR step out into common code. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	946fcd02a9	anv/pipeline: Recompile all shaders if any are missing from the cache Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f76d6d8a63	anv/pipeline: Drop anv_pipeline_add_compiled_stage We can set active_stages much more directly and then it's just candy around setting pipeline->stages[stage]. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	703a24932a	anv/pipeline: Pull shader compilation out into a helper. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	f3c59ca947	anv/pipeline: Call anv_pipeline_compile_* in a loop Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	bdc3565c8c	anv/pipeline: Hash the entire pipeline in one go Instead of hashing each stage separately (and TES and TCS together), we hash the entire pipeline. This means we'll get fewer cache hits if they, for instance, re-use the same VS over and over again but it also means we can now safely do cross-stage optimizations. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	4a8236ae17	anv/pipeline: Populate keys up-front Instead of having each anv_pipeline_compile_* function populate the shader key, make it part of the anv_pipeline_stage struct and fill it out up-front. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jason Ekstrand	76503b319a	anv/pipline: Add a helper struct for per-stage info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-02 10:29:20 -07:00
Jon Turney	a48c0659e1	meson: use correct keyword to fix a meson warning With a sufficently recent meson, the following warning is produced: WARNING: Passed invalid keyword argument "extra_args". WARNING: This will become a hard error in the future. It seems that compiler.links(args:) is meant here. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-02 18:12:49 +01:00
Andres Gomez	3013e22717	docs: add 18.3.0-devel release notes template Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-02 18:15:33 +03:00
Andres Gomez	873767cf42	mesa: bump version to 18.3.0-devel Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-08-02 18:00:15 +03:00
Eric Engestrom	44265cc65e	egl/main: fix indentation Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com>	2018-08-02 12:54:05 +01:00
Eric Engestrom	dd007d1c2a	loader: fix indentation Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com>	2018-08-02 12:53:58 +01:00
Vlad Golovkin	9d3a2394e4	swr: Remove unnecessary memset call Zeroing memory after calloc is not necessary. This also allows to avoid possible crash when allocation fails, because memset is called before checking screen for NULL. Fixes: `a29d63ecf7` "swr: refactor swr_create_screen to allow for proper cleanup on error" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-02 11:13:40 +01:00
Andres Gomez	8d3ccdbb9b	mesa: replace binary constants with hexadecimal constants The binary constant notation "0b" is a GCC extension. Instead, we use hexadecimal notation to fix the MSVC 2013 build: Compiling src\mesa\main\texcompress_astc.cpp ... texcompress_astc.cpp src\mesa\main\texcompress_astc.cpp(111) : error C2059: syntax error : 'bad suffix on number' ... src\mesa\main\texcompress_astc.cpp(1007) : fatal error C1003: error count exceeds 100; stopping compilation scons: *** [build\windows-x86-debug\mesa\main\texcompress_astc.obj] Error 2 scons: building terminated because of errors. v2: Fix wrong conversion (Ilia). Fixes: `38ab39f650` ("mesa: add ASTC 2D LDR decoder") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Mike Lothian <mike@fireburn.co.uk> Cc: Gert Wollny <gert.wollny@collabora.com> Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	1090e97e77	ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\gallium\auxiliary\driver_ddebug\dd_draw.c ... dd_draw.c c:\projects\mesa\src\gallium\auxiliary\driver_ddebug\dd_util.h(60) : warning C4013: 'snprintf' undefined; assuming extern returning int ... gallium.lib(dd_draw.obj) : error LNK2001: unresolved external symbol _snprintf build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120 scons: building terminated because of errors. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	d7694136d3	kutil/queue: use util_snprintf() in util_queue_init Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\util\u_queue.c ... u_queue.c src\util\u_queue.c(325) : warning C4013: 'snprintf' undefined; assuming extern returning int ... mesautil.lib(u_queue.obj) : error LNK2001: unresolved external symbol _snprintf scons: building terminated because of errors. Fixes: `b238e33bc9` ("kutil/queue: add a process name into a thread name") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	18d9dc179f	gallium/aux/util: use util_snprintf() in test_texture_barrier Instead of plain snprintf(). To fix the MSVC 2013 build: Compiling src\gallium\auxiliary\util\u_tests.c ... u_tests.c src\gallium\auxiliary\util\u_tests.c(624) : warning C4013: 'snprintf' undefined; assuming extern returning int ... gallium.lib(u_tests.obj) : error LNK2019: unresolved external symbol _snprintf referenced in function _test_texture_barrier build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll : fatal error LNK1120: 1 unresolved externals scons: *** [build\windows-x86-debug\gallium\targets\graw-gdi\graw.dll] Error 1120 scons: building terminated because of errors. Fixes: `56342c97ee` ("gallium/u_tests: test FBFETCH and shader-based blending with MSAA") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Andres Gomez	9d220fa950	glsl: use util_snprintf() Instead of plain snprintf(). To fix the MSVC 2013 build. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Brian Paul <brianp@vmware.com> Cc: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-08-02 10:06:44 +03:00
Jordan Justen	8fcdb71d8c	intel/compiler: Add brw_get_compiler_config_value for disk cache During code review, Jason pointed out that: `2b3064c073` "i965, anv: Use INTEL_DEBUG for disk_cache driver flags" Didn't account for INTEL_SCALER_* environment variables. To fix this, let the compiler return the disk_cache driver flags. Another possible fix would be to pull the INTEL_SCALER_* into INTEL_DEBUG bits, but as we are currently using 41 of 64 bits, I didn't think it was a good use of 4 more of these bits. (5 since INTEL_PRECISE_TRIG needs to be accounted for as well.) Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:49:16 -07:00
Jordan Justen	3887700dfd	i965: Disable shader cache with INTEL_DEBUG=shader_time Shader time hard codes an index of the shader time buffer within the gen program. In order to support shader time in the disk shader cache, we'd need to add the shader time index into the program key. This should work, but probably is not worth it for this particular debug feature. Therefore, let's just disable the disk shader cache if the shader time debug feature is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106382 Fixes: `96fe36f7ac` "i965: Enable disk shader cache by default" Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 23:30:49 -07:00
Timothy Arceri	bea4722c2e	glsl: make a copy of array indices that are used to deref a function out param Fixes new piglit test: tests/spec/glsl-1.20/execution/qualifiers/vs-out-conversion-int-to-float-vec4-index.shader_test Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-08-02 11:06:28 +10:00
Jason Ekstrand	de9e5cf35a	anv/pipeline: Add populate_tcs/tes_key helpers They don't really do anything interesting, but it's more consistent this way. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	e621f57556	anv/pipeline: Rework the parameters to populate_wm_prog_key Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b2e0b0dad6	anv/pipeline: More aggressively optimize away color attachments Instead of just looking at the number of color attachments, look at which ones are actually used by the subpass. This lets us potentially throw away chunks of the fragment shader. In DXVK, for example, all subpasses have 8 attachments and most are VK_ATTACHMENT_UNUSED so this is very helpful in that case. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	80bc0b728c	anv: Restrict the number of color regions to those actually written The back-end compiler emits the number of color writes specified by wm_prog_key::nr_color_regions regardless of what nir_store_outputs we have. Once we've gone through and figured out which render targets actually exist and are written by the shader, we should restrict the key to avoid extra RT write messages. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4d57e543b8	anv/pipeline: Fix up deref modes if we delete a FS output With the new deref instructions, we have to keep the modes consistent between the derefs and the variables they reference. Since we remove outputs by changing them to local variables, we need to run the fixup pass to fix the modes. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	7f75cf2a94	nir/lower_indirect: Bail early if modes == 0 There's no point in walking the program if we're never going to actually lower anything. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4434591bf5	intel/nir: Call nir_lower_io_to_scalar_early Shader-db results on Kaby Lake: total instructions in shared programs: 15166953 -> 15073611 (-0.62%) instructions in affected programs: 2390284 -> 2296942 (-3.91%) helped: 16469 HURT: 505 total loops in shared programs: 4954 -> 4951 (-0.06%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	b0bb547f78	intel/nir: Split IO arrays into elements The NIR nir_lower_io_arrays_to_elements pass attempts to split I/O variables which are arrays or matrices into a sequence of separate variables. This can help link-time optimization by allowing us to remove varyings at a more granular level. Shader-db results on Kaby Lake: total instructions in shared programs: 15177645 -> 15168494 (-0.06%) instructions in affected programs: 79857 -> 70706 (-11.46%) helped: 392 HURT: 0 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	57804efa88	i965/fs: Flag all slots of a flat input as flat Otherwise, only the first vec4 of a matrix or other complex type will get marked as flat and we'll interpolate the others. This was caught by a dEQP test which started failing because it did a SSO vs. non-SSO comparison. Previously, we did the interpolation wrong consistently in both versions. However, with one of Tim Arceri's NIR linkingpatches, we started splitting the matrix input into vectors at link time in the non-SSO version and it started getting correctly interpolated which didn't match the broken SSO version. As of this commit, they both get correctly interpolated. Fixes: `e61cc87c75` "i965/fs: Add a flat_inputs field to prog_data" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Jason Ekstrand	4e060385e9	intel/nir: Use the correct scalar stage for consumers when linking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-01 18:02:28 -07:00
Dave Airlie	70c34a1bd2	docs: update 18.2.0 release notes for virgl	2018-08-02 08:43:56 +10:00
Dylan Baker	34998aae18	nir/meson: fix c vs cpp args for nir test Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:51:22 -07:00
Dylan Baker	2877b6555c	gallium: fix ddebug on windows By including the proper headers for getpid and for mkdir. Fixes: `6ff0c6f4eb` ("gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 12:50:25 -07:00
Dylan Baker	17f49950da	util: move process.[ch] to u_process.[ch] On windows process.h is a system provided header, and it's required in include/c11/threads_win32.h. This header interferes with searching for that header, and results in windows build warnings with scons, but errors in meson which doesn't allow implicit function declarations. Just rename process to u_process, which follows the style of utils anyway. Fixes: `2e1e6511f7` ("util: extract get_process_name from xmlconfig.c") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 12:47:16 -07:00
Marek Olšák	cb6b241c30	ac,radeonsi: reduce optimizations for complex compute shaders on older APUs (v2) To make dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 finish sooner on the older CPUs. (otherwise it gets killed and we fail the test) Acked-by: Dave Airlie <airlied@gmail.com>	2018-08-01 15:25:18 -04:00
Eric Anholt	c2eab33b08	v3d: Actually put the "%s" in the snprintf. I missed an important part when porting the change over, fixing my compiler warning but breaking -Werror=format-security. Fixes: `e6ff5ac446` ("v3d: use snprintf(..., "%s", ...) instead of strncpy") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107443	2018-08-01 11:39:19 -07:00
Juan A. Suarez Romero	d742270564	vc4: Fix automake linking error. CXXLD gallium_dri.la ../../../../src/gallium/drivers/vc4/.libs/libvc4.a(vc4_cl_dump.o): In function `vc4_dump_cl': src/gallium/drivers/vc4/vc4_cl_dump.c:45: undefined reference to `clif_dump_init' src/gallium/drivers/vc4/vc4_cl_dump.c:82: undefined reference to `clif_dump_destroy' ../../../../src/broadcom/cle/.libs/libbroadcom_cle.a(cle_libbroadcom_cle_la-v3d_decoder.o): In function `v3d_field_iterator_next': src/broadcom/cle/v3d_decoder.c:902: undefined reference to `clif_lookup_bo' Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107423 CC: Eric Anholt <eric@anholt.net> Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:33:07 +02:00
Juan A. Suarez Romero	810c9a4eba	scons: require scons 2.4 or greater There is a bug with scons 2.3, used in Travis, where it fails to detect some C functions. Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:33:00 +02:00
Juan A. Suarez Romero	fea0b92042	travis: install scons from pip The ubuntu version provided by Travis is a bit old, and does not detect correctly some C functions. Use a more modern version through scons. Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-08-01 20:32:42 +02:00
Marek Olšák	26d3e2b4b0	docs: mark ARB_ES3_2_compatibility as done for radeonsi	2018-08-01 11:38:54 -04:00
Lionel Landwerlin	2477e516d9	intel: tools: aubwrite: split gen[89] from gen10+ Gen10+ has an additional bit in MI_BATCH_BUFFER_END to signal the end of the context image. We select the largest size for the context image regardless of the generation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-08-01 15:31:56 +01:00
Mathieu Bridon	91939255a7	python: Use the unicode_escape codec Python 2 had string_escape and unicode_escape codecs. Python 3 only has the latter. These work the same as far as we're concerned, so let's use the future-proof one. However, the reste of the code expects unicode strings, so we need to decode them again. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	ad363913e6	python: Explicitly add the 'L' suffix on Python 3 Python 2 had two integer types: int and long. Python 3 dropped the latter, as it made the int type automatically support bigger numbers. As a result, Python 3 lost the 'L' suffix on integer litterals. This probably doesn't make much difference when compiling the generated C code, but adding it explicitly means that both Python 2 and 3 generate the exact same C code anyway, which makes it easier to compare and check for discrepencies when moving to Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	a71df20855	python: Explicitly use byte strings In both Python 2 and 3, zlib.Compress.compress() takes a byte string, and returns a byte string as well. In Python 2, the script was working because: 1. string literalls were byte strings; 2. opening a file in unicode mode, reading from it, then passing the unicode string to compress() would automatically encode to a byte string; On Python 3, the above two points are not valid any more, so: 1. zlib.Compress.compress() refuses the passed unicode string; 2. compressed_data, defined as an empty unicode string literal, can't be concatenated with the byte string returned by compress(); This commit fixes this by explicitly using byte strings where appropriate, so that the script works on both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	8678fe537a	python: Use open(), not file() The latter is a constructor for file objects, but when actually opening a file, using the former is more idiomatic. In addition, file() is not a builtin any more in Python 3, so this makes the script compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	c24d826968	python: Open file in binary mode The XML parser wants byte strings, not unicode strings. In both Python 2 and 3, opening a file without specifying the mode will open it for reading in text mode ('r'). On Python 2, the read() method of the file object will return byte strings, while on Python 3 it will return unicode strings. Explicitly specifying the binary mode ('rb') makes the behaviour identical in both Python 2 and 3, returning what the XML parser expects. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	e40200e0aa	python: Don't abuse hex() The hex() builtin returns a string containing the hexa-decimal representation of an integer. When the argument is not an integer, then the function calls that object's __hex__() method, if one is defined. That method is supposed to return a string. While that's not explicitly documented, that string is supposed to be a valid hexa-decimal representation for a number. Python 2 doesn't enforce this though, which is why we got away with returning things like 'NIR_TRUE' which are not numbers. In Python 3, the hex() builtin instead calls an object's __index__() method, which itself must return an integer. That integer is then automatically converted to a string with its hexa-decimal representation by the rest of the hex() function. As a result, we really can't make this compatible with Python 3 as it is. The solution is to stop using the hex() builtin, and instead use a hex() object method, which can return whatever we want, in Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-08-01 14:26:19 +01:00
Mathieu Bridon	12eb5b496b	python: Better get character ordinals In Python 2, iterating over a byte-string yields single-byte strings, and we can pass them to ord() to get the corresponding integer. In Python 3, iterating over a byte-string directly yields those integers. Transforming the byte string into a bytearray gives us a list of the integers corresponding to each byte in the string, removing the need to call ord(). This makes the script compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 14:26:19 +01:00
Mario Kleiner	9bd8b0f700	loader_dri3: Handle mismatched depth 30 formats for Prime renderoffload. Detect if the display (X-Server) gpu and Prime renderoffload gpu prefer different channel ordering for color depth 30 formats ([X/A]BGR2101010 vs. [X/A]RGB2101010) and perform format conversion during the blitImage() detiling op from tiled backbuffer -> linear buffer. For this we need to find the visual (= red channel mask) for the X-Drawable used to display on the server gpu. We use the same proven logic for finding that visual as in commit "egl/x11: Handle both depth 30 formats for eglCreateImage()". This is mostly to allow "NVidia Optimus" at depth 30, as Intel/AMD gpu's prefer xRGB2101010 ordering, whereas NVidia gpu's prefer xBGR2101010 ordering, so we can offload to nouveau without getting funky colors. Tested on Intel single gpu, NVidia single gpu, Intel + NVidia prime offload with DRI3/Present. Note: An unintended but pleasant surprise of this patch is that it also seems to make the modesetting-ddx of server 1.20.0 work at depth 30 on nouveau, at least with unredirected "classic" X rendering, and with redirected desktop compositing under XRender accel, and with OpenGL compositing under GLX. Only X11 compositing via OpenGL + EGL still gives funky colors. modesetting-ddx + glamor are not yet ready to deal with nouveau's ABGR2101010 format, and treat it as ARGB2101010, also exposing X-visuals with ARGB2101010 style channel masks. Seems somehow this triggers the logic in this patch on modesetting-ddx + depth 30 + DRI3 buffer sharing and does the "wrong" channel swizzling that then cancels out the "wrong" swizzling of glamor and we end up with the proper pixel formatting in the scanout buffer :). This so far tested on a NVA5 Tesla card under KDE5 Plasma as shipping with Ubuntu 16.04.4 LTS. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Mario Kleiner	61a02729f7	egl/x11: Handle both depth 30 formats for eglCreateImage(). (v4) We need to distinguish if the backing storage of a pixmap is XRGB2101010 or XBGR2101010, as different gpu hw supports different formats. NVidia hw prefers XBGR, whereas AMD and Intel are happy with XRGB. Use the red channel mask of the first depth 30 visual of the x-screen to distinguish which hw format to choose. This fixes desktop composition of color depth 30 windows when the X11 compositor uses EGL. v2: Switch from using the visual of the root window to simply using the first depth 30 visual for the x-screen, as testing shows that each driver only exports either xrgb ordering or xbgr ordering for the channel masks of its depth 30 visuals, so this should be unambiguous and avoid trouble if X ever supports depth 30 pixmaps on screens with a non-depth 30 root window visual. This per Michels suggestion. v3: No change to v2, but spent some time testing this more on AMD hw, with my software hacked up to intentionally choose pixel formats/visual with the non-preferred xBGR2101010 ordering on the ati-ddx, also with a standard non-OpenGL X-Window with depth 30 visual, to make sure that things show up properly with the right colors on the screen when going through EGL+OpenGL based compositing on KDE-5. Iow. to confirm that my explanation to the v2 patch on the mailing list of why it should work and the actual practice agree (or possibly that i am good at fooling myself during testing ;). v4: Drop the local `red_mask` and just `return visual->red_mask`/ `return 0`, as suggested by Eric Engestrom. Rebased onto current master, to take the cleanup via the new function dri2_format_for_depth() into account. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Daniel Stone	753f603b52	gbm: Add support for 10bpp BGR formats Add support for XBGR2101010 and ABGR2101010 formats. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-08-01 12:55:37 +01:00
Daniel Stone	275b23ed0e	egl/wayland: Add 10bpc BGR configs Add support for XBGR2101010 and ABGR2101010. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-08-01 12:55:37 +01:00
Iago Toral Quiroga	471bce5689	intel/compiler: implement 8-bit constant load Fixes VK-GL-CTS CL#2567 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Iago Toral Quiroga	7e6c8b0cb7	intel/compiler: add setup_imm_(u)b helpers The hardware doesn't support byte immediates, so similar to setup_imm_df() for doubles, these helpers work by loading the constant value into a VGRF. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-08-01 08:08:15 +02:00
Rhys Perry	bd56e117ff	glsl: fix function inlining with opaque parameters Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	f903bce8a6	glsl, glsl_to_tgsi: fix sampler/image constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	ea2a3f52b4	glsl: allow ?: operator with images and samplers when bindless is enabled Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:01 -04:00
Rhys Perry	42d4acb39d	glsl_to_tgsi: allow bound samplers and images to be used as l-values Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:00 -04:00
Rhys Perry	00589be6c4	gallium: add new SAMP2HND and IMG2HND opcodes This commit does not add support for the opcodes in gallivm or tgsi_to_nir.c Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-01 00:10:00 -04:00
Dave Airlie	1fb388cd20	docs/features: update virgl GLES 3.1/3.2 status virgl now exposes GLES3.1 and 3.2	2018-08-01 14:09:11 +10:00
Dave Airlie	e2c62170d5	docs/features: update virgl GL 4.3 support virgl with up to date host renderer now exposes GL 4.3.	2018-08-01 14:08:33 +10:00
Erik Faye-Lund	21e33f4a10	virgl: enable FBFETCH if virglrenderer supports it This fixes the following dEQP-GLES31 cases from NotSupported to Pass for me: - dEQP-GLES31.functional.blend_equation_advanced.state_query.* - dEQP-GLES31.functional.blend_equation_advanced.basic.* - dEQP-GLES31.functional.blend_equation_advanced.srgb.* - dEQP-GLES31.functional.blend_equation_advanced.msaa.* - dEQP-GLES31.functional.blend_equation_advanced.barrier.* - dEQP-GLES31.functional.draw_buffers_indexed.overwrite_advanced_blend_eq - dEQP-GLES31.functional.state_query.indexed.blend_equation_advanced_* - dEQP-GLES31.functional.debug.negative_coverage..advanced_blend. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-08-01 14:05:22 +10:00
Erik Faye-Lund	7ef86a03f0	virgl: add texture_barrier stub In gallium, supporting FBFETCH means supporting non-coherent fetches, but in virglrenderer, due to technical reasons this is backed by coherent fetches instead. This means we don't need to do anything for the barriers. However, if we don't have a texture_barrier implementation, we get crashes because the non-coherent extensions is exposed. So, let's leave this as a NOP for now. [airlied: I've got a more complete impl of this somewhere, once we land the host side]. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-08-01 14:03:51 +10:00
Dave Airlie	6f5d463a78	virgl: enable robustness if the host exposes it Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:38 +10:00
Dave Airlie	2df8b80c4c	virgl: Support ARB_framebuffer_no_attachments This uses new protocol to send the default sizes to the host. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:35 +10:00
Dave Airlie	f8a8ea6a2d	virgl: add initial ARB_compute_shader support This hooks up compute shader creation and launch grid support. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-08-01 14:00:31 +10:00
Marek Olšák	157c6e8195	util: don't use __builtin_clz unconditionally This fixes the build if __builtin_clz is unsupported. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-31 23:28:01 -04:00
Marek Olšák	c5c6e0187f	ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle a needle in the haystack? Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-31 22:56:40 -04:00
Eric Anholt	e6ff5ac446	v3d: use snprintf(..., "%s", ...) instead of strncpy Fixes a compiler warning about terminator NUL, based on `f836d799f9` ("intel/decoder: use snprintf(..., "%s", ...) instead of strncpy")	2018-07-31 16:42:11 -07:00
Eric Anholt	3471ce9985	v3d: Add support for the TMUWT instruction. This instruction is used to ensure that TMU stores have been processed before moving on. In particular, you need any TMU ops to be done by the time the shader ends.	2018-07-31 16:05:04 -07:00
Marek Olšák	7d36c866d2	radeonsi: report supported EQAA combinations from is_format_supported Framebuffer without attachments now supports 16 samples. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	20dd75a926	radeonsi: use storage_samples instead of color_samples in most places and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	966f155623	gallium: add storage_sample_count parameter into is_format_supported Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	8632626c81	gallium: add pipe_resource::nr_storage_samples, and set it same as nr_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	0caf74bbcd	gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTS Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Marek Olšák	55d56dd859	docs: update radeonsi features and release notes	2018-07-31 18:12:37 -04:00
Marek Olšák	ed8b4ed6c4	st/mesa: implement ASTC 2D LDR fallback for all drivers Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	5fe52044ef	st/mesa: add ETC2 & ASTC fast path for GetTex(Sub)Image Not sure if GL/GLES can hit this path, but it's just decompression. Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	ebe03d3699	st/mesa: generalize fallback_copy_image for compressed textures in order to support ASTC Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	c3fafa127a	st/mesa: generalize code for the compressed texture map/unmap fallback in order to support ASTC Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	3d7e4311bf	st/mesa: use st_compressed_format_fallback more Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	912e0525be	st/mesa: generalize st_etc_fallback -> st_compressed_format_fallback for ASTC support later Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny<gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-07-31 18:09:57 -04:00
Marek Olšák	38ab39f650	mesa: add ASTC 2D LDR decoder Tested-by: Mike Lothian <mike@fireburn.co.uk> Tested-By: Gert Wollny <gert.wollny@collabora.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:09:57 -04:00
Dave Airlie	5be352b430	docs/features: mark virgl image features and GL4.2 as done	2018-08-01 08:06:41 +10:00
Gurchetan Singh	9c136e8a07	virgl: also mark sampler views as dirty When texture buffers are used as images in compute shaders, the guest never sees the modified data since the TBO is always marked as clean. Fixes most dEQP-GLES31.functional.image_load_store.buffer.* tests. Example test cases: dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui dEQP-GLES31.functional.image_load_store.buffer.qualifiers.coherent_r32f dEQP-GLES31.functional.image_load_store.buffer.format_reinterpret.rgba8_rgba8ui Note: virglrenderer side patch also needed to bind TBOs correctly Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-01 08:05:39 +10:00
Dave Airlie	a090df0d5d	virgl: add memory barrier support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:35 +10:00
Dave Airlie	6f75058359	virgl: add TXQS support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:32 +10:00
Dave Airlie	452eea140d	virgl: add initial images support (v2) v2: add max image samples support Reviwed-by: Gert Wollny <gert.wollny@collabora.com>	2018-08-01 08:02:27 +10:00
Jon Turney	faa29c0e24	Make glXChooseFBConfig handle unspecified sRGB correctly Make glXChooseFBConfig properly handle the case where the only matching configs have the sRGB flag set, but no sRGB attribute is specified. Since `6e06e281`, the sRGBcapable flag is now actually compared, using MATCH_DONT_CARE. `7b0f912e` added defaulting of sRGBcapable to GL_FALSE in __glXInitializeVisualConfigFromTags(), to handle servers which don't report it, but this function is also used by glXChooseFBConfig(), so sRGBcapable is implicitly false when not explicitly specified. (This can cause e.g. glxinfo to fail to find anything matching the simple config it looks for if all the candidates have the sRGB flag set to true. I'm assuming this doesn't happen 'normally' as candidate configs with and without sRGB true are available) Move this defaulting to createConfigsFromProperties(), and set the default for glXChooseFBConfig() in init_fbconfig_for_chooser() to GLX_DONT_CARE. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 13:56:13 -04:00
Olivier Fourdan	03a61b977e	dri3: For 1.2, use root window instead of pixmap drawable get_supported_modifiers() and pixmap_from_buffers() requests both expect a window as drawable, passing a pixmap will fail as the Xserver will fail to match the given drawable to a window. That leads to dri3_alloc_render_buffer() to return NULL and breaks rendering when using GLX_DOUBLEBUFFER on pixmaps. Query the root window of the pixmap on first init, and use the root window instead of the pixmap drawable for get_supported_modifiers() and pixmap_from_buffers(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107117 Fixes: `069fdd5` ("egl/x11: Support DRI3 v1.1") Signed-off-by: Olivier Fourdan <ofourdan@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 13:51:59 -04:00
Alejandro Piñeiro	16b5e15e91	i965: enable XFB and GeometryStreams for gen7+ Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	b7421cda86	i965: Link XFB varyings for SPIR-V shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	b9719b4b05	nir/linker: Add the start of a pure-NIR linker for XFB v2: ignore names on purpose, for consistency with other places where we are doing the same (Alejandro) v3: changes proposed by Timothy Arceri, implemented by Alejandro Piñeiro: * Remove redundant 'struct active_xfb_varying' * Update several comments, including spec quotes if needed * Rename struct 'active_xfb_varying_array' to 'active_xfb_varyings' * Rename variable 'array' to 'active_varyings' * Replace one if condition for an assert (<MAX_FEEDBACK_BUFFERS) * Remove BufferMode initialization (was already done) v4: simplify output pointer handling (Timothy) Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Neil Roberts	9fbe5bd811	nir/types: Add a wrapper to access gl_type Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Alejandro Piñeiro	739bb9e3d4	arb_gl_spirv: add calls to several nir lowerings For now we are just adding nir lowerings that are needed/mandatory to get things working. After everything is settled, we would start to add good-to-have lowerings. This patch adds the following calls: * nir_split_var_copits and nir_split_per_member_structs: as vulkan drivers are doing now. See commit `b0c643d8f5` ("spirv: Use NIR per-member splitting") for more info. Without this commit, piglit tests like this crashes: spec/arb_gl_spirv/execution/varying/block And in general most of the shaders that includes any kind of struct. * nir_copy_prop: after nir_deref_instr introduction, function calls need this. See commit "nir,spirv: Rework function calls" (`c11833ab24`) for more info. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:37 +02:00
Alejandro Piñeiro	d69027536c	compiler/spirv: add XFB and GeometryStreams capability check support Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:33:28 +02:00
Neil Roberts	1e3f61d1d5	nir/gather_info: Set info.gs.uses_streams Whenever a non-zero stream is written to it now sets uses_streams to true. This reflects the code in validate_geometry_shader_emissions for GLSL. v2: set uses_streams at gather_info instead that at spirv to nir (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-31 13:18:28 +02:00
Neil Roberts	b0af66bb17	spirv/nir: Fix the stream ID when emitting a primitive or vertex It looks like it was previously taking the SPIR-V instruction number directly instead of looking up the constant value. v2: use vtn_constant_value helper (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-31 13:18:28 +02:00
Neil Roberts	13b8857fcf	spirv: Handle the SpvDecorationStream decoration From SPIR-V 1.0 spec, section 3.20, "Decoration": "Stream Apply to an object or a member of a structure type. Indicates the stream number to put an output on." Note the "or", so that means that it is allowed for both a full struct or a membef or a struct (although the wording is not really ideal, and somewhat error-prone, imho). We found this with some Geometry Streams tests for ARB_gl_spirv, where the full gl_PerVertex is assigned Stream 0 (default value on OpenGL for gl_PerVertex). So this commit allows structs to have this Decoration, and sets the stream at the nir variable if needed. Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: squash two Decoration Stream patches (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	d480623bef	mesa/glspirv: Set last_vert_prog v2: simplify last_vert check (Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	cd4a14be06	spirv: Handle XFB variable decorations These set the new explicit XFB members on nir_variable. This is needed to support ARB_gl_spirv, as Vulkan doesn't support transform feedback. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	a5ec8461f9	spirv: Handle SpvExecutionModeXfb This just sets has_transform_feedback_varyings on the shader. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Neil Roberts	3fd5b4c7aa	nir: Add members for the explicit XFB properties to nir_variable These are copied from the from the corresponding values in ir_variable. The intention is to eventually use them in a pure-NIR linker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-31 13:18:28 +02:00
Christian Gmeiner	e1d4882d05	etnaviv: fix typo in query names Fixes: `d0bed0b494` ("etnaviv: support HI performance counters") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Chris Healy <cphealy@gmail.com>	2018-07-31 08:33:32 +02:00
Tapani Pälli	553af7a190	mesa: fix a typo (trivial) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 08:19:38 +03:00
Tapani Pälli	ce80abbb17	mesa: add glRenderbufferStorage support for EXT_texture_norm16 formats These bits were missing, found when extending the Piglit test. Fixes: `7f467d4f73` "mesa: GL_EXT_texture_norm16 extension plumbing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-31 08:19:10 +03:00
David Riley	f94681b6e2	egl/surfaceless: Allow DRMless fallback. Allow platform_surfaceless to use swrast even if DRM is not available. To be used to allow a fuzzer for virgl to be run on a jailed VM without hardware GL or DRM support. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Signed-off-by: David Riley <davidriley@chromium.org>	2018-07-30 19:40:45 -07:00
David Riley	b169b84be6	egl/surfaceless: Define DRI_SWRastLoader extension when using swrast. Signed-off-by: David Riley <davidriley@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> [chadv: Dropped spurious hunk] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-30 19:40:08 -07:00
Eric Anholt	d934492ff9	v3d: Dump the contents off all the buffers in CLIF mode. A V3D_DEBUG=clif file from a non-texturing .shader_test can now be successfully run through the CLIF runner in the simulator. Now I need to build an open source CLIF runner against the v3d DRM module.	2018-07-30 14:29:01 -07:00
Eric Anholt	99a5ac250b	v3d: Split walking the CLs to generate relocs from walking CLs to dump. We need to dump each buffer's contents in order for a CLIF file, so we need to collect all of the relocs into a buffer (such as the indirect CL full of both uniforms and GL shader states) before we start dumping.	2018-07-30 14:29:01 -07:00
Eric Anholt	2df6f1a3df	v3d: Include commands to run the BCL and RCL in CLIF dumps.	2018-07-30 14:29:01 -07:00
Eric Anholt	c6449e33e3	v3d: Use a short, underscored name for packets in CLIF/CL dumping. These will match the names that the CLIF parser expects to see. I may in the future decide to change more of the other names so that I match the names the HW/closed SW team uses for their packets, rather than the names in the spec (which only they and I can read anyway).	2018-07-30 14:29:01 -07:00
Eric Anholt	b56f8c475e	v3d: Rename "configuration" and "config" in the XML to "cfg" This matches what CLIF parsing expects, and makes TILE_BINNING_MODE_CONFIGURATION_COMMON_CONFIGURATION into a much more legible TILE_BINNING_MODE_CFG_COMMON.	2018-07-30 14:29:01 -07:00
Eric Anholt	300e609feb	v3d: s/colour/color in the XML. The CLIF format expects american english spelling, and the rest of Mesa is too. I was previously adhering to the spec's spelling, which is counterproductive.	2018-07-30 14:29:01 -07:00
Eric Anholt	3a8550ad06	v3d: Rename primitives to prims in the XML to match CLIF names. This makes us match up with the V3D HW team's names a bit more.	2018-07-30 14:29:01 -07:00
Eric Anholt	6237c64049	v3d: Print CLIF fixed-point values as just their decimal value. The parser doesn't handle float input, so we have to dump the raw value.	2018-07-30 14:29:01 -07:00
Eric Anholt	8da47b7648	v3d: When not doing terminal pretty-printing, comment struct field names. The struct field names aren't part of the CLIF ABI, just the order of fields within the struct. The comments are there for human readability.	2018-07-30 14:29:01 -07:00
Eric Anholt	103f21b13d	v3d: Add a separate flag for CLIF ABI output versus human-readable CLs. A few of the upcoming changes would make the V3D_DEBUG=cl output less readable, so let's make proper CLIF file production be under a separate V3D_DEBUG=clif flag.	2018-07-30 14:29:01 -07:00
Eric Anholt	89ac6fa403	v3d: Add pack header support for f187 values. V3D only has one of these (the top 16 bits of a float32) left in its CLs, but VC4 had many more. This gets us proper pretty-printing of the values instead of a large uint.	2018-07-30 14:29:01 -07:00
Eric Anholt	e146e3a795	v3d: Move depth offset packet setup to CSO creation time. This should be some simpler memcpying at draw time, and makes the next change easier.	2018-07-30 14:29:01 -07:00
Dave Airlie	9039cf70fa	r600: reduce num compute threads to 1024. I copied this value from radeonsi, but it was wrong, 1024 seems to be correct answer from looking at gpuinfo. This should fix a few compute shader related hangs. (at least in CTS) Cc: <mesa-stable@lists.freedesktop.org> (airlied: pushed because it avoids hangs)	2018-07-31 04:55:38 +10:00
Rob Clark	0ea243dcd5	freedreno/a5xx: fix txf_ms Somehow this got lost from the initial MSAA patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 12:31:05 -04:00
Rhys Perry	f310e86a42	nvc0: serialize before updating some constant buffer bindings on Maxwell+ To avoid serializing, this has the user constant buffer always be 65536 bytes and enabled unless it's required that something else is used for constant buffer 0. Fixes artifacts with at least XCOM: Enemy Within, 0 A.D. and Unigine Valley, Heaven and Superposition. v2: changed uniform_buffer_bound to be bool instead of a uint32_t v3: remove magic constants v3: remove pointless code in nvc0_validate_driverconst Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100177 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-07-30 15:04:26 +01:00
Eric Anholt	0a3f653180	v3d: Block bin on render when doing vertex texturing. The kernel by default serializes the BCL on previous BCLs submitted on this FD, but not RCLs. For now this fix is conservative and blocks on last RCL if any vertex texturing is done, which fails to get bin/render overlap if there was an intermediate job that doesn't draw to the BCL's buffer. I've dropped a perf_debug() in here to note that as a potential future improvement. Fixes intermittent failures in KHR-GLES3.copy_tex_image_conversions.required.*	2018-07-29 19:25:39 -07:00
Eric Anholt	34cefa7fe0	v3d: Fix meson build without vc4.	2018-07-29 19:22:33 -07:00
Eric Anholt	27f1bfe471	vc4: Fix meson build when enabled without v3d. Reported-by: Rob Clark <robdclark@gmail.com> Fixes: `e92959c4e0` ("v3d: Pass the whole clif_dump structure to v3d_print_group().")	2018-07-29 19:13:29 -07:00
Jason Ekstrand	05fb2f88ec	nir/instr_set: Fix nir_instrs_equal for derefs We weren't returning at the end of the nir_isntr_type_deref case in nir_instrs_equal and it was falling through to the default of false. While we're at it, make the default unreachable because all statements in the switch now have their own returns. Had we done that before, we would have caught this bug a long time ago. Fixes: `19a4662a54` "nir: Add a deref instruction type" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Thomas Helland<thomashelland90@gmail.com>	2018-07-29 13:39:35 -07:00
Jason Ekstrand	9a4ab4c120	nir: Take if uses into account in ssa_def_components_read Fixes: `d800b7daa5` "nir: Add a helper for figuring out what..." Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-29 13:39:35 -07:00
Jason Ekstrand	5c1c6939ce	util/list: Make some helpers take const lists They're all just querying things about the list and not mutating anything. Reviewed-by: Thomas Helland<thomashelland90@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-07-29 13:39:35 -07:00
Rob Clark	0ddae4acae	freedreno/a5xx: small cleanup We no longer have semi-custom clear pipe that uses 3d state. Normal clears happen via hw blitter, and everything else uses u_blitter these days. So we don't need this hack. TODO a3xx+a4xx could get same treatment. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 14:00:06 -04:00
Rob Clark	3932db0f7e	freedreno/a5xx: remove unused prototype Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 13:50:19 -04:00
Rob Clark	104a49f166	freedreno: fix caps harder Fixes: `868ca81c` and `f485e567` Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-29 13:48:22 -04:00
Karol Herbst	bc0e0c2818	nir/lower_int64: mark all metadata as dirty v2: use nir_metadata_preserve preserve metadata in case of !progress Fixes: `074f5ba0b5` "nir: Add a simple int64 lowering pass" Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-28 19:59:28 +02:00
Mauro Rossi	0ca153f869	android: radv: enable build of vulkan.radv HAL module src/amd/Android.mk requires to include src/amd/vulkan/Android.mk to enable the build of vulkan.radv module Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:40:14 +02:00
Mauro Rossi	212af3c9ea	android: radv: add Android.mk for vulkan.radv HAL module radv implements the Android Vulkan HAL interface, this patch adds Android.mk building rules by porting of radv automake rules. vendor HAL module is installed as /vendor/lib/hw/vulkan.radv.so Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:40:07 +02:00
Mauro Rossi	1eb65c51ad	radv: generate entrypoints for VK_ANDROID_native_buffer Patch changes radv entrypoints generator to not skip this extension even though it is set as disabled in the vk.xml Reference: `63525ba730` ("android: enable VK_ANDROID_native_buffer") Fixes: `69f447553c` ("vulkan: Drop vk_android_native_buffer.xml") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:39:57 +02:00
Mauro Rossi	c67b36c8a1	radv: move vk_format_table.c to generated sources Android build system will try to compile vk_format_table.c as a shipped source, but at compile time it will be missing, we move it to generated source, where it belongs Fixes: `f4e499ec79` ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-28 12:39:49 +02:00
Brian Paul	b4bda6e066	xlib: fix build break from _swrast_map_soft_renderbuffer() call We need to pass the new flip_y argument. Reviewed-by: Clayton Craft <clayton.a.craft@intel.com>	2018-07-27 21:21:24 -06:00
Brian Paul	90b189e5d2	swrast: fix crash in AA line code when there's no texture Fixes a crash running the Piglit polygon-mode-facing test (and probably others). Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	ce0f42dfe4	mesa: add switch case for GL 2.1 in _mesa_compute_version() The xlib/swrast driver only supports GL 2.1. This patch fixes a crash if the app calls glGetString(GL_SHADING_LANGUAGE_VERSION). Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	4f51e8880d	tgsi: whitespace fixes in tgsi_ureg.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	f02243541d	gallium/util: whitespace fixes in u_inlines.h Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	4216a1d0a8	svga: whitespace fixes in svga_tgsi_decl_sm30.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	2f1af8549d	mesa: replace tabs with spaces in mipmap.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	f39840f866	gallium/util: whitespace fixes in u_debug_memory.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	2261d6a403	mesa: whitespace clean-up in texstore.c Trivial.	2018-07-27 21:21:24 -06:00
Brian Paul	a67b629193	mesa: move var decls in texstore_rgba() Move them closer to where they're first used. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-07-27 21:21:24 -06:00
Brian Paul	5e2582b381	mesa: remove unneeded free() call in texstore_rgba() The pointer will always be NULL since that's what we just tested for. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-07-27 21:21:24 -06:00
Eric Anholt	942456f646	v3d: Skip printing sub-id or pad fields in CLIF dumping. The parser doesn't expect them, so our fields would end up mismatched. They're not really useful in console output, either.	2018-07-27 18:00:48 -07:00
Eric Anholt	3ee0ab599e	v3d: Emit commands to switch CLIF parser to CL/shader/attr input mode. By default after saying you are emitting a buffer, it'll expect a buffer size. Once you set a format, it'll keep parsing that format until you announce something else.	2018-07-27 18:00:46 -07:00
Eric Anholt	a57770aa37	v3d: Dump fields in CLIF output in increasing offset order. Previously, we emitted in XML order, which I happen to type in the decreasing offset order of the specifications. However, the CLIF parser wants increasing offsets.	2018-07-27 17:56:55 -07:00
Eric Anholt	95bafeeabf	v3d: Print addresses in CLIFs as references to buffers. With CLIFs, the parser will choose an address for the buffer being created, so we need to use effectively relocations to buffers instead of the addresses that the driver uses. This is also a whole lot more intelligible for console output than raw addresses!	2018-07-27 17:56:36 -07:00
Eric Anholt	3c02838d29	v3d: Stop doing pretty-printed colorful booleans in CLIF output. The parser wants to see a 1 or 0. We can put "true" and "false" in a comment to clarify that it's a boolean and the parser will skip it.	2018-07-27 17:55:57 -07:00
Eric Anholt	422910d2e7	v3d: Move clif dumping to a separate step from noting where the CLs are. Now all the printing happens from the same worklist processing.	2018-07-27 17:08:35 -07:00
Eric Anholt	01b4952773	v3d: Move clif dump BO lookup into the clif dumper. The clif dumper is going to need information about all of our BOs if we're going to dump them for replay purposes.	2018-07-27 17:08:35 -07:00
Eric Anholt	e92959c4e0	v3d: Pass the whole clif_dump structure to v3d_print_group(). To generate CLIF files that the v3dv3 simulator can parse, we're going to need to decode addresses, and for that we'll need the vaddr lookup function from the clif structure from within v3d_decoder.	2018-07-27 17:08:35 -07:00
Timothy Arceri	77207e5380	ac: pass write param to get_sampler_desc() from get_image_descriptor() Looks like a mistake from when the deref stuff landed. Fixes: `506a07e4e3` ("ac/nir: Add deref support to image intrinsics.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-28 08:57:03 +10:00
Marek Olšák	d89a123dfd	gallium/u_vbuf: split u_vbuf_get_minmax_index function (v2) This will be used by indirect multidraws. v2: clean up the function further, change return types to unsigned Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2018-07-27 17:50:40 -04:00
Alexander von Gluck IV	da8de6b757	gallium/auxiliary: Extern "c" fixes. Used by C++ code such as Haiku's renderer. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-27 16:19:12 -05:00
Marek Olšák	5fe943aaee	gallium/noop: implement invalidate_resource	2018-07-27 16:31:56 -04:00
Dave Airlie	5040319331	radv: fix cdw check vs tracing emit If we have tracing enabled we could do all the tracing emits and overflow the precalculated cdw_max. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-28 06:20:27 +10:00
Dave Airlie	b88468f15c	radv: return binary code_size not variant code size to cache The code sizes return here get passed to the cache shader insert function, which then memcpy from the code ptr, and causes all sorts of valgrind errors like: ==6755== Invalid read of size 8 ==6755== at 0x4C32FEE: memcpy@GLIBC_2.2.5 (vg_replace_strmem.c:1021) ==6755== by 0x2305D4C7: radv_pipeline_cache_insert_shaders (radv_pipeline_cache.c:416) ==6755== by 0x2305791D: radv_create_shaders (radv_pipeline.c:2158) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) ==6755== by 0x230188AB: radv_device_init_meta_blit_color (radv_meta_blit.c:871) ==6755== by 0x2301D50E: radv_device_init_meta_blit_state (radv_meta_blit.c:1278) ==6755== by 0x23011893: radv_device_init_meta (radv_meta.c:352) ==6755== by 0x2300744B: radv_CreateDevice (radv_device.c:1576) ==6755== by 0x5187D0F: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x518F6A3: ??? (in /usr/lib64/libvulkan.so.1.1.77) ==6755== by 0x5192A42: vkCreateDevice (in /usr/lib64/libvulkan.so.1.1.77) ==6755== Address 0x22a58548 is 4 bytes after a block of size 116 alloc'd ==6755== at 0x4C2EBAB: malloc (vg_replace_malloc.c:299) ==6755== by 0x23089DC4: ac_elf_read (ac_binary.c:144) ==6755== by 0x23090A60: ac_compile_module_to_binary (ac_llvm_helper.cpp:162) ==6755== by 0x23053F06: compile_to_memory_buffer (radv_llvm_helper.cpp:58) ==6755== by 0x23053F06: radv_compile_to_binary (radv_llvm_helper.cpp:98) ==6755== by 0x23052769: ac_llvm_compile (radv_nir_to_llvm.c:3394) ==6755== by 0x23052823: ac_compile_llvm_module (radv_nir_to_llvm.c:3418) ==6755== by 0x23053C05: radv_compile_nir_shader (radv_nir_to_llvm.c:3542) ==6755== by 0x23061B4E: shader_variant_create (radv_shader.c:580) ==6755== by 0x23061CFD: radv_shader_variant_create (radv_shader.c:634) ==6755== by 0x23057765: radv_create_shaders (radv_pipeline.c:2123) ==6755== by 0x2305C523: radv_pipeline_init (radv_pipeline.c:3404) ==6755== by 0x2305C890: radv_graphics_pipeline_create (radv_pipeline.c:3515) Since we are just inserting the code into the cache, we can avoid these bad reads and data in the cache by just using the binary code size here. Fixes: `939e5a382` (radv: add padding for the UMR disassembler) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-28 06:20:20 +10:00
Eric Anholt	22a1ba0403	v3d: Drop the use of the semaphores. The kernel's scheduler doesn't rely on our emitting them, and in fact we'd get in trouble if the kernel decided to schedule too many bins in a row before getting around to scheduling the corresponding render.	2018-07-27 12:56:36 -07:00
Eric Anholt	9bf9a6d6a1	v3d: Drop the VG support from the XML. This reflects a change on the HW/closed SW side to drop this unused HW. With it dropped on their side, the CLIF parser no longer expects to find VG fields.	2018-07-27 12:56:36 -07:00
Eric Anholt	5a1cc3861c	v3d: Use /* */ instead of () for enum names in CLIF output. This lets the comments be ignored by the CLIF parser.	2018-07-27 12:56:36 -07:00
Eric Anholt	95a0f99825	v3d: CLIF-dump the "Vec size" field as 0 == maximum value. That's what a user should want to see, and what the CLIF parser wants. This should maybe be generalized.	2018-07-27 12:56:36 -07:00
Eric Anholt	1c8e4632a7	v3d: Stop using spaces in the names of our buffers. For CLIF dumping, we need names to not have spaces. Rather than rewriting them after the fact, just change the two cases where I had put a space in.	2018-07-27 12:56:36 -07:00
Fritz Koenig	ab05dd183c	i965: implement GL_MESA_framebuffer_flip_y [v3] Instead of using _mesa_is_winsys_fbo or _mesa_is_user_fbo to infer if an fbo is flipped use the FlipY flag. v2: * additional window-system framebuffer checks [for jason] v3: * s/inverted_y/flip_y/g [for chadv] * s/InvertedY/FlipY/g [for chadv] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-27 12:33:32 -07:00
Fritz Koenig	318c265160	mesa: GL_MESA_framebuffer_flip_y extension [v4] Adds an extension to glFramebufferParameteri that will specify if the framebuffer is vertically flipped. Historically system framebuffers are vertically flipped and user framebuffers are not. Checking to see the state was done by looking at the name field. This adds an explicit field. v2: * updated spec language [for chadv] * correctly specifying ES 3.1 [for chadv] * refactor access to rb->Name [for jason] * handle GetFramebufferParameteriv [for chadv] v3: * correct _mesa_GetMultisamplefv [for kusmabite] v4: * update spec language [for chadv] * s/GLboolean/bool/g [for chadv] * s/InvertedY/FlipY/g [for chadv] * s/inverted_y/flip_y/g [for chadv] * assert changes [for chadv] Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-27 12:32:25 -07:00
Chad Versace	7953399e59	gallium/auxiliary: Fix Autotools on Android (v2) Problem 1: u_debug_stack_android.cpp transitively included "pipe/p_compiler.h", but src/gallium/include was missing from the C++ include path. Problem 2: Add -std=c++11 to AM_CXXFLAGS. Android's libbacktrace headers require C++11, but the Android toolchain (at least in the Chrome OS SDK) does not enable C++11 by default. v2: Add -std=c++11. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Cc: Eric Engestrom <eric.engestrom@intel.com>	2018-07-27 11:35:56 -07:00
Topi Pohjolainen	a5889d70f2	i965/icl: Disable binding table prefetching Gen 11 workarounds table #2056 WABTPPrefetchDisable suggests to disable prefetching of binding tables for ICLLP A0 and B0 steppings. It fixes multiple gpu hangs in ext_framebuffer_multisample* tests on ICLLP B0 h/w. Anuj: Add comments and commit message. Add gen 11 checks in the code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-27 11:05:04 -07:00
Caio Marcelo de Oliveira Filho	1d71981b27	glsl: use only copy_propagation_elements Now that the elements version handles both cases, remove the non-elements version. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-27 10:51:25 -07:00
Caio Marcelo de Oliveira Filho	134b5a7047	glsl: teach copy_propagation_elements to deal with whole variables Keep information in acp_entry whether the entry is full or not, and use the ACP in more nodes when visiting the instructions: - add_copy: write whole variables to the ACP state (regardless the type). - visit(ir_dereference_variable ): perform the propagation here if we have a full candidate. Element-wise here doesn't apply because the mask isn't available at this point. - visit_leave(ir_assignment ): process beyond scalar and vector, as the full variables might have other types. Also import an improvement from opt_copy_propagation.cpp: if ir_call is an intrinsic, we know the variables affected, so keep going. v2: (all from Eric Anholt) Describe how acp_entry attributes are used. Don't do book-keeping to avoid adding repeated element to the dsts in write_elements(). v3: Use _mesa_set_remove_key. (Thomas Helland) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-27 10:51:25 -07:00
vadym.shovkoplias	399228ecad	i965: Disable guardband clipping on SandyBridge for odd dimensions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104388 Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-27 10:07:44 -07:00
Dylan Baker	665fc9cf55	docs: Update release calendar, add news item, and add release notes for 18.1.5	2018-07-27 07:08:59 -07:00
Dylan Baker	2b7b5d3100	docs: Add sha-256 sums for 18.1.5	2018-07-27 07:06:55 -07:00
Dylan Baker	5cc4ee3e17	docs: add 18.1.5 release notes	2018-07-27 07:06:53 -07:00
Iago Toral Quiroga	615aaedb93	intel/compiler: fix lower conversions to account for predication The pass can create a temporary result for the instruction and then moves from it to the original destination, however, if the original instruction was predicated, the mov has to be predicated as well. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-27 14:48:29 +02:00
Samuel Pitoiset	df679b1643	radv: allocate enough space in radv_cmd_buffer_after_draw() The driver might emit up to 4 dwords when RADV_TRACE_FILE is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:29 +02:00
Samuel Pitoiset	c08ae911d9	radv: check CS space in radv_emit_write_data_packet() This wasn't wrong but it looks better to me like this. It's only used for debugging purposes (ie. RADV_TRACE_FILE). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:27 +02:00
Samuel Pitoiset	434630f57c	radv: do not emit pipeline stats flushes on compute queue Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:26 +02:00
Samuel Pitoiset	c118c8938c	radv: reduce CB/DB meta flushes in radv_dst_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-27 14:31:24 +02:00
Kenneth Graunke	0c4e0471f5	radv: Fix build I renamed this pass and forgot to update radv. Fixes: `488972222c` ("i965: Combine both gl_PatchVerticesIn lowering passes.")	2018-07-26 23:57:13 -07:00
Kenneth Graunke	488972222c	i965: Combine both gl_PatchVerticesIn lowering passes. Until now, we had separate passes for lowering gl_PatchVerticesIn to a statically known constant (for TES inputs when linked against a TCS), and a uniform in the other cases. Annoyingly, one had to be run before nir_lower_system_values, and the other afterward. This simplified the passes, but made life painful for the callers. This patch combines both into a single pass. If you give it a non-zero static count, it uses that. If you give it Mesa state slots, it turns it back into a built-in uniform. Otherwise, it does nothing. This also moves the i965 uniform lowering out to shared code. v2: Make token arrays const. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-26 21:51:36 -07:00
Sagar Ghuge	29dd5dda9d	i965: Expose EXT_base_instance extension in OpenGLES 3.0 The extension requires at least OpenGL 3.0 and OpenGL ES 3.0. Fixes two ext_base_instance tests: arb_base_instance-baseinstance-doesnt-affect-gl-instance-id_gles3 arb_base_instance-drawarrays_gles3 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-26 17:25:35 -07:00
Bas Nieuwenhuizen	3665f66ef2	radv: Add support for ETC2 textures. Was surprised that is even supported by Vega. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-27 01:31:32 +02:00
Jan Vesely	1e8b8e0878	clover: Reduce wait_count in abort path. Trigger waiter condition variable. Passes 'events' CTS on carrizo and turks. v2: reduce to 0 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-26 15:38:22 -04:00
Jan Vesely	c2942141ae	clover: Don't extend illegal integer types. It's OK to pass them in memory, which is what kernel invocation needs. Fixes regressions since llvm r337535 ("Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"): scalar-arithmetic-char scalar-arithmetic-uchar scalar-arithemtic-short scalar-arithmetic-ushort scalar-comparison-char scalar-comparison-uchar scalar-comparison-short scalar-comparison-ushort Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-26 15:38:22 -04:00
Kenneth Graunke	8794fe3e30	intel/compiler: Delete dead VS intrinsic handling. These are lowered by brw_nir_lower_vs_inputs(). If they weren't, we would have already hit the unreachable() in emit_system_values_block(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-26 11:45:34 -07:00
Eric Anholt	deecc1ef86	v3d: Avoid the GFXH-1461 workaround if we have only Z or only S. This seems like a sensible precaution to avoid extra draws. It doesn't deal with the case of a Z24S8 buffer created by the window system for an application that happens to never use S.	2018-07-26 11:02:25 -07:00
Eric Anholt	301c32caf4	v3d: Rework the ordering of how we clear things. First, figure out if we can just sneak the clear into the TLB clear, even if drawing has already happened (since we have job->load and job->clear to tell us), taking into account GFXH-1461. For any pieces we can't TLB clear, fall back to drawing a quad without flushing the scene. Fixes extra scene flushes in glmark2 due to GFXH-1461.	2018-07-26 11:02:25 -07:00
Eric Anholt	ceecddfe77	v3d: Only store buffers that have been written to. I've seen cases where a color buffer is bound, but only Z is written, and we end up storing color.	2018-07-26 11:02:25 -07:00
Eric Anholt	d29435e7cb	v3d: Track the buffers being loaded separately. We were computing this at RCL generation time, but that means you can't unflag the store for an invalidate_resource, or not flag the store if writmasking is disabled.	2018-07-26 11:02:20 -07:00
Eric Anholt	47f5d158ae	v3d: Rename cleared/resolve to clear/store. These describe what the fields mean in RCL generation. "resolve" is left over from VC4, and sounds like MSAA resolves (which may or may not be involved in the store we generate).	2018-07-26 11:00:34 -07:00
Eric Anholt	d934d3206e	nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform. This is controlled by a new nir_shader_compiler_options flag, and fixes dEQP-GLES3.functional.shaders.builtin_variable.pointcoord on V3D. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-26 11:00:34 -07:00
Rhys Perry	b5a56a11da	docs: fix incorrect placement of the ARB_sample_locations release notes Seems something went wrong somehow when it was pushed. v2: combine into one list Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek OIšák <marek.olsak@amd.com>	2018-07-26 11:49:23 +01:00
Eric Engestrom	2cc1849afb	anv: drop unused local vars Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-26 10:21:03 +01:00
Eric Engestrom	2a4191bb38	anv: remove incorrect `UNUSED` flag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-26 10:06:11 +01:00
Erik Faye-Lund	e68fe445f5	gallium: initialize ureg_dst::Invariant bit When this bit was added, it seems the some initialization code was omitted by mistake. Since stack-variables have kinda random contents, and we don't zero initialize the whole struct in these code-paths, we end up getting random-ish values for this bit. Spotted by Coverity in the following CIDs: - 1438115 - 1438123 - 1438130 Fixes: `70425bcfe6` ("gallium: plumb invariant output attrib thru TGSI") Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-26 09:01:33 +02:00
Samuel Pitoiset	ff0d553818	radv: fix adjusting vertex fetches since 16bit support Move the integer conversion after the fixup. This fixes some regressions with dEQP-VK.pipeline.vertex_input.single_attribute.mat4.as_a2r10g10b10* Fixes: `b722b29f10` ("radv: add support for 16bit input/output") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-26 08:57:43 +02:00
Samuel Pitoiset	6465bf0015	nir: remove wrong assertion in print_var_decl() This breaks printing input/output variables with more than 4 components like mat4. Fixes: `1beef89ad8` ("nir: prepare for bumping up max components to 16") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-26 08:57:38 +02:00
Marek Olšák	ce8e6b970b	ac: fix typo DSL_SEL -> DST_SEL	2018-07-26 01:45:47 -04:00
Marek Olšák	7039d9299e	radeonsi: update a comment about cache behavior	2018-07-26 01:45:47 -04:00
Kenneth Graunke	37c3efca29	intel: Make the decoder just store addresses for bases, not buffers. The various base addresses are simply addresses. There may or may not be a buffer located at those addresses. So, it doesn't make much sense to request one. Just save the raw address so we can add it later, when asking about BOs at the final <base + offset> address. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:54 -07:00
Kenneth Graunke	933223db3c	intel: Make the decoder handle STATE_BASE_ADDRESS not being a buffer. Normally, i965 programs STATE_BASE_ADDRESS every batch, and puts all state for a given base in a single buffer. I'm working on a prototype which emits STATE_BASE_ADDRESS only once at startup, where each base address is a fixed 4GB region of the PPGTT. State may live in many buffers in that 4GB region, even if there isn't a buffer located at the actual base address itself. To handle this, we need to save the STATE_BASE_ADDRESS values across multiple batches, rather than assuming we'll see the command each time. Then, each time we see a pointer, we need to ask the driver for the BO map for that data. (We can't just use the map for the base address, as state may be in multiple buffers, and there may not even be a buffer at the base address to map.) v2: Fix things caught in review by Lionel: - Drop bogus bind_bo.size check. - Drop "get the BOs again" code - we just get the BOs as needed - Add a message about interface descriptor data being unavailable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-25 14:43:47 -07:00
Eric Engestrom	aa59f9c8bc	anv: don't crash on vkDestroyDevice(NULL) CovID: 1438132 Fixes: `a99c9e63a0` "anv: finish the binding_table_pool on destroyDevice when use_softpin" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-07-25 21:04:30 +01:00
Eric Engestrom	270a44040c	vulkan/wsi: fix incorrect assignment in assert() CovID: 1438113, 1438118, 1438119, 1438121 Fixes: `dc1d10b396` "anv,radv: Add support for VK_KHR_get_display_properties2" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-25 20:55:35 +01:00
Eric Engestrom	bbf8316fcb	anv: fix python whitespace warning Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	e0347581f3	anv: cleanup python imports Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Eric Engestrom	ce7348507e	anv: remove unnecessary semicolons in python Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-25 20:55:35 +01:00
Kenneth Graunke	a2c63cae14	st/nir: Fix st_nir_opts() prototype. This wasn't updated for the new scalar ISA parameter. It worked anyway because all the function's callers live in the same file, so it found the correct function. Tim made this external for the new st prog_to_nir translator, which got reverted, but which I'd like to land eventually. So, fix the prototype. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-07-25 10:19:41 -07:00
Lionel Landwerlin	b21b38c46c	intel: tools: dump: only store device id on success We might fail on master node drm fd because we won't have the right permissions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-25 16:53:06 +01:00
Gert Wollny	82fc6bdebf	r600: Scale integer valued texture border colors to float (v2) It seems the hardware always expects floating point border color values [0,1] for unsigned, and [-1,1] for signed texture component, regardless of pixel type, but the border colors are passed according to texture component type. Hence, before submitting the border color, convert and scale it these ranges accordingly. This doesn't seem to work for textures with 32 bit integer components though, here, it seems that the border color is always set to zero, regardless of the BORDER_COLOR_TYPE state set in Q_TEX_SAMPLER_WORD0_0. v2: Simplyfy logic as suggested by Roland Schneidegger Fixes: dEQP-GLES31.functional.texture.border_clamp.formats.compressed* dEQP-GLES31.functional.texture.border_clamp.formats.r* (non 32 bit integer) dEQP-GLES31.functional.texture.border_clamp.per_axis_wrap_mode.texture_2d* and a number of piglits out of piglit run gpu -t texture -t gather -t formats Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-25 08:58:33 +02:00
Jason Ekstrand	b3b170ade9	nir: Add a couple of iand/ior optimizations Spotted in a shader in Batman: Arkham City. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-24 20:39:43 -07:00
Jordan Justen	2b3064c073	i965, anv: Use INTEL_DEBUG for disk_cache driver flags Since various options within INTEL_DEBUG could impact code generation, we need to set the disk cache driver_flags parameter based on the INTEL_DEBUG flags in use. An example that will affect the program generated by i965 is the INTEL_DEBUG=nocompact option. The DEBUG_DISK_CACHE_MASK value is added to mask the settings of INTEL_DEBUG that can affect program generation. v2: * Use driver_flags (Tim) * Also update Anvil (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:28 -07:00
Jordan Justen	69a686b0ae	i965, anv: Add extra unused character in disk_cache renderer temp string This extra character should not be used by snprintf, but we make it available to verify that we printed the exact number we wanted, and didn't overflow. v2: * Also update Anvil Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 16:17:25 -07:00
Marek Olšák	7d2e6edd89	mesa: allow indirect draws with the default VAO and compatibility profile Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-24 16:00:09 -04:00
Danylo Piliaiev	49ed075615	mesa: Fix copy-paste error in ConservativeRasterDilateRange initialization Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `4580617509` ("mesa: add support for nvidia conservative rasterization extensions") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 20:44:34 +01:00
Jason Ekstrand	f214baf72f	nir/serialize: Alloc constants off the variable nir_sweep assumes that constants area always allocated off the variable to which they belong. Violating this assumption causes them to get freed early and leads to use-after-free bugs. Fixes: `120da00975` "nir: add serialization and deserialization" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-07-24 12:34:07 -07:00
Karol Herbst	7f95564a22	nir: rename f2f16_undef to f2f16 we need rounding modes on other conversions involving floats and it is easier to rename f2f16_undef than renaming all the other ones. v2: rebased on master Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Karol Herbst	2083cfb6eb	nir: add builtin builder also move some of the GLSL builtins over we will need for implementing some OpenCL builtins v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code fix up changes caused by swizzle rework Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Rob Clark	9e90708d5d	nir/spirv: import OpenCL.std.h Lightly edited to be valid 'C' code. Is there a bug open to fix this upstream? Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-24 20:40:05 +02:00
Marek Olšák	98ab24fdab	radeonsi: handle SI_FORCE_FAMILY early before LLVM target machines are created	2018-07-24 14:21:29 -04:00
Mathieu Bridon	9ebd8372b9	python: Use range() instead of xrange() Python 2 has a range() function which returns a list, and an xrange() one which returns an iterator. Python 3 lost the function returning a list, and renamed the function returning an iterator as range(). As a result, using range() makes the scripts compatible with both Python versions 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	022d2a381d	python: Better use iterators In Python 2, iterators had a .next() method. In Python 3, instead they have a .__next__() method, which is automatically called by the next() builtin. In addition, it is better to use the iter() builtin to create an iterator, rather than calling its __iter__() method. These were also introduced in Python 2.6, so using it makes the script compatible with Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	01da2feb0e	python: Better sort dictionary keys/values In Python 2, dict.keys() and dict.values() both return a list, which can be sorted in two ways: * l.sort() modifies the list in-place; * sorted(l) returns a new, sorted list; In Python 3, dict.keys() and dict.values() do not return lists any more, but iterators. Iterators do not have a .sort() method. This commit moves the build scripts to using sorted() on dict keys and values, which makes them compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	5530cb1296	python: Better iterate over dictionaries In Python 2, dictionaries have 2 sets of methods to iterate over their keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems(). The former return lists while the latter return iterators. Python 3 dropped the method which return lists, and renamed the methods returning iterators to keys()/values()/items(). Using those names makes the scripts compatible with both Python 2 and 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	fdf946ffbf	python: Stop using the string module Most functions in the builtin string module also exist as methods of string objects. Since the functions were removed from the string module in Python 3, using the instance methods directly makes the code compatible with both Python 2 and Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-24 11:07:04 -07:00
Mathieu Bridon	1d209275c2	python: Better check for keys in dicts Python 3 lost the dict.has_key() method. Instead it requires using the "in" operator. This is also compatible with Python 2. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-24 11:07:04 -07:00
Kenneth Graunke	9b34742495	intel: Make the disassembler take a const pointer to the assembly. Disassembling doesn't modify the assembly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-24 11:04:56 -07:00
Andres Gomez	3647b16675	travis: manually generate sys/syscall.h Until now, the needed bits were wrongly included in linux/memfd.h Since Travis' sys/syscall.h doesn't provide the SYS_memfd_create, we generate that header manually, including the needed bits to avoid compilation problems, as the ones observed after: `3228335b55` ("intel: aubinator: handle GGTT mappings") v2: replace fixes commit with the first direct user of syscall.h (Emil). Fixes: `3228335b55` ("intel: aubinator: handle GGTT mappings") Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-24 19:52:11 +03:00
Andres Gomez	7665a05a3a	docs: update calendar to match the 18.2 plan with the one announced Additionally, I've extended the 18.1 cycle by one more release, tentatively assigned to Dylan, due to the ~2 weeks delay for 18.2. Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Juan A. Suarez <jasuarez@igalia.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:49:08 +03:00
Andres Gomez	1391892e73	docs: move releases from Fridays to Wednesdays As discussed at: https://lists.freedesktop.org/archives/mesa-dev/2018-March/188525.html Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Dylan Baker <dylan.c.baker@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Carl Worth <cworth@cworth.org> Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:48:01 +03:00
Andres Gomez	b0e49a9e7a	docs: correct typo in the submitting patches instructions Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-24 19:47:40 +03:00
Bas Nieuwenhuizen	28b8c18d84	radv: Still enable inmemory & API level caching if disk cache is not enabled. That we don't have a background disk cache does not mean we should prevent the app caching anything. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-24 18:06:41 +02:00
Jose Fonseca	04d77d53aa	gallium/tests: Don't ignore S3TC errors. Now we do full S3TC decompression they should no longer fail. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-24 15:58:14 +01:00
Harish Krupo	fd734608c3	egl: Fix missing clamping in eglSetDamageRegionKHR Clamp the x and y co-ordinates of the rectangles. v2: Clamp width/height after converting to co-ordinates (Ilia Merkin) Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-24 14:46:21 +01:00
Erik Faye-Lund	c3eaf8fe57	forward precise-flag if supported New versions of virglrenderer supports the precise-flag, so let's forward it from TGSI if that's the case. This fixes a few dEQP-GLES31 tests: - dEQP-GLES31.functional.tessellation.common_edge.quads_equal_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_even_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.quads_fractional_odd_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_equal_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_even_spacing_precise - dEQP-GLES31.functional.tessellation.common_edge.triangles_fractional_odd_spacing_precise Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-24 10:27:27 +02:00
Marek Olšák	6853862a58	radeonsi: fix pk2h breakage	2018-07-23 22:29:59 -04:00
Marek Olšák	86b52d4236	radeonsi: reduce LDS stalls by 40% for tessellation 40% is the decrease in the LGKM counter (which includes SMEM too) for the GFX9 LSHS stage. This will make the LDS size slightly larger, but I wasn't able to increase the patch stride without corruption, so I'm increasing the vertex stride.	2018-07-23 20:23:52 -04:00
Tom Stellard	0866edede0	radeonsi: Add debug option to enable LLVM GlobalISel (v2) R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than SelectionDAG for instruction selection. v2: mareko: move the helper to src/amd/common Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <tstellar@redhat.com>	2018-07-23 20:23:48 -04:00
Jason Ekstrand	820d5e51b7	intel/compiler: Account for built-in uniforms in analyze_ubo_ranges The original pass only looked for load_uniform intrinsics but there are a number of other places that could end up loading a push constant. One obvious omission was images which always implicitly use a push constant. Legacy VS clip planes also get pushed into the shader. This fixes some new Vulkan CTS tests that test random combinations of bindings and, in particular, test lots of UBOs and images together. Cc: mesa-stable@lists.freedesktop.org Cc: Kenneth Graunke <kenneth@whitecape.org>	2018-07-23 15:28:17 -07:00
Daniel Schürmann	62024fa775	radv: enable VK_KHR_16bit_storage extension / 16bit storage features Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:26 +02:00
Daniel Schürmann	4d0b02bb5a	ac: add support for 16bit load_push_constant Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	b722b29f10	radv: add support for 16bit input/output Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	87989339a0	nir: add 16bit type information to glsl types Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	7e7ee82698	ac: add support for 16bit buffer loads v2: Fixed dvec3 loads (bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	a6a21e651d	ac: add support for 16bit UBO loads Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	3109c5257b	ac: add support for 16bit ssbo stores Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Daniel Schürmann	f582367d49	ac: add 16bit conversion operations Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 23:16:25 +02:00
Dave Airlie	d73f1026b4	r600: enable tess_input_info for TES There might be a nicer way to do this, but this is at least correct. This fixes: KHR-GL44.tessellation_shader.single.max_patch_vertices KHR-GL44.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn Reviewed-By: Gert Wollny <gert.wollny@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2018-07-23 21:11:35 +01:00
Dave Airlie	760622c328	docs/features: fix virgl gles3.1 entries	2018-07-24 06:10:46 +10:00
Roland Scheidegger	09828feab0	draw: force draw pipeline if there's more than 65535 vertices The pt emit path can only handle 65535 - the number of vertices is truncated to a ushort, resulting in a too small buffer allocation, which will crash. Forcing the pipeline path looks suboptimal, then again this bug is probably there ever since GS is supported, so it seems it's not happening often. (Note that the vertex_id in the vertex header is 16 bit too, however this is only used by the draw pipeline, and it denotes the emit vertex nr, and that uses vbuf code, which will only emit smaller chunks, so should be fine I think.) Other solutions would be to simply allow 32bit counts for vertex allocation, however 65535 is already larger than this was intended for (the idea being it should be more cache friendly). Or could try to teach the pt emit path to split the emit in smaller chunks (only the non-index path can be affected, since gs output is always linear), but it's a bit tricky (we don't know the primitive boundaries up-front). Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=107295 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-07-23 22:07:07 +02:00
Dave Airlie	51f67eeb21	docs/features: note ARB_copy_image is working on virgl	2018-07-24 06:06:15 +10:00
Dave Airlie	83332618c1	Revert "virgl: remove unused stride-arguments" This reverts commit `dc938b8398`. This adds warnings in vtest, and possibly breaks it.	2018-07-24 06:03:20 +10:00
Dave Airlie	69c2cd0b14	docs/features: note ssbo and atomic counters done for virgl	2018-07-24 05:56:35 +10:00
Dave Airlie	958b57ac82	virgl: add initial shader_storage_buffer_object support. (v2) This adds the guest side support for ARB_shader_storage_buffer_object. Co-authors: Gurchetan Singh <gurchetansingh@chromium.org> v2: move to using separate maximums (fixup macros) Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-07-24 05:54:21 +10:00
Jason Ekstrand	e4d346c86d	nir: Add a couple trivial abs optimizations Spotted in a shader in Batman: Arkham City. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-23 10:48:21 -07:00
Caio Marcelo de Oliveira Filho	52d831ff83	glsl: remove delegating constructors to allow build with C++98 Delegating constructors is a C++11 feature, so this was breaking when compiling with C++98. Change the copy_propagation_state() calls that used the convenience constructor to use a static member function instead. Since copy_propagation_state is expected to be heap allocated, this change is a good fit. Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305	2018-07-23 10:34:43 -07:00
Eric Anholt	6b73a97f84	v3d: Implement a small immediates optimization, based on VC4's. We can do one per instruction, and we have to be careful not to overwrite raddr_b, but this greatly reduces the pressure on uniform loads (particularly around ldvpm/stvpm instructions). total instructions in shared programs: 90768 -> 88220 (-2.81%) instructions in affected programs: 82711 -> 80163 (-3.08%)	2018-07-23 10:21:43 -07:00
Eric Anholt	79e0f042bc	v3d: Return an invalid src number if asked for a missing implicit uniform. Sometimes when iterating over sources, we might want to check if it's the implicit one. We wouldn't want to match on a non-implicit src using this function.	2018-07-23 10:21:43 -07:00
Eric Anholt	f2ea936f48	v3d: Skip emitting texture config parameter 2 if it's just the defaults. shader-db: total instructions in shared programs: 91275 -> 90768 (-0.56%) instructions in affected programs: 20702 -> 20195 (-2.45%)	2018-07-23 10:21:43 -07:00
Eric Anholt	421e99d777	v3d: Update an XXX comment for a path we handled in HW on V3D 4.x.	2018-07-23 10:21:43 -07:00
Eric Anholt	e7ae900341	v3d: Switch to using the new SFU instructions on V3D 4.x. These instructions let us write directly to the phys regfile, instead of just R4. That lets us avoid moving out of R4 to avoid conflicting with other SFU results, and to avoid conflicting with thread switches. There is still an extra instruction of latency, which is not represented in the scheduler at the moment. If you use the result before it's ready, the QPU will just stall, unlike the magic R4 mode where you'd read the previous value. That means that the following shader-db results aren't quite representative (since we now cause some stalls instead of emitting nops), but they're impressive enough that I'm happy with the change. total instructions in shared programs: 95669 -> 91275 (-4.59%) instructions in affected programs: 82590 -> 78196 (-5.32%)	2018-07-23 10:21:43 -07:00
Eric Anholt	58c1d3860f	v3d: Add QPU pack/unpack for the new SFU instructions. These instructions allow writing the result to any register, instead of a special writeback to r4.	2018-07-23 10:21:43 -07:00
Eric Anholt	cdfa99657d	v3d: Fix the name of the "flpop" operation. Noticed while trying to sort a new op into the appropriate place to match the documentation.	2018-07-23 10:21:43 -07:00
Eric Anholt	91e24e5718	v3d: Print the instruction we're testing in the QPU disasm/pack round-trip. If we fail initial disassembly, it's good to know what instruction it was that failed.	2018-07-23 10:21:42 -07:00
Eric Anholt	a1beb333d8	v3d: Drop unused vir_SAT() operation. We lower saturates in NIR.	2018-07-23 10:21:42 -07:00
Eric Anholt	8dfc6ee317	v3d: Rotate through registers to improve post-RA scheduling options. Similarly to VC4's implementation, by not picking r0 immediately upon freeing it, we give the scheduler more of a chance to fit later writes in earlier. I'm not clear on whether there's any real cost to picking phys over accumulators, so keep that behavior for now. shader-db: total instructions in shared programs: 96831 -> 95669 (-1.20%) instructions in affected programs: 77254 -> 76092 (-1.50%)	2018-07-23 10:21:42 -07:00
Eric Anholt	1fb31819ae	v3d: Allow reading from physical regs written in the previous instruction. This restriction existed in V3D 2.x, but lifting it was a major change in 3.x. shader-db results: total instructions in shared programs: 98117 -> 96831 (-1.31%) instructions in affected programs: 48520 -> 47234 (-2.65%)	2018-07-23 10:21:23 -07:00
Eric Engestrom	e6e22e4207	anv: remove unnecessary runtime copy of static string It's actually also a bit safer, since now the compiler will warn if the string is larger than the `.name` array. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-23 17:56:08 +01:00
Alex Smith	54f8f1545f	anv: Pay attention to VK_ACCESS_MEMORY_(READ\|WRITE)_BIT According to the spec, these should apply to all read/write access types (so would be equivalent to specifying all other access types individually). Currently, they were doing nothing. v2: Handle VK_ACCESS_MEMORY_WRITE_BIT in dstAccessMask. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-23 15:29:43 +01:00
Erik Faye-Lund	dc938b8398	virgl: remove unused stride-arguments The IOCTLs doesn't pass this along, so computing them in the first place is kinda pointless. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-23 11:21:09 +01:00
Samuel Pitoiset	6c58bc8d9c	radv: print a big warning when RADV_TRACE_FILE is set Users shouldn't use this debugging option except when we ask them to do! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 11:34:42 +02:00
Samuel Pitoiset	6e32d9e7b0	radv: fix a memleak for merged shaders on GFX9 modules[i] can be NULL for merged shaders but we have to free the NIR code. radv_can_dump_shader_stats() already handles if modules[i] is NULL, no need to check it twice. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-23 11:34:39 +02:00
Jason Ekstrand	d0ee0a0a5d	intel/blorp: Fix blits to R8G8B8_UNORM_SRGB sRGB harder The first fix attempt contained a nasty typo which somehow didn't get caught in review. It also didn't work as intended because the sRGB conversion was happening but then throwing away all but the red channel because it dind't know it was RGB. Really, it's my fault for trying to fix a bug without first writing tests. I've now written tests and they pass with this change. :) Fixes: `11712b9ca1` "intel/blorp: Fix blits to R8G8B8_UNORM_SRGB" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-23 00:36:39 -07:00
Jason Ekstrand	abd629eb3d	anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV We've had several broadwell hangs that have come down to this bit just not working correctly. Most recently, we've had a pile of hangs reported with apps running under DXVK: https://github.com/doitsujin/dxvk/issues/469 Instead, use the bit that doesn't try to imply weird D3D coherency things and just force-enables the PS like we want. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-22 23:43:19 -07:00
Jason Ekstrand	b99493c628	anv: Properly handle GetImageSubresourceLayout on complex images We support mipmapped and arrayed linear images so we need to support vkGetImageSubresourceLayout on them. Fortunately, it's just a trivial call into ISL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-22 23:24:10 -07:00
Timothy Arceri	78f391d343	radeonsi/nir: make use of nir_lower_load_const_to_scalar() This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example some loops in Civilization Beyond Earth shaders are unrolled. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-23 09:48:51 +10:00
Ilia Mirkin	257128079c	anv/gen9: expose VK_EXT_post_depth_coverage Note that the use of ICMS_INNER_CONSERVATIVE disagrees with the GL driver. Perhaps it's more performant than ICMS_NORMAL and is otherwise permitted? Not sure, so I left it as-is. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-22 14:56:44 -07:00
Ilia Mirkin	768f143667	spirv: add support for SPV_KHR_post_depth_coverage Allow the capability to be exposed, and convert the new execution mode into fs state. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-22 14:56:36 -07:00
Mauro Rossi	6cbbd5b4f8	android: util/disk_cache: fix building errors in gallium drivers This patch applies the necessary changes in Android.common.mk as per automake rules, to avoid following building error: external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8: error: implicit declaration of function 'disk_cache_get_function_timestamp' is invalid in C99 [-Werror,-Wimplicit-function-declaration] if (disk_cache_get_function_timestamp(nouveau_disk_cache_create, ^ 1 error generated. (v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled Fixes: `cc10b34` ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-21 12:06:38 +02:00
Chih-Wei Huang	e7ffd3fb08	Android: fix a missing nir_intrinsics.h error The commit `76dfed8ae2` changed nir_intrinsics.h to be a generated header, but the corresponding dependency was not updated for Android. It causes the error: [ 0% 19/4336] target C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c ... In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28: In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140: In file included from external/mesa/src/amd/common/ac_llvm_build.h:30: external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `76dfed8ae2` ("nir: mako all the intrinsics") Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>	2018-07-21 08:50:23 +02:00
Bas Nieuwenhuizen	e1febbefe8	nir: Fix end of function without return warning/error. There always is a continue block, so let us just do unreachable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `8cacf38f52` "nir: Do not use continue block after removing it." CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312	2018-07-20 22:27:39 +02:00
Danylo Piliaiev	d24c35c3fb	st: Sweep NIR after linking phase to free held memory After optimization passes and many trasfromations most of memory NIR holds is a garbage which was being freed only after shader deletion. Freeing it at the end of linking will save memory which would be useful in case there are a lot of complex shaders being compiled. The common case for this issue is 32bit game running under Wine. The cost of the optimization is around ~3-5% of compilation speed with complex shaders. Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-20 11:26:12 -07:00
Eric Anholt	945524ba0e	st/dri: Don't require a dri_format for image creation. Nothing in EGL_KHR_gl_image.txt seems to let us deny creation based on formats, and doing so causes many failures in dEQP-EGL.functional.image.api.* The NONE value we were protecting from only gets looked at in the __DRI_IMAGE_ATTRIB_FORMAT and __DRI_IMAGE_ATTRIB_FOURCC queries, which are used from wayland and gbm (which throw an error cleanly on unknown format) and DMABUF export. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 11:26:12 -07:00
Eric Anholt	f6750456c5	egl: Refuse EGL_MESA_image_dma_buf_export if we don't have a DRM fourcc. The EGL CTS expects that you can make images from all sorts of things, including things like z16 and s8, which we don't have DRM fourccs for. Just return an error when trying to export one of those. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 11:26:12 -07:00
Eric Anholt	a221f9709e	v3d: Fix incorrect handling of two fences created back-to-back. Recreating our context's syncobj with ALREADY_SIGNALED meant that if you created two fences in a row, then waiting on the second would succeed immediately. Instead, export a sync file in the gallium fence (since we don't have a syncobj clone ioctl), and just create a new syncobj to wait on whenever we need to. Noticed while debugging dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	fc28692a5a	v3d: Fix the timeout value passed to drmSyncobjWait(). The API wants an absolute time, so we need to go add gallium's argument to CLOCK_MONOTONIC.	2018-07-20 11:11:29 -07:00
Eric Anholt	4f04bd68cf	v3d: Fix drmSyncobjWait() return value checking even more. It tends to return >0 in the success case (I think the value is something like "how much of the timeout remained"). Fixes dEQP-GLES3.functional.fence_sync.client_wait_sync_finish	2018-07-20 11:11:29 -07:00
Eric Anholt	2f90879a34	v3d: Use the list_first_entry/list_last_entry macros.	2018-07-20 11:11:29 -07:00
Eric Anholt	d0e53373e5	v3d: Move BO cache counting to dump time instead of cache management. This is one less way to get the dump stats wrong.	2018-07-20 11:11:29 -07:00
Eric Anholt	7d6aef6fa5	v3d: Reduce the stale BO reclamation spam with dump_stats set. This was obviously meant to be when we were actually freeing a BO, not just when there was at least one BO in the list.	2018-07-20 11:11:29 -07:00
Eric Anholt	5d11094db1	v3d: Respect a sampler view's first_layer field. Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture	2018-07-20 11:11:29 -07:00
Sonny Jiang	c6737756ad	radeonsi: emit_spi_map packets optimization v2: marek: remove an empty line before break; rename reg_val_seq -> spi_ps_input_cntl "type * x" -> "type *x" Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 13:50:26 -04:00
Gert Wollny	4d094993c3	virgl: Expose GL_ARB_copy_image if host supports it Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:15:12 +02:00
Gert Wollny	0bde9739c0	virgl: Allow RGB32* textures only as buffer objects When requesting a texture of the internal format GL_RGB32F Gallium will try to allocate a renderable texture and returns RGBA32F or RGBX32F, but when one requests GL_RGB32I or GL_RGB32UI the according 3-component texture will be returned. This leads to problems later, when one wants to use glCopyImageSubData to copy data between these textures that should be compatible, but given the way virgl and Gallium handle this the latter fails with an assertion, because the per-texel bit size is different. By allowing the GL_RGB32* only for texture buffers these problems are avoided without losing the ARB_tbo_rgb32 extension (thanks Ilia Mirkin). v2: Correct spelling (Gurchetan Singh) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-07-20 19:12:49 +02:00
Lionel Landwerlin	feb43ef674	intel: tools: dump: protect against multiple calls on destructor When running gdb, make sure to pass the LD_PRELOAD variable only to the executed program, not the debugger. Otherwise the debugger will run the preloaded constructor/destructor too and bad things will happen. Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:56 +01:00
Lionel Landwerlin	2a9069eb97	intel: tools: dump: make dump tool reliable under gdb The problem with passing the configuration of the dump lib through a file descriptor is that it can be read only once. But under gdb you might want to rerun your program multiple times. This change hands the configuration through a temporary file that is deleted once the command line passes to intel_dump_gpu has exited. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-20 17:36:37 +01:00
Samuel Pitoiset	1efc9094e0	radv: don't flush DB before subpass FS resolves That shouldn't be needed because the DB state is invalid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 17:30:13 +02:00
Gert Wollny	016807161b	r600: Correct evaluation of cube array index and face The array index needs to be corrected and it must be insured that it is rounded and its value is non-negative before it is combined with the face id. v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) v6: Fix type (Roland Scheidegger) Fixes 182 from android/cts/master/gles31-master.txt: dEQP-GLES31.functional.texture.filtering.cube_array.formats.* dEQP-GLES31.functional.texture.filtering.cube_array.sizes.* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_* dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	01766c1db6	r600: correct texture offset for array index lookup Correct the array index for TEXTURE_1D_ARRAY, and TEXTURE_2D_ARRAY The standard says the array index is evaluated according to floor(z + 0.5) but RNDNE is sufficient also for the test cases were z is close to 1.5 and it is likely to hit 1.5, the corner case were RNDNE gives a result different from above formula. v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin) - update commit message Fixes 325 tests from android/cts/master/gles3-master.txt: dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray* dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2darray dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2darray dEQP-GLES3.functional.texture.filtering.2d_array.formats.* dEQP-GLES3.functional.texture.filtering.2d_array.sizes.* dEQP-GLES3.functional.texture.filtering.2d_array.combinations.* dEQP-GLES3.functional.texture.shadow.2d_array.* dEQP-GLES3.functional.texture.vertex.2d_array.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Gert Wollny	626bd455d4	r600: Delay emission of texture gradients and lookup offsets Gradients used in texture lookups and the offsets must reside in the same fetch clause (the first is imposed by the hardware and the second is expected by sb). In order to ensure that no ALU clause is inserted between emission and use of these, delay the emission of these instructions until the texture instruction using them is also emitted. This is needed in preparation for the correction of the texture array indices. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-20 14:55:12 +02:00
Bas Nieuwenhuizen	cc10b34e9e	util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache. radv always needs it, so just check the header instead. Also do not declare the function if the variable is not set, so we get a nice compile error instead of failing to open a device at runtime. Fixes: `b87ef9e606` "util: fix MSVC build issue in disk_cache.h" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-20 12:09:19 +02:00
Bas Nieuwenhuizen	8cacf38f52	nir: Do not use continue block after removing it. Reinserting code directly before a jump means the block gets split and merged, removing the original block and replacing it in the process. Hence keeping a pointer to the continue block over a reinsert causes issues. This code changes nir_opt_if to simply look for the new continue block. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275 CC: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-20 12:09:19 +02:00
Samuel Pitoiset	ce454d02cc	radv: simplify a condition in radv_src_access_flush() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:17 +02:00
Samuel Pitoiset	1ff25c4e6b	radv: save current state just before resolving with FS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:15 +02:00
Samuel Pitoiset	c3d5f124c6	radv: don't check if a subpass has resolve attachments twice We already check that in radv_cmd_buffer_resolve_subpass(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:13 +02:00
Samuel Pitoiset	0a8127bbfb	radv: make use of radv_subpass_barrier() when resolving subpasses The goal is to use radv_barrier()/radv_subpass_barrier() as much as possible for further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-20 10:17:11 +02:00
Rhys Perry	409a60df3b	nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding total instructions in shared programs : 5480808 -> 5472107 (-0.16%) total gprs used in shared programs : 647530 -> 647532 (0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58551648 -> 58459352 (-0.16%) local shared gpr inst bytes helped 0 0 73 2609 2609 hurt 0 0 71 34 34	2018-07-19 23:34:58 +02:00
Rhys Perry	2afef231db	nv50/ir: handle SHLADD in IndirectPropagation An alternative solution to the problem fixed in `0bd83d0` ("nv50/ir: move LateAlgebraicOpt to the very end"). total instructions in shared programs : 5481195 -> 5480808 (-0.01%) total gprs used in shared programs : 647535 -> 647530 (-0.00%) total shared used in shared programs : 389120 -> 389120 (0.00%) total local used in shared programs : 21064 -> 21064 (0.00%) total bytes used in shared programs : 58555784 -> 58551648 (-0.01%) local shared gpr inst bytes helped 0 0 2 34 34 hurt 0 0 0 0 0	2018-07-19 23:34:58 +02:00
Rhys Perry	3b6edd0b59	gm107/ir: use CS2R for SV_CLOCK This instruction seems to be faster than S2R and requires no barrier, though the range of special registers it can read from is limited. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-19 23:34:58 +02:00
Lionel Landwerlin	94cf964586	intel: tools: dump: remove mentions of intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 20:12:53 +01:00
Lionel Landwerlin	0f9d8b754f	intel: tools: aubwrite: fix invalid frees on finish Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-07-19 20:11:56 +01:00
Samuel Pitoiset	3d41757788	ac/nir: add a workaround for bitfield_extract when count is 0 LLVM 7 returns incorrect results when count is 0, something has been broken since LLVM 6. Of course, the best solution is to fix LLVM but this workaround works as expected for now. Original workaround by Philippe Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-19 20:41:10 +02:00
Nanley Chery	e2e32b6afd	intel/isl/gen4: Make depth/stencil buffers Y-Tiled Rendering to a linear depth buffer on gen4 is causing a GPU hang in the CI system. Until a better explanation is found, assume that errata is applicable to all gen4 platforms. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Nanley Chery	44ab26d0c9	i965/misc: Use depth/stencil surf's tiling on gen4-5 Make the 3D engine aware of the depth/stencil surface's tiling before doing any render operations. Fixes `fbe01625f6` ("i965/miptree: Share tiling_flags in miptree_create"). Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-19 11:05:07 -07:00
Caio Marcelo de Oliveira Filho	507a8037a7	glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch When handling 'if' in copy propagation elements, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. x = y; if (...) { z = x; // This would turn into z = y. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = y. } With the change, we let copy propagation happen independently in the two branches and only then apply the killed values for the subsequent code. One example in shader-db part of shaders/unity/8.shader_test: (assign (xyz) (var_ref col_1) (var_ref tmpvar_8) ) (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) ( (assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... ) ) ( (assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... ) )) The variable col_1 was replaced by tmpvar_8 in the then-part but not in the else-part. NIR deals well with copy propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:59 -07:00
Caio Marcelo de Oliveira Filho	e4f32dec23	glsl: change opt_copy_propagation_elements data structures Instead of keeping multiple acp_entries in lists, have a single acp_entry per variable. With this, the implementation of clone is more convenient and now fully implemented. In the previous code, clone was only partial. Before this patch, each acp_entry struct represented a write to a variable including LHS, RHS and a mask of what channels were written to. There were two main hash tables, the first (lhs_ht) stored a list of acp_entries per LHS variable, with the values available to copy for that variable; the second (rhs_ht) was a "reverse index" for the first hash table, so stored acp_entries per RHS variable. After the patch, there's a single acp_entry struct per LHS variable, it contains an array with references to the RHS variables per channel. There now is a single hash table, from LHS variable to the corresponding entry. The "reverse index" is stored in the ACP entry, in the form of a set of variables that copy from the LHS. To make the clone operation cheaper, the ACP entries are created on demand. This should not change the result of copy propagation, a later patch will take advantage of the clone operation. v2: Add note clarifying how the hashtable is destroyed. v3: (all from Eric Anholt) Add remove_unused_var_from_dsts() function for reuse. Remove from dsts as we go instead of clearing at the end. Add clarifying comment to erase(). Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Caio Marcelo de Oliveira Filho	7b0d395250	glsl: separate copy propagation state Separate higher level logic of visiting instructions and chosing when to store and use new copy data from the datastructure holding the copy propagation information. This will also make easier later patches that change the structure. v2: Remove empty destructor and clarify how hash tables are destroyed. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-19 10:00:30 -07:00
Lionel Landwerlin	49e86f09fe	intel: tools: dump: trace memory writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-19 16:48:42 +01:00
Lionel Landwerlin	5ba3e5c358	intel: tools: dump: remove command execution feature In commit `86cb05a6d3` ("intel: aubinator: remove standard input processing option") we removed the ability to process aub as an input stream because we're now rely on mmapping the aub file to back the buffers aubinator is parsing. intel_aubdump was the provider of the standard input data and since we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa, we don't need that code anymore. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-19 10:11:54 +01:00
Danylo Piliaiev	494a206229	radv: Fix incorrect assumption about ternary operator precedence Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 10:04:27 +02:00
Marek Olšák	dcbcc83003	mesa: fix make check for AMD_performance_monitor	2018-07-19 01:17:01 -04:00
Marek Olšák	f097f0c55c	mesa: remove dead code from api_loopback This should only contain functions not set in vtxfmt.c. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 01:10:32 -04:00
Marek Olšák	987c2ece03	mesa: expose ARB_indirect_parameters in the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) v2: fix dispatch_sanity	2018-07-19 01:10:18 -04:00
Marek Olšák	d40188800e	vbo: fix ARB_multi_draw_indirect for the compatibility profile Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	6c4652ea8a	mesa: expose ARB_shader_viewport_layer_array in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	da528898bc	mesa: expose ARB_ES3_1_compatibility in the compatibility profile no changes needed for GL compat Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-19 00:58:51 -04:00
Marek Olšák	565dacc3d6	winsys/amdgpu: remove RADEON_SURF_FMASK leftover RADEON_SURF_FMASK is never set.	2018-07-19 00:58:51 -04:00
Marek Olšák	9b82d128c9	ac: run LLVM optimization passes only on the final function after inlining	2018-07-19 00:58:49 -04:00
Bas Nieuwenhuizen	17b5a59b4e	radv: Enable binning and dfsm by default on Raven. Seems like it increases performance by 2-3% for some demos and games. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:21 +02:00
Bas Nieuwenhuizen	978570769d	radv: Always set disable zpass increment bit when possible. When no occlusion queries are active even if out of order is enabled. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen	82664af6cf	radv: Select correct entries for binning. Overshot it by one every time. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:38:01 +02:00
Bas Nieuwenhuizen	760211b77c	radv: Fix number of samples used for binning. Used the wrong register ... CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:54 +02:00
Bas Nieuwenhuizen	c0144e915a	radv: Disable disabled color buffers in rbplus opts. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-19 02:37:47 +02:00
Marek Olšák	fb049742d6	r600: silence the signed overflow warning like radeonsi r600_gpu_load.c: In function ‘r600_gpu_load_thread’: ../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow] if (start <= end)	2018-07-18 17:48:48 -04:00
Andres Rodriguez	d3d9513556	radv: fix wmaybe-uninitialized in radv_meta_fast_clear.c Assignment and usage of this variable both happen inside an if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and assumes that both function calls could have different return values. But in this case we should be safe. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-18 15:32:51 -04:00
Sonny Jiang	4bf7234061	radeonsi: emit_guardband packets optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Sonny Jiang	80ade05b8d	radeonsi: Save CLEAR_STATE initial values for optimization Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 15:04:27 -04:00
Jan Vesely	9baacf3fa7	radeonsi: Refuse to accept code with unhandled relocations They might lead to unrecoverable GPU hang. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 13:56:56 -04:00
Eric Anholt	70534dbe29	Allow AMD_perfmon on GLES contexts v2: whitespace alignment fix Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:39:21 -07:00
Eric Anholt	4ba478d7cd	egl: Use the canonical drm-uapi fourcc header to avoid local defines. We should only use a #define locally once it's been upstreamed, and at that point you should just update our drm_fourcc.h. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 10:37:54 -07:00
Eric Anholt	2c6279d58b	v3d: Fix tiling modifier support to use the new UIF define. You can't use T tiled buffers on V3D 3.x and newer, it's been replaced with a newer layout shared with other hardware blocks.	2018-07-18 10:37:49 -07:00
Eric Anholt	6c0482e176	drm-uapi: Update drm_fourcc.h for new format modifiers. This brings in the Broadcom VC4 SAND and V3D 3.x+ UIF modifiers, from drm-next commit 4da1d4c751c9b1b713c13043bad7c4d27cd1418c.	2018-07-18 10:37:49 -07:00
Marek Olšák	201ebf51d1	st/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-18 13:33:30 -04:00
Timothy Pearson	e1621fda84	radeonsi: Use signed char for color_interp_vgpr_index color_interp_vgpr_index was declared as a generic char value. Because signed values are used in this variable, the result was not safe across architectures and crashed on ppc64[el] and arm. Declare color_interp_vgpr_index as a signed type. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-18 13:31:29 -04:00
Jason Ekstrand	aaa6fac8f6	intel/blorp: Take an explicit filter parameter in blorp_blit This lets us move the glBlitFramebuffer nonsense into the GL driver and make the usage of BLORP mutch more explicit and obvious as to what it's doing. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Jason Ekstrand	9fbe2a2007	intel/blorp: Add a blorp_filter enum for use in blorp_blit At the moment, this is entirely internal but we'll expose it to clients of the BLORP API in the next commit. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-07-18 09:47:28 -07:00
Caio Marcelo de Oliveira Filho	ea556471a1	intel/tools: add missing include for stdarg.h Fixes build in GCC 8.1.1: FAILED: src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o gcc -Isrc/intel/tools/src@intel@tools@@intel_dump_gpu@sha -Isrc/intel/tools -I../../src/intel/tools -Isrc/../include -I../../src/../include -Isrc -I../../src -Isrc/mapi -I../../src/mapi -Isrc/mesa -I../../src/mesa -I../../src/gallium/include -I../../src/gallium/auxiliary -Isrc/intel -I../../src/intel -I../../include/drm-uapi -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DVERSION="18.2.0-devel"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -fPIC -fvisibility=hidden -Wno-override-init -MD -MQ 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -MF 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o.d' -o 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -c ../../src/intel/tools/aub_write.c ../../src/intel/tools/aub_write.c: In function ‘fail_if’: ../../src/intel/tools/aub_write.c:243:4: error: implicit declaration of function ‘va_start’; did you mean ‘assert’? [-Werror=implicit-function-declaration] va_start(args, format); ^~~~~~~~ assert ../../src/intel/tools/aub_write.c:245:4: error: implicit declaration of function ‘va_end’; did you mean ‘rand’? [-Werror=implicit-function-declaration] va_end(args); ^~~~~~ rand cc1: some warnings being treated as errors Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:19:22 -07:00
Jason Ekstrand	2be30a1a39	intel/tools: Rename error2aub to intel_error2aub Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 09:03:05 -07:00
Danylo Piliaiev	d219521379	i965: Sweep NIR after linking phase to free held memory After optimization passes and many trasfromations most of memory NIR holds is a garbage which was being freed only after shader deletion. Freeing it at the end of linking will save memory which would be useful in case there are a lot of complex shaders being compiled. The common case for this issue is 32bit game running under Wine. The cost of the optimization is around ~3-5% of compilation speed with complex shaders. V2: by Jason Ekstrand - Move nir_sweep up, right after the last change of NIR Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-07-18 09:00:18 -07:00
Marek Olšák	51d6b163da	winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2) Dependencies between rings are inserted correctly if a buffer is represented by only one unique amdgpu_winsys_bo instance. Use a hash table keyed by amdgpu_bo_handle to have exactly one amdgpu_winsys_bo per amdgpu_bo_handle. v2: return offset and stride properly Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	e06b8ec106	winsys/amdgpu: use a better hash_pointer function Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	53684e9163	winsys/amdgpu: clean up error handling in amdgpu_bo_from_handle Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Marek Olšák	a73e3d5e00	winsys/amdgpu: shorten bo->ws in amdgpu_bo_destroy Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Jason Ekstrand	6a60beba40	intel/tools: Add an error state to aub translator Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:53 -07:00
Jason Ekstrand	d6ad32600e	intel/tools: Break aub file writing into a helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:50 -07:00
Jason Ekstrand	0a457d987e	intel/tools: Refactor aub dumping to remove singletons Instead of having quite so many singletons, we use a struct aub_file to organize the bits we need for writing an aub file. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:46 -07:00
Jason Ekstrand	6953d7f5d2	intel/dump_gpu: Fix corner cases in PPGTT range calculations For large buffers which span an entire l1 page table, we got the range calculations wrong. In this case, we end up with an l1_start which is the first byte represented by the given l1 table and an l1_end which is the first byte after the range represented by the l1 table. Then l2_start_index == L2_index(l2_end) due to roll-over. Instead, compute lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the range represented by the Nth level page table. When we do this, we don't need the conditional expression anymore. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:42:38 -07:00
Caio Marcelo de Oliveira Filho	322fa3e5be	intel/blorp: fix uninitialized variable warning Compiler doesn't pick up that level and start_layer will be defined, so do as was done for num_layers in `4d8b476fa9` "intel/blorp: Fix compiler warning about num_layers." and always set it. Fixes warning ../../src/mesa/drivers/dri/i965/brw_blorp.c: In function ‘brw_blorp_clear_depth_stencil’: ../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘start_layer’ may be used uninitialized in this function [-Wmaybe-uninitialized] blorp_clear_depth_stencil(&batch, &depth_surf, &stencil_surf, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ level, start_layer, num_layers, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ x0, y0, x1, y1, ~~~~~~~~~~~~~~~ (mask & BUFFER_BIT_DEPTH), ctx->Depth.Clear, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ stencil_mask, ctx->Stencil.Clear); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘level’ may be used uninitialized in this function [-Wmaybe-uninitialized] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	3bf19bfdc6	util/string_buffer: fix warning in tests And also specify the maximum size when writing to static buffers. The warning below refers to the case where "str5" could be larger than "str5 - str4", then the strcat would have overlapping dst and src. Compiler doesn't pick up the bound from the snprintf above, so we make clear the bounds of str5 by using strncat() instead of strcat(). ../../src/util/tests/string_buffer/string_buffer_test.cpp: In member function ‘virtual void string_buffer_string_buffer_tests_Test::TestBody()’: ../../src/util/tests/string_buffer/string_buffer_test.cpp:106:10: warning: ‘char* strcat(char, const char)’ accessing 81 or more bytes at offsets 48 and 128 may overlap 1 byte at offset 128 [-Wrestrict] strcat(str4, str5); ~~~~~~^~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	577c8d7288	i965/miptree: avoid uninitialized variable warnings GCC 8.1.1 is having a hard time identifying that the values are properly initialized when used. In the 'memset_value' case, we pass the uninitialized value to another function (that will use only if the conditions match the initialization). Just give enough hint to the compiler to figure things out. Fixes the warnings ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function ‘intel_miptree_alloc_aux’: ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1839:18: warning: ‘memset_value’ may be used uninitialized in this function [-Wmaybe-uninitialized] mt->aux_buf = intel_alloc_aux_buffer(brw, &aux_surf, needs_memset, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ memset_value); ~~~~~~~~~~~~~ ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1698:10: warning: ‘initial_state’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (wants_memset) ^ ../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1772:23: note: ‘initial_state’ was declared here enum isl_aux_state initial_state; ^~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	8ec40824ae	intel/batch-decoder: fix uninitialized values warnings Code assumes that all the necessary fields will exist, but compiler doesn't know about this. Provide zero as default values, like in other decoding functions. Fixes warnings ../../src/intel/common/gen_batch_decoder.c: In function ‘handle_media_interface_descriptor_load’: ../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_entry_count’ may be used uninitialized in this function [-Wmaybe-uninitialized] dump_binding_table(ctx, binding_table_offset, binding_entry_count); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_table_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] ../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_count’ may be used uninitialized in this function [-Wmaybe-uninitialized] dump_samplers(ctx, sampler_offset, sampler_count); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] ../../src/intel/common/gen_batch_decoder.c:343:7: warning: ‘ksp’ may be used uninitialized in this function [-Wmaybe-uninitialized] ctx_disassemble_program(ctx, ksp, "compute shader"); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c: In function ‘decode_dynamic_state_pointers’: ../../src/intel/common/gen_batch_decoder.c:663:54: warning: ‘state_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized] const uint32_t *state_map = ctx->dynamic_base.map + state_offset; ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ ../../src/intel/common/gen_batch_decoder.c: In function ‘gen_print_batch’: ../../src/intel/common/gen_batch_decoder.c:856:13: warning: ‘next_batch.map’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (next_batch.map == NULL) { ^ ../../src/intel/common/gen_batch_decoder.c:860:13: warning: ‘next_batch.addr’ may be used uninitialized in this function [-Wmaybe-uninitialized] gen_print_batch(ctx, next_batch.map, next_batch.size, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ next_batch.addr); ~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	f836d799f9	intel/decoder: use snprintf(..., "%s", ...) instead of strncpy strncpy() doesn't guarantee the terminator NUL, so we would need to set ourselves. Just use snprintf() instead. Fixes the warnings ../../src/intel/common/gen_decoder.c: In function ‘iter_decode_field’: ../../src/intel/common/gen_decoder.c:897:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation] strncpy(iter->name, iter->field->name, sizeof(iter->name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function ‘iter_advance_field’, inlined from ‘gen_field_iterator_next’ at ../../src/intel/common/gen_decoder.c:1015:9: ../../src/intel/common/gen_decoder.c:844:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation] strncpy(iter->name, iter->field->name, sizeof(iter->name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	20fcd152a2	anv: give more room to debug report The error buffer is limited to 256, but the report contains the filename and possibly other data. So give it more space. Avoids the warnings ../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’: ../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=] snprintf(report, sizeof(report), "%s: %s", file, buffer); ^~ ~~~~~~ ../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256 snprintf(report, sizeof(report), "%s: %s", file, buffer); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’: ../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=] snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer, ^~ ~~~~~~ ../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256 snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error_str); ~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	01d02e8906	anv: avoid warning when switching in VkStructureType When one of the cases is not part of the enum, the compilar complains: ../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’: ../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch] case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA: ^~~~ Given the switch has an "default:" case, we don't lose anything by switching on the unsigned value to avoid the warning. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	df8f1637fa	glsl: remove unnecessary parenthesis from macro The "__inst" will contain the name used for the variable of type "__type ". Parenthesis is not necessary as the name itself shouldn't be an expression. Fixes warning: In file included from ../../src/mesa/main/mtypes.h:49, from ../../src/intel/compiler/brw_compiler.h:30, from ../../src/intel/compiler/brw_shader.h:29, from ../../src/intel/compiler/brw_fs.h:31, from ../../src/intel/compiler/brw_fs_cse.cpp:24: ../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t)’: ../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses] __type *(__inst); \ ^ ../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’ foreach_in_list_use_after(aeb_entry, entry, &aeb) { ^~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	4a29ee1861	intel/compiler: fix -Wsign-compare warning Explicitly convert to signed integer. Conversion is valid since is the same (implicitly) used to initialize the loop. Avoids the warning: ../../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::lower_simd_width()’: ../../src/intel/compiler/brw_fs.cpp:5761:45: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare] split_inst.eot = inst->eot && i == n - 1; ~~^~~~~~~~ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	7df5f62768	intel/compiler: silence -Wclass-memaccess warnings Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Caio Marcelo de Oliveira Filho	ff8abce361	spirv: initialize is_vertex_input Fixes warning: ../../src/compiler/spirv/vtn_variables.c: In function ‘var_decoration_cb’: ../../src/compiler/spirv/vtn_variables.c:1400:12: warning: ‘is_vertex_input’ may be used uninitialized in this function [-Wmaybe-uninitialized] bool is_vertex_input; ^~~~~~~~~~~~~~~ The code used to set is_vertex_input in all possible codepaths, but after `23edc5b1ef` "spirv: translate default-block uniforms" the compiler isn't sure all codepaths will initialize the variable. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-07-18 08:29:51 -07:00
Rob Clark	cbad8f3cc0	freedreno/a5xx: perfmance counters AMD_performance_monitor support Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	33af91dc07	freedreno: batch query support (perfcounters) Core infrastructure for performance counters, using gallium's batch query interface (to support AMD_performance_monitor). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	9e30e7490d	freedreno: batch query prep-work For batch queries we have N different query_type's for one query, so mapping a single query_type to a sample_provider doesn't really work out. Instead add a new constructor to construct a query directly from a sample_provider. Also, the sample buffer size needs to be determined at runtime, as it depends on the number of query_types. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	37b724ff72	freedreno: rework accumulated query result vfunc Take the query object, rather than the ctx. The ctx ptr isn't hugely useful but for back queries we will need the query object to properly get the results. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:19:03 -04:00
Rob Clark	1f464d5301	freedreno/ir3: output ir3 and nir asm for frameretrace See: `298dc8195b` Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	e4c225ab6f	freedreno/ir3: redirectable ir3 disasm output For now it still goes to stdout, this will make it easier to support output on stderr like what frameretrace expects. (If we eventually have a proper GL extension for this, implementation probably looks like dumping shader disasm to a tmp file and then dumping that out over whatever mechanism is used.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	4c58db8064	freedreno/ir3: resync ir3 disassembler Pull in latest updates from cffdump in envytools tree, so we can output to other than just stdout. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:45 -04:00
Rob Clark	97a9283f5d	freedreno: register usage queries Avg number of (half) regs per draw, so we can corrolate fps dips to shader register usage. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	8dfc9e22c1	nir: add lowering for gl_HelperInvocation v2: reword comment about lower_helper_invocations to be more clear that it might not work on all hardware v3: add special variant of load_sample_id which does not imply per- sample shading Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	09f240eb5f	mesa: don't double incr/decr ActiveCounters Frameretrace ends up w/ excess calls to SelectPerfMonitorCountersAMD() which ends up re-enabling already enabled counters. Which causes ActiveCounters[group] to be double incremented for the same counter. This causes BeginPerfMonitorAMD() to fail. The AMD_performance_monitor spec doesn't say that an error should be generated in this case. So I think the safe thing to do is just safe- guard against excess increments/decrements. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-18 10:10:44 -04:00
Rob Clark	426f1c60bc	mesa: fix error msg typo Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-18 10:10:44 -04:00
Rob Clark	640b8eb5b1	nir: fixup intrinsic comment Now the deref is the first src. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-18 10:10:44 -04:00
Tomeu Vizoso	3f7c2148b0	mesa: handle a bunch of formats in IMPLEMENTATION_COLOR_READ_* Virgl could save a lot of work converting buffers in the host side between formats if Mesa supported a bunch of other formats when reading pixels. This commit adds cases to handle specific formats so that the values reported by the two calls match more closely the underlying native formats. In GLES is important that IMPLEMENTATION_COLOR_READ_* return the native format and data type because the spec only allows reading with those, besides GL_RGBA or GL_RGBA_INTEGER. Additionally, because virgl currently doesn't implement such conversions, this commit fixes several tests in dEQP-GLES3.functional.fbo.color.clear., when using virgl in the guest side. The logic is based on knowledge that is shared with _mesa_format_matches_format_and_type() but we cannot assert that the results match as we don't have all the starting information at both points. So leave the assert out and hope CI comes soon to save us all. v2: Let R10G10B10A2_UINT fall back to GL_RGBA_INTEGER (Eric Anholt) * Assert with _mesa_format_matches_format_and_type (Eric Anholt) v3: * Remove the assert, as it won't be reliable (Eric Anholt) v4: * Use _mesa_is_format_integer in the fallback (Eric Anholt) v5: * Remove superfluous call to _mesa_uncompressed_format_to_type_and_comps (Eric Anholt) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-07-18 14:52:35 +01:00
Samuel Pitoiset	e45ba51ea4	radv: add support for VK_EXT_conditional_rendering Inherited commands buffers are not supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:09 +02:00
Samuel Pitoiset	946cf3f39f	radv: add support for non-inverted conditional rendering By default, our internal rendering commands are discarded only if the predicate is non-zero (ie. DRAW_VISIBLE). But VK_EXT_conditional_rendering also allows to discard commands when the predicate is zero, which means we have to use a different flag. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:06 +02:00
Samuel Pitoiset	4d99caf590	radv: set the predicate for indirect/indexed draw commands VK_EXT_conditional_rendering allows to discard draw commands (not only normal draws). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:04 +02:00
Samuel Pitoiset	1e83f65673	radv: set the predicate for dispatch commands VK_EXT_conditional_rendering allows to discard dispatch commands. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 13:44:01 +02:00
Lionel Landwerlin	83427acc87	i965: batchbuffer: write correct canonical offset with softpin Addresses in the command streams should be in canonical form (i.e bit[63:48] == bit[47]). If the [bo->gtt_offset, bo->gtt_offset + target_offset] range contains the address 0x800000000000, the current code will fail that criteria. v2: Fix missing include (Lionel) Fixes: `1c9053d076` ("i965: Prepare batchbuffer module for softpin support.") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-18 11:29:16 +01:00
Samuel Pitoiset	1376f2824f	radv: remove unused variable in radv_CreateRenderPass2KHR() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-18 10:54:42 +02:00
Samuel Pitoiset	d9526384bd	radv: optimize radv_stage_flush() for pre fragment shader stages We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH is enough for these stages. Note that PS_PARTIAL_FLUSH also synchronizes all vertex stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-18 10:09:05 +02:00
Samuel Iglesias Gonsálvez	0f29006256	anv: fix assert in anv_CmdBindDescriptorSets() The assert is checking that we are not binding more descriptor sets than the supported by the driver. When binding the descriptor set number MAX_SETS-1, it was breaking the assert because descriptorSetCount = 1. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-18 08:54:23 +02:00
Jan Vesely	154fbd03cc	clover: Report error when pipe driver fails to create compute state CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-17 21:04:15 -04:00
Jan Vesely	866b25fd01	clover: Catch errors from executing event action Abort all dependent events. v2: Abort the current event as well. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-07-17 21:04:15 -04:00
Timothy Arceri	e105b0ca30	nir: add a couple of ior opts to nir_opt_algebraic One of these was seen in a Deus Ex shader. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:53:27 +10:00
Timothy Arceri	c4188a9b9f	nir: allow opt_peephole_select to handle nir_instr_type_deref Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-18 09:53:22 +10:00
Marek Olšák	bb5449cfee	r600: fix warnings when unref'ing pool->bo	2018-07-17 14:51:45 -04:00
Konstantin Kharlamov	3f8fa7716d	r600g: some -Wsign-compare fixes Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	b674a1d3b9	st/glx: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	1379d9759f	st/nine: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	77ca550224	r600g: constify some variables Just a nice hint for both peoples and compilers. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Konstantin Kharlamov	9b379591c9	r600g: do not use "fast-clear" for small textures (v3) Ported from radeonsi. Improves windowed glxgears ran as vblank_mode=0 glxgears -info -geometry 0+0+512+512 from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS. v2: turned out glxgears ignores the option above, the correct way would be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS. 300×300 however wins around a hundred FPS, and to leave some room in case results may differ for other cards I want not to nitpick in search of an optimum but to simply leave 300×300 in the code. v3: remove redundant braces, and try harder for the mail to stick to the rest of the series. Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-17 14:47:37 -04:00
Rob Clark	4cf8f329ed	freedreno: re-work fd_batch_reference() locking Annoyingly we still have to briefly drop the lock to unref resources.. but push the lock down into __fd_batch_destroy() so we can invalidate the batch and reset resources before dropping the lock. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	4b847b38ae	freedreno: make fd_batch a one-shot thing Re-allocate rather than re-use. Originally we had an unnecessarily complex design to avoid re-allocating cmdstream buffers. But now that support for "growable" cmdstream buffers has been in place for a couple years, I guess we can care a bit less about the extra overhead on older kernels. But making the batches one-shot removes a class of potential race conditions vs the flush_queue. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	f129971e71	freedreno: flush immediately when reading a pending batch Instead of the reading batch setting a dependency on the writing batch, simply flush the writing batch immediately. This avoids situations where we have to flush the context's current batch later. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	20f677f6bc	freedreno: get rid of noop render This was basically to avoid a zero-dword IB (indirect-branch), but instead just don't emit the IB packet in that case. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	15f6c0509a	freedreno: fix samples=0 vs samples=1 confusion pipe_framebuffer_state can have samples=0 in various cases, which is actually the same thing as samples=1. So use the _get_num_samples() helper to populate the key, to avoid this looking like two distinct fb states to the cache. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	d77fcdeb59	freedreno: comment for _invalidate_batch() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Rob Clark	f2570409f9	freedreno: hold batch references when flushing It is possible for a batch to be freed under our feet when flushing, so it is best to hold a reference to all of them up-front. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-17 11:00:00 -04:00
Karol Herbst	71add09e79	nir/spirv: print id for unsupported alu opcode Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-17 13:24:09 +02:00
Karol Herbst	1beef89ad8	nir: prepare for bumping up max components to 16 OpenCL knows vector of size 8 and 16. v2: rebased on master (nir_swizzle rework) rework more declarations with nir_component_mask_t adjust print_var_decl Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-17 13:24:09 +02:00
Samuel Pitoiset	f65bee7e85	radv/winsys: use alloca() for semaphore dependencies Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-17 10:53:45 +02:00
Samuel Pitoiset	88e56804a7	radv: reduce number of CB/DB meta flushes for VK_ACCESS_TRANSFER_WRITE_BIT If we know that the given image doesn't have any metadata, we don't need to flush. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 09:34:20 +02:00
Samuel Pitoiset	b213947510	radv: fix implementation of VK_KHR_create_renderpass2 for multiviews The Vulkan 1.1.80 spec says: "viewMask has the same effect for the described subpass as VkRenderPassMultiviewCreateInfo::pViewMasks has on each corresponding subpass." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-17 09:04:35 +02:00
Erik Faye-Lund	591b700944	virgl: respect max_vertex_attrib_stride cap This is required for OpenGL 4.4 and OpenGL ES 3.1 support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 15:45:37 +10:00
Lepton Wu	04e278f793	virgl: Fix flush in virgl_encoder_inline_write. The current code is buggy: if there are only 12 dwords left in cbuf, we emit a zero data length command which will be rejected by virglrenderer. Fix it by calling flush in this case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 14:56:25 +10:00
Erik Faye-Lund	b5db3aa6e8	virgl: implement set_min_samples This allows us to implement glMinSampleShading correctly, which up until now just got ignored. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-17 13:59:47 +10:00
Caio Marcelo de Oliveira Filho	ba1b41b504	glsl: do second pass of const propagation in loops When handling loops in constant propagation, implement the "FINISHME" comment like copy propagation: perform a first pass to find values that can't be propagated, then perform a second pass with the ACP containing still valid values. Certain values are killed because the loop may run more than one iteration, so we can't copy propagate them as they would be invalid in the later iterations. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 16:33:39 -07:00
Caio Marcelo de Oliveira Filho	d7849fd1da	glsl: don't let an 'if' then-branch kill const propagation for else-branch When handling 'if' in constant propagation, if a certain variable was killed when processing the first branch of the 'if', then the second would get any propagation from previous nodes. This is similar to the change done for copy propagation code. x = 1; if (...) { z = x; // This would turn into z = 1. x = 22; // x gets killed. } else { w = x; // This would NOT turn into w = 1. } With the change, we let constant propagation happen independently in the two branches and only then apply the killed values for the subsequent code. The new code use a single hash table for keeping the kills of both branches (the branches only write to it), and it gets deleted after we use -- instead of waiting for mem_ctx to collect it. NIR deals well with constant propagation, so it already covered for the missing ones that this patch fixes. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 16:33:39 -07:00
Eric Anholt	229836fb37	v3d: Disable shader-db cycle estimates until we sort out TMU estimates. I keep having to ignore these shader-db changes since I don't trust them, so just disable the reports entirely.	2018-07-16 14:39:59 -07:00
Eric Anholt	2baab6bf2a	v3d: Emit the lowered uniform just before its first use in a block. total instructions in shared programs: 98578 -> 98119 (-0.47%) instructions in affected programs: 27571 -> 27112 (-1.66%) and it also eliminates most spills/fills on the CTS's randomized uniform usage testcases.	2018-07-16 14:39:59 -07:00
Eric Anholt	26f830d9fc	v3d: Add an assert that we don't provide an invalid texture return words. The docs had an update noting this restriction, so reflect it in the code.	2018-07-16 14:39:59 -07:00
Eric Anholt	d661d78464	v3d: Apply GFXH-1625 restriction on TMUWT in the end of the shader. This doesn't affect us yet since we're not doing TMUWTs, but I think we will for GLES 3.1.	2018-07-16 14:39:59 -07:00
Sergii Romantsov	cec540fbc6	intel/batch_decoder: decoding of 3DSTATE_CONSTANT_BODY. SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats why we got segmentation fault when used INTEL_DEBUG=bat. Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT of VS, GS and PS structures. v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml Fixes: `169d8e011a` (intel: Fix 3DSTATE_CONSTANT buffer decoding.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-16 12:18:36 -07:00
Marek Olšák	4054133dcc	r600: fix build after the removal of RADEON_PRIO_* flags	2018-07-16 14:33:31 -04:00
Roland Scheidegger	b3474645d4	nir: fix msvc build Empty initializer braces aren't valid c (it's a gnu extension, and it's valid in c++). Hopefully fixes appveyor / msvc build... Fixes `a3150c1d06`	2018-07-16 20:07:53 +02:00
Jason Ekstrand	f378fa94b2	nir/worklist: Rework the foreach macro This makes the arguments match the (thing, container) pattern used in other nir_foreach macros and also renames it to make that a bit more clear. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-16 11:02:10 -07:00
Eric Anholt	360714bfa5	intel: tools: Fix uninitialized variable warnings in intel_dump_gpu. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-16 10:58:40 -07:00
Jason Ekstrand	5e030deaf2	spirv: Fix a couple of image atomic load/store bugs For one thing, the NIR opcodes for image load/store always take and return a vec4 value regardless of the image type. We need to fix up both the source and destination to handle it. For another thing, we weren't actually setting up a destination in the OpAtomicLoad case. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org	2018-07-16 10:54:50 -07:00
Marek Olšák	f8aa116c3c	winsys/amdgpu: clean up error handling in amdgpu_cs_submit_ib Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	6b1e0e51e6	radeonsi: rework RADEON_PRIO flags to be <= 31 This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	54ad9b444c	radeonsi: merge DCC/CMASK/HTILE priority flags For a later simplification. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	3e6888e5d7	radeonsi: remove non-GFX BO priority flags For a later simplification. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	342fff6cbc	winsys/amdgpu: use alloca when using global_bo_list Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	6ec44b7055	winsys/amdgpu: remove label bo_list_error Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	7346e5296e	winsys/amdgpu: always update gfx_bo_list_counter Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Marek Olšák	caf41fb96d	winsys/amdgpu: make amdgpu_cs_context::flags & handles local Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-07-16 13:32:33 -04:00
Gert Wollny	78887e99e3	mesa/virgl: Fix off-by-one and copy-paste error in multisample position evaluation Converting from a switch statement that would not allow intermediate sample counts to use an if-else chain went a bit wrong, so that in some cases the range that should be inclusive was exclusive and the line for 16 samples was copies wrongly. v2: elaborate commit message. Fixes: `91f48cdfe5` virgl: Add support for glGetMultisample Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)	2018-07-16 12:51:39 +02:00
Karol Herbst	4d0d911875	nouveau: fix 3D blitter for unsigned to signed integer conversions fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run. v2: simplify the changes a bit Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 19:28:37 +02:00
Karol Herbst	87c8af2836	nir: fix printing of vec16 type Fixes: `2f181c8c18` "glsl_types: vec8/vec16 support" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 19:28:37 +02:00
Rob Clark	427a3dbdb1	nir/spirv: implement BuiltInWorkDim Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 07:51:13 +02:00
Karol Herbst	39180d3931	nir/spirv: print id for unsupported builtins Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-15 07:51:13 +02:00
Jason Ekstrand	daa78f30b6	intel/blorp: Handle 3-component formats in clears This fixes a nasty hang in Batman: Arkham City which apparently calls vkCmdClearColorImage on a linear RGB image. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-13 20:57:46 -07:00
Jason Ekstrand	11712b9ca1	intel/blorp: Fix blits to R8G8B8_UNORM_SRGB In this case, the surface faking will give us a R8_UNORM surface and we need to do an sRGB conversion in the shader. Found by inspection. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-13 20:57:46 -07:00
Caio Marcelo de Oliveira Filho	4ec8b39fcd	util/hash_table: add helper to remove entry by key And the corresponding test case. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:20:49 -07:00
Jason Ekstrand	a3150c1d06	nir/lower_tex: Use nir_format_srgb_to_linear A while ago, we added a bunch of format conversion helpers; we should use them instead of hand-rolling sRGB conversions. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:02:18 -07:00
Jason Ekstrand	b52d79514c	vc4: Tell NIR to lower fdiv instructions This should allow us to use them in nir_lower_tex Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-13 14:02:18 -07:00
Dylan Baker	53aca66874	docs: Update news, calendar, and relnotes for 18.1.4	2018-07-13 13:54:46 -07:00
Dylan Baker	97870f2cd0	docs: Add sha256 sums for 18.1.4 tarballs	2018-07-13 13:53:03 -07:00
Dylan Baker	e8df2f12d6	docs: Add release notes for 18.1.4	2018-07-13 13:53:01 -07:00
Eric Anholt	d009463a65	vc4: Switch to using u_transfer_helper for MSAA maps. No requirement, just reduces code duplication.	2018-07-13 13:29:29 -07:00
Eric Anholt	afcc714c98	v3d: Work around GFXH-1461 bug losing our Z/S clears. If you load S and clear Z or vice versa, the clear may get lost. Just fall back to drawing a quad. Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8	2018-07-13 13:29:29 -07:00
Eric Anholt	162fcdad6a	meson: Move xvmc test tools from unit tests to installed tools. These are not unit tests, as they rely on the host's XVMC and some user configuration. Switch them over to being general installed tools, to fix unit testing. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-13 13:29:29 -07:00
Gert Wollny	695a4cb0f6	r600: Add spill output to group only if register or target index changes The current spill code checks in each instruction of an instruction group whether spilling is needed and if so, it adds spilling for each component as a seperate instruction and it allocates a new temporary for each component and since it takes the write mask from the TGSI representation, all components might be written each time and as a result already written components might be overwritten with garbage like: ... y: MOV R9.y, [0x42140000 37].x t: MOV R8.x, [0x42040000 33].y ... MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3 MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3 ... To resolve this isse accumulate spills to the same memory location so that only one memory write instruction is emitted for an instruction group that writes up to all four components. This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/): spec/glsl-1.30/execution fs-large-local-array-vec2.shader_test fs-large-local-array-vec3.shader_test fs-large-local-array-vec4.shader_test v2: fix some typos and add comment about piglits (Roland Scheidegger) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)	2018-07-13 21:11:34 +02:00
Nanley Chery	3b4279f772	i965/miptree: Allocate MS texture BOs as BUSY These buffer objects are never accessed with the CPU. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:26 -07:00
Nanley Chery	7784a9ceac	i965/miptree: Inline make_separate_stencil Note that the separate stencil miptree now has the same alloc_flag as the depth component. Only stencil renderbuffers (as opposed to textures) have BO_ALLOC_BUSY. v2: Add note about BO_ALLOC_BUSY in message (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:26 -07:00
Nanley Chery	74cf188985	i965/miptree: Init r8stencil_needs_update to false The current behavior masked two bugs where the flag was not set to true after modifying the stencil texture. One case was a regression introduced with commit `bdbb527a65` and another was a bug in the depthstencil mapping code. These have since been fixed. To prevent such bugs from being masked in the future, initialize r8stencil_needs_update to false. v2: Keep the delayed allocation. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:36:19 -07:00
Nanley Chery	ffac81fa5c	i965/miptree: Refactor miptree_create Enable a future patch to create the r8stencil_mt in this function. v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	03cbaae03e	i965/miptree: Add and use mt_surf_usage v2: Make mt_fmt const (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	32b22592a8	i965/miptree: Share alloc_flags in miptree_create Note that this maintains BO_ALLOC_BUSY for depth renderbuffers, but not depth textures. v2: Add note about BO_ALLOC_BUSY in message (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	2321e85759	i965/miptree: Share the miptree format in miptree_create Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	fbe01625f6	i965/miptree: Share tiling_flags in miptree_create Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	6c9947c3ef	i965/miptree: Delete MIPTREE_CREATE_LINEAR This enum constant was introduced to enable blit maps with intel_miptree_create `da2880bea0`. Now that such maps use the more direct make_surface function which allows you to specify the tiling directly, the constant is no longer being used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	684fa59eb6	i965/miptree: Use make_surface in map_blit Do this so that we don't have to special case linearly-tiled depth buffers in miptree_create. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	63d428dc17	i965/draw: Fix adding the stencil bo to the depth cache Fix the case where stencil writes are enabled on a depth stencil texture. Found by inspection. v2: Fix message to allow for depth stencil writes (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	be07cc43a2	i965/draw: Set the r8stencil flag after drawing Fixes the regresion introduced with commit `bdbb527a65` "i965: Use ISL for emitting depth/stencil/hiz state on gen6+" Found by inspection. Prevents regressing the piglit test, fbo-depth-array stencil-draw, later on in this series. Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	0eafe44ba7	i965/miptree: Set the r8stencil flag in map_depthstencil Found by initializing the r8stencil_needs_update to false in make_separate_stencil_surface. Prevents regressing the piglit test arb_stencil_texturing-draw, later on in the series. Cc: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Nanley Chery	cef7ce07fa	i965: Set the r8stencil flag in miptree_finish_write This seems to be the most appropriate place. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-07-13 08:31:21 -07:00
Karol Herbst	cb65246ed2	nir: cleanup oversized arrays in nir_swizzle calls There are no fixed sized array arguments in C, those are simply pointers to unsized arrays and as the size is passed in anyway, just rely on that. where possible calls are replaced by nir_channel and nir_channels. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-13 15:46:57 +02:00
Nanley Chery	0288fe8d04	i965/miptree: Use the correct BLT pitch Retile miptrees to a linear tiling less often. Retiling can cause issues with imported BOs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738 Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Nanley Chery	3df201e3e8	i965/miptree: Drop an if case from retile_as_linear Drop an if statement whose predicate never evaluates to true. row_pitch belongs to a surface with non-linear tiling. According to isl_calc_tiled_min_row_pitch, the pitch is a multiple of the tile width. By looking at isl_tiling_get_info, we see that non-linear tilings have widths greater than or equal to 128B. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Nanley Chery	0ab2541943	i965: Make blt_pitch public We'd like to reuse this helper. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-07-12 19:16:30 -07:00
Caio Marcelo de Oliveira Filho	1f6ce1973a	nir: delete not needed for reinserted nir_cf_list It wasn't causing problems since there's nothing to delete, but better be consistent with the rest of existing codebase. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	13cfd6cc96	glsl: remove struct kill_entry in constant propagation The only value in kill_entry is the writemask, which can be stored in the data pointer of the hash table entry. Suggested by Eric Anholt. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	d6e869afe9	glsl: slim the kill_entry struct used in const propagation Since `4654439fdd` "glsl: Use hash tables for opt_constant_propagation() kill sets." uses a hash_table for storing kill_entries, so the structs can be simplified. Remove the exec_node from kill_entry since it is not used in an exec_list anymore. Remove the 'var' from kill_entry since it is now redundant with the key of the hash table. Suggested by Eric Anholt. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	094225d69d	i965: fix typo (wrong gen number) in comment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	fa0c19d17b	util/set: helper to remove entry by key v2: Add unit test. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	b034facfbc	util/set: add a clone function v2: Add unit test. (Eric Anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Caio Marcelo de Oliveira Filho	8af0a45b47	util/set: add a basic unit test Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-12 14:03:51 -07:00
Marek Olšák	2e0b00ab7d	radeonsi: add support for Vega20 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-07-12 16:48:12 -04:00
Eric Anholt	e8dc3c0c36	u_blitter: Add an option to draw the triangles using an index buffer. For V3D, the HW will interpolate slightly differently along the shared edge of the trifan. The conformance tests manage to catch this in the nearest_consistency_* group. To get interpolation to match, we need the last vertex of the triangle to be shared. I first tried implementing draw_rectangle to do triangles instead, but that was quite a bit (147 lines) of code duplication from u_blitter, and this seems much simpler and less likely to break as u_blitter changes. Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:49:22 -07:00
Eric Anholt	c17dac0534	u_draw: Add some indices to the util_draw_elements() helpers. These helpers have been unused, and were definitely not useful since `330d0607ed` ("gallium: remove pipe_index_buffer and set_index_buffer") made it so that they never had an index buffer passed in. For an upcoming u_blitter change to use these helpers, I have just 6 bytes of index data, so pass it as user data until a more interesting caller comes along. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:49:20 -07:00
Eric Anholt	50a3a283d0	vc4: Don't automatically reallocate a PERSISTENT-mapped buffer. I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes: `a2014c2eb9` ("vc4: Simplify the DISCARD_RANGE handling")	2018-07-12 11:31:08 -07:00
Eric Anholt	7714896256	v3d: Don't automatically reallocate a PERSISTENT-mapped buffer. I had mistakenly used the COHERENT flag, which can only be set when PERSISTENT is mapped, but isn't always. Fixes piglit bufferstorage-persistent read	2018-07-12 11:31:08 -07:00
Eric Anholt	e48c615292	v3d: Fix stride of 1D_ARRAY mappings. All of our other texture arrays will be tiled, but 1D is an array of raster mappings and we had the wrong value plugged in here. Fixes piglit getteximage-targets 1D_ARRAY	2018-07-12 11:31:08 -07:00
Eric Anholt	97ddeed949	v3d: Fix MRT blending with independent blending disabled. We were only emitting the RT blend state for RT 0 and only enabling it for RT 0, when the gallium API for !independent_blend is for rt0's state to apply to all of them. Fixes piglit fbo-drawbuffers-blend-add.	2018-07-12 11:31:08 -07:00
Eric Anholt	e0dbbf9987	gallium/u_transfer_helper: Initialize the stride of MSAA maps. We just never set the value that was returned for MSAA mappings (directly reading back an MSAA framebuffer). Since we're handing back ss_map, it should be ss_map's stride from our nested transfer. Fixes piglit /home/anholt/src/piglit/bin/fbo-depthstencil -samples=4 cases. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-12 11:31:06 -07:00
Eric Anholt	589bb5bd65	gallium/u_transfer_helper: Fix MSAA mappings with nonzero x/y. We created a temporary with box->{width,height} and then tried to map width,height from a nonzero offset when we meant to just map the whole temporary. Fixes segfaults in V3D in dEQP-GLES3.functional.prerequisite.read_pixels with --deqp-egl-config-name=rgba8888d24s8ms4 and also piglit's read-front clear-front-first -samples=4 Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-12 11:31:00 -07:00
Jason Ekstrand	ccb8309516	util/rb_tree: Fix a compiler warning Gcc 8 warns "cast to pointer from integer of different size" in 32-bit builds. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-12 10:25:46 -07:00
Jose Maria Casanova Crespo	62f37ee53d	i965/fs: unspills shoudn't use grf127 as dest since Gen8+ At `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" we didn't take into account the case of SEND instructions that are not send_from_grf. But since Gen7+ although the backend still uses MRFs internally for sends they are finally assigned to a GRFs. In the case of unspills the backend assigns directly as source its destination because it is suppose to be available. So we always have a source-destination overlap. If the reg_allocator assigns registers that include the grf127 we fail the validation rule that affects Gen8+ "r127 must not be used for return address when there is a src and dest overlap in send instruction." So this patch activates the grf127_send_hack_node for Gen8+ and if we have any register spilled we add interferences to the destination of the unspill operations. We also need to avoid that opt_bank_conflicts() optimization, that runs after the register allocation, doesn't move things around, causing the grf127 to be used in the condition we were avoiding. Fixes piglit test tests/spec/arb_compute_shader/linker/bug-93840.shader_test and some shader-db crashed because of the grf127 validation rule.. v2: make sure that opt_bank_conflicts() optimization doesn't change the use of grf127. (Caio) Found by Caio Marcelo de Oliveira Filho Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107193 Fixes: `232ed89802` "i965/fs: Register allocator shoudn't use grf127 for sends dest" Cc: 18.1 <mesa-stable@lists.freedesktop.org> Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-12 18:02:26 +02:00
Michel Dänzer	34e89e4d38	gallium: Check pipe_screen::resource_changed before dereferencing it It's optional, only implemented by the etnaviv driver so far. Fixes: `501d0edeca` "st/mesa: call resource_changed when binding a EGLImage to a texture" Fixes: `a37cf630b4` "gallium: add pipe_screen::resource_changed callback wrappers" Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-07-12 17:39:12 +02:00
Jason Ekstrand	c2587ac4e5	docs/features: Add the missing KHR extensions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Jason Ekstrand	55b68c4833	docs/features: Move the Vulkan 1.1 extensions to the 1.1 section While we're at it, add some extensions we missed along the way like the VK_KHR_maintenanceN extensions. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Jason Ekstrand	bc15d74529	docs/features: Mark some Vulkan extensions as done Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 08:28:04 -07:00
Karol Herbst	686e140ce0	nir/spirv: handle OpConstantComposites with OpUndef members Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	154ef32e46	nir/spirv: implement BuiltInGlobalSize Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	31cbcbdb87	nir: move lowering of SYSTEM_VALUE_LOCAL_GROUP_SIZE into a function we already have this code duplicated and we will need it for the global group size as well Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Karol Herbst	529aa9e646	compiler: add missing entries to gl_system_value_name also reorder to match the gl_system_value enum. It is weird that the STATIC_ASSERT doesn't trigger though. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Rob Clark	d4280561f5	nir/spirv: print extension name in fail msg Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Rob Clark	9ce0360f76	nir/spirv: Use imov where we might have 8 bit types Otherwise nir_validate may complain about 8 bit floats, which do not exist. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-12 13:09:00 +02:00
Samuel Pitoiset	f1b3f7bfac	radv: simplify the logic in radv_set_descriptor_set() Now that 'set' can't be NULL because the meta operations no longer bind a NULL descriptor, the logic can be simplified a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:49 +02:00
Samuel Pitoiset	826b3a8773	radv: remove one useless check in radv_bind_descriptor_set() 'set' shouldn't be NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:47 +02:00
Samuel Pitoiset	6bfbc7b38b	radv/meta: do not restore a NULL descriptor Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:45 +02:00
Samuel Pitoiset	5b32926f7e	radv: remove unnecessary verification code around ring_offsets_idx I don't want to waste CPU cycles for nothing. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:42 +02:00
Samuel Pitoiset	6248fbe5e4	radv: get rid of buffer object priorities We mostly use the same priority for all buffer objects, so I don't think that matter much. This should reduce CPU overhead a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:40 +02:00
Lucas Stach	501d0edeca	st/mesa: call resource_changed when binding a EGLImage to a texture When a EGLImage is newly bound to a texture, we need to make sure the driver is informed that the resource might have changed. Fixes stale texture content on Etnaviv when binding an existing EGLImage to an existing texture object. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-12 11:02:04 +02:00
Samuel Pitoiset	1f616a840e	radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9 A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion counters) must immediately precede every timestamp event to prevent a GPU hang on GFX9. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:22:36 +02:00
Samuel Pitoiset	3a16c722cf	radv: add support for VK_KHR_create_renderpass2 VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass() but refactoring the code is a bit painful. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:20:10 +02:00
Samuel Pitoiset	fe28978f2a	radv: introduce radv_subpass_attachment data structure Needed for VK_KHR_create_renderpass2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:20:06 +02:00
Kenneth Graunke	c0874947f1	st/mesa: Only enable depth writes if the function isn't EQUAL. If the depth function is EQUAL, then we'll only write the depth value when it already matches what's in the buffer, which is pointless. Skipping these writes can save bandwidth. The state tracker can easily take care of this, so all drivers benefit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-11 11:23:20 -07:00
Chad Versace	be5fc0d7f1	anv/android: Fix type error in call to vk_errorf() In a single call to vk_errorf() in the Android code, the arguments were swapped. The bug has existed since day one. Chrome OS used to forgive the warning, but it is now a compilation error. CC: <mesa-stable@lists.freedesktop.org> Fixes: `053d4c32` "anv: Implement VK_ANDROID_native_buffer (v9)" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-11 11:09:19 -07:00
Chad Versace	8e403bc959	anv/android: Fix Autotools build for VK_ANDROID_native_buffer Changes to vk.xml and anv_entrypoints_gen.py broke the Autotools build on Android. The changes undef'd the VK_ANDROID_native_buffer entrypoints in anv_entrypoints.h. Fix it with CPPFLAGS += -DVK_USE_PLATFORM_ANDROID_KHR. CC: <mesa-stable@lists.freedesktop.org> See-Also: `63525ba7` "android: enable VK_ANDROID_native_buffer" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-07-11 11:09:16 -07:00
Samuel Pitoiset	4a67ce886a	radv: make sure to wait for CP DMA when needed This might fix some synchronization issues. I don't know if that will affect performance but it's required for correctness. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-11 12:11:56 +02:00
Rafael Antognolli	688d757e15	intel/tools/dump_gpu: Add option to print ppgtt mappings. Using -vv will increase the verbosity, by printing the ppgtt mappings as they get written into the aub file. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-10 09:05:44 -07:00
Neil Roberts	45106a1c93	spirv: Fix InterpolateAt* instructions for vecs with dynamic index If the glsl is something like this: in vec4 some_input; interpolateAtCentroid(some_input[idx]) then it now gets generated as if it were: interpolateAtCentroid(some_input)[idx] This is necessary because the index will get generated as a series of nir_bcsel instructions so it would no longer be an input variable. It is similar to what is done for GLSL in `ca63a5ed3e`. Although I can’t find anything explicit in the Vulkan specs to say this should be allowed, the SPIR-V spec just says “the operand interpolant must be a pointer to the Input Storage Class”, which I guess doesn’t rule out any type of pointer to an input. This was found using the spec/glsl-4.40/execution/fs-interpolateAt* Piglit tests with the ARB_gl_spirv branch. Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: update after nir_deref_instr land on master. Implemented by Alejandro Piñeiro. Special thanks to Jason Ekstrand for guidance at the new nir_deref_instr world. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 11:43:40 +02:00
Francisco Jerez	18c086a9e6	intel/ir: Uncomment definition of several unused hardware opcodes. There are a number of opcode_desc table entries for many of these unused opcodes. A symbolic opcode enum will be required in a future commit in order to keep them in the opcode description tables. The alternative would be to remove the unused opcodes from the opcode description tables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	48d6fc5eb6	intel/fs: Initialize mlen for gen7 varying pull constant load messages. This makes the message length available at the IR level, which should save some guesswork in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6643143f6e	intel/eu: Assert that the instruction is send-like in brw_set_desc_ex(). Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	6f81e2b994	intel/eu: Get rid of the return value of brw_send_indirect_message(). The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b3cce4c130	intel/eu: Get rid of the return value of brw_send_indirect_surface_message(). All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	95b5367149	intel/eu: Use descriptor constructors for dataport typed surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	94166cef40	intel/eu: Use descriptor constructors for dataport scattered byte surface messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	2a9605d610	intel/eu: Use descriptor constructors for dataport untyped surface messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8e707fc2af	intel/eu: Provide single descriptor argument to brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	b10b4e7c45	intel/eu: Use descriptor constructors for pixel interpolator messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:58 -07:00
Francisco Jerez	8fa4bc4676	intel/eu: Use descriptor constructors for dataport write messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	2bac890bf5	intel/eu: Use descriptor constructors for dataport read messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	27c211e30f	intel/eu: Use descriptor constructors for sampler messages. v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	1c90ae5acc	intel/eu: Provide desc immediate argument up front to brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	b382bdde1d	TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	c3793d49e4	intel/eu: Use brw_set_desc() along with a helper to set common descriptor controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see `d2eecf0b0b` fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set__message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw__desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	20b962232b	intel/eu: Define SET_BITS helper more easily reusable than SET_FIELD. Allows to specify a bitfield based on its upper and lower bounds instead of a symbolic field definition, kind of what the current GET_BITS macro is to GET_FIELD. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	d0f589a55b	intel/eu: Define helper to specify the descriptor immediates of a SEND instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Francisco Jerez	f55884cad3	intel/eu: Add brw_inst.h helpers for the SEND(C) descriptor and extended descriptor. This introduces helpers that can be used to specify or extract the whole descriptor of a SEND message instruction at once. Because the the instruction encoding of these is rather awkward on some generations using the generic brw_inst.h macros doesn't seem like an option. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-09 23:46:57 -07:00
Jordan Justen	1c8a045bfb	i965: Support saving the gen program with glGetProgramBinary Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	eb5b4b0fd1	i965: Add flag_state param to brw_search_cache This allows brw_search_cache to be used to find programs without causing extra state to be emitted in the case where the program isn't being made active. (For example, to find the program to save out with the ARB_get_program_binary interface.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	48ce7745dc	mesa: Add gl_shader_program param to ProgramBinarySerializeDriverBlob This might be required because some stages might generate different programs depending on the other stages in the program. For example, the i965 driver's tessellation control stage depends on the tessellation evaluation shader. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	36dd15f8b3	i965: Add brw_populate_default_key We will need to populate the default key for ARB_get_program_binary to allow us to retrieve the default gen program to store in the program binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	65f2014740	i965: Replace brw_setup_tex_for_precompile brw with devinfo Trying to make sure the setup of the default program key is not dependent on the GL state. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	e426286e21	i965: Regenerate blob without gen program for shader cache Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	3a133223b3	compiler/blob: Add blob_skip_bytes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	8e7ee7433e	i965: Add support for driver cache blob containing the gen program Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	05bb4b4849	i965: Use brw_prog_key_set_id in disk cache load/store code Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	170d76de9f	i965: Add brw_prog_key_set_id helper to set the program id on any stage For saving programs (shader cache; get program binary) it is useful to set the id to 0, with the stage being a parameter. For restoring programs it is useful to set the id to the id allocated to the program at creation time. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:33 -07:00
Jordan Justen	1c1a7d11c8	i965: Add brw_stage_cache_id to map gl stages to brw cache_ids Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	b9f9b35431	i965: Add brw_(read\|write)_blob_program_data functions We will want to use these for both the disk shader cache, and for the ARB_get_program_binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	1777c23abf	i965: Add brw_program_deserialize_driver_blob brw_program_deserialize_driver_blob will be a more generic form of brw_program_deserialize_nir. In addition to nir, it will also be able to extract gen binaries and upload them to the program cache. In this commit, it continues to only support nir. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	f4c154afc1	i965: Move brw_program_*serialize_nir to brw_program_binary.c This will allow get_program_binary to add the gen program into its serialization in addition to just the nir program. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	cce3994dee	mesa: Always call ProgramBinarySerializeDriverBlob The driver may prefer to have a different blob for ARB_get_program_binary compared to the version saved out for the disk shader cache. Since they both use the driver_cache_blob field, we need to always give the driver the opportunity to fill in the driver_cache_blob when saving the program binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	6497be42b7	i965: Use ShaderCacheSerializeDriverBlob driver function This function is called just before the gl_program::driver_cache_blob is saved out as part of the gl_program serialization. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	450f00e39d	st/mesa: Use ShaderCacheSerializeDriverBlob driver function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	c510dd22a9	st/mesa: Skip serializing driver_cache_blob if it exists Previously the mesa core code would not call to serialize the driver_cache_blob if it existed. We will update it to always call to serialize the driver_cache_blob meaning we should avoid re-serializing it under mesa/state_tracker. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:32 -07:00
Jordan Justen	2a55553be3	mesa: Add disk shader cache driver blob callback Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-09 23:02:28 -07:00
Iago Toral Quiroga	213491600a	intel/compiler: emit actual barriers for working-group level barriers Until now we have assumed that we could skip emitting these barriers in the general case based on empirical testing and a few assumptions detailed in a comment in the driver code, however, recent CTS tests have showed that we actually need them to produce correct behavior. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 07:46:34 +02:00
Dave Airlie	0cab6e51e3	radv: add some cxxflags for new c++ file Looks like I broke intel CI compiles. Fixes: `6f3aee40f9` (radv: using tls to store llvm related info and speed up compiles (v10)) Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-07-10 10:48:03 +10:00
Jason Ekstrand	dc1d10b396	anv,radv: Add support for VK_KHR_get_display_properties2 Reviewed-by: Keith Packard <keithp@keithp.com>	2018-07-09 17:09:41 -07:00
Jason Ekstrand	c0a27c5946	intel/aubinator_error_decode: Allow for more sections Error states coming from actual Vulkan applications tend to have fairly long command buffers and lots of chained batches. 30 total BOs isn't nearly enough. This commit bumps it to 256, makes some things use the actual number of sections instead of the #define, and adds asserts if we ever go over 256 sections. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 16:40:54 -07:00
Jason Ekstrand	5009e73bb1	intel/batch_decoder: Recurse for all 2nd level batches Our attempt to restart the loop with the second level batch worked at one point but got broken at some point. It was too fragile anyway and we're not likely to have enough secondaries to actually overflow the stack so we may as well recurse in both cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 16:40:54 -07:00
Dave Airlie	45e25adfe8	virgl/vtest: add support to vtest for new cap getting. The vtest protocol is pretty simple but also pretty dumb, and the v1 caps query was fixed size, with no nice way to expand it, however the server also ignores any command it doesn't understand. So we can query v2 caps by sending a v2 followed by a v1, if the v2 is ignored we know it's an old vtest server, and the we get a v2 answer then we can just read the v1 answer and discard it. Acked-by: Jakob Bornecrantz <jakob@collabora.com> (sounds good)	2018-07-10 09:07:37 +10:00
Anuj Phogat	2badf0e85b	i965/icl: Don't set float blend optimization bit in CACHE_MODE_SS CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 15:38:42 -07:00
Anuj Phogat	c1d8300117	anv/icl: Don't set float blend optimization bit in CACHE_MODE_SS CACHE_MODE_SS is not listed in gfxspecs table for user mode non-privileged registers. So, making any changes from Mesa will do nothing. Kernel is already setting this bit in CACHE_MODE_SS register which is saved/restored to/from the HW context image. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 15:38:42 -07:00
Jason Ekstrand	227dabc266	anv: Implement VK_EXT_vertex_attribute_divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	2caf6c0392	anv/pipeline: Add a per-VB instance divisor Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jason Ekstrand	32f4feb5a0	anv/pipeline: Use a per-VB struct instead of separate arrays Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-09 15:37:51 -07:00
Jose Maria Casanova Crespo	6db20229ab	anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+ using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	0c01bf70e0	spirv/nir: Add support for SPV_KHR_8bit_storage Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	f29c19cd5c	spirv: Include headers and grammar for SPV_KHR_8bit_storage Updates headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	cd0afab99b	i965/fs: Enable store_ssbo for 8-bit types. v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:50 +02:00
Jose Maria Casanova Crespo	11c904d0d3	intel/compiler: relax brw_eu_validate for byte raw movs When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	87fc9af3fc	i965/fs: Enable conversions to 8-bit integers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	030472c1f0	i965: Support for 8-bit base types in helper functions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	232ed89802	i965/fs: Register allocator shoudn't use grf127 for sends dest Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Jose Maria Casanova Crespo	0e47ecb29a	intel/compiler: grf127 can not be dest when src and dest overlap in send Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-07-10 00:14:49 +02:00
Dave Airlie	6f3aee40f9	radv: using tls to store llvm related info and speed up compiles (v10) This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-10 07:58:03 +10:00
Adam Jackson	c1ec582059	swrast: Fix eglMakeCurrent(dpy, NULL, NULL, ctx) (v2) Fixes 14 piglits, mostly in egl_khr_create_context. v2: Also short-circuit the same-context-no-drawables case (Eric Anholt) Fixes: https://github.com/anholt/libepoxy/issues/177 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-07-09 16:09:58 -04:00
Lionel Landwerlin	7205bdf41f	intel: tools: dump_gpu: fix ppgtt mapping We were not properly writing page tables when the virtual address range spans multiple subtrees of the tables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-09 21:08:08 +01:00
Eric Anholt	beeb94402f	v3d: Implement noperspective varyings on V3D 4.x. Fixes a bunch of piglit interpolation tests, and reduces my concern about some MSAA blit shaders with noperspective varyings.	2018-07-09 11:48:32 -07:00
Eric Anholt	4b4795be9d	v3d: Refactor flat shade/centroid flag emission. The logic was duplicated in a pretty gross way, when what we really need is just a helper function for stuffing the values in the packet. This will make implementing noperspective easier.	2018-07-09 11:48:32 -07:00
Eric Anholt	93f437d128	v3d: Fix typo in dither mode offset. We weren't using the field yet, so it didn't affect anything. Fixes: `c0476d964a` ("v3d: Express dithering mode in the same way that the CLIF parser does.")	2018-07-09 11:48:32 -07:00
zhaowei yuan	73ec437627	glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2 "sampler2DRect" and "sampler2DRectShadow" are specified as reserved from GLSL 1.1 and GLSL ES 1.0 Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906 Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: `34f7e761bc` ("glsl/parser: Track built-in types using the glsl_type directly")	2018-07-09 11:37:08 -07:00
Charmaine Lee	097952abaa	st/wgl: check for NULL piAttribList in wglCreatePbufferARB() Java2d opengl pipeline passes NULL piAttribList to wglCreatePbufferARB(). So skip parsing the attribute list if it is NULL. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-07-06 17:32:49 -07:00
Jason Ekstrand	a695de5845	anv: Add support for VK_KHR_create_renderpass2 The implementation of CreateRenderPass2 uses the helpers we broke out in previous commits. The implementations of the new vkCmd functions just call the old versions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	208be8eafa	anv: Make subpass::depth_stencil_attachment a pointer This makes certain checks a bit easier and means that we don't have the attachment information duplicated in the attachment list and in depth_stencil_attachment. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	75e308fc44	anv/pass: Move implicit dependency setup to anv_render_pass_compile Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	144626946e	anv/pass: Move some dependency setup into a helper This new helper takes a VkSubpassDependency2KHR for future-proofing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	6f9485d21f	anv/pass: Move a bunch of analysis into a separate "compile" stage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	55285b8404	anv/pass: Use a designated initailizer for attachments Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Jason Ekstrand	6c746e8fea	anv: Bump the advertised patch version to 80 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 10:11:53 -07:00
Adam Jackson	d257ec0136	glx: Don't allow glXMakeContextCurrent() with only one valid drawable Drawable and readable need to either both be None or both be non-None. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-07-09 12:03:18 -04:00
Erik Faye-Lund	af6b7bf236	mesa: verify MaxVertexAttribStride for GLES 3.1 The OpenGL 3.1 specification, table Table 20.41 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL ES 3.1 on implementations where this isn't the case. Let's add a check for this Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Erik Faye-Lund	2e64a2f2d1	mesa: verify MaxVertexAttribStride for GL 4.4 The OpenGL 4.4 specification, table Table 23.55 ("Implementation Dependent Values"), defines the minimum-maximum value for MAX_VERTEX_ATTRIB_STRIDE to be 2048. So we shouldn't enable OpenGL 4.4 on implementations where this isn't the case. Let's add a check for this. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Erik Faye-Lund	747cf468ff	r600: report incorrect max-vertex-attrib for GL 4.4 OpenGL 4.4 requires a max vertex attrib of 2048 or higher, but r600 only supports 2047. Technically, this makes it an GL4.3 GPU, but it's currently exposing GL4.4. To avoid regressing the GL version supported in the following patches, let's just lie and pretend like we support 2048. Any applications using 2048 are already broken anyway. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-09 17:32:31 +02:00
Jose Maria Casanova Crespo	6706b421f0	intel/fs: use uint type for per_slot_offset at GS This helps us to compact original instruction: mul(8) g3<1>D g6<8,8,1>UD 0x00000006UD { align1 1Q }; So now we emit: mul(8) g3<1>UD g6<8,8,1>UD 0x00000006UD { align1 1Q compacted }; Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-07-09 15:28:48 +02:00
Samuel Pitoiset	e8f82b33fb	radv: add the trace BO to the list when starting a new cmdbuf That might reduce CPU overhead a little bit when using RADV_TRACE_FILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-09 13:57:01 +02:00
Samuel Pitoiset	5e5a28d52a	radv: reduce CPU overhead in radv_flush_descriptors() The number of enabled descriptors for a given pipeline stage can be computed at compile time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-09 13:56:58 +02:00
Iago Toral Quiroga	81ca08e030	intel/compiler: remove unused function Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 13:21:48 +02:00
Iago Toral Quiroga	449c22004c	anv/pipeline: honor the pipeline_cache_enabled run-time flag v2: merge both conditions to reduce the diff (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-09 08:40:26 +02:00
Roland Scheidegger	817efd8968	r600/sb: fix crash in fold_alu_op3 fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-09 07:17:29 +01:00
Jason Ekstrand	7c92c7d151	vulkan: Update the XML and headers to 1.1.80 Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-08 21:39:18 -07:00
Lionel Landwerlin	420bf14e12	i965: fix clear color bo address relocation Fixes: `7987d041fd` ("i965/surface_state: Emit the clear color address instead of value.") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-07 20:54:55 +01:00
Mauro Rossi	1a1f2b134c	radv: winsys/amdgpu: include missing pthread.h header pthread types are used in some files without explicitely including pthread.h. This leads to compile errors on Android 7.x nougat-x86 e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31: In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32: external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t' pthread_mutex_t global_bo_list_lock; ^ 1 error generated. Including pthread.h explicitely solves the building error Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-07 20:53:59 +02:00
Karol Herbst	de13978733	nv50/ir: fix Instruction::isActionEqual for PHI instructions phi instructions don't have the same results by simply having the same sources. They need to be inside the same BasicBlock or share an equal condition resulting into a path through the shader selecting equal sources as well. short example: cond = ...; const0 = 0; const1 = 1; if (cond) { ssa_1 = const0; } else { ssa_2 = const1; } ssa_3 = phi ssa_1 ssa_2; if (!cond) { ssa_4 = const0; } else { ssa_5 = const1; } ssa_6 = phi ssa_4 ssa_5; allthough both phis actually have sources with equal results, merging them would be wrong due to having a different condition selecting which source to take. For now we also stick an assert into GlobalCSE, because it should never end up having to merge phi instructions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-07-07 20:32:33 +02:00
Rhys Perry	f2cc694d8e	nvc0/ir: use the combined tid special register total instructions in shared programs : 5804448 -> 5804690 (0.00%) total gprs used in shared programs : 670065 -> 670065 (0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 0 5 5 hurt 0 0 0 191 191 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-07 20:31:56 +02:00
Jason Ekstrand	6e88561156	nir/print: Print texture and sampler indices Commit 5fb69daa6076e56b deleted support from nir_print for printing the texture and sampler indices on texture instructions. This commit just brings it back as best as we can. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-07 09:32:33 -07:00
Ian Romanick	f8e54d02f7	intel/compiler: Relax mixed type restriction for saturating immediates At the time of commit `7bc6e455e2` (i965: Add support for saturating immediates.) we thought mixed type saturates would be impossible. We were only thinking about type converting moves from D to F, for example. However, type converting moves w/saturate from F to DF are definitely possible. This change minimally relaxes the restriction to allow cases that I have been able trigger via piglit tests. Fixes new piglit tests: - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:10 -07:00
Ian Romanick	9626ea497d	i965/vec4: Properly handle sign(-abs(x)) This is achived by copying the sign(abs(x)) optimization from the FS backend. On Gen7 an earlier platforms, this fixes new piglit tests: - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:07 -07:00
Ian Romanick	88bd37c010	i965/fs: Properly handle sign(-abs(x)) Fixes new piglit tests: - glsl-1.10/execution/fs-sign-neg-abs.shader_test - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-06 16:20:04 -07:00
Lionel Landwerlin	c05c8d65ba	vulkan: utils: handle hexadecimal values in registry Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-06 22:12:00 +01:00
Marek Olšák	0eaf069679	st/dri: fix a crash in server_wait_sync Ported from i965 including the comment. This fixes: dEQP-EGL.functional.reusable_sync.valid.wait_server Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-07-06 16:23:37 -04:00
Mathieu Bridon	b39bdb0716	python: Stop using the Python 2 exception syntax We could have made this compatible with Python 3 by using: except Exception as e: But since none of this code actually uses the exception objects, let's just drop them entirely. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-07-06 10:18:43 -07:00
Mathieu Bridon	e5a8d51e54	python: Use spaces, not tabs Python 3 doesn't allow mixing spaces and tabs in a script, contrarily to Python 2. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-06 10:04:55 -07:00
Mathieu Bridon	0f7b18fa0d	python: Use the print function In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr> Acked-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-06 10:04:22 -07:00
Jon Turney	b3a42fa066	vma/tests: Fix compilation if limits.h defines PAGE_SIZE (v2) per POSIX, limits.h may define PAGE_SIZE when the value is not indeterminate v2: just change the variable name, since there's no intended correlation here between this value and the machine's actual page size. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-07-06 14:01:08 +01:00
Samuel Pitoiset	85865dbe0d	radv: fix emitting the view index on GFX9 For merged shaders, VS as HS for example. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-06 10:22:53 +02:00
Ian Romanick	965a06dbd7	i965/vec4: Make the vec4_visitor::nir_emit_instr default case unreachable The bug fixed by the previous commit went undetected because extra stderr messages are not flagged by the CI. Copy the solution from fs_visitor::nir_emit_instr and mark the default case unreachable. An alternate solution is to delete the default case so that the compiler will issue a warning. That may require more work since there are other (impossible) cases that exist. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:32 -07:00
Ian Romanick	a4d4787327	intel/compiler: More DCE after lowering Some of the lowering passes, nir_lower_locals_to_regs for example, can cause some previously live code to be dead. This pass in particular leaves a bunch of nir_instr_type_deref instructions floating around. This causes shader-db runs on Gen5 through Haswell to spew tons of messages like: VS instruction not yet implemented by NIR->vec4 UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that generates these messages. Cleaning up the dead code fixes that. To verify, I did a shader-db before and after. Even though all the messages are gone, the results make my brain hurt. :( Haswell total cycles in shared programs: 411890163 -> 411891145 (<.01%) cycles in affected programs: 57016 -> 57998 (1.72%) helped: 3 HURT: 11 helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134 helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96% HURT stats (abs) min: 18 max: 686 x̄: 115.64 x̃: 20 HURT stats (rel) min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93% 95% mean confidence interval for cycles value: -51.39 191.67 95% mean confidence interval for cycles %-change: -0.14% 2.46% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 259114802 -> 259115032 (<.01%) cycles in affected programs: 24034 -> 24264 (0.96%) helped: 1 HURT: 9 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 18 max: 48 x̄: 25.78 x̃: 20 HURT stats (rel) min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80% 95% mean confidence interval for cycles value: 12.42 33.58 95% mean confidence interval for cycles %-change: 0.54% 1.38% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `5a02ffb733` nir: Rework lower_locals_to_regs to use deref instructions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-05 21:13:21 -07:00
Eric Anholt	9d0406c52f	v3d: Fix leak of the default attributes BOs. The GLES3 CTS makes a lot more progress on a run now.	2018-07-05 15:50:54 -07:00
Eric Anholt	6b11131373	v3d: Fix leak of the spill BO on context destruction.	2018-07-05 15:50:52 -07:00
Eric Anholt	4b2ba18ff3	nir: Apply fragment color clamping to gl_FragData[] as well. From the ARB_color_buffer_float spec: 35. Should the clamping of fragment shader output gl_FragData[n] be controlled by the fragment color clamp. RESOLVED: Since the destination of the FragData is a color buffer, the fragment color clamp control should apply. Fixes arb_color_buffer_float-mrt mixed on v3d. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-07-05 12:39:36 -07:00
Eric Anholt	03f6d26b62	v3d: Skip emitting per-RT blend state for RTs with blend disabled. Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on more complicated cases by noticing if multiple RTs have the same blend state and emitting them in a single packet.	2018-07-05 12:39:36 -07:00
Eric Anholt	572f6ab489	v3d: Add proper support for GL_EXT_draw_buffers2's blending enables. I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.	2018-07-05 12:39:36 -07:00
Eric Anholt	5601ab3981	v3d: Add support for GL_SAMPLE_ALPHA_TO_ONE. Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one	2018-07-05 12:39:36 -07:00
Eric Anholt	7b63371420	v3d: Respect swap_color_rb for the f32_color_rb case. We don't actually set the two flags together, but I want to use the r/g/b/a reordered fields in the next commit.	2018-07-05 12:39:36 -07:00
Eric Anholt	dbd52585fa	st/nir: Disable varying packing when doing transform feedback. The varying packing would result in st_nir_assign_var_locations() picking new driver_locations, despite the pipe_stream_output already being set up for the old driver location. This left the gallium driver with no way to work back to what varying was referenced by pipe_stream_output. Fixes these tests on V3D: dEQP-GLES3.functional.transform_feedback.random.separate.points.3 dEQP-GLES3.functional.transform_feedback.random.separate.points.7 dEQP-GLES3.functional.transform_feedback.random.separate.points.9 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.3 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.8 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-05 12:38:27 -07:00
Jon Turney	ab7aa0f10c	meson: Set with_dri from with_gallium when DRI glx is explicitly configured Set with_dri from with_gallium when DRI GLX is explicitly configured, as well as when DRI GLX is chosen automatically. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-07-05 17:48:35 +01:00
Samuel Pitoiset	72fd93370f	radv/winsys: make use of radeon_emit() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:23:25 +02:00
Samuel Pitoiset	f2a310849e	radv: only flush CB meta in pipeline image barriers when needed If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:20:16 +02:00
Samuel Pitoiset	17bb4c2cf5	radv: only flush DB meta in pipeline image barriers when needed If the given image doesn't have HTILE, that's useless to flush DB metadata. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 17:20:12 +02:00
Samuel Pitoiset	2a3e9c89ff	radv: fix "error: initializer element is not constant" build error GCC 4.8 fails to compile with "static const", while GCC 8.1 fails to compile with only "static". Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 17:12:02 +02:00
Lionel Landwerlin	78d5c1c82a	util: u_queue: fix android build error mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] Fixes: `b238e33bc9` "kutil/queue: add a process name into a thread name" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 15:42:26 +01:00
Benedikt Schemmer	93a5c9bc99	Util: fix msvc build The MSVC preprocessor doesnt understand #warning Fixes: `2e1e6511f7` ("util: extract get_process_name from xmlconfig.c") Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-07-05 14:24:08 +01:00
Mathieu Bridon	f9b6dfd919	python: Specify the JSON separators On Python 2, the default JSON separators are ', ' for items and ': ' for dicts. On Python 3, the default is the same when no indent is specified, but if one is (and we do specify one) then the default items separator becomes ',' (the dict separator remains unchanged). This change explicitly specifies the Python 3 default, which helps ensuring that the output is identical, whether it was generated by Python 2 or 3. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 12:52:38 +01:00
Mathieu Bridon	fe8a153648	python: Stabilize some script outputs In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 12:52:12 +01:00
Lionel Landwerlin	d337713ec4	intel: tools: remove drm-uapi defines We already embed the headers, no need to redefine defines/structs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	87915baa23	intel: intel_dump_gpu: use simulator id in captures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	aab21cedc6	intel: devinfo: add simulator id Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	0f53948c59	intel: tools: dump-gpu: dump 48-bit addresses For gen8+, write out PPGTT tables in aub files so that full 48-bit addresses can be serialized. v2: Fix handling of `end` index in map_ppgtt v3: Correctly mark GGTT entry as present (Rafael) Signed-off-by: Scott D Phillips <scott.d.phillips@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	6e37b949d5	intel: tools: import intel_aubdump Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	fa00b9c1c9	intel: tools: update intel_aub.h Scott added new stuff in IGT. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	5ffa35b64d	intel: batch-decoder: add missing return line Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	28476c9d81	intel: batch-decoder: don't asks for constant BO until decoding With PPGTT mappings, our aubinator implementation can be quite slow if we request a buffer that doesn't exist. Instead of doing a PPGTT walk for invalid addresses (0 lengths), wait until we're sure we want to decode the data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	c262ec19d0	intel/batch-decoder: handle non-contiguous binding table / surface state Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-05 11:57:45 +01:00
Scott D Phillips	3ebee627cb	intel/tools/aubinator: aubinate ppgtt aubs v2: by Lionel Fix memfd_create compilation issue Fix pml4 address stored on 32 instead of 64bits Return no buffer if first ppgtt page is not mapped v3: Drop additional memfd_create() (Rafael) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	3228335b55	intel: aubinator: handle GGTT mappings We use memfd to store physical pages as they get read/written to and the GGTT entries translating virtual address to physical pages. Based on a commit by Scott Phillips. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Jason Ekstrand	2602ea89d5	util: rb-tree: A simple, invasive, red-black tree This is a simple, invasive, liberally licensed red-black tree implementation. It's an invasive data structure similar to the Linux kernel linked-list where the intention is that you embed a rb_node struct the data structure you intend to put into the tree. The implementation is mostly based on the one in "Introduction to Algorithms", third edition, by Cormen, Leiserson, Rivest, and Stein. There were a few other key design points: * It's an invasive data structure similar to the [Linux kernel linked list]. * It uses NULL for leaves instead of a sentinel. This means a few algorithms differ a small bit from the ones in "Introduction to Algorithms". * All search operations are inlined so that the compiler can optimize away the function pointer call. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	144b40db54	intel: aubinator: drop the 1Tb GTT mapping Now that we're softpinning the address of our BOs in anv & i965, the addresses selected start at the top of the addressing space. This is a problem for the current implementation of aubinator which uses only a 40bit mmapped address space. This change keeps track of all the memory writes from the aub file and fetch them on request by the batch decoder. As a result we can get rid of the 1<<40 mmapped address space and only rely on the mmap aub file \o/ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	9d08ef6335	intel: aubinator: rework register writes handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	86cb05a6d3	intel: aubinator: remove standard input processing option On a follow up commit in this series, we stop copying the data from the mmap'ed file into our big gtt mmap, and start referencing data in it directly. So reallocating the read buffer and adding more data from stdin wouldn't work. For that reason, let's stop supporting stdin process. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Lionel Landwerlin	08d85a8301	intel: aubinator: remove unused variables These memory offsets are stored in the gen_batch_decode_ctx. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-07-05 11:57:45 +01:00
Mathieu Bridon	3153bcc73e	gallium/auxiliary: Fix string matching Commit `f69bc797e1` did the following: - if format.layout in ('bptc', 'astc'): + if format.layout in ('astc'): The intention was to go from matching either 'bptc' or 'astc' to matching only 'astc'. But the new code doesn't respect this intention any more, because in Python `('astc')` is not a tuple containing a string, it is just the string. (the parentheses are simply ignored) That means we now match any substring of 'astc', for example 'a'. This commit fixes the test to respect the original intention. Fixes: `f69bc797e1` "gallium/auxiliary: Add helper support for bptc format compress/decompress" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-05 11:48:47 +01:00
Samuel Pitoiset	8339ba827b	radv: optimize vkCmd{Set,Reset}Event() a little bit Always emitting a bottom-of-pipe event is quite dumb. Instead, start to optimize these functions by syncing PFP for the top-of-pipe and syncing ME for the post-index-fetch event. This can still be improved by emitting EOS events for syncing PS and CS stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 11:31:06 +02:00
Samuel Pitoiset	f635109140	radv: optimize radv_CmdWaitEvents() This introduces radv_barrier() (same as the draw/dispatch codepath). This helper is used for merging the code from CmdWaitEvents() and CmdPipelineBarrier because it's quite similar. We do ignore the source stage mask for CmdWaitEvents because it's irrelevant when event objects are used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-05 11:31:03 +02:00
Roland Scheidegger	620626a371	nir/linker: fix msvc build Empty initializer braces aren't valid c (it's a gnu extension, and it's valid in c++). Hopefully fixes appveyor / msvc build... Fixes `6677e131b8` Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-05 09:27:05 +02:00
Gert Wollny	806a42fc47	r600: compare structure elements instead of doing a memcmp Structures might be padded by the compiler and these padding bytes remain un-initialized which in turn makes memcmp return a difference where from the logical point of view there is none. Fixes valgrind: Conditional jump or move depends on uninitialised value(s) at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099) by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573) by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129) by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153) by 0xB3B92CB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3948FD: st_update_array (st_atom_array.c:388) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:59:07 +02:00
Gert Wollny	9c1ae6a1a1	r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formats Below tests would fail with an error message "Vertex format (R4G4B4A4\|R5G5B5A1) not supported." Add the formate to the translation routine to enable these formats. Fixes: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:57:28 +02:00
Gert Wollny	5278436d67	r600: force LOD range to be only one value when mip.min filter is NONE For a texture that has only one LOD defined, but for which GL_TEXTURE_MAX_LEVEL is the default (1000) and GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does not properly resolve the LOD level and texture lookup might fail. Hence, when no mipmap filter is given (indicating that no mip-mapping takes place), force the LOD range to contain only value. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture.(i\|u)sampler2d dEQP-GLES3.functional.texture.format.sized.cube.rgb* out of VK_GL_CTS/android/cts/master/gles3-master.txt Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-07-05 07:57:28 +02:00
Gert Wollny	e7dd1a84a0	mesa/st: draw_vbo: initialize restart_index too restart_index is later always used in a comparison, so it should be initialized properly. Fixes valgrind warning: Conditional jump or move depends on uninitialised value(s) at 0xB8D682F: r600_draw_vbo (r600_state_common.c:2153) by 0xB71F743: u_vbuf_draw_vbo (u_vbuf.c:1156) by 0xB3B92DB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3B90B0: st_draw_vbo (st_draw.c:143) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-07-05 07:57:16 +02:00
Timothy Arceri	0cb6537dee	mesa: enable ARB_direct_state_access in OpenGL 4.5 compat profile Its unlikely anyone will add proper ARB_direct_state_access compat support before we branch 18.2. Enabling the extension in 4.5 at least allows users to make use of MESA_GL_VERSION_OVERRIDE=4.5COMPAT for games like No Mans Sky. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-05 13:15:34 +10:00
Timothy Arceri	39063334d3	util/drirc: turn on force_glsl_extensions_warn for No Mans Sky The game forgets to enable multiple extensions in its shaders, one of those extesions is EXT_texture_array. But enabling this config entry fixes at least one other rendering issue that enabling EXT_texture_array on its own doesn't fix. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-05 13:05:47 +10:00
Marek Olšák	9b4c4fe334	util/queue: remove leftover debug code	2018-07-04 22:19:47 -04:00
Marek Olšák	7fab8a4b37	Shorten u_queue names There is a 15-character limit for thread names shared by the queue name and process name. Shorten the thread name to make space for the process name. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-04 22:03:35 -04:00
Marek Olšák	b238e33bc9	kutil/queue: add a process name into a thread name v2: simplifications Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)	2018-07-04 21:54:39 -04:00
Marek Olšák	7149bffe66	gallium/os: use util_get_process_name when possible Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-04 21:16:57 -04:00
Marek Olšák	2e1e6511f7	util: extract get_process_name from xmlconfig.c Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-07-04 21:16:03 -04:00
Marek Olšák	4695984dbc	ac: fold LLVMContext creation into ac_llvm_context_init Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	f5cb4194c9	radeonsi: reorder code in si_llvm_context_init Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	ff330055e9	radeonsi: use ac_compile_module_to_binary to reduce compile times Compile times of simple shaders are reduced by ~20%. Compile times of prologs and epilogs are reduced by up to 40%. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Marek Olšák	0075e5fed8	ac: add reusable helpers for direct LLVM compilation This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked. struct ac_compiler_passes (opaque type) contains the main pass manager. ac_create_llvm_passes -- the result can go to thread local storage ac_destroy_llvm_passes -- can be called by a destructor in TLS ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary The motivation is to do the expensive call addPassesToEmitFile once per context or thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-04 15:48:18 -04:00
Rhys Perry	c2ae9b4052	nvc0: implement multisampled images on Maxwell+ Changes in v2: - make loadSuInfo32() protected without making the rest protected - move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating NVC0_SU_INFO_MS Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-07-04 16:04:23 +02:00
Neil Roberts	2d5ddbe960	i965: Fix output register sizes when variable ranges are interleaved In `6f5abf3146` this code was fixed to calculate the maximum size of an attribute in a seperate pass and then allocate the registers to that size. However this wasn’t taking into account ranges that overlap but don’t have the same starting location. For example: layout(location = 0, component = 0) out float a[4]; layout(location = 2, component = 1) out float b[4]; Previously, if ‘a’ was processed first then it would allocate a register of size 4 for location 0 and it wouldn’t allocate another register for location 2 because it would already be covered by the range of 0. Then if something tries to write to b[2] it would try to write past the end of the register allocated for ‘a’ and it would hit an assert. This patch changes it to scan for any overlapping ranges that start within each range to calculate the maximum extent and allocate that instead. Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ vs-fs-array-interleave-range.shader_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Fixes: `6f5abf3146` "i965: Fix output register sizes when multiple variables share a slot."	2018-07-04 10:57:51 +02:00
Dave Airlie	8c51caab24	r600/sb: cleanup if_conversion iterator to be legal C++ The current code causes: /usr/include/c++/8/debug/safe_iterator.h:207: Error: attempt to copy from a singular iterator. This is due to the iterators getting invalidated, fix the reverse iterator to use the return value from erase, and cast it properly. (used Mathias suggestion) Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-07-04 07:42:22 +01:00
Marek Olšák	45f9d58668	radeonsi: fix compiler breakage Broken by `d853d3a59b`.	2018-07-04 00:13:38 -04:00
Dave Airlie	5b32b246cf	ac: make some fns static Some of the compiler functions are no longer called outside the util file. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:26 +10:00
Dave Airlie	7398913a62	ac/radv: move llvm compiler info to struct and init in one place This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:16 +10:00
Dave Airlie	d853d3a59b	ac/radeonsi: port compiler init/destroy out of radeonsi. We want to share this code with radv in the future, so port it out of radeonsi. Add a return value as radv will want that to know if this succeeds Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 10:29:03 +10:00
Dave Airlie	35c82af539	radv/radeonsi: add a check ir tm options This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:32:35 +10:00
Dave Airlie	0eb65b4944	radeonsi: rename si_compiler -> ac_llvm_compiler As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:32 +10:00
Dave Airlie	887ba45c93	ac: add target library info helpers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:29 +10:00
Dave Airlie	e1387eaf12	radv: create/destroy passmgr at the higher level. This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:05 +10:00
Dave Airlie	97d9b88447	radv: port to use common passmgr code. This adds a inline always pass, but otherwise should work the same. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-04 05:30:34 +10:00
Dave Airlie	584ad1eda9	ac/radeonsi: refactor out pass manager init to common code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:18:01 +10:00
Dave Airlie	f2b3e96e75	radv: drop copy of ac_create_target_machine. Once we split the init once stuff out, this can be shared again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:35 +10:00
Dave Airlie	473be16c74	ac/radv: split the non-common init_once code from the common target code. (v2) This just splits out the non-shared code and reuses ac_get_llvm_target in radv. v2: rebase on Marek's patch - fixup brace position/whitespace Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:15:23 +10:00
Neil Roberts	590cc7c8f6	i965: Use the new nir atomic counter linker for SPIR-V shaders Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c13f8ea8ac	i965: enable AtomicStorage capability for gen7+ That is the same gen requirement for ARB_shader_atomic_counters. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	7600678216	mesa/glspirv: lower workgroup access to offsets This will perform the CS shared lowering. See `8761a04d0d` Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	fbcebfc5bf	nir: Fix OpAtomicCounterIDecrement for uniform atomic counters From the SPIR-V 1.0 specification, section 3.32.18, "Atomic Instructions": "OpAtomicIDecrement: <skip> The instruction's result is the Original Value." However, we were implementing it, for uniform atomic counters, as a pre-decrement operation, as was the one available from GLSL. Renamed the former nir intrinsic 'atomic_counter_dec' to 'atomic_counter_pre_dec' for clarification purposes, as it implements a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec, section 8.10, "Atomic Counter Functions": "uint atomicCounterDecrement (atomic_uint c) Atomically 1. decrements the counter for c, and 2. returns the value resulting from the decrement operation. These two steps are done atomically with respect to the atomic counter functions in this table." Added a new nir intrinsic 'atomic_counter_post_dec' which implements a post-decrement operation as required by SPIR-V. v2: (Timothy Arceri) Add extra spec quotes on commit message * Use "post" instead "pos" to avoid confusion with "position" Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Neil Roberts	6677e131b8	nir/linker: Add a pure NIR implementation of the atomic counter linker This is mostly just a straight-forward conversion of link_assign_atomic_counter_resources to C directly using nir variables instead of GLSL IR variables. It is based on the version of link_assign_atomic_counter_resources in `6b8909f2d1`. I’m noting this here to make it easier to track changes and keep the NIR version up-to-date. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Neil Roberts	1fb9984d7e	nir/types: Add wrappers for a couple of atomic counter methods Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	54d7fca077	spirv/nir: add capability check for SpvCapabilityAtomicStorage Capability that informs if atomic counters are supported. From SPIR-V 1.0 spec, section 3.7, "Storage Class", item 10 from table: (Column "Storage Class"): "AtomicCounter For holding atomic counters. Visible across all functions of the current invocation. Atomic counter-specific memory." (Column "Required Capability"): "AtomicStorage" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	12301766de	spirv/nir: add atomic counter support on vtn_handle_ssbo_or_shared_atomic So renamed to a more general vtn_handle_atomics Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c3eb0ba0ff	spirv/nir: initialize offset on the nir var at vtn_create_variable This is convenient when dealing with atomic counter uniforms. The alternative would be doing that at vtn_handle_atomics. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Antia Puentes	4110bc4c17	nir/spirv: Fix atomic counter (multidimensional-)arrays When constructing NIR if we have a SPIR-V uint variable and the storage class is SpvStorageClassAtomicCounter, we store as NIR's glsl_type an atomic_uint to reflect the fact that the variable is an atomic counter. However, we were tweaking the type only for atomic_uint scalars, we have to do it as well for atomic_uint arrays and atomic_uint arrays of arrays of any depth. Signed-off-by: Antia Puentes <apuentes@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> v2: update after deref patches got pushed (Alejandro Piñeiro) v3: simplify repair_atomic_type (suggested by Timothy Arceri, included on the patch by Alejandro) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	480d2c56b3	spirv/nir: tweak nir type when storage class is SpvStorageClassAtomicCounter GLSL types differentiates uint from atomic uint. On SPIR-V the type is uint, and the variable has a specific storage class. So we need to tweak the type based on the storage class. Ideally we would like to get the proper type at vtn_handle_type, but we don't have the storage class at that moment. We tweak only the nir type, as is the one that really requires it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	88d3325a44	nir_types: add glsl_atomic_uint_type() helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:41:46 +02:00
Alejandro Piñeiro	c6230b9358	spirv/nir: add offset at vtn_variable Also initialize it on var_decoration_cb This is equivalent to nir_variable.offset, used to store the location an atomic counter is stored at. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	768c275deb	spirv/nir: SpvStorageClassAtomicCounter support on vtn_storage_class_to_mode Atomic Counters are uniforms per spec. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	a9e6298727	nir/linker: handle uniforms without explicit location ARB_gl_spirv points that uniforms in general need explicit location. But there are still some cases of uniforms without location, like for example uniform atomic counters. Those doesn't have a location from the OpenGL point of view (they are identified with a binding and offset), but Mesa internally assigns it a location. Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> v2: squash with another patch, minor variable name tweak (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Alejandro Piñeiro	b0712df6cf	compiler/glsl: refactor empty_uniform_block utilities to linker_util This includes: * Move the defition of empty_uniform_block to linker_util.h * Move find_empty_block (with a rename) to linker_util.h * Refactor some code at linker.cpp to a new method at linker_util.h (link_util_update_empty_uniform_locations) So all that code could be used by the GLSL linker and the NIR linker used for ARB_gl_spirv. v2: include just "ir_uniform.h" (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-03 12:37:32 +02:00
Ian Romanick	995d993710	i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible Otherwise we can incorrectly cmod propagate in situations like add(8) g10<1>.xD g2<0>.xD -16D ... cmp.ge.f0(8) null<1>D g2<0>.xD 16D ... (+f0) sel(8) g21<1>.xyUD g14<4>.xyyyUD g18<4>.xyyyUD Sadly, this change hurts quite a few shaders. v2: Refactor writemask compatibility check into a separate function. Suggested by Caio. Ivy Bridge and Haswell had similar results. (Haswell shown) total instructions in shared programs: 12968489 -> 12968738 (<.01%) instructions in affected programs: 60679 -> 60928 (0.41%) helped: 0 HURT: 249 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.22% max: 0.81% x̄: 0.46% x̃: 0.44% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.44% 0.48% Instructions are HURT. total cycles in shared programs: 409171965 -> 409172317 (<.01%) cycles in affected programs: 260056 -> 260408 (0.14%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.04% max: 0.34% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.16% 0.18% Cycles are HURT. Sandy Bridge total instructions in shared programs: 10423577 -> 10423753 (<.01%) instructions in affected programs: 40667 -> 40843 (0.43%) helped: 0 HURT: 176 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.29% max: 0.79% x̄: 0.48% x̃: 0.42% 95% mean confidence interval for instructions value: 1.00 1.00 95% mean confidence interval for instructions %-change: 0.46% 0.51% Instructions are HURT. total cycles in shared programs: 146097503 -> 146097855 (<.01%) cycles in affected programs: 503990 -> 504342 (0.07%) helped: 0 HURT: 176 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.02% max: 0.36% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: 0.11% 0.13% Cycles are HURT. No changes on any other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `cd635d149b` i965/vec4: Propagate conditional modifiers from compares to adds Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 19:19:16 -07:00
Ian Romanick	fb6dc8e894	intel/compiler: Silence unused parameter warnings brw_nir.c src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’: src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ [-Wunused-parameter] bool is_scalar) ^~~~~~~~~ src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’: src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ [-Wunused-parameter] lower_bit_size_callback(const nir_alu_instr alu, void data) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-07-02 16:17:19 -07:00
Kenneth Graunke	8e38947f6c	i965: Fix BRW_NEW_NUM_SAMPLES to be in .brw, not .mesa This is the wrong kind of dirty bit. Caught by GCC warnings, due to 64-bit values being truncated to 32 bits. Fixes: `b95b0e2918` (intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-07-02 15:30:21 -07:00
Jason Ekstrand	afa8f58921	anv: Add support for the on-disk shader cache The Vulkan API provides a mechanism for applications to cache their own shaders and manage on-disk pipeline caching themselves. Generally, this is what I would recommend to application developers and I've resisted implementing driver-side transparent caching in the Vulkan driver for a long time. However, not all applications do this and, for some use-cases, it's just not practical. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 14:52:05 -07:00
Jason Ekstrand	e0f7a3aa5b	anv/pipeline_cache: Add a _locked suffix to a function Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	f5c38f4a30	anv: Add device-level helpers for searching for and uploading kernels Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	eae192bf5f	anv/pipeline: Stop optimizing for not having a cache Before, we were only hashing the shader if we had a shader cache to cache things in. This means that if we ever get it wrong, we could end up trying to cache a shader with an undefined hash. Since not having a shader cache is an extremely uncommon case, let's optimize for code clarity and obvious correctness over avoiding a hash operation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	76fdc8a85c	anv: Use a default pipeline cache if none is specified If a client is dumb enough to not specify a pipeline cache, give it a default. We have to create one anyway for blorp so we may as well let the client cache shaders in it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	d1c778b362	anv: Be more careful about hashing pipeline layouts Previously, we just hashed the entire descriptor set layout verbatim. This meant that a bunch of extra stuff such as pointers and reference counts made its way into the cache. It also meant that we weren't properly hashing in the Y'CbCr conversion information information from bound immutable samplers. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-07-02 13:07:06 -07:00
Jason Ekstrand	06412bfc98	anv,intel: Enable nir_opt_large_constants for Vulkan According to RenderDoc, this shaves 99.6% of the run time off of the ambient occlusion pass in Skyrim Special Edition when running under DXVK and shaves 92% off the runtime for a reasonably representative frame. When running the actual game, Skyrim goes from being a slide-show to a very stable and playable framerate on my SKL GT4e machine. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:50 -07:00
Jason Ekstrand	70ce880434	anv: Add state setup support for shader constants Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:49 -07:00
Jason Ekstrand	3a5ed18c51	anv: Add support for shader constant data to the pipeline cache Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:47 -07:00
Jason Ekstrand	1235850522	nir: Add a large constants optimization pass This pass searches for reasonably large local variables which can be statically proven to be constant and moves them into shader constant data. This is especially useful when large tables are baked into the shader source code because they can be moved into a UBO by the driver to reduce register pressure and make indirect access cheaper. v2 (Jason Ekstrand): - Use a size/align function to ensure we get the right alignments - Use the newly added deref offset helpers Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:45 -07:00
Jason Ekstrand	c90f221e0a	nir: Add a concept of constant data associated with a shader This commit adds a concept to NIR of having a blob of constant data associated with a shader. Instead of being a UBO or uniform that can be manipulated by the client, this constant data considered part of the shader and remains constant across all invocations of the given shader until the end of time. To access this constant data from the shader, we add a new load_constant intrinsic. The intention is that drivers will eventually lower load_constant intrinsics to load_ubo, load_uniform, or something similar. Constant data will be used by the optimization pass in the next commit but this concept may also be useful for OpenCL. v2 (Jason Ekstrand): - Rename num_constants to constant_data_size (anholt) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:42 -07:00
Jason Ekstrand	e8e159e9df	nir/deref: Add helpers for getting offsets These are very similar to the related function in nir_lower_io except that they don't handle per-vertex or packed things (that could be added, in theory) and they take a more detailed size/align function pointer. One day, we should consider switching nir_lower_io over to using the more detailed size/align functions and then we could make it use these helpers instead of having its own. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:41 -07:00
Jason Ekstrand	2bf8be99b0	nir/types: Add a natural size and alignment helper The size and alignment are "natural" in the sense that everything is aligned to a scalar. This is a bit tighter than std430 where vec3s are required to be aligned to a vec4. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:39 -07:00
Jason Ekstrand	893fc2d07d	nir: Add a deref_instr_has_indirect helper Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:37 -07:00
Jason Ekstrand	70b16963fc	util/macros: Import ALIGN_POT from ralloc.c v2 (Jason Ekstrand): - Rename y to pot_align (Brian) - Also use ALIGN_POT in build_id.c and slab.c (Brian) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-07-02 12:09:14 -07:00
Eric Anholt	4819da2301	v3d: Claim PIPE_CAP_TGSI_CAN_READ_OUTPUTS. Fixes warning at screen creation. We store our outputs in normal temps and just emit them to shader I/O at the end, due to our I/O ordering requirements, so reading "outputs" in NIR is fine.	2018-07-02 11:35:41 -07:00
Marek Olšák	32e413ca59	ac: move all LLVM module initialization into ac_create_module This removes some ugly code around module initialization. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 14:34:39 -04:00
Eric Anholt	49f7631c9f	v3d: Emit a TF flush after each draw using TF. This fixes GPU hangs on 7278 in transform feedback tests such as GTF-GLES3.gtf.GL3Tests.transform_feedback2.transform_feedback2_basic	2018-07-02 10:05:14 -07:00
Karol Herbst	c7726fbfa5	nv50/ir: handle clipvertex for geom and tess shaders as well this will be needed for compatibility profiles v2: handle tess shaders Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-02 16:21:31 +02:00
Erik Faye-Lund	4c87705705	gallium/u_vbuf: drop min/max-scanning for empty indirect draws When building with asserts enabled, we'll end up triggering an assert in pipe_buffer_map_range down this code-path, due to trying to map an empty range. Even if we avoid that, we'll trigger another assert a bit later, because u_vbuf_get_minmax_index returns a min-index of -1 here, which gets promoted to an unsigned value, and gives us an out-of-bounds buffer-mapping offset. Since we can't really have a well-defined min/max range here when the range is empty anyway, we should just drop this dance in the first place. After all, no rendering is going to be produced. This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0 on VirGL for me. Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 10:51:29 +02:00
Samuel Pitoiset	02db2363f0	radv: reset the image's predicate after a color decompression pass After performing a fast-clear eliminate, a FMASK decompress, or a DCC decompress, we can reset the predicate to FALSE. With that, the GPU should be able to skip unnecessary color decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 10:43:33 +02:00
Samuel Pitoiset	ff7daadca1	radv: enable/disable predication for the DCC decompression pass Performing a DCC decompression pass is currently pretty rare, but using predication allows the GPU to skip unnecessary passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-07-02 10:43:17 +02:00
Samuel Pitoiset	939e5a3823	radv: add padding for the UMR disassembler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-02 10:42:17 +02:00
Gert Wollny	91f48cdfe5	virgl: Add support for glGetMultisample Use caps to obtain the multisample sample positions for up to 16 positions and implement the according Gallium interface. This implemenation (plus its counterpart in virglrenderer) assume that the fixed sample position are always the same for a given number of samples over the whole live time of a qemu session. It also assumes that sample series are only given for 2, 4, 8, and 16 samples, and for intermediate numbers N of samples the next higher supported set from above list is picked and the sample positions for the first N samples are returned accordingly. Fixes (when run on GL host): dEQP-GLES31.functional.texture.multisample.samples_1.sample_position dEQP-GLES31.functional.texture.multisample.samples_2.sample_position dEQP-GLES31.functional.texture.multisample.samples_3.sample_position dEQP-GLES31.functional.texture.multisample.samples_4.sample_position dEQP-GLES31.functional.texture.multisample.samples_8.sample_position dEQP-GLES31.functional.texture.multisample.samples_10.sample_position dEQP-GLES31.functional.texture.multisample.samples_12.sample_position dEQP-GLES31.functional.texture.multisample.samples_13.sample_position dEQP-GLES31.functional.texture.multisample.samples_16.sample_position v2: remove unrelated chunk (thanks Ilia Mirkin) v3: - also return positions for intermediate sample counts - fix unused varible warning - update description v4: explain better what this patch assumes and how it handles sample numbers that are not directly advertised (thanks go to Erik Faye-Lund for making me aware that this should be documented) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>	2018-07-02 09:33:55 +02:00
Tomeu Vizoso	ba78e78cd5	st/mesa: Also check for PIPE_FORMAT_A8R8G8B8_SRGB for texture_sRGB and PIPE_FORMAT_R8G8B8A8_SRGB, as well. The reason for this is that when Virgl runs with GLES on the host, it cannot directly upload textures in BGRA. So to avoid a conversion step, consider the RGB sRGB formats as well for this extension. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:48 +02:00
Tomeu Vizoso	71867a0a61	st/mesa: Fall back to R8G8B8A8_SRGB for ETC2 If the driver doesn't support PIPE_FORMAT_B8G8R8A8_SRGB, fall back to PIPE_FORMAT_R8G8B8A8_SRGB. Drivers such as Virgl will have a hard time supporting PIPE_FORMAT_B8G8R8A8_SRGB when the host runs GLES, as GL_BGRA isn't as well suported there. So go with PIPE_FORMAT_R8G8B8A8_SRGB so these drivers can avoid a conversion copy. v2: Fix typo in commit message Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:41 +02:00
Tomeu Vizoso	e5604ef78b	st/mesa/i965: Allow decompressing ETC2 to GL_RGBA When Mesa itself implements ETC2 decompression, it currently decompresses to formats in the GL_BGRA component order. That can be problematic for drivers which cannot upload the texture data as GL_BGRA, such as Virgl when it's backed by GLES on the host. So this commit adds a flag to _mesa_unpack_etc2_format so callers can specify the optimal component order. In Gallium's case, it will be requested if the format isn't in PIPE_FORMAT_B8G8R8A8_SRGB format. For i965, it will remain GL_BGRA, as before. v2: * Remove unnecesary include (Emil Velikov) Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-02 09:33:33 +02:00
Iago Toral Quiroga	1b54824687	anv/cmd_buffer: make descriptors dirty when emitting base state address Every time we emit a new state base address we will need to re-emit our binding tables, since they might have been emitted with a different base state adress. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:31:20 +02:00
Iago Toral Quiroga	6a1d8350c9	anv/cmd_buffer: clean dirty push constants flag after emitting push constants Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:31:02 +02:00
Iago Toral Quiroga	198a72220b	anv/cmd_buffer: never shrink the push constant buffer size If we have to re-emit push constant data, we need to re-emit all of it. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> CC: <mesa-stable@lists.freedesktop.org>	2018-07-02 08:30:40 +02:00
Denis Pauk	2854c0f795	gallium/llvmpipe: Enable support bptc format. v2: none v3: none Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> CC: Matt Turner <mattst88@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	530130e74f	gallium/softpipe: Enable support bptc format. v2: none v3: none Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> CC: Rhys Perry <pendingchaos02@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	f69bc797e1	gallium/auxiliary: Add helper support for bptc format compress/decompress Reuse code shared with mesa/main/texcompress_bptc. v2: Use block decompress function v3: Include static bptc code from texcompress_bptc_tmp.h Suggested-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Nicolai Hähnle <nicolai.haehnle@amd.com> CC: Marek Olšák <maraeo@gmail.com> CC: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:37 -04:00
Denis Pauk	bf4871f9e8	mesa: add header for share bptc decompress functions Move shared bptc functions to texcompress_bptc_tmp.h: * fetch_rgba_unorm_from_block * fetch_rgb_float_from_block * compress_rgba_unorm * compress_rgb_float Create decompress functions: * decompress_rgba_unorm * decompress_rgb_float Functions will be reused in gallium/auxiliary code. v2: Add block decompress function v3: Move all shared code to header Suggested-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Denis Pauk <pauk.denis@gmail.com> CC: Marek Olšák <maraeo@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-01 15:42:36 -04:00
Marek Olšák	99c6cae227	glsl/cache: save and restore ExternalSamplersUsed Shaders that need special code for external samplers were broken if they were loaded from the cache. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-30 01:04:16 -04:00
Timothy Arceri	463f849097	nir: fix selection of loop terminator when two or more have the same limit We need to add loop terminators to the list in the order we come across them otherwise if two or more have the same exit condition we will select that last one rather than the first one even though its unreachable. This fix is for simple unrolls where we only have a single exit point. When unrolling these type of loops the unreachable terminators and their unreachable branch are removed prior to unrolling. Because of the logic change we also switch some list access in the complex unrolling logic to avoid breakage. Fixes: `6772a17acc` ("nir: Add a loop analysis pass") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-30 10:13:03 +10:00
Timothy Arceri	18293be622	radeonsi: enable OpenGL 4.4 compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	ddb351f7fe	mesa: enable ARB_vertex_attrib_64bit in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	c283b413c1	mesa: add outstanding ARB_vertex_attrib_64bit dlist support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Dave Airlie	98d02104a7	vbo_save: add support for doubles to display list code Required for ARB_vertex_attrib_64bit compat profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d2caa37741	mesa: add compat profile support for ARB_multi_draw_indirect v2: add missing ARB_base_instance support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	103b8f11d6	mesa: make valid_draw_indirect_multi() accessible externally We will use this to add compat support to ARB_multi_draw_indirect in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	5f90fb4007	mesa: add ARB_draw_indirect support to compat profile v2: add missing ARB_base_instance support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	9b32c80357	mesa: generate GL_INVALID_OPERATION using draw indirect in dlist The spec doesn't explicitly say to generate an error but since DrawArraysInstanced* and DrawElementsInstanced* do, it makes sense to do it for these functions also. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	03f1a2e8df	mesa: add missing display list support for ARB_compute_shader The extension is enabled for compat profile but there is currently no display list support. I filed a spec bug and it has been agreed that glDispatchComputeIndirect should generate an INVALID_OPERATION error when called during display list compilation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	87d6093583	mesa: expose some ARB_viewport_array dependent extensions in compat Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d87913e72a	mesa: enable ARB_viewport_array in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	d332986589	mesa: add ARB_viewport_array display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	df5e22cb7d	mesa: enable ARB_shader_subroutine in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	05f3589e67	mesa: add glUniformSubroutinesuiv() display list support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	52e3ef2400	mesa: stop hiding remaining query parameters from OpenGL compat I managed to miss these two in my last pass at this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	9f77a9729e	mesa: enable ARB_gpu_shader_fp64 in compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:33 +10:00
Timothy Arceri	a138fbc955	mesa: add ProgramUniform*d display list support This is required for fp64 to be enabled in compat profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:32 +10:00
Timothy Arceri	145f517cbd	mesa: add Uniform*d support to display lists This is required so we can enable fp64 support in compat profile. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-30 08:38:32 +10:00
Karol Herbst	04b443104d	st/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS this is required for Drivers which don't allow reading from outputs. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 23:43:26 +02:00
Eric Anholt	a77cb724da	v3d: Move GL shader state dumping out of per-version compilation. It doesn't depend on V3D_VER, since it's just calling v3d_print_group.	2018-06-29 13:36:28 -07:00
Eric Anholt	c2901ff80f	v3d: Add missing Stream field to transform feedback specs on V3D 4.1. Noticed when trying to CLIF parse a transform feedback job that hangs on HW.	2018-06-29 13:36:28 -07:00
Eric Anholt	69efc1e025	v3d: Add missing "tri trip or fan" flag in Primitive List Format.	2018-06-29 13:36:28 -07:00
Eric Anholt	b341b39db3	v3d: Fix the shader code address field widths on V3D 4.1+ We were overlapping it with the threadable/nan flags, resulting in incorrect relocations (threadable/nan included in the offset) and wrong ordering in the CLIF files.	2018-06-29 13:36:28 -07:00
Eric Anholt	6c3c11ba19	v3d: Add missing "no prim pack" field to the V3D4.1+ GL shader state. It looks like we don't need this flag for anything (not that I'm clear on what it does), but it makes our struct dumping line up with CLIF parsing.	2018-06-29 13:36:28 -07:00
Eric Anholt	c0476d964a	v3d: Express dithering mode in the same way that the CLIF parser does.	2018-06-29 13:36:28 -07:00
Eric Anholt	24d2f1347d	v3d: Add missing "number of bin tile lists" field. Noticed when trying to feed our dumps through the CLIF parser. Since this is a "minus one" field, we were already filling in the value we wanted (0).	2018-06-29 13:36:28 -07:00
Eric Anholt	b65b61cefe	v3d: Rewrite the color write masks to match CLIF format. The render_target_* fields gave us pretty(ish) printing, but meant we were incompatible with CLIF, and had much more verbose code generating them.	2018-06-29 13:36:28 -07:00
Eric Anholt	38172dcba9	v3d: Merge the V3D 4.1 and 4.2 XML into V3D 3.3'x XML. The XML ends up noisier if you're only looking at one version, but from the diffstat there's obvious wins in terms of deduplication. This will get even more significant if we ever support 3.2 or 4.0.	2018-06-29 13:36:28 -07:00
Eric Anholt	725561c0b6	v3d: Switch v3d_decoder.c to the XML's top min_ver/max_ver fields. The XML zipper wants one XML per version for filling out its tables, but we want to do more than one GPU version per XML now. Assume that the "gen" field will be the same as min_ver and look up our XML text assuming that they're listed in increasing min_ver.	2018-06-29 13:36:28 -07:00
Eric Anholt	f8af5c58c3	v3d: Create XML fields for min_ver and max_ver of a packet/struct/enum. This will be used to merge together the V3D 3.3-4.1 XML with the variants disabled based on the version.	2018-06-29 13:36:28 -07:00
Eric Anholt	6f7ad7ed11	v3d: Pass the version being generated to the pack generator script. It turns out that most V3D versions change very few packets, so keeping separate copies of the XML per version makes changing the XML a pain as you have to replicate your changes to each one. This is the start of changing it so that one XML can generate headers for multiple versions.	2018-06-29 13:36:28 -07:00
Jose Maria Casanova Crespo	a99c9e63a0	anv: finish the binding_table_pool on destroyDevice when use_softpin Running VK-CTS in batch execution mode was raising the VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the same failing tests were run isolated they always passed. createDevice and destroyDevice were called before and after every tests. Because the binding_table_pool was never closed, we reached the maximum number of open file descriptors (ulimit -n) and when that happened every call to createDevice implied a VK_ERROR_INITIALIZATION_FAILED error. Fixes: `c7db0ed4e9` ("anv: Use a separate pool for binding tables when soft pinning") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-29 21:49:31 +02:00
Marek Olšák	ea8b55b49f	gallium/util: remove dummy function util_format_is_supported Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2018-06-29 15:31:49 -04:00
Dylan Baker	82bf8a6a82	docs: update calendar, add news and link release notes to 18.1.3	2018-06-29 11:04:22 -07:00
Dylan Baker	9dfcf044f7	docs: Add SHA256 sums to notes for 18.1.3	2018-06-29 11:02:41 -07:00
Dylan Baker	2fa6c3821f	docs: Add release notes for 18.1.3	2018-06-29 11:02:39 -07:00
Rhys Perry	ffba56cc3c	nv50/ir: improve maintainability of Target*::initOpInfo() This is mainly useful for when one needs to add new opcodes in a painless and reliable way. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 16:47:27 +02:00
Rhys Perry	d885303a38	nv50/ir: fix image stores with indirect handles Having this if statement here prevented the next if statement from being reached in the case of image stores, which is needed for instructions with indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...". Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-29 16:07:59 +02:00
Ross Burton	d7c4ce1d1d	egl: fix build race in automake There is a parallel make build issue in src/egl/drivers/dri2/ for wayland builds. Can be reproduced with: $ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo $ make -C src/egl/ drivers/dri2/platform_wayland.lo ../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory This patch adds the missing dependency. Fixes: `02cc359372` "egl/wayland: Use linux-dmabuf interface for buffers" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> [Eric: fixed up the commit title] Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-29 12:49:51 +01:00
Marek Olšák	5a6414f135	radeonsi: implement vertex color clamping for tess and GS	2018-06-28 22:41:12 -04:00
Marek Olšák	034b385fc2	radeonsi: move VS_STATE_SGPR before draw SGPRs for vertex color clamping.	2018-06-28 22:27:25 -04:00
Marek Olšák	0c554bc5d5	radeonsi: don't use malloc in si_generate_gs_copy_shader	2018-06-28 22:27:25 -04:00
Marek Olšák	7bac3b589c	radeonsi: disable DCC statistics gathering on everything but Stoney I think we don't need it on other chips.	2018-06-28 22:27:25 -04:00
Marek Olšák	0da94fa19c	radeonsi: don't enable DCC statistics gathering for small surfaces	2018-06-28 22:27:25 -04:00
Marek Olšák	f8b0c54e3f	radeonsi: simplify logic around vi_separate_dcc_try_enable	2018-06-28 22:27:25 -04:00
Marek Olšák	41f80373b4	radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2 Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-28 22:27:25 -04:00
Marek Olšák	fb28bf23db	radeonsi: remove references to Evergreen	2018-06-28 22:27:25 -04:00
Marek Olšák	1542169a4a	radeonsi: enable shader caching for compute shaders Compute shaders were not using the shader cache.	2018-06-28 22:27:25 -04:00
Marek Olšák	d77557c9db	radeonsi: store compute local_size into tgsi_shader_info This is kinda a hack, but it's enough for the shader cache.	2018-06-28 22:27:25 -04:00
Marek Olšák	d13f240269	radeonsi: unify duplicated code for initial shader compilation	2018-06-28 22:27:25 -04:00
Marek Olšák	8e9c57a7fe	ac: set +auto-waitcnt-before-barrier when needed This removes useless s_waitcnt before barriers. Only radeonsi uses this function.	2018-06-28 22:27:25 -04:00
Marek Olšák	7d6ec9d43b	radeonsi/gfx9: insert the barrier between merged shaders inside the if block	2018-06-28 22:27:25 -04:00
Joe M. Kniss	70425bcfe6	gallium: plumb invariant output attrib thru TGSI Add support for glsl 'invariant' modifier for output data declarations. Gallium drivers that use TGSI serialization currently loose invariant modifiers in glsl shaders. v2: use boolean for invariant instead of unsigned. Tested: chromiumos on qemu with virglrenderer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-06-29 11:11:54 +10:00
Francisco Jerez	c2c803be7b	intel/fs: Build 32-wide FS shaders. Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-28 13:25:21 -07:00
Jason Ekstrand	b95b0e2918	intel/anv,blorp,i965: Implement the SKL 16x MSAA SIMD32 workaround Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-28 13:25:18 -07:00
Jason Ekstrand	d5e028a57b	intel/fs: Add fields to wm_prog_data for SIMD32 dispatch Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	bcbc7d3a17	intel/fs: Fix nir_intrinsic_load_helper_invocation for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	7144247c2c	intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	37c1df28c9	intel/fs: Fix Gen6+ interpolation setup for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	e208bc3bb7	intel/fs: Get rid of MOV_DISPATCH_TO_FLAGS We can just emit the MOV in the two places where we use this. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	5e3028d826	intel/fs: Emit MOV_DISPATCH_TO_FLAGS once for the centroid workaround There's no reason for us to emit it a pile of times and then have a whole pass to clean it up. Just emit it once like we really want. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	40fe108e2b	intel/fs: Generalize the unlit centroid workaround This generalizes the unlit centroid workaround so it's less code and now supports SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1d381731e0	intel/fs: Fix sample id setup for SIMD32. v2 (Jason Ekstrand): - Disallow gl_SampleId in SIMD32 on gen7 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2fd0aed89a	intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	6909aed90e	intel/fs: Implement 32-wide FS payload setup on Gen6+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	f6c4aace22	intel/fs: Extend thread payload layout to SIMD32 And handle 32-wide payload register reads in fetch_payload_reg(). v2 (Jason Ekstrand); - Fix some whitespace and brace placement Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	8f143f70d6	intel/fs: Wrap FS payload register look-up in a helper function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	d996e5b812	intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaround While we're here, we change to using horiz_offset() instead of abusing half(). v2 (Jason Ekstrand): - Use horiz_offset() instead of half() Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	38aee1a06d	intel/fs: Simplify fs_visitor::emit_samplepos_setup The original code manually handled splitting the MOVs to 8-wide to handle various regioning restrictions. Now that we have a SIMD width splitting pass that handles these things, we can just emit everything at the full width and let the SIMD splitting pass handle it. We also now have a useful "subscript" helper which is designed exactly for the case where you want to take a W type and read it as a vector of Bs so we may as well use that too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	244a0ff3a8	i965: Add plumbing for shader time in 32-wide FS dispatch mode. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2d7d652d5c	intel/fs: Disable opt_sampler_eot() in 32-wide dispatch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	db6ca13efc	intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinates On g4x through Sandy Bridge, src1 (the coordinates) of the PLN instruction is required to be an even register number. When it's odd (which can happen with SIMD32), we have to emit a LINE+MAC combination instead. Unfortunately, we can't just fall through to the gen4 case because the input registers are still set up for PLN which lays out the four src1 registers differently in SIMD16 than LINE. v2 (Jason Ekstrand): - Take advantage of both accumulators and emit LINE LINE MAC MAC (Based on a patch from Francisco Jerez) - Unify the gen4 and gen4x-6 cases using a loop v3 (Jason Ekstrand): - Don't unify gen4 with gen4x-6 as this turns out to be more fragile than first thought without reworking the gen4 barycentric coordinate layout. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	566e6abd6d	intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN When we don't have PLN (gen4 and gen11+), we implement LINTERP as either LINE+MAC or a pair of MADs. In both cases, the accumulator is written by the first of the two instructions and read by the second. Even though the accumulator value isn't actually ever used from a logical instruction perspective, it is trashed so we need to make the scheduler aware. Otherwise, the scheduler could end up re-ordering instructions and putting a LINTERP between another an instruction which writes the accumulator and another which tries to use that result. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	73d60455e9	intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSET This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU operation and less like a send. This is less code over-all and, as a side-effect, it now properly handles execution groups and lowering so SIMD32 support just falls out. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	74b477039d	intel/fs: Add the group to the flag subreg number on SNB and older We want consistent behavior in the meaning of the flag_subreg field between SNB and IVB+. v2 (Jason Ekstrand): - Add some extra commentary Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	2aefa5e19f	intel/fs: Fix FB read header setup for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	e06f5b30cc	intel/fs: Fix logical FB write lowering for SIMD32 Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	ce370902d4	intel/fs: Fix FB write message control codegen for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	8b788069fb	intel/fs: Don't enable dual source blend if no outputs are written This prevents a crash in some arb_enhanced_layouts tests that would be caused by the next commit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	48241c780a	intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	789d20df36	intel/eu: Fix pixel interpolator queries for SIMD32. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1650442026	intel/fs: Disable SIMD32 dispatch for fragment shaders with discard. Current discard handling requires dedicating the second flag register to discard. However, control-flow in SIMD32 requires both flag registers so it's incompatible with the current discard handling. Just don't support SIMD32+discard for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	1811cbdc25	intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flow The hardware's control flow logic is 16-wide so we're out of luck here. We could, in theory, support SIMD32 if we know the control-flow is uniform but we don't have that information at this point. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	d5b617a28e	intel/fs: Split instructions low to high in lower_simd_width Commit `0d905597f` fixed an issue with the placement of the zip and unzip instructions. However, as a side-effect, it reversed the order in which we were emitting the split instructions so that they went from high group to low instead of low to high. This is fine for most things like texture instructions and the like but certain render target writes really want to be emitted low to high. This commit just switches the order back around to be low to high. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `0d905597f` "intel/fs: Be more explicit about our placement of [un]zip"	2018-06-28 13:19:38 -07:00
Jason Ekstrand	0b830081f0	intel/fs: Rework KSP data to be SIMD width-based Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	9d78abbef8	intel/compiler: Add and use helpers for working with KSP indices The pixel shader dispatch table is kind-of a confusing mess. This adds some helpers for dealing with it and for easily extracting the correct data from wm_prog_data. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	85750348bc	i965: Re-arrange shader kernel setup in WM state Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	5b6e91dd35	intel/fs: Remove program key argument from generator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	a14fb0184a	intel/fs: Set up FB write message headers in the visitor Doing instruction header setup in the generator is awful for a number of reasons. For one, we can't schedule the header setup at all. For another, it means lots of implied writes which the instruction scheduler and other passes can't properly read about. The second isn't a huge problem for FB writes since they always happen at the end. We made a similar change to sampler handling in `ff4726077d`. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	dda31a7bbc	intel/fs: Fix implied_mrf_writes() for headerless FB writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	90643689aa	intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Francisco Jerez	ed09e78023	intel/eu: Return new instruction to caller from brw_fb_WRITE(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	c0a1c248b8	intel/fs: Pull FB write implied headers from src[0] Now that we have the implied header in src[0] for tracking purposes, we may as well use it in the generator. This makes things a tiny bit more general. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	b1cc9a9ae1	intel/fs: Properly track implied header regs read by FB writes The FB write opcode on gen4-5 does implied copies from g0 and g1 to the message payload. With this commit, we start tracking that as part of the IR by having the FB write read from g0-1. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Jason Ekstrand	d91fa20655	intel/fs: FS_OPCODE_REP_FB_WRITE has side effects It doesn't matter since we don't ever run replicated write shaders through the optimizer but it's good to be complete. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-28 13:19:38 -07:00
Dylan Baker	e83cd38eac	docs: Add news item for mesa 18.1.2 Which I forgot to do when 18.1.2 came out. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-06-28 10:06:44 -07:00
Rhys Perry	c92eb71a65	nvc0: remove magic values in nve4_set_tex_handles() With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is changed to anything other than 0x20. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-28 18:22:06 +02:00
Rhys Perry	6bb0f87c60	nvc0/ir: fix TargetNVC0::insnCanLoadOffset() Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset could be set to a specific value. The IndirectPropagation pass expected it to return whether the offset could be increased by a specific value, which is what TargetNV50::insnCanLoadOffset() does. Fixes: `37b67db6ae` ("nvc0/ir: be careful about propagating very large offsets into const load") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-28 18:22:06 +02:00
Alok Hota	5b7d4f9428	swr/rast: Updating code style based on current clang-format rules Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:14 -05:00
Vinson Lee	f90a60fe79	swr/rast: Fix addPassesToEmitFile usage with llvm-7.0. Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output."). CXX rasterizer/jitter/libmesaswr_la-JitManager.lo rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3 pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:06 -05:00
Alok Hota	c7e9102d89	swr/rast: Handling removed LLVM intrinsics in trunk - Functionality replaced with emulated intrinsics - Fixes Bug 106558 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:18:00 -05:00
Alok Hota	83d3ddd0ec	swr/rast: Adding SCATTERPS functionality to BuilderGfxMem Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:17:55 -05:00
Alok Hota	4509cdbb37	swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack - Removing unused generic translate function - Requiring read/write specifier in builder_gfx_mem Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-28 08:17:33 -05:00
Chad Versace	dc6665422a	gallium: Fix automake for Android (v2) Chromium OS uses Autotools and pkg-config when building Mesa for Android. The gallium drivers were failing to find the headers and libraries for zlib and Android's libbacktrace. v2: - Don't add a check for zlib.pc. configure.ac already checks for zlib.pc elsewhere. [for tfiga] - Check for backtrace.pc separately from the other Android libs. [for tfiga] Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-27 19:58:16 -07:00
Timothy Arceri	2a5121bf35	glsl: skip comparison opt when adding vars of different size The spec allows adding scalars with a vector or matrix. In this case the opt was losing swizzle and size information. This fixes a bug with Doom (2016) shaders. Fixes: `34ec1a24d6` ("glsl: Optimize (x + y cmp 0) into (x cmp -y).") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-28 12:15:17 +10:00
Jason Ekstrand	e8eb182ec5	Revert "anv: Print the actual enum for ignored structure types" This reverts commit `fda7014c35`. It was hitting an unreachable when the sType was unknown.	2018-06-27 14:10:37 -07:00
Jason Ekstrand	fda7014c35	anv: Print the actual enum for ignored structure types Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-06-27 12:43:18 -07:00
Jason Ekstrand	6a35ba5ce9	i965/bufmgr: Use the correct argument order for bo_alloc_internal The memzone and flags parameters were accidentally flipped in the call from brw_bo_alloc_tiled_2d. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-27 12:43:18 -07:00
Keith Packard	60e6b6fa96	vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors Instead of encouraging the client to re-create the swapchain and keep going with an OUT_OF_DATE error, tell the client that further use of the current surface will not succeed as the associated kernel objects are no longer valid. In particular, when a DRM lease is revoked, then the client needs to get another lease and create a new surface for that. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-27 10:02:18 -07:00
Eric Anholt	6bb046cd29	glsl: Make sure that packed varyings reflect always_active_io properly. The always_active_io flag was only set according to the first variable that got packed in, so NIR io compaction would end up compacting XFB varyings that shouldn't move at that point. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-27 09:35:55 -07:00
Eric Anholt	ad1a4cb563	v3d: Fix Z clipping when viewport.scale[2] is negative. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex	2018-06-27 09:35:51 -07:00
Eric Anholt	9f80bcc2bc	v3d: Convert a bunch of our "minus one" fields over to the new XML attr. This fixes up their formatting for CLIF files and makes the code more legible.	2018-06-27 09:13:48 -07:00
Eric Anholt	18b1bb0b63	v3d: Add pack/unpack/decode support for fields with a "- 1" modifier. Right now, we name these fields as "field name minus one" so that your C code obviously states what the value should be. However, it's easy enough to handle at the codegen level with another little XML attribute, meaning less C code and easier-to-read values in CLIF dumping and gdb as well. (The actual CLIF format for simulator and FPGA replay takes in pre-minus-one values, so we need it there too).	2018-06-27 09:13:48 -07:00
Tapani Pälli	e9a77c3e96	i965: small cleanup in blorp debug printing output (trivial) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Tapani Pälli	9a92acec67	mesa: add a space between headers and source (trivial) There used to be one and it looks like it was removed by `eb63640c1d`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Tapani Pälli	58ba7ab535	features.txt: mark some extensions as done Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-27 11:05:48 +03:00
Danylo Piliaiev	e7cdaa895a	mesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-06-27 11:02:34 +03:00
Samuel Pitoiset	7a57c82767	radv: use separate bind points for the dynamic buffers The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-27 09:48:31 +02:00
Samuel Pitoiset	9c09e7d66e	radv: remove unused 'predicated' parameter from some functions It's always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-27 09:48:15 +02:00
Dave Airlie	a6b64d6dde	virgl: add ARB_texture_view support Reviewed-By: Gert Wollny <gert.wollny@collabora.com>	2018-06-27 14:08:00 +10:00
Jason Ekstrand	ff6db94c18	nir/opt_if: Remove unneeded phis if we make progress Now that SSA values can be derefs and they have special rules, we have to be a bit more careful about our LCSSA phis. In particular, we need to clean up in case LCSSA ended up creating a phi node for a deref. This fixes validation issues with some Vulkan CTS tests with the new deref instructions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-06-26 10:47:26 -07:00
Samuel Pitoiset	fa42fa1a60	radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries Ported from RadeonSI. This appears to fix some random fails with: dEQP-VK.query_pool.statistics_query.* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 18:23:16 +02:00
Tapani Pälli	ab2643e4b0	glsl: serialize data from glTransformFeedbackVaryings While XFB has been enabled for cache, we did not serialize enough data for the whole API to work (such as glGetProgramiv). Fixes: `6d830940f7` "Allow shader cache usage with transform feedback" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-26 12:44:22 +03:00
Samuel Pitoiset	bcbd8dd6c9	radv: enable VK_EXT_shader_stencil_export The driver already supports exporting the stencil value. The following CTS test now pass: dEQP-VK.pipeline.shader_stencil_export.op_replace Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 10:40:10 +02:00
Samuel Pitoiset	ba5e25ed29	radv: ignore pInheritanceInfo for primary command buffers From the Vulkan spec: "If this is a primary command buffer, then this value is ignored." CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 10:39:43 +02:00
Andrii Simiklit	232c5d75ea	i965/gen6/gs: Handle case where a GS doesn't allocate VUE We can not use the VUE Dereference flags combination for EOT message under ILK and SNB because the threads are not initialized there with initial VUE handle unlike Pre-IL. So to avoid GPU hangs on SNB and ILK we need to avoid usage of the VUE Dereference flags combination. (Was tested only on SNB but according to the specification SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6 the ILK must behave itself in the similar way) v2: Approach to fix this issue was changed. Instead of different EOT flags in the program end we will create VUE every time even if GS produces no output. v3: Clean up the patch. Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-06-26 08:18:55 +02:00
Dave Airlie	318ff60ccd	radeon: duplicate cmask surface for now. The radeon winsys isn't linked against the ac code, I have vague memories of this causing some problems before, for now fix the build but just duplicating the code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-26 11:26:35 +10:00
Marek Olšák	bd963f8430	radeonsi: rename r600_transfer -> si_transfer Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	eabeeb86b2	radeonsi: properly set cmask_buffer in si_reallocate_texture_inplace Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	d4755ef389	radeonsi: remove redundant si_texture::cmask_size cmask_buffer and surface.cmask_size can replace its role. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2a8d1039b6	radeonsi: inline struct r600_cmask_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	166250f4e5	radeonsi: move CMASK size computation into ac_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	3da693b7d9	ac/surface: move cmask_size/alignment into radeon_surf cmask_size is changed to uint32_t because it can't be greater than 4GB. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2d64a68c6f	radeonsi: rename r600_surface -> si_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	218e133695	radeonsi: rename r600_memory_object -> si_memory_object Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	e5df04f13d	radeonsi: remove unused r600_memory_object::offset The real offset is passed through resource_from_memobj. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	45004abfd5	radeonsi: unify duplicated texture_from_handle & texture_from_memobj Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	cac7ab1192	radeonsi: reorder and initialize more fields in si_reallocate_texture_inplace Some fields shouldn't be initialized, like framebuffers_bound and other stats. It's hopefully complete now. Cc: 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-25 18:33:58 -04:00
Marek Olšák	7888245ef3	radeonsi: stop using lp_build_emit_llvm_unary/binary Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	0810f15046	radeonsi: stop using lp_build_alloc Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	21ba8a204e	radeonsi: use gallivm less Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	965904eebd	radeonsi: stop using lp_bld_intr.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	6ab54d25a6	radeonsi: remove last uses of lp_build_context::undef Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	30f3e2200a	radeonsi: stop using lp_bld_arit.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	5f54fc3ad1	radeonsi: stop using lp_build_gather_values Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	7bd40dc2f2	radeonsi: clean up some #includes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	f154555733	radeonsi: clean up passing the is_monolithic flag for compilation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Robert Foss	c7bb82136b	egl/android: Add DRM node probing and filtering This patch both adds support for probing & filtering DRM nodes and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD gralloc call. Currently the filtering is based just on the driver name, and the desired name is supplied using the "drm.gpu.vendor_name" Android property. Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-06-25 18:54:10 +02:00
Rob Herring	3f7bca44d9	egl/android: #ifdef out flink name support Maintaining both flink names and prime fd support which are provided by 2 different gralloc implementations is problematic because we have a dependency on a specific gralloc implementation header. This mostly disables the dependency on the gralloc implementation and headers. The dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD remains for now, but the definition is added locally to remove the header dependency. drm_gralloc support can be enabled by setting BOARD_USES_DRM_GRALLOC=true in BoardConfig.mk. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org>	2018-06-25 18:54:09 +02:00
Robert Foss	5a34aba07d	gallium/util: Fix build error due to cast to different size Signed-off-by: Robert Foss <robert.foss@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-25 18:54:09 +02:00
Samuel Pitoiset	07cb1373a2	radv: fix HTILE metadata initialization in presence of subpass clears If the driver ends up by performing a slow depthstencil clear, the HTILE metadata won't be initialized correctly. This fixes random VM faults on Polaris while running CTS with Bas's runner. This doesn't seem to regress performance. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-25 17:38:59 +02:00
Gert Wollny	eebb65258d	r600/sb: give the scheduler more margin to find valid instructions groups For instruction sequences that change the address register with every load the current limit to bail out of the scheduler and reject the optimisation was too tight, i.e. it was expected that at least one pending instruction would be scheduled each time. Give the scheduler more margin to sort out these load sequences by allowing a number of rounds where no instruction is scheduled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-25 05:40:19 +01:00
Gert Wollny	cd7db0ab0a	r600/sb: fix rotated register in while loop This patch is based on https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html Dave Airlie: "A bunch of CTS tests led me to write tests/shaders/ssa/fs-while-loop-rotate-value.shader_test which r600/sb always fell over on. GCM seems to move some of the copies into other basic blocks, if we don't allow this to happen then it doesn't seem to schedule them badly. Everything I've read on SSA/phi copies say they have to happen in parallel, so keeping them in the same basic block seems like a good way to keep some of that property." This patch differs from the one proposed by Dave in that it only adds the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi* and that are located in loops. Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test (no regressions in the shader set). It also fixes all failing tests from dEQP-GLES3.functional.shaders.loops.* Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-25 05:39:41 +01:00
Rob Clark	1977e92ee3	freedreno/ir3: fix deref conversion fallout Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Rob Clark	445871de94	freedreno/ir3: fix unused variable warning Fixes: `cf0c7258ee` freedreno/a5xx: MSAA Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Rob Clark	868ca81cbe	freedreno: fix HW_ATOMIC_COUNTERS cap This was mistakenly exposed, even though we want atomic counters to be lowered to atomic ops on an SSBO like nearly every other GPU. Which somehow recently started getting segfaults due to calling a null pipe->set_hw_atomic_buffers(). Fixes a crash in stk, and probably other things. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-23 18:23:11 -04:00
Keith Packard	1df586be12	radv: add VK_EXT_display_control to radv driver [v5] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Rework fence integration into the driver so that waiting for any of a mixture of fence types (wsi, driver or syncobjs) causes the driver to poll, while a list of just syncobjs or just driver fences will block. When we get syncobjs for wsi fences, we'll adapt to use them. v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v5: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-23 07:59:00 -07:00
Keith Packard	16eb390834	anv: add VK_EXT_display_control to anv driver [v5] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v3: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v4: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. v5: use zalloc2 instead of alloc2 for the WSI fence. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Keith Packard	86c8d93e5a	vulkan: add VK_EXT_display_control [v10] This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Remove DRM_CRTC_SEQUENCE_FIRST_PIXEL_OUT flag. This has been removed from the proposed kernel API. Add NULL parameter to drmCrtcQueueSequence ioctl as we don't care what sequence the event was actually queued to. v3: Adapt to pthread clock switch to MONOTONIC v4: Fix scope for wsi_display_mode andwsi_display_connector allocs Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v5: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Use wsi_rel_to_abs_time helper function to convert relative timeouts to absolute timeouts without causing overflow. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v6: Change WSI fence wait function to return VkResult instead of bool. This makes the meaning of the return value easier to understand, and allows for the indication of failure. Also change the WSI fence wait function to take only absolute timeouts and not provide an option for a relative timeout. No users wanted relative timeouts, and it's simpler if that option isn't available. Terminate the DPMS property loop once we've found the property. Assert that the fence hasn't already been destroyed in wsi_display_fence_destroy. Rearrange the event handler function order in the file to place routines in an easier to find order. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v7: Adapt to API changes for surface_get_capabilities v8: Use wsi->alloc in register_display_event so that callers don't have to dig out an allocator for us. v9: Fix a few minor formatting issues Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v10: Use wsi->alloc if none provided in wsi_display_fence_alloc. Now that drivers are expected to pass the allocator argument straight through from the application, we need to check those for NULL everywhere. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Keith Packard	5581dd5c32	anv: Support wait for heterogeneous list of fences [v3] Handle the case where the set of fences to wait for is not all of the same type by either waiting for them sequentially (waitAll), or polling them until the timer has expired (!waitAll). We hope the latter case is not common. While the current code makes sure that it always has fences of only one type, that will not be true when we add WSI fences. Split out this refactoring to make merging that clearer. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v2: Cast INT64_MAX to uint64_t to make of its use as the maximum possible timeout clearly unsigned to the reader. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Make anv_wait_for_fences with !waitAll check all fences at least once, even if the requested timeout has already passed. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-06-23 07:59:00 -07:00
Bas Nieuwenhuizen	8c4f430d43	radv: Enable lower_io_to_temporaries after deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	aef4213fca	nir/lower_system_values: Assert/assume direct var derefs System values are never arrays or structs so we can assume a direct var deref. This simplifies things a bit and prevents us from accidentally throwing away an array index. Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	a331d7d1cd	nir: Remove old-school deref chain support Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Jason Ekstrand	9800b81ffb	nir: Remove deref chain support from analyze_loops Note that this patch needs to come late in the series since this pass can be run after any pass that damages nir_metadata_loop_analysis. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:06 -07:00
Rob Clark	2db8784167	freedreno/ir3: convert to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 21:23:05 -07:00
Rob Clark	95683bdce3	nir: promote intrinsic_get_var() to helper Useful in a few other places.. let's not copy-pasta Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	5a02ffb733	nir: Rework lower_locals_to_regs to use deref instructions This completely reworks the pass to support deref instructions and delete support for old deref chains Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	2fa7a4a541	intel,ir3: Re-enable nir_opt_copy_prop_vars Now that it's rewritten for deref instructions, we can turn it back on. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	67df3739c5	radeonsi: Remove deref chain support in nir scan pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	9cb345588b	radv: Remove deref chain support in radv shader info pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	a1e9d799ad	ac/nir: Remove deref chain support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	9bfd81b217	radeonsi: Add deref support to the nir scan pass. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	ba2bd20f87	nir: Rework opt_copy_prop_vars to use deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	fa6ffcc083	nir/copy_prop_vars: Re-order some logic in compare_derefs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c5d9a65944	nir: Remove deref chain support from split_per_member_structs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	18175ab66f	nir: Remove deref chain support from opt_undef Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	aeb4bbfd1e	nir: Remove deref chain support from split_var_copies Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	636256cdc7	nir: Remove deref chain support from dead_variables Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	378d7cf3ba	nir: Remove deref chain support from propagate_invariant Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c6a9c2b60b	nir: Remove deref chain support from lower_var_copies Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	fc59230a46	nir: Remove deref chain support from lower_drawpixels Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d4dd2ca4a7	nir: Remove deref chain support from opt_peephole_select Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	54bfc0cbcf	nir: Remove deref chain support from lower_tex Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	a3589bb01f	nir: Remove deref chain support from lower_wpos_ytransform Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	3992665c52	nir: Remove deref chain support from lower_wpos_center Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	8a62db7712	nir: Remove deref chain support from lower_system_values Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	e5db1b951c	nir: Remove deref chain support from remove_unused_varyings Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	6bdd867968	nir: Delete lower_io_types It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c6fc653232	nir: Remove deref chain support from lower_phis_to_scalar Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	47ffb893e6	nir: Convert lower_io to deref instructions This deletes support for _var intrinsics and legacy deref chains in favor of deref instructions. The internals are also reworked a bit to use deref instructions directly. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	0d03c63e91	nir/lower_io: Convert atomic lowering to deref instructions No one is currently using so we can make this change irrespective of driver. We may use it again in i965 so it's best to pretend to keep it working. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	c290e8c4b0	nir: Remove deref chain support from lower_global_vars_to_local Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	41c52c963a	nir: Remove deref chain support from lower_clamp_color_outputs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d2adc08abe	nir: Remove deref chain support from lower_alpha_test Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	81f29d6d33	nir: Remove deref chain support from lower_atomics Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	4b0ea65333	nir: Remove deref chain support from lower_clip_cull_distance_arrays Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	a42af8d0d6	nir: Remove deref chain support from lower_indirect_derefs Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	69866af357	nir: Rework gather_info to entirely use deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	b1a18b8797	nir/vars_to_ssa: Rework to entirely use deref instructions This commit reworks nir_lower_vars_to_ssa to use deref instructions and deref paths internally instead of deref chains. We also drop support for the old load/store/copy_var intrinsics. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	f747ff1969	nir/vars_to_ssa: Add an is_direct field to deref_node This makes us build the is_direct parameter as the nodes are constructed rather than as we walk the chain. This will be useful later. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Eric Anholt	e1f0a1b029	broadcom/vc4: Remove deref chain support from nir_lower_txf_ms. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3d19f116ad	st,ir3,radeonsi: push lower_deref_instrs back into driver vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3e8879be5c	nir/lower_samplers: remove legacy version Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	a20929fed2	nir: convert lower_samplers_as_deref to deref instructions This also removes the legacy version of lower_samplers. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	0bc15340be	mesa/st: re-enable lower_io_to_elements() Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	245ce114c9	nir: convert lower_io_arrays_to_elements to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	c409cfddcf	mesa/st/nir: convert lower_builtins to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	3859e0b4fe	mesa/st: temporarily disable lower_io_to_elements() Not required for correctness, and makes the order of converting passes to deref instructions hard to get right for both prog_to_nir and glsl_to_nir cases. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	c6009a1e8e	nir: convert lower_io_to_scalar to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	d143f6c856	move lower_deref_instrs Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d7b0be48ef	nir: Use deref instructions in lower_constant_initializers Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	85f4149f8a	nir/builder: Use deref instructions for load/store/copy_var Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	3573570afe	radv: Disable lower_io_to_temporaries during deref changes. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	75286c2d08	nir: Use derefs in nir_lower_samplers We change glsl_to_nir to provide derefs for bot textures and samplers while we're at it. This makes the lowering much easier since we only either replace sources or remove them. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	36efae1d66	nir/lower_samplers: Clean up function arguments This little refactor makes us stop passing stage around and puts the builder as the first parameter to some functions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Rob Clark	a6ebbbc594	nir/lower_samplers: split out _legacy version for deref chains To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	3891c1906f	intel/blorp: Stop setting tex->texture/sampler nir_tex_instr_create uses rzalloc so it's already NULL Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	606eb56ab9	intel/nir: Only lower load/store derefs Everything else should already be handled. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	71cd9ebed9	intel/fs: Use image_deref intrinsics instead of image_var Since we had to rewrite the deref walking loop anyway, I took the opportunity to make it a bit clearer and more efficient. In particular, in the AoA case, we will now emit one minmax instead of one per array level. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	032b845edf	anv/pipeline: Convert apply_pipeline_layout to deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	43bb707fa4	anv/apply_pipeline_layout: Simplify extract_tex_src_plane Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	9fb36011d1	anv/pipeline: Convert lower_multiview to deref instructions Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	d57e724a45	anv/pipeline: Convert YCbCr lowering to deref instructiosn Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	38f1b89805	anv/pipeline: Convert lower_input_attachments to deref instructions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Jason Ekstrand	5cd7324a57	anv/pipeline: Do less deref instruction lowering This commit removes most of the deref instruction lowering. Instead of lowering early, we only lower textures and images and we only do so right before any of the anv image lowering passes. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	1d59034de2	radv: Remove image_var stores. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	43af92edc5	radv: Use deref instructions for tex derefs in meta shaders. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	657cedb12f	ac/nir: Add deref interp support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	d00e7d42f5	ac/nir: Add shared atomic deref instr support. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	302884d121	radv: Gather info for deref instr based load/store. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	547d970122	ac/nir: Add deref based var loads/stores. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:03 -07:00
Bas Nieuwenhuizen	5780af9880	radv: Add shader info support for image deref instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:02 -07:00
Bas Nieuwenhuizen	506a07e4e3	ac/nir: Add deref support to image intrinsics. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:54:00 -07:00
Bas Nieuwenhuizen	bb5781c9a7	ac/nir: Implement derefs for integer gather4 lowering. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	ca271e266e	ac/nir: Support deref instructions in tex instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	9b14eacf0e	ac/nir: Support deref instructions in get_sampler_desc. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Bas Nieuwenhuizen	4a888beea9	ac/nir: Implement the deref instr for shared memory. v2: Store the result in ctx->ssa_defs. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	c11833ab24	nir,spirv: Rework function calls This commit completely reworks function calls in NIR. Instead of having a set of variables for the parameters and return value, nir_call_instr now has simply has a number of sources which get mapped to load_param intrinsics inside the functions. It's up to the client API to build an ABI on top of that. In SPIR-V, out parameters are handled by passing the result of a deref through as an SSA value and storing to it. This virtue of this approach can be seen by how much it allows us to delete from core NIR. In particular, nir_inline_functions gets halved and goes from a fairly difficult pass to understand in detail to almost trivial. It also simplifies spirv_to_nir somewhat because NIR functions never were a good fit for SPIR-V. Unfortunately, there is no good way to do this without a mega-commit. Core NIR and SPIR-V have to be changed at the same time. This also requires changes to anv and radv because nir_inline_functions couldn't handle deref instructions before this change and can't work without them after this change. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	58799b6a5b	spirv/cfg: Make the builder fully capable for both walks We were only initializing vtn_builder::func for the pre-walk where we build the CFG. We were only initializing the nir_builder for the later walk through the instructions even though were were setting b->cursor for the pre-walk. Let's set both both places so that everything is consistent. This useful because we handle OpFunctionParameter in the pre-walk and we're going to need to be able to emit instructions. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:58 -07:00
Jason Ekstrand	3fc3798677	spirv: Record the type of functions Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	2f9bfd7dd9	spirv: Update vtn_pointer_to/from_ssa to handle deref pointers Now that pointers can be derefs and derefs just produce SSA values, we can convert any pointer to/from SSA. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	d5930c222c	spirv: Allow pointers to have a deref at the base Previously, pointers fell into two categories: index/offset for UBOs, SSBOs, etc. and var + access chain for logical pointers. This commit adds another logical pointer mode that's deref + access chain. It's tempting to think that we can just replace variable-based pointers with deref-based or at least replace the access chain with a deref chain. Unfortunately, there are a few sticky bits that prevent this: 1) We can't return deref-based pointers from OpVariable because those opcodes may come outside of a function so there's no place to emit the deref instructions. 2) We can't always use variable-based pointers because we may not always know the variable. (We do now, but he upcoming function rework will take that option away.) 3) We also can't replace the access chain struct with a deref. Due to the re-ordering we do in order to handle loop continues, the derefs we would emit as part of OpAccessChain may not dominate their uses. We normally fix this up with nir_repair_ssa but that generates phi nodes which we don't want in the middle of our deref chains. All in all, we have no real better option than to support partial access chains while also re-emitting the deref instructions on the spot. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	fdd5ffee32	spirv: Clean up vtn_pointer_to_offset Now that push constants are using on-the-fly offsets, we no longer need to handle access chains in vtn_pointer_to_offset. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	7dfa440922	spirv: Make push constants an offset-based pointer Push constants have been a weird edge-case for a while in that they have explitic offsets but we've been internally building access chains for them. This mostly works but it means that passing pointers to push constants through as function arguments is broken. The easy thing to do for now is to just treat them like UBOs or SSBOs only without a block index. This does loose a bit of information since we no longer have an accurate access range and any indirect access will look like it could read the whole block. Unfortunately, there's not much we can do about that. Once NIR derefs get a bit more powerful, we can plumb these through as derefs and be able to reason about them again. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	b0c643d8f5	spirv: Use NIR per-member splitting Before, we were doing structure splitting in spirv_to_nir. Unfortunately, this doesn't really work when you think about passing struct pointers into functions. Doing it later in NIR is a much better plan. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	2100c2f3a2	nir/spirv: Pass nir_variable_data into apply_var_decoration Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	39bf61aa37	nir: Add a concept of per-member structs and a lowering pass This adds a concept of "members" to a variable with an interface type. It allows you to specify the full variable data for each member of the interface instead of once for the variable. We also add a lowering pass to lower those variables to a sequence of variables and rewrite all the derefs accordingly. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	eb40540b8a	spirv: Use deref instructions for most variables The only thing still using old-school drefs are function calls. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	e5130012e4	st/nir: Move lower_deref_instrs later Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	152057b138	i965: Move nir_lower_deref_instrs to right before locals_to_regs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:57 -07:00
Jason Ekstrand	a649610ace	nir/lower_tex: Always copy deref and offset sources This should make nir_lower_tex properly handle deref instructions as well as make it more correct when texture arrays are used and it's called after lowering samplers to binding table indices. Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	261fe676e5	intel/nir: Fixup deref modes after lowering patch vertices Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	d7d5aab45b	intel,ir3: Disable nir_opt_copy_prop_vars This pass doesn't handle deref instructions yet. Making it handle both legacy derefs and deref instructions would be painful. Since it's not important for correctness, just disable it for now. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	5dc58908b7	nir: Support deref instructions in opt_undef Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	f46ecdc441	nir: Consider deref instructions in opt_peephole_select Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	1e1733aaf0	nir: Consider deref instructions in lower_phis_to_scalar Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	775ef13384	nir: Support deref instructions in lower_drawpixels Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	932c6577a0	nir: Support deref instructions in lower_clamp_color_outputs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	076b6627c2	nir: Support deref instructions in lower_alpha_test Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	414148cdc1	nir: Support deref instructions in loop_analyze Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	e786fcf777	nir: Support deref instructions in remove_unused_varyings Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:56 -07:00
Jason Ekstrand	933c2851ab	nir: Support deref instructions in lower_pos_center Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	64057fd333	nir: Support deref instructions in lower_wpos_ytransform Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	2c9ca29372	nir: Support deref instructions in lower_atomics Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	d029167ea0	nir: Support deref instructions in lower_io Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	59b43be105	nir: Support deref instructions in gather_info Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	1442969ae1	nir: Support deref instructions in propagate_invariant Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	f23356a4dd	nir: Support deref instructions in lower_clip_cull Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	61b7bef3a3	nir: Support deref instructions in lower_system_values Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	1285cc9616	nir: Support deref instructions in lower_indirect_derefs Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	dccb3acb63	nir: Support deref instructions in lower_vars_to_ssa Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	9fe99129df	nir: Support deref instructions in split_var_copies Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	4a4e175738	nir: Support deref instructions in lower_var_copies Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:55 -07:00
Jason Ekstrand	a406f7e0c9	nir: Add a deref path helper struct This commit introduces a new nir_deref.h header for helpers that are less common and really only needed by a few heavy-duty passes. In this header is a new struct for representing a full deref path which can be walked in either direction. v2 (Jason Ekstrand): - Assert that deref != NULL (Caio) - Fill _short_path with 0xdeadbeef in debug builds when not used (Caio) - Make nir_deref_path a typedef (Rob) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	535289a3a9	nir: Support deref instructions in lower_io_to_temporaries Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	21befc46ef	nir: Support deref instructions in lower_global_vars_to_local Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	54e440945e	nir: Add a pass for fixing deref modes This will be needed by anything which changes variable modes without rewriting derefs. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	f917814c14	nir: Support deref instructions in remove_dead_variables Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Rob Clark	f03a33a19a	ttn: convert to deref instructions Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	82c498510e	prog/nir: Use deref instructions for params Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	2c7b892909	glsl/nir: Use deref instructions instead of dref chains Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	7f41a99cac	glsl/nir: Only claim to handle intrinsic functions Non-intrinsic function handling has never actually been tested and probably doesn't work. Just get rid of it for now. We can always add it back in later if it's useful. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Rob Clark	d80c342d89	nir: add deref lowering sanity checking This will be removed at the end of the transition, but add some tracking plus asserts to help ensure that lowering passes are called at the correct point (pre or post deref instruction lowering) as passes are converted and the point where lower_deref_instrs() is called is moved. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	74212c2414	anv,i965,radv,st,ir3: Call nir_lower_deref_instrs This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:54 -07:00
Jason Ekstrand	8b7aa66169	nir/deref: Add some deref cleanup functions Sometimes it's useful for a pass to be able to clean up its own derefs instead of waiting for DCE. This little helper makes it very easy. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	a80fa2766e	nir: Add helpers for working with deref instructions This commit adds a pass for lowering deref instructions to deref chains as well as some smaller helpers to ease the transition. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	5286b5d832	nir: Add deref sources to texture instructions Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	f1dc2088e2	nir: Add _deref versions of all of the _var intrinsics Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	de7f60b653	nir/builder: Add deref building helpers Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	19a4662a54	nir: Add a deref instruction type This commit adds a new instruction type to NIR for handling derefs. Nothing uses it yet but this adds the data structure as well as all of the code to validate, print, clone, and [de]serialize them. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Jason Ekstrand	5fbbbda37a	nir/validate: Rework intrinsic type validation This moves the switch statement for specific intrinsics above source and destination validation. We also rework the source and destination validation to use different bit_size values for each source and/or destination. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Rob Clark <robdclark@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-22 20:15:53 -07:00
Karol Herbst	133e8bf4de	nv50/ir: only avoid spilling constrained def if a mov is added fix spilling regression introduced by `5428066f5e` this is just a minor mistake done while moving the code out into a new function. The function contained a loop which might have been terminated earlier and skipped setting noSpill to 1. After the refactoring it was always set. Fixes: `5428066f5e` ("nv50/ir: make a copy of tex src if it's referenced multiple times") Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-06-23 03:00:24 +02:00
Dylan Baker	ced3df5623	meson: Fix typo that breaks -Dgalium-xvmc=false _xmvc -> _xvmc. Sigh Fixes: `a6943bb4ce` ("meson: Fix auto option for xvmc") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-06-22 10:16:27 -07:00
Dylan Baker	94cf397092	meson: Fix auto option for va The same as the previous two patches, but for the libva state tracker. Fixes: `724916c8a8` ("meson: dedup gallium-xvmc logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:25 -07:00
Dylan Baker	a6943bb4ce	meson: Fix auto option for xvmc This fixes the same problem as the previous patch did for vdpau, but for xvmc. Fixes: `724916c8a8` ("meson: dedup gallium-xvmc logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:18 -07:00
Dylan Baker	d9a8008a93	meson: Correct behavior of vdpau=auto Currently if vdpau is set to auto, it will be disabled only in cases where gallium is disabled or the host OS is not supported (mac, haiku, windows). However on (for example) Linux if libvdpau is not installed then the build will error because of the unmet dependency. This corrects auto to do the right thing, and not error if libvdpau is not installed. Fixes: `992af0a4b8` ("meson: dedup gallium-vdpau logic") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-22 09:51:11 -07:00
Samuel Pitoiset	ca59c3906d	radv: always check the return error when submitting a CS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:10 +02:00
Samuel Pitoiset	68d9517690	radv: check the return values of radv_signal_fence() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:09 +02:00
Samuel Pitoiset	07832083d3	radv: change the returned error in radv_signal_fence() From my point of view, when we aren't able to submit a CS something terribly wrong happens and we are most likely going to lost the device. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-22 17:47:06 +02:00
Jonathan Marek	94bc06b196	freedreno: a2xx: fix clear color the format of the CLEAR_COLOR register doesn't depend on the target format this fixes clear color when rendering to 32-bit RGBA and 16-bit targets Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	dd8553dd95	freedreno: a2xx: fix crash when freeing context Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	6eeac34cee	freedreno: a2xx: fix crash on first clear blend can be NULL, so check for that Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	17e16ba9db	freedreno: add a20x this patch adds support for a20x, which has some differences with a220: -no VGT_MAX_VTX_INDX register -no CLEAR_COLOR register -set RB_BC_CONTROL in restore (hangs without) -different CP_DRAW_INDX format tested with kmscube and glmark2 scenes, on par with a220 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Jonathan Marek	d5ff36b97b	freedreno: a2xx: increase size of the offset field in instr_fetch_vtx_t The offset field is 22 bit large. 11 bits are necessary because MaxVertexAttribRelativeOffset = 2047 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-22 08:23:10 -04:00
Eric Anholt	69ae42ca4c	v3d: Don't forget to initialize the buffer offset of a new winsys handle.	2018-06-21 15:56:18 -07:00
Eric Anholt	ee9a6a13fb	v3d, vc4: Disable valgrind checking of CLE inputs when NDEBUG is set. For a meson -Db_ndebug=true release build on x86_64, reduces text size of libv3d.a from 53.0k to 51.6k. Inspired by `0d5329d626` ("anv: Disable __gen_validate_value if NDEBUG is set.")	2018-06-21 15:46:40 -07:00
Marek Olšák	a2790b134a	mesa: fix glGetInteger64v for arrays of integers Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:55:15 -04:00
Marek Olšák	ce4b8b952a	ac/surface: disallow rotated micro tile mode Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	9410cd53c3	radeonsi: fix occlusion queries with 16x AA without FBO attachments on Stoney Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	9c21002f6e	radeonsi: handle non-clearable DCC buffers as MSAA resolve dst This is reproducible on Stoney, but other chips may be affected too. Cc 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Marek Olšák	587e712eda	radeonsi: disable DCC MSAA for 128bpp formats on Stoney Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 14:42:14 -04:00
Rob Clark	6764aae169	docs: update freedreno features Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:48 -04:00
Rob Clark	fbd154294f	mesa: fix GLES 3.1 version calculation All of ARB_gpu_shader5 is most certainly not required for GLES 3.1 (most of it is in OES_gpu_shader5 on top of GLES 3.1). Some of what is required from ARB_gpu_shader5 is provided by ARB_texture_gather, so check for that. The remaining subset of ARB_gpu_shader5 doesn't have individual extensions to check for, but I guess it is unlikely that some driver has all of these extensions but not, say, integer bitfield manipulation. Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-21 08:54:47 -04:00
Rob Clark	cf0c7258ee	freedreno/a5xx: MSAA Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	b6e690ef80	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	418b3fd184	freedreno/ir3: txf_ms support Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	d03bd103f8	freedreno/a5xx: fix gpu hangs with large compute shaders Similar to the combined limit for VS+FS, there is an upper limit for shader size to run from internel memory. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Rob Clark	e1e40935b4	freedreno/ir3: fix base_vertex Fixes: `c366f422f0` nir: Offset vertex_id by first_vertex instead of base_vertex Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-21 08:54:47 -04:00
Eduardo Lima Mitev	77e790f99a	i965: Link uniforms of SPIR-V programs using the NIR linker v2: nir_link_uniforms renamed to gl_nir_link_uniforms Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	ae0208e5b4	i965: Setup glsl uniforms by index rather than name matching Previously when setting up a uniform it would try to walk the uniform storage slots and find one that matches the name of the given variable. However, each variable already has a location which is an index into the UniformStorage array so we can just directly jump to the right slot. Some of the variables take up more than one slot so we still need to calculate how many it uses. The main reason to do this is to support ARB_gl_spirv because in that case the uniforms don’t have names so the previous approach won’t work. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	57b6184931	i965: account for NIR uniforms without name Right now, the BRW linker code assumes nir_variable::name is always non-NULL, but thanks to ARB_gl_spirv we will soon be linking SPIR-V programs, and those explicitly require matching uniforms by location. The name is just a debug hint. Instead of checking for the name this patch makes it check for var->num_state_slots on the assumption that everything that had an internal name also had some state slots. This seems likely because the two code paths that are taken when the name begins with "gl_" already have an assert that var->state_slots is not NULL. v2: simplified, most of it moved to glsl/nir/spirv (Neil Roberts) v3: check for num_state_slots instead of the name. This is needed because we do actually have nameless builtins with SPIR-V such as PatchVerticesIn and we want them to hit the _mesa_add_state_reference code path (Neil Roberts) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	7dd96a0653	i965: Update TexturesUsed after linking the shaders Otherwise if the shader is SPIR-V then SamplerUsed won’t have been initialised yet so it will end up thinking no textures are used. This was causing a crash later on if nothing causes it to regenerate TexturesUsed before the next render. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	4bf8b80f54	i965: Build SPIR-V programs' resource list using NIR v2: tweak after nir_linker.h being renamed to gl_nir_linker.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	3cf12c6317	nir/linker: Add nir_build_program_resource_list() This function is equivalent to the linker.cpp build_program_resource_list() but will extract the resources from NIR shaders instead. For now, only uniforms and program inputs are implemented. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) v3: remove support for inputs, that is still WIP (spotted by Timothy Arceri) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	215c9359ed	compiler/link: move add_program_resource to linker_util So it could be used by the GLSL and NIR linker. v2: (Timothy Arceri) * Moved from compiler to compiler/glsl * Method renamed to link_util_add_program_resource Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	2bf91733fc	nir/linker: Set the uniform initial values This is based on link_uniform_initializers.cpp. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	7a9e5cdfbb	nir/linker: Add gl_nir_link_uniforms() This function will be the entry point for linking the uniforms from the nir_shader objects associated with the gl_linked_shaders of a program. This patch includes initial support for linking uniforms from NIR shaders. It is tailored for the ARB_gl_spirv needs, and it is far from complete, but it should handle most cases of uniforms, array uniforms, structs, samplers and images. There are some FIXMEs related to specific features that will be implemented in following patches, like atomic counters, UBOs and SSBOs. Also, note that ARB_gl_spirv makes mandatory explicit location for normal uniforms, so this code only handles uniforms with explicit location. But there are cases, like uniform atomic counters, that doesn't have a location from the OpenGL point of view (they have a binding), but that Mesa assign internally a location. That will be handled on following patches. A nir_linker.h file is also added. More NIR-linking related API will be added in subsequent patches and those will include stuff from Mesa, so reusing nir.h didn't seem a good idea. v2: move from compiler/nir to compiler/glsl (Timothy Arceri) v3: sets var->driver.location if the uniform was found from a previous stage (Neil Roberts). Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	aa95f0bc5b	compiler/link: add linker_util.h, move linker_error/warning to it Linker utilities common to the GLSL IR and NIR linker (the latter to be used for ARB_gl_spirv). We need to move it to a new header as the NIR linker doesn't need to know about ir_variable, and others, included at linker.h. v2: move from src/compiler to src/compiler/glsl (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	b995bda9bc	spirv: Set nir_variable->explicit_binding When SpvDecorationBinding is encountered in the SPIR-V source it now sets explicit_binding on the nir_variable. This will be used to determine whether to initialise sampler and image uniforms with the binding value. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	386f09be9b	spirv: Get rid of vtn_variable_mode_image/sampler vtn_variable_mode_image and _sampler are instead replaced with vtn_variable_mode_uniform which encompasses both of them. In the few places where it was neccessary to distinguish between the two, the GLSL type of the pointer is used instead. The main reason to do this is that on OpenGL it is permitted to put images and samplers into structs and declare a uniform with them. That means that variables can now have a mix of uniform, sampler and image modes so picking a single one of those modes for a variable no longer makes sense. This fixes OpLoad on a sampler within a struct which was previously using the variable mode to determine whether it was a sampler or not. The type of the variable is a struct so it was not being considered to be uniform mode even though the member being loaded should be sampler mode. The previous code appeared to be using var->interface_type as a place to store the type of the variable without the enclosing array for images and samplers. I guess this worked because opaque types can not appear in interfaces so the interface_type is sort of unused. This patch removes the overloading of var->interface_type and any places that needed the type without the array can now just deduce it from var->type. v2: squash in this patch the changes to anv/nir (Timothy) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Nicolai Hähnle	23edc5b1ef	spirv: translate default-block uniforms They are supported by SPIR-V for ARB_gl_spirv. v2 (changes on top of Nicolai's original patch): * Handle UniformConstant storage class for uniforms other than samplers and images. (Eduardo Lima) * Handle location decoration also for samplers and images. (Eduardo Lima) * Rebase update (spirv_to_nir options added, logging changes, and others) (Alejandro Piñeiro) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	3d6664763d	nir/types: Add a utility wrapper to glsl_type::sampler_index() I think it is more accurate to call it a sampler target (?). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	f1ab16cf17	nir/types: Add a glsl_get_component_slots() utility It is basically a wrapper around glsl_type::component_slots(). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	2b8765b824	nir/lower_samplers: Limit assert to GLSL shader programs Vulkan has the concept of separate image and sampler objects in the SPIR-V code whereas GL conflates them into one. nir_lower_samplers contains an assert to verify that sampler operand is not being set on the nir instruction. However when the code comes from spirv_to_nir the sampler operand is always set. GL_arb_gl_spirv explicitly states that OpTypeSampler is not supported so it retains the GL behaviour of not being able to seperate them. Therefore the sampler will always be the same as the texture. This GL version of the lowering code ignores instr->sampler and sets instr->sampler_index to the same value as instr->texture_index. Some other places in the code (such as in nir_print) assume that once the instruction is lowered then both instr->texture and instr->sampler will be NULL, so to keep this behaviour we now set instr->sampler to NULL after ignoring it to fill in instr->sampler_index. Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Neil Roberts	652be1563f	nir: Add explicit_binding to nir_variable This is copied from the corresponding value in ir_variable. The intention is to eventually use it in a pure-NIR linker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	8d1ec2ed5a	mesa/main: add NULL name check when searching for a resource name Since ARB_gl_spirv name reflection can be missing. piglit shader_runner does several resource checking, so this commit is useful to get even the more simple piglit tests running without crashing on SPIR-V mode. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Alejandro Piñeiro	a6dc3d22eb	i965: use gl_shader_program_data::spirv Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Eduardo Lima Mitev	a940683733	mesa/main: Add a 'spirv' flag to gl_shader_program_data This will be used by the linker code to differentiate between programs made out of SPIR-V or GLSL shaders. This was rejected in the past, assuming that it was equivalent to check for "shProg->_LinkedShaders[stage]->spirv_data != NULL". But: * At some points of the linking process it would be needed to check if _LinkerShaders[stage] is present, so the full check would be: "shProg->_LinkedShaders[stage] != NULL && shProg->_LinkedShaders[stage]->spirv_data != NULL" * Sometimes you would like to do some specific to SPIR-V independently of the stage, or for any stage. For example, "link all the uniforms, for all stages". In that case checking for the flag would be equivalent to iterate all the _LinkedShaders and check if there is any spirv_data available. The former makes readibility really worse. Both could be solved by adding two helpers. But adding a flag seems really more simple and readable. v2: added justification for the flag on the commit message (Alejandro) Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-21 14:25:05 +02:00
Emil Velikov	697254111b	docs/release-calendar: restore the missing 18.1 column Earlier commit removed the column, instead of adjusting the height. Cc: Dylan Baker <dylan@pnwbakers.com> Fixes: `0d4f338a11` ("docs: Update release-notes and calendar") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	dfb1f2759c	configure: use compliant grep regex checks The current `grep "foo\\|bar"' trips on some grep implementations, like the FreeBSD one. Instead use `egrep "foo\|bar"' as suggested by Stefan. Cc: Stefan Esser <se@FreeBSD.org> Reported-by: Stefan Esser <se@FreeBSD.org> Bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228673 Fixes: `1914c814a6` ("configure: error out if building OMX w/o supported platform") Fixes: `63e11ac2b5` ("configure: error out if building VA w/o supported platform") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	d589eddc8b	glsl/tests/glcpp: reinstate "error out if no tests found" With the recent rework of converting the shell script to a python one the check for actual tests was dropped. Bring that back, since it was explicitly added considering we had a ~2 year period, during which the tests were not run. v2: use raise Exception() over print() & return false (Dylan) Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	a2f5292c82	glsl/glcpp/tests: reinstate srcdir/abs_builddir blurb Bring back the "detection" of the said variables, to allow standalone execution. Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	87cebace54	glsl: fold glcpp-test-cr-lf.sh into glcpp-test.sh As of recently both of these have been reworked so they invoke a python script. At the same time the latter can be executed with the combined arguments of both scripts. AKA we no longer need to have them separate. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	1c1f70d12f	st/dri: constify dri_fill_st_visual's screen As the function says - only the visual is changed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:09:39 +01:00
Emil Velikov	ccaa9f09cc	mesa: remove struct gl_extensions::ATI_separate_stencil Virtually every driver that supports ATI_separate_stencil also supports EXT_stencil_two_side. Use the latter boolean for both extension. With that in mind we can drop the explicit true from the drivers and the nasty comment in compute_version(). Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:09:39 +01:00
Eric Engestrom	1714dfca8a	travis: add libXrandr and its randrproto dependency Fixes: `3f960c1338` "vulkan: EXT_acquire_xlib_display requires libXrandr headers to build" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-21 11:46:47 +01:00
Juan A. Suarez Romero	d24839be70	swr: bump minimum supported LLVM version to 5.0 RADV now requires LLVM 5.0 or greater, and thus we can't build dist tarball because swr requires LLVM 4.0. Let's bump required LLVM to 5.0 in swr too. Fixes: `f9eb1ef870` ("amd: remove support for LLVM 4.0") Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-21 12:16:46 +02:00
Grazvydas Ignotas	f966929805	radeonsi: add a debug flag to zero vram allocations This allows to avoid having to see garbage in Dying Light loading screen at least, which probably expects Windows/NV behavior of all allocations being zeroed by default. Analogous to radv flag with the same name. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:18:50 +03:00
Grazvydas Ignotas	4e0d93dc0e	radeonsi: use shifts for sign extension Avoids a branch and reduces code size a tiny bit: text data bss dec hex filename 10804563 398653 2070368 13273584 ca89f0 /tmp/radeonsi_dri.so.old 10804499 398653 2070368 13273520 ca89b0 /tmp/radeonsi_dri.so Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:17:34 +03:00
Samuel Pitoiset	af17a29ad8	radv: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation Ported from RadeonSI. Not sure why this is needed but AMDVLK does something similar. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 10:31:03 +02:00
Samuel Pitoiset	41f6096c26	radv: use EOP_DATA_SEL_* instead of magic numbers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 10:31:02 +02:00
Roland Scheidegger	53959fcbd8	r600: fix copy/paste bug for sampleMaskIn workaround The sampleMaskIn workaround (`b936f4d1ca`) tries to figure out if the shader is running at per-sample frequency, but there's a typo bug so it will only recognize per-sample linar inputs, not per-sample perspective ones. Spotted by Eric Engestrom <eric.engestrom@intel.com> Fixes: b936f4d1ca0d2ab1e828a "r600: partly fix sampleMaskIn value"	2018-06-21 02:37:11 +02:00
Eric Anholt	edb7890750	v3d: Fix min vs mag determination when not doing mip filtering. Fixes all 128 failing tests in dEQP-GLES3.functional.texture.filtering.*.combinations	2018-06-20 12:31:54 -07:00
Keith Packard	3f960c1338	vulkan: EXT_acquire_xlib_display requires libXrandr headers to build When VK_USE_PLATFORM_XLIB_XRANDR_EXT is defined, vulkan.h includes X11/extensions/Xrandr.h for the RROutput typedef which is used in the vkGetRandROutputDisplayEXT interface. Make sure we have the required header by checking during the build, and also set CFLAGS to point at the right directory. We don't need to link against the library as we don't use any functions from there, so don't add the _LIBS value in the autotools build. Signed-off-by: Keith Packard <keithp@keithp.com> Fixes: `dbac8e25f8` "radv: Add EXT_acquire_xlib_display to radv driver [v2]" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-20 10:42:05 -07:00
Eric Anholt	f49d112a01	v3d: Implement ALPHA_TO_COVERAGE. There's a convenient "FTOC" instruction for generating the coverage now, unlike vc4. This fixes dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage	2018-06-20 09:30:46 -07:00
Eric Anholt	94f7c011d6	v3d: Track write reference to the separate stencil buffer. Otherwise, a blit from separate stencil may fail to flush the job that initialized it, or new drawing could fail to flush a blit reading from stencil. Fixes: dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8	2018-06-20 09:30:46 -07:00
Eric Anholt	a52c357a65	v3d: Add missing reference to the separate stencil buffer. Noticed while debugging a missing flush of rendering in the z32f_s8 case.	2018-06-20 09:30:46 -07:00
Eric Anholt	1334295f29	v3d: Fix return value from fence_finish. We needed to convert from a -errno to a boolean success value. Fixes: GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_flush GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_signaled	2018-06-20 09:30:46 -07:00
Christian Gmeiner	8b3099353e	mesa/st: only do scalar lowerings if driver benefits As not every (upcoming) backend compiler is happy with nir_lower_xxx_to_scalar lowerings do them only if the backend is scalar (and not vec4) based. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-06-20 17:56:37 +02:00
Christian Gmeiner	f485e5671c	gallium: add scalar isa shader cap v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-20 17:55:39 +02:00
Keith Packard	050d8a4b42	radv: Add VK_EXT_display_surface_counter to radv driver This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-20 08:16:45 -07:00
Keith Packard	1801d7c73c	anv: Add VK_EXT_display_surface_counter to anv driver [v2] This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. v2: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-20 08:16:34 -07:00
Jason Ekstrand	b1a013d035	Vulkan/wsi: Implement VK_EXT_display_surface_counter This extension is required to support EXT_display_control as it offers a way to query whether the vblank counter is supported. Internally, it is implemented using a fake MESA extension which provides a chain-in to GetSurfaceCapabilities2KHR which contains the one added field. This has the advantage of reducing number of callbacks needed in the back-ends. It also means that anything chained into GetSurfaceCapabilities2EXT through VkSurfaceCapabilities2KHR::pNext so we only need to handle crawling the pNext chain once per back-end. Reviewed-by: Keith Packard <keithp@keithp.com>	2018-06-20 08:16:03 -07:00
Jason Ekstrand	8f3b58ebee	vulkan/wsi: Get rid of the get_capabilities hook Instead, we can just use get_capabilities2. This way back-ends only have to implement one hook. Reviewed-by: Keith Packard <keithp@keithp.com>	2018-06-20 08:16:03 -07:00
Eric Engestrom	7f3cb7db08	intel/aubinator: drop unused functions Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-20 15:17:26 +01:00
Samuel Pitoiset	65b3fed037	radv: always initialize the clear depth/stencil values to 0 Similar to the clear color values. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	204cf5714a	radv: always initialize the clear color values to 0 Having random data in there is probably not the best. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	4b564bd612	radv: always initialize the DCC predicate to FALSE This might eventually skip some useless DCC decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	70c1bee187	radv: do not use an user SGPR for the sample position offset We know the number of samples at compile time. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Samuel Pitoiset	20170865db	radv: don't store the number of samples as log2 Needed for the following patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 13:21:42 +02:00
Gert Wollny	8a6e3f0c5d	gallium/aux/util/u_cpu_detect.h: Fix -Wsign-compare warning in u_cpu_detect.c Change the type of util_cpu_caps::nr_cpus to int because sysconfig returns a signed value, fixes: u_cpu_detect.c: In function 'util_cpu_detect': u_cpu_detect.c:317:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (util_cpu_caps.nr_cpus == -1) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	33f4e8a043	gallium/aux/util/u_debug.h: Fix "noreturn" warnings in debug mode Only decorate function as noreturn when DEBUG is not defined, because when compiled in DEBUG mode the function actually executes an int3 and may return, fixes: u_debug.c: In function '_debug_assert_fail': u_debug.c:309:1: warning: 'noreturn' function does return Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	70f632962a	gallium/aux/util: Fix some warnings util/u_cpu_detect.c: In function 'util_cpu_detect': util/u_cpu_detect.c:377:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (util_cpu_caps.nr_cpus == ~0u) ^~ util/u_hash_table.c:274:21: warning: unused parameter 'k' [-Wunused- parameter] util_hash_inc(void k, void v, void d) ^ util/u_hash_table.c:274:30: warning: unused parameter 'v' [-Wunused- parameter] util_hash_inc(void k, void v, void d) ^ util/u_tests.c: In function 'test_texture_barrier': util/u_tests.c:652:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_samples / 2; i++) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	3e091d5a7a	gallium/aux/tgsi_ureg.c: remove unused parameter from match_or_expand_immediate64 remove "type" from "match_or_expand_immediate64", fixes: tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64': tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused- parameter] int type, ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	f79b980486	gallium/aux/tgsi_two_side.c: Fix -Wsign-compare warnings Integer propagation rules can sometimes be irritating. With "unsigned x" "x + 1" gets propagated to a signed integer, so explicitely assign the sum to an unsigned and use that for comaprison. In file included from tgsi/tgsi_two_side.c:41:0: tgsi/tgsi_two_side.c: In function 'xform_decl': ./util/u_math.h:660:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2' ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:86:24: note: in expansion of macro 'MAX2' ts->num_inputs = MAX2(ts->num_inputs, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2' ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_two_side.c:89:23: note: in expansion of macro 'MAX2' ts->num_temps = MAX2(ts->num_temps, decl->Range.Last + 1); ^~~~ tgsi/tgsi_two_side.c: In function 'xform_inst': tgsi/tgsi_two_side.c:184:45: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (inst->Src[i].Register.Index == ts- >front_color_input[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	dc5ba7e17c	gallium/aux/tgsi_ureg.c: Fix various warnings tgsi/tgsi_ureg.c: In function 'ureg_DECL_sampler': tgsi/tgsi_ureg.c:721:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ureg->sampler[i].Index == nr) ^~ tgsi/tgsi_ureg.c: In function 'match_or_expand_immediate64': tgsi/tgsi_ureg.c:837:34: warning: unused parameter 'type' [-Wunused- parameter] int type, ^~~~ tgsi/tgsi_ureg.c: In function 'emit_decls': tgsi/tgsi_ureg.c:1821:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ureg->properties[i] != ~0) ^~ tgsi/tgsi_ureg.c: In function 'ureg_create_with_screen': tgsi/tgsi_ureg.c:2193:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ARRAY_SIZE(ureg->properties); i++) ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	c5e8280504	gallium/aux/tgsi_text.c: Fix -Wsign-compare warnings tgsi/tgsi_text.c: In function 'parse_identifier': tgsi/tgsi_text.c:218:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i == len - 1) ^~ tgsi/tgsi_text.c: In function 'parse_optional_swizzle': tgsi/tgsi_text.c:873:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < components; i++) { ^ tgsi/tgsi_text.c: In function 'parse_instruction': tgsi/tgsi_text.c:1103:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < info->num_dst + info->num_src + info->is_tex; i++) { ^ tgsi/tgsi_text.c:1118:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] else if (i < info->num_dst + info->num_src) { ^ tgsi/tgsi_text.c: In function 'parse_immediate': tgsi/tgsi_text.c:1660:24: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (type = 0; type < ARRAY_SIZE(tgsi_immediate_type_names); ++type) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	b16b6d0889	gallium/aux/tgsi_point_sprite.c: Fix -Wsign-compare warnings tgsi/tgsi_lowering.c: In function 'emit_twoside': tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c: In function 'emit_decls': tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->numtmp; i++) { ^ tgsi/tgsi_lowering.c: In function 'rename_color_inputs': tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (src->Index == ctx->two_side_idx[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	3792d85755	gallium/aux/tgsi_lowering.c: Fix -Wsign-compare warnings tgsi/tgsi_lowering.c: In function 'emit_twoside': tgsi/tgsi_lowering.c:1179:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1208:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c:1216:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->two_side_colors; i++) { ^ tgsi/tgsi_lowering.c: In function 'emit_decls': tgsi/tgsi_lowering.c:1280:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < ctx->numtmp; i++) { ^ tgsi/tgsi_lowering.c: In function 'rename_color_inputs': tgsi/tgsi_lowering.c:1311:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (src->Index == ctx->two_side_idx[j]) { ^~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	7a3daaab41	gallium/aux/tgsi_build.c: Fix -Wsign-compare warnings tgsi/tgsi_build.c: In function 'tgsi_build_full_immediate': tgsi/tgsi_build.c:622:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for( i = 0; i < full_imm->Immediate.NrTokens - 1; i++ ) { ^ tgsi/tgsi_build.c: In function 'tgsi_build_full_property': tgsi/tgsi_build.c:1393:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for( i = 0; i < full_prop->Property.NrTokens - 1; i++ ) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	94f40d3ac0	gallium/aux/tgsi_build.c: Remove now unused variable Removing the unused prev_tocken from the function calls made this local variable also unused. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	dc46b2aa99	gallium/aux/tgsi_build.c: Remove unused parameters prev_token from various functions remove parameter prev_token unused in tgsi_build_instruction_label tgsi_build_instruction_texture tgsi_build_instruction_memory tgsi_build_texture_offset This fixes the following warnings: tgsi/tgsi_build.c: In function 'tgsi_build_instruction_label': tgsi/tgsi_build.c:716:24: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_instruction_texture': tgsi/tgsi_build.c:749:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_instruction_memory': tgsi/tgsi_build.c:784:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ tgsi/tgsi_build.c: In function 'tgsi_build_texture_offset': tgsi/tgsi_build.c:819:23: warning: unused parameter 'prev_token' [- Wunused-parameter] struct tgsi_token prev_token, ^~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	f06194b012	gallium/aux/tgsi_exec.c: Fix various -Wsign-compare tgsi/tgsi_exec.c: In function 'exec_tex': tgsi/tgsi_exec.c:2254:46: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args)); ^ ./util/u_debug.h:189:30: note: in definition of macro 'debug_assert' #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__)) ^~~~ tgsi/tgsi_exec.c:2254:7: note: in expansion of macro 'assert' assert(shadow_ref >= dim && shadow_ref < ARRAY_SIZE(args)); ^~~~~~ tgsi/tgsi_exec.c:2290:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = dim; i < ARRAY_SIZE(args); i++) ^ In file included from ./util/u_memory.h:39:0, from tgsi/tgsi_exec.c:62: tgsi/tgsi_exec.c: In function 'exec_lodq': tgsi/tgsi_exec.c:2357:15: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(dim <= ARRAY_SIZE(coords)); ^ ./util/u_debug.h:189:30: note: in definition of macro 'debug_assert' #define debug_assert(expr) ((expr) ? (void)0 : _debug_assert_fail(#expr, __FILE__, __LINE__, __FUNCTION__)) ^~~~ tgsi/tgsi_exec.c:2357:4: note: in expansion of macro 'assert' assert(dim <= ARRAY_SIZE(coords)); ^~~~~~ tgsi/tgsi_exec.c:2363:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = dim; i < ARRAY_SIZE(coords); i++) { ^ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	a7cbb9ba46	gallium/aux/tgsi_exec.c: remove superfluous parameter from etch_source_d Remove unused parameter src_datatype from fetch_source_d, fixes warning; tgsi/tgsi_exec.c: In function 'fetch_source_d': tgsi/tgsi_exec.c:1594:40: warning: unused parameter 'src_datatype' [-Wunused-parameter] enum tgsi_exec_datatype src_datatype) ^~~~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	5fe1b3b848	gallium/aux/tgsi_exec.c: remove superfluous parameter from store_dest_dstret remove unused parameter inst from store_dest_dstret (and consequently also from store_dest_double), fixes warning: tgsi/tgsi_exec.c: In Funktion »store_dest_dstret«: tgsi/tgsi_exec.c:1765:47: Warning: unused parameter »inst« [-Wunused-parameter] const struct tgsi_full_instruction *inst) ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	c9b53c6410	gallium/aux/tgsi_exec.c: Remove unused parameter from fetch_src_file_channel remove unused parameter chan_index from fetch_src_file_channel, fixes warning: tgsi/tgsi_exec.c: In Funktion »fetch_src_file_channel«: tgsi/tgsi_exec.c:1480:35: Warning: unused parameter »chan_index« [-Wunused-parameter] const uint chan_index, ^~~~~~~~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	38a9b42d8e	gallium/aux/tgsi_exec.c: Remove paramater inst from exec_kill Fixes warning: tgsi/tgsi_exec.c: In Funktion »exec_kill«: tgsi/tgsi_exec.c:2049:47: Warning: unused parameter »inst« [-Wunused-parameter] const struct tgsi_full_instruction *inst) ^~~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	b8fca73e47	gallium/aux/tgsi_aa_point.c: Fix -Wsign-compare warnings tgsi/tgsi_aa_point.c:32:0: tgsi/tgsi_aa_point.c: In Funktion »aa_decl«: ./util/u_math.h:660:29: Comparison between signed and unsigned in conditional expressions [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro »MAX2« ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1); ^~~~ ./util/u_math.h:660:40: Warning: signed and unsigned type in conditional expression [-Wsign-compare] #define MAX2( A, B ) ( (A)>(B) ? (A) : (B) ) ^ tgsi/tgsi_aa_point.c:76:21: Remark: when substituting of the macro »MAX2« ts->num_tmp = MAX2(ts->num_tmp, decl->Range.Last + 1); ^~~~ tgsi/tgsi_aa_point.c: In Funktion »aa_inst«: tgsi/tgsi_aa_point.c:220:31: Comparison between signed and unsigned in conditional expressions [-Wsign-compare] dst->Register.Index == ts->color_out) { Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	09b3b37b95	gallium/aux/tgsi_sanity.c: Fix -Wsign-compare warnings tgsi_sanity.c: In function 'iter_instruction': tgsi_sanity.c:316:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ctx->index_of_END != ~0) { ^~ tgsi_sanity.c: In function 'epilog': tgsi_sanity.c:488:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (ctx->index_of_END == ~0) { Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	bf6b695a90	gallium/aux/tgsi/tgsi_parse.c: Fix two warnings tgsi_parse.c: In function 'tgsi_parse_free': tgsi_parse.c:54:31: warning: unused parameter 'ctx' [-Wunused-parameter] struct tgsi_parse_context *ctx ) ^~~ tgsi_parse.c: In function 'tgsi_parse_end_of_tokens': tgsi_parse.c:62:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] return ctx->Position >= Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	fc9e259e58	gallium/aux/tgsi/tgsi_dump.c: Fix -Wsign-compare warnings tgsi_dump.c: In function 'iter_property': tgsi_dump.c:443:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < prop->Property.NrTokens - 1; ++i) { ^ tgsi_dump.c:459:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i < prop->Property.NrTokens - 2) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	03ac9708cf	gallium/aux/cso_cache: Fix various warnings cso_cache.c: In Function »delete_blend_state«: cso_cache/cso_cache.c:90:51: Warning: unused parameter »data« [-Wunused- parameter] static void delete_blend_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_depth_stencil_state«: cso_cache/cso_cache.c:98:59: Warning: unused parameter »data« [-Wunused- parameter] static void delete_depth_stencil_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_sampler_state«: cso_cache/cso_cache.c:106:53: Warning: unused parameter »data« [- Wunused-parameter] static void delete_sampler_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_rasterizer_state«: cso_cache/cso_cache.c:114:56: Warning: unused parameter »data« [- Wunused-parameter] static void delete_rasterizer_state(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »delete_velements«: cso_cache/cso_cache.c:122:49: Warning: unused parameter »data« [- Wunused-parameter] static void delete_velements(void state, void data) ^~~~ cso_cache/cso_cache.c: In Funktion »sanitize_cb«: cso_cache/cso_cache.c:166:52: Warning: unused parameter »user_data« [- Wunused-parameter] int max_size, void user_data) ^~~~~~~~~ gallium/aux/cso_context.c: a -Wunused-parameter warning cso_cache/cso_context.c: In Funktion »delete_sampler_state«: cso_cache/cso_context.c:163:57: Warning: unused parameter »ctx« [- Wunused-parameter] static boolean delete_sampler_state(struct cso_context ctx, void *state) ^~~ Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-20 11:08:28 +02:00
Gert Wollny	81e5bf3cfe	configure.ac: Add CFLAG -Wno-missing-field-initializers (v5) This warning is misleading: When a struct is partially initialized without assigning to the structure members by name, then the remaining fields will be zeroed out, and this warning will be issued (if enabled). If, on the other hand, the partial initialization is done by assigning to named members, the remaining structure elements may hold random data, but the warning is not issued. Since in Mesa the first approach to initialize structure elements is used very often, and it is usually assumed that the remaining elements are zeroed out, heeding this warning would be counter-productive. v2: - add -Wno-missing-field-initializers to meson-build - fix empty line error (both Eric Engestrom) v3: * check for -Wmissing-field-initializers warning and then disable it because gcc and clang always accept -Wno-* (Dylan Baker) * Also disable this warning for C++ v4: * meson.build add -Wno-missing-field-initializers to c_args instead of no_override_init_args (Eric Engstrom) v5: * configure.ac: Correct copy/paste error with CFLAGS/CXXFLAGS Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-06-20 11:08:28 +02:00
Samuel Pitoiset	916dda5cf7	radv: remove unnecessary code around CACHE_FLUSH_AND_INV_TS_EVENT AMDVLK also always uses CACHE_FLUSH_AND_INV_TS_EVENT. The other workaround is to flush DB metadata after emitting the framebuffer, but that seems slower. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-20 10:08:37 +02:00
Bas Nieuwenhuizen	4705a5dfda	radv: Fix flush_bits being used uninitialized. A case of making things worse while trying to fix something minor ... Fixes: `ef79457004` "radv: Merge the flush bits of CMASK & DCC clear." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-20 10:02:39 +02:00
Keith Packard	dbac8e25f8	radv: Add EXT_acquire_xlib_display to radv driver [v2] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application to the radv driver. v2: Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	46090a642d	anv: Add EXT_acquire_xlib_display to anv driver [v3] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application to the anv driver. v2: Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	7ab1fffcd2	vulkan: Add EXT_acquire_xlib_display [v5] This extension adds the ability to borrow an X RandR output for temporary use directly by a Vulkan application. For DRM, we use the Linux resource leasing mechanism. v2: Clean up xlib_lease detection * Use separate temporary '_xlib_lease' variable to hold the option value to avoid changin the type of a variable. * Use boolean expressions instead of additional if statements to compute resulting with_xlib_lease value. * Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Move mode list from wsi_display to wsi_display_connector Fix scope for wsi_display_mode and wsi_display_connector allocs Suggested-by: Jason Ekstrand <jason@jlekstrand.net> v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Explicitly forbid multiple DRM leases. Making the code support this looks tricky and will require additional thought. Use xcb_randr_output_t throughout the internals of the implementation. Convert at the public API (wsi_get_randr_output_display). Clean up check for usable active_crtc (possible when only the desired output is connected to the crtc). Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v4: Move output resource fetching closer to use in wsi_display_get_output. This simplifies the error returns in earlier parts of the code a bit. Return VK_ERROR_INITIALIZATION_FAILED from wsi_acquire_xlib_display. Jason says this is the right error message. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v5: randr doesn't pass vscan over the wire, so we set vscan to 0 for randr-acquired modes, and test wsi modes for vscan <= 1 when comparing against randr modes. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	5a2efefb0a	radv: Add EXT_direct_mode_display to radv driver Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	f89d3874fb	anv: Add EXT_direct_mode_display to anv driver [v2] Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. v2: Add extension to list in alphabetical order Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	352d320a07	vulkan: Add EXT_direct_mode_display [v2] Add support for the EXT_direct_mode_display extension. This just provides the vkReleaseDisplayEXT function. v2: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	451b58a51e	radv: Add KHR_display extension to radv [v5] This adds support for the KHR_display extension to the radv Vulkan driver. The driver now attempts to open the master DRM node when the KHR_display extension is requested so that the common winsys code can perform the necessary operations. v2: * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Adapt to new wsi_device_init API (added display_fd) v4: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v5: Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	54d0daa481	anv: Add KHR_display extension to anv [v7] This adds support for the KHR_display extension to the anv Vulkan driver. The driver now attempts to open the master DRM node when the KHR_display extension is requested so that the common winsys code can perform the necessary operations. v2: Make sure primary fd is usable When KHR_display is selected, we try to open the primary node instead of the render node in case the user wants to use KHR_display for presentation. However, if we're actually going to end up using RandR leases, then we don't care if the resulting fd can't be used for display, but the kernel also prevents us from using it for drawing when someone else has master. v3: Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v4: Adapt primary node usage to new wsi_device_init API v5: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Remove spurious MM_PER_PIXEL define Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v6: Open DRM master before initializing WSI layer. The DRM master FD is passed to the WSI layer during initialization, so we need to open the device slightly earlier in the function. Close DRM master in device_finish. Use anv_gem_get_param to detect working master_fd instead of directly using the ioctl. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v7: Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-19 14:17:46 -07:00
Keith Packard	da997ebec9	vulkan: Add KHR_display extension using DRM [v10] This adds support for the KHR_display extension support to the vulkan WSI layer. Driver support will be added separately. v2: * fix double ;; in wsi_common_display.c * Move mode list from wsi_display to wsi_display_connector * Fix scope for wsi_display_mode andwsi_display_connector allocs * Switch all allocations to vk_zalloc instead of vk_alloc. * Fix DRM failure in wsi_display_get_physical_device_display_properties When DRM fails, or when we don't have a master fd (presumably due to application errors), just return 0 properties from this function, which is at least a valid response. * Use vk_outarray for all property queries This is a bit less error-prone than open-coding the same stuff. * Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps Until we have multi-plane support, we shouldn't pretend to have any multi-plane semantics, even if undefined. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> v3: Add separate 'display_fd' and 'render_fd' arguments to wsi_device_init API. This allows drivers to use different FDs for the different aspects of the device. Use largest mode as display size when no preferred mode. If the display doesn't provide a preferred mode, we'll assume that the largest supported mode is the "physical size" of the device and report that. v4: Make wsi_image_state enumeration values uppercase. Follow more common mesa conventions. Remove 'render_fd' from wsi_device_init API. The wsi_common_display code doesn't use this fd at all, so stop passing it in. This avoids any potential confusion over which fd to use when creating display-relative object handles. Remove call to wsi_create_prime_image which would never have been reached as the necessary condition (use_prime_blit) is never set. whitespace cleanups in wsi_common_display.c Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Add depth/bpp info to available surface formats. Instead of hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the requested format to find suitable values. Destroy kernel buffers and FBs when swapchain is destroyed. We were leaking both of these kernel objects across swapchain destruction. Note that wsi_display_wait_for_event waits for anything to happen. wsi_display_wait_for_event is simply a yield so that the caller can then check to see if the desired state change has occurred. Record swapchain failures in chain for later return. If some asynchronous swapchain activity fails, we need to tell the application eventually. Record the failure in the swapchain and report it at the next acquire_next_image or queue_present call. Fix error returns from wsi_display_setup_connector. If a malloc failed, then the result should be VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl failed and we're either VT switched away, or our lease has been revoked, in which case we should return VK_ERROR_OUT_OF_DATE_KHR. Make sure both sides of if/else brace use matches Note that we assume drmModeSetCrtc is synchronous. Add a comment explaining why we can idle any previous displayed image as soon as the mode set returns. Note that EACCES from drmModePageFlip means VT inactive. When vt switched away drmModePageFlip returns EACCES. Poll once a second waiting until we get some other return value back. Clean up after alloc failure in wsi_display_surface_create_swapchain. Destroy any created images, free the swapchain. Remove physical_device from wsi_display_init_wsi. We never need this value, so remove it from the API and from the internal wsi_display structure. Use drmModeAddFB2 in wsi_display_image_init. This takes a drm format instead of depth/bpp, which provides more control over the format of the data. v5: Set the 'currentStackIndex' member of the VkDisplayPlanePropertiesKHR record to zero, instead of indexing across all displays. This value is the stack depth of the plane within an individual display, and as the current code supports only a single plane per display, should be set to zero for all elements Discovered-by: David Mao <David.Mao@amd.com> v6: Remove 'platform_display' bits from the build and use the existing 'platform_drm' instead. v7: Ensure VK_ICD_WSI_PLATFORM_MAX is large enough by setting to VK_ICD_WSI_PLATFORM_DISPLAY + 1 v8: Simplify wsi_device_init failure from wsi_display_init_wsi by using the same pattern as the other wsi layers. Adopt Jason Ekstrand's white space and variable declaration suggestions. Declare variables at first use, eliminate extra whitespace between types and names, add list iterator helpers, switch to lower-case list_ macros. Respond to Jason's April 8 review: * Create a function to convert relative to absolute timeouts to catch overflow issues in one place * use VK_NULL_HANDLE to clear prop->currentDisplay * Get rid of available_present_modes array. * return OUT_OF_DATE_KHR when display_queue_next called after display has been released. * Make errors from mode setting fatal in display_queue_next * Remove duplicate pthread_mutex_init call * Add wsi_init_pthread_cond_monotonic helper function to isolate pthread error handling from wsi_display_init_wsi Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v9: Fix vscan handling by using MAX2(vscan, 1) everywhere. Vscan can be zero anywhere, which is treated the same as 1. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> v10: Respond to Vulkan CTS failures. 1. Initialize planeReorderPossible in display_properties code 2. Only report connected displays in get_display_plane_supported_displays 3. Return VK_ERROR_OUT_OF_HOST_MEMORY when pthread cond initialization fails. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> 4. Add vkCreateDisplayModeKHR. This doesn't actually create new modes, it only looks to see if the requested parameters matches an existing mode and returns that. Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2018-06-19 14:17:46 -07:00
Bas Nieuwenhuizen	ef79457004	radv: Merge the flush bits of CMASK & DCC clear. Probably won't be much different in practice, but still wrong. Fixes Coverity issue 1435002. Not CC'ing to stable since this is only hit if you enable MSAA DCC via RADV_DEBUG. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-19 22:35:13 +02:00
Bas Nieuwenhuizen	ed06b1cdca	radv: Don't check for pipeline being set in draw. Draws without pipeline are definitely not allowed. Fixes Coverity issue 1434216. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-19 22:35:13 +02:00
Marek Olšák	1ba87f4438	radeonsi: rename r600_texture -> si_texture, rxxx -> xxx or sxxx Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Rob Clark	39b4fdc45f	freedreno/a5xx: move emit_marker5() into a5xx backend The scratch registers move again in a6xx.. so for post-a4xx let's just move this into the backend, and move the one place it used to be needed in core into fd5_emit_ib(). For a6xx we will do similar, calling emit_marker6() from fd6_emit_ib(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	0c8d9e923a	freedreno/a5xx: fix crash in dEQP-GLES31.stress.vertex_attribute_binding.buffer_bounds.bind_vertex_buffer_offset_near_wrap_10 This is kind of a hack, but really the only problem is the debug_assert() in OUT_RELOC(). But the debug_assert() is useful to catch real issues. So just add some #ifdef DEBUG code to filter things out before we hit the assert. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	4a41b02d46	freedreno/a5xx: don't crash if compute shader compile fails It is impolite, and a bit annoying with dEQP (all tests running in single process). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	658f1f6003	freedreno/ir3: fix missing recursion into block condition Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	1a6150207c	freedreno/a5xx: better FOUR_QUAD/TWO_QUAD decision for compute If we aren't going to get full occupancy, then use TWO_QUAD. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	f07154421a	freedreno/a5xx: bordercolor fixes Need a bit of hand-holding for stencil bordercolor, and add border color values for sRGB. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	ced14f1c7a	freedreno: remove per-stateobj dirty_mask's These never got updated in fd_context_all_dirty() so actually trying to rely on them (in the case of fd5_emit_images()) ends up in some cases where state is not emitted but should be. Best to just rip this out. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	5708440597	freedreno/a5xx: remove one image stateblock I think this ends up just setting uniform/const memory. But we upload x/y/z stride differently. At best this is unneeded, at worst it could possibly clobber other uniform/const memory. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	e0c6135625	freedreno/a5xx: cubemap image fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	0bb0cac8dc	freedreno/ir3: handle image buffer Similar to txf case, we need to insert a 2nd coordinate (zero). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	d1d2b13518	freedreno/ir3: handle arrays of images Unlike textures, this doesn't get lowered for us. (Would be nice if they were.. at least until we are ready to deal w/ indirect indexing..) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	5b2ef78532	freedreno/ir3: images can be arrays too Seems I previously toally forgot about 2d-arrays, etc.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	f489fa1f3f	freedreno/ir3: use move_load_const pass Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-19 13:02:28 -04:00
Rob Clark	7235c144a6	nir: add pass to move load_const Run this pass late (after opt loop) to move load_const instructions back into the basic blocks which use the result, in cases where a load_const is only consumed in a single block. This helps reduce register usage in cases where the backend driver cannot lower the load_const to a uniform. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-19 13:02:28 -04:00
Rob Clark	c9d6e579ec	mesa/st/nir: fix driver_location for arrays of image/sampler We can have arrays of images or samplers. But I forgot to handle that case long ago. Suprised no one complained yet. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Rob Clark	228457234c	nir: add comment for loop_unroll pass Save the next person from digging through the code to figure out what the indirect_mask parameter actually does. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Rob Clark	e3bbc1eaf4	glsl: fix random typo Just something I stumbled across. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-19 13:02:28 -04:00
Marek Olšák	dfeb61c5cf	radeonsi: ignore PIPE_RESOURCE_FLAG_MAP_COHERENT We treat coherent and non-coherent buffers the same. And move external_usage for better packing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	9322974ec7	radeonsi: always put persistent buffers into GTT on radeon This improves performance for certain games. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	ffbbc008be	radeonsi: fix si_get_num_queries for radeon Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	94b29763a4	radeonsi: don't expose performance counters for non-existent blocks Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	a2451a4c23	ac/gpu_info: add radeon_info::num_tcc_blocks The values for the radeon winsys were copied from the kernel driver. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	166c00e28e	radeonsi: set a better NUM_PATCHES hard limit AMDVLK uses 64 (distributed) and 16 (non-distributed). radeonsi will use 63 and 16. * This might improve tessellation performance on Hawaii, Bonaire, Tahiti, Pitcairn. (they will use 16) * I'm not sure if this matters for 1 SE configs. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	0d685ba290	radeonsi: make sure LS-HS vector lanes are reasonably occupied Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	e93fe403bc	radeonsi: properly compute an LS-HS thread group size limit "64 / max * 4" is less than "64 * 4 / max". Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Eric Anholt	da0115b1c3	v3d: Fix blitting from a linear winsys BO. This is the case for the simulator environment, and broke many blitter tests by trying to texture from linear while the HW can only actually do UIF/UBLINEAR/LT. Just make a temporary and copy into it with the CPU, then blit from that. This is the kind of path that should use the TFU, but I haven't exposed that hardware yet. Fixes dEQP-GLES3.functional.fbo.blit.default_framebuffer.*	2018-06-19 09:42:20 -07:00
Eric Anholt	07b243674f	v3d: Add missing always_flush debug flag. The #define existed and was checked in the driver.	2018-06-19 09:42:20 -07:00
Tomeu Vizoso	9b1cb50ba4	virgl: Remove debugging left-overs Some fprintfs were probably left unintentionally a few years ago and are a bit of a nuisance. Fixes: `2d3301e4d5` ("virgl: fix reference counting of prime handles") Cc: Rob Herring <robh@kernel.org> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 13:35:13 +02:00
Timothy Arceri	6c243ac2dd	glsl: fix desktop glsl linking regression The prog->Shaders[i]->IsES check was accidentally removed causing ES linking rules to be applied to desktop GLSL. Fixes: `725b1a406d` ("mesa/util: add allow_glsl_relaxed_es driconfig override")	2018-06-19 17:58:05 +10:00
Timothy Arceri	a9114b5e3e	util: add allow_glsl_relaxed_es to drirc for Google Earth VR Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	725b1a406d	mesa/util: add allow_glsl_relaxed_es driconfig override This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	781c23ece6	util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	90dbab0f9a	mesa/util: add allow_glsl_builtin_const_expression driconf override Google Earth VR shaders uses builtins in constant expressions with GLSL 1.10. That feature wasn't allowed until GLSL 1.20. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-19 12:09:56 +10:00
Timothy Arceri	de93f546a7	util: manually extract the program name from program_invocation_name Glibc has the same code to get program_invocation_short_name. However for some reason the short name gets mangled for some wine apps. For example with Google Earth VR I get: program_invocation_name: "/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe" program_invocation_short_name: "e" Acked-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-19 12:09:56 +10:00
Bas Nieuwenhuizen	1a8501a9dd	ac/surface: Set compressZ for stencil-only surfaces. We HTILE compress stencil-only surfaces too. CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-19 02:52:01 +02:00
Jason Ekstrand	0146d79636	anv: Use a single global API patch version The Vulkan API has only one patch version shared among all of the major.minor versions. We should also advertise the same patch version regardless of major.minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106941 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 17:11:52 -07:00
Timothy Arceri	68bf94a8b0	radeonsi: enable OpenGL 3.3 compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-19 09:21:33 +10:00
Timothy Arceri	89a5d6f715	mesa: add ff fragment shader support for geom and tess shaders This is required for compatibility profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-19 09:21:33 +10:00
Eric Anholt	e636199c1c	v3d: Set the SO offsets correctly if we have to re-emit. This should fix TF across a glFlush() or TF pause/restart. Fixes dEQP-GLES3.functional.transform_feedback.array.interleaved.lines.highp_float and many, many others.	2018-06-18 14:54:16 -07:00
Marek Olšák	94178044d5	gallium/hud: = should rename the last added data source Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-18 17:53:15 -04:00
Rafael Antognolli	ba2c18763b	anv: Disable constant buffer 0 being relative. If we are on gen8+ and have context isolation support, just make that constant buffer address be absolute, so we can use it for push UBOs too. v2: Do not duplicate constant_buffer_0_is_relative flag (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-18 14:41:38 -07:00
Rafael Antognolli	be18d5a0ce	anv/device: Check for kernel support of context isolation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 14:41:38 -07:00
Rafael Antognolli	056214ebfc	intel/genxml: Add bitmasks for CS_DEBUG_MODE2/INSTPM. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-18 14:41:38 -07:00
Alok Hota	a678f40e46	swr/rast: Clang-Format most rasterizer source code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-06-18 13:57:38 -05:00
Eric Engestrom	d85fef1e34	radv: fix reported number of available VGPRs It's a bit late to round up after an integer division. Fixes: `de88979413` "radv: Implement VK_AMD_shader_info" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com>	2018-06-18 17:08:22 +01:00
Eric Engestrom	9a4bd6b45f	mesa: add missing return in error path Fixes: `67f40dadaa` "mesa: add support for ARB_sample_locations" Cc: Rhys Perry <pendingchaos02@gmail.com> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-18 16:19:48 +01:00
Bas Nieuwenhuizen	a3d93eec7c	radv: Use less conservative approximation for context rolls. Drops the number of time we set the scissor by 4x for F1 2017, which results in a consistent performance improvement of about 4%. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-18 16:21:10 +02:00
Eric Engestrom	4d08c1e7d1	radv: fix bitwise check Fixes: `922cd38172` "radv: implement out-of-order rasterization when it's safe on VI+" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-18 12:15:18 +01:00
Eric Engestrom	e8eb84826e	meson: fix i965/anv/isl genX static lib names Shouldn't make any functional difference, just that `liblibanv_gen90.a` will now be called `libanv_gen90.a`. Fixes: `3218056e0e` "meson: Build i965 and dri stack" Fixes: `d1992255bb` "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-18 12:03:24 +01:00
Timothy Arceri	66673bef94	mesa: Unconditionally enable floating-point textures ARB_texture_float references US Patent #6,650,327 [1] which has a filing date of June 16 1998. According to [2], patents filed after 1995 expire 20 years from the filing date, giving an expiration of June 17 2018. [1] https://www.google.com/patents/US6650327 [2] https://en.wikipedia.org/wiki/Term_of_patent_in_the_United_States Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-18 09:29:38 +10:00
Jose Maria Casanova Crespo	b8e099e7d5	intel/fs: shuffle_64bit_data_for_32bit_write is not used anymore Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a4965842d6	intel/fs: Use new shuffle_32bit_write for all 64-bit storage writes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a4d445b93c	intel/fs: shuffle_32bit_load_result_to_64bit_data is not used anymore Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	71b319a285	intel/fs: Use shuffle_from_32bit_read for 64-bit FS load_input As the previous use of shuffle_32bit_load_result_to_64bit_data had a source/destination overlap for 64-bit. Now a temporary destination is used for 64-bit cases to use shuffle_from_32bit_read that doesn't handle src/dst overlaps. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	8003ae87f4	intel/fs: shuffle_from_32bit_read at load_per_vertex_input at TCS/TES Previously, the shuffle function had a source/destination overlap that needs to be avoided to use shuffle_from_32bit_read. As we can use for the shuffle destination the destination of removed MOVs. This change also avoids the internal MOVs done by the previous shuffle to deal with possible overlaps. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	5565630f85	intel/fs: Use shuffle_from_32bit_read at VS load_input shuffle_from_32bit_read manages 32-bit reads to 32-bit destination in the same way that the previous loop so now we just call the new function for all bitsizes, simplifying also the 64-bit load_input. v2: Add comment about future 16-bit support (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	152bffb69b	intel/fs: Use shuffle_from_32bit_read for 64-bit gs_input_load This implementation avoids two unneeded MOVs for each 64-bit component. One was done in the old shuffle, to avoid cases of src/dst overlap but this is not the case. And the removed MOV was already being being done in the shuffle. Copy propagation wasn't able to remove them because shuffle destination values are defined with partial writes because they have stride == 2. v2: Reword commit log summary (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	8b26a2d96d	intel/fs: shuffle_from_32bit_read for 64-bit do_untyped_vector_read do_untyped_vector_read is used at load_ssbo and load_shared. The previous MOVs are removed because shuffle_from_32bit_read can handle storing the shuffle results in the expected destination just using the proper offset. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	c2297bdf19	intel/fs: Remove old 16-bit shuffle/unshuffle functions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	fd3d8a8f79	intel/fs: Use shuffle_for_32bit_write for 16-bits store_ssbo Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	20e4732f7d	intel/fs: Use shuffle_from_32bit_read to read 16-bit SSBO Using shuffle_from_32bit_read instead of 16-bit shuffle functions avoids the need of retype. At the same time new function are ready for 8-bit type SSBO reads. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a0891eabca	intel/fs: Use shuffle_from_32bit_read at VARYING_PULL_CONSTANT_LOAD shuffle_from_32bit_read can manage the shuffle/unshuffle needed for different 8/16/32/64 bit-sizes at VARYING PULL CONSTANT LOAD. To get the specific component the first_component parameter is used. In the case of the previous 16-bit shuffle, the shuffle operation was generating not needed MOVs where its results where never used. This behaviour passed unnoticed on SIMD16 because dead_code_eliminate pass removed the generated instructions but for SIMD8 they cound't be removed because of being partial writes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	22c654941b	intel/fs: New shuffle_for_32bit_write and shuffle_from_32bit_read These new shuffle functions deal with the shuffle/unshuffle operations needed for read/write operations using 32-bit components when the read/written components have a different bit-size (8, 16, 64-bits). Shuffle from 32-bit to 32-bit becomes a simple MOV. shuffle_src_to_dst takes care of doing a shuffle when source type is smaller than destination type and an unshuffle when source type is bigger than destination. So this new read/write functions just need to call shuffle_src_to_dst assuming that writes use a 32-bit destination and reads use a 32-bit source. As shuffle_for_32bit_write/from_32bit_read components take components in unit of source/destination types and shuffle_src_to_dst takes units of the smallest type component, we adjust components and first_component parameters. To enable this new functions it is needed than there is no source/destination overlap in the case of shuffle_from_32bit_read. That never happens on shuffle_for_32bit_write as it allocates a new destination register as it was at shuffle_64bit_data_for_32bit_write. v2: Reword commit log and add comments to explain why first_component and components parameters are adjusted. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Maria Casanova Crespo	a5665056e5	intel/fs: general 8/16/32/64-bit shuffle_src_to_dst function This new function takes care of shuffle/unshuffle components of a particular bit-size in components with a different bit-size. If source type size is smaller than destination type size the operation needed is a component shuffle. The opposite case would be an unshuffle. Component units are measured in terms of the smaller type between source and destination. As we are un/shuffling the smaller components from/into a bigger one. The operation allows to skip first_component number of components from the source. Shuffle MOVs are retyped using integer types avoiding problems with denorms and float types if source and destination bitsize is different. This allows to simplify uses of shuffle functions that are dealing with these retypes individually. Now there is a new restriction so source and destination can not overlap anymore when calling this shuffle function. Following patches that migrate to use this new function will take care individually of avoiding source and destination overlaps. v2: (Jason Ekstrand) - Rewrite overlap asserts. - Manage type_sz(src.type) == type_sz(dst.type) case using MOVs from source to dest. This works for 64-bit to 64-bits operation that on Gen7 as it doesn't support Q registers. - Explain that components units are based in the smallest type. v3: - Fix unshuffle overlap assert (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-16 22:39:08 +02:00
Jose Fonseca	d882331f7a	appveyor: Consume LLVM 5.0.1. https://ci.appveyor.com/project/jrfonseca/mesa/build/47 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-16 18:09:20 +01:00
Bas Nieuwenhuizen	c4714f698b	ac: Clear meminfo to avoid valgrind warning. Somehow valgrind misses that the value is initialized by the ioctl. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-16 19:03:47 +02:00
Samuel Pitoiset	5917761e3d	radv: fix emitting the TCS regs on GFX9 The primitive ID is NULL and this generates an invalid select instruction which crashes because one operand is NULL. This fixes crashes in The Long Journey Home, Quantum Break and Just Cause 3 with DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106756 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-16 10:18:51 +02:00
Ian Romanick	355868dbfc	nir: Document a couple instances of parent_instr nir_ssa_def::parent_instr and nir_src::parent_instr have the same name, but they mean really different things. I choose to save the next person the hour+ that I just spent figuring that out. Even now that I know, I doubt I'd notice in code review that someone typed foo->parent_instr when they actually meant foo->ssa->parent_instr. v2: Minor wording tweak in nir_ssa_def::parent_instr. Suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-15 17:36:51 -07:00
Ian Romanick	4467040cb6	i965/fs: Propagate conditional modifiers from not instructions Skylake total instructions in shared programs: 14399081 -> 14399010 (<.01%) instructions in affected programs: 26961 -> 26890 (-0.26%) helped: 57 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.16% max: 0.80% x̄: 0.30% x̃: 0.18% 95% mean confidence interval for instructions value: -1.50 -0.99 95% mean confidence interval for instructions %-change: -0.35% -0.25% Instructions are helped. total cycles in shared programs: 532978307 -> 532976050 (<.01%) cycles in affected programs: 468629 -> 466372 (-0.48%) helped: 33 HURT: 20 helped stats (abs) min: 3 max: 360 x̄: 116.52 x̃: 98 helped stats (rel) min: 0.06% max: 3.63% x̄: 1.66% x̃: 1.27% HURT stats (abs) min: 2 max: 172 x̄: 79.40 x̃: 43 HURT stats (rel) min: 0.04% max: 3.02% x̄: 1.48% x̃: 0.44% 95% mean confidence interval for cycles value: -81.29 -3.88 95% mean confidence interval for cycles %-change: -1.07% 0.12% Inconclusive result (%-change mean confidence interval includes 0). All Gen6+ platforms, except Ivy Bridge, had similar results. (Haswell shown) total instructions in shared programs: 12973897 -> 12973838 (<.01%) instructions in affected programs: 25970 -> 25911 (-0.23%) helped: 55 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.07 x̃: 1 helped stats (rel) min: 0.16% max: 0.62% x̄: 0.28% x̃: 0.18% 95% mean confidence interval for instructions value: -1.14 -1.00 95% mean confidence interval for instructions %-change: -0.32% -0.24% Instructions are helped. total cycles in shared programs: 410355841 -> 410352067 (<.01%) cycles in affected programs: 578454 -> 574680 (-0.65%) helped: 47 HURT: 5 helped stats (abs) min: 3 max: 360 x̄: 85.74 x̃: 18 helped stats (rel) min: 0.05% max: 3.68% x̄: 1.18% x̃: 0.38% HURT stats (abs) min: 2 max: 242 x̄: 51.20 x̃: 4 HURT stats (rel) min: <.01% max: 0.45% x̄: 0.15% x̃: 0.11% 95% mean confidence interval for cycles value: -104.89 -40.27 95% mean confidence interval for cycles %-change: -1.45% -0.66% Cycles are helped. Ivy Bridge total instructions in shared programs: 11679351 -> 11679301 (<.01%) instructions in affected programs: 28208 -> 28158 (-0.18%) helped: 50 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.54% x̄: 0.23% x̃: 0.16% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.19% Instructions are helped. total cycles in shared programs: 257445362 -> 257444662 (<.01%) cycles in affected programs: 419338 -> 418638 (-0.17%) helped: 40 HURT: 3 helped stats (abs) min: 1 max: 170 x̄: 65.05 x̃: 24 helped stats (rel) min: 0.02% max: 3.51% x̄: 1.26% x̃: 0.41% HURT stats (abs) min: 2 max: 1588 x̄: 634.00 x̃: 312 HURT stats (rel) min: 0.05% max: 2.97% x̄: 1.21% x̃: 0.62% 95% mean confidence interval for cycles value: -97.96 65.41 95% mean confidence interval for cycles %-change: -1.56% -0.62% Inconclusive result (value mean confidence interval includes 0). No changes on Iron Lake or GM45. v2: Move 'if (cond != BRW_CONDITIONAL_Z && cond != BRW_CONDITIONAL_NZ)' check outside the loop. Suggested by Iago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	f2d8bb7a7b	i965/fs: Rearrange code to remove most of the gotos Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	77f269bb56	i965/fs: Refactor propagation of conditional modifiers from compares to adds Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	22f9fbc0d9	i965/vec4: Optimize OR with 0 into a MOV All of the affected shaders are geometry shaders... the same ones from the similar fs changes. The "No changes on any other platforms" comment below is not quite right. Without the previous change to register coalescing, this optimization caused quite a few regressions in tests that either used gl_ClipVertex or used different interpolation modes. I observed that with both patches applied, glsl-1.10/execution/interpolation/interpolation-none-gl_BackSecondaryColor-smooth-vertex.shader_test was one instruction shorter. I suspect other shaders would be similarly affected. Since this is all based on NOS, shader-db does not reflect it. Haswell total instructions in shared programs: 12954955 -> 12954918 (<.01%) instructions in affected programs: 3603 -> 3566 (-1.03%) helped: 37 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.21% max: 2.50% x̄: 1.99% x̃: 2.50% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.30% -1.69% Instructions are helped. total cycles in shared programs: 410012108 -> 410012098 (<.01%) cycles in affected programs: 3540 -> 3530 (-0.28%) helped: 5 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -0.28% -0.28% Cycles are helped. Ivy Bridge total instructions in shared programs: 11679387 -> 11679351 (<.01%) instructions in affected programs: 3292 -> 3256 (-1.09%) helped: 36 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.21% max: 2.50% x̄: 2.04% x̃: 2.50% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.34% -1.74% Instructions are helped. No changes on any other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	e6a9bd97b9	i965/vec4: Don't register coalesce into source of VS_OPCODE_UNPACK_FLAGS_SIMD4X2 This prevents regressions in a bunch of clipping and interpolation tests caused by the next patch (i965/vec4: Optimize OR with 0 into a MOV). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Ian Romanick	284b563fb0	i965/fs: Optimize OR with 0 into a MOV fs_visitor::set_gs_stream_control_data_bits generates some code like "control_data_bits \| stream_id << ((2 * (vertex_count - 1)) % 32)" as part of EmitVertex. The first time this (dynamically) occurs in the shader, control_data_bits is zero. Many times we can determine this statically and various optimizations will collaborate to make one of the OR operands literal zero. Converting the OR to a MOV usually allows it to be copy-propagated away. However, this does not happen in at least some shaders (in the assembly output of shaders/closed/UnrealEngine4/EffectsCaveDemo/301.shader_test, search for shl). All of the affected shaders are geometry shaders. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14375452 -> 14375413 (<.01%) instructions in affected programs: 6422 -> 6383 (-0.61%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.14% max: 2.56% x̄: 1.91% x̃: 2.56% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.26% -1.57% Instructions are helped. total cycles in shared programs: 531981179 -> 531980555 (<.01%) cycles in affected programs: 27493 -> 26869 (-2.27%) helped: 39 HURT: 0 helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16 helped stats (rel) min: 0.60% max: 7.92% x̄: 5.94% x̃: 7.92% 95% mean confidence interval for cycles value: -16.00 -16.00 95% mean confidence interval for cycles %-change: -6.98% -4.90% Cycles are helped. No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 17:22:27 -07:00
Eric Anholt	4106f6ce54	v3d: Handle a no-intersection scissor even if it's outside of the VP. The min/maxes ended up producing a negative clip width/height for dEQP-GLES3.functional.fragment_ops.scissor.outside_render_line. Just make sure they stay at 0 (or v3d 3.x's workaround) if that happens.	2018-06-15 16:09:39 -07:00
Eric Anholt	9aa670e52a	v3d: Use the proper depth texture type for sampling. Fixes failing tests in dEQP-GLES3.functional.texture.shadow	2018-06-15 16:09:39 -07:00
Eric Anholt	778594ae12	v3d: Limit shader threading according to our maximum TMU fifo usage. Fixes simulator assertion failures in dEQP-GLES3.functional.shaders.texture_functions.texture.samplercubeshadow_bias_fragment and similar complicated cases.	2018-06-15 16:09:39 -07:00
Eric Anholt	e130ada243	v3d: Fix shaders using pixel center W but no varyings. The docs called this field "uses both center W and centroid W", but actually it's "do you need center W even if varyings don't obviously call for it?" Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w	2018-06-15 16:09:39 -07:00
Dylan Baker	0d4f338a11	docs: Update release-notes and calendar	2018-06-15 13:53:25 -07:00
Dylan Baker	3c454fc84a	docs: Add release notes for 18.1.2	2018-06-15 13:52:44 -07:00
Rafael Antognolli	9e1f208795	intel/aubinator: Use int to store getopt_long flags. getopt_long flag parameter is an int pointer, so if we use bool to store those values, when getopt_long writes to one of them, it might end up overwriting the next one. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-15 09:03:10 -07:00
Samuel Pitoiset	f8e2c4c57c	Revert "radv: always set/load both depth and stencil clear values" This fixes a rendering regression with RoTR. This reverts commit `4bdad9fadd`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 16:52:06 +02:00
Samuel Pitoiset	a2f6e72138	radv: don't check for linear images in emit_fast_color_clear() We don't enable CMASK for linear surfaces and addrlib only enables DCC for tiling surfaces. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:12 +02:00
Samuel Pitoiset	3befac52db	radv: allow RADV_PERFTEST=dccmsaa on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:10 +02:00
Samuel Pitoiset	bfca15e16a	radv: add RADV_DEBUG=checkir This allows to run the LLVM verifier pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:08 +02:00
Samuel Pitoiset	706d51de7f	radv: update ZRANGE_PRECISION in radv_update_bound_fast_clear_ds() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:06 +02:00
Samuel Pitoiset	fa8bc821a8	radv: clean up radv_{set,load}_depth_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:04 +02:00
Samuel Pitoiset	4bdad9fadd	radv: always set/load both depth and stencil clear values I don't think that matter much to emit both values and that makes the code a bit simpler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:02 +02:00
Samuel Pitoiset	2193a6a828	radv: update the fast ds clear values only if the image is bound It's unnecessary to update the fast depth/stencil clear values if the fast cleared depth/stencil image isn't currently bound. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:54:00 +02:00
Samuel Pitoiset	be794fa26b	radv: clean up radv_{set,load}_color_clear_regs() helpers And replace _regs by _metadata because it makes more sense. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:53:58 +02:00
Samuel Pitoiset	d7b772abb4	radv: update the fast color clear values only if the image is bound It's unnecessary to update the fast color clear values if the fast cleared color image isn't currently bound. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 15:53:55 +02:00
Christian Gmeiner	efae127993	util/bitset: include util/macro.h BITSET_FFS(x) macro makes use of ARRAY_SIZE(x) macro which is defined in util/macro.h. Include it directy to make usage more straightforward. Fixes: `692bd4a1ab` ("util: replace Elements() with ARRAY_SIZE()") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-15 11:26:30 +01:00
Lukas Rusak	4cfc4cef80	meson: fix private libs when building without glx I noticed that the generated pkg-config files will include glx and x11 dependencies even when x11 isn't a selected platform. This fixes the private libs and was tested by building kmscube V2: - check if gallium-xlib is being used for glx Fixes: `108d257a16` "meson: build libEGL" Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-15 10:43:22 +01:00
Rhys Perry	30f1ab7a59	docs: document addition of GL_ARB_sample_locations for nvc0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	66ca7e400b	nvc0: add support for programmable sample locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>	2018-06-14 20:09:45 -06:00
Rhys Perry	9f217facbd	st/mesa: add support for ARB_sample_locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	51a221e378	gallium: add support for programmable sample locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Rhys Perry	67f40dadaa	mesa: add support for ARB_sample_locations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2018-06-14 20:09:45 -06:00
Eric Anholt	cd2e673abc	v3d: Fix polygon offset for Z16 buffers. Fixes: dEQP-GLES3.functional.polygon_offset.fixed16_displacement_with_units dEQP-GLES3.functional.polygon_offset.fixed16_render_with_units	2018-06-14 17:03:16 -07:00
Eric Anholt	d91e06a065	v3d: Fix configuration setup of mixed f32 and f16 render targets. Fixes dEQP-GLES3.functional.fragment_out.random.26 and 6 others.	2018-06-14 16:52:25 -07:00
Eric Anholt	6784aa9870	v3d: Don't set the first_ez_state to DISABLED if after only UNDECIDED draws. We need to have the RCL start with EZ enabled, since those undecided draws had EZ enabled. But we do need to update from UNDECIDED to LT or GT as necessary still. Fixes many simulator assertion fails in deqp fragment_ops/interaction/basic_shader/*	2018-06-14 16:52:25 -07:00
Eric Anholt	9080642449	v3d: Use the right size for v3d 4.x TEXTURE_SHADER_STATE BO. This doesn't really matter, since they both get rounded up to 4096.	2018-06-14 16:52:25 -07:00
Eric Anholt	31548187cf	v3d: Add static asserts for other packed packet sizes.	2018-06-14 16:52:25 -07:00
Eric Anholt	0eef4d7f8f	v3d: Fix the size of the packed attribute state. Fixes segfaults in dEQP-GLES3.functional.vertex_array_objects.all_attributes.	2018-06-14 16:52:25 -07:00
Eric Anholt	7d8fe50af3	v3d: Remove some unused context fields from vc4.	2018-06-14 16:52:25 -07:00
Eric Anholt	48011c42aa	v3d: Remove unused QUNIFORM_STENCIL left over from vc4.	2018-06-14 16:52:25 -07:00
Eric Anholt	4564537222	v3d: Use our #define for max attributes in shader caps.	2018-06-14 16:52:25 -07:00
Eric Anholt	a40bc33b11	v3d: Fix undefined results for a swap_color_rb RT from a float shader output. Fixes segfaults and undefined behavior in dEQP-GLES3.functional.fragment_out.basic.fixed.srgb8_alpha8_lowp_float	2018-06-14 16:52:25 -07:00
Dave Airlie	600d34c822	radv: remove multisample bit from shader key. This wasn't being used anywhere inside the shader from what I can see. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-15 09:33:20 +10:00
Kenneth Graunke	f6898f2b55	intel/compiler: Properly consider UBO loads that cross 32B boundaries. The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. v2: Rewrite the accounting, my calculations were wrong. v3: Write a comment about partial values (requested by Jason). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v3]	2018-06-14 14:58:59 -07:00
Ian Romanick	37bd9ccd21	glsl: Don't copy propagate elements from SSBO or shared variables either Since SSBOs can be written by a different GPU thread, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. The same shader was helped by this patch and the previous. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399119 -> 14399113 (<.01%) instructions in affected programs: 683 -> 677 (-0.88%) helped: 1 HURT: 0 total cycles in shared programs: 532973113 -> 532971865 (<.01%) cycles in affected programs: 524666 -> 523418 (-0.24%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774	2018-06-14 11:28:12 -07:00
Ian Romanick	461a5c899c	glsl: Don't copy propagate from SSBO or shared variables either Since SSBOs can be written by other GPU threads, copy propagating a read can cause the value to magically change. SSBO reads are also very expensive, so doing it twice will be slower. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14399120 -> 14399119 (<.01%) instructions in affected programs: 684 -> 683 (-0.15%) helped: 1 HURT: 0 total cycles in shared programs: 532978931 -> 532973113 (<.01%) cycles in affected programs: 530484 -> 524666 (-1.10%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774	2018-06-14 11:26:33 -07:00
Lukas Rusak	1d92d6486a	meson: only build vl_winsys_dri.c when x11 platform is used This seems to have been missed in the move from autotools This fixes the following build issue: ../src/gallium/auxiliary/vl/vl_winsys_dri.c:34:10: fatal error: X11/Xlib-xcb.h: No such file or directory #include <X11/Xlib-xcb.h> ^~~~~~~~~~~~~~~~ Fixes: `b1b65397d0` ("meson: Build gallium auxiliary") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-14 10:34:51 -07:00
Brian Paul	b9e6438adf	st/mesa: add missing switch cases in glsl_to_tgsi_visitor::visit() To silence compiler warning about unhandled switch cases. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-06-14 11:29:51 -06:00
Bas Nieuwenhuizen	41dabdc475	radv: Fix output for sparse MRTs. We need to init the cb_shader_format correctly with the changed col_format, so this moves the col_format adjustment to before the adjustment to before the cb_shader_mask gets generated. Fixes: `06d3c65098` "radv: fix a GPU hang when MRTs are sparse" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-14 11:48:24 +02:00
Samuel Pitoiset	68dead112e	radv: update the ZRANGE_PRECISION value for the TC-compat bug On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. We also need to set the depth clear regs and to update ZRANGE_PRECISION when initializing a TC-compat depth image to 0. Original patch from James Legg. This fixes random CTS fails with dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-14 11:38:29 +02:00
Samuel Iglesias Gonsálvez	183adc51f8	anv: reduce maxFragmentInputComponents If the application asks for the maximum number of fragment input components (128), use all of them plus some builtins that are passed in the VUE, then we exceed the maximum number of used VUE slots (32) and we break one assert that checks this limit. Also, with separate shader objects, we add CLIP_DIST0, CLIP_DIST1 builtins in brw_compute_vue_map() because we don't know if gl_ClipDistance is going to be read/write by an adjacent stage. Fixes VK-GL-CTS CL#2569. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-14 09:54:28 +02:00
Marek Olšák	6d671078a8	radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers This fixes: GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-06-13 22:00:12 -04:00
Marek Olšák	a4312742a5	radeonsi/gfx9: update & clean up a DPBB heuristic Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:43 -04:00
Marek Olšák	47b780be21	radeonsi/gfx9: set POPS_DRAIN_PS_ON_OVERLAP due to a hw bug This may not be needed yet, but let's set it now. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:42 -04:00
Marek Olšák	a152ca70f2	radeonsi/gfx9: remove UINT_MAX array terminators in bin size tables Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:40 -04:00
Marek Olšák	cd0be6cdc8	radeonsi/gfx9: update bin sizes This is based on our docs (recently updated), not amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:39 -04:00
Marek Olšák	2f51081a93	radeonsi/gfx9: update primitive binning code for EQAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:37 -04:00
Marek Olšák	22e994bb75	radeonsi: assume that rasterizer state is non-NULL in draw_vbo Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:36 -04:00
Marek Olšák	f3b3ee6974	radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:34 -04:00
Marek Olšák	d6974feb90	radeonsi: move the guardband registers into a separate state atom They have a different frequency of updates and don't change when scissors change. I think this even fixes something in si_update_vs_viewport_state. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:31 -04:00
Marek Olšák	68b1c669e7	radeonsi/gfx9: implement the scissor bug workaround without performance drop This might improve performance on Vega10 and Raven. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:27 -04:00
Marek Olšák	73b0d10152	radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:25 -04:00
Marek Olšák	28ee825e19	radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:23 -04:00
Marek Olšák	99e0ba6868	radeonsi: record CLIPVERTEX output usage properly for compatibility profiles This was missed when adding CLIPVERTEX support into GS & tess. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:20 -04:00
Marek Olšák	47a57a709d	radeonsi: fix FBFETCH with 2D MSAA arrays Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:17 -04:00
Marek Olšák	e5e57c3a5e	ac: handle undefined EQAA samples in ac_apply_fmask_to_sample RADV might wanna use this helper too. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:12 -04:00
Marek Olšák	a2d4c8ff6d	radeonsi: return real memory usage instead of per-process usage Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 21:47:36 -04:00
Marek Olšák	95ecde42eb	ac/gpu_info: report real total memory sizes The change from MIN2 to MAX2 is intentional. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 21:47:36 -04:00
Dave Airlie	f11b664f48	docs: mark virgl GL 4.0 features as complete. virgl should now expose GL4.1 where it can.	2018-06-14 10:38:11 +10:00
Dave Airlie	7b6f2704eb	virgl: add ARB_tessellation_shader support. (v2) This should add all the pieces to enable tess shaders on virgl. v2: fixup transform to handle tess and strip out precise. set default for max patch varyings to work around issue when tess gets enabled from v1 caps but v2 caps aren't in place. (Elie) Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-06-14 10:36:31 +10:00
Dave Airlie	babd1d526b	glsl: allow standalone semicolons outside main() GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. Adding stable because it appears apps in the wild do this. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2018-06-14 10:21:51 +10:00
Samuel Pitoiset	51e23d3419	radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8 This causes rendering issues in Shadow Warrior 2 with DXVK. Cc: mesa-stable@lists.freedesktop.org Fixes: `ccc64f3133` ("radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-13 20:30:04 +02:00
Andrew Galante	baf16b2ea3	configure.ac: Test for __atomic_add_fetch in atomic checks Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Andrew Galante	9d547a7617	meson: Test for __atomic_add_fetch in atomic checks Some platforms have 64-bit __atomic_load_n but not 64-bit __atomic_add_fetch, so test for both of them. Bug: https://bugs.gentoo.org/655616 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Matt Turner	b29b5a82a1	meson: Fix -latomic check Commit `54ba73ef10` (configure.ac/meson.build: Fix -latomic test) fixed some checks for -latomic, and then commit `54bbe600ec` (configure.ac: rework -latomic check) further extended the fixes in configure.ac but not in Meson. This commit extends those fixes to the Meson tests. Fixes: `54bbe600ec` (configure.ac: rework -latomic check) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-13 10:09:46 -07:00
Dylan Baker	9cc577761f	meson: Remove various completed todos v3: - Remove "won't do" todos, so only completed todo's are now removed. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)	2018-06-13 10:07:03 -07:00
Dylan Baker	0ce3f3538b	meson: Make use of optional modules meson 0.43 gained support for optional modules, which clover wold like to use. Since we require 0.44.1 now we can rely on them being available for clover. compile tested only. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:58 -07:00
Dylan Baker	34bbb24ce7	meson: Add support for ppc assembly/optimizations v2: - Use -mpower8-vector in compiler test for altivec - rename altivec option to power8 - reword power8 option description to be more clear, originally I had made it a boolean, but replaced it with an auto option. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:54 -07:00
Dylan Baker	e26af22143	meson: Add support for SPARC assembly This was blindly copied from autotools and tested by a helpful gentoo user. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:25 -07:00
Dylan Baker	6eaa013685	meson: Set include dirs for asm v2: - split this from the next patch - Only include x86-64 and not x86 when buiding x86_64 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:23 -07:00
Dylan Baker	65e447c5df	meson: move cc and cpp definitions to top of main meson.build This just makes using cc and cpp easier. v2: - Add this patch to fix altivec Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-13 10:06:16 -07:00
Jason Ekstrand	51376cd749	Revert "intel/compiler: Properly consider UBO loads that cross 32B boundaries." This reverts commit `b8fa847c2e`. This broke about 30k Vulkan CTS tests.	2018-06-13 09:23:55 -07:00
Kenneth Graunke	b8fa847c2e	intel/compiler: Properly consider UBO loads that cross 32B boundaries. The UBO push analysis pass incorrectly assumed that all values would fit within a 32B chunk, and only recorded a bit for the 32B chunk containing the starting offset. For example, if a UBO contained the following, tightly packed: vec4 a; // [0, 16) float b; // [16, 20) vec4 c; // [20, 36) then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, which means that we ought to record two 32B chunks in the bitfield. Similarly, dvec4s would suffer from the same problem. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-06-13 02:07:58 -07:00
Ross Burton	3c288da5ee	drivers/dri/i965: add missing #include brw_bufmgr.h uses time_t without include time.h, so the build fails under musl. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-12 12:08:30 +01:00
Mauro Rossi	fb9ab2fbd3	anv/android: Use an address for each anv_image plane Fixes to avoid building error after change in image->planes[] structure, {bo,bo_offset} has to be replaced by address.{bo,offset} and update is needed also in the assert() for debug builds. external/mesa/src/intel/vulkan/anv_android.c:188:21: error: no member named 'bo' in 'struct anv_image::(anonymous at external/mesa/src/intel/vulkan/anv_private.h:2647:4)' image->planes[0].bo = bo; ~~~~~~~~~~~~~~~~ ^ 1 error generated. Fixes: `bf34ef16ac` ("anv: Use an address for each anv_image plane") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-12 11:17:43 +03:00
Mauro Rossi	a1220e7311	anv/android: Set the BO flags in bo_cache_import (v2) Changes to avoid building error: external/mesa/src/intel/vulkan/anv_android.c:131:72: error: too few arguments to function call, expected 5, have 4 result = anv_bo_cache_import(device, &device->bo_cache, dma_buf, &bo); ~~~~~~~~~~~~~~~~~~~ ^ 1 error generated. (v2) Set the correct bo_flags based on support of 48bit addresses and soft-pin Fixes: `b0d50247a7` ("anv/allocator: Set the BO flags in bo_cache_alloc/import") Fixes: `e7d0378bd9` ("anv: Soft-pin client-allocated memory") Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-12 11:16:39 +03:00
Kenneth Graunke	0d5329d626	anv: Disable __gen_validate_value if NDEBUG is set. We were enabling undefined memory checking for genxml values based on Valgrind being installed at build time, even for release builds. This generates piles and piles of assembly whenever you touch genxml. With gcc 7.3.1 and -O3 and -march=native on a Kabylake with Valgrind installed at build time: text data bss dec hex filename 5978385 262884 13488 6254757 5f70a5 libvulkan_intel.so 3799377 262884 13488 4075749 3e30e5 libvulkan_intel.so That's a 36% reduction in text size. Fixes: `047ed02723` (vk/emit: Use valgrind to validate every packed field) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-11 14:55:32 -07:00
Eric Engestrom	06e8771dec	README: wording fix for previous commit Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:34:58 +01:00
Eric Engestrom	d9f54dceca	README: add link to WhosWho for IRC nicks Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:33:12 +01:00
Eric Engestrom	eadc068406	add project README Now that we're using GitLab, let's take advantage of the "landing page" README feature with some minimal information, mostly to point people to the right resources. Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 18:02:35 +01:00
Eric Engestrom	e43c012433	i965: fix resource leak v2: intel_miptree_release() already takes care of the planes, no need to hand-code the loop (Lionel) Coverity ID: 1436909 Fixes: `3352f2d746` "i965: Create multiple miptrees for planar YUV images" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric@engestrom.ch>	2018-06-11 14:54:23 +01:00
Rob Clark	55d1a77c29	freedreno/ir3: use pipe_image_view's cpp At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	9bb90a3255	freedreno/ir3: fix image dimensions offset copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	e9fc9c16c9	freedreno/a5xx: correct image/ssbo offset Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	132e5b0b34	freedreno/ir3: use saml always if we have lod In some cases we get plain tex opcodes (but w/ a lod argument).. in this case always use the saml instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	cf5dda3349	freedreno/ir3: don't cp absneg into meta:fi If using a fanin (collect) to collect of consecutive registers together, we can CP mov's into the fanin, but not (abs) or (neg). No places that allow those modifiers are consuming a fanin anyways. But this caused an absneg to be lost between a ldgb and stgb for shaders like: outputs[n] = abs(input[n]) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	39e7a39e91	freedreno/ir3: rework size/type conversion instructions With 8b and 16b, there are a lot more to handle. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	a52e698219	freedreno/ir3: propagate HALF flag across fanout If we have a fanout (split) meta instruction to split the result of a vector instruction, propagate the HALF flag back to the original instruction. Otherwise result ends up in a full precision register while instruction(s) that use the result look in a half-precision register. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	fc1690c9d9	freedreno/a5xx: add sample-id/sample-mask-in Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	619d2317cd	freedreno/ir3: add sample-id/sample-mask-in Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	a49c87956e	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Rob Clark	067d89c2cd	freedreno/ir3: image atomics use image-store path image reads are handled via tex state, whereas image writes and atomics are handled via SSBO state block. Previously we were only considering image write, and not image atomics which also uses the SSBO state block. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-06-11 09:06:03 -04:00
Kyle Brenneman	41642bdbca	egl/glvnd: Fix a segfault in eglGetProcAddress. If FindProcIndex in egldispatchstubs.c is called with a name that's less than the first entry in the array, it would end up trying to store an index of -1 in an unsigned integer, wrap around to 2^32, and then crash when it tries to look that up. Change FindProcIndex so that it uses bsearch(3) instead of implementing its own binary search, like the GLX equivalent FindGLXFunction does. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-11 12:17:07 +01:00
Jordan Justen	e266b32059	mesa/program_binary: add implicit UseProgram after successful ProgramBinary Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810 Fixes: `b4c37ce214` "i965: Add ARB_get_program_binary support using nir_serialization" Ref: `3fe8d04a6d` "mesa: don't always set _NEW_PROGRAM when linking" Ref: `c505d6d852` "mesa: use gl_program for CurrentProgram rather than gl_shader_program" Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-10 21:12:46 -07:00
Dave Airlie	525cfe5dab	features.txt: update virgl GL4.1 status. All the features for GL4.1 are done (64-bit attribs were part of the fp64 enable). Once tessellation shaders land this will be advertised	2018-06-11 10:49:14 +10:00
Dave Airlie	77d7d7acab	virgl: enable ARB_gpu_shader_fp64 This enables ARB_gpu_shader_fp64 if the host provides it. Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-06-11 08:35:03 +10:00
Samuel Pitoiset	135e4d434f	radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold Workaround for bug in llvm that causes the GPU to hang in presence of nested loops because there is an exec mask issue. The proper solution is to fix LLVM but this might require a bunch of work. This fixes a bunch of GPU hangs that happen with DXVK. Vega10: Totals from affected shaders: SGPRS: 110456 -> 110456 (0.00 %) VGPRS: 122800 -> 122800 (0.00 %) Spilled SGPRs: 7478 -> 7478 (0.00 %) Spilled VGPRs: 36 -> 36 (0.00 %) Code Size: 9901104 -> 9922928 (0.22 %) bytes Max Waves: 7143 -> 7143 (0.00 %) Code size slightly increases because it inserts more branch instructions but that's expected. I don't see any real performance changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-09 14:16:49 +02:00
Samuel Pitoiset	94706f0de4	radv: fix missing ZRANGE_PRECISION(1) for GFX9+ ZRANGE_PRECISION(1) seems to be the default optimal value, but it was only set for VI and older chips. This fixes a rendering issue with Banished through DXVK, and might fix more than that. There is still the ZRANGE_PRECISION bug that we need to handle but that can be fixed later. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-09 10:57:01 +02:00
Gustavo Lima Chaves	7dfaf025c5	anv: enable VK_EXT_shader_stencil_export Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-08 11:16:01 -07:00
Gustavo Lima Chaves	7cc5178bba	spirv: add/hookup SpvCapabilityStencilExportEXT v2: An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior also follows, with the interpretation to said mode being we prevent writes to the built-in FragStencilRefEXT variable when the execution mode isn't set. v3: A more cautious reading of `1db44252d0` led me to a missing change that would stop (what I later discovered were) GPU hangs on the CTS test written to exercise this. v4: Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT mode into a warning, instead of trying to make the variable read-only. If we are to follow the originating extension on GL, the built-in variable in question should never be readable anyway. v5/v6: rebases. v7: Fix check for gen9 lost in rebase. (Ilia) Reduce the scope of the bool used to track whether SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info, moved to vtn_builder. (Jason) v8: Assert for fragment shader handling StencilRefReplacingEXT execution mode. (Caio) Remove warning logic, since an entry point might not have StencilRefReplacingEXT execution mode, but the global output variable might still exist for another entry point in the module. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-08 11:15:37 -07:00
Eric Anholt	22cc83cf87	travis: Add the v3d driver to the automake build. Hopefully this reduces the number of fixup commits we need for the automake build. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-08 09:50:38 -07:00
Eric Anholt	3db39d84d2	travis: Do our automake build tests with srcdir != builddir. This will catch many automake bugs that end-users get to experience first, otherwise. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-06-08 09:50:28 -07:00
Eric Engestrom	37eb56d239	autotools/meson: compile against wayland-egl-backend Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106861 Fixes: `1db4ec0546` "egl: rewire the build systems to use libwayland-egl" Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Andreas Hartmetz <ahartmetz@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-08 16:45:43 +01:00
Cameron Kumar	cb03803253	vulkan/wsi: Destroy swapchain images after terminating FIFO queues The queue_manager thread can access the images from x11_present_to_x11, hence this reorder prevents dereferencing of dangling pointers. Cc: "18.1" <mesa-stable@lists.freedesktop.org> Fixes: `e73d136a02` ("vulkan/wsi/x11: Implement FIFO mode.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-08 14:06:46 +01:00
Sonny Jiang	ce64c1b70a	radeonsi: emit_dpbb_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:40 -04:00
Sonny Jiang	7dcfa1f46e	radeonsi: emit_clip_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	06b47005d3	radeonsi: emit_msaa_sample_locs packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	a1b4b00ce2	radeonsi: emit_msaa_config packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:36 -04:00
Sonny Jiang	2bad413f55	radeonsi: emit_cb_render_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:25 -04:00
Sonny Jiang	43b0269ce3	radeonsi: emit_db_render_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:25 -04:00
Jan Vesely	d797f1f47e	drisw: Fix invalid pointer arithmetic Use of void * in pointer arithmetic is illegal, use char * instead. Fixes: `cf54bd5e83` ("drisw: use shared memory when possible") Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-06-07 21:01:29 -04:00
Timothy Arceri	03c370d2f1	radeonsi: fix possible truncation on renderer string Fixes truncation warning in gcc 8.1 Fixes: `8539c9bf31` ("gallium/radeon: add the kernel version into the renderer string") Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-06-08 10:07:55 +10:00
Timothy Arceri	fae3b38770	ac: fix possible truncation of intrinsic name Fixes the gcc warning: snprintf’ output between 26 and 33 bytes into a destination of size 32 Fixes: `d5f7ebda3e` ("ac: add LLVM build functions for subgroup instrinsics") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-08 09:24:15 +10:00
Bas Nieuwenhuizen	4fc2d5e141	amd/common: Fix number of coords for getlod. The LLVM 6 code reduced it to a non-array call. We need to do that with the new code too. This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.array for radv. Fixes: `a9a7993441` "amd/common: use the dimension-aware image intrinsics on LLVM 7+" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-07 23:59:52 +02:00
Dave Airlie	9be56316cf	features: add virgl to the GL features list This hopefully adds virgl to the correct places and current statuses of various extensions. virgl of course relies on two external things a) host driver that can support the features b) up to date host virglrenderer library that can support the features. This list will be maintained as latest (a) + (b) + mesa. Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-06-08 07:34:53 +10:00
Matt Turner	a5abb2da74	meson: Add support for read-only text segment on x86 Port of `6dfc5e28f7` (configure.ac: Add support to enable read-only text segment on x86.) to Meson. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-07 14:16:44 -07:00
Dylan Baker	8f2421d73b	meson: work around gentoo applying -m32 to host compiler in cross builds Gentoo's ebuild system always adds -m32 to the compiler for doing x86_64 -> x86 cross builds, while meson expects it not to do that. This results in an x86 -> x86 cross build, and assembly gets disabled. Fixes: `2d62fc0646` ("meson: disable x86 asm in fewer cases.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-06-07 11:54:06 -07:00
Jason Ekstrand	e0fa239962	i965/screen: Sanity check that all formats we advertise are useable Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	0e7f3febf7	i965/screen: Use RGBA non-sRGB formats for images Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly handle RGBX formats. Also, we don't want to make decisions based on those in the first place because we can't render to RGBA and we use the non-sRGB version to determine whether or not to allow CCS_E. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	a266934935	i965/screen: Return false for unsupported formats in query_modifiers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	eeae485149	i965/screen: Refactor query_dma_buf_formats This reworks it to work like query_dma_buf_modifiers and, in particular, makes it more flexible so that we can disallow a non-static set of formats. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	3b54dd87f7	intel/isl: Add bounds-checking assertions for the format_info table We follow the same convention as isl_format_get_layout in having two assertions to ensure that only valid formats are passed in. We also check against the array size of the table because some valid formats such as CCS formats will may be past the end of the table. This fixes some potential out-of-bounds array access even in valid cases. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Jason Ekstrand	778e2881a0	intel/isl: Add bounds-checking assertions in isl_format_get_layout We add two assertions instead of one because the first assertion that format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a real but unsupported enumerant while the second ensures that they don't pass in garbage values. We also update some other helpers to use isl_format_get_layout instead of using the table directly so that they get bounds checking too. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 11:23:34 -07:00
Dylan Baker	c267f46ef2	meson: Clarify why asm cannot be used in cross compile This makes the reasoning for why a cross compile is not using asm clearer (hopefully). v2: - fix typos Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 10:40:35 -07:00
Eric Engestrom	f436ae237b	docs: talk about Wayland instead of libwayland Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 18:06:40 +01:00
Jason Ekstrand	237c5ac4f9	anv: Set fence/semaphore types to NONE in impl_cleanup There were some places that were calling anv_semaphore_impl_cleanup and neither deleting the semaphore nor setting the type back to NONE. Just set it to NONE in impl_cleanup to avoid these issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643 Fixes: `031f57eba` "anv: Add a basic implementation of VK_KHX_external..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-07 09:46:45 -07:00
Plamena Manolova	3ba16d640e	nir: Add global invocation id intrinsic. Add the missing nir intrinsic for the gl_GlobalInvocationID compute shader variable. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-07 14:53:12 +01:00
Eric Engestrom	61edad216e	travis: bump libwayland to the first version with libwayland-egl Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-07 11:10:11 +01:00
Kenneth Graunke	3ea2d791f3	i965: Require softpin support for Cannonlake and later. This isn't strictly necessary, but anyone running Cannonlake will already have Kernel 4.5 or later, so there's no reason to support the relocation model on Gen10+. This will let us avoid dealing with them for new features. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-06 19:45:09 -07:00
Kenneth Graunke	a363bb2cd0	i965: Allocate VMA in userspace for full-PPGTT systems. This patch enables soft-pinning of all buffers, allowing us to skip relocation processing entirely. All systems with full PPGTT and > 4GB of VMA should gain these benefits. This should be most Gen8+. Unfortunately, this excludes a few systems: - Cherryview (only has 32-bit addressing, despite 48-bit pointers) - Broadwell with a 32-bit kernel - Anybody running pre-4.5 kernel. We may enable it for Cherryview in the future, but it would require some tweaks to the memory zone. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-06 19:45:09 -07:00
Kenneth Graunke	74259b98aa	intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin. commit `92f01fc5f9` made i965 start emitting VF cache invalidates when the high bits of vertex buffers change. But we were not tracking vertex buffers emitted by BLORP. This was papered over by a mistake where I emitted VF cache invalidates all the time, which Chris fixed in commit `3ac5fbadfd`. This patch adds a new hook which allows the driver to track addresses and request a VF cache invalidate as appropriate. v2: Make the driver do the PIPE_CONTROL so it can apply workarounds (caught by Jason Ekstrand). Rebase on anv bug fix. v3: Don't screw up the boolean (caught by Jason Ekstrand). Fixes: `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-06 19:45:09 -07:00
Timothy Arceri	2a74296f24	nir: add opt_if_loop_terminator() This pass detects potential loop terminators and moves intructions from the non breaking branch after the if-statement. This enables both the new opt_if_simplification() pass and loop unrolling to potentially progress further. Unexpectedly this change speed up shader-db run times by ~3% Ivy Bridge shader-db results (all changes in dolphin/ubershaders): total instructions in shared programs: 9995662 -> 9995338 (-0.00%) instructions in affected programs: 87845 -> 87521 (-0.37%) helped: 27 HURT: 0 total cycles in shared programs: 230931495 -> 230925015 (-0.00%) cycles in affected programs: 56391385 -> 56384905 (-0.01%) helped: 27 HURT: 0 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-07 11:33:04 +10:00
Timothy Arceri	1098bc5e85	nir: move ends_in_break() helper to nir_loop_analyze.h We will use the helper while simplifying potential loop terminators in the following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-07 11:33:04 +10:00
Timothy Arceri	186988e28f	radv: fix Coverity no effect control flow issue swizzle is unsigned so "desc->swizzle[c] < 0" is never true. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-07 10:10:57 +10:00
Jason Ekstrand	44c614843c	intel/blorp: Don't vertex fetch directly from clear values On gen8+, we have to VF cache flush whenever a vertex binding aliases a previous binding at the same index modulo 4GiB. We deal with this in Vulkan by ensuring that vertex buffers and the dynamic state (from which BLORP pulls its vertex buffers) are in the same 4GiB region of the address space. That doesn't work if we're reading clear colors with the VF unit. In order to work around this we switch to using MI commands to copy the clear value into the vertex buffer we allocate for the normal constant data. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-06 16:32:38 -07:00
Lionel Landwerlin	b28a2510cc	dri: add missing 16bits formats mapping i965 advertises the 16-bit R and RG formats through eglQueryDmaBufFormatsEXT but falls over when a client tries to use or asks more information about such a format because driImageFormatToGLFormat returns MESA_FORMAT_NONE. Found by Eero Tamminen. v2: Add G16R16 formats (Lionel) v3: Fix G16R16 mapping to mesa format (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642 Reviewed-by: Plamena Manolova <plamena.manolova@intel.com> (v2) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-07 00:09:21 +01:00
Eric Anholt	833c404600	nir: Look into uniform structs for samplers when counting num_textures. mesa/st decides whether to update samplers after a program change based on whether num_textures is nonzero. By not counting samplers in a uniform struct, we would segfault in KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same context after a non-vertex-shader-uniform testcase (as is the case during a full conformance run). v2: Implement using two separate pure functions instead of updating pointers. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-06 13:46:55 -07:00
Eric Anholt	f69473a712	v3d: Work around GFXH-1461/GFXH-1689 by using CLEAR_TILE_BUFFERS. This doesn't seem to have done anything to my test results. However, given that we've still got a class of GPU hangs, following the workarounds that the closed driver does so that we get the same command sequences seems like a good idea.	2018-06-06 13:46:55 -07:00
Eric Anholt	9d5860310d	v3d: Enable the new NIR bitfield operation lowering paths. These together get the GLSL 3.00 unorm/snorm pack functions and MESA_shader_integer operations working. v2: Fix commit message typo. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	73953b0713	nir: Add lowering for nir_op_bit_count. This is basically the same as the GLSL lowering path. v2: Fix typo in the link Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	7afa26d4e3	nir: Add lowering for nir_op_bitfield_reverse. This is basically the same as the GLSL lowering path. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	6e1597c2d9	nir: Add an ALU lowering pass for mul_high. This is based on the glsl/lower_instructions.cpp implementation, but should be much more readable. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	6a0db5f08f	nir: Add lowering for find_lsb. There is a fairly simple relation to turn this into ufind_msb. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	d4c7c3c225	nir: Add lowering for ifind_msb to ufind_msb. ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb to that. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	af88acf4c4	nir: Add lowering from ibitfield_extract/ubitfield_extract to shifts. V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to glsl/lower_instructions.cpp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Anholt	74618ccbca	nir: Add lowering for bitfieldInsert without using bfi. If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes things harder than keeping the "bits" argument around. This still uses bfm, but I've added the obvious lowering of bfm if you need it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-06 13:44:28 -07:00
Eric Engestrom	735b104707	docs: add note about moving to libwayland-egl in 18.2.0 Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Cc: Andres Gomez <agomez@igalia.com> Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:12:03 -07:00
Eric Engestrom	b9361c9df0	egl: remove wayland-egl now that we're using libwayland-egl Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:12:01 -07:00
Eric Engestrom	1db4ec0546	egl: rewire the build systems to use libwayland-egl Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-06 12:11:57 -07:00
zhaowei yuan	67f7a16b59	glsl: Take 'double' as reserved after GLSL ES 1.0 GLSL ES 1.0.17 specifies that "double" is a keyword reserved Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823 Signed-off-by: zhaowei yuan <zhaowei.yuan@samsung.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-05 23:39:25 -07:00
Marek Olšák	17a42062cc	r300g/swtcl: make pipe_context uploaders use malloc'd memory as before Discovered by Roland Scheidegger. The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but malloc'd memory otherwise. Vertex and index buffers should use malloc'd memory. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-06-05 22:52:08 -04:00
Jason Ekstrand	01ad2067bb	intel/eu: Use a struct copy instead of a memcpy The memcpy had the wrong size and this was causing crashes on 32-bit builds of the driver. Fixes: `6a9525bf67` "intel/eu: Switch to a logical state stack" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106830 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-05 15:51:01 -07:00
Philip Rebohle	cc21e96d5f	radv: Use correct color format for fast clears Using the image format is incorrect when the view has a different format than the image. Instead, the view format needs to be used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687	2018-06-05 23:51:03 +02:00
Eric Anholt	2b1b2cbf61	v3d: Be more explicit about include directory from our generated code. You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir. nir was fine at that, but automake didn't have it. Bugzilla: https://github.com/anholt/mesa/issues/104	2018-06-05 12:44:49 -07:00
Bas Nieuwenhuizen	2a10fd902d	radv: Do not hardcode fast clear formats. except for the odd one out. This should support many more formats. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 20:53:21 +02:00
Scott D Phillips	6fb22114a0	intel/tools: add intel_sanitize_gpu to EXTRA_DIST Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778 Fixes: `cc41603d6d` ("intel/tools: new intel_sanitize_gpu tool") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:32:35 -07:00
Scott D Phillips	08535dd886	util/tests/vma: Fix warning c++11-narrowing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106801 Fixes: `943fecc569` ("util: Add a randomized test for the virtual memory allocator") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:32:07 -07:00
Scott D Phillips	4b123fb74b	util: tests: vma test depends on C++11 support Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106776 Fixes: `943fecc569` ("util: Add a randomized test for the virtual memory allocator") Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-05 10:13:14 -07:00
Michel Dänzer	6b8f3724c8	glx: Fix number of property values to read in glXImportContextEXT We were trying to read twice as many as the X server sent us, which upset XCB: [xcb] Too much data requested from _XRead [xcb] This is most likely caused by a broken X extension library [xcb] Aborting, sorry about that. glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed. Fixing this takes 3 GLX piglit tests from crash to pass. Fixes: `0852162950` "glx: Be more tolerant in glXImportContext (v2)" Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-06-05 18:56:43 +02:00
Eric Engestrom	c765c39ea7	configure: radv depends on mako Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784 Fixes: `17201a2eb0` "radv: port to using updated anv entrypoint/extension generator." Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 16:32:48 +01:00
Eric Engestrom	5bdc38f356	travis: use correct form for array options I'd like to eventually drop support for the confusing "an array of a single empty string is meant to be interpreted as an empty array", so let's start by not using it anymore. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 16:31:23 +01:00
Lionel Landwerlin	9aedee64ac	anv: intel: add softpin flag on imported BOs Looks like we forgot to update this bit of the driver for softpin. Fixes: `4affeba1e9` ("anv: Soft-pin everything else") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-05 14:18:35 +01:00
Eric Engestrom	66c61797ad	autotools: add missing android file to package Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779 Fixes: `ff904978a1` "gallium/util: Android backtrace support" Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 10:39:04 +01:00
Eric Engestrom	7c4423cce9	meson: fix platforms check for `-D egl=true` Fixes: `0ed6a87a10` "meson: fix platforms=[]" Reported-by: Christoph Haag <haagch@frickel.club> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-05 10:38:57 +01:00
Mathias Fröhlich	1ac4439d62	mesa: Make sure that imm draws are flushed before other draws execute. The recent patch mesa: Remove FLUSH_VERTICES from VAO state changes. Pending draw calls on immediate mode or display list calls do not depend on changes of the VAO state. So, remove calls to FLUSH_VERTICES and flag _NEW_ARRAY as appropriate. uncovered a problem that non immediate mode draw calls do only flush outstanding immediate mode draws if FLUSH_UPDATE_CURRENT is set in ctx->Driver.NeedFlush. In that case, due to the sequence of _mesa_set_draw_vao commands we could end up with the VAO from the FLUSH_VERTICES call set into gl_context::Array._DrawVAO when the array draw is executed. So the change pulls FLUSH_CURRENT out of _mesa_validate_* calls into the array draw calls being validated. The change introduces a new macro FLUSH_FOR_DRAW beside FLUSH_VERTICES and FLUSH_CURRENT that flushes on changed current attributes as well as on outstanding immediate mode draw calls. Use FLUSH_FOR_DRAW in the non immediate mode draw code paths. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106594 Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-06-05 07:05:24 +02:00
gurchetansingh@chromium.org	a7b74a77fa	virgl: use bits in caps set v2 Let's add another field to caps v2, that can help report boolean values. Suggested-by: Gert Wollny <gert.wollny@collabora.com> Suggested-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 14:29:00 +10:00
gurchetansingh@chromium.org	6ce94a50bb	virgl: add shader offset alignment to to v2 caps struct This is the SSBO analogue to fe0647. User supplied data must be a multiple of GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT. This fixes 44 GLES31 tests on airlied@'s GLES31 sketch branches with Nvidia hardware, but this patch standalone can applied to master. The alignment restriction on Nvidia is 32, hence the default value. Example tests: dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.0 dEQP-GLES31.functional.ssbo.layout.multi_basic_types.single_buffer.std430 v2: Move to a better place in case statement v3: Rebase Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-05 14:28:49 +10:00
Kenneth Graunke	1c9053d076	i965: Prepare batchbuffer module for softpin support. If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations. We simply want to add the BO to the validation list, and possibly mark it as writeable. The new brw_use_pinned_bo() interface does just that. To avoid having to make every caller consider both the relocation and softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given a softpinned buffer. We also can't grow buffers that are softpinned - the mechanism places a larger BO at the same offset as the original, which requires moving BOs around in the VMA. With softpin, we only allocate enough VMA for the original size of the BO. v2: Assert that BOs aren't pinned if the kernel says we should move them (feedback from Chris Wilson) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-04 18:38:41 -07:00
Kenneth Graunke	01058a5522	i965: Add virtual memory allocator infrastructure to brw_bufmgr. This introduces a new fast virtual memory allocator integrated with our BO cache bucketing. For larger objects, it falls back to the simple free-list allocator (util_vma). This puts the allocators in place but doesn't enable softpin yet. v2: (feedback from Chris Wilson) - Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag - Avoid vma_free(0ull) on the err_free path. - Only enable if the kernel says we have full PPGTT support - Make bucketing allocators more resistant to failing to grow arrays (feedback from Scott Phillips) - Don't use node after popping it from the list. - Avoid undefined behavior in canonicalization by reusing new helper - Comment updates (feedback from myself) - Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper to return a non-canonical address with the high bits zeroed. - Don't shadow loop variable 'i' when destroying things (ugly; worked) v3: - Replace zero_high_bits with new common gen_48b_address helper. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-04 18:38:41 -07:00
Jason Ekstrand	e99b32d4d6	i965: Disable internal CCS for shadows of multi-sampled windows If window system supports Y-tiling but not CCS_E, we currently create an internal CCS for any window system buffers and then resolve right before handing it off to X or Wayland. In the case of the single-sampled shadow of a multi-sampled window system buffer, this is pointless because the only thing we do with it is use it as a MSAA resolve target so we do MSAA resolve -> CCS resolve -> hand to the window system. Instead, just disable CCS for the shadow and then the MSAA resolve will write uncompressed directly into it. If the window system supports CCS_E, we will still use CCS_E, we just won't do internal CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 15:27:29 -07:00
Jason Ekstrand	6ab9fe7673	i965/miptree: Rename a parameter to create_for_dri_image Instead of having it be a general "is this a winsys image" boolean, make it more specific to the actual purpose. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 15:27:16 -07:00
Jason Ekstrand	6a9525bf67	intel/eu: Switch to a logical state stack Instead of the state stack that's based on copying a dummy instruction around, we start using a logical stack of brw_insn_states. This uses a bit less memory and is way less conceptually bogus. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	db9675f5a4	intel/eu: Set flag [sub]register number differently for 3src Prior to gen8, the flag [sub]register number is in a different spot on 3src instructions than on other instructions. Starting with Broadwell, they made it consistent. This commit fixes bugs that occur when a conditional modifier gets propagated into a 3src instruction such as a MAD. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	2d20303e18	intel/eu: Copy fields manually in brw_next_insn Instead of doing a memcpy, this moves us to start with a blank instruction (memset to zero) and copy the fields over one at a time. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jason Ekstrand	381fac2740	intel/eu: Add some brw_get_default_ helpers This is much cleaner than everything that wants a default value poking at the bits of p->current directly. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-06-04 14:03:03 -07:00
Jose Fonseca	db38c3b4ba	trace: Fix parsing of recent traces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-04 21:06:31 +01:00
Jose Fonseca	8652ff7cdf	trace: Fix trace_context_transfer_unmap methods. The emitted buffer_subdata/texture_subdata call didn't match the respective signatures. v2: Actually emit buffer_subdata call. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-04 21:06:31 +01:00
Nicolai Hähnle	a9a7993441	amd/common: use the dimension-aware image intrinsics on LLVM 7+ Requires LLVM trunk r329166. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-06-04 21:34:59 +02:00
Kenneth Graunke	b3ba47c592	i965: Fix batch-last mode to properly swap BOs. On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move the validation list entry to the end...but incorrectly left the exec_bo array alone, causing a mismatch where exec_bos[0] no longer corresponded with validation_list[0] (and similarly for the last entry). One example of resulting breakage is that we'd update bo->gtt_offset based on the wrong buffer. This wreaked total havoc when trying to use softpin, and likely caused unnecessary relocations in the normal case. Fixes: `29ba502a4e` (i965: Use I915_EXEC_BATCH_FIRST when available.) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-06-04 09:43:09 -07:00
Samuel Pitoiset	06d3c65098	radv: fix a GPU hang when MRTs are sparse When the i-th target format is set, all previous target formats must be non-zero to avoid hangs. In other words, without this if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs because the target format of mrt1 is zero. This fixes DXVK GPU hangs with "Seven: The Days Long Gone", "GTA V" and probably more games. Cc: "18.0" 18.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-04 14:01:33 +02:00
Bas Nieuwenhuizen	2835b6baf4	radv: Don't pass a TESS_EVAL shader when tesselation is not enabled. Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the GEOMETRY shader for the TESS_EVAL shader. This would cause the flush_constants code to emit the GEOMETRY constants to the TESS_EVAL registers and then conclude that it did not need to set the GEOMETRY shader registers. Fixes: `dfff9fb6f8` "radv: Handle GFX9 merged shaders in radv_flush_constants()" CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-04 13:46:24 +02:00
Samuel Pitoiset	e3e929f8c3	nir: implement the GLSL equivalent of if simplication in nir_opt_if This pass turns: if (cond) { } else { do_work(); } into: if (!cond) { do_work(); } else { } Here's the vkpipeline-db stats (from affected shaders) on Polaris10: Totals from affected shaders: SGPRS: 17272 -> 17296 (0.14 %) VGPRS: 18712 -> 18740 (0.15 %) Spilled SGPRs: 1179 -> 1142 (-3.14 %) Code Size: 1503364 -> 1515176 (0.79 %) bytes Max Waves: 916 -> 911 (-0.55 %) This pass only affects Serious Sam 2017 (Vulkan) on my side. The stats are not really good for now. Some shaders look quite dumb but this will be improved with further NIR passes, like ifs combination. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-04 12:41:10 +02:00
Samuel Pitoiset	e44f90eccf	nir: make is_comparison() a non-static helper function Rename and change the prototype for consistency regarding nir_tex_instr_is_query(). This function will be used in the following patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-06-04 12:41:08 +02:00
Dave Airlie	67eccd6aa2	nir: use num_components wrappers in print/validate. These wrappers were introduces, so start using them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-04 05:58:42 +10:00
Juan A. Suarez Romero	bad7332f7c	doc: update calendar, add news and link release notes for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-06-03 10:19:32 +00:00
Juan A. Suarez Romero	41c01d79ee	docs: add sha256 checksums for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `aba161e63a`)	2018-06-03 10:12:02 +00:00
Juan A. Suarez Romero	a89cb6711b	docs: add release notes for 18.0.5 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ca0037aaef`)	2018-06-03 10:12:00 +00:00
Jose Fonseca	8841c2cda5	scons: Fix MinGW cross compilation with LLVM 5.0. LLVM 5.0 requires additional Win32 libraries, and MinGW with pthreads. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-06-02 09:58:50 +01:00
Jason Ekstrand	64e619674e	anv: Don't even bother processing relocs if we have softpin Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:34:26 -07:00
Jason Ekstrand	c7be17c8d3	anv: Refactor reloc handling in execbuf_add_bo This just separates the reloc list vs. BO set cases and lets us avoid an allocation if relocs->deps->entries == 0. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:34:25 -07:00
Jason Ekstrand	7105b7890a	anv: Assert that the kernel leaves pinned BO addresses alone Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 16:33:07 -07:00
Scott D Phillips	4affeba1e9	anv: Soft-pin everything else v2 (Jason Ekstrand): - Break up Scott's mega-patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:13 -07:00
Scott D Phillips	f3dbe0419d	anv: Soft-pin batch buffers Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:12 -07:00
Jason Ekstrand	a0b133286a	anv/batch_chain: Simplify secondary batch return chaining Previously, we did this weird thing where we left space and an empty relocation for use in a hypothetical MI_BATCH_BUFFER_START that would be added to the secondary later. Then, when it came time to chain it into the primary, we would back that out and emit an MI_BATCH_BUFFER_START. This worked well but it was always a bit hacky, fragile and ugly. This commit instead adds a helper for rewriting the MI_BATCH_BUFFER_START at the end of an anv_batch_bo and we use that helper for both batch bo list cloning and handling returns from secondaries. The new helper doesn't actually modify the batch in any way but instead just adjusts the relocation as needed. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:12 -07:00
Jason Ekstrand	4f20c665b4	anv/batch_chain: Call batch_bo_finish at the end of end_batch_buffer The only reason we were calling it in the middle was that one of the cases for figuring out the secondary command buffer execution type wanted batch_bo->length which gets set by batch_bo_finish. It's easy enough to recalculate and now batch_bo_finish is called in a sensible location. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	e7d0378bd9	anv: Soft-pin client-allocated memory Now that we've done all that refactoring, addresses are now being directly written into surface states by ISL and BLORP whenever a BO is pinned so there's really nothing to do besides enable it. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	caf41c78ca	anv/allocator: Support softpin in the BO cache Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:11 -07:00
Jason Ekstrand	b0d50247a7	anv/allocator: Set the BO flags in bo_cache_alloc/import It's safer to set them there because we have the opportunity to properly handle combining flags if a BO is imported more than once. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-06-01 14:27:10 -07:00
Scott D Phillips	27cc68d9e9	anv: For pinned BOs, skip relocations, but track bo usage References to pinned BOs won't need to be relocated at a later point, so just write the final value of the reference into the bo directly. Add a `set` to the relocation lists for tracking dependencies that were previously tracked by relocations. When a batch is executed, we add the referenced pinned BOs to the exec list. v2: - visit bos from the dependency set in a deterministic order (Jason) v3: - compar => compare, drat (Jason) - Reworded commit message, provided by (Jordan) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-01 14:27:10 -07:00
Scott D Phillips	c7db0ed4e9	anv: Use a separate pool for binding tables when soft pinning Soft pinning lets us satisfy the binding table address requirements without using both sides of a growing state_pool. If you do use both sides of a state pool, then you need to read the state pool's center_bo_offset (with the device mutex held) to know the final offset of relocations that target the state pool bo. By having a separate pool for binding tables that only grows in the forward direction, the center_bo_offset is always 0 and relocations don't need an update pass to adjust relocations with the mutex held. v2: - don't introduce a separate state flag for separate binding tables (Jason) - replace bo and map accessors with a single binding_table_pool accessor (Jason) v3: - assert bt_block->offset >= 0 for the separate binding table (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-01 14:27:10 -07:00
Scott D Phillips	e662bdb820	anv: Soft-pin state pools The state_pools reserve virtual address space of the full BLOCK_POOL_MEMFD_SIZE, but maintain the current behavior of growing from the middle. v2: - rename block_pool::offset to block_pool::start_address (Jason) - assign state pool start_address statically (Jason) v3: - remove unnecessary bo_flags tampering for the dynamic pool (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-06-01 13:49:22 -07:00
Ian Romanick	f00fcfb7a2	nir: Lower !f2b(x) to x == 0.0 Some trivial help now, but it also prevents ~40 regressions caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. All Gen4+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14369557 -> 14369555 (<.01%) instructions in affected programs: 442 -> 440 (-0.45%) helped: 2 HURT: 0 total cycles in shared programs: 532425772 -> 532425743 (<.01%) cycles in affected programs: 6086 -> 6057 (-0.48%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-06-01 10:14:53 -07:00
Ian Romanick	619c51722b	nir: Add some missing "optimization undo" patterns `d8d18516b0` and `03fb13f646` added some patterns to undo conversions like (('ior', ('flt', a, b), ('flt', a, c)), ('flt', a, ('fmax', b, c))) If further optimization cause some of the operands to either be the same or be constants, undoing the transformation can lead to further savings. I don't know why these patterns were not added in those patches. I did not check to see which specific patterns actually helped. I just added all of them for symmetry. This prevents some loop unrolling regressions Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if simplication in nir_opt_if" patch. Skylake and Broadwell had similar results. (Skylake shown) total instructions in shared programs: 14369768 -> 14369557 (<.01%) instructions in affected programs: 44076 -> 43865 (-0.48%) helped: 141 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1 helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60% 95% mean confidence interval for instructions value: -1.67 -1.32 95% mean confidence interval for instructions %-change: -0.72% -0.59% Instructions are helped. total cycles in shared programs: 532430629 -> 532425772 (<.01%) cycles in affected programs: 1170832 -> 1165975 (-0.41%) helped: 101 HURT: 5 helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32 helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03% HURT stats (abs) min: 2 max: 22 x̄: 9.20 x̃: 4 HURT stats (rel) min: <.01% max: 0.05% x̄: 0.02% x̃: <.01% 95% mean confidence interval for cycles value: -53.64 -38.00 95% mean confidence interval for cycles %-change: -3.06% -2.20% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-01 10:13:16 -07:00
Eric Engestrom	57fbc2ac50	docs/meson: mention how to use array options Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	03a2e7b662	meson: drop unused empty string array element Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	0ed6a87a10	meson: fix platforms=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	a92cdcd598	meson: fix vulkan-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	a425db4d7d	meson: fix gallium-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	393abd6a57	meson: fix dri-drivers=[] Fixes: `5608d0a2ce` ("meson: use array type options") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Eric Engestrom	8faa22c146	REVIEWERS: add root meson.build to the Meson reviewers group Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-06-01 17:53:06 +01:00
Juan A. Suarez Romero	cbe4baed1f	glsl: Add ir_binop_vector_extract in NIR Implement ir_binop_vector_extract using NIR operations. Based on SPIR-V to NIR approach. This fixes: dEQP-GLES3.functional.shaders.indexing.moredynamic.with_value_from_indexing_expression_fragment Piglit's glsl-fs-vec4-indexing-8.shader_test CC: mesa-stable@lists.freedesktop.org Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Iago Toral <itoral@igalia.com>	2018-06-01 18:09:22 +02:00
Dylan Baker	4ad8e2ac82	doc: update calendar, add news and link release notes for 18.1.1	2018-06-01 08:39:17 -07:00
Dylan Baker	55ee53ea19	docs/relnotes: Add sha256 sums for mesa 18.1.1	2018-06-01 08:39:17 -07:00
Dylan Baker	423c4fe954	docs: Add release notes for 18.1.1	2018-06-01 08:39:17 -07:00
Plamena Manolova	939312702e	i965: Add ARB_fragment_shader_interlock support. Adds suppport for ARB_fragment_shader_interlock. We achieve the interlock and fragment ordering by issuing a memory fence via sendc. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-06-01 16:36:39 +01:00
Plamena Manolova	60e843c4d5	mesa: Add GL/GLSL plumbing for ARB_fragment_shader_interlock. This extension provides new GLSL built-in functions beginInvocationInterlockARB() and endInvocationInterlockARB() that delimit a critical section of fragment shader code. For pairs of shader invocations with "overlapping" coverage in a given pixel, the OpenGL implementation will guarantee that the critical section of the fragment shader will be executed for only one fragment at a time. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-06-01 16:36:36 +01:00
Martin Pelikán	53719f818c	compiler/spirv: reject invalid shader code properly After `bebe3d626e`, b->fail_jump is prepared after vtn_create_builder which can longjmp(3) to it through its vtx_assert()s. This corrupts the stack and creates confusing core dumps, so we need to avoid it. While there, I decided to print the offending values for debugability. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-06-01 08:09:35 -07:00
Juan A. Suarez Romero	360bfb619f	docs: change release manager for 18.1 Dylan will replace Emil as the release manager for 18.1.x series. CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-06-01 15:24:02 +02:00
Gert Wollny	ef3a6e3d98	virgl: Always assume that ORIGIN_UPPER_LEFT and PIXEL_CENTER* are supported The driver must support at least one of PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT and one of PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER otherwise glsl_to_tgsi will fire an assert. ORIGIN_UPPER_LEFT is the default convention, and is supported by all mesa drivers, hence it seems reasonable to always report the caps to be enabled. On gles ORIGIN_LOWER_LEFT is generally not supported, so we rely on the caps reported by the host that depend on whether we run on an GL or an EGL host. For PIXEL_CENTER it is completely host driver dependend on what is supported, and since we do not report the actual host driver capabilities it is best to mark both as supported, this is how it works for a GL host too. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_xyz dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_1 dEQP-GLES3.functional.shaders.metamorphic.bubblesort_flag.variant_2 Reviewed-by: Gurchetan Singh <gurcetansingh@chromium.org> Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-06-01 12:04:21 +01:00
Alex Smith	01a2414045	radeonsi: Fix crash on shaders using MSAA image load/store The value returned by tgsi_util_get_texture_coord_dim() does not account for the sample index. This means image_fetch_coords() will not fetch it, leading to a null deref in ac_build_image_opcode() which expects it to be present (the return value of ac_num_coords() does include the sample index). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-01 08:53:38 +01:00
Alex Smith	dfff9fb6f8	radv: Handle GFX9 merged shaders in radv_flush_constants() This was not previously handled correctly. For example, push_constant_stages might only contain MESA_SHADER_VERTEX because only that stage was changed by CmdPushConstants or CmdBindDescriptorSets. In that case, if vertex has been merged with tess control, then the push constant address wouldn't be updated since pipeline->shaders[MESA_SHADER_VERTEX] would be NULL. Use radv_get_shader() instead of getting the shader directly so that we get the right shader if merged. Also, skip emitting the address redundantly - if two merged stages are set in push_constant_stages this change would have made the address get emitted twice. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:34 +01:00
Alex Smith	7ca0167ae9	radv: Consolidate GFX9 merged shader lookup logic This was being handled in a few different places, consolidate it into a single radv_get_shader() function. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:31 +01:00
Alex Smith	0fa51bfdbe	radv: Set active_stages the same whether or not shaders were cached With GFX9 merged shaders, active_stages would be set to the original stages specified if shaders were not cached, but to the stages still present after merging if they were. Be consistent and use the original stages. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-01 08:53:01 +01:00
Marek Olšák	9e61147ef6	st/mesa: relax requirements for ARB_ES3_compatibility Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106748 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-01 01:04:17 -04:00
Scott D Phillips	29a139b308	anv/blorp: Write relocated values into surface states v2 (Jason Ekstrand): - Split the blorp bit into it's own patch and re-order a bit - Use anv_address helpers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-31 16:51:47 -07:00
Jason Ekstrand	bf34ef16ac	anv: Use an address for each anv_image plane This is better than having BO and offset fields. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	1f2328c3b7	anv/cmd_buffer: Rework surface relocation helpers This commit renames add_surface_state_reloc to add_surface_reloc and makes it takes an address. We also rename add_image_view_relocs to add_surface_state_relocs because it takes an anv_surface_state and doesn't really care about the image view anymore. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	f270a09737	anv: Use an anv_address in anv_buffer Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	8a8bd39d5e	anv/cmd_buffer: Use anv_address for handling indirect parameters Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	1029458ee3	anv: Use an anv_address in anv_buffer_view Instead of storing a BO and offset separately, use an anv_address. This changes anv_fill_buffer_surface_state to use anv_address and we now call anv_address_physical and pass that into ISL. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	de1c5c1b50	anv: Use full anv_addresses in anv_surface_state This refactors surface state filling to work entirely in terms of anv_addresses instead of offsets. This should make things simpler for when we go to soft-pin image buffers. Among other things, add_image_view_relocs now only cares about the addresses in the surface state and doesn't really need the image view anymore. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	94081ffc80	anv: Add some anv_address helpers Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Scott D Phillips	aaea46242d	anv: Add vma_heap allocators in anv_device These will be used to assign virtual addresses to soft pinned buffers in a later patch. Two allocators are added for separate 'low' and 'high' virtual memory areas. Another alternative would have been to add a double-sided allocator, which wasn't done here just because it didn't appear to give any code complexity advantages. v2 (Scott Phillips): - rename has_exec_softpin to use_softpin (Jason) - Only remove bottom one page and top 4 GiB from virt (Jason) - refer to comment in anv_allocator about state address + size overflowing 48 bits (Jason) - Mention hi/lo allocators vs double-sided allocator in commit message (Chris) - assign state pool memory ranges statically (Jason) v3 (Jason Ekstrand): - Use (LOW\|HIGH)_HEAP_(MIN\|MAX)_ADDRESS rather than (1 << 31) for determining which heap to use in anv_vma_free - Only return de-canonicalized addresses to the heap Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:46 -07:00
Jason Ekstrand	6e4672f881	intel/common: Add an address de-canonicalization helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-31 16:51:45 -07:00
Scott D Phillips	943fecc569	util: Add a randomized test for the virtual memory allocator The test pseudo-randomly makes allocations and deallocations with the virtual memory allocator and checks that the results are consistent. Specifically, we test that: * no result from the allocator overlaps an already allocated range * allocated memory fulfills the stated alignment requirement * a failed result from the allocator could not have been fulfilled * memory freed to the allocator can later be allocated again v2: - fix if() in test() to actually run fill() v3: - add c++11 build flag (Jason) - test the full 64-bit range (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-31 16:51:35 -07:00
Jason Ekstrand	f19ad5d31f	util: Add a virtual memory allocator This is simple linear-walk first-fit allocator roughly based on the allocator in the radeon winsys code. This allocator has two primary functional differences: 1) It cleanly returns 0 on allocation failure 2) It allocates addresses top-down instead of bottom-up. The second one is needed for Intel because high addresses (with bit 47 set) need to be canonicalized in order to work properly. If we allocate bottom-up, then high addresses will be very rare (if they ever happen). We'd rather always have high addresses so that the canonicalization code gets better testing. v2: - [scott-ph] remove _heap_validate() if NDEBUG is defined (Jordan) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Tested-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-31 16:17:35 -07:00
Bas Nieuwenhuizen	b9fb2c266a	radv: Add startup debug option. This adds a RADV_DEBUG=startup option to dump more info about instance creation and device enumeration. A common question end users have is why the direver is not loading for them, and this has two common reasons: 1) They did not install the driver. 2) AMDGPU is not used for the card in the kernel. This adds some info messages so we can easily get a some useful output from end users. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	38933c1151	radv: Add option to print errors even in optimized builds. Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	729f7373de	radv: Make the sem_info allocate/free functions static. They are only used in 1 file. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Samuel Pitoiset	70f9e2589e	nir: optimize iand(ieq(a, 0), ieq(b, 0)) to ieq(ior(a, b), 0) Totals from affected shaders: SGPRS: 80 -> 80 (0.00 %) VGPRS: 48 -> 48 (0.00 %) Code Size: 2120 -> 2096 (-1.13 %) bytes Max Waves: 16 -> 16 (0.00 %) Only two Rise of Tomb Raider shaders are affected on my side. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-31 10:57:16 +02:00
Tapani Pälli	c983c6abaf	mesa: don't call Driver.TexEnv with invalid arguments Patch skips useless and possibly dangerous calls down to the driver in case invalid arguments were given. I noticed this would be happening with demo of Darwinia game. AFAIK this does not fix anything but makes this path safer and more like how other API functions are implemented. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-31 09:24:17 +03:00
Vinson Lee	d511bba2f9	v3d: Fix automake linking error. CXXLD gallium_dri.la ../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_dump_packet': src/broadcom/clif/clif_dump.c:87: undefined reference to `v3d33_clif_dump_packet' src/broadcom/clif/clif_dump.c:85: undefined reference to `v3d41_clif_dump_packet' ../../../../src/broadcom/.libs/libbroadcom.a(clif_dump.o): In function `clif_process_worklist': src/broadcom/clif/clif_dump.c:140: undefined reference to `v3d41_clif_dump_gl_shader_state_record' src/broadcom/clif/clif_dump.c:144: undefined reference to `v3d33_clif_dump_gl_shader_state_record' Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-30 11:55:09 -07:00
Jakob Bornecrantz	d6cee5a162	virgl: Update virgl_hw.h Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:07:26 +01:00
Dave Airlie	e2b6d830b2	virgl: add ARB_transform_feedback_overflow_query support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:55 +01:00
Dave Airlie	22b072c194	virgl: add polygon offset clamp Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:51 +01:00
Dave Airlie	49204ff8ad	virgl: add derivative control support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:47 +01:00
Dave Airlie	46fe349af2	virgl: add ARB_conditional_render_inverted support Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:40 +01:00
Dave Airlie	f9eb7e8b76	virgl: update caps bitset to latest version. This makes this use all 32 bits, so future sets need to be defined in a new struct. Reviewed-by: Jakob Bornecrantz <jakob@collabora.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-05-30 17:02:19 +01:00
Timothy Arceri	e8b368ad1c	nir: add unsigned comparison simplifications This avoids loop unrolling regressions in Wolfenstein II on DXVK with an upcoming optimisation series from Samuel. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-30 22:48:37 +10:00
Bas Nieuwenhuizen	c2799574eb	radv: Only expose subgroup shuffles on VI+. The current implementation depends on bpermute, which is VI+. Fixes: `f2c6a55061` "radv: enable subgroup capabilities" Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-30 13:49:46 +02:00
Samuel Pitoiset	02c7916298	radv: fix emitting descriptor pointers with LLVM < 7 This was terribly wrong, I forced use of 32-bit pointers when emitting shader descriptor pointers. This fixes GPU hangs with LLVM 5&6 because 32-bit pointers are only supported with LLVM 7. Fixes: `88d1ed0f81` ("radv: emit shader descriptor pointers consecutively") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-30 11:38:54 +02:00
Ilia Mirkin	04fff21c62	nv30: add a couple of missed shader caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-30 02:06:28 -04:00
Ilia Mirkin	30918b77ac	nv30: ensure that displayable formats are marked accordingly Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets") Cc: "18.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-30 02:06:28 -04:00
Marek Olšák	858ac8942d	mesa: expose ARB_tessellation_shader in the compatibility profile Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	16ac832392	mesa: expose AMD_vertex_shader_layer in the compatibility profile This requires layered FBOs from GL 3.2. Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	518d8065ce	mesa: expose ARB_gpu_shader5 in the compatibility profile Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	dd93bc4f34	st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	34ea55d820	gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	e453fc76e7	mesa: update fixed-func state constants for TCS, TES, GS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	27a9f27310	mesa: print Compatibility Profile in the version string Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	d3a87537dd	glsl: parse #version XXX compatibility Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-29 20:13:24 -04:00
Marek Olšák	a7d0c53ab8	st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2) Bindless texture handles can be passed via vertex attribs using this type. They use the double codepath, so don't use st_pipe_vertex_format. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 20:09:00 -04:00
Marek Olšák	a8e1413876	mesa: handle GL_UNSIGNED_INT64_ARB properly (v2) Bindless texture handles can be passed via vertex attribs using this type. This fixes a bunch of bindless piglit tests on radeonsi. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 20:09:00 -04:00
Timothy Arceri	1f7a3a1102	mesa: add display list support for glPatchParameter{i,fv}() This is required for tessellation shader Compat profile support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:37:35 +10:00
Dave Airlie	d3ff478732	glx/drisw: make the shm/non-shm loader extensions separately. I disliked removing the const here, function tables are meant to be const just to avoid having to think about them, make a second table for the shm vs non-shm paths to use. Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	33ce3aa512	drisw/glx: implement getImageShm Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	17b27725fe	drisw: use getImageShm() if available Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	9feaf33371	drisw: learn to query shmid handle type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	bcd80be49a	drisw/glx: use XShm if possible Implements putImageShm from DRIswrastLoaderExtension. If XShm extension is not available, or fails, it will fallback on regular XPutImage(). Tested on Linux only with 16bpp and 32bpp visual. (airlied: tested on 24bpp as well) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	cf54bd5e83	drisw: use shared memory when possible If drisw_loader_funcs implements put_image_shm, allocates display target data with shared memory and display with put_image_shm(). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	63c427fa71	drisw: use putImageShm if available If the DRIswrastLoaderExtension implements putImageShm, bind it to drisw_loader_funcs. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:53 +10:00
Marc-André Lureau	de8085e649	dri: add putImageShm and getImageShm to swrastLoader Add new API to put and get an image using shared memory. Instead of only passing the data pointer, 3 arguments are given: the shmid, the data offset and the shmaddr. Bump interface version. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:53 +10:00
Dave Airlie	b7ac0779e0	gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_* This just renames this as we want to add an shm handle which isn't really drm related. Originally by: Marc-André Lureau <marcandre.lureau@gmail.com> (airlied: I used this sed script instead) This was generated with: git grep -l 'DRM_API_' \| xargs sed -i 's/DRM_API_/WINSYS_/g' Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Marc-André Lureau	d2eaff33d0	gallium: move winsys handle to it's own file. This will be used in the drisw interface later, which isn't drm specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Francisco Jerez	4bd2047dee	intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot. When using multiple RT write messages to the same RT such as for dual-source blending or all RT writes in SIMD32, we have to set the "Last Render Target Select" bit on all write messages that target the last RT but only set EOT on the last RT write in the shader. Special-casing for dual-source blend works today because that is the only case which requires multiple RT write messages per RT. When we start doing SIMD32, this will become much more common so we add a dedicated bit for it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	d3cd6b7215	intel/fs: Replace the CINTERP opcode with a simple MOV The only reason it was it's own opcode was so that we could detect it and adjust the source register based on the payload setup. Now that we're using the ATTR file for FS inputs, there's no point in having a magic opcode for this. v2 (Jason Ekstrand): - Break the bit which removes the CINTERP opcode into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	39de901a96	intel/fs: Use the ATTR file for FS inputs This replaces the special magic opcodes which implicitly read inputs with explicit use of the ATTR file. v2 (Jason Ekstrand): - Break into multiple patches - Change the units of the FS ATTR to be in logical scalars Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	4bfa2ac2ea	intel/fs: Rename a local variable so it doesn't shadow component() v2 (Jason Ekstrand): - Break the refactor into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	11c71f0e75	intel/eu: Remove brw_codegen::compressed_stack. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	71a86d1fc6	intel/fs: Use groups for SIMD16 LINTERP on gen11+ This is better than compression control because it naturally extends to SIMD32. v2: - Push/pop instruction state around adjusted codegen (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	a1a850cd34	intel/fs: Assert that the gen4-6 plane restrictions are followed The fall-back does not work correctly in SIMD16 mode and the register allocator should ensure that we never hit this case anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jan Vesely	ed834aefa2	travis: Add clover llvm-6.0 build v2: Don't force build using gcc-4.8 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com>	2018-05-29 17:36:16 -04:00
Jan Vesely	41b878e1bd	clover: Cleanup compat code for llvm < 3.9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com>	2018-05-29 17:36:16 -04:00
Jan Vesely	d424be0fed	clover: Fix build after llvm r332881. v2: fix whitespace and indentation r332881 added an extra parameter to the emit function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106619 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2018-05-29 17:36:16 -04:00
Chris Wilson	3ac5fbadfd	i965: Only emit VF cache invalidations when the high bits changes Commit `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") tried to only emit the VF invalidate if the high bits changed, but it accidentally always set need_invalidate to true; causing it to emit unconditionally emit the pipe control before every primitive. Fixes: `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-29 12:16:26 -07:00
Eric Engestrom	e4fe2fd3bb	vulkan: don't free uninitialised memory The modifiers array hasn't been initialised by then, much less with data that would need freeing. Move the label after the loop to fix this. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	51a17e7fee	dri: replace two-way switch case with a table lookup Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> --- v2: rebased on top of `432df741e0` "dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format."	2018-05-29 17:44:13 +01:00
Eric Engestrom	d3ca7bd452	dri: fix error value returned by driGLFormatToImageFormat() 0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum. It is, however, the value of MESA_FORMAT_NONE, which two of the callers (i915 & i965) checked for. The other callers (that check for errors, ie. st/dri) already check for __DRI_IMAGE_FORMAT_NONE. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	1945231b48	egl/x11: fix build with DRI3 disabled Fixes: `473af0b541` "egl/x11: deduplicate depth-to-format logic" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>	2018-05-29 17:01:21 +01:00
Emil Velikov	63b95fb291	meson: require shared glapi when using DRI based libGL Just like we do in the autotools build. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 16:56:19 +01:00
Emil Velikov	728d1da159	meson: remove unreachable with_glx == 'auto' check Cannot happen since, props to the autodetection further up. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 16:31:46 +01:00
Thierry Reding	9e539012df	tegra: Treat resources with modifiers as scanout Resources created with modifiers are treated as scanout because there is no way for applications to specify the usage (though that capability may be useful to have in the future). Currently all the resources created by applications with modifiers are for scanout, so make sure they have bind flags set accordingly. This is necessary in order to properly export buffers for such resources so that they can be shared with scanout hardware. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:37 +02:00
Thierry Reding	9603d81df0	tegra: Fix scanout resources without modifiers Resources created for scanout but without modifiers need to be treated as pitch-linear. This is because applications that don't use modifiers to create resources must be assumed to not understand modifiers and in turn won't be able to create a DRM framebuffer and passing along which modifiers were picked by the implementation. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:34 +02:00
Thierry Reding	bd3e97e5aa	tegra: Remove usage of non-stable UAPI This code path is no longer required with framebuffer modifier support. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:47:45 +02:00
Eric Engestrom	f736be86bb	docs: add favicon to the website favicon.png is just gears.png resized to 64x64, and favicon.ico is generated using this command, adapted from the ImageMagick example [1]: $ convert favicon.png -background black \ $ -clone 0 -resize 16x16 $ \ $ -clone 0 -resize 32x32 $ \ $ -clone 0 -resize 48x48 $ \ $ -clone 0 -resize 64x64 $ \ -delete 0 -alpha off -colors 256 favicon.ico We could edit every html page to add `<link rel="icon" href="favicon.ico" />`, but there's not much point as pretty much every browser will pick it up automatically if the file is named `favicon.ico` and is in the root folder. [1] http://www.imagemagick.org/Usage/thumbnails/#favicon Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Eric Engestrom	e6a1aca0b2	docs: add missing html closing tag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Eric Engestrom	3b5376330f	docs: add missing html tag Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 14:48:21 +01:00
Karol Herbst	56792a0876	nir/print: fix printing of 8/16 bit constant variables v2 (Jose Maria Casanova Crespo <jmcasanova@igalia.com>): add float16 support Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-29 13:43:49 +02:00
Pierre Moreau	f0e80e123c	nv50/ir: Extend ImmediateValue::applyLog2 to 64-bit integers Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Pierre Moreau	03f592a164	util/u_math: Implement a logbase2 function for unsigned long v2 (Karol Herbst <kherbst@redhat.com>): * removed unneeded ll * ll -> ull Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Eric Engestrom	539aa604a0	docs: trivial typo fix Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-29 12:10:14 +01:00
Samuel Pitoiset	88d1ed0f81	radv: emit shader descriptor pointers consecutively This reduces the number of SET_SH_REG packets which are emitted for applications that use more than one descriptor set per stage. We should be able to emit more SET_SH_REG packets consecutively (like push constants and vertex buffers for the vertex stage), but this will be improved later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:18 +02:00
Samuel Pitoiset	21baf33a94	radv: allow radv_emit_shader_pointer_head() to emit more pointers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:16 +02:00
Samuel Pitoiset	288fe7ec71	radv: split radv_emit_shader_pointer() This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:13 +02:00
Rhys Perry	57e721a456	gm107/ir: prevent WaW hazards in instruction scheduling Previously, findFirstUse() only considered reads "uses". This fixes that by making it check both an instruction's sources and definitions. It also shortens both findFistUse() and findFirstDef() along the way. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-28 13:59:56 -04:00
Bas Nieuwenhuizen	a29bc043ae	radv: Implement VK_KHR_draw_indirect_count. Literally the same as the AMD ext. Passes indirect_draw_count CTS tests. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:08:26 +02:00
Bas Nieuwenhuizen	b0002e4e05	vulkan: Update header+vk.xml to 1.1.76 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 12:08:20 +02:00
Bas Nieuwenhuizen	6914d5a2c0	radv: Implement alternate GFX9 scissor workaround. This improves dota2 performance for me by 11% when I force the GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my threadripper), which should be a large part of the radv-amdvlk gap. (For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68 fps) It looks like dota2 rendered the GUI with a bunch of draws with a SetScissors before almost each draw, causing a lot of pipeline stalls. I'm not really happy with the duplication of code, but overriding radeon_set_context_reg would also be messy since we have the pre-recorded pipelines and a bunch of si_cmd_buffer code, as well as some memory->context reg loads for which things would be more complicated. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:04:25 +02:00
Eric Anholt	3b6dfcf7ae	Revert "st/nir: use NIR for asm programs" This reverts commit `5c33e8c772`. It broke fixed function vertex programs on vc4 and v3d, and apparently caused trouble for radeonsi's NIR paths as well. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> https://bugs.freedesktop.org/show_bug.cgi?id=106673	2018-05-28 14:41:03 +10:00
Scott D Phillips	4714784dae	anv: move canonical_address calculation into a separate function A later patch will make use of this in other places. Also, remove dependency on undefined behavior of left-shifting a signed value. v2: - move function into a separate header (Chris) v3: (by Ken) Add new header to the various build systems. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-27 19:24:33 -07:00
Gert Wollny	1aec4a07d4	r600: Fix SSG when not all components are written Make sure only those components are written to that are specified in the write mask. Fixes: dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Gert Wollny	42cd2810aa	r600: Correct IDIV if DST and SRC use the same temporary In cases like IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy the result will be written to the same register that is also a source register. Since the components are evaluated one by one, this may result in overwriting the source value for a later operation. Work around this by adding another temporary to store the result if the destination temporary index is equal to one of the source temporary indices. Fixes: dEQP-GLES2.functional.shaders.operator.binary_operator.div.* Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Kenneth Graunke	58fb613a51	i965: Revert recent tiled memcpy changes. This reverts commit `79fe00efb4`. This reverts commit `f5e8b13f78`. This reverts commit `d21c086d81`. They broke the Android build and I'd rather not leave it broken for the long holiday weekend.	2018-05-26 16:25:50 -07:00
Scott D Phillips	79fe00efb4	i965/miptree: Use cpu tiling/detiling when mapping Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x\|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit `b499b85b0f`. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Chris Wilson	f5e8b13f78	i915: Fix streaming loads for intel_tiled_memcpy We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. Fixes: `d21c086d81` (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Marek Olšák	18c50498db	radeonsi: remove unused variable addr_vec trivial	2018-05-25 18:37:57 -04:00
Jason Ekstrand	ae514ca695	intel/blorp: Support blits and clears on surfaces with offsets For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 14:01:44 -07:00
Marek Olšák	2f65c67043	radeonsi: fix passing gl_ClipVertex for GS and tess Also add the fprintf call. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a7d61c0753	radeonsi: fix color inputs/outputs for GS and tess GS is tested, tessellation is untested. Have outputs_written_before_ps for HW VS and outputs_written for other stages. The reason is that COLOR and BCOLOR alias for HW VS, which drives elimination of VS outputs based on PS inputs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	92ea9329e5	radeonsi: fix incorrect parentheses around VS-PS varying elimination I don't know if it caused issues. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a4ba7cd6a2	st/mesa: simplify lastLevel determination in st_finalize_texture This fixes shader images where we always bind stObj->pt and not individual gl_texture_images. Roughly based on i965 commit `845ad2667a` which does a similar thing but for a different reason. This fixes GL CTS assertion failures introduced by Ilia. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:31:36 -04:00
Scott D Phillips	d21c086d81	i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 11:05:46 -07:00
Alok Hota	fb20ae0374	swr/rast: Adjusted avx512 primitive assembly for msvc codegen Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by about 4x, MSVC compiler was going crazy making temporaries and split-loading inputs onto the stack unless explicit AVX-512 load ops were added Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:57:02 -05:00
Alok Hota	b3360f5c8b	swr/rast: Moved memory init out of core swr init Added two new files for a wrapper function for initialization v2: added missing include for single architecture builds Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:55 -05:00
Alok Hota	b6b114c1ae	swr/rast: Removed superfluous JitManager argument from passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:49 -05:00
Alok Hota	98d0201577	swr/rast: Renamed MetaData calls Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:43 -05:00
Alok Hota	14b5cac0be	swr/rast: Use metadata to communicate between passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:37 -05:00
Alok Hota	f09636e2e1	swr/rast: Check gCoreBuckets/CORE_BUCKETS equal length at compile time Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:01 -05:00
Alok Hota	cfe75cc7b5	swr/rast: Added in-place building to SCATTERPS SCATTERPS previously assumed it was being used with an existing basic block Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:55:37 -05:00
Samuel Pitoiset	45eb24fedf	radv: run the EarlyCSEMemSSA LLVM pass It's recommended by the instruction combining pass, and RadeonSI also runs it. This pass used to segfault with one shader of F12017 in the past, but it no longer crashes. Maybe the LLVM IR generated by RADV has changed. Polaris10: Totals from affected shaders: SGPRS: 441352 -> 441648 (0.07 %) VGPRS: 310888 -> 300784 (-3.25 %) Spilled SGPRs: 13576 -> 12983 (-4.37 %) Code Size: 22560328 -> 22420544 (-0.62 %) bytes Max Waves: 40755 -> 41366 (1.50 %) Vega10: Totals from affected shaders: SGPRS: 442848 -> 442000 (-0.19 %) VGPRS: 310396 -> 300460 (-3.20 %) Spilled SGPRs: 13708 -> 12906 (-5.85 %) Code Size: 22479428 -> 22336216 (-0.64 %) bytes Max Waves: 45783 -> 46506 (1.58 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 14:24:14 +02:00
Samuel Pitoiset	66e38654c9	radv: fix dumping compute shader on the graphics queue The graphics pipeline can be NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:07 +02:00
Samuel Pitoiset	de06dfa9ea	radv: add radv_dump_pipeline_state() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:05 +02:00
Samuel Pitoiset	6f0530ecfe	radv: rework how shaders are dumped when generating a hang report Use a flag for the active stages instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:58:03 +02:00
Samuel Pitoiset	8c406f0b4d	radv: remove unused parameter in radv_dump_annotated_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-25 11:57:59 +02:00
Jose Dapena Paz	6c61c31dc2	mesa: do not leak ctx->Shader.ReferencedProgram references When glUseProgram is used, references to the included shaders are added in ctx->Shader.ReferencedProgram. But those references are not decreased when the shader data is deallocated. Thus, those shaders are leaked. Explicitely remove the pending references to these shaders. Fixes: `e6506b3cd2` ("mesa: retain gl_shader_programs after glDeleteProgram if they are in use") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-25 10:38:09 +10:00
Marek Olšák	508b423dd6	radeonsi: set DB_EQAA.MAX_ANCHOR_SAMPLES correctly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	07e02c8617	radeonsi: round ps_iter_samples in set_min_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	510c88f9d1	radeonsi: remove redundant ps_iter_samples clamp si_get_ps_iter_samples already does this. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	25cdf754e4	radeonsi: remove some old gfx 9.x registers Leftover from bring up. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	b936f9aa32	radeonsi: disable primitive binning for all blitter ops same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	8c1c451a90	ac/surface/gfx6: don't overallocate mipmapped HTILE Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Eric Engestrom	473af0b541	egl/x11: deduplicate depth-to-format logic Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-24 18:01:45 +01:00
Tapani Pälli	7b54404c9d	i965: enable OES_texture_view for gen8+ Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Tapani Pälli	3ddcdcf94d	mesa: changes to expose OES_texture_view extension Functionality already covered by ARB_texture_view, patch also adds missing 'gles guard' for enums (added in `f1563e6392`). Tested via arb_texture_view.*_gles3 tests and individual app utilizing texture view with ETC2. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-24 12:53:07 +03:00
Juan A. Suarez Romero	046b2b651e	docs: update release calendar for 18.1 series v2: extend 18.1 series (Andres) v3: fix copy/paste typo (Engestrom) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-05-24 11:47:47 +02:00
Samuel Pitoiset	38a8c5903b	radv: call nir_lower_io_to_temporaries for VS, GS, TES and FS Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:57 +02:00
Samuel Pitoiset	ded1509587	radv: call nir_split_var_copies() before nir_lower_var_copies() This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-24 09:18:54 +02:00
Francisco Jerez	936cd3c87a	i965: Use intel_bufferobj_buffer() wrapper in image surface state setup. Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <nanleychery@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:34 -07:00
Francisco Jerez	e989acb03b	i965: Handle non-zero texture buffer offsets in buffer object range calculation. Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:28 -07:00
Francisco Jerez	156d2c6e62	i965: Move buffer texture size calculation into a common helper function. The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Francisco Jerez	5a68147803	Revert "mesa: simplify _mesa_is_image_unit_valid for buffers" This reverts commit `c0ed52f614`. It was preventing the image format validation from being done on buffer textures, which is required to ensure that the application doesn't attempt to bind a buffer texture with an internal format incompatible with the image unit format (e.g. of different texel size), which is not allowed by the spec (it's not allowed for any texture target, whether or not there is spec wording restricting this behavior specifically for buffer textures) and will cause the driver to calculate texel bounds incorrectly and potentially crash instead of the expected behavior. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-05-23 16:21:09 -07:00
Bas Nieuwenhuizen	699e1f5aac	ac: Use DPP for build_ddxy where possible. WQM is pretty reliable now on LLVM 7, so let us just use DPP + WQM. This gives approximately a 1.5% performance increase on the vrcompositor built-in benchmark. v2: Use ac_build_quad_swizzle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-23 21:02:45 +02:00
Miguel Casas	b73b340c37	i965: add {X,A}BGR2101010 to 'intel_image_formats' This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:19:04 -07:00
Miguel Casas	432df741e0	dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format. Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-23 10:17:45 -07:00
Dylan Baker	c8acfd5ab2	bin/get-pick-listh.sh: force git --pretty=medium Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:17 -07:00
Dylan Baker	5a639bdb81	bin/bugzilla_mesa.sh: explicitly set the --pretty argument Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 09:54:00 -07:00
Eric Engestrom	ec986241f3	docs: drop unnecessary out-of-frame target I'm guessing an earlier version of the website used to have the page contents in <frames>, but this isn't the case anymore so just drop the unnecessary `target="_main"` :) Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	09a6cb7be6	docs: fix various html tags mistakes Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Eric Engestrom	8034f5f623	docs: fix `<` & `>` used in html code Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-23 16:52:23 +01:00
Juan A. Suarez Romero	6db0660d08	docs: add news notes to 18.1.0 CC: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-23 13:06:55 +02:00
Dave Airlie	f2f464de57	tgsi/scan: add hw atomic to the list of memory accessing files This fixes 4 out of 5 cases in: arb_framebuffer_no_attachments-atomic on cayman. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>	2018-05-23 03:51:40 +01:00
Roland Scheidegger	7b89fcec41	llvmpipe: improve rasterization discard logic This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-23 04:23:32 +02:00
Bas Nieuwenhuizen	047438287c	ac/surface/gfx6: Don't force a tile index for fmask. The bpe of the fmask often differs from the bpe of the main surface. On SI that means it has to get a different tile index. addrlib is capable of figuring this out itself, so just pass -1 instead to let it know that it is not preset. Fixes: `9bf3570fed` "ac/surface/gfx6: compute FMASK together with the color surface" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-23 02:23:03 +02:00
Jason Ekstrand	a347a5a12c	i965: Remove ring switching entirely Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:39 -07:00
Jason Ekstrand	b499b85b0f	i965/miptree: Move the access_raw call to the individual map functions The only function that doesn't need to call access_raw is map_blit. If it takes the blitter path, it will happen as part of intel_miptree_copy. If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle doing whatever resolves are needed. This should save us resolves in quite a few cases and will probably help performance a bit. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:37 -07:00
Jason Ekstrand	f566a1264c	i965: Remove support for the BLT ring We still support the blitter on gen4-5 but it's on the same ring as 3D. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:35 -07:00
Jason Ekstrand	33affda8bf	i965/miptree: Use blorp for blit maps on gen6+ Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:34 -07:00
Jason Ekstrand	0eedb0fca9	i965/miptree: Use blorp for validation tex copies on gen6+ It's faster than the blitter and can handle things like stencil properly so it doesn't require software fallbacks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:32 -07:00
Jason Ekstrand	80fc3896f3	i965: Delete the blitter path for CopyTexSubImage The blorp path (called first) can do anything the blitter path can do so it's just dead code. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:31 -07:00
Jason Ekstrand	8162256b01	i965: Don't fall back to the blitter in BlitFramebuffer On gen4-5, we try the blitter before we even try blorp. On newer platforms, blorp can do everything the blitter can so there's no point in even having the blitter fall-back path. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:29 -07:00
Jason Ekstrand	e596563b08	i965: Remove some unused includes of intel_blit.h Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:27 -07:00
Jason Ekstrand	a9499374a9	i965/blit: Delete intel_emit_linear_blit This function is no longer used. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:25 -07:00
Jason Ekstrand	7fd962093f	i965: Use meta for pixel ops on gen6+ Using meta for anything is fairly aweful and definitely has more CPU overhead. However, it also uses the 3D pipe and is therefore likely faster in terms of GPU time than the blitter. Also, the blitter code has so many early returns that it's probably not buying us that much. We may as well just use meta all the time instead of working over-time to find the tiny case where we can use the blitter. We keep gen4-5 using the old blit paths to avoid perturbing old hardware too much. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-22 15:46:20 -07:00
Kenneth Graunke	92f01fc5f9	i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin. We'd like to start using soft-pin to assign BO addresses up front, and never move them again. Our previous plan for dealing with 48-bit VF cache bugs was to relocate vertex buffers to the low 4GB, so we'd never have addresses that alias in the low 32 bits. But that requires moving buffers dynamically. This patch tracks the last seen BO address for each vertex/index buffer, and emits a VF cache invalidate if the high bits change. (Ideally, we won't hit this case very often.) This should work for the soft-pin case, but unfortunately won't work in the relocation case, as we don't actually know the addresses. So, we have to use both methods. v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple more explicitly (suggested by Scott). Mention "single batch" too (suggested by Chris). Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-22 10:02:28 -07:00
Kenneth Graunke	c7259259d4	i965: Introduce a "memory zone" concept on BO allocation. We're planning to start managing the PPGTT in userspace in the near future, rather than relying on the kernel to assign addresses. While most buffers can go anywhere, some need to be restricted to within 4GB of a base address. This commit adds a "memory zone" parameter to the BO allocation functions, which lets the caller specify which base address the BO will be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA. Eventually, I hope to create a 4GB memory zone corresponding to each state base address. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-05-22 10:01:09 -07:00
Jason Ekstrand	417b9e5770	intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0 Fixes: `d6cd14f213` "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-22 09:53:23 -07:00
Michel Dänzer	fe2edb25dd	dri3: Stricter SBC wraparound handling Prevents corrupting the upper 32 bits of draw->recv_sbc when draw->send_sbc resets to 0 (which currently happens when the window is unbound from a context and bound to one again), which in turn caused loader_dri3_swap_buffers_msc to calculate target_msc with corrupted upper 32 bits. This resulted in hangs with the Xorg modesetting driver as of xserver 1.20 (older versions and other drivers ignored the upper 32 bits of the target MSC, which is why this wasn't noticed earlier). Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/106351 Tested-by: Mike Lothian <mike@fireburn.co.uk>	2018-05-22 17:59:53 +02:00
Samuel Pitoiset	75e919c045	radv: fix computation of user sgprs for 32-bit pointers With 32-bit pointers we only need one user SGPR per desc set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:29 +02:00
Samuel Pitoiset	c5536fc813	radv: drop user_sgpr_info::sgpr_count It's only used inside allocate_user_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:26 +02:00
Samuel Pitoiset	36a4d6d081	radv: add support for 32-bit pointers in user data SGPRs We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:22 +02:00
Samuel Pitoiset	b654ef5808	radv: add set_loc_shader_ptr() helper This helper will hep for switching to 32-bit GPU pointers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:20 +02:00
Samuel Pitoiset	14a7547c08	radv: allocate descriptor BOs in the 32-bit addr space Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:18 +02:00
Samuel Pitoiset	0d1406ad12	radv: allocate the upload BO in the 32-bit addr space Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:17 +02:00
Samuel Pitoiset	d8a61d3232	radv: set amdgpu-32bit-address-high-bits LLVM attribute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:15 +02:00
Samuel Pitoiset	fe2649d3ad	radv/winsys: allow to allocate BOs in the 32-bit addr space This introduces a new flag called RADEON_FLAG_32BIT. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:13 +02:00
Samuel Pitoiset	b60e0ee789	radv/winsys: request high address This is needed for 32-bit GPU pointers. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-22 15:53:09 +02:00
Anuj Phogat	0748383a60	i965/glk: Add l3 banks count for 2x6 configuration 2x6 configuration with pci-id 0x3185 has same number of banks (2) as 3x6 configuration (pci-id 0x3184). Reported-by: Clayton Craft <clayton.a.craft@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `eb23be1d97` "i965: Add and initialize l3_banks field for gen7+" Cc: Francisco Jerez <currojerez@riseup.net>	2018-05-21 16:43:26 -07:00
Vinson Lee	85f61197df	v3d: Include v3d_drm.h path. Fix build error. CC v3d_blit.lo In file included from v3d_blit.c:27:0: v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory #include "v3d_drm.h" ^~~~~~~~~~~ Fixes: `8a793d42f1` ("v3d: Switch the vc5 driver to using the finalized V3D UABI.") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-21 11:15:47 -07:00
Samuel Pitoiset	73df16dcee	radv: fix centroid interpolation It's legal to set the centroid and sample interpolation modes when MSAA disabled. So, we have to initialize the centroid inputs because the hardware doesn't. This fixes rendering issues with DXVK and The Witness, World of Warcraft, Trackmania and probably more games. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390 CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-21 13:57:46 +02:00
Bas Nieuwenhuizen	f26b008e28	radv: Cleanup unused prime blit path. Since we have the common WSI code, we use vkCmdCopyImageToBuffer instead. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 10:33:41 +02:00
Bas Nieuwenhuizen	a63a0960e3	radv: Fix SRGB compute copies. SRGB stores are broken. We had compensation code in the resolve path but none in the copy path. Since we don't want any conversion and it does not matter for DCC, just make everything UNORM instead. This happened to cause wrong colors for the PRIME path, as that uses image->buffer copies which always use the compute path. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-21 10:33:41 +02:00
Tapani Pälli	63525ba730	android: enable VK_ANDROID_native_buffer Patch changes entrypoints generator to not skip this extension even though it is set as disabled in the xml. We also need compilation flag VK_USE_PLATFORM_ANDROID_KHR to be enabled. It looks like this extension got disabled in commit `69f447553c`. v2: just remove the whole 'supported' attrib check + remove vk_icd.h compilation fix (fix in VulkanHeaders instead) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 09:26:50 +03:00
Tapani Pälli	437acae704	vulkan: update vk_icd.h to current upstream Import from commit eb0c1fd on branch 'master' of https://github.com/KhronosGroup/Vulkan-Headers.git. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-21 09:26:50 +03:00
Dave Airlie	bfa74bb44d	virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range. The host side hasn't got support for this feature yet, so don't enable it unless we get the caps from the host. This makes the texture buffer range piglit tests skip now. Fixes: `fe0647df5a` (virgl: add offset alignment values to to v2 caps struct) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>	2018-05-21 12:44:55 +10:00
Timothy Arceri	2e6c987a85	mesa: stop hiding query parameters from OpenGL compat Just let the extension detection do its job as we will be adding compat profile support in future, also we want these to work with compat profile version overrides. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-21 09:39:03 +10:00
Christoph Haag	549e54270b	radv: fix VK_EXT_descriptor_indexing GetPhysicalDeviceProperties2KHR() was crashing because features was null Fixes: `0e10790558` "radv: Enable VK_EXT_descriptor_indexing." CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-20 13:36:07 +02:00
Bas Nieuwenhuizen	a1c87235a9	ac/surface: Only align linear power of two fmt textures. We're not sharing 32_32_32 formats between different GPUs, so we do not have to align for vega on pre-vega cards. Fixes: `e361970ed7` "radv: Add support for IMG_DATA_FORMAT_32_32_32." Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-20 11:57:59 +02:00
Bas Nieuwenhuizen	62e0e089d7	amd/addrlib: Use defines in autotools build. Otherwise stuff like NDEBUG would not be passed through. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-20 11:57:59 +02:00
Aaron Watry	cfe582f9dc	r600/compute: Mark several functions as static They're not used anywhere else, so keep them private Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-19 10:22:16 -05:00
Aaron Watry	d21e64c626	r600/compute: Remove unused compute_memory_pool functions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-19 10:21:57 -05:00
Roland Scheidegger	6f558fb0f7	draw: get rid of special logic to not emit null tris I've confirmed after `77554d220d` we no longer need this to pass some tests from another api (as we no longer generate the bogus extra null tris in the first place). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-19 02:49:58 +02:00
Dylan Baker	c86e9a5fe5	docs: Add sha sums for release	2018-05-18 16:44:50 -07:00
Dylan Baker	1d46852830	docs: Add release notes for 18.1.0	2018-05-18 16:44:43 -07:00
Alyssa Rosenzweig	5d85a0a55b	nir: Implement optional b2f->iand lowering This pass is required by the Midgard compiler; our instruction set uses NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction. Normally, this lowering pass would be implemented in a backend-specific algebraic pass, but this conflicts with the existing iand->b2f pass in nir_opt_algebraic.py, hanging the compiler. This patch thus makes the existing pass optional (default on -- all other backends should remain unaffected), adding an optional pass for lowering the opposite direction. v2: Defer lowering until late algebraic optimisations to allow optimising the b2f instruction itself. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-18 22:44:09 +02:00
Jan Vesely	8ed2cabd04	travis: Adapt to radeonsi dropping support for LLVM 4 meson Vulkan, Clover, and autotools Vulkan need to be switched to llvm 5 Fixes: `f9eb1ef870` Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-18 13:59:37 -04:00
Marek Olšák	3d64ed5785	radeonsi: skip ES output stores for undefined output components Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-18 13:38:07 -04:00
Nanley Chery	0ab25f05ab	i965: isl: Move the MCS gen7+ assertion into ISL This is useful for every user of ISL. Drop the comment along the way to match similar functions in ISL. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	f88caf2321	i965/miptree: Remove format assertion in alloc_aux intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the color or depth miptree before the miptree is assigned an aux_usage. alloc_aux switches on the aux_usage so don't assert that the format is valid. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	8007b2d78b	i965/miptree: Simplify the switch in supports_ccs Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-18 09:53:06 -07:00
Nanley Chery	da98441fef	i965: Make get_ccs_surf succeed in alloc_aux Synchronize the requirements listed in isl_surf_get_ccs_surf with intel_miptree_supports_ccs by importing a restriction from ISL. Some implications: * We successfully create every aux_surf in alloc_aux * We only return false from alloc_aux if we run out of memory Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-18 09:53:06 -07:00
Brian Paul	42aee8f4f6	llvmpipe: fix check for a no-op shader The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders. Fix the code to check for num_instructions <= 1 instead. Fixes: `8fde9429c3` ("tgsi: fix incorrect tgsi_shader_info::num_tokens computation") Tested-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-18 09:09:41 -06:00
Samuel Pitoiset	03c4816093	radv: pass radv_nir_compiler_options directly to create_llvm_function() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-18 11:07:01 +02:00
Christian Gmeiner	2eb3f794d9	st/mesa: only define GLSL 1.4 for compat if driver supports it Currently GLSL 1.4 is defined for all gallium drivers even only GLSL 1.2 is supported as seen on etnaviv. v1 -> v2: - use _min(..) as suggested by Lucas Stach and Michel Dänzer Fixes: `4560aad780` ("mesa: add GLSLVersionCompat constant") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-18 10:46:24 +02:00
Dave Airlie	48e28ab961	vbo: remove MaxVertexAttribStride assert check. Some drivers (virgl) don't support GL4.4 or GLES3.1 yet, so never fill in this const. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-05-18 14:58:15 +10:00
Timothy Arceri	c0c69bd8dd	mesa: drop GL_EXT_polygon_offset support glPolygonOffset() has been part of the GL standard since 1.1. Also niether AMD or Nvidia support this in their binary drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61761	2018-05-18 09:21:24 +10:00
Brian Paul	8fde9429c3	tgsi: fix incorrect tgsi_shader_info::num_tokens computation We were incrementing num_tokens in each loop iteration while parsing the shader. But each call to tgsi_parse_token() can consume more than one token (and often does). Instead, just call the tgsi_num_tokens() function. Luckily, this issue doesn't seem to effect any current users of this field (llvmpipe just checks for <= 1, for example). Reviewed-by: Neha Bhende<bhenden@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-17 15:02:05 -06:00
Samuel Pitoiset	fcba3934fc	radv: add radv_emit_shader_pointer() helper For future work (support for 32-bit GPU pointers). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 21:28:59 +02:00
Samuel Pitoiset	9b2c310a70	radv: add some helpers for cleaning up radv_get_preamble_cs() Because this function looks a bit ugly to me. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 21:28:57 +02:00
Marek Olšák	f9eb1ef870	amd: remove support for LLVM 4.0 It doesn't support GFX9. Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-17 14:54:41 -04:00
Juan A. Suarez Romero	11a0d5563f	docs: update calendar, add news and link release notes to 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-05-17 18:45:26 +00:00
Juan A. Suarez Romero	042e21976a	docs: add sha256 checksums for 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `69ef6e4a75`)	2018-05-17 18:40:53 +00:00
Juan A. Suarez Romero	bb7750e8da	docs: add release notes for 18.0.4 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3b49ab6219`)	2018-05-17 18:40:51 +00:00
Mathias Fröhlich	6fac626193	mesa: The glArrayElement api is independent of the current program. All the shader program dependent handling is done on the level of the gl_Context::Array._DrawVAO/_DrawVAOEnabledAttribs. So, skip array element invalidation on _NEW_PROGRAM. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:40 +02:00
Mathias Fröhlich	984cb4e512	mesa: Flag _NEW_ARRAY only if we are changing ctx->Array.VAO. For the VAO internal helper functions that may be called with a non current VAO, flag the _NEW_ARRAY state only if it is the current ctx->Array.VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Mathias Fröhlich	5c7e3a90ed	mesa: Remove flush_vertices argument from VAO methods. The flush_vertices argument is now unused, remove it. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Mathias Fröhlich	9c7be67968	mesa: Remove FLUSH_VERTICES from VAO state changes. Pending draw calls on immediate mode or display list calls do not depend on changes of the VAO state. So, remove calls to FLUSH_VERTICES and flag _NEW_ARRAY as appropriate. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-17 20:13:39 +02:00
Juan A. Suarez Romero	0a2c947556	docs: add 18.0.5 in the release calendar Mesa 18.1 series has not been released yet, so let's extend 18.0 lifetime. v2: Add missing closing TR tags (Eric Engestrom) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-05-17 19:01:19 +02:00
Alok Hota	936ce75285	swr/rast: Added FEClipRectangles event and also added some comments Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:14 -05:00
Alok Hota	a33d376133	swr/rast: Whitespace and tab-to-spaces changes Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:10 -05:00
Alok Hota	7970fcff25	swr/rast: fix VCVTPD2PS generation for AVX512 Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:06 -05:00
Alok Hota	a0dddac1cb	swr/rast: Rectlist support for GS Add rectlist as an option for GS. Needed to support some driver optimizations. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:53:01 -05:00
Alok Hota	7926d18fa5	swr/rast: Remove unneeded virtual from methods Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-17 10:52:21 -05:00
Stefan Schake	b0acc3a562	broadcom/vc4: Native fence fd support With the syncobj support in place, lets use it to implement the EGL_ANDROID_native_fence_sync extension. This mostly follows previous implementations in freedreno and etnaviv. v2: Drop the flags (Eric) Handle in_fence_fd already in job_submit (Eric) Drop extra vc4_fence_context_init (Eric) Dup fds with CLOEXEC (Eric) Mention exact extension name (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:30 +01:00
Stefan Schake	44036c354d	broadcom/vc4: Store job fence in syncobj This gives us access to the fence created for the render job. v2: Drop flag (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:28 +01:00
Stefan Schake	9ed05e2520	broadcom/vc4: Detect syncobj support We need to know if the kernel supports syncobj submission since otherwise all the DRM syncobj calls fail. v2: Use drmGetCap to detect syncobj support (Eric) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:26 +01:00
Stefan Schake	4fc0ebdff5	broadcom/vc4: Bump libdrm requirement Require a version of libdrm with syncobj support. v2: Don't require a libdrm_vc4, just bump core libdrm if vc4 enabled (by anholt) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:24 +01:00
Stefan Schake	580d1f4c60	drm-uapi: Update vc4 header with syncobj submit support v2: Synchronized with kernel v2 v3: Update for the finalized kernel ABI (pad2 field) Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:21 +01:00
Stefan Schake	1ec01a911b	broadcom/vc4: Drop libdrm_vc4 requirement This was missed in the move back to the local uapi copy. libdrm_vc4 only seems to consist of headers that also exist in the Mesa tree. Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-17 16:04:12 +01:00
Eric Anholt	97894b1267	v3d: Add support for glSampleMask / glSampleCoverage.	2018-05-17 15:09:46 +01:00
Eric Anholt	9bbc3f8cf1	v3d: Enable NaN propagation in the VS and CS as well. Fixes piglit vs-isnan-*.shader_test at the expense of gl-1.0-spot-light.	2018-05-17 15:09:12 +01:00
Nanley Chery	edfb57c0a0	i965/blorp: Disable BLORP clear color updates With the previous patches, we now update the indirect clear color buffer every time the clear color changes. Avoid redundant updates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	02f5512fed	intel/blorp: Add a NO_UPDATE_CLEAR_COLOR batch flag Allow callers to handle updating the indirect clear color buffer themselves. This can reduce the number of clear color updates in the case where a caller performs multiple fast clears with the same clear color. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	f8ac11d69f	i965/blorp: Also skip the fast clear if the clear color differs If the aux state is CLEAR and clear color value has changed, only the surface state must be updated. The bit-pattern in the aux buffer is exactly the same. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:42 -07:00
Nanley Chery	43616404be	i965/clear: Drop a stale comment in fast_clear_depth This comment made more sense when it was above the calls to intel_miptree_slice_set_needs_depth_resolve(). We stopped using these functions at commit `554f7d6d02` ("i965: Move depth to the new resolve functions"). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	82849fb6d5	i965: Update the indirect buffer in set_clear_color For depth buffers, we avoid fast-clearing if the aux_state is already CLEAR. We do the same for color buffers only if the clear color doesn't change. We require that the clear colors match because, in that case, we don't update the indirect clear color outside of BLORP. Update the indirect clear color for color buffers as well. We'll enable the same depth buffer optimization for color buffers in a later patch. Note that we're now actually updating the indirect clear color twice in the case where we use BLORP to perform the fast-clear. This is only temporary. In later patches, we'll prevent BLORP from performing the update. v2: Add more context to the commit message (Topi). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-17 07:06:41 -07:00
Nanley Chery	5b315f3ad1	i965/clear: Remove an early return in fast_clear_depth Reduce complexity and allow the next patch to delete some code. With this change, clear operations will still be skipped and setting the aux_state will cause no side-effects. Remove the associated comment which implies an early return. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6f609ca609	i965: Use set_clear_color for depth miptrees Reduce code duplication now and prevent it in the following commits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	92a0a87b6f	Revert "i965: Make the miptree clear color setter take a gl_color_union" This reverts commit `1d94aa1987`. The next patch will make depth miptrees use the clear color setter that was originally being used for color miptrees. Go back to using the isl_color_value parameter because it's the same type as the fast_clear_color field used by color and depth miptrees. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	bb18af82c3	i965/miptree: Unify aux buffer allocation There isn't much that changes between the aux allocation functions. Remove the duplicated code. v2: Inline the switch statement (Jason). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6c41a2ef3b	i965: Prepare to delete intel_miptree_alloc_ccs() We're going to delete intel_miptree_alloc_ccs() in the next commit. With that in mind, replace the use of this function in do_single_blorp_clear() with intel_miptree_alloc_aux() and move the delayed allocation logic to it's callers. v2: Duplicate the delayed allocation comment (Topi Pohjolainen). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	beed9c4550	i965/miptree: Drop the mt param from alloc_aux_buffer Drop an unused parameter. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	6b1836aabe	i965/miptree: Drop the alloc_flags param from alloc_aux_buffer We have enough information to determine the optimal flags internally. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	3dd7f600e0	i965/miptree: Drop the name param from alloc_aux_buffer A name of "aux-miptree" should be sufficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	58d99a21f1	i965/miptree: Initialize the indirect clear color to zero The indirect clear color isn't correctly tracked in intel_miptree::fast_clear_color. The initial value of ::fast_clear_color is zero, while that of the indirect clear color is undefined. Topi Pohjolainen discovered this issue with MCS buffers. This issue is apparent when fast-clearing an MCS buffer for the first time with glClearColor = {0.0,}. Although the indirect clear color is undefined, the initial aux state of the MCS is CLEAR and the tracked clear color is zero, so we avoid updating the indirect clear color with {0.0,}. Make the indirect clear color match the initial value of ::fast_clear_color. Note: although we only have to drop HiZ's BO_ALLOC_BUSY flag for gen10+, we also drop it pre-gen10 to keep things simple. We add this flag back for pre-gen10 in a later patch. v2: Add a note about dropping HiZ's BO_ALLOC_BUSY flag (Topi). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	b58675e93f	i965/miptree: Add and use a memset option in alloc_aux_buffer Add infrastructure for initializing the clear color BO. intel_miptree_init_mcs is no longer needed with change. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	8a9491058d	i965/miptree: Zero-initialize CCS_D buffers Before this patch, the aux_state was actually AUX_INVALID because the BO was never defined. This was fine on single slice miptrees because we would fast-clear the resource right after creation. For multi-slice miptrees on SKL+ however, this results in undefined behavior when accessing a non-base slice. Here's a specific example: 1) Fast clear level 0 * Undefined CCS_D buffer allocated in "PASS_THROUGH" state. * Level 0 transitions to the CLEAR state. 2) Render to level 1 * Level 1 may have a 2-bit pattern of 2's. * Rendering with a 2 in the CCS is undefined. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Nanley Chery	816f2dc67d	i965/miptree: Fix handling of uninitialized MCS buffers Before this patch, if we failed to initialize an MCS buffer, we'd end up in a state in which the miptree thinks it has an MCS buffer, but doesn't. We also leaked the clear_color_bo if it existed. With this patch, we now free the miptree aux buffer resources and let intel_miptree_alloc_mcs() know that the MCS buffer no longer exists. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-17 07:06:41 -07:00
Samuel Pitoiset	1fba2e10b3	radv: only declare the ESGS rings for pre GFX9 chips GFX9 uses LDS instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:20 +02:00
Samuel Pitoiset	d349d4bd24	radv: allow to print GPU info with RADV_DEBUG=info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:17 +02:00
Samuel Pitoiset	56d53ed1d6	radv: do not emit unnecessary ES output stores GFX9: Totals from affected shaders: SGPRS: 472 -> 464 (-1.69 %) VGPRS: 576 -> 584 (1.39 %) Code Size: 45432 -> 44324 (-2.44 %) bytes Max Waves: 40 -> 40 (0.00 %) VI: SGPRS: 720 -> 720 (0.00 %) VGPRS: 728 -> 728 (0.00 %) Code Size: 45348 -> 43992 (-2.99 %) bytes Max Waves: 120 -> 120 (0.00 %) This affects Rise of Tomb Raider and the three Vulkan demos that use a geometry shader (geometryshader, deferredshadows and viewportarray). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:13 +02:00
Samuel Pitoiset	a6e44d1271	radv: do not emit unnecessary GS output stores Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 14:14:11 +02:00
Samuel Pitoiset	507402ada6	radv: only pass the global BO list at submit time if enabled That way the winsys might use a faster path when the global BO list is NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:27 +02:00
Samuel Pitoiset	6211799aff	radv: remove the radv_finishme() when compiling shaders Having an entrypoint different than "main" doesn't mean we have multiple shaders per module. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:24 +02:00
Samuel Pitoiset	1e86eaf7d8	radv: remove radv_device::llvm_supports_spill It's always true. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-17 13:48:21 +02:00
Timothy Arceri	f71714022b	mesa: add glUniform*ui{v} support to display lists Fixes: `a017c7ecb7` "mesa: display list support for uint uniforms" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78097	2018-05-17 13:07:48 +10:00
Dieter Nützel	7f1dc93357	radeonsi: create .gitignore Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-05-16 21:48:17 -04:00
Dave Airlie	eba4cf797c	ac/llvm: use amdgcn.tbuffer.store instead of SI.tbuffer.store intrinsic Drop the use of the old intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-17 11:46:53 +10:00
Eric Anholt	b2e7c32703	v3d: Fix wiring filters to NEAREST for 32-bit texture returns. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104626	2018-05-16 21:19:07 +01:00
Eric Anholt	795488d2bf	v3d: Enable the driver by default. Now that we have a stabilized ABI and a fairly conformant driver, turn it on.	2018-05-16 21:19:07 +01:00
Eric Anholt	01ae6a9181	v3d: Rename driver functions from vc5 to v3d. This is the final step of the driver rename.	2018-05-16 21:19:07 +01:00
Eric Anholt	8c47ebbd23	v3d: Rename the driver files from "vc5" to "v3d".	2018-05-16 21:19:07 +01:00
Eric Anholt	c4c488a2ae	v3d: Rename the vc5_dri.so driver to v3d_dri.so. This allows the driver to load against the merged kernel DRM driver. In the process, rename most of the build system variables and gallium plumbing functions.	2018-05-16 21:19:07 +01:00
Eric Anholt	8a793d42f1	v3d: Switch the vc5 driver to using the finalized V3D UABI. In the process of merging to the kernel, I renamed the driver to the general product line's name (since we have both vc5 and vc6 supported already). Since the ABI is finalized, move the header to include/drm-uapi.	2018-05-16 21:19:07 +01:00
Charmaine Lee	33a86acd78	svga: fix incompatible bind flags at buffer validation time At buffer resource validation time, if the resource handle is not yet created and if the initial buffer bind flags and the tobind flags are incompatible, just use the tobind flags to create the resource handle. On the other hand, if the bind flags are compatible, we can combine the bind flags for the resource handle creation. Fixes piglit gl-3.1-buffer-bindings crash. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-16 13:04:16 -06:00
jenny.q.cao	1261b34cd5	mesa: cast the GLenum16 to GLint to avoid compile warning on android Cast the enum to GLint to avoid the compile warning: /src/mesa/main/get.c:3005:19: warning: comparison of constant -32768 with expression of type 'GLenum16' (aka 'unsigned short') is always false -Wtautologicalia-constant-out-of-range-compare Tests: compilation without this warning Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-16 13:02:43 -06:00
Stuart Young	f806cc9eb6	etnaviv: Fix missing rnndb file in tarballs Seems that when the rnndb files for etniviv were updated/included back in Nov 2017, hw/texdesc_3d.xml.h was missed from Makefile.sources and meson.build. This was all during the conversion to meson, so it apears to have slipped through the cracks. As such, this file has been missing from the official tarballs since inclusion in Mesa, so the git trees and tarballs differ. Found due to lintian errors in the Debian packages. Fixes: `f1e1c60ff6` ("etnaviv: Update from rnndb") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-16 19:36:10 +02:00
Matthias Groß	71892fbe19	gallium/hud: add frametime graph (v2) Thanks for your comment. This version has an additional boolean in the fps_info struct to distinguish between fps and frame time calculation. The struct is initialised in the respecting install functions for this purpose. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-05-15 19:30:12 -04:00
Jan Vesely	f3521ce2c4	eg/compute: Use reference counting to handle compute memory pool. Use pipe_reference to release old RAT surfaces. RAT surface adds a reference to pool bo, so use reference counting for pool->bo as well. v2: Use the same pattern for both defrag paths Drop confusing comment CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-15 19:01:47 -04:00
Roland Scheidegger	e01af38d6f	gallivm: Use alloca_undef with array type instead of alloca_array Use a single allocation of array type instead of the old-style array allocation for the temp and immediate arrays. Probably only makes a difference if they aren't used indirectly (so, if we used them solely because there's too many temps or immediates). In this case the sroa and early-cse passes can sometimes do some optimizations which they otherwise cannot. (As a side note, for the temp reg array, we actually really should use one allocation per array id, not just one for everything.) Note that the instcombine pass would actually promote such allocations to single alloc of array type as well, but it's too late for some artificial shaders we've seen to help (we don't want to run instcombine at the beginning due to its cost, hence would need another sroa/cse pass after instcombine). sroa/early-cse help there because they can actually eliminate all of the huge shader, reducing it to a single const output (don't ask...). (Interestingly, instcombine also removes all the bitcasts we do on that allocation for single-value gathering, and in the end directly indexes into the single vector elements, which according to spec is only semi-valid, but this happens regardless. Another thing instcombine also does is use inbound GEPs, which is probably something we should do manually as well - for indirectly indexed reg files llvm may not be able to figure it out on its own, but we should be able to guarantee all pointers are always inbound. In any case, by the looks of it using single allocation with array type seems to be the right thing to do even for ordinary shaders.) No piglit change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-05-16 00:04:48 +02:00
Dieter Nützel	bd0b6b9f17	radv: add generated files to .gitignore(s) Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-15 22:53:55 +02:00
Samuel Pitoiset	6bde8c5608	spirv: fix visiting inner loops with same break/continue block We should stop walking through the CFG when the inner loop's break block ends up as the same block as the outer loop's continue block because we are already going to visit it. This fixes the following assertion which ends up by crashing in RADV or ANV: SPIR-V parsing FAILED: In file ../src/compiler/spirv/vtn_cfg.c:381 block->node.link.next == NULL 0 bytes into the SPIR-V binary This also fixes a crash with a camera shader from SteamVR. v2: make use of vtn_get_branch_type() and add an assertion Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106090 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106504 CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-15 21:38:19 +02:00
Rob Clark	d89f58a6b8	mesa/st: handle vert_attrib_mask in nir case too Note, actually fixes `9987a072cb`, but the problems don't show up until `19a91841c3`. Fixes: `19a91841c3` st/mesa: Use Array._DrawVAO in st_atom_array.c. Fixes: `9987a072cb` st/mesa: Make the input_to_index array available. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-05-15 15:15:33 -04:00
Marek Olšák	3e27b377f2	cso: check count == 0 in cso_set_vertex_buffers The code didn't expect that, leading to crashes. Fixes: `86d63b53a2` "gallium: remove aux_vertex_buffer_slot code" Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-05-15 12:36:27 -04:00
Rob Clark	dace607245	vc5: use util_copy_framebuffer_state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-15 08:48:13 -04:00
Rob Clark	dae4c98dd7	vc4: use util_copy_framebuffer_state Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-15 08:47:35 -04:00
Rob Clark	f897b67dc1	freedreno/a5xx: remove fd5_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	d48a2404a2	freedreno/a4xx: remove fd4_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	2c40f2ba32	freedreno/a3xx: remove fd3_shader_stateobj Extra level of indirection that serves no purpose. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	273f7d8404	freedreno: fence should hold a ref to pipe Since the fence can outlive the context, and all it really needs to wait on a fence is the pipe, use the new fd_pipe reference counting to hold a ref to the pipe and drop the ctx pointer. This fixes a crash seen with (for example) glmark2: #0 fd_pipe_wait_timeout (pipe=0xbf48678b3cd7b32b, timestamp=0, timeout=18446744073709551615) at freedreno_pipe.c:101 #1 0x0000ffffbdf75914 in fd_fence_finish (pscreen=0x561110, ctx=0x0, fence=0xc55c10, timeout=18446744073709551615) at ../src/gallium/drivers/freedreno/freedreno_fence.c:96 #2 0x0000ffffbde154e4 in dri_flush (cPriv=0xb1ff80, dPriv=0x556660, flags=3, reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/gallium/state_trackers/dri/dri_drawable.c:569 #3 0x0000ffffbecd8b44 in loader_dri3_flush (draw=0x558a28, flags=3, throttle_reason=__DRI2_THROTTLE_SWAPBUFFER) at ../src/loader/loader_dri3_helper.c:656 #4 0x0000ffffbecbc36c in glx_dri3_flush_drawable (draw=0x558a28, flags=3) at ../src/glx/dri3_glx.c:132 #5 0x0000ffffbecd91e8 in loader_dri3_swap_buffers_msc (draw=0x558a28, target_msc=0, divisor=0, remainder=0, flush_flags=3, force_copy=false) at ../src/loader/loader_dri3_helper.c:827 #6 0x0000ffffbecbcfc4 in dri3_swap_buffers (pdraw=0x5589f0, target_msc=0, divisor=0, remainder=0, flush=1) at ../src/glx/dri3_glx.c:587 #7 0x0000ffffbec98218 in glXSwapBuffers (dpy=0x502bb0, drawable=2097154) at ../src/glx/glxcmds.c:840 #8 0x000000000040994c in CanvasGeneric::update (this=0xfffffffff400) at ../src/canvas-generic.cpp:114 #9 0x0000000000411594 in MainLoop::step (this=this@entry=0x5728f0) at ../src/main-loop.cpp:108 #10 0x0000000000409498 in do_benchmark (canvas=...) at ../src/main.cpp:117 #11 0x00000000004071b0 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.cpp:210 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:46 -04:00
Rob Clark	a8c0daa172	freedreno: batch cache doesn't hold a ref to batch The cache doesn't hold a (strong) reference to the batch. So we shouldn't be trying to drop a reference, as that leads to: #0 0x0000ffffbecb37a0 in raise () from /lib64/libc.so.6 #1 0x0000ffffbeca159c in abort () from /lib64/libc.so.6 #2 0x0000ffffbecacf48 in __assert_fail_base () from /lib64/libc.so.6 #3 0x0000ffffbecacfa8 in __assert_fail () from /lib64/libc.so.6 #4 0x0000ffffbd28def0 in pipe_reference_described (ptr=0x4f47130, reference=0x0, get_desc=0xffffbd2e0f08 <__fd_batch_describe>) at ../src/gallium/auxiliary/util/u_inlines.h:88 #5 0x0000ffffbd28e188 in fd_batch_reference_locked (ptr=0x4f40de0, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:258 #6 0x0000ffffbd28e9a8 in fd_bc_invalidate_resource (rsc=0x4f40ca0, destroy=true) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:244 #7 0x0000ffffbd293778 in fd_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/drivers/freedreno/freedreno_resource.c:644 #8 0x0000ffffbd922674 in u_transfer_helper_resource_destroy (pscreen=0xedc170, prsc=0x4f40ca0) at ../src/gallium/auxiliary/util/u_transfer_helper.c:144 #9 0x0000ffffbd29527c in pipe_resource_reference (ptr=0x4f455d8, tex=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:144 #10 0x0000ffffbd29548c in fd_surface_destroy (pctx=0x1012720, psurf=0x4f455d0) at ../src/gallium/drivers/freedreno/freedreno_surface.c:78 #11 0x0000ffffbd1f9c48 in pipe_surface_reference (ptr=0x4f471d0, surf=0x0) at ../src/gallium/auxiliary/util/u_inlines.h:113 #12 0x0000ffffbd1f9ef4 in util_copy_framebuffer_state (dst=0x4f471c8, src=0x0) at ../src/gallium/auxiliary/util/u_framebuffer.c:114 #13 0x0000ffffbd2e0e30 in __fd_batch_destroy (batch=0x4f47130) at ../src/gallium/drivers/freedreno/freedreno_batch.c:225 #14 0x0000ffffbd28e1b0 in fd_batch_reference_locked (ptr=0xfffffffff010, batch=0x0) at ../src/gallium/drivers/freedreno/freedreno_batch.h:262 #15 0x0000ffffbd28e6b0 in fd_bc_invalidate_context (ctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_batch_cache.c:190 #16 0x0000ffffbd2e2b6c in fd_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/freedreno_context.c:139 #17 0x0000ffffbd2c3280 in fd5_context_destroy (pctx=0x1012720) at ../src/gallium/drivers/freedreno/a5xx/fd5_context.c:56 #18 0x0000ffffbd5b7a8c in st_destroy_context_priv (st=0xfd72f0, destroy_pipe=true) at ../src/mesa/state_tracker/st_context.c:281 Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-05-15 08:46:26 -04:00
Eric Engestrom	37d44e2608	docs/meson: mark code/commands as <code> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:39 +01:00
Eric Engestrom	5829f616ec	docs/meson: replace plaintext url with a link Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:36 +01:00
Eric Engestrom	67c550708a	docs/meson: fix various html issues Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:34 +01:00
Eric Engestrom	dc2dc1fa30	docs/meson: fix various typos Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:33:28 +01:00
Eric Engestrom	6c5df78d8b	meson: fix copyright symbol Fixes: `bd68f1013c` "autotools, meson: add tileset.h" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:31:46 +01:00
Juan A. Suarez Romero	bd68f1013c	autotools, meson: add tileset.h Fixes: `4e52cb51b5` ("swr/rast: Thread locked tiles improvement") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-15 10:00:11 +02:00
Thomas Hellstrom	3d0b4979ee	st/xa: Bump minor Bump xa minor to signal that the underlying mesa version is suitable for dri3. This is a bit ugly since it doesn't relate to a specific xa interface change. Recently there has been a number of fixes in mesa that helps enabling dri3 without any significant regressions in automated testing and common desktop usage latency. However, the xf86-video-vmware driver has no other way to tell but inspecting the xa version. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-15 09:27:46 +02:00
Dave Airlie	9585e70206	virgl: enable vertex streams when glsl level is high enough. This enabled the vertex streams out when the host supports GL4.0.	2018-05-15 14:56:57 +10:00
Kai Wasserbäch	b691d9192c	opencl: autotools: Fix linking order for OpenCL target Otherwise the build fails with an undefined reference to clang::FrontendTimesIsEnabled. Bugzilla: https://bugs.freedesktop.org/106209 Cc: Jan Vesely <jan.vesely@rutgers.edu> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Acked-by: Jan Vesely <jan.vesely@rutgers.edu> Tested-by: Aaron Watry <awatry@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-05-14 22:45:01 -04:00
Samuel Pitoiset	97b179570c	radv: reduce the number of parameters export by the GS copy shader By using the geometry shader output usage mask. This improves all Vulkan demos that use a geometry shader (ie. geometryshader, deferredshadows, viewportarray). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:23 +02:00
Samuel Pitoiset	560bd9eb67	radv: scan the geometry shader output usage mask For reducing the number of parameters that are exported by the GS copy shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:21 +02:00
Samuel Pitoiset	ea43d935ab	radv: run the shader info pass before emitting the GS copy shader For further optimizations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:19 +02:00
Samuel Pitoiset	7cbc6f2621	radv: check that layout isn't NULL in radv_nir_shader_info_pass() An upcoming patch will run the shader info pass on the geometry shader just before emitting the GS copy shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-14 21:38:17 +02:00
Jason Ekstrand	18f8200a99	intel/blorp: Use linear formats for CCS_E clear colors in copies It's clear that the original code meant to do this and there is even a 10-line comment explaining why. Originally, we had a simple function for packing the clear colors which was unaware of sRGB. However, in `a6b66a7b26`, when we started using ISL to do the packing, the wrong format was used. Fixes: `a6b66a7b26` "intel/blorp: Use ISL instead of bitcast_color..." Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-14 10:41:26 -07:00
Bas Nieuwenhuizen	f944a59996	radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega. The hardware always interprets the alpha as unsigned and fixing it in the shader is going to add unacceptable overheads. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:30 +02:00
Bas Nieuwenhuizen	3d4d388e39	radv: Fix up 2_10_10_10 alpha sign. Pre-Vega HW always interprets the alpha for this format as unsigned, so we have to implement a fixup to do the sign correctly for signed formats. v2: Improve indexing mess. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:20 +02:00
Bas Nieuwenhuizen	e361970ed7	radv: Add support for IMG_DATA_FORMAT_32_32_32. Basic sampling support for linear tiling. No CTS regressions, but it seems the blitting coverage is not very extensive. https://bugs.freedesktop.org/show_bug.cgi?id=106331 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 18:58:12 +02:00
Bas Nieuwenhuizen	dd102405de	radv: Translate logic ops. radeonsi could pass them through but the enum changed between Gallium and Vulkan, so we have to translate. In progress I made the register defines a bit more readable. CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 16:49:06 +02:00
Bas Nieuwenhuizen	62f50df7b7	radv: Fix multiview queries. This moves the extra queries to after the main query ended, instead of doing it after the begin and hence doing nesting. We also emit only (view count - 1) extra queries, as the main query is already there for the first view. This fixes the CTS occasionally getting stuck in dEQP-VK.multiview.queries* waiting on results. Fixes: `32b4f3c38d` "radv/query: handle multiview queries properly. (v3)" CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 16:49:06 +02:00
Eric Engestrom	f0cdc39b13	meson: remove dependency antipattern `dep_valgrind != []` now (0.45) produces a warning that is quite explicit: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. The result of this is undefined and will become a hard error in a future Meson release. `dep_valgrind = []` used to be the recommended way to deal with non-existant dependency, but these don't work with `.found()`, so now the recommended way is to declare a impossible dependency, which null_dep does for us in Mesa. In short, we don't need and shouldn't check for `!= []` anywhere anymore. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>	2018-05-14 14:55:36 +01:00
Samuel Pitoiset	ece398277c	radv: remove useless check in radv_create_shaders() radv_can_dump_shader() already handles if module is NULL. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:38:01 +02:00
Samuel Pitoiset	8ade3e4684	radv: allow to dump the GS copy shader with RADV_DEBUG="shaders" Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:38:00 +02:00
Samuel Pitoiset	553418af1e	radv: move {load,store}_var intrinsics scanning in different functions These are going to be crazy and we are probably going to add more scan stuff in the future. Also use switch cases instead. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-14 12:37:58 +02:00
jenny.q.cao	ff7521c9ba	android: change include "cutils/log.h" to "log/log.h" on Android API >=26 There is a compile warning from Android 8 (API version 26) from "include cutils/log.h" warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings, Change to include "log/log.h" on Android 8 or later major version to avoid this warning Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-14 08:08:31 +03:00
Roland Scheidegger	cf3fb42fb5	llvmpipe: Fix random number generation for unit tests We were never producing negative numbers for signed types. Also fix only producing half the valid range for uint32, and properly clamp signed values. Because this now also properly tests snorm with actually negative values, need to increase eps for such conversions. I believe these cannot actually be hit in ordinary operation (e.g. if a snorm texture is sampled and output to snorm RT, it will still go through snorm->float and float->snorm conversion), so don't bother to do anything to fix the bad accuracy (might be quite complex). Basically, the issue is for something like snorm16->snorm8 that in the end this will just use a 8 bit arithmetic right shift. But the math behind it says we should actually do a division by 32767 / 127, which is ~258, not 256. So the result can be one bit off (values have too large magnitude), and furthermore, the shift has incorrect rounding (always rounds down). For positive numbers, these errors have different direction, but for negative ones they have the same, hence for some values the error will be 2 bit in the end. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=106232	2018-05-14 03:14:00 +02:00
Dave Airlie	5978d54a09	radv: use compute path for multi-layer images. I don't think the hw resolve path can't handle multi-layer images. This fixes all the: dEQP-VK.renderpass.multisample_resolve.layers_* tests on my VI card. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:57:54 +10:00
Dave Airlie	98dbaa445a	radv: resolve all layers in compute resolve path. This path should iterate across all layers, I've some ideas for doing this in a single pass, but this is simpler for now. This passes the tests because we don't use the fragment path unless we have DCC, and we don't have DCC on layered images. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:57:27 +10:00
Dave Airlie	b16fc6cda1	radv/resolve: do fmask decompress on all layers. For a multi-layer subpass resolve we want to make sure we flush all the layers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: <mesa-stable@lists.freedesktop.org>	2018-05-14 08:56:47 +10:00
Rhys Perry	8f6cbb8c7d	nvc0: fix setting of subpixel precision during conservative rasterization Fixes: `07dac3e040` ("nvc0: add conservative rasterization support") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-13 13:21:41 -04:00
Rhys Perry	c879011c72	anv,nir: add generated files to .gitignore(s) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-12 20:14:49 -07:00
Marek Olšák	86d63b53a2	gallium: remove aux_vertex_buffer_slot code The slot index is always 0, and is pretty unlikely to change in the future. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-05-12 21:08:09 -04:00
Timothy Arceri	ce188813bf	radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR linking optimisations and only run over the NIR optimisation loop once similar to the GLSLOptimizeConservatively constant used by some GL drivers. We need to run over the opts at least once to avoid errors in LLVM (e.g. dead vars it can't handle) and also to reduce the time spent compiling the IR in LLVM. With this change the Blacksmith Unity demos compilation times go from 329760 ms -> 299881 ms when using Wine and DXVK. V2: add bit to radv_pipeline_key Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246	2018-05-13 09:58:33 +10:00
Vinson Lee	26ddc4f9e1	scons: Add PROGRAM_NIR_FILES. Fix SCons build error. Linking build/linux-x86_64-debug/gallium/targets/libgl-xlib/libGL.so.1.5 ... build/linux-x86_64-debug/mesa/libmesa.a(st_program.os): In function `st_translate_prog_to_nir': src/mesa/state_tracker/st_program.c:392: undefined reference to `prog_to_nir' Fixes: `5c33e8c772` ("st/nir: use NIR for asm programs") Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2018-05-12 00:50:05 -07:00
Timothy Arceri	5c33e8c772	st/nir: use NIR for asm programs Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-12 14:48:21 +10:00
Timothy Arceri	0b3e9564bd	st/nir: make st_nir_opts() available externally The following patch will make use of this for asm style programs. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-05-12 14:48:21 +10:00
Boyuan Zhang	0907d3ab9c	radeon/vce: add firmware support for ver 53 and up All vce firmwares with major version greater than or equal to 53 are supported Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-05-11 14:59:00 -04:00
Rob Clark	a7c81a7f67	etnaviv: remove pipe_fence_handle::ctx A fence can outlive the ctx it was created from (see glmark2).. etnaviv doesn't actually need fence->ctx so lets remove it before someone makes the mistake of assuming it is a valid pointer. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-05-11 18:42:13 +02:00
George Kyriazis	4e52cb51b5	swr/rast: Thread locked tiles improvement - Change tilemgr TILE_ID encoding to use Morton-order (Z-order). - Change locked tiles set to bitset. Makes clear, set, get much faster. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:26:35 -05:00
George Kyriazis	8238c791dc	swr/rast: Add Builder::GetVectorType() Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:25:47 -05:00
George Kyriazis	8cb55dae2e	swr/rast: Prepend the console output with a newline It can get jumbled with output from other threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:25:24 -05:00
George Kyriazis	db25fcfcde	swr/rast: Add ConcatLists() for concatenating lists Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:22:57 -05:00
George Kyriazis	dcaca3c7b3	swr/rast: Add constant initializer for uint64_t Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:22:17 -05:00
George Kyriazis	70f0a28b83	swr/rast: Use binner topology to assemble backend attributes Previously was using the draw topology, which may change if GS or Tess are active. Only affected attributes marked with constant interpolation, which limited the impact. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:21:52 -05:00
George Kyriazis	b3b0f0e0ec	swr/rast: Change formatting Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-11 11:21:22 -05:00
Ville Syrjälä	659910eda0	meson: Fix build for egl platform_x11 with dri3 platform_x11 with dri3 needs inc_loader. In file included from ../src/egl/drivers/dri2/platform_x11_dri3.c:35:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory In file included from ../src/egl/drivers/dri2/platform_x11.c:46:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory In file included from ../src/egl/drivers/dri2/egl_dri2.c:61:0: ../src/egl/drivers/dri2/egl_dri2.h:41:32: fatal error: loader_dri3_helper.h: No such file or directory Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2018-05-11 17:41:57 +03:00
Samuel Pitoiset	efc10949cc	radv: move ac_build_if_state on top of radv_nir_to_llvm.c These helpers will be needed for future work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-11 12:35:07 +02:00
Samuel Pitoiset	3a410f0afc	radv: minor cleanups in radv_fill_shader_variant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-11 12:35:05 +02:00
Jan Vesely	58272c1ad7	winsys/amdgpu: Destroy dev_hash table when the last winsys is removed. Fixes memory leak on module unload. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-10 23:23:50 -04:00
Marek Olšák	a2e9d9b4c1	ac/gpu_info: add has_read_registers_query Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:11 -04:00
Marek Olšák	9b1fdfc541	ac/gpu_info: add has_2d_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:10 -04:00
Marek Olšák	d26696283d	ac/gpu_info: add has_sparse_vm_mappings Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:08 -04:00
Marek Olšák	125adc92ad	ac/gpu_info: add has_unaligned_shader_loads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:07 -04:00
Marek Olšák	8b9694da4b	radeonsi: expose ARB_query_buffer_object on ancient kernels too It doesn't use indirect dispatches. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:04 -04:00
Marek Olšák	e9c08bc658	ac/gpu_info: add has_indirect_compute_dispatch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:03 -04:00
Marek Olšák	64265ac8d5	ac/gpu_info: add kernel_flushes_tc_l2_after_ib Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:01 -04:00
Marek Olšák	14c5a93bfa	ac/gpu_info: add has_format_bc1_through_bc7 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:40:00 -04:00
Marek Olšák	2bd2c173e8	ac/gpu_info: add has_eqaa_surface_allocator Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:58 -04:00
Marek Olšák	e720cb6135	radeonsi: clean up the reset status query implementation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:57 -04:00
Marek Olšák	3060f62340	ac/gpu_info: add has_bo_metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:56 -04:00
Marek Olšák	09f1bab483	ac/gpu_info: add si_TA_CS_BC_BASE_ADDR_allowed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:54 -04:00
Marek Olšák	8b58a14ef7	ac/gpu_info: add htile_cmask_support_1d_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:53 -04:00
Marek Olšák	b81149e258	ac/gpu_info: add kernel_flushes_hdp_before_ib Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:39:47 -04:00
Marek Olšák	a969f184cf	radeonsi: add an environment variable that forces EQAA for MSAA allocations This is for testing and experiments. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:37 -04:00
Marek Olšák	2309cedf44	radeonsi: set up EQAA image descriptors properly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:36 -04:00
Marek Olšák	7ac4ef097d	radeonsi: add EQAA SC,DB,CB register programming Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:34 -04:00
Marek Olšák	9d00580e75	radeonsi: support creating EQAA color textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:32 -04:00
Marek Olšák	912b0163dc	ac/surface: add EQAA support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:31 -04:00
Marek Olšák	ee31762ef5	radeonsi: use better sample locations for 8x EQAA Verified with the piglit MSAA accuracy test. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:32:57 -04:00
Marek Olšák	4b6df225f7	radeonsi: improve quality of 16 sample locations This results in better 16x and 8x quality when using these locations. Verified with the piglit MSAA accuracy test. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:29:02 -04:00
Marek Olšák	01fd543c82	radeonsi: use better sample locations for 4x MSAA Discovered by luck. Verified with the piglit MSAA accuracy test. It also shows that the worst case EQAA 16s4f results in very good 4x MSAA in the worst case. Nine might not like these positions, but they are prettier to the eye and GL doesn't care. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:28:12 -04:00
Marek Olšák	8d8b71ccfa	radeonsi: reorder sample locations as required by EQAA Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:27:46 -04:00
Marek Olšák	5769a5ec01	radeonsi: simplify si_get_sample_position Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	9f456b3a3c	radeonsi: simplify arrays of sample locations Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	3d70b5beae	radeonsi: set DB_EQAA the same as Vulkan These never change, but they only affect EQAA, which isn't implemented. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	b5ed039325	radeonsi: remove CM_ prefixes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	656fd607be	radeonsi: don't update clear color registers if they don't change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	835095973d	radeonsi: remove r600_fmask_info radeon_surf contains almost everything. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	bdc3e410f7	ac/surface: unify common legacy and gfx9 fmask fields Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	9bf3570fed	ac/surface/gfx6: compute FMASK together with the color surface instead of invoking FMASK computation separately. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	276acda835	ac/surface/gfx9: fix a typo in CMASK RB/pipe alignment No change in behavior because it's always aligned. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	6841845b00	ac: set correct LLVM processor names for Raven & Vega12 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	6f7f10d285	ac: sort raster configs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	e7b82a9978	ac: remove 1 RB raster config for Iceland Iceland always reports 2 RBs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	cb0f5cddcc	ac: move the Fiji kernel workaround for raster config out of the switch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	ce954ac6f3	ac: enable both RBs on Kaveri This can result in 2x increase in performance on non-harvested Kaveris. v2: don't do it on radeon Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Marek Olšák	597b9e8810	radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVM Fixes: `6d19120da8` "radeonsi/gfx9: workaround for INTERP with indirect indexing" Cc: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:32 -04:00
Jason Ekstrand	b784561c1a	intel/isl/storage: Don't lower most UNORM formats on gen11+ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-10 14:13:24 -07:00
Jason Ekstrand	399962e7c6	intel/isl: Several UNORM formats support typed writes on gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-10 14:12:55 -07:00
Brian Paul	e4211b36bb	mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT Since size can be 3, 4 or GL_BGRA we need to keep these glGet types as TYPE_INT, not TYPE_UBYTE. Fixes: `d07466fe18` ("mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106462 cc: mesa-stable@lists.freedesktop.org Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-05-10 09:49:40 -06:00
Andres Rodriguez	34e9e4023f	radv: disable DCC for shareable images on GFX9+ This seems to be broken at the moment for opengl interop. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 11:27:12 -04:00
Thomas Petazzoni	54bbe600ec	configure.ac: rework -latomic check The configure.ac logic added in commit `2ef7f23820` ("configure: check if -latomic is needed for __atomic_*") makes the assumption that if a 64-bit atomic intrinsic test program fails to link without -latomic, it is because we must use -latomic. Unfortunately, this is not completely correct: libatomic only appeared in gcc 4.8, and therefore gcc versions before that will not have libatomic, and therefore don't provide atomic intrinsics for all architectures. This issue was for example encountered on PowerPC with a gcc 4.7 toolchain, where the build fails with: powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic This commit aims at fixing that, by not assuming -latomic is available. The commit re-organizes the atomic intrinsics detection as follows: (1) Test if a program using 64-bit atomic intrinsics links properly, without -latomic. If this is the case, we have atomic intrinsics, and we're good to go. (2) If (1) has failed, then test to link the same program, but this time with -latomic in LDFLAGS. If this is the case, then we have atomic intrinsics, provided we link with -latomic. This has been tested in three situations: - On x86-64, where atomic instrinsics are all built-in, with no need for libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='' LIBATOMIC_LIBS='' This means: atomic intrinsics are available, and we don't need to link with libatomic. - On NIOS2, where atomic intrinsics are available, but some of them (64-bit ones) require using libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='#' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='' LIBATOMIC_LIBS='-latomic' This means: atomic intrinsics are available, and we need to link with libatomic. - On PowerPC with an old gcc 4.7 toolchain, where 32-bit atomic instrinsics are available, but not 64-bit atomic instrinsics, and there is no libatomic. In this case, config.log contains: GCC_ATOMIC_BUILTINS_SUPPORTED_FALSE='' GCC_ATOMIC_BUILTINS_SUPPORTED_TRUE='#' With means that atomic intrinsics are not usable. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>	2018-05-10 08:13:57 -07:00
Brian Paul	d07466fe18	mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs The vertex array Size and Stride attributes are now ubyte and short, respectively. The glGet code needed to be updated to handle those types, but wasn't. Fixes the new piglit test gl-1.5-get-array-attribs test. v2: fix inadvertant whitespace change, change COLOR_ARRAY_SIZE to UBYTE, misc fixes suggested by Justin Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106450 Fixes: `d5f42f96e1` ("mesa: shrink size of gl_array_attributes (v2)") Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-05-10 08:08:11 -06:00
Jan Vesely	45dfa6f4e7	winsys/radeon: Destroy fd_hash table when the last winsys is removed. Fixes memory leak on module unload. v2: Use util_hash_table helper function CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-10 05:12:48 -04:00
Jan Vesely	d146768d13	gallium/auxiliary: Add helper function to count the number of entries in hash table CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2018-05-10 05:12:43 -04:00
Samuel Pitoiset	0defc55547	radv: move handling nosisched option in a better place It's a per-application optimization, so it makes more sense to do that in radv_handle_per_app_options(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-10 10:57:41 +02:00
Grazvydas Ignotas	4fdce205dd	radv: assorted typo fixes Trivial. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 11:50:46 +03:00
Mathias Fröhlich	f660683027	mesa/vbo/tnl: Move gl_vertex_array related stuff to tnl. The only remaining users of gl_vertex_array are tnl based drivers. So move everything related to that into tnl and rename it accordingly. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	881d2fcafa	mesa: Remove Array._DrawArrays. Only tnl based drivers still use this array. So remove it from core mesa and use Array._DrawVAO instead. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	899476b6b1	i965: Remove the now unused gl_vertex_array. Was meant to be temporary in i965. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	0fabd55306	i965: Remove the gl_vertex_array indirection. For now store binding and attrib in brw_vertex_element. The i965 driver still provides lots of opportunity to make use of the unique binding information in the VAO which is currently not taken from the VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	172c9a908f	i965: Implement all_varyings_in_vbos in terms of Array._DrawVAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	79eb6ab7b6	st/mesa: Remove the now unused gl_vertex_array. Was meant to be temporary in gallium. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	4c77f0d065	st/mesa: Make feedback draw and rasterpos use _DrawVAO. Instead of playing with Array._DrawArrays, make the feedback draw path use Array._DrawVAO. Also st_RasterPos needs to use the VAO then. v2: Use helper methods to get the offset values for array and binding. Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:16 +02:00
Mathias Fröhlich	19a91841c3	st/mesa: Use Array._DrawVAO in st_atom_array.c. Finally make use of the binding information in the VAO when setting up arrays for draw. v2: Emit less relocations also for interleaved userspace arrays. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	9987a072cb	st/mesa: Make the input_to_index array available. The input_to_index array is already available internally when preparing vertex programs. Store the map in struct st_vertex_program. Also store the bitmask of mesa vertex processing inputs in struct st_vp_variant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	f24bf45210	st/mesa: Use _DrawVAO for edgeflag enabled check. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Mathias Fröhlich	d1698d4311	mesa: Compute effective buffer bindings in the vao. Compute VAO buffer binding information past the position/generic0 mapping. Scan for duplicate buffer bindings and collapse them into derived effective buffer binding index and effective attribute mask variables. Provide a set of helper functions to access the distilled information in the VAO. All of them prefixed with _mesa_draw_... to indicate that they are meant to query draw information. v2: Also group user space arrays containing interleaved arrays. Add _Eff*Offset to be copied on attribute and binding copy. Update comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-05-10 07:06:15 +02:00
Gert Wollny	fb4011ace9	virgl: Add support for passing GL_ANY_SAMPLES_PASSED_CONSERVATIVE This is needed for fixing CTS: dEQP-GLES3.functional.occlusion_query.conservative* Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Gert Wollny <gert.wollny@collabora.com>	2018-05-10 12:26:57 +10:00
Dave Airlie	ce027ac5c7	r600: fix constant buffer bounds. If you have an indirect access to a constant buffer on r600/eg use a vertex fetch in the shader. However apps have expected behaviour on those out of bounds accessess (even if illegal). If the constants were being uploaded as part of a larger upload buffer, we'd set the range of allowed access to a lot larger than required so apps would get values back from other parts of the upload buffer instead of the expected out of bounds access. This fixes rendering bugs in Trine and Witcher 1, thanks to iive for nagging me effectively until I figured it out :-) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91808 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-05-10 02:14:32 +01:00
Jason Ekstrand	a8a740f272	i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL From the bspec docs for "Indirect State Pointers Disable": "At the completion of the post-sync operation associated with this pipe control packet, the indirect state pointers in the hardware are considered invalid" So the ISP disable is a post-sync type of operation which means that it should be combined with a CS stall. Without this, the simulator throws an error. Fixes: `766d801ca` "anv: emit pixel scoreboard stall before ISP disable" Fixes: `f536097f6` "i965: require pixel scoreboard stall prior to ISP disable" Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-09 18:03:28 -07:00
Dave Airlie	56766b8515	radv: handle arrays in the fmask descriptor. This fixes the fmask descriptor generation to handle 2d ms arrays. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-10 10:42:49 +10:00
Matt Turner	0f959215c3	gallium/tests: Fix assignment of EXTRA_DIST Fixes: `6754c2e83d` ("autotools: Include new meson files")	2018-05-09 16:38:47 -07:00
Matt Turner	0097940223	configure.ac: Check for grep with AC_PROG_GREP Perhaps with a new version of autoconf, I began seeing: \| checking the name lister (/usr/bin/nm -B) interface... ./configure: line 6973: External.some_variable: command not found \| BSD nm This is because AC_PROG_NM expands to ... if $GREP 'External.some_variable' conftest.out > /dev/null; then lt_cv_nm_interface="MS dumpbin" fi ... I'm not sure if it's a bug in AC_PROG_NM that it doesn't call AC_PROG_GREP, but it's easy enough for us to do it.	2018-05-09 16:38:47 -07:00
Xiong, James	0ab266dc1b	main: fail texture_storage() call if the size is not okay Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 09:34:31 +10:00
Xiong, James	08c1444c95	main: return 0 length when the queried program object's not linked Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-05-10 09:34:19 +10:00
Kenneth Graunke	a83face48a	i965: Shut up unused variable warnings. These are only used in assertions.	2018-05-09 16:20:50 -07:00
Ross Burton	1755654d9f	src/intel/Makefile.vulkan.am: add missing MKDIR_GEN Out of tree builds can try to write into a directory that doesn't exist yet: \| Traceback (most recent call last): \| File "../../../mesa-18.0.2/src/intel/vulkan/anv_icd.py", line 46, in <module> \| with open(args.out, 'w') as f: \| IOError: [Errno 2] No such file or directory: 'vulkan/intel_icd.x86_64.json' \| Makefile:4882: recipe for target 'vulkan/intel_icd.x86_64.json' failed Add missing MKDIR_GEN calls to solve this. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-09 16:08:52 -07:00
Rhys Perry	5ac16ed047	mesa: fix error handling in get_framebuffer_parameteriv CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-09 14:32:40 -07:00
Lionel Landwerlin	766d801ca3	anv: emit pixel scoreboard stall before ISP disable We want to make sure that all indirect state data has been loaded into the EUs before disable the pointers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Fixes: `78c125af39` ("anv/gen10: Ignore push constant packets during context restore.")	2018-05-09 20:11:57 +01:00
Lionel Landwerlin	f536097f67	i965: require pixel scoreboard stall prior to ISP disable Invalidating the indirect state pointers might affect a previously scheduled & still running 3DPRIMITIVE (causing page fault). So stall on pixel scoreboard before that. v2: Fix compile issue :( v3: Stall on pixel scoreboard v4: Drop the post sync operation (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Fixes: `ca19ee33d7` ("i965/gen10: Ignore push constant packets during context restore.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106243	2018-05-09 20:11:51 +01:00
Jason Ekstrand	561348caa1	intel/isl: Allow CCS_E on 1010102 formats On CNL and above, CCS_E supports 1010102 formats and R11G11B10F. We had shut them off during early enabling because blorp_copy couldn't handle them. Now it can handle 1010102 formats so we can turn them back on. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	ccb44b8a94	intel/blorp: Allow CCS copies of 1010102 formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	1978de66f7	intel/blorp: Add support for more format bitcasting nir_format_bitcast_uint_vec_unmasked can only be used to cast between formats with uniform channel sizes. In particular, it cannot handle 10_10_10_2 formats. By making use of the NIR helper for uint vector casts, we should now be able to bitcast between any two uint formats so long as their channels are in RGBA order (possibly with channels missing). In order to do this we need to rework the key a bit to pass the actual formats instead of just the number of bits in each. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	7998fe268e	intel/blorp: Use nir_format_bitcast_uint_vec_unmasked Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	047e68389f	nir/format_convert: Add code for bitcasting vectors This is a fairly direct port from blorp. The only real change is that the nir_format_convert version doesn't assume that everything is a vec4. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	a6b66a7b26	intel/blorp: Use ISL instead of bitcast_color_value_to_uint Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	09ced65420	intel/isl: Add format conversion code This adds helpers to ISL to convert an isl_color_value to and from binary data encoded with a given isl_format. The conversion is done using ISL's built-in format introspection so it's fairly slow as format conversions go but it should be fine for a single pixel value. In particular, we can use this to convert clear colors. As a side-effect, we now rely on the sRGB helpers in libmesautil so we need to tweak the build system a bit. All prior uses of src/util in ISL were header-only. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8152c60e01	intel/isl/format: Get rid of the ALPHA colorspace Alpha-only formats are just linear. There's no need to specially deliminate them as being in their own colorspace. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8ab73790ef	intel/isl/format: Add field locations informations to channel_layout Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	96598fbc02	intel/isl/format: Add a column for channel order to the table Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	d08d6a3da8	i965/blorp: Remove a pile of blorp_blit restrictions Previously, blorp could only blit into something that was renderable. Thanks to recent additions to blorp, it can now blit into basically anything so long as it isn't compressed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	465d8566cd	i965/blorp: Allow blorp blits for 16x MSAA BLORP has supported 16x MSAA for quite a while now, we just never bothered to enable it for CopyTexSubImage. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	09eede9c9d	anv: Allow blitting to/from any supported format Now that blorp handles all the cases, why not? The only real change we have to make is to stop using anv_swizzle_for_render() in blorp_blit because it doesn't work for B4G4R4A4 and blorp now natively handles that. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	8ce31c9cc5	intel/blorp: Support the RGB workaround on more formats Previously we only supported UINT formats because that's what blorp_copy required. If we want to use it in blorp_blit, however, we need to support everything. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	4e26e3dea9	intel/blorp: Silently convert RGBX destination formats to RGBA Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	08cd834996	intel/isl: Add some helpers for working with RGBX formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	804856fa57	intel/blorp: Handle more exotic destination formats This commit adds support for the following formats as destination formats even though the hardware does not support rendering to them: - ISL_FORMAT_R24_UNORM_X8_TYPELESS - ISL_FORMAT_A4B4G4R4_UNORM - ISL_FORMAT_L8_UNORM_SRGB - ISL_FORMAT_R9G9B9E5_SHAREDEXP This is done by using a different format and emitting shader code to fake it the rest of the way. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	9e492bb92e	intel/blorp: Include nir_format_convert.h in blorp_blit.c nir_mask_shift_or is now defined in nir_format_convert.h so we can delete the copy in blorp_blit.c. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	9981709d8f	nir/format_convert: Add a function to pack RGB9_E5 formats Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	4e337b42f9	nir/format_convert: Add pack/unpack for R11F_G11F_B10F Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	98156b0019	nir/format_convert: Add linear <-> sRGB helpers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	2fdd966e3d	nir: Add the start of a format conversion helper header Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	906c32ce87	intel/blorp: Add swizzle support for all hardware This commit makes blorp capable of swizzling anything even on hardware that doesn't support texture swizzle. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	1ef4f5aff1	intel/isl: Add a helper for inverting swizzles Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	242f6f7492	intel/isl: Add a helper for composing swizzles Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	dad67cc245	intel/isl: Add an isl_swizzle_supports_rendering helper This helper encodes more details, specifically about Haswell, than the previous asserts in isl_surface_state.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	23d703de1f	i965/surface_state: Use an identity swizzle pre-Haswell Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-09 11:16:33 -07:00
Jason Ekstrand	293b8de161	blorp: Handle the RGB workaround more like other workarounds The previous version was sort-of strapped on in that it just adjusted the blit rectangle and trusted in the fact that we would use texelFetch and round to the nearest integer to ensure that the component positions matched. This new version, while slightly more complicated, is more accurate because all three components end up with exactly the same dst_pos and so they will get interpolated and sampled at the same texture coordinate. This makes the workaround suitable for using with scaled blits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-05-09 11:16:33 -07:00
Lionel Landwerlin	3853f1c6f4	i965: silence unused variable Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2dc29e095f` ("i965: Don't leak blorp on Gen4-5.") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-05-09 18:12:10 +01:00
Lionel Landwerlin	11d36c373a	intel: devinfo: silence coverity warning It's just not possible to have a device with no subslices. CID: 1433511 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-05-09 15:21:01 +01:00
Michel Dänzer	6f81e07ecb	dri3: Only update number of back buffers in loader_dri3_get_buffers And only free no longer needed back buffers there as well. We want to stick to the same back buffer throughout a frame, otherwise we can run into various issues. Bugzilla: https://bugs.freedesktop.org/105906 Bugzilla: https://bugs.freedesktop.org/106399 Fixes: `3160cb86aa` "egl/x11: Re-allocate buffers if format is suboptimal" Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-05-09 15:40:41 +02:00
Samuel Iglesias Gonsálvez	2cf64fdb46	anv: ignore pColorBlendState if all color attachments of the subpass are unused According to Vulkan spec: "pColorBlendState is a pointer to an instance of the VkPipelineColorBlendStateCreateInfo structure, and is ignored if the pipeline has rasterization disabled or if the subpass of the render pass the pipeline is created against does not use any color attachments." Fixes tests from CL#2505: dEQP-VK.renderpass.*.simple.color_unused_omit_blend_state v2: - Check that blend is not NULL before usage. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-09 07:01:10 +02:00
Timothy Arceri	e7a7b712fe	mesa: remove hard-coded OpenGL 3.2 compat limit Just let validate_context_version() do it instead. This fixes MESA_GL_VERSION_OVERRIDE for compat, it will also allow us to enable new compat versions on a per driver bases in future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:43 +10:00
Timothy Arceri	4560aad780	mesa: add GLSLVersionCompat constant This allows drivers to define what version of GLSL they support in compat. This will be needed in order to support compat 3.2 without breaking drivers that wont support it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:36 +10:00
Timothy Arceri	be3ee9d141	mesa: dont call _mesa_override_glsl_version() in _mesa_init_constants() All drivers that support GLSL will later set their default GLSL versions overriding this override call. They currently all call _mesa_override_glsl_version() again later in order to support overrides. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:29 +10:00
Timothy Arceri	2a621acc8d	mesa: dont set GLSLVersion in _mesa_init_constants() Just leave it as 0 and let the drivers set it (as they already do) to avoid redundantly initialising it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-09 14:24:22 +10:00
Jan Vesely	0783399d79	pipe-loader: Free driver_name in error path CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 21:35:07 -04:00
Brian Paul	901db25d5b	glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug Change the size of the bitset from 128 bits to 96. This works around an apparent GCC 5.4 bug in which bad SSE code is generated, leading to a crash in ast_type_qualifier::validate_in_qualifier() (ast_type.cpp:654). This can be repro'd with the Piglit test tests/spec/glsl-1.50/execution/ varying-struct-basic-gs-fs.shader_test Bugzilla:https://bugs.freedesktop.org/show_bug.cgi?id=105497 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Charmaine Lee <charmainel@vmware.com> Tested-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-08 19:06:09 -06:00
Kenneth Graunke	20f06bc72b	i965: Dump validation list on INTEL_DEBUG=bat,submit. This is really useful when debugging any sort of buffer management issues, so just printing it during INTEL_DEBUG=bat,submit seems reasonable. With bat, we're already spamming so much output that it doesn't really hurt. With submit, it's still easy to grep for the older information, and the new information is nice too. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-08 10:08:16 -07:00
Jason Ekstrand	06d3841882	i965/miptree: Remove redundant fields from intel_miptree_aux_buffer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:46 -07:00
Jason Ekstrand	4f4779b367	i965: Simplify brw_emit_depthbuffer and brw_emit_depth_stencil_hiz Now that we're using ISL, a good chunk of brw_emit_depthstencil is pointless checks which ISL will do for us anyway. Since we only have one manual depth buffer emit function, move the useful bits into it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:45 -07:00
Jason Ekstrand	96f01501d7	i965: Move brw_emit_depth_stencil_hiz higher up in the file Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:45 -07:00
Jason Ekstrand	bdbb527a65	i965: Use ISL for emitting depth/stencil/hiz state on gen6+ We leave gen4-5 alone because the ISL code hasn't really been well- tested on gen4-5 or with combined depth-stencil because we don't use BLORP for depth operations on gen4-5. Also, the gen4-5 code has to deal with intratile offsets for LOD hacks and ISL doesn't handle those yet. We could make ISL handle gen4-5 capable or we could just not bother. Among other things, this should make future platform enabling easier because it means we don't have to update multiple (or hand-rolled!) depth stencil emit paths. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:44 -07:00
Jason Ekstrand	ccd3dce3c0	i965: Use the brw_depthbuffer atom on all gens The only reason why we had two atoms was that the one we used for gen7+ depended on _NEW_DEPTH and _NEW_STENCIL as well as _NEW_BUFFERS. Since this is no longer true, we can combine them into one atom. We do add a dependence on BRW_NEW_AUX_STATE but that should never get set on gen4-5 so adding it is a no-op for those platforms. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:44 -07:00
Jason Ekstrand	514bb6f41e	i965: Always set depth/stencil write enables on gen7+ The hardware will AND these fields with the corresponding fields in DEPTH_STENCIL_STATE so there's no real reason to toggle them on and off based on state bits. This removes our reliance on the _NEW_DEPTH and _NEW_STENCIL state bits and better matches what ISL does. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:43 -07:00
Jason Ekstrand	c4d00da7b7	i965: Re-order depth/stencil/hiz/clear packets to match ISL Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:27:42 -07:00
Jason Ekstrand	6fc3404911	i965: Re-emit depth/stencil/hiz on BRW_NEW_AUX_STATE Certain things can change the aux usage or fast clear color of a depth surface and we want to re-emit if that happens. For instance, if you do a fast depth clear of an already clear depth surface, we will just set the clear color and not do anything else. In that case, we could fail to re-emit 3DSTATE_CLEAR_PARAMS and not get the new fast-clear color. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 08:23:55 -07:00
Lionel Landwerlin	3cdf1bf97d	intel: devinfo: fix assertion on devices with odd number of EUs I forgot to change the assert in the second helper function in a previous change. This hit the assert() on a Broadwell platform with 1 slice, 3 subslices but all EUs disabled in subslice 1 & 2. Fixes: `c1900f5b0f` ("intel: devinfo: add helper functions to fill fusing masks values") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-08 15:15:54 +01:00
Bas Nieuwenhuizen	b17cfb08a3	vulkan/wsi: Only use LINEAR modifier for prime if supported. This was setting the LINEAR modifier if neither the X server nor the driver supported modifiers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106180 Fixes: `c80c08e226` "vulkan/wsi/x11: Add support for DRI3 v1.2" CC: 18.1 <mesa-stable@lists.freedesktop.org> Tested-by: Abel Garcia Dorta <mercuriete@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-08 15:47:16 +02:00
Jan Vesely	a9e4be9212	eg/compute: Drop reference to kernel_param bo in destructor CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 09:02:38 -04:00
Jan Vesely	a1e8fcce3e	r600: Cleanup constant buffers on context destruction CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 09:02:30 -04:00
Alejandro Piñeiro	b6648798cf	mesa/formatquery: remove online compression check on is_resource_supported is_resource_supported returns if the combination of target/internalformat is supported in at least one operation. Online compression is only mandatory for glTexImage2D. Some formats doesn't support online compression, but can be used in any case, with glCompressed*D methods. Without this commit, ETC2 internalformats were returning FALSE, even for the drivers supporting it. So any other query (like TEXTURE_COMPRESSED) was returning FALSE/NONE instead of the proper value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-08 08:19:38 +02:00
Kenneth Graunke	e6fb8196ce	intel/genxml: Assert that genxml field start and ends are sane. Chris recently fixed a bunch of genxml end < start bugs, as well as booleans that are wider than a bit. These are way too easy to write, so asserting that the fields are sane is a good plan. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	f83fd929b7	intel/genxml: Fix some more fake booleans in genxml. None of these are actually booleans. Tile Parameter is a tiling mode enum. Display pipes take plane numbers. Predicate Enable has some operations (and the default value of 6 was particular bogus). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	33906eeaca	intel/genxml: Make assert in gen_pack_header print a message. Python's assert can take both a condition and a string, which will cause it to print the string if the assertion trips. (You can't use parens as that creates a tuple.) Doing "condition and string" works in C, but doesn't have the desired effect in Python. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 23:06:52 -07:00
Kenneth Graunke	2dc29e095f	i965: Don't leak blorp on Gen4-5. We used to only initialize BLORP on Gen6+. When we added it on Gen4-5, we forgot to destroy it unconditionally. Fixes: `752d7af77a` (i965: Add blorp support for gen4-5) Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-07 23:05:59 -07:00
Matt Turner	ed5af94373	nir: Transform discard_if(true) into discard Noticed while reviewing Tim Arceri's NIR inlining series. Without his series: instructions in affected programs: 16 -> 14 (-12.50%) helped: 2 With his series: instructions in affected programs: 196 -> 174 (-11.22%) helped: 22 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 13:50:23 -07:00
Jan Vesely	ea1fff4416	eg/compute: Drop reference on code_bo in destructor. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-07 15:04:03 -04:00
Nicolas Boichat	54ba73ef10	configure.ac/meson.build: Fix -latomic test When compiling with LLVM 6.0 on x86 (32-bit) for Android, the test fails to detect that -latomic is actually required, as the atomic call is inlined. In the code itself (src/util/disk_cache.c), we see this pattern: p_atomic_add(cache->size, - (uint64_t)size); where cache->size is an uint64_t *, and results in the following link time error without -latomic: src/util/disk_cache.c:628: error: undefined reference to '__atomic_fetch_add_8' Fix the configure/meson test to replicate this pattern, which then correctly realizes the need for -latomic. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org>	2018-05-07 10:14:53 -07:00
Scott D Phillips	8b519075ea	anv: remove unused field anv_queue::pool The last use of the field was removed in 2015's ("48a87f4ba06 anv/queue: Get rid of the serial") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-07 09:03:46 -07:00
Kenneth Graunke	0b1cfd01ff	i965: Set initial kflags on BO creation. This simplifies kflag initialization, by creating a bufmgr-wide setting for initial kflags, and just applying it whenever we create a new BO. This also properly allows 48-bit addresses for imported BOs (via prime or flink), which I had missed in my earlier 48-bit support series. This will be useful when adding softpin support, as we'll want to add EXEC_OBJECT_PINNED to initial_kflags as well. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-05-07 08:47:21 -07:00
Juan A. Suarez Romero	7ee54fc33d	docs: update calendar, add news and link release notes to 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-05-07 11:25:54 +00:00
Juan A. Suarez Romero	78e103da8b	docs: add sha256 checksums for 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ae12c5e990`)	2018-05-07 11:19:36 +00:00
Juan A. Suarez Romero	6c06d4e17b	docs: add sha256 checksums for 18.0.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `6dc2658fd6`)	2018-05-07 11:19:34 +00:00
Chris Wilson	cf440d85db	intel/genxml: Fix a few invalid field widths A couple of typos found by inspecting field.end - field.start, revealed a few wide integers declared as bool and some that ended before they started. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-07 11:34:13 +01:00
Vinson Lee	cd5319a64f	swr/rast: Fix include for createInstructionCombiningPass with llvm-7.0. Fix build error after llvm-7.0.0svn r330669 ("InstCombine: Fix layering by not including Scalar.h in InstCombine"). CXX rasterizer/jitter/libmesaswr_la-blend_jit.lo rasterizer/jitter/blend_jit.cpp:816:20: error: use of undeclared identifier 'createInstructionCombiningPass'; did you mean 'createInstructionSimplifierPass'? passes.add(createInstructionCombiningPass()); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ createInstructionSimplifierPass Suggested-by: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-05 13:20:53 -07:00
Jan Vesely	2f1ad72ac1	clover: Add explicit virtual destructor to argument class It is needed to destroy the v vector in scalar_argument Fixes memory leaks on parameter set/bind. v2: Drop redundant sclara_argument destructor Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-05-05 13:17:08 -04:00
Iago Toral Quiroga	e4c667b9e8	anv/device: expose shaderInt16 support in gen8+ This rollbacks the revert of this patch introduced with commit `7cf284f18e`. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:41:14 +02:00
Iago Toral Quiroga	5a12bdac09	i965/compiler: handle conversion to smaller type in the lowering pass for that This rollbacks the revert of this same patch introduced in commit `7b9c15628a`. And also squahes the following patch to prevent a piglit regression caused by this change: intel/compiler: Fix lower_conversions for 8-bit types. Author: Jose Maria Casanova Crespo <jmcasanova@igalia.com> For 8-bit types the execution type is word. A byte raw MOV has 16-bit execution type and 8-bit destination and it shouldn't be considered a conversion case. So there is no need to change alignment and enter in lower_conversions for these instructions. Fixes a regresion in the piglit test "glsl-fs-shader-stencil-export" that is introduced with this patch from the Vulkan shaderInt16 series: 'i965/compiler: handle conversion to smaller type in the lowering pass for that'. The problem is caused because there is already a case in the driver that injects Byte instructions like this: mov(8) g127<1>UB g2<32,8,4>UB And the aforementioned pass was not accounting for the special handling of the execution size of Byte instructions. This patch fixes this. v2: (Jason Ekstrand) - Simplify is_byte_raw_mov, include reference to PRM and not consider B <-> UB conversions as raw movs. v3: (Matt Turner) - Indentation style fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:41:02 +02:00
Iago Toral Quiroga	a75f967388	intel/compiler: handle 16-bit to 64-bit conversions in BSW platforms These are subject to the general restriction that anything that is converted to 64-bit needs to be aligned to 64-bit. We had this already in place for 32-bit to 64-bit conversions, so this patch generalizes the implementation to take effect on any conversion to 64-bit from a source smaller than 64-bit. Fixes assembly validation errors in the following CTS tests in BSW: dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_int64 dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint16_to_uint64 dEQP-VK.spirv_assembly.instruction.compute.sconvert.int16_to_uint64 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-05 12:26:37 +02:00
Caio Marcelo de Oliveira Filho	9d1ff2261c	intel/genxml: recognize 0x, 0o and 0b when setting default value Remove the need of converting values that are documented in hexadecimal. This patch would allow writing <field name="3D Command Sub Opcode" ... default="0x1B"/> instead of <field name="3D Command Sub Opcode" ... default="27"/> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-04 23:58:10 +01:00
Ian Romanick	9a10a2fd5f	r200: Enable NV_fog_distance With the previous fixes in place, it appears to just work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:29:30 -07:00
Ian Romanick	9d0bf720ed	i965: Enable NV_fog_distance With the previous fixes in place, it appears to just work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:29:28 -07:00
Ian Romanick	df80ffa4aa	ffvertex: Don't try to read output registers in fog calculation Gallium drivers use _mesa_remove_output_reads() via st_program to lower output reads away. It seems better to just generate the right thing in the first place. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:50 -07:00
Ian Romanick	f2db3be620	mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV) Found by inspection, so I made a piglit test too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:44 -07:00
Ian Romanick	d350276b03	mesa: Silence an unused parameter warning main/framebuffer.c: In function ‘update_color_draw_buffers’: main/framebuffer.c:629:46: warning: unused parameter ‘ctx’ [-Wunused-parameter] update_color_draw_buffers(struct gl_context ctx, struct gl_framebuffer fb) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-04 15:27:40 -07:00
Gert Wollny	e695a35f40	mesa/main/readpix: Correct handling of packed floating point values Make sure that clamping in the pixel transfer operations is enabled/disabled for packed floating point values just like it is done for single normal and half precision floating point values. This fixes a series of CTS tests with virgl that use r11f_g11f_b10f buffers as target, and where virglrenderer reads these surfaces back using the format GL_UNSIGNED_INT_10F_11F_11F_REV. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-05-04 10:47:46 -07:00
Scott D Phillips	5c075b0855	util/set: add a set_clear function Clear a set back to the state of having zero entries. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-04 10:13:33 -07:00
Tapani Pälli	affe63b1da	egl: add EGL_BAD_MATCH error case for surfaceless and android Just like is done for other backends when suitable config is not found (added in `fd4eba4929`). Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-05-04 14:04:03 +03:00
Nicolai Hähnle	c0acb596f4	amd/common: use llvm.amdgcn.wqm for explicit derivatives To comply with an upcoming change in LLVM, see https://reviews.llvm.org/D46051 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-04 11:02:48 +02:00
Rhys Perry	b30949a9c2	nv50/ir: fix printing of pixld Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-03 22:57:46 -04:00
Drew Davenport	4373dd3215	st/va: Support YUV formats in vaCreateSurfaces Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2018-05-03 15:48:35 -07:00
Mark Janes	7cf284f18e	Revert "anv/device: expose shaderInt16 support in gen8+" This reverts commit `0ba0ac815e`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-03 15:26:59 -07:00
Mark Janes	7b9c15628a	Revert "i965/compiler: handle conversion to smaller type in the lowering pass for that" This reverts commit `96b5153790`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106393 Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-05-03 15:26:59 -07:00
Vinson Lee	589622a2fe	swr/rast: Fix WriteBitcodeToFile usage with llvm-7.0. Fix build error after llvm-7.0svn r325155 ("Pass a reference to a module to the bitcode writer."). CXX rasterizer/jitter/libmesaswr_la-JitManager.lo rasterizer/jitter/JitManager.cpp:548:30: error: reference to type 'const llvm::Module' could not bind to an lvalue of type 'const llvm::Module *' llvm::WriteBitcodeToFile(M, bitcodeStream); ^ Suggested-by: George Kyriazis <george.kyriazis@intel.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-05-03 14:06:09 -07:00
Deepak Rawat	9a21c96126	egl/x11: Send invalidate to driver on copy_region path in swap_buffer Similar to swap_available path send invalidate to the driver because egl/X11 is not watching for for server's invalidate events. The dri2_copy_region path is trigerred when server supports DRI2 version minor 1. Tested with piglit egl tests for regression. V2: Move invalidate from dri2_copy_region to swap_buffer common. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Deepak Rawat <drawat@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2018-05-03 13:55:58 +02:00
Juan A. Suarez Romero	fd4eba4929	egl: check if colorspace/surface type is supported According to EGL 1.4 spec, section 3.5.1 ("Creating On-Screen Rendering Surfaces"), if config does not support the colorspace or alpha format attributes specified in attrib_list (as defined for eglCreateWindowSurface), an EGL_BAD_MATCH error is generated. This fixes dEQP-EGL.functional.wide_color.*_888_colorspace_srgb (still not merged, https://android-review.googlesource.com/c/platform/external/deqp/+/667322), which is crashing when trying to create a windows surface with RGB888 configuration and sRGB colorspace. v2: Handle the fix in other backends (Tapani) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-05-03 12:26:12 +02:00
Iago Toral Quiroga	0ba0ac815e	anv/device: expose shaderInt16 support in gen8+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	002cb6f2b3	anv/pipeline: support SpvCapabilityInt16 in gen8+ Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	f07c05576f	compiler/spirv: add implementation to check for SpvCapabilityInt16 support Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	dd41630d9a	intel/compiler: implement 16-bit pack/unpack opcodes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	1dacb56279	compiler/spirv: implement 16-bit bitcasts Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	2d648e5ba3	compiler/lower_64bit_packing: rename the pass to be more generic It can do 32-bit packing too now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	d2564af842	nir/lower_64bit_packing: extend the pass to handle packing from / to 16-bit. With 16-bit support we can now do 32-bit packing, a follow-up patch will rename the pass to something more generic. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	c9653cc14c	nir: add opcodes for 16-bit packing and unpacking Noitice that we don't need 'split' versions of the 64-bit to / from 16-bit opcodes which we require during pack lowering to implement these operations. This is because these operations can be expressed as a collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit operations, so we don't need new opcodes specifically for them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:26 +02:00
Iago Toral Quiroga	6318808a05	intel/compiler: fix 16-bit comparisons NIR assumes that booleans are always 32-bit, but Intel hardware produces 16-bit booleans for 16-bit comparisons. This means that we need to convert the 16-bit result to 32-bit. In the future we want to add an optimization pass to clean this up and hopefully remove the conversions. v2 (Jason): use the type of the source for the temporary and use brw_reg_type_from_bit_size for the conversion to 32-bit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	b11e9425df	intel/compiler: lower some 16-bit integer operations to 32-bit These are not supported in hardware for 16-bit integers. We do the lowering pass after the optimization loop to ensure that we lower ALU operations injected by algebraic optimizations too. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	b9a3d8c23e	compiler/nir: add a lowering pass to convert the bit size of ALU operations Not all bit-sizes may be supported natively in hardware for all operations. This pass allows drivers to lower such operations to a bit-size that is actually supported and then converts the result back to the original bit-size. Compiler backends control which operations and wich bit-sizes require the lowering through a callback function. v2: generalize this pass and make it available in NIR core (Rob, Jason) v3: remove some temporaries and reduce nesting in instruction loop using a continue statement (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	f575277f7e	intel/compiler: support negate and abs of half float immediates Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	f0e6dacee5	intel/compiler: fix brw_imm_w for negative 16-bit integers 16-bit immediates need to replicate the 16-bit immediate value in both words of the 32-bit value. This needs to be careful to avoid sign-extension, which the previous implementation was not handling properly. For example, with the previous implementation, storing the value -3 would generate imm.d = 0xfffffffd due to signed integer sign extension, which is not correct. Instead, we should cast to uint16_t, which gives us the correct result: imm.ud = 0xfffdfffd. We only had a couple of cases hitting this path in the driver until now, one with value -1, which would work since all bits are one in this case, and another with value -2 in brw_clip_tri(), which would hit the aforementioned issue (this case only affects gen4 although we are not aware of whether this was causing an actual bug somewhere). v2: Make explicit uint32_t casting for left shift (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	2a76f03c90	intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate From Intel Skylake PRM, vol 07, "Immediate" section (page 768): "For a word, unsigned word, or half-float immediate data, software must replicate the same 16-bit immediate value to both the lower word and the high word of the 32-bit immediate field in a GEN instruction." This fixes the int16/uint16 negate and abs immediates that weren't taking into account the replication in lower and upper words. v2: Integer cases are different to Float cases. (Jason Ekstrand) Included reference to PRM (Jose Maria Casanova) v3: Make explicit uint32_t casting for left shift (Jason Ekstrand) Split half float implementation. (Jason Ekstrand) Fix brw_abs_immediate (Jose Maria Casanova) Cc: "18.0 18.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Jose Maria Casanova Crespo	e5fc3c0717	intel/compiler: implement nir_instr_type_load_const for 16-bit constants Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	939501c8ed	intel/compiler: implement conversions from 16-bit int/float to bool Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	d5a419176f	intel/compiler: implement conversion between float/int 16-bit types Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	96b5153790	i965/compiler: handle conversion to smaller type in the lowering pass for that The lowering pass was specialized to act on 64-bit to 32-bit conversions only, but the implementation is valid for other cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Iago Toral Quiroga	5361a87ee7	intel/compiler: fix isign for 16-bit integers We need to use 16-bit constants with 16-bit instructions, otherwise we get the following validation error: "Destination stride must be equal to the ratio of the sizes of the execution data type to the destination type" Because the execution data type is 4B due to the 32-bit integer constant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 11:40:25 +02:00
Chris Wilson	b5e266765a	i965: Always try to create a logical context Always enable use of HW logical contexts to preserve GPU state between batches when the kernel supports such constructs, continuing to enforce the required support for gen6+. At runtime, this effectively removes the BRW_NEW_CONTEXT flag (and the upload of invariant state) from the start of every batch for any kernel supporting contexts. So long as the older atoms are correctly listening to the right flag (NEW_CONTEXT rather than NEW_BATCH) this should eliminate a few redundant state uploads for the older platforms. No piglits were harmed on ctg and ilk, both with and without logical contexts. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-03 01:39:33 -07:00
Neil Roberts	e17d0ccbbd	spirv: Apply OriginUpperLeft to FragCoord This behaviour was changed in `1e5b09f42f`. The commit message for that says it is just a “tidy up” so my assumption is that the behaviour change was a mistake. It’s a little hard to decipher looking at the diff, but the previous code before that patch was: if (builtin == SpvBuiltInFragCoord \|\| builtin == SpvBuiltInSamplePosition) nir_var->data.origin_upper_left = b->origin_upper_left; if (builtin == SpvBuiltInFragCoord) nir_var->data.pixel_center_integer = b->pixel_center_integer; After the patch the code was: case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; /* fallthrough / case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; break; Before the patch origin_upper_left affected both builtins and pixel_center_integer only affected FragCoord. After the patch origin_upper_left only affects SamplePosition and pixel_center_integer affects both variables. This patch tries to restore the previous behaviour by changing the code to: case SpvBuiltInFragCoord: nir_var->data.pixel_center_integer = b->pixel_center_integer; / fallthrough */ case SpvBuiltInSamplePosition: nir_var->data.origin_upper_left = b->origin_upper_left; break; This change will be important for ARB_gl_spirv which is meant to support OriginLowerLeft. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Fixes: `1e5b09f42f` "spirv: Tidy some repeated if checks..."	2018-05-03 10:08:42 +02:00
Samuel Iglesias Gonsálvez	b291a3a4a3	spirv: convert some operands for bitwise shift and bitwise ops to uint32 SPIR-V allows to define the shift, offset and count operands for shift and bitfield opcodes with a bit-size different than 32 bits, but in NIR the opcodes have that limitation. As agreed in the mailing list, this patch adds a conversion to 32 bits to fix this. For more info, see: https://lists.freedesktop.org/archives/mesa-dev/2018-April/193026.html v2: - src_bit_size will have zero value for variable bit-size operands (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-03 07:07:24 +02:00
Timothy Arceri	58c05ede96	mesa: enable geom shaders in OpenGL 3.2 Compat profile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-03 12:08:21 +10:00
Bas Nieuwenhuizen	ffa15861ef	radv: UseEnumerateInstanceVersion for the default version. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen	467c562a29	radv: Don't check the incoming apiVersion on CreateInstance. This fixes dEQP-VK.api.device_init.create_instance_invalid_api_version CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 21:57:08 +02:00
Bas Nieuwenhuizen	9267ff9883	radv: Allow vkEnumerateInstanceVersion ProcAddr without instance. Apparently the somewhere between 1.1.70 and 1.1.73 the loader started depending on this. The loader then creates a 1.0 instance, which gets into funny situation because we have a 1.1 device. No idea how to do line wrapping in Mako though, my random guesses did not work. CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 21:57:08 +02:00
Lionel Landwerlin	336decd67e	intel: aubinator: add an option to limit the number of decoded VBO lines Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 19:46:47 +01:00
Lionel Landwerlin	000452aebc	intel: decoder: limit to the number decoded lines from VBO By default we set no limit, but the debug batch decoder in i965 sets it to 100. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 19:46:47 +01:00
Jason Ekstrand	bd35345e85	anv: Advertise variableMultisampleRate Initially, I didn't understand this feature. Turns out that all it means is that you can switch multisample rates in the middle of a zero-attachment subpass. We've been able to do this since forever. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-02 10:59:03 -07:00
Rob Clark	28e410f6a5	nir: add missing dependency in meson.build nir_builder_opcodes.h also depends on nir_intrinsics.py for generating the system-value builders. Reported-by: Christoph Haag <haagch@frickel.club> Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 13:57:51 -04:00
Matthew Nicholls	97d57ef917	radv: fix multisample image copies Previously before `fb077b0728`, the LOD parameter was being used in place of the sample index, which would only copy the first sample to all samples in the destination image. After that multisample image copies wouldn't copy anything from my observations. This fixes some copy_and_blit CTS tests. v3.1: - set lod to 0 for nir_txf_ms (Samuel) v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel) - updated commit description (Samuel) Fix this properly by copying each sample in a separate radv_CmdDraw and using a pipeline with the correct rasterizationSamples for the destination image. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 19:32:00 +02:00
Kenneth Graunke	169d8e011a	intel: Fix 3DSTATE_CONSTANT buffer decoding. First, this was iterating over the 3DSTATE_CONSTANT_* instruction but trying to process fields of the 3DSTATE_CONSTANT_BODY substructure. Secondly, the fields have been called Buffer[0] and Read Length[0], for a while now, and we were not handling the subscripts correctly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 10:09:28 -07:00
Lionel Landwerlin	cf1d587879	intel: fix aubinator include Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7c22c150c4` ("intel: Move batch decoder/disassembler from tools/ to common/") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:54:29 +01:00
Kenneth Graunke	0ab423388c	i965: Reuse batch decoder infrastructure rather than open coding it. With the new callback, Jason's newer batch decoder infrastructure should be able to do just as well as the old open coded INTEL_DEBUG=bat handling, with much less code. If there are any limitations, we'd like to improve the common code rather than doing one-off hacks here. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	bf91b81a0b	intel: Give the batch decoder a callback to ask about state size. Given an arbitrary batch, we don't always know what the size of certain things are, such as how many entries are in a binding table. But it's easy for the driver to track that information, so with a simple callback we can calculate this correctly for INTEL_DEBUG=bat. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	7c22c150c4	intel: Move batch decoder/disassembler from tools/ to common/ Making these part of libintel_common allows us to use them in the DRI driver. The standalone tool binaries already link against the common library, too, so it's no harder for them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:27:56 -07:00
Kenneth Graunke	5c04971831	i965: Allocate shadow batches to explicitly be the BO size. This unfortunately makes it malloc/realloc on every new batch, rather than once at startup. But it ensures that the shadow buffer's size will absolutely match the BO size. Otherwise, as we tune BATCH_SZ/STATE_SZ or bufmgr cache bucket sizes, we may get a BO size that's rounded up, and fail to allocate the shadow buffer large enough. This doesn't fix any bugs today, as BATCH_SZ/STATE_SZ are the size of a cache bucket, but it's better to be safe than sorry. Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-02 09:26:55 -07:00
Lionel Landwerlin	ec5df73803	intel: batch-decoder: iterate VERTEX_BUFFER_STATE fields The gen_field_iterator only iterates the fields of a given gen_group. If we want to iterate the fields of another gen_group contained as field, we need to do it manually. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:11:28 +01:00
Lionel Landwerlin	acbce2ac57	intel: decoder: fix starting dword of struct fields Struct fields might span several dwords, but iter_dword is incremented up to the last dword of the current field before we print out the struct's fields. We can't use iter_dword for computing the offset into the pointer of data to decode. v2: Fix displayed offset number (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:11:28 +01:00
Lionel Landwerlin	467430ddcc	intel: decoder: document when fields should be used Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Lionel Landwerlin	4f128f7850	intel: decoder: identify groups with fixed length <register> & <struct> elements always have fixed length. The get_length() method implies that we're dealing with an instruction in which the length is encoded into the variable data but the field iterator uses it without checking what kind of gen_group it is dealing with. Let's make get_length() report the correct length regardless of the gen_group (register, struct or instruction). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Lionel Landwerlin	3c416a50d8	intel: decoder: make the field iterator use more natural while (iter_next()) { ... } instead of do { ... } while (iter_next()); Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-02 17:10:37 +01:00
Vlad Golovkin	967aabca06	nv50: Extract needed value bits without shifting them before calling bitcount This can save one instruction since bitcount doesn't care about specific bits' positions. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-05-02 15:12:48 +02:00
Antia Puentes	3a1df14a7b	intel: activate the gl_BaseVertex lowering Surplus code related to the basevertex is removed. The Vertex Elements contain now: * VE 1: <firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is_indexed_draw, 0, 0> Also fixes unreachable message. Fixes OpenGL CTS tests: * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters Fixes Piglit tests: * arb_shader_draw_parameters-drawid-indirect baseinstance * arb_shader_draw_parameters-basevertex Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678	2018-05-02 11:24:46 +02:00
Antia Puentes	0fb204fac1	compiler/nir: Add conditional lowering for gl_BaseVertex Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:24:31 +02:00
Antia Puentes	0cbf29fa55	intel: emit is_indexed_draw in the same VE than gl_DrawID The Vertex Elements are now: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is-indexed-draw, 0, 0> VE1 is it kept as it was before, VE2 additionally contains the new system value. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:23:34 +02:00
Antia Puentes	6ba9088d9c	intel/compiler: Add uses_is_indexed_draw flag Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:20:48 +02:00
Antia Puentes	9e6b886cf2	compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics This VS system value contains if the draw command used to start the rendering was an indexed draw command or a non-indexed one (~0/0 respectively). Useful to calculate the gl_BaseVertex as: (SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-05-02 11:20:40 +02:00
Samuel Pitoiset	0737c1e3a6	radv: enable out-of-order rasterization by default As the implementation is conservative, we can now enable it by default. It can be disabled with RADV_DEBUG=nooutoforder. Don't expect much more than 1% of improvements, but the gain seems consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 10:33:24 +02:00
Samuel Pitoiset	1d766b0196	radv: only disable out-of-order rast for perfect occlusion queries Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 10:33:22 +02:00
Kenneth Graunke	1122fb2d98	i965: Drop unused gen5 sampler default color struct. Trivial.	2018-05-01 23:09:25 -07:00
Kenneth Graunke	9f6082f6c7	i965: Make brw_vs_outputs_written static. Drop a prototype. Trivial.	2018-05-01 23:09:16 -07:00
Nanley Chery	3e56e4642f	i965/tex_image: Avoid the ASTC LDR workaround on gen9lp Both the internal documentation and the results of testing this in the CI suggest that this is unnecessary. Add the fixes tag because this reduces an internal benchmark's startup time by about 17 seconds (reported by Eero). Fixes: `710b1d2e66` "i965/tex_image: Flush certain subnormal ASTC channel values" Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-01 16:47:39 -07:00
Eric Anholt	800be7f277	freedreno: Fix ir3_cmdline.c build. Fixes: `6487e7a30c` ("nir: move GL specific passes to src/compiler/glsl") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-05-01 16:38:37 -07:00
Jason Ekstrand	d216ffc604	anv: Allow lookup of vkEnumerateInstanceVersion without an instance Fixes: `cbab2d1da5` Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-01 14:45:51 -07:00
Jason Ekstrand	d5a0787f03	anv: Don't advertise Float64 or Int64 on HW without 64-bit types Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-05-01 14:45:50 -07:00
Samuel Pitoiset	d8db5986ce	radv: compute the number of subpass attachments correctly Only count color attachments twice if resolves are used, also account for the depth stencil attachment if present. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-01 22:18:03 +02:00
Dave Airlie	e66f64c285	radv: set fmask_surf_index on fmask surfaces. This is needed for gfx9 and later for all fmask surface index. (Mentioned by Marek on irc) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-02 06:01:42 +10:00
Brian Paul	f298ed93d9	gallium/i915: fix PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE typo Fixes: `fffe5e2d14` ("gallium: add initial support for conservative rasterization") Trivial.	2018-05-01 09:52:22 -06:00
Rhys Perry	07dac3e040	nvc0: add conservative rasterization support Subpixel precision bias, dilation and the post-snap mode are supported on GM200 and newer. The pre-snap mode is supported for triangle primitives on GP100. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-30 21:13:53 -06:00
Rhys Perry	97f5f399ef	st/mesa: add support for nvidia conservative rasterization extensions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-30 21:13:53 -06:00
Rhys Perry	fffe5e2d14	gallium: add initial support for conservative rasterization Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-30 21:13:53 -06:00
Rhys Perry	4580617509	mesa: add support for nvidia conservative rasterization extensions Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-30 21:13:53 -06:00
Brian Paul	31ab0427a7	glsl/tests: add GLSL_TYPE_UINT8, GLSL_TYPE_INT8 cases to switch statements To silence warnings about unhandled switch values. Untested otherwise. v2: move the INT/UINT8 cases after the INT/UINT16 cases, per Eric. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-30 21:13:53 -06:00
Brian Paul	efec712d51	tgsi: use enums instead of unsigned in ureg code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-30 21:13:53 -06:00
Timothy Arceri	6487e7a30c	nir: move GL specific passes to src/compiler/glsl With this we should have no passes in src/compiler/nir with any dependencies on headers from core GL Mesa. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-05-01 12:39:33 +10:00
Andres Rodriguez	f56e22e496	radv/winsys: fix leaking resources from bo's imported by fd A bo's ref_count was not being initialized when imported from an fd. Therefore, we would fail to free the resource during VkFreeMemory(). This patch fixes applications like hifi VR in threaded mode, which perform frequent imports/releases of IPC shared memory. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-30 18:20:30 -04:00
Scott D Phillips	2a08ae3c7c	i965/tiled_memcpy: ytiled_to_linear a cache line at a time Similar to the transformation applied to linear_to_ytiled, also align each readback from the ytiled source to a cacheline (i.e. transfer a whole cacheline from the source before moving on to the next column). This will allow us to utilize movntqda (_mm_stream_si128) in a subsequent patch to obtain near WB readback performance when accessing the uncached ytiled memory, an order of magnitude improvement. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-30 15:18:36 -07:00
Chris Wilson	682bdaa658	i965: Record mipmap resolver for unmapping When mapping a region of the mipmap_tree, record which complementary method to use to unmap it afterwards. By doing so we can avoid duplicating the decision tree used when mapping and thereby eliminate trivial errors that can be introduced if the two if-chains become out of sync. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	5367295e1a	i965: Move unmap_depthstencil before map_depthstencil Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	ab2825c898	i965: Move unmap_etc before map_etc Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	9e7e88049f	i965: Move unmap_s8 before map_s8 Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	b3ad6f5ca6	i965: Move unmap_movntdqa before map_movntdqa Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	f348d07a62	i965: Move unmap_blit before map_blit Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Chris Wilson	359624142d	i965: Move unmap_gtt before map_gtt Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 14:06:23 -07:00
Dave Airlie	8d3529872c	ac/nir: expand 64-bit vec3 loads to fix shuffling. If loading 64-bit vec3 values, a 4 component load would be followed by a 2 component load and the resulting shuffle would fail as it requires 2 4 components. This just expands the second results vector out to 4 components. This fixes 100 CTS tests: dEQP-VK.spirv_assembly.type.vec3.64 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-01 05:58:14 +10:00
Kenneth Graunke	bde12f75e1	i965: Don't stomp initial kflags for program cache. We want to flag EXEC_OBJECT_CAPTURE, but we ought to preserve any existing kflags. Today, there are none (as the program cache doesn't support 48-bit addressing), but once we start using softpin, we'll need to preserve EXEC_OBJECT_PINNED. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-30 11:34:19 -07:00
Kenneth Graunke	0cc98522f9	i965: Let batchbuffers be placed anywhere in the 48-bit address space. We were trying to mark batch buffers with EXEC_OBJECT_CAPTURE, and accidentally stomped EXEC_OBJECT_SUPPORTS_48B_ADDRESS in the process. There's no reason to restrict batch buffers to the lower 4GB. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-30 11:34:19 -07:00
Scott D Phillips	8ffc6ee251	intel: fix check for 48b ppgtt support The previous logic of the supports_48b_addresses wasn't actually checking if i915.ko was running with full_48bit_ppgtt. The ENOENT it was checking for was actually coming from the invalid context id provided in the test execbuffer. There is no path in the kernel driver where the presence of EXEC_OBJECT_SUPPORTS_48B_ADDRESS leads to an error. Instead, check the default context's GTT_SIZE param for a value greater than 4 GiB v2 (Ken): Fix in i965 as well. v3 Check GTT_SIZE instead of HAS_ALIASING_PPGTT (Chris Wilson) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-30 11:34:19 -07:00
Leo Liu	1c5f4f4e17	st/omx/enc: fix blit setup for YUV LoadImage The blit here involves scaling since it's copying from I8 format to R8G8 format. Half of source will be filtered out with PIPE_TEX_FILTER_NEAREST instruction, it looks that GPU always uses the second half as source. Currently we use "1" as the start point of x for R, then causing 1 source pixel of U component shift to right. So "-1" should be the start point for U component. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-30 11:55:36 -04:00
Juan A. Suarez Romero	4d449c94e4	autotools, meson: bump up required VA version Due using a new VP9 config we use, required VA API 0.39 Fixes: `413c5ca372` ("travis: update libva required version") CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-30 13:59:37 +02:00
Juan A. Suarez Romero	96ed3714fc	docs: update calendar, add news and link release notes to 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-28 17:01:48 +00:00
Juan A. Suarez Romero	8f1159bf9a	docs: add sha256 checksums for 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `b3eed3ad03`)	2018-04-28 16:58:39 +00:00
Juan A. Suarez Romero	14f85260de	docs: add release notes for 18.0.2 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `d38da7bd2d`)	2018-04-28 16:58:36 +00:00
Marek Olšák	8b7358fe43	radeonsi: increase the number of compiler threads depending on the CPU The compiler queue was limited to 3 threads, so shader-db running on a 16-thread CPU would have a bottleneck on the 3-thread queue. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	3f0eaaf6d9	radeonsi: avoid a crash in gallivm_dispose_target_library_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	e75fc8d033	radeonsi: move data_layout into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	797d673c9a	radeonsi: move passmgr into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	c1823ff661	radeonsi: move target_library_info into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	5a94f15aa7	radeonsi: use si_compiler::triple in si_llvm_optimize_module Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	43f0a10051	radeonsi: add triple into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	87eb597758	radeonsi: add struct si_compiler containing LLVMTargetMachineRef It will contain more variables. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	788d66553a	radeonsi: rename r600_texture::resource to buffer r600_resource could be renamed to si_buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6fadfc01c6	radeonsi: use r600_resource() typecast helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	3160ee876a	radeonsi: remove unused atom parameter from si_atom::emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	de344209ad	radeonsi: inline 2 trivial state structures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	e395475096	radeonsi: remove function si_init_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	ccebcba893	radeonsi: remove si_atom::id Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	639b673fc3	radeonsi: don't use an indirect table for state atoms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9054799b39	radeonsi: rename r600_atom -> si_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	a8abbbb172	radeonsi: remove r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6d19120da8	radeonsi/gfx9: workaround for INTERP with indirect indexing and clean up the conditions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-27 17:56:04 -04:00
Marek Olšák	2d69b485f5	radeonsi: rewrite DCC format compatibility checking code It might be better to use a slow compressed clear when clearing to 1. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	c732d069b3	radeonsi: implement DCC fast clear swizzle constraints more accurately Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear value of 1. This significantly changes the DCC fast clear code, and fixes fast clear for RGB formats without alpha. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9ef423f720	radeonsi: rename variables and document stuff around DCC fast clear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	1cc2e0cc6b	radeonsi: fully enable 2x DCC MSAA for array and non-array textures The clear code is exactly the same as for 1 sample buffers - just clear the whole thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	ca33d961a4	radeonsi: enable fast color clear for level 0 of mipmapped textures on <= VI GFX9 is more complicated and needs a compute shader that we should just copy from amdvlk. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	174e11c3f5	ac/surface: handle DCC subresource fast clear restriction on VI v2: require the previous level to be clearable for determining whether the last unaligned level is clearable Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
George Kyriazis	838f15650e	swr/rast: No need to export GetSimdValidIndicesGfx Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7caeee3432	swr/rast: Small editorial changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	f276517ebf	swr/rast: Use new processor detection mechanism Use specific avx512 selection mechanism based on avx512er bit instead of getHostCPUName(). LLVM 6.0.0 has a bug that reports wrong string for KNL (fixed in 6.0.1). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	8ace547e8d	swr/rast: Output rasterizer dir to console since it's process specific Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	c328c5d0f4	swr/rast: Add TranslateGfxAddress for shader Also add GFX_MEM_CLIENT_SHADER Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	edc41f73b8	swr/rast: jit PRINT improvements. Sign-extend integer types to 32bit when specifying "%d" and add new %u which zero-extends to 32bit. Improves printing of sub 32bit integer types (i1 specifically). Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	5d403178e6	swr/rast: Fix regressions. Bump jit cache revision number to force recompile. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	577af2bed4	swr/rast: Cleanup old cruft. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	aeab9db50a	swr/rast: Package events.proto with core output However only if the file exists in DEBUG_OUTPUT_DIR. The expectation is that AR rasterizerLauncher will start placing it there when launching a workload (which is in a subsequent checkin) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	b97bb0ea6d	swr/rast: Fix init in EventHandlerWorkerStats Make sure we initialize variables. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	9a72d4c03e	swr/rast: Fix return type of VCVTPS2PH. expecting <8xi16> return. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	3f008c5505	swr/rast: WIP Translation handling Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7986519d50	swr/rast: Use different handing for stream masks Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	6b1c852ebc	swr/rast: Silence warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	e6daa62a48	swr/rast: Add support for TexelMask evaluation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	cec1b52cac	swr/rast: Internal core change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	7b343a215e	swr/rast: Fix x86 lowering 64-bit float handling - 64-bit cvt-to-float needs to be explicitly handled - gathers need the right parameter types to work with doubles Fixes draw-vertices piglit tests Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	fa4ab7910e	swr/rast: Add some SIMD_T utility functors VecEqual and VecHash Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	18c9cb85d1	swr/rast: Fix wrong type allocation ALLOCA pointer elements, not pointers. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	1cdbce8805	swr: touch generated files to update timestamp previous change in generators necessitates this change Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
George Kyriazis	9ceeb671a3	swr/rast: Fix byte offset for non-indexed draws for the case when USE_SIMD16_SHADERS == FALSE Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-27 14:36:41 -05:00
Marek Olšák	7083ac7290	util/u_queue: fix a deadlock in util_queue_finish Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 13:28:17 -04:00
Dylan Baker	7772de5283	meson: fix race condition revealed by using 0.44 Previously there was a special target that blocked for the generation of anv_entrypoints.h, with meson 0.44 we don't need this, we can use a new language feature instead. The problem is that previously that blocking target would hide a race condition for the generation of another header, anv_extensions.h. Now the build sometimes fails when anv_extensions.h is not generated in time. v2: - clarify the race condition in the commit message (Emil) CC: Mark Janes <mark.a.janes@intel.com> Fixes: `92550d9b16` ("meson: remove workaround for custom target creating .h and .c files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-27 10:24:51 -07:00
Dylan Baker	0c23bd76d1	bin: force git show to use default pretty setting I have pretty default to short, which breaks this script. v2: - Fix both places that don't define a --pretty (Emil) cc: Juan A. Suarez <jasuarez@igalia.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-27 10:19:55 -07:00
Tapani Pälli	b3ad4b6971	mesa: add TBO support for GL_EXT_texture_norm16 Earlier plumbing missed interaction with texture buffer objects. Fixes: `7f467d4f73` "mesa: GL_EXT_texture_norm16 extension plumbing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-27 14:34:43 +03:00
Samuel Pitoiset	d38425ce87	ac: fix texture query LOD for 1D textures on GFX9 1D textures are allocated as 2D which means we only need one coordinate for texture query LOD. Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 11:15:35 +02:00
Christian Gmeiner	3e69127939	etnaviv: remove not needed includes Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-04-27 09:04:56 +02:00
Christian Gmeiner	2ba587aac7	etnaviv: remove redundant include Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2018-04-27 09:04:53 +02:00
Timothy Arceri	79b0556f29	glsl: replace some asserts with unreachable when processing the ast Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-27 10:18:47 +10:00
Timothy Arceri	410f901bee	mesa: drop the buffer mode param from the DrawBuffer driver function No drivers used it. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-27 10:09:10 +10:00
Anuj Phogat	b695a7bd8e	anv/icl: Enable Vulkan on Ice Lake This patch enables the Vulkan driver on Ice Lake h/w with added warning about preliminary support. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-26 16:31:27 -07:00
Caio Marcelo de Oliveira Filho	c9bdc7f7e2	anv: enable VK_EXT_shader_viewport_index_layer Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 15:32:05 -07:00
Jason Ekstrand	3db93f9128	anv/allocator: Don't shrink either end of the block pool Previously, we only tried to ensure that we didn't shrink either end below what was already handed out. However, due to the way we handle relocations with block pools, we can't shrink the back end at all. It's probably best to not shrink in either direction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105374 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106147 Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-04-26 13:17:14 -07:00
Eric Anholt	76ee9edcb4	broadcom/vc5: Add support for centroid varyings. It would be nice to share the flags packet emit logic with flat shade flags, but I couldn't come up with a good way while still using our pack macros. We need to refactor this to shader record setup at compile time, anyway. Fixes ext_framebuffer_multisample-interpolation * centroid-*	2018-04-26 11:30:22 -07:00
Eric Anholt	e2f3317801	broadcom/vc5: Add an assert about GFXH-1559. Our TF outputs always start at 6 or 7 currently, so we don't hit the broken 8 case. Let's make sure that doesn't change somehow.	2018-04-26 11:30:22 -07:00
Eric Anholt	77b4f30bae	broadcom/vc5: Add validation that we don't violate GFXH-1633 requirements. We don't use ldunifa yet, but we will eventually for UBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	089c32eefd	broadcom/vc5: Add validation that we don't violate GFXH-1625 requirements. We don't use TMUWT yet, but we will once we do SSBOs.	2018-04-26 11:30:22 -07:00
Eric Anholt	57ceb95c84	broadcom/vc5: Implement GFXH-1742 workaround (emit 2 dummy stores on 4.x). This should fix help with intermittent GPU hangs in tests switching formats while rendering small frames. Unfortunately, it didn't help with the tests I'm having troubles with.	2018-04-26 11:30:22 -07:00
Eric Anholt	dc4cb04ee5	broadcom/vc5: Add QPU validation for register writes after thrend. The next shader gets to start writing the register file during these slots, so make sure we don't stomp over them. The only case of hitting this that I could imagine would be dead writes.	2018-04-26 11:30:22 -07:00
Eric Anholt	8adf813f83	st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type. GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an unsized internalformat, and it should be non-color-renderable. fbobject.c's implementation of the check for color-renderable is checks that the texture has a 2101010 mesa format, so make sure that we have chosen a 2101010 format so that check can do what it meant to. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-26 11:30:22 -07:00
Charmaine Lee	8aef7fccb7	st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stage With this patch, _ElementSize is initialized along with the rest of the vertex array attributes in new_draw_rasterpos_stage(). This fixes a crash in st_pipe_vertex_format() when running topogun-1.06-orc-84k-resize trace file with VMware svga driver. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-26 10:29:02 -07:00
Drew Davenport	e923e8151d	st/va: Fix typos s/attibute/attribute/ s/suface/surface/ v2: rebased(Leo) Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Drew Davenport	893808006a	st/va: Fix potential buffer overread VASurfaceAttribExternalBuffers.pitches is indexed by plane. Current implementation only supports single plane layout. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Boyuan Zhang	deba56accf	radeon/vcn: fix mpeg4 msg buffer settings Previous bit-fields assignments are incorrect and will result certain mpeg4 decode failed due to wrong flag values. This patch fixes these assignments. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2018-04-26 11:16:05 -04:00
Ian Romanick	bf5e0276b6	radeon: Drop broken front_buffer_reading/drawing optimization Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-26 09:38:51 -04:00
Ian Romanick	0b3231966f	radeon: Use _mesa_is_front_buffer_drawing Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-26 09:38:51 -04:00
Samuel Pitoiset	d7ffe3b384	radv: set ac_surf_info::num_channels correctly num_channels has been introduced since "ac/surface: don't set the display flag for obviously unsupported cases". Based on RadeonSI. Fixes: `e29facff31` ("ac/surface: don't set the display flag for obviously unsupported cases (v2)") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 15:34:14 +02:00
Samuel Pitoiset	a6fbefa67b	radv: fix DCC enablement since partial MSAA implementation dcc_msaa_allowed is always false on GFX9+ and only true on VI if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled in some situations where it should not. This is likely going to fix a performance regression. Fixes: `2f63b3dd09` ("radv: enable DCC for MSAA 2x textures on VI under an option") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 15:34:11 +02:00
Karol Herbst	227b1af866	nir/opt_constant_folding: fix folding of 8 and 16 bit ints Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Karol Herbst	14943add44	nir: print 8 and 16 bit constants correctly Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Karol Herbst	543a8c66a7	nir: support converting to 8-bit integers in nir_type_conversion_op Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-26 11:16:15 +02:00
Neil Roberts	c4ab1bdcc9	spirv: Don’t check for NaN for most OpFOrd* comparisons For all of the OpFOrd* comparisons except OpFOrdNotEqual the hardware should probably already return false if one of the operands is NaN so we don’t need to have an explicit check for it. This seems to at least work on Intel hardware. This should reduce the number of instructions generated for the most common comparisons. For what it’s worth, the original code to handle this was added in `e062eb6415`. The commit message for that says that it was to fix some CTS tests for OpFUnord* opcodes. Even if the hardware doesn’t handle NaNs this patch shouldn’t affect those tests. At any rate they have since been moved out of the mustpass list. Incidentally those tests fail on the nvidia proprietary driver so it doesn’t seem like handling NaNs correctly is a priority. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-26 10:08:14 +02:00
Matt Atwood	3ba5a646e5	Intel: Add a Kaby Lake PCI ID v2: Branding changed Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-04-25 13:31:55 -07:00
Eric Anholt	069c409f43	gallium/util: Fix incorrect refcounting of separate stencil. The driver may have a reference on the separate stencil buffer for some reason (like an unflushed job using it), so we can't directly free the resource and should instead just decrement the refcount that we own. Fixes double-free in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8 on vc5. Fixes: `e94eb5e600` ("gallium/util: add u_transfer_helper") Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-04-25 12:14:33 -07:00
Eric Anholt	0d4ce00d70	broadcom/vc5: Fix reloads of separate stencil buffers. Like for stores, we need to emit a separate load_general packet.	2018-04-25 09:21:54 -07:00
Eric Anholt	9f3f4284c0	broadcom/vc5: Fix cpp of MSAA surfaces on 4.x. The internal-type-bpp path is for surfaces that get stored in the raw TLB format. For 4.x, we're storing MSAA as just 2x width/height at the original format.	2018-04-25 09:21:54 -07:00
Eric Anholt	ac207acb97	broadcom/vc5: Implement stencil blits using RGBA. Fixes piglit fbo-depthstencil blit default_fb	2018-04-25 09:21:54 -07:00
Eric Anholt	503716fa86	broadcom/vc5: Remove leftover vc4 MSAA lowering setup in the FS key.	2018-04-25 09:21:54 -07:00
Eric Anholt	5710532e9e	broadcom/vc5: Fix tile load/store of MSAA surfaces on 4.x. For single-sample we have to always program SAMPLE_0, but for multisample we want to store all the samples.	2018-04-25 09:21:54 -07:00
Juan A. Suarez Romero	413c5ca372	travis: update libva required version Commit `fa328456e8` added VP9 config support, but this needs a newer libva version, 1.7.0 or above. Fixes: `fa328456e8` ("st/va: add VP9 config to enable profile2") CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-25 16:09:20 +02:00
Tapani Pälli	7f467d4f73	mesa: GL_EXT_texture_norm16 extension plumbing Patch enables use of short and unsigned short data for texture uploads, rendering and reading of framebuffers within the restrictions specified in GL_EXT_texture_norm16 spec. Patch also enables those 16bit format layout qualifiers listed in GL_NV_image_formats that depend on EXT_texture_norm16. v2: expose extension with dummy_true fix layout qualifier map changes (Ilia Mirkin) v3: use _mesa_has_EXT_texture_norm16, other fixes and cleanup (Ilia Mirkin) v4: fix rest of the issues found Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-25 14:26:20 +03:00
Jordan Justen	b0c5774027	meson: Fix with_intel_vk and with_amd_vk variables Fixes: `5608d0a2ce` "meson: use array type options" Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-24 23:12:42 -07:00
Roland Scheidegger	77554d220d	draw: fix different sign logic when clipping The logic was flawed, since mul(x,y) will be <= 0 (exactly 0) when the sign is the same but both numbers are sufficiently small (if the product is smaller than 2^-128). This could apparently lead to emitting a sufficient amount of additional bogus vertices to overflow the allocated array for them, hitting an assertion (still safe with release builds since we just aborted clipping after the assertion in this case - I'm however unsure if this is now really no longer possible, so that code stays). Not sure if the additional vertices could cause other grief, I didn't see anything wrong even when hitting the assertion. Essentially, both +-0 are treated as positive (the vertex is considered to be inside the clip volume for this plane), so integrate the logic determining different sign into the branch there. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-25 04:50:20 +02:00
Roland Scheidegger	98578df27b	draw: simplify clip null tri logic Simplifies the logic when to emit null tris (albeit the reasons why we have to do this remain unclear). This is strictly just logic simplification, the behavior doesn't change at all. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-25 04:50:20 +02:00
Ilia Mirkin	c17ddcb4b4	nvc0/ir: all short immediates are sign-extended, adjust LIMM test Some analysis suggests that all short immediates are sign-extended. The insnCanLoad logic already accounted for this, but we could still pick the wrong form when emitting actual instructions that support both short and long immediates (with the long form usually having additional restrictions that insnCanLoad should be aware of). This also reverses a bunch of commits that had previously "worked around" this issue in various emitters: `9c63224540`: gm107/ir: make use of ADD32I for all immediates `83a4f28dc2`: gm107/ir: make use of LOP32I for all immediates `b84c97587b`: gm107/ir: make use of IMUL32I for all immediates `d30768025a`: gk110/ir: make use of IMUL32I for all immediates as well as the original import for UMUL in the nvc0 emitter. Reported-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-04-24 21:37:44 -04:00
Boyan Ding	6695f9d5c5	mesa: call DrawBufferAllocate driver hook in update_framebuffer for windows-system FB When draw buffers are changed on a bound framebuffer, DrawBufferAllocate() hook should be called. However, it is missing in update_framebuffer with window-system framebuffer, in which FB's draw buffer state should match context state, potentially resulting in a change. Note: This is needed because gallium delays creating the front buffer, i965 works fine without this change. V2 (Timothy Arceri): - Rebased on merged/simplified DrawBuffer driver function - Move DrawBuffer call outside fb->ColorDrawBuffer[0] != ctx->Color.DrawBuffer[0] check to make piglit pass. v3 (Timothy Arceri): - Call new DrawBuffaerAllocate() driver function. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v2) Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116	2018-04-25 09:08:26 +10:00
Timothy Arceri	6ca09f3a60	st/mesa: add new driver function DrawBufferAllocate Unlike some of the classic drivers the st was only using DrawBuffer() to allocated some buffers on-demand. Creating a separate function will allow us to call it from update_framebuffer() in the following patch without regressing some of the older classic drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-25 09:08:26 +10:00
Timothy Arceri	2554b8cb00	mesa: some C99 tidy ups for framebuffer.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-04-25 09:08:26 +10:00
Dylan Baker	1d01b52d76	meson: Fix no-rtti in llvm detection Because I clearly wasn't thinking and clearly didn't do a good job testing. Sigh Fixes: `c5a97d658e` ("meson: fix builds against LLVM built without rtti") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-24 15:26:51 -07:00
Dylan Baker	be0a2cfc65	meson: use new warning function Instead of emulating it with message. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	5608d0a2ce	meson: use array type options This option type is nice since it involves less converting strings into lists, and because it validates the values that are provided. v2: - Set with_any_vk to true if any vulkan driver is built (Eric) Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	c5a97d658e	meson: fix builds against LLVM built without rtti Building without rtti is a frought with peril, but it's something that autotools supports so we need to support it too. Since we've moved to version 0.44 as a whole we can use the meson functionality for accessing random llvm-config options we can check for rtti and add -fno-rtti to all C++ code accordingly. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-24 14:08:15 -07:00
Dylan Baker	595021bf1a	meson: remove dummy_cpp meson has gotten pretty smart about tracking C and C++ dependencies (internal and external), and using the right linker. This wasn't always the case and we created empty c++ files to force the use of the c++ linker. We don't need that any more. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	db90c8627c	meson: allow empty sources when using link_whole meson used to get grumpy if the sources list was empty, even when using --whole-archive (link_whole). In more recent versions that's not true, so remove the workaround. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	92550d9b16	meson: remove workaround for custom target creating .h and .c files In more modern versions of meson a custom_target returns an index-able object. This allows us to create accurate dependency models for targets that rely only on the header and not on the code from anv_entrypoints. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	5a670d08c0	meson: raise required version to 0.44.1 We have already required 0.44 for building clover and swr, so it was already partially required. This just makes it required across the board instead of just for clover and swr. There is a bug in 0.44 which makes it impossible to build mesa in some configurations, so require 0.44.1 which fixes this. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	1546f76a39	meson: fix graw-xlib after auxiliary consolidation This one's completely my fault, I didn't do good enough testing after rebasing and this got missed. Fixes: `d28c246501` ("meson: build graw tests") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	c73abb4f82	meson: only build mesa_st tests when build-tests is true Since we have an option to turn test building on and off, we should honor that. Fixes: `34cb4d0ebc` ("meson: build tests for gallium mesa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Dylan Baker	aaab624245	meson: don't build classic mesa tests without dri_drivers Since mesa_classic is build-on-demand the tests will create a demand and add a bunch of extra compilation. Fixes: `43a6e84927` ("meson: build mesa test.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-24 14:08:15 -07:00
Nanley Chery	0e8b16e0a2	i965/meta_util: Re-enable sRGB-encoded fast-clears on CNL The paths which sample with the clear color are now using a getter which performs the sRGB decode needed to enable this fast clear. This path can be exercised by fast-clearing a texture, then performing an operation which requires sRGB decoding. Test coverage for this feature is provided with the following tests: * Shader texture calls: - spec@ext_texture_srgb@tex-srgb * Shader texelfetch calls: - spec@arb_framebuffer_srgb@fbo-fast-clear - spec@arb_framebuffer_srgb@msaa-fast-clear * Blending: - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend * Blitting: - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	129ad66dd5	i965/miptree: Extend the sRGB-blending WA to future platforms The blending issue seems to be present on CNL as well. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	7ea013c6d3	i965: Add and use a getter for the clear color It returns both the inline clear color and a clear address which points to the indirect clear color buffer (or NULL if unused/non-existent). This getter allows CNL to sample from fast-cleared sRGB textures correctly by doing the needed sRGB-decode on the clear color (inline) and making the indirect clear color buffer unused. v2 (Rafael): * Have a more detailed commit message. * Add a comment on the sRGB conversion process. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Jason Ekstrand	b55077a8bc	util/srgb: Add a float sRGB -> linear helper Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	cd5ce363e3	i965/wm_surface_state: Use the clear address if clear_bo is non-NULL We want to add and use a getter that turns off the indirect path by returning zero for the clear color bo and offset. v2: Fix usage of "clear address" in commit message (Jason). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	af4e9295fe	i965: Add and use a single miptree aux_buf field We want to add and use a function that accesses the auxiliary buffer's clear_color_bo and doesn't care if it has an MCS or HiZ buffer specifically. v2 (Jason Ekstrand): * Drop intel_miptree_get_aux_buffer(). * Mention CCS in the aux_buf field. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 13:41:14 -07:00
Nanley Chery	5503b65103	i965: Add and use a getter for the miptree aux buffer Make the next patch easier to read by eliminating most of the would-be duplicate field accesses now. v2: Update the HiZ comment instead of deleting it (Rafael). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-04-24 13:41:14 -07:00
Karol Herbst	e4f675dc42	gm107/ir/lib: fix sched in div u32 builtin Imad needs to set a read barrier. With significant big work groups I was getting wrong results for div u32. Turns out the issue was with the sched opcodes. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 22:31:59 +02:00
Ian Romanick	0d5ce25c1c	intel/compiler: Add scheduler deps for instructions that implicitly read g0 Otherwise the scheduler can move the writes after the reads. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95009 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95012 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: Clayton A Craft <clayton.a.craft@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-04-24 14:31:21 -04:00
Ian Romanick	cd32a4e5f4	intel/compiler: Silence unused parameter warnings in empty vec4_instruction_scheduler methods src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::count_reads_remaining(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:764:72: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::count_reads_remaining(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::setup_liveness(cfg_t)’: src/intel/compiler/brw_schedule_instructions.cpp:769:51: warning: unused parameter ‘cfg’ [-Wunused-parameter] vec4_instruction_scheduler::setup_liveness(cfg_t cfg) ^~~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual void vec4_instruction_scheduler::update_register_pressure(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:774:75: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::update_register_pressure(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:779:80: warning: unused parameter ‘be’ [-Wunused-parameter] vec4_instruction_scheduler::get_register_pressure_benefit(backend_instruction be) ^~ src/intel/compiler/brw_schedule_instructions.cpp: In member function ‘virtual int vec4_instruction_scheduler::issue_time(backend_instruction)’: src/intel/compiler/brw_schedule_instructions.cpp:1550:61: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction_scheduler::issue_time(backend_instruction inst) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Ian Romanick	bdb15c2344	intel/compiler: Silence unused parameter warning in compile_cs_to_nir src/intel/compiler/brw_fs.cpp: In function ‘nir_shader* compile_cs_to_nir(const brw_compiler, void, const brw_cs_prog_key, brw_cs_prog_data, const nir_shader, unsigned int)’: src/intel/compiler/brw_fs.cpp:7205:44: warning: unused parameter ‘prog_data’ [-Wunused-parameter] struct brw_cs_prog_data prog_data, ^~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Ian Romanick	d84b2ed1d7	intel/compiler: Silence unused parameter warnings in generate_foo methods Since all of the fs_generator::generate_foo methods take a fs_inst * as the first parameter, just remove the name to quiet the compiler. src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_barrier(fs_inst, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:743:41: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_barrier(fs_inst inst, struct brw_reg src) ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_discard_jump(fs_inst)’: src/intel/compiler/brw_fs_generator.cpp:1326:46: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_discard_jump(fs_inst inst) ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_pack_half_2x16_split(fs_inst, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:1675:54: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_pack_half_2x16_split(fs_inst inst, ^~~~ src/intel/compiler/brw_fs_generator.cpp: In member function ‘void fs_generator::generate_shader_time_add(fs_inst, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_fs_generator.cpp:1743:49: warning: unused parameter ‘inst’ [-Wunused-parameter] fs_generator::generate_shader_time_add(fs_inst inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_set_simd4x2_header_gen9(brw_codegen, brw::vec4_instruction, brw_reg)’: src/intel/compiler/brw_vec4_generator.cpp:1412:52: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_mov_indirect(brw_codegen, brw::vec4_instruction, brw_reg, brw_reg, brw_reg, brw_reg)’: src/intel/compiler/brw_vec4_generator.cpp:1430:41: warning: unused parameter ‘inst’ [-Wunused-parameter] vec4_instruction inst, ^~~~ src/intel/compiler/brw_vec4_generator.cpp:1432:63: warning: unused parameter ‘length’ [-Wunused-parameter] struct brw_reg indirect, struct brw_reg length) ^~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-24 14:31:21 -04:00
Eric Anholt	3d21fc193e	broadcom/vc5: Set up internal_format for imported resources. Without this, we'd assertion fail in u_transfer_helper when mapping an imported resource.	2018-04-24 10:37:29 -07:00
Eric Anholt	f08f477a93	broadcom/vc5: Assert that created BOs have offset != 0. The kernel shouldn't return a bo at NULL, and the HW special-cases NULL address values for things like OQs.	2018-04-24 10:37:29 -07:00
Eric Anholt	482f2e24b5	broadcom/vc5: Don't allocate simulator BOs at offset 0. The kernel won't return us BOs at offset 0 (because things like OQs wouldn't work there), so we shouldn't in the simulator either.	2018-04-24 10:37:29 -07:00
Eric Anholt	82cdb801fd	broadcom/vc5: Add sim support for the GET_BO_OFFSET ioctl. Otherwise we'd crash immediately upon importing a BO through EGL interfaces.	2018-04-24 10:37:29 -07:00
Eric Anholt	3cdd055ed2	broadcom/vc5: Treat imports of DRM_FORMAT_MOD_INVALID BOs as linear. We don't have any kernel metadata about BO tiling, so this probably is all we should do for the moment.	2018-04-24 10:37:29 -07:00
Tapani Pälli	c2e159d050	i965: expose MESA_FORMAT_R8G8B8A8_SRGB visual Exposing the visual makes following dEQP tests pass on Android: dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-24 14:55:18 +03:00
Tapani Pälli	fa4d4d97f3	dri: Add __DRI_IMAGE_FORMAT_SABGR8 Add format definition and required plumbing to create images. Note that there is no match to drm_fourcc definition, just like with existing _DRI_IMAGE_FOURCC_SARGB8888. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-24 14:55:18 +03:00
Marek Olšák	4559aefb5c	Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable" This reverts commit `dab02dea34`. It causes crashes of qtcreator and firefox. Fixes: `dab02de` "st/dri: Fix dangling pointer to a destroyed dri_drawable" Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-24 00:00:20 -04:00
Roland Scheidegger	e8e1d287a3	gallivm: dump bitcode before optimization If we dump the bitcode for off-line debug purposes, we really want the pre-optimized bitcode, otherwise it's useless in identifying problems with IR optimization (if you have a shader which takes an hour to do IR optimization, it's also nice you don't have to wait that hour...). Also, print out the function passes for opt which correspond to what was used for jit compilation (and also the opt level for codegen). Using opt/llc this way should then pretty much mimic what was done for jit. (When specifying something like -time-passes -debug-pass=[Structure\|Arguments] (for either opt or llc) that also gives very useful information in which passes all the time was spent, and which passes are really run along with the order - llvm will add passes due to dependencies on its own, and of course -O2 for llc comes with a ~100 pass list.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	e89cf59c27	gallivm: (trivial) do division by 1000 with int64 Conversion to int can otherwise overflow if compile times are over ~71min. (Yes this can happen...) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	45b8f620a5	gallivm: remove LICM pass LICM is simply too expensive, even though it presumably can help quite a bit in some cases. It was definitely cheaper in llvm 3.3, though as far as I can tell with llvm 3.3 it failed to do anything in most cases. early-cse also actually seems to cause licm to be able to move things when it previously couldn't, which causes noticeable compile time increases. There's more loop passes in llvm, but I'm not sure which ones are helpful, and I couldn't find anything which would roughly do what the old licm in llvm 3.3 did, so ditch it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Roland Scheidegger	8b9ab674b9	gallivm: add early cse pass This pass is quite cheap, and can simplify the IR quite a bit for our generated IR. In particular on a variety of shaders I've found the time saved by other passes due to the simplified IR more than makes up for the cost of this pass, and on top of that the end result is actually better. The only downside I've found is this enables the LICM pass to move some things out of the main shader loop (in the case I've seen, instanced vertex fetch (which is constant within the jit shader) plus the derived instructions in the shader) which it couldn't do before for some reason. This would actually be desirable but can increase compile time considerably (licm seems to have considerable cost when it actually can move things out of loops, due to alias analysis). But blaming early cse for this seems inappropriate. (Note that the first two sroa / earlycse passes are similar to what a standard llvm opt -O1/-O2 pipeline would do, albeit this has some more passes even before but I don't think they'd do much for us.) It also in particular helps some crazy shader used for driver verification (don't ask...) a lot (about factor of 6 faster in compile time) (due to simplfiying the ir before LICM is run). While here, also move licm behind simplifycfg. For some shaders there seems to be very significant compile time gains (we've seen a factor of 10000 albeit that was a really crazy shader you'd certainly never see in a real app), beause LICM is quite expensive and there's cases where running simplifycfg (along with sroa and early-cse) before licm reduces IR complexity significantly. (I'm not entirely sure if it would make sense to also run it afterwards.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-04-24 04:49:39 +02:00
Vlad Golovkin	1ff1dc1c63	glsl/glcpp: Handle hex constants with 0X prefix GLSL 4.6 spec describes hex constant as: hexadecimal-constant: 0x hexadecimal-digit 0X hexadecimal-digit hexadecimal-constant hexadecimal-digit Right now if you have a shader with the following structure: #if 0X1 // or any hex number with the 0X prefix // some code #endif the code between #if and #endif gets removed because the checking is performed only for "0x" prefix which results in strtoll being called with the base 8 and after encountering the 'X' char the strtoll returns 0. Letting strtoll detect the base makes this limitation go away and also makes code easier to read. From the strtoll Linux man page: "If base is zero or 16, the string may then include a "0x" prefix, and the number will be read in base 16; otherwise, a zero base is taken as 10 (decimal) unless the next character is '0', in which case it is taken as 8 (octal)." This matches the behaviour in the GLSL spec. This patch also adds a test for uppercase hex prefix. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-24 09:55:05 +10:00
Timothy Arceri	295f57e09a	mesa: rename api_validate.{c,h} -> draw_validate.{c,h} Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422	2018-04-24 09:23:30 +10:00
Dave Airlie	a90c9f33cf	ac/radv/radeonsi: refactor harvest config register getters. This refactors the code out to share it between radv and radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 09:08:34 +10:00
Dave Airlie	8e4d54505a	radv: only set raster_config_1 outside the index registers. This follows what radeonsi does. Ported from radeonsi: radeonsi: emit PA_SC_RASTER_CONFIG_1 only once Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-24 09:08:34 +10:00
Dave Airlie	f77caa7411	ac/radv/radeonsi: refactor max simd waves into common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:08:33 +10:00
Dave Airlie	899df55ee0	ac/radv/radeonsi: refactor raster_config default values getters. This just makes this common code between the two drivers. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:07:51 +10:00
Dave Airlie	8de7ff91be	radeonsi: use common gs_table_depth code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:43 +10:00
Dave Airlie	9afe9c0fe2	radv: use common gs_table_depth code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:43 +10:00
Dave Airlie	5e2ef28390	ac/info: move gs table depth to common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:38 +10:00
Dave Airlie	b25f6cde89	radeonsi: don't runtime check gs table info We can just unreachable here, this aligns with radv code, makes it easier to move to common code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-24 09:05:29 +10:00
Dave Airlie	40783a7fa3	radv/gfx9: don't use gs_table_depth on gfx9. Missed this on initial radeonsi port, we shouldn't use this value on gfx9, but also in gfx8 only for when we have a geom shader. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-04-24 09:04:42 +10:00
Jason Ekstrand	de1f22d595	i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_* They are send messages and this makes size_read() and mlen agree. For both of these opcodes, the payload is just a dummy so mlen == 1 and this should decrease register pressure a bit. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable@lists.freedesktop.org	2018-04-23 14:04:42 -07:00
Samuel Pitoiset	d136a5fad9	ac: fix the number of coordinates for ac_image_get_lod and arrays This fixes crashes for the following CTS: dEQP-VK.glsl.texture_functions.query.texturequerylod.* Cubemaps are the same as 2D arrays. Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 21:48:38 +02:00
Lionel Landwerlin	2964e16e51	i965: perf: enable GPA query statistics The combinaison of GPA/MDAPI components expects a particular name & layout for their pipeline statistics query. v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel) v3: Add curly braces (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	2e3025c817	i965: perf: add support for raw queries The INTEL_performance_query extension provides a list of queries that a user can select to monitor a particular workload. Each query reports different sets of counters (roughly looking at different parts of the hardware, i.e. caches/fixed functions/etc...). Each query has an associated configuration that we need to program into the hardware before using the query. Up to now, we provided predefined queries. This change allows the user to build its own query (and associated configuration) externally, and have the i965 driver use that configuration through a new query named : Intel_Raw_Hardware_Counters_Set_0_Query When this query is selected, the i965 driver will report raw counters deltas (meaning their values need to be interpreted by the user, as opposed to existing queries that provide human readable values). This change is also useful for debug purposes for building new pre-defined queries and verifying the underlying numbers make sense before writing equations for user readable output. This change's purpose is also to enable GPA. GPA uses a library called MDAPI that processes raw counter data. MDAPI expects raw data to have a certain layout (per generation which is a bit unfortunate...). This change also embeds the expected data layouts. v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel) v3: Don't assert on cherryview for gen7... (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	c61d445a5a	i965: perf: read slice/unslice frequencies from OA reports v2: Add comment breaking down where the frequency values come from (Ken) v3: More documentation (Ken/Lionel) Adjust clock ratio multiplier to reflect the divider's behavior (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	43fcb72d2c	i965: perf: snapshot RPSTAT register This register contains the current/previous frequency of the GT, it's one of the value GPA would like to have as part of their queries. v2: Don't use this register on baytrail/cherryview (Ken) Use GET_FIELD() macro (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Lionel Landwerlin	d71b442416	i965: perf: extract utility functions We would like to reuse a number of the functions and structures in another file in a future commit. We also move the previous content of brw_performance_query.h into brw_performance_query_metrics.h to be included by generated metrics files. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-23 18:30:10 +01:00
Samuel Pitoiset	e37e643589	ac: teach get_ac_sampler_dim() about subpass attachments Suggested by Nicolai. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 19:10:56 +02:00
Samuel Pitoiset	84fef802fb	ac/nir: add missing round_slice for 1D arrays This fixes a bunch of CTS fails with 1D arrays: dEQP-VK.glsl.texture_functions.texture.sampler1darray_ Fixes: `625dcbbc45` ("amd/common: pass address components individually to ac_build_image_intrinsic") Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-23 19:10:52 +02:00
Dylan Baker	10e4290524	bin/install_megadrivers: rename a few variables to make things clearer Originally the "each" variable was just a part of the "drivers" variable. It's not anymore so it's a bit ambiguous. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-23 09:57:35 -07:00
Dylan Baker	ae3f45c11e	bin/install_megadrivers: fix DESTDIR and -D*-path This fixes -Ddri-drivers-path, -Dvdpau-libs-path, etc. with DESTDIR when those paths are absolute. Currently due to the way python's os.path.join handles absolute paths these will ignore DESTDIR, which is bad. This fixes them to be relative to DESTDIR if that is set. Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-23 09:57:35 -07:00
Dylan Baker	dbf5b772b3	compiler/glsl: close fd's in glcpp_test.py I would have thought falling out of scope would allow the gc to collect these, but apparently it doesn't, and this hits an fd limit on macos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106133 Fixes: `db8cd8e367` ("glcpp/tests: Convert shell scripts to a python script") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2018-04-23 09:55:17 -07:00
Bas Nieuwenhuizen	0e945fdf23	nir: Do not use progress for unreachable code in return lowering. We seem to use progress for two cases: 1) When we lowered some returns. 2) When we remove unreachable code. If just case 2 happens we assert as state->return_flag has not been allocated yet, but we are still trying to do insert all predicates based on it. This splits the concerns. We only use progress internally for case 1 and then keep track of 2 in a separate variable to indicate progress in the return value of the pass. This is slightly better than transforming the assert into if (!state->return_flag) return, as the solution in this patch avoids inserting predicates even if some other part of the might need them. Fixes: `6e22ad6edc` "nir: return early when lowering a return at the end of a function" CC: 18.1 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106174 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-23 16:55:15 +02:00
Józef Kucia	8328c64eb1	radv: advertise 8 bits of subpixel precision for viewports This is what radeonsi does. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-23 11:16:11 +02:00
Johan Klokkhammer Helsing	dab02dea34	st/dri: Fix dangling pointer to a destroyed dri_drawable If an EGLSurface is created, made current and destroyed, and then a second EGLSurface is created. Then the second malloc in driCreateNewDrawable may return the same pointer address the first surface's drawable had. Consequently, when dri_make_current later tries to determine if it should update the texture_stamp it compares the surface's drawable pointer against the drawable in the last call to dri_make_current and assumes it's the same surface (which it isn't). When texture_stamp is left unset, then dri_st_framebuffer_validate thinks it has already called update_drawable_info for that drawable, leaving it unvalidated and this is when bad things starts to happen. In my case it manifested itself by the width and height of the surface being unset. This is fixed this by setting the pointer to NULL before freeing the surface. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106126 Signed-off-by: Johan Klokkhammer Helsing <johan.helsing@qt.io> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org>	2018-04-23 04:25:40 -04:00
Ilia Mirkin	5428066f5e	nv50/ir: make a copy of tex src if it's referenced multiple times For nv50 we coalesce the srcs and defs into a single node. As such, we can end up with impossible constraints if the source is referenced after the tex operation (which, due to the coalescing of values, will have overwritten it). This logic already exists for inserting moves for MERGE/UNION sources. It's the exact same idea here, so leverage that code, which also includes a few optimizations around not extending live ranges unnecessarily. Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-22 23:03:16 -04:00
Lepton Wu	6c5abb68c7	virgl: disable virgl when no 3D for virtio gpu. If users are running mesa under old version of qemu or have turned off GL at runtime, virtio gpu driver actually doesn't work. Adds a detection here so mesa can fall back to software rendering. v2: - move detection from loader to virgl (Ilia, Emil) Signed-off-by: Lepton Wu <lepton@chromium.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-04-23 12:35:29 +10:00
Dave Airlie	a8420e2530	radv: mark const structs as extern in header file to avoid lto damage The copr repo from che was using LTO and he reported radv broke recently with it. When testing with lto builds here I noticed that we weren't seeing any instance extensions reported. It appears LTO was treating the const without extern as an empty struct, this is possibly a gcc bug, but we can work around it just by marking these with extern. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-04-23 05:55:22 +10:00
Dylan Baker	f8c4716854	Bump version after 18.1 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-22 09:35:56 -07:00
Ilia Mirkin	3f1cad48b8	gallium/tests/trivial: fix viewport depth transform These were getting mapped off into outer space, which would cause nv50 and nvc0 to clip the primitives (as depth_clip was enabled). These drivers are configured to clip everything outside the [0, 1] range, even though the hardware supports other view settings. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-04-21 23:31:48 -04:00
Ilia Mirkin	fe8b6d7e1f	trace: allow image resource to be null Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-21 23:29:39 -04:00
Karol Herbst	63572091b5	nv50/ir/ra: prefer def == src2 for fma with immediates on nvc0 This helps with the PostRALoadPropagation pass moving long immediates into FMA/MAD instructions. changes in shader-db: total instructions in shared programs : 5894114 -> 5886074 (-0.14%) total gprs used in shared programs : 666558 -> 666563 (0.00%) total shared used in shared programs : 520416 -> 520416 (0.00%) total local used in shared programs : 53524 -> 53524 (0.00%) total bytes used in shared programs : 54006744 -> 53932472 (-0.14%) local shared gpr inst bytes helped 0 0 2 4192 4192 hurt 0 0 7 9 9 Signed-off-by: Karol Herbst <karolherbst@gmail.com> [imirkin: minor edits to separate nv50 and nvc0+ cases] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-21 10:53:59 -04:00
Rhys Perry	cc35b76e99	docs/features: mark GL_ARB_post_depth_coverage as DONE for nvc0 This was done a while ago but never marked on features.txt. Note that this is only supported on GM200+. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-21 10:02:55 -04:00
Dylan Baker	6754c2e83d	autotools: Include new meson files Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:56 -07:00
Dylan Baker	5c8e2501a6	autotools: Add passes.h to sources so it will be included in the tarball This was introduced in commit `8f848ada8a` but not added to the sources list, which is necessary for it to be included in release tarballs. Fixes: `8f848ada8a` ("swr/rast: Start refactoring of builder/packetizer.") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:54 -07:00
Dylan Baker	cfd7d2ba0d	autotools: include include/vulkan headers This is needed to provide vk_android_native_buffer.h for vk_enum_to_str. v2: - remove accidentally included changes Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-20 20:26:49 -07:00
Rhys Perry	a0e57432b7	nvc0: fix line width on GM20x+ This has the side-effect of fixing polygon-offset piglit test failures. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-20 20:43:59 -04:00
Nanley Chery	7b20329107	i965/miptree: Delete an unused function We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once that happens, this function no longer make sense. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-20 17:14:37 -07:00
Nanley Chery	010abacc95	i965/miptree: Don't leak the clear_color_bo Free the clear_color_bo in addition to freeing the intel_miptree_aux_buffer which holds the reference to it. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-20 17:14:37 -07:00
Jason Ekstrand	9d2ef3c9ec	i965/blorp: Do the gen11 BTI flush Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-20 16:30:14 -07:00
Jason Ekstrand	185630c6bc	anv/blorp: Do the gen11 BTI flush Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-20 16:30:14 -07:00
Lucas Stach	52e93e309f	etnaviv: fix texture_format_needs_swiz memcmp returns 0 when both swizzles are the same, which means we don't need any hardware swizzling. texture_format_needs_swiz should return true when the return value of the memcmp is non-zero. Fixes: `751ae6afbe` ("etnaviv: add support for swizzled texture formats") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2018-04-20 18:54:10 +02:00
Samuel Pitoiset	8f13975713	ac/nir: fix image dimension for subpass attachments For subpass attachments we need one more coordinate with the layer, so make them array types. This fixes a bunch of CTS fails with RADV. Fixes: `24fb3e6aa1` ("ac/nir: use ac_build_image_opcode for image intrinsics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 18:44:51 +02:00
Bas Nieuwenhuizen	e1df849c3c	radv: Mark GTT memory as device local for APUs. Otherwise a lot of games complain about not having enough memory, and it is sort of local so this seems reasonable to me. CC: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-20 18:16:16 +02:00
Samuel Pitoiset	fedd0a4215	radv/winsys: allow to submit up to 4 IBs for chips without chaining The SI family doesn't support chaining which means the maximum size in dwords per CS is limited. When that limit was reached we failed to submit the CS and the application crashed. This patch allows to submit up to 4 IBs which is currently the limit, but recent amdgpu supports more than that. Please note that we can reach the limit of 4 IBs per submit but currently we can't improve that. The only solution is to upgrade libdrm. That will be improved later but for now this should fix crashes on SI or when using RADV_DEBUG=noibs. Fixes: `36cb5508e8` ("radv/winsys: Fail early on overgrown cs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 18:12:26 +02:00
Stefan Schake	ff904978a1	gallium/util: Android backtrace support We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake <stschake@gmail.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-20 18:49:49 +03:00
Stefan Schake	2abd4f4b49	gallium/util: Don't stub u_debug_stack on Android The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake <stschake@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-20 18:49:37 +03:00
Samuel Pitoiset	dd069e9b41	ac/nir: handle nir_intrinsic_load_first_vertex like base_vertex This fixes a ton of CTS crashes. Fixes: `c366f422f0` ("nir: Offset vertex_id by first_vertex instead of base_vertex") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 17:07:38 +02:00
Samuel Pitoiset	b21a4efb55	radv/winsys: allow local BOs on APUs Ported from RadeonSI. Local BOs ignore BO priorities, and we don't need those on APUs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:24 +02:00
Samuel Pitoiset	5c1233ed62	radv: use a global BO list only for VK_EXT_descriptor_indexing Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:18 +02:00
Samuel Pitoiset	7bd5367546	Revert "radv: Don't store buffer references in the descriptor set." In order to reduce a performance regression introduced by `4b13fe55a4` ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit `ab6cadd3ec`. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-20 16:18:13 +02:00
Jose Maria Casanova Crespo	eb96bd57c7	i965/fs: retype offset_reg to UD at load_ssbo All operations with offset_reg at do_vector_read are done with UD type. So copy propagation was not working through the generated MOVs: mov(8) vgrf9:UD, vgrf7:D This change allows removing the MOV generated for reading the first components for 16-bit and 64-bit ssbo reads with non-constant offsets. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-20 13:30:12 +02:00
Nicolai Hähnle	24fb3e6aa1	ac/nir: use ac_build_image_opcode for image intrinsics So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:30:07 +02:00
Nicolai Hähnle	74063431f1	radeonsi: generate image load/store/atomic ops using ac_build_image_opcode In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:29:57 +02:00
Nicolai Hähnle	625dcbbc45	amd/common: pass address components individually to ac_build_image_intrinsic This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:52 +02:00
Nicolai Hähnle	f931583828	amd/common: pass new enum ac_image_dim to ac_build_image_opcode This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-20 09:23:40 +02:00
Nicolai Hähnle	9cb52d470a	radeonsi/nir: fix crash in test involving the sample mask Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:50 +02:00
Nicolai Hähnle	552bc37c6f	radeonsi/nir: set FS properties only when scanning a fragment shader Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:47 +02:00
Nicolai Hähnle	a807a9b215	ac/nir: fix atomic compare-and-swap The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:40 +02:00
Nicolai Hähnle	e788b987d8	radeonsi: fix error paths of si_texture_transfer_map trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:33 +02:00
Nicolai Hähnle	68ee1d5796	glsl: prevent spurious Valgrind errors when serializing NIR It looks as if the structure fields array is fully initialized below, but in fact at least gcc in debug builds will not actually overwrite the unused bits of bit fields. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-20 09:21:23 +02:00
Aaron Watry	354b12681b	clover: Fix host access validation for sub-buffer creation From CL 1.2 Section 5.2.1: CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY . Fixes CL 1.2 CTS test/api get_buffer_info v2: Correct host_access_flags check (Francisco) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-04-19 20:57:37 -05:00
Neil Roberts	c366f422f0	nir: Offset vertex_id by first_vertex instead of base_vertex base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Rob Clark <robdclark@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-04-19 15:57:45 -07:00
Neil Roberts	c4f30a9100	spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used for SpvBuiltInBaseVertex. Suggested by Jason. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-19 15:57:45 -07:00
Antia Puentes	c32e1035cb	intel: Handle firstvertex in an identical way to BaseVertex Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <Draw ID, 0, 0, 0> v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.	2018-04-19 15:57:45 -07:00
Neil Roberts	0c8395e15d	intel/compiler: Add a uses_firstvertex flag Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-19 15:57:45 -07:00
Antia Puentes	5ff848df7b	compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics This VS system value will contain the value passed as <basevertex> for indexed draw calls or the value passed as <first> for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to <first>. v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be generated. Reformat commit message to 72 columns. Reviewed-by: Neil Roberts <nroberts@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-04-19 15:57:45 -07:00
Mike Lothian	051fddb4a9	meson: Build st_tests_common with gtest Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131 Fixes: `34cb4d0ebc` ("meson: build tests for gallium mesa state tracker") Signed-off-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-19 09:04:51 -07:00
Bas Nieuwenhuizen	dffdef6737	radv: Add Vega M support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:36:21 +02:00
Bas Nieuwenhuizen	d1ce31d36c	radv: Add bound checking workaround for dynamic buffers. I have seen a few applications and games do the dynamic buffer bounds incorrectly, this make it easier to work around, e.g. for debugging. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-19 16:13:25 +02:00
Thomas Hellstrom	e0c08183fb	svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY extension to query whether an sRGB format is supported. That extension will query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than PIPE_BIND_DISPLAY_TARGET which is used when building the configs. We only return the correct value for PIPE_BIND_DISPLAY_TARGET. The inconsistency causes EGL to crash at surface initialization if sRGB is not supported. Fix this by supporting both bind flags. Testing done: piglit egl_gl_colorspace srgb Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-19 13:42:51 +02:00
Mike Lothian	79487c427e	swr: Fix include for createPromoteMemoryToRegisterPass Include llvm/Transforms/Utils.h with the newest LLVM 7 v2: Include with " " rather than < > (Vinson Lee) v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis) Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-04-19 00:39:04 -07:00
Samuel Pitoiset	2f63b3dd09	radv: enable DCC for MSAA 2x textures on VI under an option This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:55 +02:00
Samuel Pitoiset	dc3d39771f	radv: decompress DCC for multisampled source images before resolving Multisampled source images (ie. color attachments) can be now DCC compressed, so the driver needs to perform a DCC decompression pass before resolving Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:52 +02:00
Samuel Pitoiset	1aefb62f1e	radv: add a workaround for fast clears with DCC and MSAA textures This should be fixed at some point in order to improve performance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:50 +02:00
Samuel Pitoiset	373fa0b599	radv: allocate CMASK for DCC fast clear with MSAA CMASK is required because it should be cleared to 0xCCCCCCCC for MSAA textures. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:48 +02:00
Samuel Pitoiset	255506c4e0	radv: implement fast color clear for DCC with MSAA When DCC is enabled with MSAA textures, CMASK should be cleared to 0xCCCCCCCC. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:10:45 +02:00
Samuel Pitoiset	796b6f4aab	radv: make sure to sync after resolving using the compute path This fixes some random CTS failures: dEQP-VK.renderpass.multisample.*. Performing a fast-clear eliminate is still useless, but it seems that we need to sync. Found while running CTS with RADV_DEBUG=zerovram. Fixes: `56a171a499` ("radv: don't fast-clear eliminate after resolving a subpass with compute") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:55 +02:00
Samuel Pitoiset	4a698660ae	radv: dump the SHA1 of SPIRV in the hang report Might be useful for debugging purposes, especially when we want to replace a shader on the fly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-19 09:09:52 +02:00
Bas Nieuwenhuizen	0e10790558	radv: Enable VK_EXT_descriptor_indexing. This adds everything except non-uniform indexing, which needs a bit more work and testing. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	5f7ebb5206	spirv: Add support for runtime descriptor array cap. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	c48feaf2d1	spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	b5e04e9217	radv: Support allocating variable size descriptor sets. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	78c54acbe8	radv: Add support for variable descriptor set layouts. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	082c11e8a5	radv: Fix GetDescriptorSetLayoutSupport. The continue means we do alignment differently than during creation, making the buffer smaller than expected. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	d02bbde1a8	radv: Use sorted bindings for set layout creation. Previously we did not care about havin the set storage in order, but for variable descriptor count we want the highest binding at the end of the storage. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	ab6cadd3ec	radv: Don't store buffer references in the descriptor set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	4b13fe55a4	radv: Keep a global BO list for VkMemory. With update after bind we can't attach bo's to the command buffer from the descriptor set anymore, so we have to have a global BO list. I am somewhat surprised this works really well even though we have implicit synchronization in the WSI based on the bo list associations and with the new behavior every command buffer is associated with every swapchain image. But I could not find slowdowns in games because of it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Bas Nieuwenhuizen	22d6b89e39	spirv: Update spirv.h to 12f8de9f04327336b699b1b80aa390ae7f9ddbf4 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-18 22:56:54 +02:00
Kenneth Graunke	da25ae92be	i965: Fix shadow batches to be the same size as the real BO. brw_bo_alloc may round up our allocation size to the next bucket size. In this case, we would malloc a shadow buffer that was the original intended size, but use bo->size (the larger size) for all of our checks. This could cause us to run off the end of the shadow buffer. v2: Actually use the new BO size (caught by Lionel) Reported-by: James Xiong <james.xiong@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c7dcee58b5` (i965: Avoid problems from referencing orphaned BOs after growing.)	2018-04-18 13:55:08 -07:00
Marek Olšák	7bd24d951a	glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract This fixes some piglits. Cc: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 15:34:52 -04:00
Leo Liu	90de03708f	radeon/vce: disable vce dual pipe on VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:35 -04:00
Marek Olšák	c6f1d36019	radeonsi: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:33 -04:00
Marek Olšák	d6a66bc8db	amd/addrlib: add support for VegaM Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:45:32 -04:00
Marek Olšák	d15fb766aa	radeonsi/gfx9: fix a hang with an empty first IB This packet causes the no-op IB detection to fail, so the IB is always submitted. Also fix the no-op IB detection by moving the begin call. Cc: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-18 14:42:06 -04:00
Dylan Baker	d28c246501	meson: build graw tests This only enables the null and xlib target, so no windows support yet. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	34cb4d0ebc	meson: build tests for gallium mesa state tracker v2: - Fix typo Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	de01018293	meson: build gallium unit tests v2: - gate unit tests on swrast being enabled (Eric A) v3: - rebase on libtrace being merged with gallium auxiliary Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v2)	2018-04-18 09:03:57 -07:00
Dylan Baker	4c794c7834	meson: Build gallium trivial tests Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	7fee8fed16	meson: Remove TODO about mesa/main tests They're already done. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	5d16c86add	meson: enable glcpp test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	db8cd8e367	glcpp/tests: Convert shell scripts to a python script This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that accepts arguments for each line ending type. This should allow for better reporting to users. v2: - Use $PYTHON2 to be consistent with other tests in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	8cb96c4031	glsl/tests: Remove unused compare_ir.py script Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	877d250ea1	meson: enable optimization-test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	97c28cb082	glsl/tests: Convert optimization-test.sh to pure python This patch converts optimization-test.sh to python, in this process it removes external shell dependencies including diff. It replaces the python script that generates shell scripts with a python library that generates test cases and runs them using subprocess. v2: - use $PYTHON2 to be consistent with other tests in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-18 09:03:57 -07:00
Dylan Baker	ad9c2f2018	meson: run glsl compiler warnings test Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
Dylan Baker	3b52d29227	glsl/tests: reimplement warnings-test in python This reimplements the test in python with a shell script wrapper that allows autotools to continue to run the test without realizing that anything has changed. Using python has two advantages, first it's portable so this test can be run on windows as well as Linux since it just requires python, no more diff, pwd or sh. It's also no longer tied to autotools implementation details, like the environment variables $srcdir and $abs_builddir, though the autotools shell wrapper still uses those, which makes it possible to run the test in meson. v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-18 09:03:57 -07:00
George Kyriazis	12a002a3a1	swr/rast: Fix VGATHERPD lowering Also Implement VHSUBPS in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	99fe90722d	swr/rast: Replace x86 VMOVMSK with llvm-only implementation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	0899122c03	swr/rast: Optimize late/bindless JIT of samplers Add per-worker thread private data to all shader calls Add per-worker sampler cache and jit context Add late LoadTexel JIT support Add per-worker-thread Sampler / LoadTexel JIT Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	ec7154abc0	swr/rast: Implement VROUND intrinsic in x86 lowering pass Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	bb02da3c1b	swr/rast: Refactor to improve code sharing. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	94ca1c018f	swr/rast: minimize codegen redundant work Move filtering of redundant codegen operations into gen scripts themselves Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	7f34860125	swr/rast: double-pump in x86 lowering pass Add support for double-pumping a smaller SIMD width intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	96ad8f5a23	swr/rast: Fix 64bit float loads in x86 lowering pass Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	1ffbbbee97	swr/rast: Add shader stats infrastructure (WIP) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a81c625cb7	swr/rast: Type-check TemplateArgUnroller Allows direct use of enum values in conversion to template args. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	2966ee1028	swr/rast: Add vgather to x86 lowering pass. Add support for generic VGATHERPD intrinsic in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	e4929b5d26	swr/rast: fix comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	670a99c233	swr/rast: add cvt instructions in x86 lowering pass Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	aa482014e5	swr/rast: Fix alloca usage in jitter Fix issue where temporary allocas were getting hoisted to function entry unnecessarily. We now explicitly mark temporary allocas and skip hoisting during the hoist pass. Shuold reduce stack usage. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	81371a5909	swr/rast: Change gfx pointers to gfxptr_t Changing type to gfxptr for indices and related changes to fetch and mem builder code. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	71239478d3	swr/rast: Fix byte offset for non-indexed draws Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c57b594317	swr/rast: Add support for setting optimization level for JIT compilation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	4f0df5e2f7	swr/rast: Adding translate call to builder_gfx_mem. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f135f54b18	swr/rast: Fix codegen for typedef types Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c5d7b37fe7	swr: add x86 lowering pass to fragment shader Needed because some FP paths (namely stipple) use gather intrinsics that now need to be lowered to x86. v2: fix typo in commit message Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9161c40d14	swr/rast: Enable generalized fetch jit Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some work needed to remove some simd8 double pumping for 16-wide target. Also removed unused non-gather load vertices path. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d73082b98b	swr/rast: Add builder_gfx_mem.{h\|cpp} Abstract usage scenarios for memory accesses into builder_gfx_mem. Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer types for use by builder_mem. v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	1eb72673fc	swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86. Some more work to do before we can support simultaneous 8-wide and 16-wide and remove the VGATHERPS_16 version. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b15fb78df5	swr/rast: Cleanup of JitManager convenience types Small cleanup. Remove convenience types from JitManager and standardize on the Builder's convenience types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d68694016c	swr/rast: Lower PERMD and PERMPS to x86. Add support for providing an emulation callback function for arch/width combinations that don't map cleanly to an x86 intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	8f848ada8a	swr/rast: Start refactoring of builder/packetizer. Move x86 intrinsic lowering to a separate pass. Builder now instantiates generic intrinsics for features not supported by llvm. The separate x86 lowering pass is responsible for lowering to valid x86 for the target SIMD architecture. Currently it's a port of existing code to get it up and running quickly. Will eventually support optimized x86 for AVX, AVX2 and AVX512. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	ffc0aeb4ec	swr/rast: Simplify #define usage in gen source file Removed preprocessor defines from structures passed to LLVM jitted code. The python scripts do not understand the preprocessor defines and ignores them. So for fields that are compiled out due to a preprocessor define the LLVM script accounts for them anyway because it doesn't know what the defines are set to. The sanitize defines for open source are fine in that they're safely used. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f36026ce2e	swr/rast: Move CallPrint() to a separate file Needed work for jit code debug. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	67c8bb4db7	swr/rast: Fix name mangling for LLVM pow intrinsic Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	7a5054aa1c	swr/rast: Add some archrast counters Hook up archrast counters for shader stats: instructions executed. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f52a501716	swr/rast: Code cleanup Removing some code that doesn't seem to do anything meaningful. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	093c1aee88	swr/rast: Add "Num Instructions Executed" stats intrinsic. Added a SWR_SHADER_STATS structure which is passed to each shader. The stats pass will instrument the shader to populate this. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	5fbee5e4ef	swr/rast: Add MEM_ADD helper function to Builder. mem[offset] += value This function will be heavily used by all stats intrinsics. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9103119cb3	swr/rast: Permute work for simd16 Fix slow permutes in PA tri lists under SIMD16 emulation on AVX Added missing permute (interlane, immediate) to SIMDLIB Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	4c69823d15	swr/rast: WIP builder rewrite (2) Finish up the remaining explicit intrinsic uses. At this point all explicit Intrinsic::getDeclaration() usage has been replaced with auto generated macros generated with gen_llvm_ir_macros.py. Going forward, make sure to only use the intrinsics here, adding new ones as needed. Next step is to remove all references to x86 intrinsics to keep the builder target-independent. Any x86 lowering will be handled by a separate pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c2163dc56a	swr/rast: Add autogen of helper llvm intrinsics. Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent. Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm macros for stacksave, stackrestore, popcnt. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	6427315e43	swr/rast: WIP builder rewrite. Start removing avx2 macros for functionality that exists in llvm. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a16f8e0554	swr/rast: LLVM 6 fix for getting masked gather intrinsic (also compatible with LLVM 4) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a92cc09c7a	swr/rast: Changes to allow jitter to compile with LLVM5 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	0f6fef9632	swr/rast: Add some archrast stats Add stats for degenerate and backfacing primitive counts Wire archrast stats for alpha blend and alpha test. pass value to jitter, upon return have archrast event increment a value Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b488028854	swr/rast: Silence some unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	e84bfec4ab	swr/rast: Add debug type info for i128 Help support debug info in 16 wide shaders. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a3edcfe1fb	swr/rast: Use blend context struct to pass params Stuff parameters into a blend context struct before passing down through the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	be6cf0fd7c	swr/rast: Introduce JIT_MEM_CLIENT Add assert for correct usage of memory accesses v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d34edffe48	swr/rast: Add some instructions to jitter VPHADDD, PMAXUD, PMINUD Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero	4aa03581b5	docs: update calendar, add news and link release notes to 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero	ad51d8871e	docs: add sha256 checksums for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `a1c421c638`)	2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero	76cadaa1de	docs: add release notes for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `8bd719e3fa`)	2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero	193d615917	docs: update calendar, add news and link release notes to 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero	6372227209	docs: add sha256 checksums for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cf0864dc63`)	2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero	6a1261bd09	docs: add release notes for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `6d88ea9dd4`)	2018-04-18 09:40:42 +00:00
Dylan Baker	b9ad5282ba	Revert "meson: add wrap for libdrm" This reverts commit `6217eedc9b`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:55 -07:00
Dylan Baker	efcbcfa7c8	Revert "Add subprojects directory and git ignore" This reverts commit `21e2e73f71`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)	5cf752b18b	meson: Version libMesaOpenCL like autotools does This is for parity with autotools. It names the library libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink. opencl_version now matches configure.ac's OPENCL_VERSION. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)	5bb98cfd92	meson: Add library versions to swr drivers This is for parity with autotools. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Dylan Baker	6217eedc9b	meson: add wrap for libdrm Currently this requires libdrm from git, since the version reported by meson is wrong.	2018-04-17 13:46:15 -07:00
Dylan Baker	21e2e73f71	Add subprojects directory and git ignore For meson wraps.	2018-04-17 13:46:15 -07:00
Samuel Pitoiset	893e19efb7	radv: fix scissor computation when using half-pixel viewport offset 'scale[i]' can be non-integer. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074 Fixes: `0f3de89a56` ("radv: Use the guard band.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-17 22:12:14 +02:00
Neil Roberts	608d70bc02	spirv: Accept doubles in FaceForward, Reflect and Refract The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32- or 16- bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 64-bit. This adds an optional conversion of the final argument in order to support any type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 v2: Convert the eta operand of Refract from any size in order to make it eventually cope with 16-bit floats. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:11 +02:00
Neil Roberts	6e499572b9	spirv: Add a 64-bit implementation of OpIsInf The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. v2: Use nir_imm_floatN_t for the constant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:06 +02:00
Neil Roberts	696f4abcbc	spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins There is an existing macro that is used to choose between either a float or a double immediate constant based on the bit size of the first operand to the builtin. This is now changed to use the new nir_imm_floatN_t helper function to reduce the number of places that make this decision. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:03 +02:00
Neil Roberts	e7b2c125c3	nir/builder: Add a nir_imm_floatN_t helper This lets you easily build float immediates just given the bit size. If we have this single place here to handle this then it will be easier to add support for 16-bit floats later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:57:36 +02:00
Timothy Arceri	6e22ad6edc	nir: return early when lowering a return at the end of a function Otherwise we create unused conditional return flags and things get unnecessarily ugly fast when lowering nested functions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 14:17:56 +10:00
Timothy Arceri	d3cafc18fc	mesa: merge the driver functions DrawBuffers and DrawBuffer The extra params we unused by the drivers that used DrawBuffers. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-17 14:17:48 +10:00
Marc Dietrich	268d8f244b	glsl: fix gcc 8 parenthesis warning fixes warnings like this: [184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'. In file included from ../src/mesa/main/mtypes.h:48, from ../src/compiler/glsl_types.h:149, from ../src/compiler/glsl/lower_jumps.cpp:59: ../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list)': ../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses] for (__type (__inst) = (__type *)(__list)->head_sentinel.next; \ ^ ../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list' foreach_in_list(ir_instruction, node, list) { ^~~~~~~~~~~~~~~ Signed-off-by: Marc Dietrich <marvin24@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-17 11:53:59 +10:00
Rob Clark	2a55344e7d	compiler: int8/uint8 fixes A couple spots were missed for handling of the new INT8/UINT8 base type. Also de-duplicate get_base_type().. get_scalar_type() had nearly the same switch statement, with the exception that anything with base_type that was not scalar would return error_type. So just handle that one special case in get_scalar_type(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-16 20:41:18 -04:00
Marek Olšák	60299e9abe	radeonsi: don't emit partial flushes for internal CS flushes only Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	692f550740	winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE There is a kernel patch that adds the new flag. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	1b3199d14d	radeonsi: implement mechanism for IBs without partial flushes at the end (v6) (This patch doesn't enable the behavior. It will be enabled in a later commit.) Draw calls from multiple IBs can be executed in parallel. v2: do emit partial flushes on SI v3: invalidate all shader caches at the beginning of IBs v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed, only do this for flushes invoked internally v5: empty IBs should wait for idle if the flush requires it v6: split the commit If we artificially limit the number of draw calls per IB to 5, we'll get a lot more IBs, leading to a lot more partial flushes. Let's see how the removal of partial flushes changes GPU utilization in that scenario: With partial flushes (time busy): CP: 99% SPI: 86% CB: 73: Without partial flushes (time busy): CP: 99% SPI: 93% CB: 81% Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Erico Nunes	d19b488339	nir: fix ir_binop_gequal glsl_to_nir conversion ir_binop_gequal needs to be converted to nir_op_sge when native integers are not supported in the driver. Otherwise it becomes no different than ir_binop_less after the conversion. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-16 07:59:25 -07:00
Jason Ekstrand	72ab499c9f	anv,radv: Drop XML workarounds for VK_ANDROID_native_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-16 07:59:25 -07:00
Jason Ekstrand	35ef0f767e	vulkan: Update the XML and headers to 1.1.73 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-16 07:59:25 -07:00
Samuel Pitoiset	62510846b6	radv: clean up radv_decompress_resolve_subpass_src() To handle the source color image transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:05 +02:00
Samuel Pitoiset	56a171a499	radv: don't fast-clear eliminate after resolving a subpass with compute That looks useless, and I think radv_handle_image_transition() will do a fast-clear eliminate because it's called after the resolve. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:21:02 +02:00
Samuel Pitoiset	7e84d69861	radv: handle CMASK/FMASK transitions only if DCC is disabled DCC implies a fast-clear eliminate, so I think this sounds reasonable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:59 +02:00
Samuel Pitoiset	584d1f2711	radv: merge radv_handle_{dcc,cmask}_image_transition() functions Into radv_handle_color_image_transition(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:56 +02:00
Samuel Pitoiset	d5812b900b	radv: add radv_init_color_image_metadata() helper In order to separate initialization from decompression. In the future, that will allow us to init DCC/FMASK/CMASK in one shot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:54 +02:00
Samuel Pitoiset	fde7b90ecf	radv: make radv_initialise_cmask() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:51 +02:00
Samuel Pitoiset	790f6e4718	radv: clean up radv_handle_image_transition() a bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:49 +02:00
Samuel Pitoiset	6967d32beb	radv: add radv_handle_color_image_transition() helper To handle CMASK, FMASK and DCC transitions in the same place. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:45 +02:00
Samuel Pitoiset	c6b1f1c97a	radv: handle DCC image transitions before CMASK/FMASK transitions Mostly because DCC implies a fast-clear eliminate and we should be able to skip some DCC decompressions by setting a predicate like for CMASK and FMASK. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:42 +02:00
Samuel Pitoiset	79c87a45b6	radv: disable prediction only if it has been enabled When decompressing DCC we don't enable it, so it's useless to disable it. This reduces the number of prediction packets sent to the GPU when performing color decompression passes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-16 14:20:39 +02:00
Bas Nieuwenhuizen	b0e3a9b19f	ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too. No clue how I missed those ... Fixes: `4503ff760c` "ac/nir: Add workaround for GFX9 buffer views." CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 11:55:48 +02:00
Brian Paul	6a519a157b	gallium/osmesa: link with winsock2 library on Windows To fix the MSVC build. The build broke because we started to compile the ddebug code on Windows after the mtypes.h changes. Building ddebug caused us to also use the u_network.c code for the first time. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	201c08c463	gallium/util: put (void) in a few function signatures To match the header file. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	65d1040435	ddebug: add PIPE_OS_UNIX/LINUX checks to fix MSVC build Don't include Unix headers or use Unix functions when building with MSVC. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	6d41edbf8a	mesa: protect #include of unistd.h with _MSV_VER check unistd.h is unix only. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-04-13 19:06:55 -06:00
Brian Paul	bf67fec235	mesa: remove unused 'i' in dimensions_error_check() Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-13 19:06:55 -06:00
Marek Olšák	976db661ff	radeonsi: restore si_emit_cache_flush call at the end of IBs Fixes: `918b798668` "radeonsi: make sure CP DMA is idle at the end of IBs"	2018-04-13 20:05:53 -04:00
Daniel Schürmann	f2c6a55061	radv: enable subgroup capabilities Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	4b0616e533	ac: handle subgroup intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:15 +02:00
Daniel Schürmann	d5f7ebda3e	ac: add LLVM build functions for subgroup instrinsics Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 01:03:09 +02:00
Daniel Schürmann	d19f20e793	ac: make ballot and umsb capable of 64bit inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	79701b414c	nir: lower 64bit subgroup shuffle intrinsics Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	fd5b0e0a64	nir/spirv: Fix warning and add missing breaks. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	54937d820d	nir: use ballot_bit_size when lowering ballot_bitfield_extract Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Daniel Schürmann	4d802df3aa	nir: subgroups instructions for 64bit ballot sizes Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-14 00:52:22 +02:00
Brian Paul	1098c18af3	glsl: #undef THIS macro to fix MSVC build THIS is a macro in one of the MSVC header files. It's also a token in the GLSL lexer. This causes a compilation failure with MSVC. This issue seems to be newly exposed after the recent mtypes.h removal patches. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:53:12 -06:00
Brian Paul	5dc7233f44	glsl: rename 'interface' var to 'iface' to fix MSVC build The recent mtypes.h removal patches seems to have exposed a MSVC issue where 'interface' is defined as a macro in an MSVC header file. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:53:08 -06:00
Brian Paul	73f1e33d34	mesa: remove snprintf macro in imports.h to fix MSVC build snprintf is a macro in the MSVC stdio.h header and we needed to include that header before imports.h where we also defined an snprintf macro. Otherwise, the MSVC build would fail. The recent mtypes.h removal patches seems to have exposed this issue. This patch simply removes our snprintf macro and replaces one use of it in teximage.c with _mesa_snprintf(). There are other calls to snprintf() in DRI drivers, but none of them are built on Windows. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-04-13 13:52:57 -06:00
Lionel Landwerlin	0a6547014f	anv: fix number of planes for depth & stencil We're not counting correctly with depth & stencil images. Additionally we need to move an assert that is meant just for color attachments. v2: Move an assert() (Reported by Craig) Change aspect mask checks (Francesco) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `a62a979335` ("anv: enable multiple planes per image/imageView") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105994 Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-13 11:44:53 -07:00
Marek Olšák	6ff0c6f4eb	gallium: move ddebug, noop, rbug, trace to auxiliary to improve build times which also simplifies the build scripts.	2018-04-13 14:08:14 -04:00
Marek Olšák	918b798668	radeonsi: make sure CP DMA is idle at the end of IBs	2018-04-13 14:07:20 -04:00
Marek Olšák	b6ad7075b9	gallium/hud: add a simple HUD view that only draws text Add this prefix to the env var: "simple," For example: GALLIUM_HUD=simple,fps The X coordinates are the same, but the Y coordinates are different, because there is only text. '+' happens to behave the same as "\n". ',' happens to behave the same as "\n\n".	2018-04-13 14:07:20 -04:00
Dylan Baker	506671594a	mesa: Include unistd.h in program_lexer Which was previously provided implicitly by mtypes.h CC: Marek Olšák <marek.olsak@amd.com> CC: Mark Janes <mark.a.janes@intel.com> Fixes: `43d66c8c2d` ("mesa: include mtypes.h less") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-13 11:03:37 -07:00
Marek Olšák	9a1363427e	radeonsi: always prefetch later shaders after the draw packet so that the draw is started as soon as possible. v2: only prefetch the API VS and VBO descriptors Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	e4b7974ec7	radeonsi: emit shader pointers before cache flushes & waits This code was written with the constant engine in mind. We can simplify it now. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	82799c5035	radeonsi/gfx9: don't use the workaround for gather4 + stencil it doesn't seem to be needed. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	1372ccfe6f	radeonsi: disable TC-compat HTILE on Tonga and Iceland Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	afe0bd2c55	radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabled just pass the flag that indicates it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	29a09e1d38	radeonsi: don't flush HTILE if there is no HTILE clear Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	5fb31a1734	radeonsi: merge 2 identical if statements in si_clear and other cleanups Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	8a28679987	radeonsi: don't do GFX-specific texture decompression for compute Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	307bccc6df	radeonsi: simplify generating the renderer string HAVE_LLVM > 0 is a tautology. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Marek Olšák	a3b785be4d	winsys/amdgpu: allow local BOs on APUs Local BOs ignore BO priorities, and we don't need those on APUs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Juan A. Suarez Romero	b37b35a5d2	getteximage: assume texture image is empty for non defined levels Current code is returning an INVALID_OPERATION when trying to use getTextureImage() on a level that has not been explicitly defined. That is, we define a mipmapped Texture2D with 3 levels, and try to use GetTextureImage() for the 4th levels, and INVALID_OPERATION is returned. Nevertheless, such case is not listed as an error in OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), where all the case errors for this function are defined. So it seems this is a valid operation. On the other hand, in section 8.22 ("Texture State and Proxy State") it states: "Each initial texture image is null. It has zero width, height, and depth, internal format RGBA, or R8 for buffer textures, component sizes set to zero and component types set to NONE, the compressed flag set to FALSE, a zero compressed size, and the bound buffer object name is zero." We can assume that we are reading this initialized empty image when calling GetTextureImage() with a non defined level. With this assumption, we will reach one of the other error cases defined for the functions. In the end this means that we would end up returning INVALID_VALUE to the caller. This fixes arb_get_texture_sub_image piglit tests. v2: just return INVALID_VALUE if there is no defined level (Iago) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:37 +02:00
Juan A. Suarez Romero	8d411eb6b3	gettextureimage: verify cube map is complete According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), relative to errors for GetTexImage, GetTextureImage, and GetnTexImage: "An INVALID_OPERATION error is generated by GetTextureImage if the effective target is TEXTURE_CUBE_MAP or TEXTURE_CUBE_MAP_ARRAY, and the texture object is not cube complete or cube array complete, respectively." This fixes arb_get_texture_sub_image piglit tests. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:27 +02:00
Juan A. Suarez Romero	42891dbaa1	gettextsubimage: verify zoffset and depth are correct According to OpenGL 4.6 spec, section 8.11.4 ("Texture Image Queries"), relative to errors for GetTextureSubImage() function: "An INVALID_VALUE error is generated if the effective target is TEXTURE_1D and either yoffset is not zero, or height is not one. An INVALID_VALUE error is generated if the effective target is TEXTURE_1D, TEXTURE_1D_ARRAY, TEXTURE_2D or TEXTURE_RECTANGLE, and either zoffset is not zero, or depth is not one." The commit fixes the check for height and depth. This fixes arb_get_texture_sub_image piglit tests. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-13 17:47:27 +02:00
Timothy Arceri	a63e69f5f0	mesa: free debug messages when destroying the debug state Fixes: `04a8baad37` "mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98281	2018-04-13 22:20:48 +10:00
Timothy Arceri	c500ab2735	mesa: fix x86 builds Fixes: `43d66c8c2d` "mesa: include mtypes.h less"	2018-04-13 22:13:46 +10:00
Marek Olšák	e961824ba8	Fix make check	2018-04-12 20:03:13 -04:00
Marek Olšák	6d6b1b3890	Fix scons build	2018-04-12 19:55:01 -04:00
Marek Olšák	43d66c8c2d	mesa: include mtypes.h less - remove mtypes.h from most header files - add main/menums.h for often used definitions - remove main/core.h v2: fix radv build Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:30 -04:00
Marek Olšák	57f4268da4	mesa: include dispatch.h less Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-12 19:31:28 -04:00
Bas Nieuwenhuizen	6ff98dbf7c	radv: Implement VK_EXT_vertex_attribute_divisor. Pretty straight forward, just pass the divisors through the shader key and then do a LLVM divide. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-12 22:57:23 +02:00
Bas Nieuwenhuizen	7eff8d7d35	ac/surface: Allow S swizzle for displayable surfaces. For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts S swizzles. At the same time addrlib prefers D swizzles is allowed, so we can just allow S swizzles as fallback. Fixes: `b64b712558` "ac/surface/gfx9: request desired micro tile mode explicitly" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-12 21:24:55 +02:00
Eric Anholt	7bc77dbb00	broadcom/vc5: Fix a stray '`' in a comment.	2018-04-12 11:20:50 -07:00
Eric Anholt	b225cdcecc	broadcom/vc5: Update the UABI for in/out syncobjs This is the ABI I'm hoping to stabilize for merging the driver. seqnos are eliminated, which allows for the GPU scheduler to task-switch between DRM fds even after submission to the kernel. In/out sync objects are introduced, to allow the Android fencing extension (not yet implemented, but should be trivial), and to also allow the driver to tell the kernel to not start a bin until a previous render is complete.	2018-04-12 11:20:50 -07:00
Eric Anholt	d9c525ed22	broadcom/vc5: Drop the finished_seqno optimization. With the DRM scheduler changes, I'm about to remove all seqnos from the UABI.	2018-04-12 11:20:50 -07:00
Eric Anholt	aedfd8ede4	broadcom/vc5: Drop the throttling code. Since I'll be using the DRM scheduler, we won't run into the problem of a runaway client starving other clients of GPU time.	2018-04-12 11:20:50 -07:00
Eric Anholt	dd9c476165	broadcom/vc5: Move flush_last_load into load_general, like for stores. This should avoid mistakes with not flushing as we change the series of loads. Already, it fixes a hopefully unreachable case where we were emitting just the TILE_COORDINATES and not the dummy store that needs to go with it.	2018-04-12 11:20:50 -07:00
Eric Anholt	6a21a582fb	broadcom/vc5: Rename read_but_not_cleared to loads_pending. This is a more obvious name for what the variable means, and matches what it's called for stores.	2018-04-12 11:20:50 -07:00
Eric Anholt	b946218c48	broadcom/vc5: Refactor the implicit coords/stores_pending logic. Since I just fixed a bug due to forgetting to do these right, do it once in the helper func.	2018-04-12 11:20:50 -07:00
Eric Anholt	ec60559f97	broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores. Fixes a simulator assertion failure in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8	2018-04-12 11:20:50 -07:00
Eric Anholt	8f2999120d	broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores. This was dying in the simulator on GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit. We'll need to do basically the same thing as Z32F/S8 does in the MSAA Z24S8 case.	2018-04-12 11:20:50 -07:00
Eric Anholt	7553cbfc9d	broadcom/vc5: Fix MSAA depth/stencil size setup. The v3dX(get_internal_type_bpp_for_output_format)() call only handles color output formats (which overlap in enum numbers with depth output formats), so for depth we just need to take the normal cpp times the number of samples.	2018-04-12 11:20:50 -07:00
Leo Liu	fa328456e8	st/va: add VP9 config to enable profile2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	dac0024b58	radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	f1277dabbc	radeon/vcn: add VP9 profile2 support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	e8724bd1e3	vl: add VP9 profile2 support Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	d9a31341ec	st/va: add VP9 config to enable profile0 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	ef52ba8aa0	st/va: parse VP9 uncompressed frame header To get some of UVD required parameters. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	bf0f5fe929	st/va: add slice parameter handling for VP9 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	05176fe65e	st/va: add picture parameter handling for VP9 Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	9ff83d13e5	st/va: add handles for VP9 buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	30438fbf46	st/va: add VP9 picture to context Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	0f373a65e5	radeonsi: cap VP9 support to progressive buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	6adaf6de6d	radeonsi: cap VP9 support to Raven Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	905368669d	radeon/vcn: add VP9 context buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	e2ce7c0a62	radeon/vcn: get VP9 msg buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	6000bdb75b	radeon/vcn: fill probability table to prob buffers Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	93c0f3cc13	radeon/vcn: add VP9 message buffer interface Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:13 -04:00
Leo Liu	caaecf3d3b	radeon/vcn: add VP9 prob table buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	b628ea039f	vl: add VP9 probability tables Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	eb22785bd8	radeon/vcn: add VP9 dpb buffer size The current FW has restricted the size to the worse case, and the new dynamic dpb buffer support is on the way from firmware side, we will change accordingly. Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	f73befdd9b	radeon/vcn: add VP9 stream type for decoder Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	ca1646db89	vl: add VP9 picture description Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Leo Liu	29bc354684	vl: add VP9 profile0 and format Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-04-12 11:15:12 -04:00
Samuel Pitoiset	9eac49246c	radv: fix radv_layout_dcc_compressed() when image doesn't have DCC num_dcc_levels means that DCC is supported, but this doesn't mean that it's enabled by the driver. Instead, we should rely on radv_image_has_dcc(). This fixes some multisample regressions since `0babc8e5d6` ("radv: fix picking the method for resolve subpass") on Vega. This is because the resolve method changed from HW to FS, but those fails are totally unexpected, so there might some differences between Polaris and Vega here. Fixes: `44fcf58744` ("radv: Disable DCC for GENERAL layout and compute transfer dest.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:46 +02:00
Samuel Pitoiset	ab0e625a67	radv: add radv_decompress_resolve_{subpass}_src() helpers This helper shares common code before resolving using either a fragment or a compute shader. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:44 +02:00
Samuel Pitoiset	ed93d90a67	radv: add radv_init_dcc_control_reg() helper And add some comments. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-12 09:58:41 +02:00
Timothy Arceri	c7e3d31b0b	glsl: fix compat shaders in GLSL 1.40 The compatibility and core tokens were not added until GLSL 1.50, for GLSL 1.40 just assume all shaders built with a compat profile are compat shaders. Fixes rendering issues in Dawn of War II on radeonsi which has enabled OpenGL 3.1 compat support. Fixes: `a0c8b49284` "mesa: enable OpenGL 3.1 with ARB_compatibility" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807	2018-04-12 11:51:08 +10:00
Ian Romanick	f3b14ca2e1	mesa: Silence remaining unused parameter warnings in teximage.c src/mesa/main/teximage.c: In function ‘_mesa_test_proxy_teximage’: src/mesa/main/teximage.c:1301:51: warning: unused parameter ‘level’ [-Wunused-parameter] GLuint numLevels, GLint level, ^~~~~ src/mesa/main/teximage.c: In function ‘texsubimage_error_check’: src/mesa/main/teximage.c:2186:30: warning: unused parameter ‘dsa’ [-Wunused-parameter] bool dsa, const char callerName) ^~~ src/mesa/main/teximage.c: In function ‘copytexture_error_check’: src/mesa/main/teximage.c:2297:32: warning: unused parameter ‘width’ [-Wunused-parameter] GLint width, GLint height, GLint border ) ^~~~~ src/mesa/main/teximage.c:2297:45: warning: unused parameter ‘height’ [-Wunused-parameter] GLint width, GLint height, GLint border ) ^~~~~~ src/mesa/main/teximage.c: In function ‘check_rtt_cb’: src/mesa/main/teximage.c:2679:21: warning: unused parameter ‘key’ [-Wunused-parameter] check_rtt_cb(GLuint key, void data, void *userData) ^~~ src/mesa/main/teximage.c: In function ‘override_internal_format’: src/mesa/main/teximage.c:2756:55: warning: unused parameter ‘width’ [-Wunused-parameter] override_internal_format(GLenum internalFormat, GLint width, GLint height) ^~~~~ src/mesa/main/teximage.c:2756:68: warning: unused parameter ‘height’ [-Wunused-parameter] override_internal_format(GLenum internalFormat, GLint width, GLint height) ^~~~~~ src/mesa/main/teximage.c: In function ‘texture_sub_image’: src/mesa/main/teximage.c:3293:24: warning: unused parameter ‘dsa’ [-Wunused-parameter] bool dsa) ^~~ src/mesa/main/teximage.c: In function ‘can_avoid_reallocation’: src/mesa/main/teximage.c:3788:53: warning: unused parameter ‘x’ [-Wunused-parameter] mesa_format texFormat, GLint x, GLint y, GLsizei width, ^ src/mesa/main/teximage.c:3788:62: warning: unused parameter ‘y’ [-Wunused-parameter] mesa_format texFormat, GLint x, GLint y, GLsizei width, ^ src/mesa/main/teximage.c: In function ‘valid_texstorage_ms_parameters’: src/mesa/main/teximage.c:5987:40: warning: unused parameter ‘samples’ [-Wunused-parameter] GLsizei samples, unsigned dims) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-11 16:20:56 -07:00
Ian Romanick	fa44941072	mesa: Silence unused parameter warning in compressedteximage_only_format Passing ctx to compressedteximage_only_format was the only use of the ctx parameter in _mesa_format_no_online_compression, so that parameter had to go too. ../../SOURCE/master/src/mesa/main/teximage.c: In function ‘compressedteximage_only_format’: ../../SOURCE/master/src/mesa/main/teximage.c:1355:57: warning: unused parameter ‘ctx’ [-Wunused-parameter] compressedteximage_only_format(const struct gl_context *ctx, GLenum format) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-11 16:20:42 -07:00
Nanley Chery	377da9eb78	blorp: Silence unused function warnings vulkan/genX_blorp_exec.c:69:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from vulkan/genX_blorp_exec.c:35:0: ./blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function] blorp_emit_memcpy(struct blorp_batch batch, ^~~~~~~~~~~~~~~~~ genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from genX_blorp_exec.c:33:0: ../../../../../src/intel/blorp/blorp_genX_exec.h:1249:1: warning: ‘blorp_emit_memcpy’ defined but not used [-Wunused-function] blorp_emit_memcpy(struct blorp_batch batch, ^~~~~~~~~~~~~~~~~ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-04-11 13:04:49 -07:00
Caio Marcelo de Oliveira Filho	89542c9ce6	nir/vars_to_ssa: Simplify node matching code The matching code doesn't make real use of the return value. The main function return value is ignored, and while the worker function propagate its return value, the actual callback never returns false. v2: Style fixes. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho	fac9dd1b93	nir/vars_to_ssa: Remove an unnecessary deref_arry_type check Only fully-qualified direct derefs, collected in direct_deref_nodes, are checked for aliasing, so it is already known up front that they have only array derefs of type direct. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Caio Marcelo de Oliveira Filho	1c9bccdeb8	nir/vars_to_ssa: Rework register_variable_uses() The return value was needed to make use of the old nir_foreach_block helper, but not needed anymore with the macro version. Then go one step further and move the foreach directly into the register variable uses function. v2: Move foreach to register_variable_uses(). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-11 11:05:05 -07:00
Jason Ekstrand	bc2b170d68	nir: Use nir_builder in lower_io_to_temporaries Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-11 11:03:22 -07:00
Bas Nieuwenhuizen	bd95397d65	radv: Enable RB+ on Raven. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 18:46:55 +02:00
Tapani Pälli	9f29b1a4c8	vulkan: fix build issue on android (both anv/radv) Fixes linking errors against: anv_GetPhysicalDeviceImageFormatProperties2KHR radv_GetPhysicalDeviceImageFormatProperties2KHR Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-11 13:55:49 +03:00
Nicolai Hähnle	41e6ffee49	radeonsi: correctly parse disassembly with labels LLVM now emits labels as part of the disassembly string, which is very useful but breaks the old parsing approach. Use the semicolon to detect the boundary of instructions instead of going by line breaks. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-11 12:44:30 +02:00
Nicolai Hähnle	0630e52c9e	radeonsi: pass -O halt_waves to umr for hang debugging This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-11 12:44:24 +02:00
Jason Ekstrand	69f447553c	vulkan: Drop vk_android_native_buffer.xml All the information in vk_android_native_buffer.xml is now in vk.xml. The only exception is the extension type attribute which we can work around in the generators while we wait for the XML to be fixed. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-10 19:29:49 -07:00
Jason Ekstrand	ae3a856c34	nir/lower_atomics: Rework the main walker loop a bit This replaces some "if (...} { }" with "if (...) continue;" to reduce nesting depth and makes nir_metadata_preserve conditional on progress for the given impl. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-10 19:28:49 -07:00
Bas Nieuwenhuizen	ed94638156	radv: Enable RB+ where possible. According to Marek, not enabling it on Stoney has a significant negative performance impact. (And I guess this might impact performance on Raven as well) The register settings are pretty much copied from radeonsi. I did not put this in the pipeline as that would make the pipeline more dependent on the format which mean we would have to have more pipelines for the meta shaders. v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet does already. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-11 01:19:10 +02:00
Topi Pohjolainen	5d895a1f37	nir: Check if u_vector_init() succeeds However, it only fails when running out of memory. Now, if we are about to check that, we should be consistent and check the allocation of the worklist as well. CID: 1433512 Fixes: `edb18564c7` nir: Initial implementation of a nir_instr_worklist Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Topi Pohjolainen	98d3874754	mesa: Assert base format before truncating to unsigned short CID: 1433709 Fixes: `ca721b3d8`: mesa: use GLenum16 in a few more places Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Topi Pohjolainen	26f48fe010	intel/dev: Assert the number of slices is not zero Fixes: `c1900f5b` intel: devinfo: add helper functions to fill... CID: 1433511 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-11 01:49:56 +03:00
Kenneth Graunke	8960903c90	i965: Remove brw_bo_alloc_tiled_2d from intel_detect_swizzling. I'd like to drop this pre-isl function. This drops one of the two uses. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-10 15:31:31 -07:00
Timothy Arceri	a05faf80c3	mesa: fix glsl version mismatch in compat profile Drivers that only support compat 3.0 were reporting GLSL 1.40 support. This fixes issues with the menu of Dawn of War II. Fixes: `a0c8b49284` "mesa: enable OpenGL 3.1 with ARB_compatibility" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105807	2018-04-11 08:05:19 +10:00
Samuel Pitoiset	0babc8e5d6	radv: fix picking the method for resolve subpass The source and destination image parameters were swapped. No CTS changes on Polaris10, but I suspect this might fix something. Fixes: `2a04f5481d` ("radv/meta: select resolve paths") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Samuel Pitoiset	9f6a28eb27	radv: add shader BOs to the list at pipeline bind time Otherwise, the shader BOs are not added to the list on SI because prefetching isn't supported. Calling radv_cs_add_buffer() in the prefetch codepath was a bad idea. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952 Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Turo Lamminen <turo@alternativegames.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-10 21:55:28 +02:00
Marek Olšák	e29facff31	ac/surface: don't set the display flag for obviously unsupported cases (v2) This enables the tile swizzle for some cases of the displayable micro mode, and it also fixes an addrlib assertion failure on Vega. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-04-10 13:06:03 -04:00
Marek Olšák	19ce5048ee	radeonsi: add shader binary padding for UMR	2018-04-10 13:05:20 -04:00
Marek Olšák	b64b712558	ac/surface/gfx9: request desired micro tile mode explicitly Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-10 12:44:41 -04:00
Emil Velikov	5dd02123a0	docs/release-calendar: update to include 18.1 and 18.2 Dylan has kindly stepped up to help with 18.1.0, while I've taken the liberty to nominate Andres for 18.2.0 ;-) As always, people are welcome to swap/adjust where needed. v2: Add Juan for 18.0.x (Juan) Cc: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> (v1) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 16:08:54 +01:00
Emil Velikov	8eceac9de7	glsl: remove unreachable assert() Earlier commit enforced that we'll bail out if the number of terminators is different than 2. With that in mind, the assert() will never trigger. Fixes: `56b867395d` ("glsl: fix infinite loop caused by bug in loop unrolling pass") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 16:04:50 +01:00
Juan A. Suarez Romero	0d0ef8ae33	spirv: autotools: add vtn_gather_types_c.py in distribution tarball Fixes: `042ee4bea2` "(spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 10:37:46 +02:00
Juan A. Suarez Romero	15ed757834	radeonsi: autotools: add si_build_pm4.h in dist tarball Fixes: `5777488406` ("radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-10 10:33:28 +02:00
Bas Nieuwenhuizen	4381be4648	ac/nir: Use an array instead of hashtable for SSA defs. Saves about 2% of compile time for F1 2017, as well as reduce code size of an optimized libvulkan_radeon.so by about 1 KiB. This still keeps the hashtable, as we also stored blocks in there. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-10 09:53:16 +02:00
Timothy Arceri	6066f08ee9	st/mesa: finalise tcs/tes/geom NIR before storing it to the cache We don't create variants of the NIR so here we finalise it before caching to avoid unnecessary processing when restoring it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 15:10:16 +10:00
Timothy Arceri	bc71e20993	st/mesa: exit st_translate_fragment_program() earlier for NIR path This avoids a bunch of scanning that is only used by the TGSI path. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 15:10:16 +10:00
Timothy Arceri	494a5c3501	radeonsi/nir: tidy up si_nir_load_sampler_desc() This makes it easier to follow the code, and also initialises dynamic_index which will be useful for adding bindless textures support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	d7cbe795ed	radeonsi/nir: set uses_bindless_images for images V2: add missing intrinsics (Spotted-by: Samuel Pitoiset) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	74b3fc2ce0	nir: dont lower bindless samplers We neeed to skip the var if its not a uniform here as well as checking the bindless flag since UBOs can contain bindless samplers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	bd4cc54c8b	st/glsl_to_nir: set paramater value offset as driver location for packed uniforms This allows us to simplify the code and will also be useful for supporting bindless textures. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	222d862cd3	radeonsi/nir: don't add bindless samplers/images to declared bitmasks Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Timothy Arceri	f33d9036b9	st/mesa: stop calling _mesa_init_shader_object_functions() This sets the LinkShader function for the driver, but for the st we set it properly with the following call to st_init_program_functions(). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-10 14:43:45 +10:00
Jason Ekstrand	c3f9d5c235	anv/pipeline: Lower more constant initializers earlier Once we've gotten rid of everything but the main entrypoint, there's no reason why we should go ahead and lower them all. This is what radv does and it will make future work easier. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Jason Ekstrand	14e0a222d9	spirv: Use the LOCAL_GROUP_SIZE system value Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Jason Ekstrand	131d454c35	nir/lower_system_values: Support SYSTEM_VALUE_LOCAL_GROUP_SIZE Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-09 19:45:25 -07:00
Lionel Landwerlin	f3353e53db	intel: aubinator: print out addresses of invalid instructions Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-10 00:58:38 +01:00
Bas Nieuwenhuizen	41fbcc7901	radv: Always reset draw user SGPRs after secondary command buffer. As we sometimes reset them to -1, -1 does not mean that they are not written by the secondary command buffer. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:42 +02:00
Bas Nieuwenhuizen	74b0b869dd	radv: Don't set instance count using predication. The packet can sometimes be skipped, but we still think the change takes effect. This just makes the packet always take effect. Fixes: `ad11fc3571` "radv: don't emit unneeded vertex state." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105942 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 23:04:35 +02:00
Rob Clark	d66dc34316	mesa/st/nir: fix instruction removal At one point this kinda worked (or at least didn't cause problems). But with deref-instructions it results in dangling deref instructions not being properly removed. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Rob Clark	becf2d1fac	mesa/st/nir: fix naked lowering pass call Not using the macro means no nir_validate in debug builds, resulting in problems showing up only after later passes. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Rob Clark	c4457113e9	nir: add comment about nir_src_copy() So it is more clear about when to use nir_instr_rewrite_src() Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 15:36:21 -04:00
Nanley Chery	1d94aa1987	i965: Make the miptree clear color setter take a gl_color_union We want to hide the internal details of how the miptree's clear color is calculated. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Nanley Chery	3dbb49a978	i965/miptree: Move the clear color and value setter implementations These will get more complex in later commits. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Nanley Chery	1ce7ae391e	i965: Use the brw_context for the clear color and value setters Do what all the other functions in the miptree API do. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-09 10:56:48 -07:00
Bas Vermeulen	c63bef15fc	radeonsi: convert dispatch packet to little endian The parameters for the compute engine are wrong when using an E8860 on a big endian machine. To fix this, convert the contents of struct dispatch_packet to little endian. This ensures that get_global_id(0) and similar functions in the OpenCL code get the correct endian values, and makes my simple OpenCL program work correctly. Signed-off-by: Bas Vermeulen <bas@daedalean.ai> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-04-09 13:47:52 -04:00
Bas Vermeulen	be628e4749	radeonsi: correct si_vgt_param_key on big endian machines Using mesa OpenCL failed on a big endian PowerPC machine because si_vgt_param_key is using bitfields and a 32 bit int for an index into an array. Fix si_vgt_param_key to work correctly on both little endian and big endian machines. Signed-off-by: Bas Vermeulen <bas@daedalean.ai> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-09 13:42:30 -04:00
Marek Olšák	f33e4482b3	radeonsi: don't set RB+ registers on GFX9 chips without RB+ CLEAR_STATE initializes them properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-09 13:40:25 -04:00
Emil Velikov	ea2536cd26	etnaviv: meson: add etnaviv_query_pm.[ch] to the sources Otherwise building the driver will fail with unresolved symbols. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960 Fixes: `72d2043be0` ("etnaviv: add perfmon query implementation") Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Clayton Craft <clayton.a.craft@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-09 19:09:24 +02:00
Xiong, James	f23b45dce3	i965: return the fourcc saved in __DRIimage when possible When creating a image from a texture, the image's dri_format is set to the first plane's format, and used to look up for the fourcc. e.g. for FOURCC_NV12 texture, the dri_format is set to __DRI_IMAGE_FORMAT_R8, we end up with a wrong entry in function intel_lookup_fourcc(): { __DRI_IMAGE_FOURCC_R8, __DRI_IMAGE_COMPONENTS_R, 1, { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, } }, instead of the correct one: { __DRI_IMAGE_FOURCC_NV12, __DRI_IMAGE_COMPONENTS_Y_UV, 2, { { 0, 0, 0, __DRI_IMAGE_FORMAT_R8, 1 }, { 1, 1, 1, __DRI_IMAGE_FORMAT_GR88, 2 } } }, as a result, a wrong fourcc __DRI_IMAGE_FOURCC_R8 was returned. To fix this bug, the image inherits the texture's planar_format that has the original fourcc; Upon querying, if planar_format is set, return the saved fourcc; Otherwise fall back to the old way. v3: add a bug description and "cc mesa-stable" tag (Jason) remove redundant null pointer check (Tapani) squash 2 patches into one (James) v2: fall back to intel_lookup_fourcc() when planar_format is NULL (Dongwon & Matt Roper) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Xiong, James <james.xiong@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-09 18:16:59 +03:00
Bastien Orivel	42c2f5b579	nir: Fix a typo in src/compiler/Makefile.nir.am Since `31d91f019b`, the makefile tries to find the file SConstript.spirv instead of SConscript.spirv which breaks the make dist command. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-09 08:32:45 -06:00
Samuel Pitoiset	04e609f1f8	radv: fix prefetching of vertex shader and VBOs on SI Forgot one check... Too many mistakes for a simple change. Fixes: `f1d7c16e85` ("radv: fix prefetching compute shaders on CIK and older chips") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 16:14:12 +02:00
Samuel Pitoiset	56a4d03b0c	radv: implement VK_AMD_shader_core_properties Simple extension that only returns information for AMD hw. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	466aba9fa2	radv: add RADV_NUM_PHYSICAL_VGPRS constant Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	2f7bb93146	radv: add radv_get_num_physical_sgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Samuel Pitoiset	b30dec738a	vulkan: Update the XML and headers to 1.1.72 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 14:28:13 +02:00
Andres Gomez	a055f5108d	docs: properly escape characters Signed-off-by: Andres Gomez <agomez@igalia.com>	2018-04-09 13:47:40 +03:00
Andres Gomez	7cf3932098	mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage Fixes: `03fd6704db` ("mesa: Add support for a new override string MESA_GLES_VERSION_OVERRIDE") Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-09 13:47:40 +03:00
Marek Olšák	806ab42c0f	mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override v2: - Provide a correct explanation on the envvars documentation (Ian). - Provide a more correct explanation on the function comments (Andres). v3: - Homogenize documentation and inline comments (Emil). - Correct a typo (Emil). Fixes: `2599b92eb9` ("mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE") Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-09 13:47:40 +03:00
Andres Gomez	c6067fcd07	dri_util: don't fail when not supporting ARB_compatibility with GL3.1 Currently, any driver that does not support the ARB_compatibility extension will fail on GL3.1 context creation if the application does not request the forward-compatiblity flag. Restore the original check which changes mesa_api to API_OPENGL_CORE, only when: - GL3.1 is requested, without the forward-compatiblity flag. - driver does not support ARB_compatibility - as deduced by max_gl_compat_version. Fixes: `a0c8b49284` ("mesa: enable OpenGL 3.1 with ARB_compatibility") v2: - Improve commit log (Emil). - Provide a correct explanation on the features documentation (Ian). Cc: Marek Olšák <marek.olsak@amd.com> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-09 13:46:34 +03:00
Andres Gomez	044acd3569	dri_util: when overriding, always reset the core version This way we won't fail when validating just because we may have a non overriden core version that is lower than the requested one, even when the compat version is high enough. For example, running glcts from VK-GL-CTS with i965, this will succeed: $ MESA_GL_VERSION_OVERRIDE=4.6 ./glcts --deqp-case=KHR-GL46.info.vendor While, this will fail: $ MESA_GL_VERSION_OVERRIDE=4.6COMPAT ./glcts --deqp-case=KHR-GL46.info.vendor Fixes: `464c56d3d5` ("dri_util: Use _mesa_override_gl_version_contextless") Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-04-09 13:18:16 +03:00
Samuel Pitoiset	b0f8ad189c	radv: add radv_image_is_tc_compat_htile() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:26 +02:00
Samuel Pitoiset	95d5ad80e9	radv: add radv_use_dcc_for_image() helper And add some TODOs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:24 +02:00
Samuel Pitoiset	fab5fe4284	radv: rename radv_image_is_tc_compat_htile() ... to radv_use_tc_compat_htile_for_image(). This function name makes more sense to me because we want to know if and only if TC-compat HTILE should be used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:21 +02:00
Samuel Pitoiset	2692736cee	radv: simplify a check in radv_initialise_color_surface() If the image has FMASK metadata, the number of samples is > 1 because radv_image_can_enable_fmask() handles that already. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:16 +02:00
Samuel Pitoiset	ed41e776d0	radv: clean up radv_vi_dcc_enabled() And rename to radv_dcc_enabled() to be consistent. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:14 +02:00
Samuel Pitoiset	e213f19907	radv: clean up radv_htile_enabled() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:12 +02:00
Samuel Pitoiset	0fc9113ac5	radv: add radv_image_has_{cmask,fmask,dcc,htile}() helpers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:10 +02:00
Samuel Pitoiset	32f5174ce8	radv: add radv_get_cmask_fast_clear_value() helper DCC for MSAA textures are currently unsupported but that will be used later on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:08 +02:00
Samuel Pitoiset	f882c62218	radv: add radv_clear_{cmask,dcc} helpers They will help for DCC MSAA textures and if we support mipmaps in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-09 11:21:05 +02:00
Axel Davy	d899826733	st/nine: Do not use scratch for face register Scratch registers are reused every instructions. Since vFace is reused, a new temporary register should be used. Fixes: https://github.com/iXit/Mesa-3D/issues/311 Signed-off-by: Axel Davy <davyaxel0@gmail.com> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-04-08 22:49:43 +02:00
Christian Gmeiner	9e80273693	etnaviv: expose perfmon query groups Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:45 +02:00
Christian Gmeiner	c320b158f5	etnaviv: add query_group_info for perfmon counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:38 +02:00
Christian Gmeiner	5a3b744ed2	etnaviv: assign group_ids to perfmon queries Prep work for AMD_performance_monitor support. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:23:34 +02:00
Christian Gmeiner	4020fa3e08	etnaviv: support MC performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:40 +02:00
Christian Gmeiner	3c3f936ae1	etnaviv: support TX performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:12 +02:00
Christian Gmeiner	f380ce13f0	etnaviv: support RA performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:21:04 +02:00
Christian Gmeiner	3af0e228e5	etnaviv: support SE performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:50 +02:00
Christian Gmeiner	9ae86c1306	etnaviv: support PA performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:46 +02:00
Christian Gmeiner	69bebe06e3	etnaviv: support SH performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:42 +02:00
Christian Gmeiner	1f603402f6	etnaviv: support PE performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:37 +02:00
Christian Gmeiner	d0bed0b494	etnaviv: support HI performance counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:32 +02:00
Christian Gmeiner	72d2043be0	etnaviv: add perfmon query implementation Add needed infrastructure to use performance monitor requests for queries. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com>	2018-04-08 22:20:25 +02:00
Christian Gmeiner	7e3dba301e	etnaviv: sw queries: return correct number of groups Fixes: `3d912bd742` ("etnaviv: add query_group_info for sw counters") Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:13:04 +02:00
Lucas Stach	208891650b	etnaviv: advertise YUV formats as external only We only support importing YUV as OES external resources. This will change in the future, but for now this fixes the advertised capabilities in eglQueryDmaBufModifiersEXT. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:11:46 +02:00
Lucas Stach	dfe4a08ccd	gallium/util: implement util_format_is_yuv This adds a helper to check if a pipe format is in YUV color space. Drivers want to know about this, as YUV mostly needs special handling. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-04-08 22:10:57 +02:00
Rhys Perry	19254a977b	nvc0: finish implementation of PIPE_QUERY_SO_OVERFLOW_PREDICATE This also removes some useless code leftover from old changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rhys Perry	14cc8c55ea	nvc0: change ACQUIRE_EQUAL to ACQUIRE_GEQUAL in nvc0_hw_query_fifo_wait If a fence is created in between nvc0_hw_end_query and nvc0_hw_query_fifo_wait, the sequence number in nvc0->screen->fence.bo can be larger than hq->fence->sequence before the semaphore is created, resulting in the semaphore never being triggered. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rhys Perry	98d15e0550	nvc0: ensure the query's fence has been emitted in nvc0_hw_query_fifo_wait If the fence has not been emitted, hq->fence->sequence would be zero. This would result in the semaphore never being triggered, blocking all later commands in the pushbuf. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> [imirkin: use nouveau_fence_emit instead] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Ilia Mirkin	90bb2d7152	st/mesa: tex offsets can't be in a const or 2d-indexed All consts are now implicitly 2d (they set .Dimension), so trigger asserts. Also, the texture offset can't handle any sort of 2d indexing. While this could be tacked on, this seems unnecessary, just move it off into a separate temp. Fixes assertion failure in tests/spec/arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag Note that this was an issue even before the const-always-2d thing, since there was no detection of when even a proper second dimension was used, e.g. for UBO or geom/tess inputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-07 16:45:00 -04:00
Ilia Mirkin	2a2b22e9b1	nvc0: restore image binding on RGB10A2, remove from BGR10A2 Fixes a bunch of new CTS pbo tests that use those as an output format, which the state tracker converts into buffer image writes. No part of the driver is ready for BGR10A2. It could probably be enabled on Maxwell+, but seems unnecessary. This error was introduced when flipping the displayable bit on those formats, which accidentally also moved the image bit. Fixes: `e1a70aed10` (nv50,nvc0: mark ABGR format as displayable instead of ARGB format) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-07 16:45:00 -04:00
Rob Clark	684f7cd7e3	freedreno/ir3: use lower_global_vars_to_local in cmdline compiler tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does lower_locals_to_regs. But nothing was lowering global to local, which breaks compiling tgsi shaders Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-07 11:33:41 -04:00
Kenneth Graunke	a3782a612f	i965: Use %x instead of %u in debug print. I mistakenly printed out the address as 0x<decimal number> instead of printing a proper hex number. This was...surprising.	2018-04-06 22:57:48 -07:00
Dylan Baker	b5f92b6fd4	meson: fix warnings about comparing unlike types In the old days (0.42.x), when mesa's meson system was written the recommendation for handling conditional dependencies was to define them as empty lists. When meson would evaluate the dependencies of a target it would recursively flatten all of the arguments, and empty lists would be removed. There are some problems with this, among them that lists and dependencies have different methods (namely .found()), so the recommendation changed to use `dependency('', required : false)` for such cases. This has the advantage of providing a .found() method, so there is no need to do things like `dep_foo != [] and dep_foo.found()`, such a dependency should never exist. I've tested this with 0.42 (the minimum we claim to support) and 0.45. On 0.45 this removes warnings about comparing unlike types, such as: meson.build:1337: WARNING: Trying to compare values of different types (DependencyHolder, list) using !=. v2: - Use dependency('', required : false) instead of declare_dependency(), the later will always report that it is found, which is not what we want. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2018-04-06 15:29:53 -07:00
Ian Romanick	81ed629b38	intel/compiler: Explicitly cast register type in switch brw_reg::type is "enum brw_reg_type type:4". For whatever reason, GCC is treating this as an int instead of an enum. As a result, it doesn't detect missing switch cases and it doesn't detect that flow can get out of the switch. This silences the warning: src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg, const brw_reg)’: src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-06 15:22:10 -07:00
Axel Davy	39240926cd	st/nine: Declare lighting consts for ff shaders The lighting constants were not declared previously, but were accessed with indirect addressing, which is illegal. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105442 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-04-06 23:34:31 +02:00
Caio Marcelo de Oliveira Filho	67c728f7a9	nir: rename variables in nir_lower_io_to_temporaries for clarity In the emit_copies() function, the use of "newv" and "temp" names made sense when only copies from temporaries to the new variables were being done. But now there are other calls to copy with other pairings, and "temp" doesn't always refer to a temporary created in this pass. Use the names "dest" and "src" instead. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-06 11:08:08 -07:00
Samuel Pitoiset	8f9f62c2db	radv: don't pass the pipeline to radv_flush_constants() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:27 +02:00
Samuel Pitoiset	2bd50cceff	radv: rename radv_cmd_buffer_update_vertex_descriptors() ... to radv_flush_vertex_descriptors(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-04-06 19:46:23 +02:00
Samuel Pitoiset	e829a0cc1e	radv: do not try to skip draw calls when VBOs upload failed This is unnecessary because we record an error which should be returned by vkEndCommandBuffer(), and the app shouldn't submit a command buffer when this happens. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:21 +02:00
Samuel Pitoiset	f1d7c16e85	radv: fix prefetching compute shaders on CIK and older chips Because the check was moved to radv_emit_prefetch_L2(). Fixes: `4ad7595f35` ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2()") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 19:46:18 +02:00
Samuel Pitoiset	7fe586f6fb	radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries This unnecessary when the precision bit flag is not set, and this might hurt performance. The Vulkan explains that not setting VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some implementations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:34 +02:00
Samuel Pitoiset	d53dff3bfc	radv: enable the Polaris small primitive filter control Enable it directly in the preamble, but do not enable line on Polaris10/11/12 because there is a hw bug. There is possibly an issue when MSAA is off, but this doesn't regress any CTS and AMDVLK doesn't have a workaround as well. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:31 +02:00
Jason Ekstrand	c5b87c94d8	anv: Add WSI support for the I915_FORMAT_MOD_Y_TILED_CCS v2 (Jason Ekstrand): - Return the correct enum values from anv_layout_to_fast_clear_type v3 (Jason Ekstrand): - Always return ANV_FAST_CLEAR_NONE and leave doing the right thing for the patch which adds a modifier which supports fast-clears. Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Daniel Stone <daniels@collabora.com> Acked-by: Nanley Chery <nanley.g.chery@intel.com>	2018-04-05 21:17:02 -07:00
Anuj Phogat	ff8b82666a	Add more Coffee Lake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-05 14:50:11 -07:00
Jan Vesely	2406e8848e	radeonsi: Reorder checks in si_check_render_feedback si_get_total_colormask accesses NULL pointer on compute shaders Fixes crashes on clover Fixes: `0669dca9c0` ("radeonsi: skip DCC render feedback checking if color writes are disabled") CC: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-05 17:11:18 -04:00
Kevin Rogovin	cc41603d6d	intel/tools: new intel_sanitize_gpu tool Adds a new debug tool to pad each GEM BO allocated with (weak) pseudo-random noise values which are then checked after each batchbuffer dispatch to the kernel. This can be quite valuable to find diffucult to track down heisenberg style bugs. [scott.d.phillips@intel.com: split to separate tool] v2: (by Scott D Phillips) - track gem handles per fd (Kevin) - remove handles on GEM_CLOSE (Kevin) - ignore prime handles - meson & shell script v3: (by Scott D Phillips) - don't track prime bos at all (Kevin) - protect the hash table with a mutex (Kevin) - hook fds by drm_version.name, not path (Chris Wilson) Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com> Reviewed-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-05 13:52:49 -07:00
Jason Ekstrand	e85b95269e	prog/nir: Simplify some load/store operations Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-05 13:20:39 -07:00
Marek Olšák	c7dd59b06d	radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask	2018-04-05 15:53:52 -04:00
Marek Olšák	be4250aa88	radeonsi: remove more R600 references Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c0dfc0c6df	radeonsi: try to fix android Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f55d1f806e	radeonsi: try to fix meson This is not fully tested. Meson can't link LLVM even though automake can. PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \ -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \ -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \ -Dtexture-float=true -Dvulkan-drivers= src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o): (.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10): undefined reference to `typeinfo for llvm::RTDyldMemoryManager' Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	38faac43e3	radeonsi: don't build libradeon.la separately for better parallelism Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f9323ddbb9	radeonsi: clean up GET_MAX_VIEWPORT_RANGE definition Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	6a93441295	radeonsi: remove r600_common_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f77361d2e	radeonsi: remove r600_pipe_common::screen Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	321bd6c280	radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	d58080b318	radeonsi: move r600_gpu_load.c to si_gpu_load.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f7f4ba5306	radeonsi: move r600_query.c/h files to si_query.c/h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5777488406	radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	eced536ed6	radeonsi: rename query definitions R600_ -> SI_ Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72e9e98076	radeonsi: move and rename R600_ERR out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	076afb4f0e	radeonsi: rename a few R600/r600_ -> SI_/si_ Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f1cddde78	radeonsi: move definitions out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	a67ee02388	radeonsi: move functions out of and remove r600_pipe_common.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	90d12f1d77	radeonsi: rename r600 -> si in some places Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	50c7aa6756	radeonsi: use si_context instead of pipe_context in parameters pt3 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e332ba61f4	radeonsi: use si_context instead of pipe_context in parameters pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c424f86180	radeonsi: use si_context instead of pipe_context in parameters pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	2a62e5eec9	radeonsi: pass sctx to si_rebind_buffer and clean up Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	605ba1b9ae	radeonsi: use r600_common_context less pt7 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0b2f2a6a18	radeonsi: use r600_common_context less pt6 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4c5efc40f4	radeonsi: update copyrights Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	95bc30275b	radeonsi: switch radeon_add_to_buffer_list parameter to si_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e5053060eb	radeonsi: use r600_common_context less pt5 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	884fd97f6b	radeonsi: use r600_common_context less pt4 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	a8291a23c5	radeonsi: use r600_common_context less pt3 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3069cb8b78	radeonsi: use r600_common_context less pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	71d9028b7a	radeonsi: use r600_common_context less pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0606190059	radeonsi: don't use r600_common_context in si_emit_cache_flush Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3de323f9bb	radeonsi: switch r600_atom::emit parameter to si_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	2b70dd8c8a	radeonsi: flatten / remove struct r600_ring Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f7de8686de	radeonsi: remove r600_ring::flush callback Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4598ad6a00	radeonsi: make radeon_add_to_buffer_list_check_mem be gfx-only Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	426ef367f3	radeonsi: add_to_buffer_list functions can return void Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c0987d8adf	radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	37ef4765ff	radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	19f550f1d2	radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fc6a44e169	radeonsi: rename si_hw_context.c -> si_gfx_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	42500d1dab	radeonsi: move si_destroy_saved_cs to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	02a61e71a2	radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fa09388704	radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	85e75b2da5	radeonsi: remove r600_pipe_common::blit_decompress_depth Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e04389cc2a	radeonsi: remove r600_pipe_common::decompress_dcc Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	9d7f809c03	radeonsi: remove r600_pipe_common::invalidate_buffer Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	898500c440	radeonsi: remove r600_pipe_common::rebind_buffer Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fbf1bf9b8f	radeonsi: remove r600_common_context::set_occlusion_query_state and remove unused old_enable parameter. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5ed8b54ffe	radeonsi: remove r600_pipe_common::save_qbo_state Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72842d15ac	radeonsi: remove unused query code The get_size perf counter callback is also inlined and removed. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3f55fe99d6	radeonsi: use num_cs_dw_queries_suspend Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	54f28359b5	radeonsi: remove r600_pipe_common::need_gfx_cs_space Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0447e8e59e	radeonsi: remove r600_pipe_common::set_atom_dirty Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5c125ab1ba	radeonsi: remove r600_pipe_common::check_vm_faults Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	17e8f1608e	radeonsi: call CS flush functions directly whenever possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0669dca9c0	radeonsi: skip DCC render feedback checking if color writes are disabled	2018-04-05 15:34:58 -04:00
Dylan Baker	6ac87c1769	meson: fix megadriver symlinking Which should be relative instead of absolute. Fixes: `f7f1b30f81` ("meson: extend install_megadrivers script to handle symmlinking") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105567 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-05 10:48:38 -07:00
Dylan Baker	19dbed6477	meson: Set .so version for xa like autotools does Fixes: `0ba909f0f1` ("meson: build gallium xa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-04-05 10:46:14 -07:00
Rafael Antognolli	7728720f07	anv: Make blorp update the clear color. Instead of updating the clear color in anv before a resolve, just let blorp handle that for us during fast clears. v5: Update comment about HiZ clear color (Jordan). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	e8cadb673d	anv: Use clear address for HiZ fast clears too. Store the default clear address for HiZ fast clears on a global bo, and point to it when needed. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	021e1885d0	anv: Emit the fast clear color address, instead of value. On Gen10+, instead of copying the clear color from the state buffer to the surface state, just use the address of the state buffer in the surface state directly. This way we can avoid the copy from state buffer to surface state. v4: - Remove use_clear_address from anv code. (Jason) - Use the helper to extract clear color from attachment (Jason) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	3f96b459f4	anv: Add a helper to extract clear color from the attachment. Extract the code from color_attachment_compute_aux_usage, so we can later reuse it to update the clear color state buffer. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	7987d041fd	i965/surface_state: Emit the clear color address instead of value. On Gen10, when emitting the surface state, use the value stored in the clear color entry buffer by using a clear color address in the surface state. v4: Use the clear color offset from the clear_color_bo, when available. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	2efe8309d3	i965/blorp: Update the fast clear value buffer. On Gen10, whenever we do a fast clear, blorp will update the clear color state buffer for us, as long as we set the clear color address correctly. However, on a hiz clear, if the surface is already on the fast clear state we skip the actual fast clear operation and, before gen10, only updated the miptree. On gen10+ we need to update the clear value state buffer too, since blorp will not be doing a fast clear and updating it for us. v4: - do not use clear_value_size in the for loop - Get the address of the clear color from the aux buffer or the clear_color_bo, depending on which one is available. - let core blorp update the clear color, but also update it when we skip a fast clear depth. v5: Better subject (Jordan). v6: Remove outdated comment (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	5449f942f2	i965: Add aux_buf variable to simplify code. In a follow up patch, we make use of clear_color_bo, which is in mt->mcs_buf or mt->hiz_buf. To avoid duplicating more code that does the same thing on both aux buffers, just use aux_buf already. v5: Add aux_buf to brw_wm_surface_state too. v6: Drop aux_surf and use aux_buf->surf instead (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	8735c86ce0	i965/miptree: Add new clear color BO for winsys aux buffers Add an extra BO to store clear color when we receive the aux buffer from the window system. Since we have no control over the aux buffer size in this case, we need the new BO to store only the clear color. v5: - Better subject (Jordan). - Drop alignment from brw_bo_alloc(). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	ab633c2d61	i965/miptree: Add space to store the clear value in the aux surface. Similarly to vulkan where we store the clear value in the aux surface, we can do the same in GL. v2: Remove unneeded extra function. v3: Use clear_value_state_size instead of clear_value_size. v4: - rename to clear_color_state_size - store clear_color_bo and clear_color_offset in the aux buf struct v5: Unreference clear color bo (Jordan) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	14260e7c60	intel/blorp: Update clear color state buffer during fast clears. We always want to update the fast clear color during a fast clear on i965. On anv, we are doing that before a resolve, but by adding support to blorp, we can do a similar thing and update it during a fast clear instead. The goal is to remove some code from anv that does such update, and centralize everything in blorp, hopefully removing a lot of code duplication. It also allows us to have a similar behavior on gen < 9 and gen >= 10. v5: s/we/we are/ (Jordan) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	92eb5bbc68	intel/blorp: Only copy clear color when doing a resolve. We only need to copy the clear color from the state buffer to the inlined surface state when doing a resolve. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	188a473b9a	intel/blorp: Add support for fast clear address. On gen10+, if surface->clear_color_addr is present, use it directly intead of copying it to the surface state. v4: Remove redundant #if clause for GEN <= 10 (Jason) v5: Move flush after the reloc, and keep lower bits (Topi). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	b8f45cf967	intel/isl: Add support to emit clear value address. gen10 can emit the clear color by setting it on a buffer somewhere, and then adding only the address to the surface state. This commit add support for that on isl_surf_fill_state, and if that is requested, skip setting the clear value itself. v2: Add assert to make sure we are at least on gen10. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	94675edcfd	intel: Use Clear Color struct size. The size of the clear color struct (expected by the hardware) is 8 dwords (isl_dev.ss.clear_value_state_size here). But we still need to track the size of the clear color, used when memcopying it to/from the state buffer. For that we keep isl_dev.ss.clear_value_size. v4: - Add struct to gen11 too (Jason, Jordan) - Add field for Converted Clear Color to gen11 (Jason) - Add clear_color_state_offset to differentiate from clear_value_offset. - Fix all the places where clear_value_size was used. v5 (Jason): - Split genxml changes to another commit. - Remove unnecessary gen checks. - Bring back missing offset increment to init_fast_clear_color(). v6 (Jason): - On init_fast_clear_color, change: addr.offset += 4 => sdi.Address.offset += i * 4 - Use GEN_GEN instead of GEN_VERSIONx10. [jordan.l.justen@intel.com: isl_device_init changes] Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	f77789a3f0	intel/genxml: Add Clear Color struct to gen10+. v5: Split genxml changes into its own commit (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	7e616ae201	intel/genxml: Use a single field for clear color address on gen10. genxml does not support having two address fields with different names but same position in the state struct. Both "Clear Color Address" and "Clear Depth Address Low" mean the same thing, only for different surface types. To workaround this genxml limitation, rename "Clear Color Address" to "Clear Value Address" and use it for both color and depth. Do the same for the high bits. TODO: add support for multiple addresses at the same position in the xml. v2: Combine high and low order bits into a single address field. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	8e1f2e1d2d	genxml: Preserve fields that share dword space with addresses. Some instructions contain fields that are either an address or a value of some type based on the content of other fields, such as clear color values vs address. That works fine if these fields are in the less significant dword, the lower 32 bits of the address, because they get OR'ed with the address. But if they are in the higher 32 bits, they get discarded. On Gen10 we have fields that share space with the higher 16 bits of the address too. This commit makes sure those fields don't get discarded. v5: Remove spurious whitespace (Jason). Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-05 07:42:45 -07:00
Rafael Antognolli	f421a31637	anv/image: Do not override lower bits of dword. The lower bits seem to have extra fields in every platform but gen8 (even though we don't use them in gen9). So just go ahead and avoid using them for the address. v4: Use Jason's suggestion for comment explaining the change. v5: Fix aux_address comment in anv_private.h (Jason) Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-04-05 07:42:45 -07:00
Samuel Pitoiset	942fdfe357	radv: implement a fast prefetch path for the vertex stage This allows to start draws as soon as possible. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:48 +02:00
Samuel Pitoiset	4ad7595f35	radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:45 +02:00
Samuel Pitoiset	a8a696a38f	radv: use a mask for VBOs and shaders prefetching Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-05 10:03:42 +02:00
Marek Olšák	8cd58df2f2	gallium/pp: fix MLAA shaders Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99549	2018-04-04 20:01:43 -04:00
Marek Olšák	096942be2c	gallium/pp: use user constant buffers This fixes a radeonsi crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105026	2018-04-04 20:01:43 -04:00
Marek Olšák	d9dc26c94e	st/mesa: set stencil border color the same as intensity This fixes some stencil border color tests on Vega and Raven chips. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-04-04 16:55:52 -04:00
Jon Turney	498d9d0f4d	Fix use of alloca() without #include <c99_alloca.h> Fix use of alloca() without #include <c99_alloca.h> in `1da345e5` vbo/vbo_context.c: In function '_vbo_draw_indirect': vbo/vbo_context.c:284:34: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration] struct _mesa_prim space = alloca(draw_countsizeof(struct _mesa_prim)); ^~~~~~ vbo/vbo_context.c:284:34: warning: initialization makes pointer from integer without a cast [-Wint-conversion] Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-04-04 14:34:07 +01:00
Samuel Pitoiset	922cd38172	radv: implement out-of-order rasterization when it's safe on VI+ Disabled by default for now, it can be enabled with RADV_PERFTEST=outoforder. No CTS regressions on Polaris, and all Vulkan games I tested look good as well. Expect small performance improvements for applications where out-of-order rasterization can be enabled by the driver. Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	d6709c91a6	radv: change blend_enable field to use four bits per CB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	a8818d1af2	radv: scan which color blend attachments are enabled With cb_target_enabled_4bit in order to have four bits per CB. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ac456d0d1b	radv: put more fields in radv_blend_state Some will be used for further optimizations (ie. out-of-order rast). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	e4976ca33b	radv: do not always disable dual quad mode when chip has RbPlus For GFX9+ only, RadeonSI does this too. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	b8c06a961c	radv: don't use the SPI barrier management bug workaround Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Samuel Pitoiset	ab147cba77	radv: mask out high VM address bits in registers where needed Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-04 13:32:00 +02:00
Lionel Landwerlin	1beb80cb56	intel: compiler: silence compiler warning ../src/intel/compiler/brw_reg.h: In function ‘bool brw_regs_negative_equal(const brw_reg, const brw_reg)’: ../src/intel/compiler/brw_reg.h:305:1: warning: control reaches end of non-void function [-Wreturn-type] Introduced by `8f83eea71e` ("i965: Add negative_equals methods"). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-04-04 11:57:39 +01:00
Iago Toral Quiroga	41ac0b1443	compiler/spirv: set is_shadow for depth comparitor sampling opcodes From the SPIR-V spec, OpTypeImage: "Depth is whether or not this image is a depth image. (Note that whether or not depth comparisons are actually done is a property of the sampling opcode, not of this type declaration.)" The sampling opcodes that specify depth comparisons are OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set is_shadow only for these (we were using the deph property of the image until now). v2: - Do the same for OpImageDrefGather. - Set is_shadow to false if the sampling opcode is not one of these (Jason) - Reuse an existing switch statement instead of adding a new one (Jason) Fixes crashes in: dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-04-04 07:57:58 +02:00
Sergii Romantsov	98b860e311	i965: Extend the negative 32-bit deltas to 64-bits Gen8+ use 48-bit address relocations so need to extend the sign to 64-bit return value. Without it we have higher bits zeroed and missing the negavive values. Haswell and older use 32-bit deltas so are unaffected by this issue. v2: used int32_t fucntion parameter instead of explicit type conversion. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101408 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Tested-by: Andriy Khulap <andriy.khulap@globallogic.com> Tested-by: Stuart Young <cefiar@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>	2018-04-03 22:48:09 -07:00
Jason Ekstrand	800df942ea	nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination Otherwise we may end up trying to coalesce in a case such as ssa_1 = fadd r1, r2 r3.x = fneg(r2); r3 = vec4(ssa_1, ssa_1.y, ...) and that would cause us to move the writes to r3 from the vec to the fadd which would re-order them with respect to the write from the fneg. In order to solve this, we just don't coalesce if the destination of the vec is not SSA. We could try to get clever and still coalesce if there are no writes to the destination of the vec between the vec and the ALU source. However, since registers only come from phi webs and indirects, the chances of having a vec with a register destination that is actually coalescable into its source is very slim. Shader-db results on Haswell: total instructions in shared programs: 13657906 -> 13659101 (<.01%) instructions in affected programs: 149291 -> 150486 (0.80%) helped: 0 HURT: 592 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440 Fixes: `2458ea95c5` "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible" Reported-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-04-03 22:21:23 -07:00
Kevin Strasser	5bbde9b80f	anv: Fix close(fd) before import issue in vkCreateDmaBufImageINTEL If we close the fd before calling DRM_IOCTL_PRIME_FD_TO_HANDLE the kernel will hit a -EBADF error. Move the close(fd) call to the end of anv_CreateDmaBufImageINTEL(). Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-03 18:33:17 -07:00
Timothy Arceri	b42633db8e	glsl: always call do_lower_jumps() after loop unrolling This fixes a bug in radeonsi where LLVM cannot handle the case where a break exists but its not the last instruction in the block. LLVM would fail with: Terminator found in the middle of a basic block! LLVM ERROR: Broken function found, compilation aborted! Fixes: `96fe8834f5` "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively" Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317	2018-04-04 08:40:16 +10:00
James Legg	a58fdc61e9	vulkan/wsi/wayland: fix leaks Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Reviewed-by: Daniel Stone <daniels@collabora.com> CC: Jason Ekstrand <jason@jlekstrand.net>	2018-04-03 22:09:57 +01:00
Juan A. Suarez Romero	06076ead28	docs: update calendar, add news and link release notes to 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-03 17:38:36 +00:00
Juan A. Suarez Romero	ca71b7bab8	docs: add sha256 checksums for 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `ba371c7262`)	2018-04-03 17:34:16 +00:00
Juan A. Suarez Romero	d89ef8ce62	docs: add release notes for 17.3.8 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `3bf5c10c5c`)	2018-04-03 17:34:16 +00:00
Jakob Bornecrantz	88e958257c	st/mesa: Also use PIPE_FORMAT_R8G8B8A8_SRGB for framebuffer_sRGB. When running virgl on a GLES host the only sRGB formats that support rendering is RGBA and RGBX. That pipe format is in the sRGB default lists that the state tracker uses when mapping mesa formats. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-04-03 17:48:52 +01:00
Lionel Landwerlin	78c18d99dc	intel: gen-decoder: print all dword a field belongs to Prior to printing a decoded field, print out all dwords that field belongs to. In particular with address fields spanning multiple dwords, we want to have all the dwords presented before the field is decoded to make it easier to read. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	4d59127213	intel: genxml: decode variable length MI_LRI MI_LOAD_REGISTER_IMM can load multiple (register, value) tuples in one command. In our drivers we only use one tuple at a time, but the kernel might load more than one at a time. Instead of making all the tuple part of a group, we leave out the first tuple (the one we use in the generated packing structures). This is particularly useful for looking at error stats generated by the kernel. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	2841af6238	intel: gen-decoder: don't decode fields beyond a dword length For example, a PIPE_CONTROL with DWordLength = 2 should look like this : 0xffffe374: 0x7a000002: PIPE_CONTROL 0xffffe374: 0x7a000002 : Dword 0 DWord Length: 2 0xffffe378: 0x00800000 : Dword 1 Depth Cache Flush Enable: false Stall At Pixel Scoreboard: false State Cache Invalidation Enable: false Constant Cache Invalidation Enable: false VF Cache Invalidation Enable: false DC Flush Enable: false Pipe Control Flush Enable: false Notify Enable: false Indirect State Pointers Disable: false Texture Cache Invalidation Enable: false Instruction Cache Invalidate Enable: false Render Target Cache Flush Enable: false Depth Stall Enable: false Post Sync Operation: 0 (No Write) Generic Media State Clear: false TLB Invalidate: false Global Snapshot Count Reset: false Command Streamer Stall Enable: false Store Data Index: 0 LRI Post Sync Operation: 1 (MMIO Write Immediate Data) Destination Address Type: 0 (PPGTT) Flush LLC: false 0xffffe37c: 0x00000000 : Dword 2 Address: 0x00000000 0xffffe384: 0x05000000: MI_BATCH_BUFFER_END Prior to this change, fields beyond the length of the command would be decoded (notice the MI_BATCH_BUFFER_END decoded as part of the previous PIPE_CONTROL) : 0xffffe374: 0x7a000002: PIPE_CONTROL 0xffffe374: 0x7a000002 : Dword 0 DWord Length: 2 0xffffe378: 0x00800000 : Dword 1 Depth Cache Flush Enable: false Stall At Pixel Scoreboard: false State Cache Invalidation Enable: false Constant Cache Invalidation Enable: false VF Cache Invalidation Enable: false DC Flush Enable: false Pipe Control Flush Enable: false Notify Enable: false Indirect State Pointers Disable: false Texture Cache Invalidation Enable: false Instruction Cache Invalidate Enable: false Render Target Cache Flush Enable: false Depth Stall Enable: false Post Sync Operation: 0 (No Write) Generic Media State Clear: false TLB Invalidate: false Global Snapshot Count Reset: false Command Streamer Stall Enable: false Store Data Index: 0 LRI Post Sync Operation: 1 (MMIO Write Immediate Data) Destination Address Type: 0 (PPGTT) Flush LLC: false 0xffffe37c: 0x00000000 : Dword 2 Address: 0x00000000 0xffffe380: 0x00000000 : Dword 3 0xffffe384: 0x05000000 : Dword 4 Immediate Data: 83886080 0xffffe384: 0x05000000: MI_BATCH_BUFFER_END Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	81375516b2	intel: error_decode: add an option to decode all buffers The kernel reports workaround batch buffers, but we're not presenting them currently. Also they might not be useful for debugging purely userspace driver issues, when problems arise because of interactions between kernel & userspace drivers, it's nice to be able to decode them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Lionel Landwerlin	b3aa18dfd6	intel: genxml: add preemption control instructions Helpful to debug kernel workaround batchbuffers. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-04-03 16:55:53 +01:00
Dylan Baker	6f6e711c72	mesa: ensure that variable is initialized This variable controls whether we link using the glsl code path or the spirv path. It's set when we validate that all shaders are glsl or spirv, but if there are no shaders attached to the program it will remain unset, resulting in undefined behavior. We want to go down the glsl path in that case, so initialize to false. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105820 Fixes: `16f6634e7f` ("mesa/program: Link SPIR-V shaders using the SPIR-V code-path") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-04-03 08:47:59 -07:00
Marek Olšák	d3e96b1063	radeonsi/gfx9: fix bad LLVM params in monolithic LS+HS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-03 11:07:28 -04:00
Samuel Pitoiset	acf60abc54	radv: enable VK_EXT_shader_viewport_index_layer The driver already supports exporting the Layer and ViewportIndex built-ins from vertex or tessellation shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-03 14:05:46 +02:00
Rob Clark	51888bf07d	nir+drivers: add helpers to get # of src/dest components Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-04-03 06:08:56 -04:00
Rob Clark	91f9450b32	freedreno/ir3: fix fallout of unused false-depth elimination Since we were MARK flag for both preventing loops, and tracking whether instructions were used, we could end up in an infinite loop due to `bd2ca2bcdd`. Instead invert the logic.. mark all instructions UNUSED up front and clear the flag as we visit them. Fixes: `bd2ca2bcdd` freedreno/ir3: eliminate unused false-deps Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-03 06:08:56 -04:00
Timothy Arceri	7e9b7ec094	gallium/pipebuffer: fix parenthesis location Without this the return value will never get set to -1. This was first added in `49866c8f34` and copied in `2b396eeed9`. Fixes: `2b396eeed9` "gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager" Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102342	2018-04-03 16:05:59 +10:00
Tapani Pälli	6b21391729	Revert "mesa: add GL_HALF_FLOAT as supported type to readpixels" This reverts commit `41cf30b8bc`. Commit caused regressions with KHR-GLES3.packed_pixels.* tests. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Eric Anholt <eric@anholt.net>	2018-04-03 08:43:30 +03:00
Mike Lothian	0bdbe4583f	gallivm: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Mike Lothian	5e07881305	radeonsi: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Mike Lothian	7e144ace95	ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPass Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:27:29 -04:00
Daniel Stone	4cbecb6168	st/dri: Initialise modifier to INVALID for DRI2 When allocating a buffer for DRI2, set the modifier to INVALID to inform the backend that we have no supplied modifiers and it should do its own thing. The missed initialisation forced linear, even if the implementation had made other decisions. This resulted in VC4 DRI2 clients failing with: Modifier 0x0 vs. tiling (0x700000000000001) mismatch Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Andreas Müller <schnitzeltony@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: `3f8513172f` ("gallium/winsys/drm: introduce modifier field to winsys_handle")	2018-04-02 19:07:57 +01:00
Marek Olšák	2be6143032	radeonsi: implement GL_KHR_blend_equation_advanced MSAA is supported using sample shading. Layered rendering and all texture targets are also supported. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:25 -04:00
Marek Olšák	e04631b0f2	radeonsi: rename unpack_param -> si_unpack_param Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:23 -04:00
Marek Olšák	dc04e4bba2	radeonsi: move FMASK shader logic to shared code We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:22 -04:00
Marek Olšák	eb77961292	radeonsi: add R600_DEBUG=nofmask to disable MSAA compression For testing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:20 -04:00
Marek Olšák	56342c97ee	gallium/u_tests: test FBFETCH and shader-based blending with MSAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:18 -04:00
Marek Olšák	5d91c2ccea	ac/gpu_info: print GB_ADDR_CONFIG	2018-04-02 13:10:37 -04:00
Marek Olšák	b1f33086ec	ac/gpu_info: reorder the fields and print them nicely	2018-04-02 13:10:37 -04:00
Marek Olšák	a0a96819e1	ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memory	2018-04-02 13:10:37 -04:00
Marek Olšák	32b3932de1	ac/gpu_info: don't print irrelevant fields	2018-04-02 13:10:37 -04:00
Marek Olšák	f754217517	st/mesa: don't draw if the bound element array buffer is not allocated Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:10:36 -04:00
Iago Toral Quiroga	31881079af	anv/cmd_buffer: honor pending clear views for depth/stencil attachments v2: rebased on top of subpass rework. v3: rebased v4: - rebased - reset pending clear views in one go rather one bit at a time (Caio) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-02 09:53:24 +02:00
Iago Toral Quiroga	f60c5fc17e	anv/cmd_buffer: consider multiview masks for tracking pending clear aspects When multiview is active a subpass clear may only clear a subset of the attachment layers. Other subpasses in the same render pass may also clear too and we want to honor those clears as well, however, we need to ensure that we only clear a layer once, on the first subpass that uses a particular layer (view) of a given attachment. This means that when we check if a subpass attachment needs to be cleared we need to check if all the layers used by that subpass (as indicated by its view_mask) have already been cleared in previous subpasses or not, in which case, we must clear any pending layers used by the subpass, and only those pending. v2: - track pending clear views in the attachment state (Jason) - rebased on top of fast-clear rework. v3: - rebased on top of subpass rework. v4: rebased. v5 (Caio): - Rebased. - Initialize pending clear views to only have bits set for layers that exist. - Reset pending clear views in one go rather one bit at a time. - Put "last subpass for this attachment" condition in a separate function to simplify the conditional that resets pending_clear_aspects. Fixes: dEQP-VK.multiview.readback_implicit_clear.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-02 09:53:15 +02:00
Timothy Arceri	c88e7fe29e	radeonsi/nir: fix explicit component packing for geom/tess doubles Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	dd3d3cc877	radeonsi/nir: gather buffers declared more accurately and use const fast path For now we skip SI && HAVE_LLVM < 0x0600 for simplicity. We also skip setting the more accurate masks for builtin uniforms for now as it causes some piglit regressions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	56017d8100	radeonsi: create load_const_buffer_desc_fast_path() helper This will be shared by the TGSI and NIR backends. For simplicity we leave the SI LLVM 5.0 and lower work around only in the TGSI backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	7aad5e15f6	radeonsi/nir: set TGSI_PROPERTY_NEXT_SHADER Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Timothy Arceri	2ca5d9548f	st/glsl_to_nir: gather next_stage in shader_info Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-04-02 14:56:00 +10:00
Rob Clark	2f175bfe5d	freedreno/a5xx: don't align height for PIPE_BUFFER Buffers can be large, so we probably don't want to make them all 32x bigger. But they can't be rendered to (at least in GL) so we don't need this workaround to prevent page faults on mem<->gmem. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-01 11:26:01 -04:00
Rob Clark	1866f76f7b	freedreno/a5xx: fix page faults on last level We could alternatively fall back to using "old style" draw's for mem<->gmem (ie. what <= a4xx do) when height is not aligned to 32, but that is somewhat more work (and not really something that could be applied to stable) Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-04-01 10:50:11 -04:00
Rob Clark	afde9294b5	freedreno/ir3: fix issue w/ glamor composite shaders Fixes an issue that became possible when we started lowering phi webs to regs (`a7ea2b4e`) (although was not really seen until we also switched to using peephole select pass (`ec8bc54a`) instead of lowering all if/else to select). If texture coord (or anything else that uses create_collect() to collect scalar values in a sequence of scalar registers) was consuming a value produced on either side of an if/else (ie. a phi lowered to nir reg, which in ir3 is an "array" of length 1) then register allocation would happen incorrectly and we'd end up sampling from garbage coordinates. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 16:25:13 -04:00
Rob Clark	2191a18e75	freedreno/ir3: more half-precision fixes Some instructions require src/dst to be in full or half precision register depending on src/dst type. So do a better job of propagating register type. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:16:16 -04:00
Rob Clark	e04e068f75	freedreno/ir3: add helper to create immed of specified size We'll also need to be able to create a half-precision immediate. So re-work create_immed(). Prep work for following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:13:11 -04:00
Rob Clark	1f45320e51	freedreno/ir3: pass ctx instead of block to create_collect() Prep work for following patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:12:33 -04:00
Rob Clark	bd2ca2bcdd	freedreno/ir3: eliminate unused false-deps Previously false-dependencies would get flagged as used, even if the only "use" was a false dep to (for example) prevent a load from being scheduled after a store. In addition to being pointless instructions, in some cases they can cause problems. For example, ldg (and similar instructions) depend on an immed arg getting CP'd into the instruction, but this doesn't happen if an instruction is otherwise unused. Which can result in undefined results (overwriting unintended registers). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:11:46 -04:00
Rob Clark	4f78383809	freedreno/ir3: add local_group_size Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:10:56 -04:00
Rob Clark	96e7927fb2	freedreno/ir3: clear SSA flag when assigning "ARRAY" regs too Avoids a misleading "INVALID FLAGS" warning in debug builds. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:10:16 -04:00
Rob Clark	6514b4e3fd	freedreno/ir3: print array live ranges This is also useful to see if optmsgs are enabled. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-31 15:09:42 -04:00
Wladimir J. van der Laan	e8e3aa68d6	freedreno: a2xx: Implement DP2 instruction Use DOT2ADDv instruction with 0.0f constant add. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	79d6b194f2	freedreno: a2xx: implement SEQ/SNE instructions Extend translate_sge_slt to emit these, in analogous fashion but using CNDEv. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	837fabaaa3	freedreno: a2xx: Compressed textures support Add support for: - PIPE_FORMAT_ETC1_RGB8 - PIPE_FORMAT_DXT1_RGB - PIPE_FORMAT_DXT1_RGBA - PIPE_FORMAT_DXT3_RGBA - PIPE_FORMAT_DXT5_RGBA Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	92d529e7e4	freedreno: a2xx: Support TEXTURE_RECT Denormalized texture coordinates are required for text rendering in GALLIUM_HUD. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	6be017fdc4	freedreno: a2xx: Prevent crash in emit_texture if view is not set Textures will sometimes be updated if texture view state was un-set, without this change that causes an assertion crash or segfault. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	fb41372761	freedreno: a2xx: Fix fd2_tex_swiz Compose swizzles using util_format_compose_swizzles instead of the custom code (which somehow had a bug). This makes the GL_ALPHA internal format work. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	faed84a615	freedreno: a2xx: Change use of BLEND_ to BLEND2_ Change use of BLEND_ to BLEND2_, BLEND_* a3xx_rb_blend_opcode BLEND2_* is a2xx_rb_blend_opcode This makes no effective difference as the used enumerant has the same value (0), but the other enumerants do not match 1-to-1 so this will avoid future problems. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Wladimir J. van der Laan	cb6dd7070f	freedreno: a2xx: Update rnndb header for formats enumeration The format enumeration comes comes from the yamoto register headers that are part of the amd-gpu kernel driver. (see freedreno envytools commit b8fb7978e7ae106d0d11d0b238ab2ba2d4dd9d43) Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-31 06:17:59 +00:00
Mathias Fröhlich	1da345e569	vbo: Use alloca for _vbo_draw_indirect. Avoid using malloc in the draw path of mesa. Since the draw_count is a user api input, fall back to malloc if the amount of consumed stack space may get too high. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:15 +02:00
Mathias Fröhlich	3f1cd957d3	vbo: Remove unused includes to vbo_private.h Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	6e9f00e3fc	vbo: Move vbo_split into the tnl module. Move the files, adapt to the naming scheme in tnl, update callers and build system. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	245f9a3977	vbo: Readd the arrays argument to the legacy draw methods. The legacy draw paths from back before 2012 contained a gl_vertex_array array for the inputs to be used for draw. So all draw methods from legacy drivers and everything that goes through tnl are originally written for this calling convention. The same goes for tools like t_rebase or vbo_split, that even partly still have the original calling convention with a currently unused such pointer. Back in 2012 patch `50f7e75` mesa: move gl_client_array[] from vbo_draw_func into gl_context introduced Array._DrawArrays, which was something that was IMO aiming for a similar direction than Array._DrawVAO introduced recently. Now several tools like t_rebase and vbo_split, which are mostly used by tnl based drivers, would need to be converted to use the internal Array._DrawVAO instead of Array._DrawArrays. The same goes for the driver backends that use any of these tools. Alternatively we can reintroduce the gl_vertex_array array in its call argument list and put these tools finally into the tnl directory. So this change reintroduces this gl_vertex_array array for the legacy draw paths that are still required for the tools t_rebase and vbo_split. A followup will move vbo_split also into tnl. Note that none of the affected drivers use the DriverFlags.NewArray driver bit. So it should be safe to remove this also for the legacy draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:14 +02:00
Mathias Fröhlich	461698af26	vbo: Remove the now unused vbo draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	784fdef4e7	tnl: Push down the gl_vertex_array inputs into tnl drivers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	7f8db5ca47	vbo: Remove vbo_indirect_draw_func. Remove the vbo_indirect_draw_func vbo callback and make the default implementation use the drivers main draw callback function directly. This will be needed with the next changes when drivers without own main drivers DrawIndirect implementation get moved to the main drivers Draw method. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:13 +02:00
Mathias Fröhlich	4db9d83a2d	i965: Push down the gl_vertex_array inputs into i965. Let the i965 backend have its own gl_vertex_array array and basically reimplement the way _vbo_draw works. Note that brw_draw_indirect_prims calls brw_draw_prims internally and gets its update to Array._DrawArray by this way. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:12 +02:00
Mathias Fröhlich	fca1550550	gallium: Push down the gl_vertex_array inputs into gallium. Let the gallium backend have its own gl_vertex_array array and basically reimplement the way _vbo_draw works. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-31 06:32:12 +02:00
Jason Ekstrand	9978f55cd1	nir/validator: Validate that all used variables exist We were validating this for locals but nothing else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Jason Ekstrand	2b977989f3	intel/vec4: Set channel_sizes for MOV_INDIRECT sources Otherwise, any indirect push constant access results in an assertion failure when we start digging through the channel_sizes array. This fixes dEQP-VK.pipeline.push_constant.graphics_pipeline.dynamic_index_vert on Haswell. It should be a harmless no-op for GL since indirect push constants aren't used there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `e69e5c7006` "i965/vec4: load dvec3/4 uniforms first in the..."	2018-03-30 17:20:27 -07:00
Jason Ekstrand	6018f5b079	nir/lower_indirect_derefs: Support interp_var_at intrinsics This fixes the fs-interpolateAtCentroid-block-array piglit test on i965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-03-30 17:20:27 -07:00
Jason Ekstrand	0517d65f96	nir/vars_to_ssa: Remove copies from the correct set Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-03-30 17:20:27 -07:00
Jason Ekstrand	a1452a94fc	nir: Return a cursor from nir_instr_remove Because nir_instr_remove is an inline wrapper around nir_instr_remove_v, the compiler should be able to tell that the return value is unused and not emit the extra code in most cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Jason Ekstrand	956f17395b	nir: Add src/dest num_components helpers We already have these for bit_size Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-30 17:20:27 -07:00
Brian Paul	bebf758c49	docs: document WGL_SWAP_INTERVAL env var Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-30 14:45:05 -06:00
Brian Paul	c8906b8459	st/wgl: check if WGL_SWAP_INTERVAL is defined in wglSwapIntervalEXT() This allows the WGL_SWAP_INTERVAL env var to override any application calls to wglSwapIntervalEXT(). Useful for debugging, or to set the interval to zero to effectively disable the swap interval. Note: we also rename the previous instance of SVGA_SWAP_INTERVAL to WGL_SWAP_INTERVAL since this is a WGL feature and not related to the svga driver. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-30 14:44:50 -06:00
Brian Paul	1bf201ddce	glapi: define GL_API to be KEYWORD1 in glapi_dispatch.c (v2) This fixes a Windows build warning where the prototypes for the ES function in the header file don't match the prototypes in this file because the GL_API and GLAPI macros are defined differently. v2: defined GL_API to KEYWORD1 instead of GLAPI, per Mathias. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-30 14:33:33 -06:00
Brian Paul	26bc983c83	spirv: s/uint/unsigned/ to fix MSVC build Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	f3164c2ed9	nir/spirv: s/uint32_t/SpvOp/ in various functions The MSVC compiler warns when the function parameter types don't exactly match with respect to enum vs. uint32_t. Use SpvOp everywhere. Alternately, uint32_t could be used everywhere. There doesn't seem to be an advantage to one over the other. Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	cb619a3c9a	nir/spirv: fix MSVC syntax error in vtn_handle_texture() Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	c58c9f712d	nir/spirv: move NORETURN annotation on _vtn_fail() prototype This needs to before the function, not after, to compile with MSVC. This works with gcc too. Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Brian Paul	84be45fc20	nir/spirv: fix MSVC warning in vtn_align_u32() Fixes warning that "negation of an unsigned value results in an unsigned value". Reviewed-by: Neil Roberts <nroberts@igalia.com>	2018-03-30 14:33:33 -06:00
Neil Roberts	31d91f019b	spirv: Fix building with SCons The SCons build broke with commit `ba975140d3` because a SPIR-V function is called from Mesa main. This adds a convenience library for SPIR-V and adds it to everything that was including nir. It also adds both nir and spirv to drivers/x11/SConscript. Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2018-03-30 14:33:03 -06:00
Brian Paul	cdc34e2cea	mesa: fix MSVC bitshift overflow warnings In the BITFIELD_MASK() macro, if b==32 the expression evaluates to ~0u, but the compiler still sees the expression (1 << 32) in the unused part and issues a warning about integer bitshift overflow. Fix that by using (b) % 32 to ensure the max shift is 31 bits. This issue has been present for a while, but shows up much more often because of the recent VBO changes. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-03-30 11:04:32 -06:00
Brian Paul	fa18a427e9	st/mesa: add missing GLSL_TYPE_[U]INT8 cases in st_glsl_type_dword_size() Silences a compiler warning about unhandled enum switch cases. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-30 11:04:32 -06:00
Jakob Bornecrantz	e16b92ad7e	vbo: MaxVertexAttribStride is not always set This assert is hit on hardware which does not expose GL 4.4 or GLES 3.1. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>	2018-03-30 17:23:08 +01:00
Daniel Stone	696762eef5	x11: Only report supported DRI3/Present versions The version passed to QueryVersion requests is the version that the client supports. We were just passing in whatever version of XCB was present on the system, which may not be a version that Mesa actually explicitly supports, e.g. it might bring unwanted semantics. Set specific protocol versions which we support, and only pass those. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-30 16:53:51 +01:00
Samuel Pitoiset	2a329f4ada	radv: set SAMPLE_RATE to the number of samples of the current fb Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-30 17:32:15 +02:00
Brian Paul	fc1d1dbe81	nir: s/uint/unsigned/ to fix MSVC/MinGW build Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-30 08:37:59 -06:00
Eduardo Lima Mitev	e7fc18097e	i965: Don't call process_glsl_ir() for SPIR-V shaders v2: Use 'spirv_data' from gl_linked_shader instead, to check if shader is SPIR-V. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	e7d97aa75d	i965: Call spirv_to_nir() instead of glsl_to_nir() for SPIR-V shaders This is the main fork of the shader compilation code-path, where a NIR shader is obtained by calling spirv_to_nir() or glsl_to_nir(), depending on its nature.. v2: Use 'spirv_data' member from gl_linked_shader to know which method to call. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	abb6d0797c	mesa/glspirv: Add a _mesa_spirv_to_nir() function This is basically a wrapper around spirv_to_nir() that includes arguments setup and post-conversion validation. v2: * Rebase update (SpirVCapabilities not a pointer anymore, spirv_to_nir_options added, and others). * Code-style improvements and remove debug hunk. (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	16f6634e7f	mesa/program: Link SPIR-V shaders using the SPIR-V code-path Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	9c36e9f862	mesa/glspirv: Add _mesa_spirv_link_shaders() function This is the equivalent to link_shaders() from src/compiler/glsl/linker.cpp, but for SPIR-V programs. It just creates the program and its gl_linked_shader objects, giving drivers the opportunity to implement any linking of SPIR-V shaders they choose, at a later stage. v2: Bail out if we see more that one shader for the same stage, and add a corresponding comment. (Timothy Arceri) v3: * Adds also a linker error log to the condition above, with a reference to the specification issue. (Timothy Arceri) * Squash with the patch adding the function boilerplate (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Eduardo Lima Mitev	22b6b3d0a7	mesa: Add a reference to gl_shader_spirv_data to gl_linked_shader This is a reference to the spirv_data object stored in gl_shader, which stores shader SPIR-V data that is needed during linking too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Nicolai Hähnle	ba975140d3	mesa: Implement glSpecializeShaderARB v2: * Use gl_spirv_validation instead of spirv_to_nir. This method just validates the shader. The conversion to NIR will happen later, during linking. (Alejandro Piñeiro) * Use gl_shader_spirv_data struct to store the SPIR-V data. (Eduardo Lima) * Use the 'spirv_data' member to tell if the gl_shader is a SPIR-V shader, instead of a dedicated flag. (Timothy Arceri) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	9063bf7ad8	nir/spirv: add gl_spirv_validation method ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new method, glSpecializeShader. Here we add a new function to do the validation for this function: From OpenGL 4.6 spec, section 7.2.1" "Shader Specialization", error table: INVALID_VALUE is generated if <pEntryPoint> does not name a valid entry point for <shader>. INVALID_VALUE is generated if any element of <pConstantIndex> refers to a specialization constant that does not exist in the shader module contained in <shader>."" v2: rebase update (spirv_to_nir options added, changes on the warning logging, and others) v3: include passing options on common initialization, doesn't call setjmp on common_initialization v4: (after Jason comments): * Rename common_initialization to vtn_builder_create * Move validation method and their helpers to own source file. * Create own handle_constant_decoration_cb instead of reuse existing one v5: put vtn_build_create refactoring to their own patch (Jason) v6: update after vtn_builder_create method renamed, add explanatory comment, tweak existing comment and commit message (Timothy)	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	bebe3d626e	spirv: add vtn_create_builder Refactored from spirv_to_nir, in order to be reused later. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> v2: renamed method (from vtn_builder_create), add explanatory comment (Timothy)	2018-03-30 09:14:56 +02:00
Alejandro Piñeiro	3761e675e2	i965: initialize SPIR-V capabilities Needed for ARB_gl_spirv. Those are not the same that the Intel vulkan driver. From the ARB_spirv_extensions spec: "3. If a new GL extension is added that includes SPIR-V support via a new SPIR-V extension does it's SPIR-V extension also get enumerated by the SPIR_V_EXTENSIONS_ARB query?. RESOLVED. Yes. It's good to include it for consistency. Any SPIR-V functionality supported beyond the SPIR-V version that is required for the GL API version should be enumerated." So in addition to the core SPIR-V support, there is the possibility of specific GL extensions enabling specific SPIR-V extensions (so capabilities). That would mean that it is possible that OpenGL and Vulkan not having the same capabilities supported, even for the same driver. For this reason it is better to keep them separated. As an example: at the time of this patch writing Intel vulkan driver support multiview, but there isn't any OpenGL multiview GL extension supported. Note: we initialize SPIR-V capabilities at brwCreateContext instead of the usual brw_initialize_context_constants because we want to do that only if the extension is enabled. v2: * Rebase update (SpirVCapabilities not a pointer anymore) * Fill spirv capabilities for OpenGL >= 3.3 (Ian Romanick) v3: * Drop multiview support, as i965 doesn't support any multiview GL extension (Jason) * Fill spirv capabilities only if the extension is enabled (Jason) v4: Capabilities are supported only on gen7+. Added comment and assert (Jason)	2018-03-30 09:14:56 +02:00
Nicolai Hähnle	ca5cc78206	mesa: add gl_constants::SpirVCapabilities For drivers to declare which SPIR-V features they support. v2: Don't use a pointer (Ian Romanick) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-30 09:14:56 +02:00
Ian Romanick	19e0dd1ad3	i965: Don't request GLSL IR lowering of gl_VertexID Let the lowering in NIR handle it instead. This hurts one shader that occurs twice in shader-db (SynMark GSCloth) on IVB and HSW. No other shaders or platforms were affected. total cycles in shared programs: 253438422 -> 253438426 (0.00%) cycles in affected programs: 412 -> 416 (0.97%) helped: 0 HURT: 2 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2018-03-29 14:16:07 -07:00
Ian Romanick	2765633116	i965: Silence unused parameter warning src/mesa/drivers/dri/i965/brw_draw_upload.c: In function ‘double_types’: src/mesa/drivers/dri/i965/brw_draw_upload.c:225:34: warning: unused parameter ‘brw’ [-Wunused-parameter] double_types(struct brw_context *brw, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:16:04 -07:00
Ian Romanick	042ee4bea2	spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.build Future changes will add generated files used only from src/compiler/glsl. These can't be built from Makefile.nir.am, and we can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and it would be silly anyway). v2: Do it for meson too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (the meson bits) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (the automake bits)	2018-03-29 14:16:01 -07:00
Ian Romanick	2c9621ee5c	compiler: All leaf Makefile.am should use += This slightly simplifies later changes that add more Makefile.*.am files. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:41 -07:00
Ian Romanick	4925347ec5	util: Include bitscan.h directly Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:30 -07:00
Ian Romanick	ef7a4c9015	util: Optimize util_is_power_of_two_nonzero Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:29 -07:00
Ian Romanick	cd18aa1e50	util: Use util_is_power_of_two_nonzero in u_vector Previously size=0, element_size=0 would have been allowed. That combination can only lead to despair. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:28 -07:00
Ian Romanick	22fbb5c594	util: Add and use util_is_power_of_two_nonzero Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2018-03-29 14:09:28 -07:00
Ian Romanick	d76c204d05	util: Move util_is_power_of_two to bitscan.h and rename to util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-03-29 14:09:23 -07:00
Dylan Baker	a3a16d4aa7	meson: use dep_libdrm version for pkg-config This corrects pkg-config to use the libdrm version (as computed by the previous patch) instead of using a hardcoded value that may or may not (probably not) be right. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:20:52 -07:00
Dylan Baker	c445b1d56f	meson: Use the same version for all libdrm checks Currently each driver specifies it's own version, and core libdrm specifies a version. In the most common case this is fine, since there will be exactly one libdrm installed on a system, but if there are more than one it's possible that mesa will be linked against different versions of libdrm. There is also the possibility that the current approach makes the pkg-config files we generate incorrect, since there could be #defines that use newer features if they're available. This patch corrects all of that. All of the versions are still set by driver (along with a default core version). Then all of the drivers that are enabled have their versions compared and the highest version is selected, then all libdrm checks are made with that version. v2: - Reorder the list to have the name first and whether the dependency is needed second (Eric) Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:20:52 -07:00
Dylan Baker	acadf06f56	meson: group libdrm dependencies The reason libdrm is after libdrm_* will be made clear in later patches. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-29 10:18:47 -07:00
Brian Paul	e520ca562a	gl.h: remove stale comment, trailing whitespace Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-29 08:46:55 -06:00
Brian Paul	4ff6a7b0de	glapi: add glBlendBarrier(), glPrimitiveBoundingBox() prototypes in glapi_dispatch.c, as we have for many other GLES functions. Fixes a cross-compile issue (missing prototype) when GLES support is disabled. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2018-03-29 08:45:10 -06:00
Brian Paul	5cd5878a1f	st/mesa: silence unhandled switch case warning And improve the unreachable() error message. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-29 08:45:10 -06:00
Henri Verbeet	0b73c86b80	mesa: Inherit texture view multi-sample information from the original texture images. Found running "The Witness" in Wine. Without this patch, texture views created on multi-sample textures would have a GL_TEXTURE_SAMPLES of 0. All things considered such views actually work surprisingly well, but when combined with (plain) multi-sample textures in a framebuffer object, the resulting FBO is incomplete because the sample counts don't match. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-29 14:38:25 +04:30
Samuel Pitoiset	e45fe0ed66	radv: fix scanning output_usage_mask with structs To fix a regression in: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct And the following regressions (Polaris only): dEQP-VK.glsl.indexing.varying_array.* Fixes: `f3275ca01c` ("ac/nir: only enable used channels when exporting parameters") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-29 10:22:10 +02:00
Karol Herbst	6179a87c1e	nvc0/ir: fix emiting NOTs with predicates Signed-off-by: Karol Herbst <karolherbst@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-03-29 03:06:36 +02:00
Aaron Watry	1dae92f150	broadcom/vc4: Fix out-of-tree build with automake. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-28 17:48:41 -07:00
Eric Anholt	81f82ecc56	broadcom/vc5: Start using nir_opt_move_load_ubo(). In the absence of a general NIR or VIR-level scheduler, this at least avoids spilling in GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts	2018-03-28 17:48:41 -07:00
Eric Anholt	1fe4c748f7	broadcom/vc5: Fix setup of integer surface clear values. I'm disappointed that the compiler didn't warn me about use of uninitialized uc in these paths. Just use the incoming clear color instead of the packing temporary if we're doing our own packing. Fixes GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_*	2018-03-28 17:48:41 -07:00
Eric Anholt	123ee37627	broadcom/vc5: Stop trying to swizzle around RGBA4 clear color. We always want A in the A slot in the tile buffer, and any other swapping should happen elsewhere. Fixes RGBA4-using cases in fbo-clear-formats and GTF-GLES3.gtf.GL3Tests.color_buffer_float.color_buffer_float_clamp_fixed.	2018-03-28 17:48:41 -07:00
Eric Anholt	2f4c4e10c2	broadcom/vc5: Work around scissor w/h==0 bug same as rasterizer discard. The 7268 HW apparently lets some rendering through in this case. Fixes GTF-GLES2.gtf.GL2FixedTests.scissor.scissor	2018-03-28 17:48:41 -07:00
Eric Anholt	0349c79bdc	st: Don't try to finalize the texture in st_render_texture(). We can't necessarily finalize the texture at this point if we're rendering to a texture image whose format is different from the baselevel's format. This was introduced as a fix for fbo-incomplete-texture-03 in `de414f4915`, but the later fix for vmware on that testcase in `95d5c48f68` made it unnecessary. Fixes assertion failures in util_resource_copy_region() in KHR-GLES3.copy_tex_image_conversions.forbidden.* when trying to finalize an R8 texture image to the RG8 texture object's pt. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-28 17:48:41 -07:00
Marek Olšák	e159d46fc7	drirc: whitelist glthread for Medieval II: TW, Carnivores: DHR, Far Cry 2	2018-03-28 20:00:48 -04:00
Daniel Schürmann	b91cd5dba4	radv: enable VK_AMD_shader_trinary_minmax extension Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:39 +02:00
Daniel Schürmann	d00fb7ce54	ac: add support for trinary_minmax instructions v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:35 +02:00
Dave Airlie	fe5d5d19b0	spirv: add support for SPV_AMD_shader_trinary_minmax Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:29:29 +02:00
Dave Airlie	3e830a1af2	nir: add support for min/max/median of 3 srcs These are needed for SPV_AMD_shader_trinary_minmax, the AMD HW supports these. Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 01:28:58 +02:00
Marek Olšák	025105453a	radeonsi: simplify DCC format categories Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Marek Olšák	3fea237c85	radeonsi: don't use the SPI barrier management bug workaround Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Marek Olšák	3045c5f274	radeonsi: use maximum OFFCHIP_BUFFERING on Vega12 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 18:45:52 -04:00
Bas Nieuwenhuizen	4503ff760c	ac/nir: Add workaround for GFX9 buffer views. On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-29 00:03:03 +02:00
Marek Olšák	4f96747530	ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertions Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 17:23:41 -04:00
Samuel Pitoiset	1c4fdcf444	radv: enable VK_EXT_sampler_filter_minmax Only enable for CIK+ because it's buggy on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	413d77e7f9	radv: add support for VK_EXT_sampler_filter_minmax The driver only supports the required formats for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 22:55:48 +02:00
Samuel Pitoiset	99b52aa1da	radv: rename VEGA10 device name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:17 +02:00
Samuel Pitoiset	4d2c46dda3	radv: add support for Vega12 Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-28 20:15:14 +02:00
Matt Turner	3e6326deb9	build: Fix up nir_intrinsics.Plo nir_intrinsics.c existed as a static file until commit `76dfed8ae2` began generating it as part of the build process. autotools is incapable of coping, and so a build-tree from before this commit would then fail with it: [4]: *** No rule to make target '../../../mesa/src/compiler/nir/nir_intrinsics.c', needed by 'nir/nir_intrinsics.lo'. Stop. Add a few lines to configure.ac to update the broken build files. Fixes: `76dfed8ae2` ("nir: mako all the intrinsics")	2018-03-28 11:09:23 -07:00
Dylan Baker	2cfc68d984	autotools: Include intel/dev/meson.build in tarball Fixes: `272bef0601` ("intel: Split gen_device_info out into libintel_dev") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-28 10:19:05 -07:00
Dylan Baker	bc2fdb9759	autotools: include meson_get_version Otherwise meson won't read the VERSION file and won't set a version. That means that pkg-config files will have version unset as well. Fixes: `3e9533d9b8` ("meson: Add script to use VERSION file for getting version") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-28 10:13:23 -07:00
Eric Engestrom	d77844a529	docs: fix 18.0 release note version Fixes: `839fb3a696` "docs: Update 18.0.0 release notes" Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-28 16:52:56 +01:00
Marek Olšák	20eb44ad65	radeonsi: add support for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Marek Olšák	5425d32fcf	amd/addrlib: update to the latest version for Vega12 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-28 11:37:43 -04:00
Eric Engestrom	431a1d12cc	gbm: remove never-implemented function I assume this was implemented in a previous version of that commit, but was removed in the version that actually landed. Fixes: `8430af5ebe` "Add support for swrast to the DRM EGL platform" Cc: Giovanni Campagna <gcampagna@src.gnome.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-28 16:25:52 +01:00
Stefan Schake	77ade10c86	android: Use new nir intrinsics python scripts Fixes: `76dfed8ae2` ("nir: mako all the intrinsics") Signed-off-by: Stefan Schake <stschake@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-28 14:48:47 +03:00
Eric Anholt	a691fa4a1b	broadcom/vc5: Fix padding of NPOT miplevels >= 2. The power-of-two padded size that gets minified is based on level 1's dimensions, not level 0's, which starts to differ at a width of 9. Fixes all failures on texelFetch fs sampler2D 1x1x1-64x64x1	2018-03-27 21:16:23 -07:00
Timothy Arceri	92fa89a08d	ac/radeonsi: pass bindless bool to load_sampler_desc() We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:16 +11:00
Timothy Arceri	5411b98d52	st/glsl_to_nir: set driver location for bindless images and samplers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:15 +11:00
Timothy Arceri	f94b6b79be	radeonsi/nir: set uses_bindless_samplers for samplers Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 12:56:15 +11:00
Timothy Arceri	5c810a2c05	nir: add bindless to nir data Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-28 12:56:15 +11:00
Kenneth Graunke	fb18d0dbe4	i965: Drop unnecessary bo->align field. bo->align is always 0; there's no need to waste 8 bytes storing it. Thanks to C99 initializers zeroing fields, we can completely drop the only read of the field altogether. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	037d738a23	i965: Drop unused alignment parameter from brw_bo_alloc(). brw_bo_alloc no longer uses this parameter, so there's no point. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	07ec3a2e0f	i965: Drop alignment parameter from bo_alloc_internal(). Buffers are always page aligned on 965+ hardware; I believe this extra parameter is a vestige from the Gen2-3 era. All callers pass 0, and in fact we assert that the alignment is 0 unless BO_ALLOC_BUSY is set (for some reason). We can just drop the parameter and set the value to 0 explicitly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	b9a54b18f6	i965: Drop BO_ALLOC_BUSY in intel_miptree_create_for_bo(). intel_miptree_create_for_bo does not actually allocate a BO, so specifying allocation flags accomplishes nothing and is confusing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 18:41:44 -07:00
Kenneth Graunke	2c01215c1b	i965: Drop PIPE_CONTROL_NO_WRITE from various calls. This is just zero - passing nothing already gives us a post-sync operation of "nothing". Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-27 18:41:44 -07:00
Jason Ekstrand	5f21a7afe0	nir/intrinsics: Don't report negative dest_components I have no idea why but having dest_components == -1 was causing a memory leak somewhere. Without this, you can't get through a full shader-db run without running out of memory. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-03-27 18:18:26 -07:00
Jason Ekstrand	7e38f49a8f	intel/fs: Don't emit a des copy for image ops with has_dest == false This was causing us to walk dest_components times over a thing with no destination. This happened to work because all of the image intrinsics without a destination also happened to have dest_components == 0. We shouldn't be reading dest_components if has_dest == false. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-27 18:18:21 -07:00
Ilia Mirkin	776e6af879	nvc0/ir: fix INTERP_* with indirect inputs There were two problems, both of which are fixed now: - The indirect address was not being shifted by 4 - The indirect address was being placed as an argument in the offset case This fixes some of the new interpolateAt* piglits which now test for these situations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-03-27 20:41:11 -04:00
Timothy Arceri	629ee690ad	nir: fix crash in loop unroll corner case When an if nesting inside anouther if is optimised away we can end up with a loop terminator and following block that looks like this: if ssa_596 { block block_5: /* preds: block_4 / vec1 32 ssa_601 = load_const (0xffffffff / -nan /) break / succs: block_8 / } else { block block_6: / preds: block_4 / / succs: block_7 / } block block_7: / preds: block_6 */ vec1 32 ssa_602 = phi block_6: ssa_552 vec1 32 ssa_603 = phi block_6: ssa_553 vec1 32 ssa_604 = iadd ssa_551, ssa_66 The problem is the phis. Loop unrolling expects the last block in the loop to be empty once we splice the instructions in the last block into the continue branch. The problem is we cant move phis so here we lower the phis to regs when preparing the loop for unrolling. As it could be possible to have multiple additional blocks/ifs following the terminator we just convert all phis at the top level of the loop body for simplicity. We also add some comments to loop_prepare_for_unroll() while we are here. Fixes: `51daccb289` "nir: add a loop unrolling pass" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670	2018-03-28 09:59:38 +11:00
Timothy Arceri	48f6014903	st/glsl_to_nir: correctly handle arrays packed across multiple vars Fixes piglit test: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:38 +11:00
Timothy Arceri	b260efbd5e	radeonsi/nir: fix input processing for packed varyings The location was only being incremented the first time we processed a location. This meant we would incorrectly skip some elements of an array if the first element was packed and proccessed previously but other elements were not. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:38 +11:00
Timothy Arceri	51f175028d	ac/nir_to_llvm: fix component packing for double outputs We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Timothy Arceri	fc51fdbcde	st/glsl_to_nir: fix driver location for dual-slot packed doubles Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Timothy Arceri	47eee04556	radeonsi/nir: fix scanning of multi-slot output varyings This fixes tcs/tes varying arrays where we dont lower indirects and therefore don't split arrays. Here we also fix useagemask for dual slot doubles. Fixes a number of arb_tessellation_shader piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-28 09:59:37 +11:00
Eric Anholt	9f1b4f6204	broadcom/vc5: Fix RG16I/UI texture sampling. How many times did I look at this table without noticing the missing 'G' in the texture column? Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.	2018-03-27 15:49:58 -07:00
Rob Clark	16581904b0	nir: fix generated nir_intrinsics.c for MSVC Apparently it is not happy about things like: .foo = {} So skip over initializers for empty lists. Fixes: `76dfed8ae2` Reported-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-27 15:01:11 -04:00
Emil Velikov	eda2f58d15	docs: update calendar 18.0.0 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-27 19:11:45 +01:00
Emil Velikov	02f89b62fe	docs: add news item and link release notes for 18.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-27 19:08:48 +01:00
Emil Velikov	62eb721ed8	docs: add sha256 checksums for 18.0.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `fb64913d19`)	2018-03-27 19:06:27 +01:00
Emil Velikov	839fb3a696	docs: Update 18.0.0 release notes Note: the file was originally 17.4.0, yet git stuggles to detect the move :-\ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `dceb1ce807`)	2018-03-27 19:06:19 +01:00
Rob Clark	76dfed8ae2	nir: mako all the intrinsics I threatened to do this a long time ago.. I probably should have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and mostly remove side-effects, add autotools build support v4: scons build Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 08:36:37 -04:00
Rob Clark	cc3a88e81d	nir: fix per_vertex_output intrinsic This is supposed to have both BASE and COMPONENT but num_indices was inadvertantly set to 1. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-27 08:20:40 -04:00
Rob Clark	1e0a06000b	glsl_types: fix build break with intel/msvc compiler The VECN() macro was taking advantage of a GCC specific feature that is not available on lesser compilers, mostly for the purposes of avoiding a macro that encoded a return statement. But as suggested by Ian, we could just have the macro produce the entire method body and avoid the need for this. So let's do that instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740 Fixes: `f407edf340` Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Cc: Roland Scheidegger <sroland@vmware.com> Cc: Ian Romanick <idr@freedesktop.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-27 08:17:11 -04:00
Lin Johnson	41cf30b8bc	mesa: add GL_HALF_FLOAT as supported type to readpixels EXT_color_buffer_float spec states: "An INVALID_OPERATION error is generated ... if the color buffer is a floating-point format and type is not FLOAT, HALF FLOAT, or UNSIGNED_INT_10F_11F_11F_REV." This means that GL_HALF_FLOAT type should be supported when color buffer has floating-point format. Fixes Android CTS test android.view.cts.PixelCopyTest. v2: remove comments of EXT_color_buffer_half_float as EXT_color_buffer_float can use type GL_HALF_FLOAT Signed-off-by: Lin Johnson <johnson.lin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-27 09:04:52 +03:00
Eric Anholt	0024b77e87	broadcom/vc5: Fix swizzling of RGB10_A2UI render targets. This is the actual hardware layout, and we were only swizzling R/B back around in texturing. Fixes part of KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in simulation.	2018-03-26 17:46:23 -07:00
Eric Anholt	c2b13627d9	broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes. Just like TLB without a config uniform, we don't have a register index.	2018-03-26 17:46:23 -07:00
Eric Anholt	494da6c2dd	broadcom/vc5: Implement workaround for GFXH-1431. This should fix some blending errors, but doesn't impact any testcases in the CTS.	2018-03-26 17:46:19 -07:00
Eric Anholt	1bf466270d	broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well. Once we've disabled EZ for some draws, we need to not use EZ on future draws. Implementing that made implementing the GT/GE direction trivial. Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.	2018-03-26 17:46:19 -07:00
Eric Anholt	262208eb3c	broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled. On 3.x, we just don't flag the primitive as needing TF, but those primitive bits are now allocated to the new primitive types. Now we need to actually update the enable flag at draw time.	2018-03-26 17:46:19 -07:00
Eric Anholt	ef2cf9cc3c	broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job. The next job from this client will turn it back on unless TF gets disabled, but we don't want the state to leak from this client to another (which causes GPU hangs).	2018-03-26 17:46:19 -07:00
Eric Anholt	1fa820cef8	broadcom/vc5: Move the BCL epilogue code to a per-version compile. I need to do some new packets for transform feedback on 4.1.	2018-03-26 17:46:19 -07:00
Eric Anholt	3387864130	broadcom/vc5: Fix transform feedback in the presence of point size. I had this note to myself, and it turns out that a lot of CTS tests use XFB with points to get data out without using a fragment shader. Keep track of two sets of precomputed TF specs (point size in VPM prologue or not), and switch between them when we enable/disable point size.	2018-03-26 17:46:19 -07:00
Eric Anholt	09ac5ade8f	broadcom/vc5: Split transform feedback specs update from buffers. The specs update will be changing based on additional state flags in the next commit, and this unindents the buffer update code.	2018-03-26 17:46:18 -07:00
Eric Anholt	9e62aec9cd	broadcom/vc5: Limit each transform feedback data spec to 16 dwords. The length-1 field only has 4 bits, so we need to generate separate specs when there's too much TF output per buffer. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type and transform_feedback_max_interleaved.	2018-03-26 17:33:37 -07:00
Eric Anholt	0356db022d	gallium/u_vbuf: Protect against overflow with large instance divisors. GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. v2: Use some more obvious logic, and explain why we don't use the normal round_up(). Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 17:33:37 -07:00
Eric Anholt	d491ad1d36	st: Allow accelerated CopyTexImage from RGBA to RGB. There's nothing to worry about here -- the A channel just gets dropped by the blit. This avoids a segfault in the fallback path when copying from a RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an RGBA16_SINT texture (the fallback path tries to get/fetch to float buffers, but the float pack/unpack functions are NULL for SINT/UINT). Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5. v2: Extract the logic to a helper function and explain what's going on better. v3: const-qualify args Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 17:33:37 -07:00
Marek Olšák	7d2079908d	winsys/amdgpu: always allow GTT placements on APUs Reviewed-by: Christian König <christian.koenig@amd.com>	2018-03-26 19:23:30 -04:00
Marek Olšák	769603564e	radeonsi: don't reallocate on DMABUF export if local BOs are disabled	2018-03-26 19:22:12 -04:00
Timothy Arceri	56b867395d	glsl: fix infinite loop caused by bug in loop unrolling pass Just checking for 2 jumps is not enough to be sure we can do a complex loop unroll. We need to make sure we also have also found 2 loop terminators. Without this we were attempting to unroll a loop where the second jump was nested inside multiple ifs which loop analysis is unable to detect as a terminator. We ended up splicing out the first terminator but failed to actually unroll the loop, this resulted in the creation of a possible infinite loop. Fixes: `646621c66d` "glsl: make loop unrolling more like the nir unrolling path" Tested-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670	2018-03-27 09:15:02 +11:00
Vinson Lee	dc94a0506f	gallium: Do not add -Wframe-address option for gcc <= 4.4. This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: `370e356eba` ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-26 11:23:51 -07:00
Alyssa Rosenzweig	029f1a2d61	gallium: Correct minor typo in header comments Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-26 10:15:04 -07:00
Rafael Antognolli	27581d18bc	intel/aubinator_error_decode: Decode more registers. Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	70d7c70e8d	intel/genxml: Add SAMPLER_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	227edf05f3	intel/genxml: Add ROW_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Rafael Antognolli	4c0ae36143	intel/genxml: Add SC_INSTDONE register. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-26 09:25:57 -07:00
Ian Romanick	91225cb33f	i965/vec4: Fix null destination register in 3-source instructions A recent commit (see below) triggered some cases where conditional modifier propagation and dead code elimination would cause a MAD instruction like the following to be generated: mad.l.f0 null, ... Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases like this in the scalar backend. This commit basically ports that code to the vec4 backend. NOTE: I have sent a couple tests to the piglit list that reproduce this bug without the commit mentioned below. This commit fixes those tests. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Fixes: `ee63933a7` ("nir: Distribute binary operations with constants into bcsel") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704	2018-03-26 08:50:44 -07:00
Ian Romanick	2c643fd978	nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional Now that i965 recognizes that a-b generates the same conditions as 'a < b', there is no reason to condition this transformation on 'is not used by conditional.' Since this was the only user of the is_not_used_by_conditional function, delete it. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14400775 -> 14400595 (<.01%) instructions in affected programs: 36712 -> 36532 (-0.49%) helped: 182 HURT: 26 helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1 helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90% 95% mean confidence interval for instructions value: -0.97 -0.76 95% mean confidence interval for instructions %-change: -0.59% -0.43% Instructions are helped. total cycles in shared programs: 532929592 -> 532926345 (<.01%) cycles in affected programs: 478660 -> 475413 (-0.68%) helped: 187 HURT: 22 helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18 helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03% HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11 HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86% 95% mean confidence interval for cycles value: -19.50 -11.57 95% mean confidence interval for cycles %-change: -1.42% -0.58% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 177851578 -> 177851810 (<.01%) cycles in affected programs: 24408 -> 24640 (0.95%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44% HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54 HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02% 95% mean confidence interval for cycles value: -7.75 85.08 95% mean confidence interval for cycles %-change: -0.39% 1.49% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	cd635d149b	i965/vec4: Propagate conditional modifiers from compares to adds No changes on Broadwell or later as those platforms do not use the vec4 backend. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11682119 -> 11681056 (<.01%) instructions in affected programs: 150403 -> 149340 (-0.71%) helped: 950 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1 helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71% 95% mean confidence interval for instructions value: -1.19 -1.04 95% mean confidence interval for instructions %-change: -0.84% -0.79% Instructions are helped. total cycles in shared programs: 257495842 -> 257495238 (<.01%) cycles in affected programs: 270302 -> 269698 (-0.22%) helped: 271 HURT: 13 helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2 helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28% HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -2.41 -1.84 95% mean confidence interval for cycles %-change: -0.31% -0.26% Cycles are helped. Sandy Bridge total instructions in shared programs: 10430493 -> 10429727 (<.01%) instructions in affected programs: 120860 -> 120094 (-0.63%) helped: 766 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 146138718 -> 146138446 (<.01%) cycles in affected programs: 244114 -> 243842 (-0.11%) helped: 132 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2 helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19% 95% mean confidence interval for cycles value: -2.12 -2.00 95% mean confidence interval for cycles %-change: -0.18% -0.15% Cycles are helped. GM45 and Iron Lake had identical results. (Iron Lake shown) total instructions in shared programs: 7780251 -> 7780248 (<.01%) instructions in affected programs: 175 -> 172 (-1.71%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49% total cycles in shared programs: 177851584 -> 177851578 (<.01%) cycles in affected programs: 9796 -> 9790 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	780f307ba8	i965/vec4: Allow cmod propagation when src0 is a uniform or shader input No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help more shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	020b0055e7	i965/fs: Propagate conditional modifiers from compares to adds The math inside the add and the cmp in this instruction sequence is the same. We can utilize this to eliminate the compare. add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This is reduced to: add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This optimization pass could do even better. The nature of converting vectorized code from the GLSL front end to scalar code in NIR results in sequences like: add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q }; In this sequence, only the first cmp.z is removed. With different scheduling, all 3 could get removed. Skylake total instructions in shared programs: 14407009 -> 14400173 (-0.05%) instructions in affected programs: 1307274 -> 1300438 (-0.52%) helped: 4880 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.45 -1.35 95% mean confidence interval for instructions %-change: -0.72% -0.69% Instructions are helped. total cycles in shared programs: 532943169 -> 532923528 (<.01%) cycles in affected programs: 14065798 -> 14046157 (-0.14%) helped: 2703 HURT: 339 helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2 helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21% HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12 HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41% 95% mean confidence interval for cycles value: -8.66 -4.26 95% mean confidence interval for cycles %-change: -0.24% -0.14% Cycles are helped. LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 14719636 -> 14712949 (-0.05%) instructions in affected programs: 1288188 -> 1281501 (-0.52%) helped: 4845 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.43 -1.33 95% mean confidence interval for instructions %-change: -0.72% -0.68% Instructions are helped. total cycles in shared programs: 559599253 -> 559581699 (<.01%) cycles in affected programs: 13315565 -> 13298011 (-0.13%) helped: 2600 HURT: 269 helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2 helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20% HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20 HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75% 95% mean confidence interval for cycles value: -8.47 -3.77 95% mean confidence interval for cycles %-change: -0.27% -0.18% Cycles are helped. LOST: 0 GAINED: 8 Haswell total instructions in shared programs: 12978609 -> 12973483 (-0.04%) instructions in affected programs: 932921 -> 927795 (-0.55%) helped: 3480 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58% 95% mean confidence interval for instructions value: -1.53 -1.42 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 410270788 -> 410250531 (<.01%) cycles in affected programs: 10986161 -> 10965904 (-0.18%) helped: 2087 HURT: 254 helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4 helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21% HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16 HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47% 95% mean confidence interval for cycles value: -12.82 -4.49 95% mean confidence interval for cycles %-change: -0.31% -0.18% Cycles are helped. LOST: 0 GAINED: 5 Ivy Bridge total instructions in shared programs: 11686082 -> 11681548 (-0.04%) instructions in affected programs: 937696 -> 933162 (-0.48%) helped: 3150 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: -1.49 -1.38 95% mean confidence interval for instructions %-change: -0.71% -0.67% Instructions are helped. total cycles in shared programs: 257514962 -> 257492471 (<.01%) cycles in affected programs: 11524149 -> 11501658 (-0.20%) helped: 1970 HURT: 239 helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3 helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17% HURT stats (abs) min: 1 max: 1358 x̄: 50.00 x̃: 15 HURT stats (rel) min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65% 95% mean confidence interval for cycles value: -17.01 -3.35 95% mean confidence interval for cycles %-change: -0.33% -0.08% Cycles are helped. LOST: 9 GAINED: 1 Sandy Bridge total instructions in shared programs: 10432841 -> 10429893 (-0.03%) instructions in affected programs: 685071 -> 682123 (-0.43%) helped: 2453 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1 helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46% 95% mean confidence interval for instructions value: -1.23 -1.17 95% mean confidence interval for instructions %-change: -0.67% -0.62% Instructions are helped. total cycles in shared programs: 146133660 -> 146134195 (<.01%) cycles in affected programs: 3991634 -> 3992169 (0.01%) helped: 1237 HURT: 153 helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2 helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14% HURT stats (abs) min: 1 max: 1740 x̄: 59.56 x̃: 12 HURT stats (rel) min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42% 95% mean confidence interval for cycles value: -5.13 5.90 95% mean confidence interval for cycles %-change: -0.17% 0.16% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 GM45 and Iron Lake had similar results (GM45 shown): total instructions in shared programs: 4800332 -> 4798380 (-0.04%) instructions in affected programs: 565995 -> 564043 (-0.34%) helped: 1451 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1 helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31% 95% mean confidence interval for instructions value: -1.40 -1.29 95% mean confidence interval for instructions %-change: -0.50% -0.45% Instructions are helped. total cycles in shared programs: 122032318 -> 122027798 (<.01%) cycles in affected programs: 8334868 -> 8330348 (-0.05%) helped: 1029 HURT: 1 helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2 helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04% HURT stats (abs) min: 38 max: 38 x̄: 38.00 x̃: 38 HURT stats (rel) min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25% 95% mean confidence interval for cycles value: -4.70 -4.08 95% mean confidence interval for cycles %-change: -0.09% -0.08% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	5bbb3d60d3	i965/fs: Allow cmod propagation when src0 is a uniform or shader input No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help about 900 more shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Ian Romanick	8f83eea71e	i965: Add negative_equals methods This method is similar to the existing ::equals methods. Instead of testing that two src_regs are equal to each other, it tests that one is the negation of the other. v2: Simplify various checks based on suggestions from Matt. Use src_reg::type instead of fixed_hw_reg.type in a check. Also suggested by Matt. v3: Rebase on 3 years. Fix some problems with negative_equals with VF constants. Add fs_reg::negative_equals. v4: Replace the existing default case with BRW_REGISTER_TYPE_UB, BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF. Suggested by Matt. Expand the FINISHME comment to better explain why it isn't already finished. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Gert Wollny	a21da49e5c	mesa/st/tests: Use tgsi opcode enum also in the test classes Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-26 09:04:53 -06:00
Eric Engestrom	1e36fe5dc4	meson: fix header check message before: Checking if "endian.h works" compiles: YES after: Checking if "endian.h" compiles: YES Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-03-26 09:59:32 +01:00
Rob Clark	2f181c8c18	glsl_types: vec8/vec16 support Not used in GL but 8 and 16 component vectors exist in OpenCL. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-25 10:42:54 -04:00
Rob Clark	f407edf340	glsl_types: refactor/prep for vec8/vec16 Refactor things so there isn't so much typing involved to add new things. Also drops a pointless conditional (out of bounds rows or columns already returns error_type in all paths.. might as well drop it rather than make the check more convoluted in the next patch by adding the vec8/vec16 case). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-25 10:42:54 -04:00
Jordan Justen	d60eaf7b1f	anv: Set genX_table for gen11 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-23 17:23:59 -07:00
Jordan Justen	af8535d02f	anv: Add gen11 to anv_genX_call Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-23 17:23:59 -07:00
Mathias Fröhlich	4a8ef1f5d4	vbo: Make sure the internal VAO's stay within limits. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:59:02 +01:00
Mathias Fröhlich	1a131aaf4b	mesa: Flag early if we modify a SharedAndImmutable VAO. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:58:59 +01:00
Mathias Fröhlich	19526a57f5	mesa: When copying a VAO also copy the vertex attribute mode. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-23 19:58:54 +01:00
Emil Velikov	5a75019ad0	configure: use AC_CHECK_HEADERS to check for endian.h The currently we use the singular CHECK_HEADER combined with explicit append to the DEFINES variable. That is a legacy misnomer, since it requires us to add $DEFINES to every piece that we build. Using the plural version of the helper sets the HAVE_ macro for us, plus ensures it's passed to the compiler - if config.h is available in there (not in the case of mesa) otherwise on the command line. In hindsight, we should replace all the AC_CHECK_{FUNC,HEADER} instances with the plural version (or even the _ONCE suffixed version) and drop the DEFINES hacks. Fixes: `cbee1bfb34` ("meson/configure: detect endian.h instead of trying to guess when it's available") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105717 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-03-23 18:12:52 +00:00
Kenneth Graunke	90f556f0b1	android: Use local i915_drm.h rather than the system one. Fixes: `2d26c99933` (intel: devinfo: meson: include drm uapi) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2018-03-23 10:05:02 -07:00
Brian Paul	e31d5bd2f9	st/mesa: s/unsigned/enum pipe_shader_type/ for st_bind_ubos() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	6a93deedf5	st/mesa: whitespace/formatting fixes in st_atom_constbuf.c Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	aad23f91ee	st/mesa: s/unsigned/enum pipe_shader_type/ Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	93581c2ca0	svga: simplify uses_flat_interp expression in emit_input_declarations() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	c99f46c2ac	svga: replace unsigned with proper enum names Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-23 09:03:26 -06:00
Brian Paul	7181a9fa0e	tgsi,softpipe: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	ec478cf9c3	st/mesa,tgsi: use enum tgsi_opcode Need to update the tgsi code and st_glsl_to_tgsi code at the same time to prevent compile break since C++ is much pickier about implicit enum/unsigned casting. Bump size of glsl_to_tgsi_instruction::op to 10 bits to be sure to avoid MSVC signed enum overflow issue. No change in class size. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	ccecb2bbd3	tgsi/nir: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	22a3190c85	tgsi: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	9413d1c0fe	gallivm: use enum tgis_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	7df96826f8	svga: use enum tgsi_opcode Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Brian Paul	4e0f967f6d	tgsi: convert opcode macros to enums Enums are nicer in gdb. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-03-23 09:03:26 -06:00
Lionel Landwerlin	412fae46c0	compiler: glsl: silence valgrind warning on write cache I don't think it actually fixes anything, but that's nice not to have valgrind warnings. It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060 ==2058== Uninitialised byte(s) found during client check request ==2058== at 0xC5BB040: blob_write_bytes (blob.c:152) ==2058== by 0xC595359: write_variable (nir_serialize.c:144) ==2058== by 0xC59560C: write_var_list (nir_serialize.c:192) ==2058== by 0xC5982E4: nir_serialize (nir_serialize.c:1124) ==2058== by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835) ==2058== by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context, state_key) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) ==2058== by 0xC1B50AF: update_program (state.c:134) ==2058== by 0xC1B56DF: _mesa_update_state_locked (state.c:352) ==2058== by 0xC1B579A: _mesa_update_state (state.c:386) ==2058== Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd ==2058== at 0x4C2CB8F: malloc (vg_replace_malloc.c:299) ==2058== by 0xC0FD306: ralloc_size (ralloc.c:121) ==2058== by 0xC0FD5B1: ralloc_array_size (ralloc.c:208) ==2058== by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable) (glsl_to_nir.cpp:448) ==2058== by 0xC45CE8B: ir_variable::accept(ir_visitor) (ir.h:428) ==2058== by 0xC46D0B5: visit_exec_list(exec_list, ir_visitor) (ir.cpp:1898) ==2058== by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162) ==2058== by 0xC0B5223: brw_create_nir (brw_program.c:79) ==2058== by 0xC0AAB67: brw_link_shader (brw_link.cpp:257) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context, state_key) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-23 13:05:12 +00:00
Eric Engestrom	cbee1bfb34	meson/configure: detect endian.h instead of trying to guess when it's available Cc: Maxin B. John <maxin.john@gmail.com> Cc: Khem Raj <raj.khem@gmail.com> Cc: Rob Herring <robh@kernel.org> Suggested-by: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Cc: <mesa-stable@lists.freedesktop.org>	2018-03-23 11:44:21 +00:00
Juan A. Suarez Romero	ee2b943fa8	wayland-drm: do not distribute generated sources Instead we will re-generate them again on building. v2: get rid of BUILT_SOURCES (Daniel, Emil) v3: keep BUILT_SOURCES for egl/Makefile.am (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-23 11:27:12 +01:00
Samuel Pitoiset	ccc64f3133	radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8 The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:57 +01:00
Samuel Pitoiset	5ae9772245	radv: add radv_calc_decompress_on_z_planes() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:55 +01:00
Samuel Pitoiset	9b8e75bee3	radv: add radv_image_is_tc_compat_htile() helper Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-23 10:05:54 +01:00
Jason Ekstrand	884d27bcf6	nir: Rename image intrinsics to image_var Generated with git grep -l nir_intrinsic_image \| xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-23 13:48:11 +11:00
Dave Airlie	fa683385de	virgl: add ARB_cull_distance support. This just allows the properties through to the host if we have cull dist support. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-23 10:21:10 +10:00
Eric Anholt	d7a015cbc6	broadcom/vc5: Account for InstanceID/VertexID in VPM segment size. Fixes failure in GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size	2018-03-22 15:12:21 -07:00
Eric Anholt	b8387dbc49	broadcom/vc5: Allow FBOs with mixed color formats. This is required by GLES3, fixing GTF-GLES3.gtf.GL3Tests.framebuffer_srgb.framebuffer_srgb_draw	2018-03-22 15:12:21 -07:00
Eric Anholt	4f62679be5	broadcom/vc5: Add missing support for 2101010_REV vertex attributes. Fixes GTF-GLES3.gtf.GL3Tests.vertex_type_2_10_10_10_rev.vertex_type_2_10_10_10_rev_invalid2, where we hadn't thrown a GL error as needed in the extension-disabled case. We want to be exposing the extension anyway.	2018-03-22 15:12:21 -07:00
Eric Anholt	ba29b89dc7	broadcom/vc5: Set up a vertex position if the shader doesn't. Our backend needs some sort of vertex position value to emit the scaled viewport values and such. Fixes potential segfaults in KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx	2018-03-22 15:12:21 -07:00
Lionel Landwerlin	903e9952fb	i965: add performance query support on CNL v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	e7f6d1e5f8	i965: perf: add support for new equation operators Some equations of the CNL metrics started to use operators we haven't defined yet, just add those. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	57a11550bc	i965: perf: query topology With the introduction of asymmetric slices in CNL, we cannot rely on the previous SUBSLICE_MASK getparam to tell userspace what subslices are available. We introduce a new uAPI in the kernel driver to report exactly what part of the GPU are fused and require this to be available on Gen10+. Prior generations can continue to rely on GETPARAM on older kernels. This patch is quite a lot of code because we have to support lots of different kernel versions, ranging from not providing any information (for Haswell on 4.13 through 4.17), to being able to query through GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17 for Gen10+. This change stores topology information in a unified way on brw_context.topology from the various kernel APIs. And then generates the appropriate values for the equations from that unified topology. v2: Move slice/subslice masks fields to gen_device_info (Rafael) v3: Add a gen_device_info_subslice_available() helper (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c1900f5b0f	intel: devinfo: add helper functions to fill fusing masks values There are a couple of ways we can get the fusing information from the kernel : - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK parameters - Through the new DRM_IOCTL_I915_QUERY by requesting the DRM_I915_QUERY_TOPOLOGY_INFO The second method is more accurate and also gives us the EUs fusing masks. It's also a requirement for CNL as this platform has asymetric subslices and the first method SUBSLICE_MASK value is assumed uniform across slices. v2: Change gen_device_info_update_from_masks() to generate topology and call into gen_device_info_update_from_topology (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	2d26c99933	intel: devinfo: meson: include drm uapi Already available with the autotools build. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	5d3e74a5a5	drm-uapi: bump headers Required updates from drm-next for changes in i965. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c471716574	intel: devinfo: store slice/subslice/eu masks We want to store values coming from the kernel but as a first step, we can generate mask values out the numbers already stored in the gen_device_info masks. v2: Add a helper to set EU masks (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	7e2c6147da	intel: devinfo: store number of EUs per subslice This will be reused to store values reported by the kernel. The main use case will be for use as the input values of the metric sets equations for the INTEL_performance_queries extension. By storing this information in the gen_device_info we make this non GL specific so this can be reused by Vulkan if we ever have an equivalent extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Dylan Baker	8e5988eb35	Revert "meson: merge C and C++ compiler arguments check" This reverts commit `cb2ddcefa5`. This causes clang to error out building C++ code. The plan is to fix the build to work with clang, but in the mean time we'll just revert this Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch>	2018-03-22 11:35:08 -07:00
Lionel Landwerlin	1603ce1921	i965/perf: fix config registration when uploading to kernel When registring configurations to the kernel for the first time, we run into an issue where the id number is not properly set (we're using the wrong variable). As a result when trying to use that id later on, we get an error. This issue manifest itself the first time you use frameretrace after reboot, subsequent runs are fine. Fixes: `27ee83eaf7` ("i965: perf: add support for userspace configurations") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 18:21:57 +00:00
Lepton Wu	a8b846bccd	gallium/winsys/kms: Add support for multi-planes Add a new struct kms_sw_plane which delegate a plane and use it in place of sw_displaytarget. Multiple planes share same underlying kms_sw_displaytarget. v2: - add more check for plane size (Tomasz) v3: - split from larger patch (Emil) v4: - no change from v3 v5: - remove mapped field (Tomasz) v6: - remove change-id in commit message (Tomasz) v7: - add revision history in commit message (Emil) Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:44 +00:00
Lepton Wu	d891f28df9	gallium/winsys/kms: Fix possible leak in map/unmap. If user calls map twice for kms_sw_displaytarget, the first mapped buffer could get leaked. Instead of calling mmap every time, just reuse previous mapping. Since user could map same displaytarget with different flags, we have to keep two different pointers, one for rw mapping and one for ro mapping. Also introduce reference count for mapped buffer so we can unmap them at right time. v2: - avoid duplicated mapping and leaked mapping (Tomasz) v3: - split from larger patch (Emil) v4: - remove munmap from dt_destory (Emil) v5: - introduce reference count for mapping (Tomasz) - add back munmap in dt_destory v6: - remove change-id in commit message (Tomasz) v7: - remove munmap from dt_destory again (Emil) - add revision history in commit message (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero	4db269f30c	broadcom/vc4: add path to nir_builder.h As the other VC4 files do. Otherwise, it won't find nir_builder.h v2: add path in source code rather changing autotools (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	d39e828c82	autotools: add tegra header files Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	40ecee89b7	swr/rast: autotools: add events_private.proto in dist tarball. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	0bf1274883	radv: autotools: add radv_extensions.h in the generated VULKAN list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	13459c637a	anv/radv: autotools: include vulkan_*.h headers Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	f8b749b7c0	nir: autotools, meson: add GLSL.ext.AMD.h in the files list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Matt Turner	724586a266	intel/compiler: Readd ICL to test_eu_validate.cpp Now that the PCI IDs are upstream, this can be readded.	2018-03-22 09:56:09 -07:00
Matt Turner	65b060d9cb	intel/compiler: Skip 64-bit type tests when types not available Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	ad7ed86bf7	intel: Add a Ice Lake PCI IDs Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1065acfb69	intel: Disable fast color clear on icl Disabling fast color clear makes fbo-clearmipmap test render correct texture in base miplevel. Fast color clear is anyways disabled for non-base miplevels. Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Jason Ekstrand	d2eecf0b0b	intel/compiler/icl: Clear "null render target" bit in extended message descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1484876ef7	intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch() Rafael ran piglit with the test code enabled and saw no additional GPU hangs. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	f05e0d9c2a	intel/common/icl: Disable hiz surface sampling On gen11+ AUX_HIZ is not a supported value for surfaces being sampled by the 3D sampler. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	370af9dcc0	intel/common/icl: Add L3 config ICL uses the same L3 configs as CNL, just leaving the SLM configs out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Matt Turner	f56693af4b	intel/tools/aubinator: Drop platform list from print_help() We all know the platform names, and I don't want to update this list continually. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Derek Foreman	aa18a63512	egl/wayland: Make swrast display_sync the correct queue commit `03dd9a88b0` introduced per surface queues, but the display_sync for swrast_commit_backbuffer remained on the old queue. This is likely to break when dispatching the correct queue at the top of function (which can't dispatch the sync callback we're waiting for). The easiest known reproduction case is running weston-subsurfaces under weston --use-pixman Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-22 15:27:35 +00:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Eric Engestrom	cb2ddcefa5	meson: merge C and C++ compiler arguments check Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 11:59:12 +00:00
Mathias Fröhlich	880c1718b6	omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA} We're trying to be -Wundef clean so that we can turn it on (and eventually make it an error). Note that the OMX code already used `#if ENABLE_ST_OMX_BELLAGIO` instead of #ifdef; I could've changed these, but the point of -Wundef is to catch typos, so we might as well make the change the right way. Fixes: `83d4a5d5ae` "st/omx/tizonia: Add H.264 decoder" Fixes: `b2f2236dc5` "st/omx/tizonia: Add H.264 encoder" Fixes: `c62cf1f165` "st/omx/tizonia/h264d: Add EGLImage support" Cc: Gurkirpal Singh <gurkirpal204@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 11:39:28 +00:00
Mathias Fröhlich	795b465c50	meson: simplify omx logic and let's make sure `with_gallium_omx` is never 'auto' and can only be one of [bellagio, tizonia, disabled]. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-22 10:08:10 +00:00
Mathias Fröhlich	862c872c48	vbo: Remove now duplicate _DrawVAO notification. The DriverFlags.NewArray bit is already set to NewDriverState in _mesa_set_draw_vao since we have actually just above changed the VAOs content. So this can be removed. The _vbo_update_inputs is called by the vbo...recalculate_inputs being set through the same mechanism as described above. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	006b5e798a	vbo: Remove now duplicate _vbo_update_inputs from dlist draw. At the current state, _vbo_update_inputs is called from the draw callback if vbo...recalculate_inputs is set. But that is now set of the _DrawVAO or its content or the vertex program mode is changed. So remove _vbo_update_inputs from the direct dlist draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	2887c98140	vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays. Now that setting vbo...recalculate_inputs also sets the DriverFlags.NewArray bits into the NewDriverState setting that from vbo_bind_arrays is redundant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	9f5b6ef2ef	vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state. This flag is now set when the actual Array._DrawVAO changes. So setting this flag is redundant here. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	bf328359a7	mesa: A change of gl_vertex_processing_mode needs an array update. Since arrays also handle the mapping of current values into the disabled array slots, we need to tell the array update code that this mapping has changed. Also mark only dirty if it has changed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	5b91786225	mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs. Both mean something very similar and are set at the same time now. For that vbo module to be set from core mesa, implement a public vbo module method to set that flag. In the longer term the flag should vanish in favor of a driver flag of the appropriate driver. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	d3c604e12e	mesa: Update VAO internal state when setting the _DrawVAO. Update the VAO internal state on Array._DrawVAO instead of Array.VAO. Also the VAO internal state update gets triggered now by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag. Also no driver looks at any VAO's NewArrays value from within the Driver.UpdateState callback. So it should be safe to move this update into the _mesa_set_draw_vao method. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	c4c56ff303	vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback. Factor out that common call into the almost single place. Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code paths as the function is now called through vbo_bind_arrays. Prepare updating the list of struct gl_vertex_array entries via calling _vbo_update_inputs for being pushed into those drivers that finally work on that long list of gl_vertex_array pointers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	6307d1be0a	mesa: Move vbo draw functions into dd_function_table. Move vbo draw functions into struct dd_function_table. For now just wrap the underlying vbo functions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Aaron Watry	23100acc8f	clover/llvm: Fix build against LLVM/Clang 4.0 The opencl 1.0 langstandard was renamed in 5.0+ v2: Move preprocessor check into compat.hpp Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-21 21:03:23 -05:00
Timothy Arceri	c135316555	ac/nir_to_llvm: add frexp support Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-22 12:42:34 +11:00
Timothy Arceri	cca2141745	nir: add frexp_exp and frexp_sig opcodes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho	12c22b897a	anv/pipeline: don't pass constant view index in multiview If view mask has only one bit set, view index is effectively a constant, so doesn't need to be passed to the next stages, just always set it. Part of this was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho	5e7c1d05d4	anv/pipeline: use less instructions for multiview The view_index is encoded in the remainder of dividing instance id by the number of views in the view mask (n). In the general case (handled by the else clause), there is a need to map from 0..n-1 into the number of the view being masked. For that a map is encoded. In the case only the first n bits in the mask are set, the mapping is trivial, 0..n-1 already represent what view is being referred to. That case was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Eric Anholt	baeb6a4b4a	broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI. Unfortunately TGSI doesn't record the type of the FS output like GLSL does, but VC5's TLB writes depend on the output's base type. Just record the type in the key at variant compile time when we've got a TGSI input and then fix it up. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a GPU hang that breaks most tests that come after it.	2018-03-21 14:02:34 -07:00
Neil Roberts	61603f0e42	spirv: Add a 64-bit implementation of Frexp The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 20:18:44 +01:00
Rafael Antognolli	5297a17571	aubinator_error_decode: Compare only the class_name of the ring. ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to first compare the class name only, then get the instance id. Without this, INSTDONE is not being decoded. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-21 11:35:15 -07:00
Thomas Helland	8d5cd91ca0	nir: Migrate nir_dce to instr worklist Shader-db runtime change avarage of five runs: Before 125,77 seconds (+/- 0,09%) After 124,48 seconds (+/- 0,07%) Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:40 +01:00
Thomas Helland	edb18564c7	nir: Initial implementation of a nir_instr_worklist Make a simple worklist by basically just wrapping u_vector. This is intended used in nir_opt_dce to reduce the number of calls to ralloc, as we are currenlty spamming ralloc quite bad. It should also give better cache locality and much lower memory usage. Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:27 +01:00
Scott D Phillips	cab8df1e3e	intel/tools: aubinator: Catch gen11 "enhanced execlist" submission Different registers are used for execlist submission in gen11, so also watch those. This code only watches element zero of the submit queue, which is all aubdump currently writes. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-21 11:07:15 -07:00
Marek Olšák	a8d55374dc	radeonsi: fix a snprintf warning on gcc 7.3.0	2018-03-21 13:43:09 -04:00
Marek Olšák	cf0a95afac	radeonsi/gfx9: print the swizzle mode for testdma Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Marek Olšák	f7ffa504a0	ac/surface: compute tile swizzle for GFX9 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Eric Anholt	9f0c9c6d18	broadcom/vc5: Don't skip job submit just because everything is scissored. The coordinate shaders may now have side effects in the form of transform feedback. Part of fixing GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc	2018-03-21 10:04:21 -07:00
Eric Anholt	024e814dee	broadcom/vc5: Handle sparsely populated SO target array. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables	2018-03-21 10:04:21 -07:00
Eric Anholt	f735ac6b1c	broadcom/vc5: Fix 3D miplevel limit to match other texture targets. Fixes segfault in GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on level 13.	2018-03-21 10:04:21 -07:00
Eric Anholt	ba87d85b04	broadcom/vc5: Clamp the instance divisor to 16 bits. Fixes debug assert on GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-21 10:04:21 -07:00
Lionel Landwerlin	3dd92184d5	i965: fix android build This is the equivalent of commit `5770e1d89e` for android. v2: fix xml files path and file given to --header Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Fixes: `2d2b15fbca` ("i965: fix autotools/android build") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634 Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 18:56:47 +02:00
Juan A. Suarez Romero	e5cd376c2f	docs: fix typo in 17.3.6 release notes Title is about 17.3.5, when it must be about 17.3.6. CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-21 16:37:49 +00:00
Caio Marcelo de Oliveira Filho	8571c577aa	nir/dead_cf: also remove useless ifs Generalize the code for remove dead loops to also remove dead if nodes. The conditions are the same in both cases, if the node (and it's children) don't have side-effects AND the nodes after it don't use the values produced by the node. The only difference is when evaluating side effects: loops consider only return jumps as a side-effect -- they can stop execution of nodes after it; 'if' nodes outside loops should consider all kinds of jumps (return, break, continue) since all of them can cause execution of nodes after it to be skipped. After this patch, empty ifs (those which both then and else blocks are empty) will be removed by nir_opt_dead_cf. It caused no change to shader-db, in part because the removal of empty ifs is currently covered by nir_opt_peephole_select. v2: Improve the identification of cases where break/continue can cause side-effects. (Jason) v3: Move code comment changes to a different patch. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 09:36:09 -07:00
Caio Marcelo de Oliveira Filho	470056d37b	nir/dead_cf: rephrase definition of a dead loop node Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 09:35:57 -07:00
Juan A. Suarez Romero	e1f8c23e18	docs: update calendar, add news and link release notes to 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-03-21 16:02:37 +00:00
Juan A. Suarez Romero	543e7c8382	docs: add sha256 checksums for 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `13dd6016d7`)	2018-03-21 15:58:55 +00:00
Juan A. Suarez Romero	09448940ed	docs: add release notes for 17.3.7 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `8a51f3857c`)	2018-03-21 15:58:52 +00:00
Leo Liu	c4de2f0880	radeon/vce: move feedback command inside of destroy function On the CI family, firmware requires the destory command have to be the last command in the IB, moving feedback command after destroy is causing issues on CI cards, so we have to keep the previous logic that moves destroy back to the last command. But as the original issue fixed previously, with the newer family like Vega10, feedback command have to be included inside of the task info command along with destroy command. Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command") Signed-off-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-03-21 11:24:35 -04:00
Eric Engestrom	1346a36162	egl: pull update from Khronos and drop local define Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2]. [1] `2b6bb4ee45` [2] https://github.com/KhronosGroup/EGL-Registry/pull/36 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Eric Engestrom	f744c6c1e2	egl: align the formatting of Haiku section of eglplatform.h with Khronos' Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Eric Engestrom	ac698ae4a0	egl: add Ozone section to eglplatform.h This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone section to eglplatform.h" from Khronos [1] added by Brian Anderson [2] a few months ago. [1] `a93f559e9c` [2] https://github.com/KhronosGroup/EGL-Registry/pull/26 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 14:28:05 +00:00
Aaron Watry	c95d953b18	clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version Use get_language_version to calculate default cl standard based on device capabilities and -cl-std specified in build options. v5; move dev_clc_version declaration from an earlier patch v4: Squash the __OPENCL_VERSION__ and CLC language version patches v3: (Jan) Allow device_version up to 2.2 while device_clc_version only goes to 2.0 Use get_cl_version to calculate version instead v2: Split out from the previous patch (Pierre) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> CC: Jan Vesely <jan.vesely@rutgers.edu>	2018-03-21 06:59:46 -05:00
Aaron Watry	29b4090d18	clover/llvm: Add get_[cl\|language]_version, validation and some helpers Used to calculate the default CLC language version based on the --cl-std in build args and the device capabilities. According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by: 1) If you have -cl-std=CL1.1+ use the version specified 2) If not, use the highest 1.x version that the device supports Curiously, there is no valid value for -cl-std=CL1.0 Validates requested cl-std against device_clc_version Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> v7: (Pierre) Split cl/clc versions into separate lists and make more references const. v6: (Pierre) Add more const and fix some whitespace v5: (Aaron) Use a collection of cl versions instead of switch cases Consolidates the string, numeric version, and clc langstandard::kind v4: (Pierre) Split get_language_version addition and use into separate patches Squash patches that add the helpers and validate the language standard v3: Change device_version to device_clc_version v2: (Pierre) Move create_compiler_instance changes to correct patch to prevent temporary build breakage. Convert version_str into unsigned and use it to find language version Add build_error for unknown language version string Whitespace fixes	2018-03-21 06:59:37 -05:00
Juan A. Suarez Romero	14fffefc60	docs: add 17.3.{8,9} in the release calendar Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime. v2: add 17.3.9 in the calendar (Andres Gomez) CC: Andres Gomez <agomez@igalia.com> CC: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-21 11:57:44 +01:00
Eric Anholt	4d8b476fa9	intel/blorp: Fix compiler warning about num_layers. The compiler doesn't notice that the condition for num_layers to be undefined already defined it above (as our assert checked in a debug build). v2: Move the pair of assignments to one outside of the block. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 14:06:46 -07:00
Samuel Pitoiset	f0211155f1	radv: add support for VK_EXT_depth_range_unrestricted This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:55:41 +01:00
Samuel Pitoiset	4e9b0b39b5	radv: only enable one channel when exporting prim id It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 21:54:48 +01:00
Lionel Landwerlin	5770e1d89e	i965: fix out of tree autotools build Fixes: `2d2b15fbca` ("i965: fix autotools/android build") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-03-20 19:48:56 +00:00
Stéphane Marchesin	1117edc60d	virgl: Implement seamless cube maps This was previously ignored. Along with the virglrenderer patch, this fixes ~100 dEQP tests: dEQP-GLES3.functional.texture.filtering.cube.* Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-21 05:44:52 +10:00
Emil Velikov	c43715d30b	i965: annotate brw_oa.py's --header and --code as required As of earlier commit, the --header was made a hard requirement when using --code. Hence - annotate both as required and drop a few no longer needed checks. Fixes: `035cc7a12d` ("i965: perf: reduce i965 binary size") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 17:21:49 +00:00
Lionel Landwerlin	d3e5d3955c	i965: pipecontrol: add LRI write immediate flag Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 16:58:30 +00:00
Lionel Landwerlin	7f977d51b3	intel: genxml: add INSTPM/CS_DEBUG_MODE2 registers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 16:58:30 +00:00
Lionel Landwerlin	2d2b15fbca	i965: fix autotools/android build Autotools/android builds generate the header & code files in 2 steps, but the code generation requires the name of the header file to include it. This change generates both files in one command. Fixes: `035cc7a12d` ("i965: perf: reduce i965 binary size") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-20 16:58:29 +00:00
Daniel Stone	9f3509665d	dri3: Fix typo in version check The have-new-DRI3 codepaths would never actually properly trigger, since there was a typo in configure.ac which broke the version check. This went unnoticed but for an error in config.log if you looked closely enough. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Lukas F. Hartmann <lukas@mntmn.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)") Cc: Dave Airlie <airlied@redhat.com>	2018-03-20 16:38:08 +00:00
Daniel Stone	bc5e59119e	meson: Don't build svga by default on ARM/AArch64 VMware has no (published) support for Arm-architecture guests. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reported-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-20 16:18:37 +00:00
Daniel Stone	d7603cb518	meson: Add default DRI drivers for ARM/AArch64 On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as 'aarch64'), only build swrast for DRI drivers. The only classic drivers which could be used are r200 and NV20 cards, which seems unlikely enough that it shouldn't be the default. Signed-off-by: Daniel Stone <daniels@collabora.com> Reported-by: Javier Jardón <jjardon@gnome.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-20 16:18:37 +00:00
Emil Velikov	28780c5028	st/mesa: add compiler/nir/ prefix for nir includes Stay consistent with the rest of the codebase, effectively fixing the autotools build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621 Fixes: `ffa4bbe466` ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker") Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-20 16:11:19 +00:00
Scott D Phillips	d849d36c6c	anv: off-by-one in GetDescriptorSetLayoutSupport Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: `ddc4069122` ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-20 07:58:10 -07:00
Lionel Landwerlin	035cc7a12d	i965: perf: reduce i965 binary size Performance metric numbers are calculated the following way : - out of the 256 bytes long OA reports, we accumulate the deltas into an array of uint64_t - the equations' generated code reads the accumulated uint64_t deltas and normalizes them for a particular platform Our hardware is such that a number of counters in the OA reports always return the same values (i.e. they're not programmable), and they return the same values even across generations, and as a result a number of equations are identical in different metric sets across different generations. Up to now we've kept the generated code of the equations separated in different files (per generation/GT), and didn't apply any factorization of the common equations. We could have make some improvement by reusing equations within a given metrics file, but we can go even further and reuse across generations (i.e. all files). This change changes the code generation to emit a single file in which we reuse equations emitted code based on the hash of equations' strings. Here are the savings in a meson build : Before(.old)/after : $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old 43M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so 47M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old text data bss dec hex filename 13054002 409424 671856 14135282 d7aff2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so 14550386 409552 671856 15631794 ee85b2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old As a side comment here is the size of the drivers if we remove all of the metrics from the build : $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so 40M build/src/mesa/drivers/dri/libmesa_dri_drivers.so v2: Fix an issue with hashing of counter equations (Lionel) Build system rework (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (build system part) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 13:56:07 +00:00
Lionel Landwerlin	e9a9e85948	i965: perf: fix a counter return type on hsw The equation code computes a float (percentage) yet the return type was an uint64_t. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-20 11:36:13 +00:00
Tapani Pälli	604cac9f73	mesa: fix leaking ParameterValueOffset ==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66 ==15115== at 0x4C2EC15: realloc (vg_replace_malloc.c:785) ==15115== by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212) ==15115== by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252) ==15115== by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384) ==15115== by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409) Fixes: `edded12376` "mesa: rework ParameterList to allow packing" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-20 13:25:07 +02:00
Daniel Stone	478fc2d2a1	dri3: Don't fail on version mismatch The previous commit to make DRI3 modifier support optional, breaks with an updated server and old client. Make sure we never set multibuffers_available unless we also support it locally. Make sure we don't call stubs of new-DRI3 functions (or empty branches) which will never succeed. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Fixes: `7aeef2d4ef` ("dri3: allow building against older xcb (v3)")	2018-03-20 08:52:59 +00:00
Timothy Arceri	9a243eccae	radv: don't lower indirects until after opts have run Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-20 15:01:44 +11:00
Timothy Arceri	dfe2f19855	st/nir: fix atomic lowering for gallium drivers i965 and gallium handle the atomic buffer index differently. It was just by luck that the single piglit test for this was passing. For gallium we use the atomic binding so that we match the handling in st_bind_atomics(). On radeonsi this fixes the CTS test: KHR-GL43.shader_storage_buffer_object.advanced-write-fragment It also fixes tressfx hair rendering in Tomb Raider. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:29:53 +11:00
Timothy Arceri	632d5e97ef	st/radeonsi: enable uniform packing in NIR backend Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:19:35 +11:00
Timothy Arceri	231333a20d	st: add uniform packing support to lower_uniforms_to_ubo() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	9c51a7ea29	gallium: add packed uniform CAP Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	ffa4bbe466	st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker This will only ever be used by gallium drivers so it probably doesn't belong in the nir toolkit. Also we want to pass it some non NIR things in the following patch. To avoid regressions we wrap the lowering calls that have been moved to st_glsl_to_nir with a quick hack so that they are only called for radeonsi, we will replace the hack with a check for uniform packing in a following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	a80cf442d9	st: add st_glsl_type_dword_size() helper This will be used to support uniform packing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	5488166730	st/glsl_to_nir: add support for packed builtin uniforms Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	57ebab64c0	mesa: add _mesa_add_sized_state_reference() helper This will be used for adding packed builtin uniforms. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	2377754329	mesa: add support propagate uniform support for packed uniforms Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:34 +11:00
Timothy Arceri	40711a7a60	mesa: allow for uniform packing when adding uniforms to param list Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	a2198d4fdb	mesa: add packing support for setting uniform handles Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	6cfa15b803	mesa: add packing support for setting uniforms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	4a7c5c079b	mesa: create copy uniform to storage helpers These will be used in the following patch to allow copying directly to the param list when packing is enabled. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	edded12376	mesa: rework ParameterList to allow packing Currently everything is padded to 4 components. Making the list more flexible will allow us to do uniform packing. V2 (suggestions from Nicolai): - always pass existing calls to _mesa_add_parameter() true for padd_and_align - fix bindless param value offsets - remove left over wip logic from pad and align code - zero out param value padding - whitespace fix Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Timothy Arceri	b13b9eb432	mesa: add PackedDriverUniformStorage const Will be used to determine whether to take packing code paths or not. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-20 14:17:33 +11:00
Eric Anholt	00910e3057	broadcom/vc5: Don't annotate dumps with stale live intervals. As you're debugging register allocation, you may have changed the intervals and not recomputed yet. Just skip the dump in that case.	2018-03-19 16:44:20 -07:00
Eric Anholt	facc3c6f58	broadcom/vc5: Add support for register spilling. Our register spilling support is nice to have since vc4 couldn't at all, but we're still very restricted due to needing to not spill during a TMU operation, or during the last segment of the program (which would be nice to spill a value of, when there's a long-lived value being passed through with little modification from the start to the end). We could do better by emitting unspills for the last-segment values just before the last thrsw, since the last segment is probably not the maximum interference area. Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3 others.	2018-03-19 16:44:06 -07:00
Eric Anholt	271fc58ba1	broadcom/vc5: Remove redundant last_inst lookup. The point was to get the MOV, which the MOV_dest already returned.	2018-03-19 16:42:59 -07:00
Eric Anholt	34dc64f627	broadcom/vc5: On QPU pack error, dump the instruction and return cleanly. This is nice for debugging when you've made a bad instruction.	2018-03-19 16:42:59 -07:00
Eric Anholt	d721348dcd	broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's. This will let me do lowering late in compilation using the same instruction builder as we use in nir_to_vir.	2018-03-19 16:42:59 -07:00
Eric Anholt	c81d681742	broadcom/vc5: Move the umul macro to a header. Anywhere we want to multiply, we probably want this.	2018-03-19 16:42:59 -07:00
Eric Anholt	9e28c18cd1	broadcom/vc5: Correct the arg count of TIDX/EIDX.	2018-03-19 16:42:59 -07:00
Eric Anholt	55bf298333	broadcom/vc5: Re-do live variables after removing thrsws. Otherwise our start/ends ips won't line up with the actual instructions.	2018-03-19 16:42:59 -07:00
Eric Anholt	c3a504f470	broadcom/vc5: Add a QPU helper for instructions using the TLB. This will be used for detecting last thread segment in register spilling.	2018-03-19 16:42:59 -07:00
Eric Anholt	09c4dd1971	broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm(). These helpers will be used in register spilling to determine where to add a last thrsw if needed, and might help refactor QPU scheduling.	2018-03-19 16:42:59 -07:00
Eric Anholt	407f21ef1b	broadcom/vc5: The ldvpm signal also a case of using the VPM. The QPU scheduling code calling this function already separately checked this signal.	2018-03-19 16:42:59 -07:00
Eric Anholt	4760040c09	broadcom/vc5: Extract v3d_qpu_writes_tmu() helper. This will be reused in register spilling.	2018-03-19 16:42:59 -07:00
Dave Airlie	32791a0502	radv: don't export NULL layer. We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: `d4c74aed7a` (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 21:36:48 +00:00
Marek Olšák	f674b50d0e	mesa: adjust incorrect comment in texture_buffer_range	2018-03-19 16:56:17 -04:00
Ian Romanick	6aeaa7d363	nir: Don't compare b2f or b2i with zero All of the shaders that had loops changed were in Tomb Raider. The one shader that lost SIMD16 is one of those. Skylake total instructions in shared programs: 14391653 -> 14390468 (<.01%) instructions in affected programs: 111891 -> 110706 (-1.06%) helped: 501 HURT: 0 helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1 helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01% 95% mean confidence interval for instructions value: -3.23 -1.50 95% mean confidence interval for instructions %-change: -1.77% -1.45% Instructions are helped. total cycles in shared programs: 532793024 -> 532776598 (<.01%) cycles in affected programs: 987682 -> 971256 (-1.66%) helped: 348 nnHURT: 41 helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18 helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68% HURT stats (abs) min: 1 max: 422 x̄: 65.39 x̃: 24 HURT stats (rel) min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02% 95% mean confidence interval for cycles value: -64.08 -20.38 95% mean confidence interval for cycles %-change: -2.78% -1.23% Cycles are helped. total loops in shared programs: 4854 -> 4829 (-0.52%) loops in affected programs: 27 -> 2 (-92.59%) helped: 18 HURT: 0 LOST: 1 GAINED: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 13:52:35 -07:00
Dave Airlie	e8d9b7ab02	radv: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: `99b57daf4a` anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:40 +00:00
Dave Airlie	032014ac01	radv/query: handle multiview timestamp queries. For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:14 +00:00
Dave Airlie	32b4f3c38d	radv/query: handle multiview queries properly. (v3) For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:09 +00:00
Dave Airlie	4034dc5c72	radv/query: split out begin/end query emission This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:05 +00:00
Dave Airlie	d4c74aed7a	radv/multiview: mark layer_input if we have input attachments. This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:26:39 +00:00
Caio Marcelo de Oliveira Filho	f6338c3b85	anv/pipeline: set active_stages early Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Caio Marcelo de Oliveira Filho	318073ce66	anv/pipeline: fail if TCS/TES compile fail v2: Add Fixes tag. (Lionel) Fixes: `e50d4807a3` ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-19 18:00:49 +00:00
Jordan Justen	2ed288363f	main/program_binary: In ProgramBinary set link status as LINKING_SKIPPED This change allows the disk shader cache to work with programs loaded with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set, then they try to use the shader cache. Since the program loaded by ProgramBinary is similar to loading the shader from the disk cache, this is probably more appropriate. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	d2b74ca2b5	i965: Allow disk shader cache usage with LINKING_SUCCESS status Currently, we only look in the disk shader cache if we see that the shader program is in the cache during the link step. If the shader cache entry isn't found during the program link, there are still some (fairly unlikely) scenarios where later it might be useful to search the cache for gen binary programs. 1. If the cache evicts the serialized glsl cache, there might still be valid gen program entries in the disk cache. 2. If two applications are running in parallel, then it is possible that one may write out the cached gen program item which the other application can then make use of. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	b5baaee0d6	glsl/serialize: Save shader program metadata sha1 When the shader cache is used, this can be generated. In fact, the shader cache uses this sha1 to lookup the serialized GL shader program. If a GL shader program is restored with ProgramBinary, the shaders are not available, and therefore the correct sha1 cannot be generated. If this is restored, then we can use the shader cache to restore the binary programs to the program that was loaded with ProgramBinary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	9b473f9e3c	glsl: Remove api_enabled tracking for transform feedback We used this to prevent usage of the disk shader cache when transform feedback was enabled via the GL API. This is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	fc4a7aaa82	i965: Allow disk shader cache usage with transform feedback Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jordan Justen	6d830940f7	glsl/shader_cache: Allow shader cache usage with transform feedback Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Suggested-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-19 09:57:09 -07:00
Jose Fonseca	e10dc12f6f	scons: need to split CC or things might fail We've seen this fail internally. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-03-19 16:41:57 +01:00
Jordan Justen	d07a49fb18	i965: Add INTEL_DEBUG stages support for disk shader cache Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-19 00:07:29 -07:00
Dave Airlie	8f052a3e25	radv: handle exporting view index to fragment shader. (v1.1) The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: `e3265c10c8` (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-19 01:20:00 +00:00
Axel Davy	dbc24835d7	st/nine: Fix non inversible matrix check There was a missing absolute value when checking if the determinant was big enough. Fixes: https://github.com/iXit/Mesa-3D/issues/292 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:46 +01:00
Axel Davy	f61e9a958b	st/nine: Fixes warning about implicit conversion Makes the conversion explicit. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:42 +01:00
Axel Davy	71eae7940e	st/nine: Fix bad tracking of vs textures for NINESBT_ALL Stateblocks with NINESBT_ALL should track all textures. For better performance they have a faster path which copies all the required. This path was only tracking ps textures. Fixes: https://github.com/iXit/Mesa-3D/issues/303 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> CC: "17.3 18.0" <mesa-stable@lists.freedesktop.org>	2018-03-18 22:53:36 +01:00
Axel Davy	76fa1f730b	st/nine: Fix bad tracking of bound vs textures An incorrect formula was used to compute bound_samplers_mask_vs. Since s is above always 8 for vs and the variable is encoded on 8 bits, it was always 0. This resulted in commiting the samplers every call when there was at least one texture read in the vs shader. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Patrick Rudolph <siro@das-labor.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-18 22:53:32 +01:00
Grazvydas Ignotas	e1b2e5667c	radv: make vk_format_description structures static No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Grazvydas Ignotas	331141e87e	radv: fix stale comment in generated vk_format_table.c It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-17 18:53:21 +02:00
Eric Anholt	7db1c09d12	anv: Silence warning about heap_size. We only get VK_SUCCESS if it was initialized, but apparently my compiler doesn't track that far. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:10:05 -07:00
Eric Anholt	d25640c3a3	i965: Silence compiler warning about promoted_constants. We only have a cfg != NULL if we went through one of the paths that set it, but my compiler doesn't figure that out. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `6411defdcd` ("intel/cs: Re-run final NIR optimizations for each SIMD size")	2018-03-16 15:09:55 -07:00
Eric Anholt	9f89452ea3	anv: Silence compiler warnings about uninitialized bind_offset. This is a legitimate warning: if anv's blorp_alloc_binding_table() throws an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently continue to use this undefined value. The rest of this code doesn't seem very allocation-error-proof, though, either. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-16 15:09:47 -07:00
Matt Turner	f3833f1ca7	intel/compiler: Use gen_get_device_info() in test_eu_validate Previously the unit test filled out a minimal devinfo struct. A previous patch caused the test to begin assert failing because the devinfo was not complete. Avoid this by using the real mechanism to create devinfo. Note that we have to drop icl from the table, since we now rely on the name -> PCI ID translation done by gen_device_name_to_pci_device_id(), and ICL's PCI IDs are not upstream yet. Fixes: `f89e735719` ("intel/compiler: Check for unsupported register sizes.") Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Matt Turner	54db78b196	intel: Add cfl to gen_device_name_to_pci_device_id() Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-16 13:20:21 -07:00
Rob Clark	bc5001325b	meson+dri3: allow building against older xcb (v3) Similar to previous patch, make xcb 1.13 optional. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-16 16:18:42 -04:00
Dave Airlie	7aeef2d4ef	dri3: allow building against older xcb (v3) I'm not sure everyone wants to be updating their dri3 in a forced march setting, this allows a nicer approach, esp when you want to build on distro that aren't brand new. I'm sure there are plenty of ways this patch could be cleaner, and I've also not built it against an updated dri3. For meson I've just left it alone, since if you are using meson you probably don't mind xcb updates, and if you are using meson you can fix this better than me. v3: just don't put a version in for dri3/present without modifiers, should allow building with 1.11 as well (feel free to supply meson followups) Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:19:45 -04:00
Marek Olšák	f099c3aef1	r600: consolidate PIPE_BIND_SHARED/SCANOUT handling (Ported from radeonsi commit `f70f6baaa3`) Allows cached BOs to be reused in more cases. Bugzilla: https://bugs.freedesktop.org/105171 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>	2018-03-16 17:31:28 +01:00
Rafael Antognolli	f89e735719	intel/compiler: Check for unsupported register sizes. Make sure we don't emit 64 bit types if the hardware doesn't support them. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-16 09:27:16 -07:00
Jason Ekstrand	315ee5faec	loader: Include include/drm-uapi in the autotools build We're already including it in the meson build. This fixes build issues on systems which have a drm_fourcc.h that doesn't have modifiers. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-16 08:50:07 -07:00
Wu, Zhongmin	5fc21c6044	egl/android: Implement the eglSwapinterval for Android. Implement the eglSwapinterval for Android platform to enable the async mode for some GFX benchmarks such as Daimler C217, CityBench. Results of the dEQP-EGL.*swap_interval tests 'dEQP-EGL.functional.query_config.get_config_attrib.max_swap_interval'.. 'dEQP-EGL.functional.query_config.get_config_attrib.min_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_only.max_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_only.min_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_and_sort.max_swap_interval'.. 'dEQP-EGL.functional.choose_config.simple.selection_and_sort.min_swap_interval'.. 'dEQP-EGL.functional.negative_api.swap_interval'.. Test run totals: Passed: 7/7 (100.0%) Failed: 0/7 (0.0%) Not supported: 0/7 (0.0%) Warnings: 0/7 (0.0%) Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> [Emil Velikov: polish inline comment, add dEQP stats, s/dpy/disp/] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-16 13:58:56 +00:00
Emil Velikov	3a9fb4f7ad	st/mesa: simplify st_init_limits() via tgsi_processor_to_shader_stage Reuse the tgis helper and remove a bunch of duplicated code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:49:16 +00:00
Emil Velikov	f7f95310f0	tgsi: move tgsi_processor_to_shader_stage() to a header This way we can utilise it with later patches. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-16 13:48:46 +00:00
Emil Velikov	9fa1d822bf	egl/dri2: move wayland header inclusion where applicable Instead of indirectly pulling the wayland headers everywhere, use forward declarations and #include only as needed. Should effectively fix build errors like the following: make[5]: Entering directory '/.../src/gallium/state_trackers/omx/tizonia' CC h264dprc.lo In file included from h264dprc.c:45:0: .../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error: wayland/wayland-egl/wayland-egl-backend.h: No such file or directory #include "wayland/wayland-egl/wayland-egl-backend.h" Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Andy Furniss <adf.lists@gmail.com>	2018-03-16 13:47:59 +00:00
Emil Velikov	d091c9c4cf	vulkan/wsi/x11: correct DRI3 version in comment During development the version was bumped, yet the comment did not get an update. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-16 13:47:52 +00:00
Emil Velikov	19ec817756	vulkan/wsi/x11: use ARRAY_SIZE where applicable Use the handy macro instead of hard coded numbers. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-16 13:45:47 +00:00
Juan A. Suarez Romero	705a6446b4	mesa: RGB9_E5 invalid for CopyTexSubImage* in GLES According to OpenGL ES 3.2, section 8.6, CopyTexSubImage* should return an INVALID_OPERATION if the internalformat of the texture is RGB9_E5. This fixes dEQP-GLES31.functional.debug.negative_coverage.*.copytexsubimage2d_texture_internalformat. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-16 12:49:16 +00:00
Christian Gmeiner	5e51f72374	etnaviv: remove superfluous \n from DBG(..) callers The DBG(..) macro appends a \n already so there is no need to do it twice. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-03-16 11:41:27 +01:00
Samuel Pitoiset	e96a1d27dc	radv: run nir_opt_move_load_ubo Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:58:19 +01:00
Samuel Pitoiset	af355aaa07	nir: add nir_opt_move_load_ubo() optimization pass This pass moves load UBO operations just before their first use, loosely based on nir_opt_move_comparisons. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-16 09:50:31 +01:00
Dave Airlie	9d0d806332	radv: drop geometry stride user sgpr. This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:21 +00:00
Dave Airlie	6f051549c3	radv: get rid of geometry user sgpr for num entries. This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:17 +00:00
Dave Airlie	9188bd78d7	radv: migrate lds size calculations to shader gen. This moves the lds_size calcs into the shader so we have all the size stuff in one file. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:12 +00:00
Dave Airlie	384aced65e	radv: drop scanning the tess shader in the nir code. This drops the now unneeded scanning and results in favour of the ones in the info. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:08 +00:00
Dave Airlie	f50d520acf	radv: use num_patches output from tcs shader. Instead of recalculating the value, use the shader calculated value. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:05 +00:00
Dave Airlie	bf9a0ea853	radv/tess: remove last chunk of tess sgprs This removes the last TES-specifc user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:23:01 +00:00
Dave Airlie	6db44d6a8c	radv: pass num_patches to tes from tcs TES needs num_patches to do some of the calculations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:58 +00:00
Dave Airlie	010d055aae	radv: drop tess offchip layout for tcs. This removes the last TCS specific user sgpr. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:54 +00:00
Dave Airlie	ee31cff856	radv: drop tcs_out_offsets Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:47 +00:00
Dave Airlie	b0460bbf1c	radv: drop tcs_out_layout Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:43 +00:00
Dave Airlie	6adf99165c	radv/tess: drop tcs_in_layout setting completely. Inline all calcs at shader creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:37 +00:00
Dave Airlie	f343d11ae7	radv: drop ls_out_layout const. We can precalculate input_vertex_size at compile time. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:32 +00:00
Dave Airlie	d89b16b7b9	radv/shader_info: start gathering tess output info (v2) This gathers the ls outputs written by the vertex shader, and the tcs outputs, these are needed to calculate certain tcs parameters. These have to be separate for combined gfx9 shaders. This is a bit pessimistic compared to the nir pass, as we don't work out the individual slots for tcs outputs, but I actually thing it should be fine to just mark the whole thing used here. v2: move to radv, handle clip dist (Samuel), handle compacts and patchs properly. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:23 +00:00
Dave Airlie	2012dae19a	radv: migrate unique index info shader info (v2) This just moves this function to an inline so the shader_info pass can use it. v2: use inline (Samuel) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-16 05:22:19 +00:00
Samuel Pitoiset	f02f1ad13f	Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()" This reverts commit `f314a532fd`. This appears to introduce some blinking textures in UT2004. Not sure exactly what's the root cause because we don't have much information about the issue. Anyway, this was just a micro optimization that actually breaks, at least, one app almost one year later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105436 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-15 21:32:52 +01:00
Lionel Landwerlin	51783f3e7d	anv: silence unused variable warning Fixes: `59b0ea0c74` ("anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:26 +00:00
Lionel Landwerlin	b5b56f91f5	i965: silence unused function warning [123/227] Compiling C object 'src/mesa/drivers/dri/i965/libi965_gen110@sta/genX_blorp_exec.c.o'. ../src/mesa/drivers/dri/i965/genX_blorp_exec.c:99:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:56:23 +00:00
Lionel Landwerlin	0f544a3c51	anv: silence unused function warning on gen11 [84/227] Compiling C object 'src/intel/vulkan/libanv_gen110@sta/genX_blorp_exec.c.o'. ../src/intel/vulkan/genX_blorp_exec.c:68:1: warning: ‘blorp_get_surface_base_address’ defined but not used [-Wunused-function] blorp_get_surface_base_address(struct blorp_batch *batch) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-15 18:55:42 +00:00
Dylan Baker	2a7027f79a	meson: fix pipe-loaders after omx changes with_gallium_omx used to be a boolean, but now it's a string. That means it needs to be compared to 'disabled' instead of false. CC: Rob Clark <robdclark@gmail.com> Fixes: `34e852d5b5` ("meson: Re-add auto option for omx") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Tested-by: Rob Clark <robdclark@gmail.com Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-15 10:02:32 -07:00
Dylan Baker	9bd7a6f6f0	meson: require amdgpu >= 2.4.91 the meson equivalent of `f8773edb0a` Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-15 10:00:02 -07:00
Marek Olšák	f8773edb0a	configure.ac: require libdrm_amdgpu 2.4.91 Since 2.4.90 is problematic, just ask for the next version. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-15 12:44:40 -04:00
Marek Olšák	5d0acff39e	configure.ac: blacklist libdrm 2.4.90 Cc: 18.0 17.3 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-15 12:44:37 -04:00
Samuel Pitoiset	16ecf037f9	radv: dump LLVM IR when a hang is detected Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:07 +01:00
Samuel Pitoiset	81818662a5	radv: record LLVM IR when debugging shaders If AMD_shader_info or RADV_TRACE_FILE is used we might need to keep trace of LLVM IR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:03 +01:00
Samuel Pitoiset	d07edf5fdf	radv: add dump_shader to the NIR compiler options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:20:00 +01:00
Samuel Pitoiset	50fcca328c	radv: pass the NIR compiler options to ac_compile_llvm_module() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:58 +01:00
Samuel Pitoiset	14c27c2511	radv: print some information when RADV_TRACE_FILE is set Just to be sure all options are enabled when trying to generate a hang report. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:54 +01:00
Samuel Pitoiset	5be2757c35	radv: only display options that are enabled Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 17:19:52 +01:00
Eric Engestrom	6332893594	mailmap: Use Eric Engestrom's personal email address Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-15 12:03:41 +00:00
Alejandro Piñeiro	50767214a7	spirv/radv: add AMD_gcn_shader capability, remove current extensions So now, during spirv_to_nir, it uses the capability instead of the extension. Note that we are really doing here is treating SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader is not the first SPV extension supported. For example, the capability draw_parameters infers if the extension SPV_KHR_shader_draw_parameters is supported or not. This could be seen as counter-intuitive, and that it would be easier to define which extensions are supported, and based our checks on that, but we need to take into account that some capabilities are optional from core, and others came from new extensions. Also this commit would make the implementation of ARB_spirv_extensions easier. v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann) Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez	adf58e59d3	spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode() We don't need anymore the source and destination's data type, just their bitsize. v2: - Use glsl_get_bit_size () instead (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez	ce2fd87056	spirv: fix the translation of SPIR-V conversion opcodes to NIR There are some SPIRV opcodes (like UConvert and SConvert) have some expectations of the output that doesn't depend on the operands data type. Generalize the solution of all of them. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-15 08:51:01 +01:00
Mathias Fröhlich	98f35ad63c	vbo: Correctly handle source arrays in vbo_split_copy. The original approach did optimize away a bit too many fields. Restablish the pointer into the original array and correctly feed that one. Reviewed-by: Brian Paul <brianp@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105471 Fixes: `64d2a20480` mesa: Make gl_vertex_array contain pointers to first order VAO members. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-15 06:11:57 +01:00
Apple SWE	361f79c97f	sched.h needs to be imported on Darwin/OSX targets. sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-14 22:08:34 -07:00
Apple SWE	67f27b1e18	Add processor topology calculation implementation for Darwin/OSX targets. The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-14 22:08:34 -07:00
Dave Airlie	4b15b5e803	virgl: resize resource bo allocation if we need to. This fixes an illegal command buffer on the host seen with piglit arb_internalformat_query2-max-dimensions Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-15 12:26:39 +10:00
Mario Kleiner	c1e47a3c1f	nv50,nvc0: Support BGRX1010102 and RGBX1010102 for sampling. Add them as usable for textures, so they can be used by Wayland drm in 10 bpc mode and for X11 compositing under GLX and EGL. We need these formats to be supported at least for sampling, otherwise GLX_texture_from_pixmap and the equivalent EGL image extension won't work with X11 drawables of depth 30 and just display an all black window. Do not expose these formats as renderable, and thereby not as a fbconfig/EGLConfig/Visual, as NVidia hw does not support 10 bpc unorm formats without alpha channel. Tested under X11 + GLX/EGL + DRI2/DRI3 for compositing, and under Wayland+Weston drm backend with a Tesla and Pascal gpu. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-03-14 21:41:27 -04:00
Thomas Helland	03e37ec6d7	util: Use set_foreach instead of rolling our own This follows the same pattern as in the hash_table. Reviewed-by: Jason Ekstrand <jason.ekstrand at intel.com>	2018-03-14 20:03:57 +01:00
Thomas Helland	5f129c05e6	glsl: Use hash table cloning in copy propagation Walking the whole hash table, inserting entries by hashing them first is just a really bad idea. We can simply memcpy the whole thing. V2: Remove leftover creation of acp in two places Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 19:52:02 +01:00
Thomas Helland	6baaf4291b	util: Implement a hash table cloning function V2: Don't rzalloc; we are about to rewrite the whole thing (Vladislav) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 19:52:01 +01:00
Guillaume Charifi	388ed47081	st/mesa: Factorize duplicate code in st_BlitFramebuffer() Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-14 14:46:51 -04:00
Dylan Baker	7dd261ac50	autotools: add -I/src/egl to tizonia This fixes the following build breakage: make[5]: Entering directory '/mnt/sdc1/Gits/mesa/src/gallium/state_trackers/omx/tizonia' CC h264dprc.lo In file included from h264dprc.c:45:0: ../../../../../src/egl/drivers/dri2/egl_dri2.h:47:10: fatal error: wayland/wayland-egl/wayland-egl-backend.h: No such file or directory #include "wayland/wayland-egl/wayland-egl-backend.h" ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. meson got the same fix in `7598dedfde`. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-14 11:23:19 -07:00
Dylan Baker	848f2b6e31	Revert "Add processor topology calculation implementation for Darwin/OSX targets." This reverts commit `de0d10db93`. This breaks the build on at least Linux, probably other non-apple platforms. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-14 09:30:17 -07:00
Dylan Baker	0f30c80932	Revert "sched.h needs to be imported on Darwin/OSX targets." This reverts commit `9dc5063262`. This breaks the build on at least Linux, probably other non-apple platforms. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-14 09:28:58 -07:00
Karol Herbst	b617bfcccf	compiler: int8/uint8 support OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-03-14 10:08:42 -04:00
Alex Smith	fcf267ba08	radv: Fix CmdCopyImage between uncompressed and compressed images From the spec: "When copying between compressed and uncompressed formats the extent members represent the texel dimensions of the source image and not the destination." However, as per `7b890a36`, we must still use the destination image type when clamping the extent so that we copy the correct number of layers for 2D to 3D copies. Fixes: `7b890a36` "radv: Fix vkCmdCopyImage for 2d slices into 3d Images" Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-14 09:59:21 +00:00
Samuel Pitoiset	38f34117dd	radv: fix vkGetDeviceQueue2() when create flags don't match This fixes CTS: dEQP-VK.api.device_init.create_device_queue2_unmatched_flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 09:53:42 +01:00
Neil Roberts	25a966a23d	spirv: Handle doubles when multiplying a mat by a scalar The code to handle mat multiplication by a scalar tries to pick either imul or fmul depending on whether the matrix is float or integer. However it was doing this by checking whether the base type is float. This was making it choose the int path for doubles (and presumably float16s). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:43:33 +01:00
Iago Toral Quiroga	1a0aba7216	anv/entrypoints: VkGetDeviceProcAddr returns NULL for core instance commands `af5f2322d0` addressed this for extension commands, but the spec mandates this behavior also for core API commands. From the Vulkan spec, Table 2. vkGetDeviceProcAddr behavior: device pname return ---------------------------------------------------------- (..) device core device-level command fp (...) See that it specifically states "device-level". Since the vk.xml file doesn't state if core commands are instance or device level, we identify device level commands as the ones that take a VkDevice, VkQueue or VkCommandBuffer as their first parameter. Fixes test failures in new work-in-progress CTS tests. Also see the public issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues/2323 v2: - Include reference to github issue (Emil) - Rebased on top of Vulkan 1.1 changes. v3: - Remove the not in the condition and switch the then/else cases (Jason) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Iago Toral Quiroga	a631575ff4	anv/entrypoints: dispatches to VkQueue are device-level v2: - Add trampoline functions (Jason) - Add an assertion for unhandled trampoline cases Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-14 08:09:15 +01:00
Dave Airlie	3b0f2081b5	radv: drop assert on bindingDescriptorCount > 0 The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 16:54:52 +10:00
Apple SWE	9dc5063262	sched.h needs to be imported on Darwin/OSX targets. sched_yield is used but the include reference on Darwin is missing. This patch conditionally guards on Darwin/OSX to import sched.h first. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2018-03-13 22:50:56 -07:00
Apple SWE	de0d10db93	Add processor topology calculation implementation for Darwin/OSX targets. The implementation for bootstrapping SWR on Darwin targets is based on the Linux version. Instead of reading the output of /proc/cpuinfo, sysctlbyname is used to determine the physical identifiers, processor identifiers, core counts and thread-processor affinities. With this patch, it is possible to use SWR as an alternate renderer on OSX to softpipe and llvmpipe. Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2018-03-13 22:50:27 -07:00
Roland Scheidegger	274f8bf05e	r600: fix abs for op3 sources If a src was referencing the same temp as the dst, the per-component copy code didn't work. e.g. cndge r0.xy, r0.xx, \|r2\|, r3 got expanded into mov r12.x, \|r2\| cndge r0.x, r0.x, r12, r3 mov r12.y, \|r2\| cndge r0.y, r0.x, r12, r3 hence for the second cndge r0.x was mistakenly the previous cndge result. Fix this by doing all the movs first, so there's no bogus alu.last in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102905 Tested-by: <iive@yahoo.com> Reviewed-by: Dave Airlie <airlied@gmail.com>	2018-03-14 04:54:45 +01:00
Dave Airlie	27a5e5366e	radv: mark all tess output for an indirect access. If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	4f0c89d66c	ac/nir: pass the nir variable through tcs loading. I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Dave Airlie	f9de2d409b	radv: get correct offset into LDS for indexed vars. This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: `94f9591995` (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-14 11:18:54 +10:00
Rob Clark	4e4428482e	nir: lower_load_const_to_scalar fix for 8/16b types Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-13 20:17:04 -04:00
Dylan Baker	2aad12b2af	Update the documentation for meson Meson is pretty well tested and works in most configurations now, so we can remove the warning about it being unsuited for actual use. It's also worth documenting that meson 0.42.0 or greater is required. v2: - Minor rewording of supported platforms as suggested by Emil - Add two missing tags as reported by xmllint --html Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2018-03-13 14:54:47 -07:00
Jason Ekstrand	85000b812d	ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroups Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 13:25:27 -07:00
Jason Ekstrand	3d1d7e8561	nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot This is based heavily on `97f10934ed`, "ac/nir: Add vote_ieq/vote_feq lowering pass." from Bas Nieuwenhuizen. This version is a bit more general since it's in common code. It also properly handles NaN due to not flipping the comparison for floats. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 13:25:15 -07:00
Dylan Baker	8247a30838	meson: don't use compiler.has_header Meson's compiler.has_header is completely useless, it only checks that a header exists, not whether it's usable. This creates problems if a header contains a conditional #error declaration, like so: > #if __x86_64__ > # error "Doesn't work with x86_64!" > #endif Compiler.has_header will return true in this case, even when compiling for x86_64. This is useless. Instead, we'll do a compile check so that any #error declarations will be treated as errors, and compilation will work. Fixes compilation on x32 architecture. Gentoo Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=649746 meson bug: https://github.com/mesonbuild/meson/issues/2246 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-13 11:41:10 -07:00
Jason Ekstrand	8379bff6c4	i965: Emit texture cache invalidates around blorp_copy This is a terrible hack but it fixes CTS regressions. It's still incredibly unclear exactly what is going wrong in the hardware to cause this to be an issue so this isn't a good fix by any means. However, it does fix tests so there is that. Fixes: `fb0e9b5197` "i965: Track the depth and render caches separately" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103746 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-13 11:24:40 -07:00
Eric Anholt	a326eedc75	brodacom/vc4: Fix simulator since the perfmon change. It would be nice to support perfmon with simulator, and might be a useful tool for regression testing performance (since the simulator would be deterministic).	2018-03-13 10:32:58 -07:00
Eric Anholt	191bc7ce61	spirv: Silence compiler warning about undefined srcs[0] v2: Use assume() at the srcs[] definition instead. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-03-13 10:32:55 -07:00
Samuel Pitoiset	7c83430672	ac/nir: rename radeon_llvm_reg_index_soa() to ac_llvm_reg_index_soa() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:28 +01:00
Samuel Pitoiset	b128fd773f	ac/nir: remove some unnecessary includes and declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:27 +01:00
Samuel Pitoiset	cd4e823341	ac/nir: drop radv prefix from radv_lower_gather4_integer() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:25 +01:00
Samuel Pitoiset	fbe694562b	ac/nir: move ac_nir_compiler_options and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:23 +01:00
Samuel Pitoiset	237229430f	ac: move ac_shader_info to radv folder This is RADV specific code. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:21 +01:00
Samuel Pitoiset	2cfba40eea	ac/nir: move ac_shader_variant_info and friends to radv folder Also replace ac_ by radv_. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 16:54:16 +01:00
Samuel Pitoiset	b2653007b9	ac/nir: move all RADV related code to radv_nir_to_llvm.c Now the "ac/nir" prefix will really be the shared code between RadeonSI and RADV, that might avoid confusions in the future. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	8e15824b9d	ac/nir: make emit_barrier() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	4e3117b718	ac/nir: move radeon_llvm_reg_index_soa() to ac_nir_to_llvm.h Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3a30b89353	ac/nir: make handle_shader_output_decl() non-static Required in order to move all RADV specific code outside of ac/nir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	3fe47b1290	ac/nir: change prototype of handle_shader_output_decl() This allows to remove the ac_nir_context dependency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	61a91ca3f5	ac/nir: move unpack_param() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	28bb6873ec	ac/nir: move trim_vector to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	895632baef	ac/nir: move cast_ptr() to ac_llvm_build.c Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Samuel Pitoiset	bf6368297b	ac/nir: move ac_build_alloca() to ac_llvm_build.c As well as si_build_alloca_undef() and drop the si prefix. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-13 14:05:06 +01:00
Timothy Arceri	370e356eba	gallium: silence __builtin_frame_address nonzero argument is unsafe warning Calling __builtin_frame_address with a nonzero argument is unsafe but is sometimes done for debugging purposes. Since this code is part of some debug util code I'm assuming that is the case here and using GCC pragma to silence the warning. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-13 09:38:10 +11:00
Dylan Baker	b7c6870f87	meson: Add moduledir to d3d.pc This is required to build wine with the nine patchset Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Reported-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-12 13:52:38 -07:00
Mathias Fröhlich	a2f08dd574	gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-12 18:24:31 +01:00
Ian Romanick	def0030e64	mesa: Don't write to user buffer in glGetTexParameterIuiv on error With some sets of optimization flags, GCC will generate warnings like this: src/mesa/main/texparam.c:2327:27: warning: ‘((void )&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized] params[3] = ip[3]; ~~^~~ src/mesa/main/texparam.c:2320:16: note: ‘((void )&ip+12)’ was declared here GLint ip[4]; ^~ ip is not initialized in cases where a GL error is generated. In these cases, we should not write to the user's buffer, so this is actually a bug. I wrote a new piglit test gl-3.0-texparameteri to show this bug. I suspect that Coverity also detected this, but the scan site is currently down. Fixes: `c2c507786` "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-12 10:13:30 -07:00
Roman Gilg	f94597f554	gallium: work around libtool relink issue for libdrm This is similar to commit `90633079`. libtool links first to system directories instead of custom locations of libdrm on relinking. Since a more recent libdrm version than the one provided by the system is often needed when compiling mesa, make sure this works by putting libdrm in front. See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259 Signed-off-by: Roman Gilg <subdiff@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-12 14:49:07 +00:00
Emil Velikov	678ba53240	vulkan: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	8151f5cad9	wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	1178e0cf49	egl: autotools: do not redirect stdin/stdout for wayland-scanner The tool accepts the input and output files as arguments. There's no need for the redirection. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	08189731a4	docs: document removal of GLX_SGIX_swap_{barrier,group} stubs Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	5ef608fab7	glx: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	2c765b0d9a	gallium/x11: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	afab516f5f	x11: remove empty GLX_SGIX_swap_group stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	742b8e3301	glx: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:52 +00:00
Emil Velikov	447731348e	gallium/x11: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:51 +00:00
Emil Velikov	1d2d519d78	x11: remove empty GLX_SGIX_swap_barrier stubs The extension was never implemented. Quick search suggests: - no actual users (on my Arch setup) - the Nvidia driver does not implement the extension Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-03-12 14:48:51 +00:00
Emil Velikov	f197f02e50	configure: remove unused AM_CONDITIONAL Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-12 14:48:51 +00:00
Bas Nieuwenhuizen	997306c031	radv: Increase the number of dynamic uniform buffers. The vulkan API is not ideal as it does not allow us have a shared limit. Feral needs 15+6 for one of their games, and I'm not a fan of overcommitting the limits, so increase the number of dynamic uniform buffers to 16. CC: <mesa-stable@lists.freedesktop.org> CC: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-12 09:46:22 +01:00
Dave Airlie	e76cf1ff12	u_vbuf/translate: pass max_index into the set_buffer. This fixes a memory trashing crash (not the test) seen with dEQP-GLES3.stress.draw.unaligned_data.random.203 on virgl. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-12 11:57:13 +10:00
Dave Airlie	5d4fbc2b54	r600: implement callstack workaround for evergreen. This is ported from the sb backend, there are some issues with evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE instructions. Whenever we are going to use a push before, we check the stack usage and if we have to use the workaround, then we switch to a separate push. I noticed this problem dealing with some of the soft fp64 shaders, in nosb mode, they are quite stack happy. This fixes all the glitches and inconsistencies I've seen with them Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Elie Tournier <elie.tournier@collabora.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-12 11:11:44 +10:00
Marek Olšák	163a29099a	gallium/util: add helper util_wait_for_idle This is an old patch that I had.	2018-03-11 13:14:27 -04:00
Roland Scheidegger	0f0a6fa21d	u_blit: (trivial) u_blit.h needs to include p_defines.h (For the pipe_tex_filter enum) Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-10 20:09:04 +01:00
Christian Gmeiner	c9b153fea7	travis: bump libxcb version to 1.13 Fixes following dependency problem: Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13' Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2")	2018-03-10 16:55:36 +01:00
Mathias Fröhlich	64d2a20480	mesa: Make gl_vertex_array contain pointers to first order VAO members. Instead of keeping a copy of the vertex array content in struct gl_vertex_array only keep pointers to the first order information originaly in the VAO. For that represent the current values by struct gl_array_attributes and struct gl_vertex_buffer_binding. v2: Change comments. Remove gl... prefix from variables except in the i965 directory where it was like that before. Reindent because of that. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-10 07:33:51 +01:00
Roland Scheidegger	d62f0df354	draw: fix alpha value for very short aa lines The logic would not work correctly for line lengths smaller than 1.0, even a degenerated line with length 0 would still produce a fragment with anyhwere between alpha 0.0 and 0.5. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-10 02:11:50 +01:00
Jordan Justen	24b415270f	intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-09 16:15:58 -08:00
Jordan Justen	06e3bd02c0	i965: Hard code CS scratch_ids_per_subslice for Cherryview Ken suggested that we might be underallocating scratch space on HD 400. Allocating scratch space as though there was actually 8 EUs seems to help with a GPU hang seen on synmark CSDof. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>	2018-03-09 16:15:34 -08:00
Marek Olšák	db495b8962	st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER Tested by our OpenCL team. Fixes: `9c499e6759` "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs" Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-09 16:33:31 -05:00
Marek Olšák	2bdb54bce7	radeonsi: add a workaround for GFX9 hang with init_config alignment Fixes: `75c5d25f0f` "radeonsi: align command buffer starting address to fix some Raven hangs" Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-03-09 16:28:29 -05:00
Marek Olšák	e99212e970	ac/gpu_info: print ib_start_alignment, add assertion	2018-03-09 16:28:29 -05:00
Greg V	e30a165be2	meson: Use system_has_kms_drm in default driver selection Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-09 10:02:44 -08:00
Eric Anholt	c57d5ea3bb	broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled. Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to 36.	2018-03-09 09:59:54 -08:00
Eric Anholt	cf170616da	gallium: Add a util_blitter path for using a custom VS and FS. Like the r600 paths to use other custom states, we pass in a couple of parameters to customize the innards of the blitter. It's up to the caller to wrap other state necessary for its shaders (for example, constant buffers for the uniforms the shader uses). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-09 09:59:54 -08:00
Eric Anholt	46a32e3d2e	broadcom/vc4: Allow binding non-zero constant buffers. We're going to use UBO loads for implementing YUV linear-to-T-format blits.	2018-03-09 09:59:54 -08:00
Eric Anholt	2725ab2b12	broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID. The imported drm_fourcc.h handles it now.	2018-03-09 09:59:54 -08:00
Eric Anholt	a3a4c23dec	broadcom: Suppress compiler warnings about enum pipe_tex_filter.	2018-03-09 09:59:54 -08:00
Louis-Francis Ratté-Boulianne	3160cb86aa	egl/x11: Re-allocate buffers if format is suboptimal If PresentCompleteNotify event says the pixmap was presented with mode PresentCompleteModeSuboptimalCopy, it means the pixmap could possibly have been flipped instead if allocated with a different format/modifier. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne	069fdd5f9f	egl/x11: Support DRI3 v1.1 Add support for DRI3 v1.1, which allows pixmaps to be backed by multi-planar buffers, or those with format modifiers. This is both for allocating render buffers, as well as EGLImage imports from a native pixmap (EGL_NATIVE_PIXMAP_KHR). Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:14 +00:00
Louis-Francis Ratté-Boulianne	61309c2a72	vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11 When it is detected that a window could have been flipped but has been copied because of suboptimal format/modifier. The Vulkan client should then re-create the swapchain. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-09 17:47:13 +00:00
Daniel Stone	c80c08e226	vulkan/wsi/x11: Add support for DRI3 v1.2 Adds support for multiple planes and buffer modifiers. v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers" v12: Multi-planar/modifier support is now DRI3 v1.2; also update release versions	2018-03-09 17:47:13 +00:00
Dylan Baker	7258be91c5	autotools: include all meson.build files Otherwise SWR cannot be built with meson from an autotools generated tarball, such as the 18.0.0-rc4 tarball. Fixes: `16bf813830` ("meson/swr: re-shuffle generated files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-09 08:15:04 -08:00
Michel Dänzer	2a4596a2f0	st/mesa: gl_program::info.system_values_read is a 64-bit-field We were dropping the upper 32 bits, which caused assertion failures in some compute shader piglit tests with radeonsi since the commit below. Fixes: `752e969703` ("compiler: Add two new system values for subgroups") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-09 16:52:11 +01:00
George Kyriazis	379e00dc27	swr/rast: Refactor memory gather operations Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:42 -06:00
George Kyriazis	3f7ce10b3e	swr/rast: Add KNOB_DISABLE_SPLIT_DRAW This is useful for archrast data collection. This greatly speeds up the post processing script since there is significantly less events generated. Finally, this is a simpler option to communicate to users than having them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:30 -06:00
George Kyriazis	e0a4a25829	swr/rast: Add VPOPCNT Supports popcnt on vector masks (e.g. <8 x i1>) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:23 -06:00
George Kyriazis	b56afe1a4f	swr/rast: Add tracking for stream out topology Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:14 -06:00
George Kyriazis	2f6ae8cfcd	swr/rast: Add split draw and other state information to DrawInfoEvent. Removed specific split draw events. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:36:07 -06:00
George Kyriazis	714093203e	swr/rast: Refactor api and worker event handlers. In the API event handler we want to share information between the core layer and the API. Specifically, around associating various ids with different kinds of events. For example, associate render pass id with draw ids, or command buffer ids with draw ids. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:59 -06:00
George Kyriazis	cfdd35beaf	swr/rast: Add support for generalized late and early z/stencil stats Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:52 -06:00
George Kyriazis	9e25f298eb	swr/rast: Rasterized Subspans stats support Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:35:47 -06:00
George Kyriazis	d78b28fc33	swr/rast: Added comment Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-03-09 09:34:55 -06:00
Eric Engestrom	e903a7b0bb	vulkan/wsi: clean up cleanup path Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-09 13:25:44 +00:00
Bas Nieuwenhuizen	a793e7899f	radv: Fix the autotools build take 2. Forgot to remove a word.... Fixes: `04ffabf17a` "radv: Fix autotools build."	2018-03-09 14:10:24 +01:00
Lucas Stach	1f55d06783	etnaviv: allow mixing different bit depths for color and depth surfaces Vivante hardware supports this just fine. There is no reason why this shouldn't be advertised as a valid combination. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-03-09 12:06:07 +01:00
Thierry Reding	6d4d46bca9	autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS This allows the driver to be built on a make distcheck and makes sure that it properly builds when a distribution tarball is made. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:22 +01:00
Thierry Reding	1755f608f5	tegra: Initial support Tegra K1 and later use a GPU that can be driven by the Nouveau driver. But the GPU is a pure render node and has no display engine, hence the scanout needs to happen on the Tegra display hardware. The GPU and the display engine each have a separate DRM device node exposed by the kernel. To make the setup appear as a single device, this driver instantiates a Nouveau screen with each instance of a Tegra screen and forwards GPU requests to the Nouveau screen. For purposes of scanout it will import buffers created on the GPU into the display driver. Handles that userspace requests are those of the display driver so that they can be used to create framebuffers. This has been tested with some GBM test programs, as well as kmscube and weston. All of those run without modifications, but I'm sure there is a lot that can be improved. Some fixes contributed by Hector Martin <marcan@marcan.st>. Changes in v2: - duplicate file descriptor in winsys to avoid potential issues - require nouveau when building the tegra driver - check for nouveau driver name on render node - remove unneeded dependency on libdrm_tegra - remove zombie references to libudev - add missing headers to C_SOURCES variable - drop unneeded tegra/ prefix for includes - open device files with O_CLOEXEC - update copyrights Changes in v3: - properly unwrap resources in ->resource_copy_region() - support vertex buffers passed by user pointer - allocate custom stream and const uploader - silence error message on pre-Tegra124 - support X without explicit PRIME Changes in v4: - ship Meson build files in distribution tarball - drop duplicate driver_tegra dependency Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:22 +01:00
Thierry Reding	2052dbdae3	nouveau: Add framebuffer modifier support This adds support for framebuffer modifiers to Nouveau. This will be used by the Tegra driver to share metadata about the format of buffers (such as the tiling mode or compression). Changes in v2: - remove unused parameters to nouveau_buffer_create() - move format modifier query code to nvc0 backend - restrict format modifiers to 2D textures - implement ->query_dmabuf_modifiers() Changes in v4: - add UAPI include path on meson builds Changes in v5: - remove unnecessary includes Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:48:08 +01:00
Thierry Reding	b964cab80a	nouveau/nvc0: Extract common tile mode macro Add a new macro that can be used to extract the tiling mode from a tile_mode value. This is will be used to determine the number of GOBs used in block linear mode. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:47:54 +01:00
Thierry Reding	75bf489628	drm/tegra: Sanitize format modifiers The existing format modifier definitions were merged prematurely, and recent work has unveiled that the definitions are suboptimal in several ways: - The format specifiers, except for one, are not Tegra specific, but the names don't reflect that. - The number space is split into two, reserving 32 bits for some "parameter" which most of the modifiers are not going to have. - Symbolic names for the modifiers are not using the standard DRM_FORMAT_MOD_* prefix, which makes them awkward to use. - The vendor prefix NV is somewhat ambiguous. Fortunately, nobody's started using these modifiers, so we can still fix the above issues. Do so by using the standard prefix. Also, remove TEGRA from the name of those modifiers that exist on NVIDIA GPUs as well. In case of the block linear modifiers, make the "parameter" smaller (4 bits, though only 6 values are valid) and don't let that leak into any of the other modifiers. Finally, also use the more canonical NVIDIA instead of the ambiguous NV prefix. This is based on commit 268892cb63a822315921a8dab48ac3e4abf7dd03 from Linux v4.16-rc1. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:44:35 +01:00
Thierry Reding	ffc85cfac0	drm/fourcc: Fix fourcc_mod_code() definition Avoid a compiler warnings when the val parameter is an expression. This is based on commit 5843f4e02fbe86a59981e35adc6cabebee46fdc0 from Linux v4.16-rc1. Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-09 11:44:35 +01:00
Bas Nieuwenhuizen	04ffabf17a	radv: Fix autotools build. Forgot it again .... Fixes: `b6347807a9` "radv: Generate icd files." Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-09 09:36:19 +01:00
Samuel Pitoiset	365850fd68	ac/nir: set number of channels for packed mrt exports Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables VSRC1 (B in low bits, A high). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-09 09:28:20 +01:00
Bas Nieuwenhuizen	68201ab2da	radv: Update version to 1.1.70. Turns out they did not reset the patch number on release. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Bas Nieuwenhuizen	b6347807a9	radv: Generate icd files. If the api version is too low, the loader clamps the application requested version to the advertized version, which messes with which extensions are enabled. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-09 07:53:39 +01:00
Ian Romanick	6878c9aabc	nir: Don't i2b a value that is already Boolean A bunch of shaders have sequences like: i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0)))) Other optimizations (and NIR's typeless nature) reduce this to i2b(x == y) which is silly. Skylake total instructions in shared programs: 14498698 -> 14497948 (<.01%) instructions in affected programs: 74480 -> 73730 (-1.01%) helped: 277 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68% 95% mean confidence interval for instructions value: -3.35 -2.06 95% mean confidence interval for instructions %-change: -1.74% -1.16% Instructions are helped. total cycles in shared programs: 532015500 -> 531999238 (<.01%) cycles in affected programs: 5943878 -> 5927616 (-0.27%) helped: 251 HURT: 74 helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14 helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53% HURT stats (abs) min: 1 max: 4550 x̄: 214.04 x̃: 15 HURT stats (rel) min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33% 95% mean confidence interval for cycles value: -158.51 58.43 95% mean confidence interval for cycles %-change: -1.07% -0.04% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4753 -> 4735 (-0.38%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. Haswell and Broadwell had simliar results. (Broadwell shown) total instructions in shared programs: 14791877 -> 14791127 (<.01%) instructions in affected programs: 77326 -> 76576 (-0.97%) helped: 278 HURT: 1 helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49% 95% mean confidence interval for instructions value: -3.33 -2.05 95% mean confidence interval for instructions %-change: -1.70% -1.13% Instructions are helped. total cycles in shared programs: 558250067 -> 558252872 (<.01%) cycles in affected programs: 5806328 -> 5809133 (0.05%) helped: 235 HURT: 83 helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16 helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51% HURT stats (abs) min: 1 max: 10590 x̄: 265.19 x̃: 20 HURT stats (rel) min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54% 95% mean confidence interval for cycles value: -89.87 107.51 95% mean confidence interval for cycles %-change: -1.06% -0.32% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4735 -> 4717 (-0.38%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. total fills in shared programs: 83111 -> 83110 (<.01%) fills in affected programs: 28 -> 27 (-3.57%) helped: 1 HURT: 0 Ivy Bridge total instructions in shared programs: 11774173 -> 11773436 (<.01%) instructions in affected programs: 70819 -> 70082 (-1.04%) helped: 267 HURT: 0 helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2 helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63% 95% mean confidence interval for instructions value: -3.51 -2.01 95% mean confidence interval for instructions %-change: -1.94% -1.21% Instructions are helped. total cycles in shared programs: 257153833 -> 257148932 (<.01%) cycles in affected programs: 585341 -> 580440 (-0.84%) helped: 167 HURT: 100 helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16 helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88% HURT stats (abs) min: 1 max: 200 x̄: 25.95 x̃: 16 HURT stats (rel) min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65% 95% mean confidence interval for cycles value: -33.25 -3.46 95% mean confidence interval for cycles %-change: -1.47% -0.54% Cycles are helped. total loops in shared programs: 3416 -> 3398 (-0.53%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. LOST: 2 GAINED: 0 Sandy Bridge total instructions in shared programs: 10499306 -> 10499094 (<.01%) instructions in affected programs: 6051 -> 5839 (-3.50%) helped: 43 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2 helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45% 95% mean confidence interval for instructions value: -7.66 -2.20 95% mean confidence interval for instructions %-change: -5.47% -3.12% Instructions are helped. total cycles in shared programs: 145862568 -> 145861370 (<.01%) cycles in affected programs: 61733 -> 60535 (-1.94%) helped: 36 HURT: 2 helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35 helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81% HURT stats (abs) min: 18 max: 102 x̄: 60.00 x̃: 60 HURT stats (rel) min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48% 95% mean confidence interval for cycles value: -41.28 -21.77 95% mean confidence interval for cycles %-change: -6.16% -3.00% Cycles are helped. total loops in shared programs: 1803 -> 1785 (-1.00%) loops in affected programs: 18 -> 0 helped: 18 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -100.00% -100.00% Loops are helped. LOST: 4 GAINED: 0 No changes on Iron Lake of GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-08 15:26:26 -08:00
Ian Romanick	1583f49eaa	i965/vec4: Allow CSE on subset VF constant loads v2: Rewrite the code that generates the VF mask. Suggested by Ken. No changes on other platforms. Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown) total instructions in shared programs: 13059891 -> 13059884 (<.01%) instructions in affected programs: 431 -> 424 (-1.62%) helped: 7 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -3.39% -0.71% Instructions are helped. total cycles in shared programs: 409260032 -> 409260018 (<.01%) cycles in affected programs: 4228 -> 4214 (-0.33%) helped: 7 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28% 95% mean confidence interval for cycles value: -2.00 -2.00 95% mean confidence interval for cycles %-change: -1.15% 0.07% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	360899d457	i965/vec4: Relax writemask condition in CSE If the previously seen instruction generates more fields than the new instruction, still allow CSE to happen. This doesn't do much, but it also enables a couple more shaders in the next patch. It helped quite a bit in another change series that I have (at least for now) abandoned. v2: Add some extra comentary about the parameters to instructions_match. Suggested by Ken. No changes on Skylake, Broadwell, Iron Lake or GM45. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11780295 -> 11780294 (<.01%) instructions in affected programs: 302 -> 301 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 257308315 -> 257308313 (<.01%) cycles in affected programs: 2074 -> 2072 (-0.10%) helped: 1 HURT: 0 Sandy Bridge total instructions in shared programs: 10506687 -> 10506686 (<.01%) instructions in affected programs: 335 -> 334 (-0.30%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-08 15:26:26 -08:00
Ian Romanick	52c7df1643	i965/fs: Merge CMP and SEL into CSEL on Gen8+ v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Kenneth Graunke	70de61594d	i965/fs: Add infrastructure for generating CSEL instructions. v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-08 15:26:26 -08:00
Ian Romanick	54e8d2268d	nir: Narrow some dot product operations On vector platforms, this helps elide some constant loads. v2: Reorder the transformations. No changes on Broadwell or Skylake. Haswell total instructions in shared programs: 13093793 -> 13060163 (-0.26%) instructions in affected programs: 1277532 -> 1243902 (-2.63%) helped: 13216 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.57 -2.49 95% mean confidence interval for instructions %-change: -3.65% -3.54% Instructions are helped. total cycles in shared programs: 409580819 -> 409268463 (-0.08%) cycles in affected programs: 71730652 -> 71418296 (-0.44%) helped: 9898 HURT: 2352 helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16 helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50% HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6 HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97% 95% mean confidence interval for cycles value: -33.19 -17.80 95% mean confidence interval for cycles %-change: -4.50% -4.26% Cycles are helped. total fills in shared programs: 82059 -> 82052 (<.01%) fills in affected programs: 21 -> 14 (-33.33%) helped: 7 HURT: 0 Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown) total instructions in shared programs: 11811851 -> 11780605 (-0.26%) instructions in affected programs: 1155007 -> 1123761 (-2.71%) helped: 12304 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.56 -2.48 95% mean confidence interval for instructions %-change: -3.71% -3.59% Instructions are helped. total cycles in shared programs: 257618409 -> 257316805 (-0.12%) cycles in affected programs: 71999580 -> 71697976 (-0.42%) helped: 9155 HURT: 2380 helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16 helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62% HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4 HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33% 95% mean confidence interval for cycles value: -34.32 -17.97 95% mean confidence interval for cycles %-change: -4.55% -4.29% Cycles are helped. GM45 and Iron Lake had nearly identical results (Iron Lake shown) total instructions in shared programs: 7886750 -> 7879944 (-0.09%) instructions in affected programs: 373781 -> 366975 (-1.82%) helped: 3715 HURT: 47 helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1 helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06% HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2 HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35% 95% mean confidence interval for instructions value: -1.85 -1.77 95% mean confidence interval for instructions %-change: -2.91% -2.73% Instructions are helped. total cycles in shared programs: 178114636 -> 178095452 (-0.01%) cycles in affected programs: 7227666 -> 7208482 (-0.27%) helped: 3349 HURT: 301 helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4 helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63% HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10 HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50% 95% mean confidence interval for cycles value: -5.52 -4.99 95% mean confidence interval for cycles %-change: -0.81% -0.73% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-03-08 15:26:26 -08:00
Lionel Landwerlin	d10a39ebe0	i965: perf: consolidate unmapping oa perf bo outside accumulation Do this in one place outside the only caller of the accumulation function. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:29 +00:00
Lionel Landwerlin	fb921a2870	i965: perf: count number of accumlated reports This will be reused later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:26 +00:00
Lionel Landwerlin	e4387faafb	i965: perf: reuse timescale base function from query We already have the same function in brw_queryobj.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:23 +00:00
Lionel Landwerlin	b71da26496	i965: perf: store sysfs device entry into context We want to reuse it later on. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:21 +00:00
Lionel Landwerlin	5742b17da1	i965: perf: store the hw_id of the context in the query Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:18 +00:00
Lionel Landwerlin	80cd669a32	i965: perf: default case for unknown query types Just some extra safety before further changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-08 23:05:00 +00:00
Marek Olšák	9b7db12815	radeonsi: remove chip_class parameter from si_lower_nir We can get it from si_screen. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	78ef16e2f9	winsys/amdgpu: query GDS info Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	a4a113b5bc	winsys/amdgpu: pad compute IBs v2: pad with PKT2 NOPs on SI Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	35cd86d4e9	radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs This is only required with the latest libdrm. This fixes 32-bit support with high addresses. (and possibly 64-bit support too because the high bits need to be masked out) Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	75c5d25f0f	radeonsi: align command buffer starting address to fix some Raven hangs Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Christian Gmeiner	5b68a7297d	etnaviv: add get_driver_query_group_info(..) This enables AMD_performance_monitor extension. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-03-08 20:44:04 +01:00
Christian Gmeiner	3d912bd742	etnaviv: add query_group_info for sw counters Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-03-08 20:43:55 +01:00
Dylan Baker	1e9d779331	meson: Fix building gallium media libs without egl v2: - rebase on omx fix Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2018-03-08 10:14:02 -08:00
Dylan Baker	f74cf04d3e	meson: Allow building dri based EGL without GLX It should be possible to build EGL without GLX, but the meson build currently doesn't allow that because it too tightly couples glx and dri. This patch eases dri and glx apart, so that EGL without GLX can be built. CC: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-03-08 09:12:24 -08:00
Thierry Reding	d41ee9ba5d	glx/apple: Ship meson build file in tarball The meson build file for Apple GLX is not listed in the EXTRA_DIST make variable and therefore isn't shipped as part of the release tarball, so meson builds from the tarball will fail. Add the file to EXTRA_DIST to ensure it is included in the tarball. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-08 12:11:32 +01:00
Samuel Pitoiset	4e3c1ace65	ac/nir: do not emit unnecessary null exports in fragment shaders Null exports should only be needed when no other exports are emitted. This removes a bunch of 'exp null off, off, off, off done vm'. Affected games are Dota 2 and Wolfenstein 2, not sure if that really helps, but code size is decreasing there. Polaris10: Totals from affected shaders: SGPRS: 8216 -> 8216 (0.00 %) VGPRS: 7072 -> 7072 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 454968 -> 453896 (-0.24 %) bytes Max Waves: 772 -> 772 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-08 11:56:05 +01:00
Eric Engestrom	19dd7f007e	drirc: whitespace fix Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-08 09:53:34 +00:00
Thomas Hellstrom	93e58d5e17	drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware With this extension enabled and a server GLX implementation that actually honors it, Window movement lags considerably on gnome-shell/vmware, so disable it by default. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Thomas Hellstrom	4ca9ad2bb2	gallium/st_dri: Honor the glx_disable_sgi_video_sync config option This option is disabled by default. Primarily intended for drivers on virtual hardware. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Thomas Hellstrom	f4070956d4	glx/dri: Add a driconf option to disable GLX_SGI_video_sync Drivers on virtual hardware don't want to expose this extension to GLX compositors, similarly to GLX_OML_sync_control, since that significantly increases latency. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com>	2018-03-08 07:26:29 +01:00
Timothy Arceri	0c90264da4	ac/radeonsi: add emit_kill to the abi This should fix a regression with Rocket League grass rendering on the NIR backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717	2018-03-08 11:28:37 +11:00
Timothy Arceri	50cc97d98a	radeonsi: add si_llvm_emit_kill() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 11:28:37 +11:00
Timothy Arceri	f4b877631e	spirv: fix autotools builds Fixes: `68a6a3b51a` "spirv: handle AMD_gcn_shader extended instructions" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-08 10:45:56 +11:00
Timothy Arceri	99cdc019bf	ac: make use of if/loop build helpers These helpers insert the basic block in the same order as they appear in NIR making it easier to follow LLVM IR dumps. The helpers also insert more useful labels onto the blocks. TGSI use the line number of the corresponding opcode in the TGSI dump as the label id, here we use the corresponding block index from NIR. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Timothy Arceri	6e1a142863	radeonsi: make use of if/loop build helpers in ac Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Timothy Arceri	42627dabb4	ac: add if/loop build helpers These have been ported over from radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-08 10:12:34 +11:00
Daniel Schürmann	ffbf75cde4	radv: enable AMD_gcn_shader extension Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	18c7f1e041	ac: implement AMD_gcn_shader extended instructions Co-authored-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	68a6a3b51a	spirv: handle AMD_gcn_shader extended instructions Co-authored-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	a1a2a8dfda	nir: add AMD_gcn_shader extended instructions Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Daniel Schürmann	39437025de	spirv: import AMD extensions header from glslang Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-07 23:09:58 +01:00
Dylan Baker	cba104ebe3	meson: Fix indent in omx meson.build Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:54 -08:00
Dylan Baker	6f628951af	meson: Use include directory variables instead of traversing Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Dylan Baker	34e852d5b5	meson: Re-add auto option for omx This re-adds the auto option for omx, without it we default to tizonia and the build fails almost immediately, this is especially obnoxious those building a driver that doesn't support the OMX state tracker to begin with. v2: - Only define OMX_FOO for auto cases if the dependencies are found. This fixes building tizonia with auto (Julien, Eric) CC: Gurkirpal Singh <gurkirpal204@gmail.com> Fixes: `bb5e27fab6` ("st/omx/bellagio: Rename st and target directories") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com> (v1)	2018-03-07 13:30:53 -08:00
Dylan Baker	7598dedfde	meson: fix tizonia compilation It needs to have src/egl in it's includes as well. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Dylan Baker	2d3004ef1c	meson: combine state trackers and target if blocks This is needed later since tizonia requires dri Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Julien Isorce <julien.isorce@gmail.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-03-07 13:30:53 -08:00
Marek Olšák	55376cb31e	st/mesa: expose 0 shader binary formats for compat profiles for Qt Bugzilla: https://bugreports.qt.io/browse/QTBUG-66420 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105065 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2018-03-07 15:36:31 -05:00
Roland Scheidegger	8ba3750d3d	draw: fix line stippling with aa lines In contrast to non-aa, where stippling is based on either dx or dy (depending on if it's a x or y major line), stippling is based on actual distance with smooth lines, so adjust for this. (It looks like there's some minor artifacts with mesa demos line-sample and stippling, it looks like the line endpoints aren't quite right with aa + stippling - maybe due to the integer math in the stipple stage, but I can't quite pinpoint it.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-07 21:29:00 +01:00
Roland Scheidegger	dbb2cf388b	draw: simplify (and correct) aaline fallback (v2) The motivation actually was to get rid of the additional tex instruction, since that requires the draw fallback code to intercept all sampler / view calls (even if the fallback is never hit). Basically, the idea is to use coverage of the pixel to calculate the alpha value, and coverage is simply based on the distance to the center of the line (in both line direction, which is useful for wide lines, as well as perpendicular to the line). This is much closer to what hw supporting this natively actually does. It also fixes an issue with line width not quite being correct, as well as endpoints getting stretched too far (in line direction) with wide lines, which is apparent with mesa demo line-sample. (For llvmpipe, it would probably make sense to do something like this directly when drawing lines, since rendering two tris is twice as expensive as a line, but it would need some changes with state management.) Since we're no longer relying on mipmapping to get the alpha value, we also don't need to draw 3 rects (6 tris), one is sufficient. There's still issues (as before): - quite sure it's not correct without half_pixel_center, but can't test this with GL. - aaline + line stipple is incorrect (evident with line-sample demo). Looking at the spec the stipple pattern should actually be based on distance (not just dx or dy for x/y major lines as without aa). - outputs (other than pos + the one used for line aa) should be reinterpolated since we actually increase line length by half a pixel (but there's no tests which would care). v2: simplify the math (should be equivalent), don't need immediate v3: use float versions of atan2,cos,sin, minor cleanups Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-03-07 21:28:31 +01:00
Bas Nieuwenhuizen	034cce96b4	radv: Don't emit a warning on VI-GFX9. We are conformant: https://www.khronos.org/conformance/adopters/conformant-products#submission_308 v2: Actually not emit it on gfx9. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	04d65d2b76	radv: Enable vulkan 1.1.0 for configurations that can support it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	0168eaaa42	radv: Disable sampler ycbcr conversion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	cce62f4065	radv: Expose that we don't support any VK_KHR_16_bit_storage parts. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b99b9cc864	radv: Implement vkEnumerateInstanceVersion. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	5240fddb9d	radv: Add trivial device group implementation. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	84e877aa77	radv: Implement vkCmdDispatchBase. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	de5e25898c	radv: Implement VkGetDeviceQueue2. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	b137e25277	radv: Support VkPhysicalDeviceProtectedMemoryFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	4bcf4d1678	radv: Support VkPhysicalDeviceShaderDrawParameterFeatures. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	41d958d073	radv: Implement VK_KHR_maintenance3. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	8f9af587a2	radv: Add minimal subgroup support. Deliberately not implementing workgroup scopes as that is not needed for core vulkan. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:35 +01:00
Bas Nieuwenhuizen	89651fba9b	radv: Change client version check. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	5b3979704d	radv: Update MAX_API_VERSION to 1.1.0 v2: Don't bump supported version. v3: Update json files. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:34 +01:00
Bas Nieuwenhuizen	97f10934ed	ac/nir: Add vote_ieq/vote_feq lowering pass. The old vote_eq implementation supported only booleans, but now we have to support arbitrary values, so use the read_first_invocation intrinsic + ballot. I took this as an opportunity to figure out how easy it was to do this in nir instead of in the nir_to_llvm pass, and it actually turned out pretty okay IMO. Only creating the pass is some extra code. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 21:18:32 +01:00
Jason Ekstrand	c217607b65	anv: Support version overrides While always sketchy to do, this is useful for debugging. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	a1ee51309e	vulkan/util: Add a helper to get a version override Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d6b65222df	anv: Enable Vulkan 1.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	03c07ac548	anv: Add support for SPIR-V 1.3 subgroup operations This requires us to bump the subgroup size to 32 for all shader stages because Vulkan requires that to be a physical device query. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8b4a5e641b	intel/fs: Add support for subgroup quad operations NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	2292b20b29	intel/fs: Implement reduce and scan opeprations Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	4150920b95	intel/fs: Add a helper for emitting scan operations This commit adds a helper to the builder for emitting "scan" operations. Given a binary operation #, a scan takes the vector [a0, a1, ..., aN] and returns the vector [a0, a0 # a1, ..., a0 # a1 # ... # aN] where each channel contains the combination of all previous channels. The sequence of instructions to perform the scan is fairly optimal; a 16-wide scan on a 32-bit type is only 6 instructions. The subgroup scan and reduction operations will be implemented in terms of this. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b0858c1cc6	intel/fs: Add a couple of simple helper opcodes Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	57bff0a546	spirv: Add support for subgroup arithmetic Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	789221dcfa	nir: Add a helper for getting binop identities Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	82d493a939	nir: Add subgroup arithmetic reduction intrinsics Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	b3a5b0f3fc	spirv: Add subgroup quad support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	493a165544	nir: Add quad operations and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	90c9f29518	i965/fs: Add support for nir_intrinsic_shuffle Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8256ee3fa3	spirv: Add subgroup shuffle support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	149b92ccf2	nir: Add subgroup shuffle intrinsics and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7cfece820d	i965/fs: Support nir_intrinsic_vote_feq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	0e893356fe	nir/lower_subgroups: Add scalarizing for vote_eq Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d792f3d4cd	spirv: Add subgroup vote support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	44681e4795	nir: Generalize nir_intrinsic_vote_eq The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9812fce60b	spirv: Add subgroup ballot support Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	974daec495	i965/fs: Implement basic SPIR-V subgroup intrinsics Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	adc077797a	spirv: Add initial subgroup support Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	5162a1d884	nir: Add new SPIR-V ballot intrinsics and lowering Someone can make the lowering optional later if they want something different for their hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	752e969703	compiler: Add two new system values for subgroups This will be required for SPIR-V subgroup support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	34c60ea02b	nir: Add new SPIR-V ballot ALU intrinsics and lowering Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cc587ee9a7	spirv: Handle the new OpModuleProcessed instruction Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	59b0ea0c74	anv: Stop returning VK_ERROR_INCOMPATIBLE_DRIVER From the Vulkan 1.1 spec: "Vulkan 1.0 implementations were required to return VK_ERROR_INCOMPATIBLE_DRIVER if apiVersion was larger than 1.0. Implementations that support Vulkan 1.1 or later must not return VK_ERROR_INCOMPATIBLE_DRIVER for any value of apiVersion." Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	cbab2d1da5	anv: Implement vkEnumerateInstanceVersion Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	605fd7c0da	anv/device: fail to initialize device if we have queues with unsupported flags This is not strictly necessary since users should not be requesting any flags that are not valid for the list of enabled features requested and we already fail if they attempt to use an unsupported feature, however it is an easy to implement sanity check that would help developes realize that they are doing things wrong, so we might as well do it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Iago Toral Quiroga	b262f17b15	anv/device: GetDeviceQueue2 should only return queues with matching flags From the Vulkan 1.1 spec, VkDeviceQueueInfo2 structure: "The queue returned by vkGetDeviceQueue2 must have the same flags value from this structure as that used at device creation time in a VkDeviceQueueCreateInfo instance. If no matching flags were specified at device creation time then pQueue will return VK_NULL_HANDLE." For us this means no flags at all since we don't support any. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	9c8b40001d	anv: Support querying for protected memory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	773a51e772	anv: Implement GetDeviceQueue2 This belongs to the protected memory feature but there's nothing about it that's specific to protected memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68df93ecbc	anv: Trivially implement VK_KHR_device_group Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	dfe18be09e	anv: Implement vkCmdDispatchBase This is part of the device groups extension/feature but it's a decent chunk of work in its own right so it's worth breaking into its own patch. The mechanism we use is fairly straightforward: we just push the base work group id into the shader and add it to the work group id we get from dispatch. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ff9db1a4cc	nir/spirv: Add support for device groups Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	ddc4069122	anv: Implement VK_KHR_maintenance3 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	1deb7967c8	anv: Support VkPhysicalDeviceShaderDrawParameterFeatures This advertises the VK_KHR_shader_draw_parameters functionality as a "core optimal feature" in Vulkan 1.1. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	06719f9d4b	anv/entrypoints: Drop support for protect attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	bd1279bd9f	Get rid of a bunch of KHR suffixes Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	af461986db	anv: Add version 1.1.0 but leave it disabled This requires us to rename any Vulkan API entrypoints which became core in 1.1 to no longer have the KHR suffix. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	0128187335	spirv: Update the SPIR-V headers and json to 1.3.1 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	205c271562	vulkan: Update the XML and headers to 1.1.70 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	7fb86fb511	vulkan/enum_to_str: Add support for aliases and new Vulkan versions Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	539a0aec45	vulkan/enum_to_str: Add a add_value_from_xml helper to VkEnum Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	eb23ca069f	anv/entrypoints: Generate #ifdef guards from platform attributes Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	05fc377f2e	anv/extensions: Add support for multiple API versions Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8efa173ed2	anv/entrypoints_gen: Add support for aliases in the XML Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	39d9fcea13	anv/entrypoints: Allow an entrypoint to require multiple extensions In this case, we say an entrypoint is supported if ANY of the extensions is supported. This is because, in the XML, entrypoints don't require extensions so much as extensions require entrypoints. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	8e8f167c72	anv/entrypoints: Add an is_device_entrypoint helper Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	54b3493fc0	anv/entrypoints_gen: Allow the string map to grow Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	d91da06df5	anv/entrypoints_gen: A bit of refactoring Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	a4ca4c99ba	anv/entrypoints: Generalize the string map a bit The original string map assumed that the mapping from strings to entrypoints was a bijection. This will not be true the moment we add entrypoint aliasing. This reworks things to be an arbitrary map from strings to non-negative signed integers. The old one also had a potential bug if we ever had a hash collision because it didn't do the strcmp inside the lookup loop. While we're at it, we break things out into a helpful class. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	3960d0e332	vulkan: Rename multiview from KHX to KHR Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	68af9f04a4	spirv: Rework barriers Our previous handling of barriers always used the big hammer and didn't correctly emit memory barriers when specified along with a control barrier. This commit completely reworks the way we emit barriers to make things both more precise and more correct. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Jason Ekstrand	de518f38e5	spirv: Add a vtn_constant_value helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 12:13:47 -08:00
Marek Olšák	9779f34326	radeonsi: remove si_llvm_add_attribute	2018-03-07 13:55:49 -05:00
Marek Olšák	2c3f3651c4	radeonsi: fix passing address32_hi to LLVM for high values The old function treats high values as negative, which LLVM interprets as 0.	2018-03-07 13:55:49 -05:00
Marek Olšák	b3b6b00ac8	radeonsi: assume has_virtual_memory == true	2018-03-07 13:55:48 -05:00
Marek Olšák	53db2790c0	radeonsi: add/update assertions for 32-bit address space	2018-03-07 13:55:47 -05:00
Marek Olšák	16856a1ee8	radeonsi: prevent a negative buffer offset in si_upload_descriptors	2018-03-07 13:55:42 -05:00
Marek Olšák	9b55498059	radeonsi: properly extract a buffer address from a descriptor	2018-03-07 13:55:40 -05:00
Marek Olšák	2a47660754	radeonsi: fix vertex buffer address computation with full 64-bit addresses	2018-03-07 13:55:38 -05:00
Marek Olšák	2e30268877	radeonsi: mask out high VM address bits in registers where needed	2018-03-07 13:55:35 -05:00
Bas Nieuwenhuizen	94c9096c83	radv: Add entrypoints generation with the new vk.xml A lot of it is based on intel again. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-07 15:50:19 +01:00
Simon Hausmann	fb5825e7ce	glsl: Fix memory leak with known glsl_type instances When looking up known glsl_type instances in the various hash tables, we end up leaking the key instances used for the lookup, as the glsl_type constructor allocates memory on the global mem_ctx. This patch changes glsl_type to manage its own memory, which fixes the leak and also allows getting rid of the global mem_ctx and its mutex. v2: remove lambda usage (Tapani) (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Simon Hausmann <simon.hausmann@qt.io> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho	c17808562e	spirv: Add SpvCapabilityShaderViewportIndexLayerEXT This capability allows gl_ViewportIndex and gl_Layer to also be used as outputs in Vertex and Tesselation shaders. v2: Make conditional to the capability, add gl_Layer, add tesselation shaders. (Iago) v3: Don't export to tesselation control shader. v4: Add Reviewd-by tag. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-07 07:04:20 +01:00
Mauro Rossi	487f8d48c9	android: anv: add libmesa_intel_dev static dependency Fixes the following building errors: external/mesa/src/intel/vulkan/anv_device.c:300: error: undefined reference to 'gen_get_pci_device_id_override' external/mesa/src/intel/vulkan/anv_device.c:312: error: undefined reference to 'gen_get_device_name' external/mesa/src/intel/vulkan/anv_device.c:313: error: undefined reference to 'gen_get_device_info' clang.real: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: `272bef0601` "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-03-07 07:55:34 +02:00
Timothy Arceri	1fdb21541e	Revert "nir: bump loop unroll limit to 96." This reverts commit `2d36efdb7f`. This raised limit turns out to harmful for more complex shaders, it causes excessive spilling in some Bioshock Infinite shaders. The fps for the ssao demo on radv remains unchanged when reverting this. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-07 15:10:05 +11:00
Dave Airlie	fb077b0728	ac/nir: don't put lod into args if it's zero. If it's zero but put it in args we still end up consuming a register for it. This fixes some spilling in the NIR paths in Dirt Rally that isn't seen with TGSI. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-07 03:34:59 +00:00
Christian Gmeiner	38e91e2b81	freedreno: bump required libdrm version Fixes: `26a9321d0a` "freedreno: add global_bindings state" Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-06 21:52:59 +01:00
Ian Romanick	e3ea166a2c	nir: Simplify some comparisons like a+b < a All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14514555 -> 14514547 (<.01%) instructions in affected programs: 1972 -> 1964 (-0.41%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.41% -0.40% Instructions are helped. total cycles in shared programs: 533141444 -> 533136780 (<.01%) cycles in affected programs: 164728 -> 160064 (-2.83%) helped: 181 HURT: 3 helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30 helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80% HURT stats (abs) min: 4 max: 54 x̄: 24.00 x̃: 14 HURT stats (rel) min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68% 95% mean confidence interval for cycles value: -27.12 -23.58 95% mean confidence interval for cycles %-change: -3.54% -3.16% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533667 -> 10533539 (<.01%) instructions in affected programs: 10148 -> 10020 (-1.26%) helped: 124 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04% 95% mean confidence interval for instructions value: -1.06 -1.00 95% mean confidence interval for instructions %-change: -2.46% -1.95% Instructions are helped. total cycles in shared programs: 146136887 -> 146132122 (<.01%) cycles in affected programs: 206382 -> 201617 (-2.31%) helped: 171 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30 helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67% 95% mean confidence interval for cycles value: -29.19 -26.54 95% mean confidence interval for cycles %-change: -3.20% -2.76% Cycles are helped. Iron Lake total instructions in shared programs: 7886515 -> 7886507 (<.01%) instructions in affected programs: 3016 -> 3008 (-0.27%) helped: 8 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.27% -0.26% Instructions are helped. total cycles in shared programs: 178100396 -> 178100388 (<.01%) cycles in affected programs: 156128 -> 156120 (<.01%) helped: 4 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: -3.68 1.68 95% mean confidence interval for cycles %-change: -0.03% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857872 -> 4857868 (<.01%) instructions in affected programs: 1544 -> 1540 (-0.26%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.28% -0.24% Instructions are helped. total cycles in shared programs: 122167654 -> 122167662 (<.01%) cycles in affected programs: 96248 -> 96256 (<.01%) helped: 0 HURT: 4 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01% 95% mean confidence interval for cycles value: 2.00 2.00 95% mean confidence interval for cycles %-change: <.01% 0.02% Cycles are HURT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:30 -08:00
Ian Romanick	d1ed4ffe0b	nir: Use De Morgan's Law on logic compounded comparisons The replacement of the comparison operators must happen during this step. If it does not, the next pass of nir_opt_algebraic will reapply De Morgan's Law in the "opposite direction" before performing dead code elimination. The resulting infinite loop will eventually get OOM killed. Haswell, Broadwell, and Skylake had similar results. (Broadwell shown) total instructions in shared programs: 14808185 -> 14808036 (<.01%) instructions in affected programs: 13758 -> 13609 (-1.08%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3 helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01% 95% mean confidence interval for instructions value: -4.67 -2.97 95% mean confidence interval for instructions %-change: -1.09% -0.88% Instructions are helped. total cycles in shared programs: 559438333 -> 559435832 (<.01%) cycles in affected programs: 199160 -> 196659 (-1.26%) helped: 42 HURT: 3 helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51 helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40% HURT stats (abs) min: 2 max: 40 x̄: 27.33 x̃: 40 HURT stats (rel) min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74% 95% mean confidence interval for cycles value: -71.47 -39.69 95% mean confidence interval for cycles %-change: -1.64% -0.93% Cycles are helped. Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total instructions in shared programs: 11811776 -> 11811553 (<.01%) instructions in affected programs: 15201 -> 14978 (-1.47%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6 helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26% 95% mean confidence interval for instructions value: -7.21 -4.23 95% mean confidence interval for instructions %-change: -1.48% -1.12% Instructions are helped. total cycles in shared programs: 257617270 -> 257614589 (<.01%) cycles in affected programs: 212107 -> 209426 (-1.26%) helped: 45 HURT: 0 helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54 helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32% 95% mean confidence interval for cycles value: -74.02 -45.14 95% mean confidence interval for cycles %-change: -1.59% -1.01% Cycles are helped. Iron Lake total instructions in shared programs: 7886648 -> 7886515 (<.01%) instructions in affected programs: 14106 -> 13973 (-0.94%) helped: 29 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4 helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81% 95% mean confidence interval for instructions value: -5.65 -3.52 95% mean confidence interval for instructions %-change: -1.03% -0.76% Instructions are helped. total cycles in shared programs: 178100812 -> 178100396 (<.01%) cycles in affected programs: 67970 -> 67554 (-0.61%) helped: 29 HURT: 0 helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54% 95% mean confidence interval for cycles value: -18.30 -10.39 95% mean confidence interval for cycles %-change: -0.71% -0.45% Cycles are helped. GM45 total instructions in shared programs: 4857939 -> 4857872 (<.01%) instructions in affected programs: 7426 -> 7359 (-0.90%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4 helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77% 95% mean confidence interval for instructions value: -6.06 -2.87 95% mean confidence interval for instructions %-change: -1.06% -0.67% Instructions are helped. total cycles in shared programs: 122167930 -> 122167654 (<.01%) cycles in affected programs: 43118 -> 42842 (-0.64%) helped: 15 HURT: 0 helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16 helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54% 95% mean confidence interval for cycles value: -25.03 -11.77 95% mean confidence interval for cycles %-change: -0.82% -0.41% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	52607658ff	nir: Replace fmin(b2f(a), b) with a bcsel All of the affected shaders are HDR mappers from Serious Sam 3. All Gen7+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14516285 -> 14516273 (<.01%) instructions in affected programs: 348 -> 336 (-3.45%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -5.55% -3.06% Instructions are helped. total cycles in shared programs: 533163876 -> 533163808 (<.01%) cycles in affected programs: 1144 -> 1076 (-5.94%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for cycles value: -18.84 -15.16 95% mean confidence interval for cycles %-change: -6.20% -5.68% Cycles are helped. Sandy Bridge total instructions in shared programs: 10533321 -> 10533309 (<.01%) instructions in affected programs: 372 -> 360 (-3.23%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -4.96% -2.86% Instructions are helped. total cycles in shared programs: 146136632 -> 146136428 (<.01%) cycles in affected programs: 11668 -> 11464 (-1.75%) helped: 12 HURT: 0 helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17 helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29% 95% mean confidence interval for cycles value: -17.66 -16.34 95% mean confidence interval for cycles %-change: -2.82% -1.58% Cycles are helped. Iron Lake total instructions in shared programs: 7886301 -> 7886277 (<.01%) instructions in affected programs: 576 -> 552 (-4.17%) helped: 12 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.30% -3.72% Instructions are helped. total cycles in shared programs: 178113176 -> 178113176 (0.00%) cycles in affected programs: 2116 -> 2116 (0.00%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58% 95% mean confidence interval for cycles value: -3.25 3.25 95% mean confidence interval for cycles %-change: -0.93% 0.94% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857756 -> 4857744 (<.01%) instructions in affected programs: 294 -> 282 (-4.08%) helped: 6 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55% 95% mean confidence interval for instructions value: -2.00 -2.00 95% mean confidence interval for instructions %-change: -5.71% -3.09% Instructions are helped. total cycles in shared programs: 122178730 -> 122178722 (<.01%) cycles in affected programs: 700 -> 692 (-1.14%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	b974dfee11	nir: Pull b2f out of bcsel All platforms had similar results. (Skylake shown) total instructions in shared programs: 14516592 -> 14516586 (<.01%) instructions in affected programs: 500 -> 494 (-1.20%) helped: 2 HURT: 0 total cycles in shared programs: 533167044 -> 533166998 (<.01%) cycles in affected programs: 6988 -> 6942 (-0.66%) helped: 2 HURT: 0 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	f50400cc80	nir: Replace an odd comparison involving fmin of -b2f I noticed the fge version while looking at a shader for an unrelated reason. The feq version prevents a regression in a later change that performs strength reduction of some compares. Broadwell and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514808 -> 14514796 (<.01%) instructions in affected programs: 750 -> 738 (-1.60%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.83% max: 1.96% x̄: 1.40% x̃: 1.40% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.43% -0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533144939 -> 533144853 (<.01%) cycles in affected programs: 8911 -> 8825 (-0.97%) helped: 4 HURT: 0 helped stats (abs) min: 16 max: 32 x̄: 21.50 x̃: 19 helped stats (rel) min: 0.60% max: 1.89% x̄: 1.28% x̃: 1.31% 95% mean confidence interval for cycles value: -32.94 -10.06 95% mean confidence interval for cycles %-change: -2.30% -0.26% Cycles are helped. Haswell total instructions in shared programs: 13093785 -> 13093775 (<.01%) instructions in affected programs: 924 -> 914 (-1.08%) helped: 4 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.82% max: 1.95% x̄: 1.39% x̃: 1.39% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.53 1.20 95% mean confidence interval for instructions %-change: -2.02% 0.97% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 409580553 -> 409580118 (<.01%) cycles in affected programs: 10909 -> 10474 (-3.99%) helped: 5 HURT: 1 helped stats (abs) min: 6 max: 222 x̄: 89.60 x̃: 18 helped stats (rel) min: 0.16% max: 24.72% x̄: 9.54% x̃: 1.78% HURT stats (abs) min: 13 max: 13 x̄: 13.00 x̃: 13 HURT stats (rel) min: 0.39% max: 0.39% x̄: 0.39% x̃: 0.39% 95% mean confidence interval for cycles value: -180.68 35.68 95% mean confidence interval for cycles %-change: -19.55% 3.79% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 11811851 -> 11811840 (<.01%) instructions in affected programs: 1032 -> 1021 (-1.07%) helped: 5 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.63% max: 1.95% x̄: 1.13% x̃: 0.97% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 1.19% x̄: 1.19% x̃: 1.19% 95% mean confidence interval for instructions value: -4.17 0.51 95% mean confidence interval for instructions %-change: -1.86% 0.36% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 257618403 -> 257618168 (<.01%) cycles in affected programs: 10784 -> 10549 (-2.18%) helped: 4 HURT: 2 helped stats (abs) min: 4 max: 220 x̄: 64.50 x̃: 17 helped stats (rel) min: 0.50% max: 24.34% x̄: 7.07% x̃: 1.72% HURT stats (abs) min: 9 max: 14 x̄: 11.50 x̃: 11 HURT stats (rel) min: 0.24% max: 0.42% x̄: 0.33% x̃: 0.33% 95% mean confidence interval for cycles value: -133.11 54.78 95% mean confidence interval for cycles %-change: -14.79% 5.59% Inconclusive result (value mean confidence interval includes 0). GM45, Iron Lake, and Sandy Bridge had similar results. (Sandy Bridge shown) total instructions in shared programs: 10533871 -> 10533859 (<.01%) instructions in affected programs: 865 -> 853 (-1.39%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.63% max: 1.83% x̄: 1.22% x̃: 1.21% 95% mean confidence interval for instructions value: -6.67 0.67 95% mean confidence interval for instructions %-change: -2.16% -0.29% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 146139904 -> 146139852 (<.01%) cycles in affected programs: 15213 -> 15161 (-0.34%) helped: 4 HURT: 0 helped stats (abs) min: 3 max: 18 x̄: 13.00 x̃: 15 helped stats (rel) min: 0.15% max: 0.84% x̄: 0.39% x̃: 0.29% 95% mean confidence interval for cycles value: -23.79 -2.21 95% mean confidence interval for cycles %-change: -0.88% 0.09% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-06 11:17:29 -08:00
Ian Romanick	380136e998	nir: Mark bcsel-to-fmin (or fmax) transformations as inexact These transformations are inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Ian Romanick	4addd34b04	nir: Recognize some more open-coded fmin / fmax This transformation is inexact because section 4.7.1 (Range and Precision) says: Operations and built-in functions that operate on a NaN are not required to return a NaN as the result. The fmin or fmax might not return NaN in cases where the original expression would be required to return NaN. v2: Reorder operands and mark as inexact. The latter suggested by Jason. shader-db results: Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14514817 -> 14514808 (<.01%) instructions in affected programs: 229 -> 220 (-3.93%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12% total cycles in shared programs: 533145211 -> 533144939 (<.01%) cycles in affected programs: 37268 -> 36996 (-0.73%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2 helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05% Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown) total cycles in shared programs: 257618409 -> 257618403 (<.01%) cycles in affected programs: 12582 -> 12576 (-0.05%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05% No changes on Iron Lake or GM45. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 11:17:14 -08:00
Gurkirpal Singh	c62cf1f165	st/omx/tizonia/h264d: Add EGLImage support Example Gstreamer pipeline : MESA_ENABLE_OMX_EGLIMAGE=1 GST_GL_API=gles2 GST_GL_PLATFORM=egl gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! glimagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 17:21:11 +00:00
Gurkirpal Singh	b2f2236dc5	st/omx/tizonia: Add H.264 encoder v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videoconvert ! omxh264enc ! h264parse ! avdec_h264 ! videoconvert ! ximagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 17:20:08 +00:00
Gurkirpal Singh	83d4a5d5ae	st/omx/tizonia: Add H.264 decoder v2: Refactor out screen functions to st/omx Example Gstreamer pipeline : gst-launch-1.0 filesrc location=movie.mp4 ! qtdemux ! h264parse ! omxh264dec ! videoconvert ! ximagesink Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	430ccdbcb9	st/omx/tizonia: Add entrypoint Adds base files for adding components Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	e2afa154e9	st/omx/tizonia: Add --enable-omx-tizonia flag and build files Allow only bellagio or tizonia to be used at the same time. Detect tizonia package config file Generate libomx_mesa.so and install it to libtizcore.pc::pluginsdir Only compile empty source (target.c) for now. GSoC Project link: https://summerofcode.withgoogle.com/projects/#4737166321123328 Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 14:29:42 +00:00
Gurkirpal Singh	bb5e27fab6	st/omx/bellagio: Rename st and target directories v2: Refactor out screen functions to st/omx Allows to keep all the code under st/omx (st/omx/tizonia and st/omx/bellagio). Reverts targets/omx_bellagio to omx as additions to existing files is enough to compile for both bellagio and tizonia. * autotools changes: --enable-omx -> --enable-omx-bellagio * meson changes: -Dgallium-omx=false -> -Dgallium-omx=disabled -Dgallium-omx=true -> -Dgallium-omx=bellagio Acked-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Julien Isorce <julien.isorce@gmail.com>	2018-03-06 13:07:03 +00:00
Samuel Pitoiset	e96e6f60f7	radv: report the scratch private memory size with shader stats Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:42 +01:00
Samuel Pitoiset	7f6b91c9c3	ac/nir: count the scratch private memory size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:40 +01:00
Samuel Pitoiset	3b8e7459f2	ac: add ac_count_scratch_private_memory() Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:38 +01:00
Samuel Pitoiset	f3275ca01c	ac/nir: only enable used channels when exporting parameters This allows us to generate, for example, "exp param0 v0, off, off, off" if only the first channel is needed. Not sure if this improves performance but it's worth trying. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:38:35 +01:00
Samuel Pitoiset	675dde13b2	ac: update enabled channels mask when optimizing PARAM exports When the mask is not 0xf we need to update the number of enabled channels, otherwise the hardware won't emit the components that are combined. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:52 +01:00
Samuel Pitoiset	c24abae9dc	ac/nir: pass the number of enabled channels to si_llvm_init_export_args() Currently, it's always 0xf but an upcoming patch will reduce the number of channels for parameters export. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:50 +01:00
Samuel Pitoiset	5cd34f03c0	ac/shader: scan output usage mask for VS and TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 10:37:47 +01:00
Clayton Craft	d1fa30e0f8	intel: Add missing includes for building on Android This adds a missing library to the i965/Android.mk file, and updates intel/Android.mk to include the new library. Without this, mesa does not build on Android. Fixes: `272bef0601` "intel: Split gen_device_info out into libintel_dev" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-06 00:14:22 -08:00
Tapani Pälli	237c9caa78	vulkan: do not expose surface/swapchain extensions on Android On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit `9f763c1f9b`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-06 08:02:59 +02:00
Tapani Pälli	85518657a9	anv: Don't expose VK_KHX_multiview on android. Just like commit `2ffe395` does for radv. Fixes following dEQP test on i965: dEQP-VK.api.info.android.no_unknown_extensions v2: make it !ANDROID since this extension is not about surfaces/swapchain Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-06 08:01:20 +02:00
Roland Scheidegger	cf4a92fda2	gallium: increase PIPE_MAX_SHADER_SAMPLER_VIEWS to 128 Some state trackers require 128. (There are no plans to increase PIPE_MAX_SAMPLERS too, since with gl state tracker it's unlikely more than 32 will be needed, if you need more use bindless.)	2018-03-06 05:18:17 +01:00
Roland Scheidegger	06e724c7b4	tgsi/scan: use wrap-around shift behavior explicitly for file_mask The comment said it will only represent the lowest 32 regs. This was not entirely true in practice, since at least on x86 you'll get masked shifts (unless the compiler could recognize it already and toss it out). It turns out this actually works out alright (presumably noone uses it for temp regs) when increasing max sampler views, so make that behavior explicit. Albeit it feels a bit hacky (but in any case, explicit behavior there is better than undefined behavior). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-06 05:18:17 +01:00
Aaron Watry	95ae6c0355	clover: Allow overriding platform/device version numbers Useful for testing API, builtin library, and device completeness of not-yet-supported versions. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> v4: Remove redundant std::string wrapper around debug_get_option calls v3: mark CL version overrides as static and const v2: Make version_string in platform const in case	2018-03-05 20:09:46 -06:00
Aaron Watry	106020712f	clover/llvm: Pass device down to compile We'll need to be able to detect device version to define the appropriate __OPENCL_VERSION__ header. v2: Rebase after removing the previous patch (Pierre) - Removed "clover: Add device_clc_version to llvm::create_compiler_instance" Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-05 20:09:46 -06:00
Aaron Watry	fc629e3594	clover: Pass device to llvm::create_compiler_instance We'll be using dev.device_clc_version to select the default language version soon along with the existing ir_target field. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> v4: Pass the device down instead of device_clc_version as a separate field v3: Revise to acknowledge that we now have the device in compile/link_program instead of the string values. v2: (Pierre) Move changes to create_compiler_instance invocation to correct patch to prevent temporary build breakage. (Jan) Use device_clc_version instead of device_version for compile/link	2018-03-05 20:09:46 -06:00
Aaron Watry	dd81ca3883	clover/llvm: Use device in llvm compilation instead of copying fields Copying the individual fields from the device when compiling/linking will lead to an unnecessarily large number of fields getting passed around. v3: Rebase on current master v2: Use device in function args before making additional changes in following patches Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-05 20:09:46 -06:00
Timothy Arceri	71b3d681d8	radeonsi/nir: fix handling of doubles for gs inputs Fixes piglit test: tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Timothy Arceri	20bd0f6a2b	ac: pass the unmodified number of components to load gs inputs Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Timothy Arceri	2a68c6c6c8	radeonsi: move si_nir_load_input_gs() to si_shader.c All the tess shader and tgsi equivalents are here and it allows use to use llvm_type_is_64bit() in the following patch without exposing it externally. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-06 11:44:06 +11:00
Boris Brezillon	9ea90ffb98	broadcom/vc4: Add support for HW perfmon The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-05 15:54:04 -08:00
Boris Brezillon	5924379a58	drm-uapi: Update vc4 header with perfmon related definitions v2: Update to the final version with the documentation. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2018-03-05 15:53:48 -08:00
Roland Scheidegger	434523cf2a	r600: fix color export mask The r600 code (not the eg one) forgot to copy the ps_color_export_mask in commit `5b14e06d8b` when updating the pixel state, leading to misrenderings (probably with MRT). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262 Tested-by: LoneVVolf <lonewolf@xs4all.nl> Tested-by: Pavel Vinogradov <public@sourcemage.org>	2018-03-05 20:15:05 +01:00
Andres Gomez	72552012c7	travis: keep meson version below 0.45.0 Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is not available in Trusty. Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-03-05 21:12:37 +02:00
Kenneth Graunke	0472aa3efe	intel: Drop SURFACE_FORMAT enum from genxml. We want people to be using ISL_FORMAT_*, rather than the genxml format enumerations. This patch drops 10 separate copies, and drops a bunch of ugly casting. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [jordan.l.justen@intel.com: Minor changes for rebase] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:08 -08:00
Jordan Justen	755e7e6c20	intel/common: Use isl for decoder surface formats Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:04 -08:00
Jordan Justen	bd3392423d	intel/isl: Add isl_format_is_valid Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:51:01 -08:00
Jordan Justen	272bef0601	intel: Split gen_device_info out into libintel_dev Split out the device info so isl doesn't depend on intel/common. Now it will depend on the new intel/dev device info lib. This will allow the decoder in intel/common to use isl, allowing us to apply Ken's patch that removes the genxml duplication of surface formats. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-05 09:47:37 -08:00
Gert Wollny	9a0d7bb48c	gallium/aux/hud: Avoid possible buffer overflow Limit the length of acceptable cpu names for use in hud_get_num_cpufreq in order to avoid a buffer overflow later in add_object when this name is copied into cpufreq_info::name. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-03-05 11:38:28 -05:00
Eric Engestrom	b98c905a46	gbm: give a name to rgba fields Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-05 15:14:36 +00:00
Andres Gomez	40abffb295	egl: remove duplicated initialization Found by inspection. The line removed is a duplicate of the line literally just above the the 3 lines context usually printed in a commit log. v2: enhance the commit log (Emil). Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-05 15:55:53 +02:00
Rob Clark	5a5a43078c	freedreno/ir3: start dealing with half-precision Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	175d1b4372	freedreno/ir3: fix fixing-up register footprint It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9a62536108	freedreno: surfaces can be PIPE_BUFFER At least for clover. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	d7af35a7f3	freedreno/a5xx: handle compute resources Not entirely sure why this is a different BIND bit, but it is. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	82c71b09d5	freedreno/ir3: ignore return jump I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	c9b1cc33df	freedreno: add some more compute caps Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9630f4df3b	freedreno/a5xx: don't expose 64b pointers yet Temporary hack, but since we can't do 64b math yet in ir3, pretend that we don't support 64b pointers. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	54988f1e6b	freedreno: steal handy macro for compute caps from nouveau Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	26a9321d0a	freedreno: add global_bindings state Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	8c42f63151	freedreno/ir3: small cleanup Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	76687b0c0a	freedreno: add pctx->memory_barrier() Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Rob Clark	9e4f5966e8	freedreno/ir3: cmdline compiler updates for spv shaders Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-05 08:05:33 -05:00
Samuel Pitoiset	322a51b549	ac: add ac_build_fsign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:36 +01:00
Samuel Pitoiset	e8bdde2289	ac: add ac_build_isign() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:32 +01:00
Samuel Pitoiset	459e33900f	ac: add ac_build_fract() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-05 11:04:30 +01:00
gurchetansingh@chromium.org	fe0647df5a	virgl: add offset alignment values to to v2 caps struct glBindBufferRange(..) in vrend_draw_bind_ubo is failing with more than one uniform block. This is due to improper alignment of the start of the second block. Let's query the proper alignment from the driver and pass it back to Mesa. Let's query for the texture alignment too, even though the Virgl renderer doesn't call glTexBufferRange yet. The default values are the widest workable range possible (for example, GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256). Fixes: dEQP-GLES3.functional.ubo.* on Nvidia Example test: dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex Note: This is based on "virgl: reduce some default capset limits.", which hasn't landed in Mesa yet but should relatively soon. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:39 +10:00
Dave Airlie	9283cf2ad1	virgl: reduce some default capset limits. Since v2 might take a while to rollout, we should reduce these inside some gathered minimums and then v2 can increase them using host values. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:38 +10:00
Dave Airlie	cd32258ec1	virgl: handle getting new capsets. This checks the kernel api is new enough and asks for the larger caps size since the kernel won't mess it up now. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-05 13:29:38 +10:00
Timothy Arceri	70190a6567	radeonsi/nir: call ac_lower_indirect_derefs() Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Timothy Arceri	561503e3bd	radeonsi: add chip class to compiler_ctx_state This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Timothy Arceri	0f2c7341e8	ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Bas Nieuwenhuizen	eea20d59ab	radv: Fix copying from 3D images starting at non-zero depth. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-05 01:04:54 +01:00
Vinson Lee	bb742b6ebf	swr/rast: Fix macOS macro. Fixes: `a25093de71` ("swr/rast: Implement JIT shader caching to disk") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2018-03-04 13:23:57 -08:00
Mathias Fröhlich	411aa8c322	vbo: Try to reuse the same VAO more often for successive dlists. The change tries to catch more opportunities to reuse the same set of VAO's when building up display lists. Instead of checking the offset with respect to the beginning of the vertex buffer object the change tries to apply this same optimization with respect to the previous display list node. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-03 05:56:35 +01:00
Ian Romanick	a9eb455e29	mesa: Silence unused parameter warnings from TEXSTORE_PARAMS Reduces my build from 1717 warnings to 1547 warnings by silencing 170 instances of things like In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0, from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31: ../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’: ../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter] mesa_format dstFormat, \ ^ ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’ _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS) ^~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	1049b57bf2	i965: Silence unused parameter warnings in genX_state_upload Reduces my build from 1772 warnings to 1717 warnings by silencing 55 instances of things like ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter] unsigned end_offset, ^~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter] genX(emit_sampler_state_pointers_xs)(struct brw_context brw, ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter] struct brw_stage_state stage_state) ^~~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter] mesa_format format, GLenum base_format, ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter] translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest) ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter] uint32_t batch_offset_for_sampler_state) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	50bf186829	isl: Silence unused parameter warnings in __gen_combine_address implementations Reduces my build from 1808 warnings to 1772 warnings by silencing 36 instances of things like ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’: ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~~ ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter] __gen_combine_address(void data, void loc, uint64_t addr, uint32_t delta) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	492a472b28	genxml: Silence unused parameter warnings in generated pack code Reduces my build from 1960 warnings to 1808 warnings by silencing 152 instances of things like In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0, from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36: src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’: src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_uint(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’: src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~~~ src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_offset(uint64_t v, uint32_t start, uint32_t end) ^~~ src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’: src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter] __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	f726695cce	i965: Silence unused parameter warnings in blorp Reduces my build from 2023 warnings to 1960 warnings by silencing 63 instances of things like In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’: ../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’: ../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter] const struct blorp_params params) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’: ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void start, size_t size) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter] blorp_flush_range(struct blorp_batch batch, void *start, size_t size) ^~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	3a944316c4	nir: Silence unused parameter warnings in generated nir_constant_expressions code Reduces my build from 2075 warnings to 2023 warnings by silencing 52 instances of things like src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’: src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter] evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size, ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	ab8f2e30b8	i965: Silence unused parameter warnings in generated OA code Reduces my build from 6301 warnings to 2075 warnings by silencing 4226 instances of things like src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’: src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter] hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	a55dae6ea2	i965: Silence warnings about mixing enum and non-enum in conditional Reduces my build from 6451 warnings to 6301 warnings by silencing 150 instances of ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info, const brw_inst)’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra] unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ BRW_GENERAL_REGISTER_FILE : \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ brw_inst_##reg##_reg_file(devinfo, inst); \ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’ REG_TYPE(src1) ^~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	feefb7810e	intel/compiler: Silence unused parameter warnings in release builds Reduces my build from 7005 warnings to 6451 warnings by silencing 554 instances of In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0: ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src0_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_set_3src_a1_src2_imm(const struct gen_device_info devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’: ../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_inst_imm_uq(const struct gen_device_info devinfo, const brw_inst insn) ^~~~~~~ In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0, from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29: ../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’: ../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo, ^~~~~~~ ../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: ../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Ian Romanick	c8a03ab453	i965: Silence unused parameter warnings Reduces my build from 7119 warnings to 7005 warnings by silencing 114 instances of In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0, from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’: ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter] static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; } ^~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-02 16:10:44 -08:00
Kenneth Graunke	9fa95359df	intel: Drop program size pointer from vec4/fs assembly getters. These days, we're just passing a pointer to a prog_data field, which we already have access to. We can just use it directly. (In the past, it was a pointer to a separate value.) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-03-02 14:20:22 -08:00
Kenneth Graunke	b04cf529f2	i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT. This should have no practical impact. For the default uploader, we don't really care, but for others, we may want to append more data as the GPU is reading existing data, which means we need async and persistent flags. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-02 14:19:33 -08:00
Kenneth Graunke	eb99bf8abe	i965: Generalize intel_upload.c to support multiple uploaders. I'd like to reuse the upload logic for a new program cache, but the buffers will need to have a different lifetime than the default uploader, and also some address space restrictions. So, we can't use a single uploader for both situations - we'll need two of them. This creates a public 'uploader' structure, and adjusts the interface to take an uploader rather than always using brw->upload. It should have no functional change at the moment. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-02 14:19:33 -08:00
Anuj Phogat	56dc9f9f49	intel/compiler: Memory fence commit must always be enabled for gen10+ Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-02 11:45:21 -08:00
Francisco Jerez	4b4838b1ae	Revert "i965/fs: Predicate byte scattered writes if needed" This reverts commit `a4031bdfa9`. It's redundant with the sample mask predication done at this point by the common logical send lowering infrastructure, and rather buggy because it wasn't applying the correct sample mask in shaders using discard, since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS doesn't reflect samples discarded by the shader, so it could have led to data corruption in fragment shader invocations that execute discard based on a non-dynamically uniform condition. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	c063e88909	intel/fs: Handle surface opcode sample masks via predication. The main motivation is to enable HDC surface opcodes on ICL which no longer allows the sample mask to be provided in a message header, but this is enabled all the way back to IVB when possible because it decreases the instruction count of some shaders using HDC messages significantly, e.g. one of the SynMark2 CSDof compute shaders decreases instruction count by about 40% due to the removal of header setup boilerplate which in turn makes a number of send message payloads more easily CSE-able. Shader-db results on SKL: total instructions in shared programs: 15325319 -> 15314384 (-0.07%) instructions in affected programs: 311532 -> 300597 (-3.51%) helped: 491 HURT: 1 Shader-db results on BDW where the optimization needs to be disabled in some cases due to hardware restrictions: total instructions in shared programs: 15604794 -> 15598028 (-0.04%) instructions in affected programs: 220863 -> 214097 (-3.06%) helped: 351 HURT: 0 The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL laptop with this change. According to Eero this improves performance of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2 desktop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>	2018-03-02 11:28:56 -08:00
Francisco Jerez	e7c9adca57	intel/eu: Plumb header present bit to codegen helpers for HDC messages. This makes sure that the header-present bit of the message descriptor is in sync with the IR instruction fields, which gives the optimizer more control to avoid the overhead of setting up a message header when it's possible to do so. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	6edb332b44	intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL. This shouldn't cause any functional change at this point, it changes SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at the IR level instead of the hard-coded f1.0, now that it can be represented in backend_instruction::flag_subreg. This will be necessary for scheduling to behave correctly once more things start making use of f1.0. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	cc0fc8b8ac	intel/ir: Allow representing additional flag subregisters in the IR. This allows representing conditional mods and predicates on f1.0-f1.1 at the IR level by adding an extra bit to the flag_subreg backend_instruction field. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Francisco Jerez	9ec3362e0b	intel/l3: Don't allocate SLM partition on ICL+. SLM has a chunk of special-purpose memory separate from L3 on ICL+, we shouldn't allocate a partition for it on L3 anymore. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-02 11:28:56 -08:00
Charmaine Lee	af8877af3b	svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs Since geometry shader also consumes prescale constants, the geometry shader constant buffer will need to be updated when prescale factor is changed. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	dc79b88402	svga: fix blending regression The earlier Mesa commit `3d06c8afb5` ("st/mesa: don't translate blend state when it's disabled for a colorbuffer") subtly changed the details of gallium's per-RT blend state. In particular, when pipe_rt_blend_state[i].blend_enabled is true, we have to get the src/dst blend terms from pipe_rt_blend_state[i], not [0] as before. We now have to scan the blend targets to find the first one that's enabled (if any). We have to use the index of that target for getting the src/dst blend terms. And note that we have to set identical blend terms for all targets. This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug 2063493. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	b871a77316	svga: check svga_have_vgpu10() in svga_delete_blend_state() We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not enabled (bs->id==0 by default), resulting in lots of device errors. Reviewed-by: Neha Bhende<bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	72df3a7a39	svga: if svga_update_state() fails, skip the draw call If svga_update_state() fails, we flush the command buffer and retry. If it fails again, it likely means we were unable to translate a shader for some reason (uses too many resources, for example). In that case, let's just skip the draw call. The alternative, just disabling the shader stage in question, would certainly lead to bad rendering anyway, and probably device errors. Fixes failed assertion running Piglit glsl-1.50/execution/ variable-indexing/gs-output-array-vec4-index-wr.shader_test since it uses too many GS output registers (though the test still fails). VMware bug 2063492. v2: also call pipe_debug_message() so apps or apitrace can be notified when this issue occurs. v3: use svga_update_state_retry(). Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	0a7deaa0d6	svga: let svga_update_state_retry() return a bool This will allow minor simplifications elsewhere. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-03-02 12:23:50 -07:00
Brian Paul	35c5cf8959	svga: s/unsigned/boolean/ for a few local vars Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-03-02 12:23:50 -07:00
Dylan Baker	e23192022a	meson: install vulkan_intel.h header Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-02 11:11:20 -08:00
Boyuan Zhang	1ad89fa138	st/omx_bellagio: add picture profile and entry point Profile and entry point were missing in the picture structure. Therefore, add them back. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-03-02 12:04:36 -05:00
Boyuan Zhang	6a62e455f2	radeonsi: fix radeon create encoder return Previous patch missed a "return" when trying to modify the create encoder function, which made the whole logic fail. Therefore, add the return back. Fixes: `b38b208ff8` "radeonsi:create uvd hevc enc entry" Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-02 12:04:36 -05:00
Thierry Reding	f9bc48d41d	loader: Add support for platform and host1x busses ARM SoCs usually have their DRM/KMS devices on the platform bus, so add support for this bus in order to allow use of the DRI_PRIME environment variable with those devices. While at it, also support the host1x bus, which is effectively the same but uses an additional layer in the bus hierarchy. Note that it isn't enough to support the bus that has the rendering GPU because the loader code will also try to construct an ID path tag for a scanout-only device if it is the default that is being opened. The ID path tag for a device can be obtained by running udevadm info on the device node, as shown in this example on NVIDIA Tegra: $ udevadm info /dev/dri/card0 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-50000000_host1x The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is constructed, can be found in the sysfs "uevent" attribute for the card0 device's parent: $ grep OF_FULLNAME /sys/devices/platform/50000000.host1x/drm/uevent OF_FULLNAME=/host1x@50000000 Similarily, /dev/dri/card1 corresponds to the GPU: $ udevadm info /dev/dri/card1 \| grep ID_PATH_TAG E: ID_PATH_TAG=platform-57000000_gpu and: $ grep OF_FULLNAME /sys/devices/platform/57000000.gpu/uevent OF_FULLNAME=/gpu@57000000 Changes in v2: - avoid confusing pre-increment in strdup() - add examples of tags to commit message Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 14:40:29 +01:00
Thierry Reding	498faea103	disk cache: Link with -latomic if necessary The disk cache implementation uses 64-bit atomic operations. For some architectures, such as 32-bit ARM, GCC will not be able to translate these operations into atomic, lock-free instructions and will instead rely on the external atomics library to provide these operations. Check at configuration time whether or not linking against libatomic is necessary and if so, create a dependency that can be used while linking the mesautil library. This is the meson equivalent of `2ef7f23820` ("configure: check if -latomic is needed for __atomic_*"). For some background information on this, see: https://gcc.gnu.org/wiki/Atomic/GCCMM Changes in v2: - clarify meaning of lock-free in commit message - fix build if -latomic is not necessary Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-03-02 11:31:59 +01:00
Samuel Pitoiset	c133a3411b	radv: do not set pending_reset_query in BeginCommandBuffer() This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-02 09:44:12 +01:00
Dave Airlie	bf2af063c3	r600/cayman: fix fragcood loading recip generation. This fixes some hangs seen where the recip_ieee opcodes would end up split across the wrong slots. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-02 00:33:18 +00:00
Kenneth Graunke	cee9f38903	i965: Allow 48-bit addressing on Gen8+. This allows most GPU objects to use the full 48-bit address space offered by Gen8+ platforms, rather than being stuck with 32-bit. This expands the available GPU memory from 4G to 256TB or so. A few objects - instruction, scratch, and vertex buffers - need to remain pinned in the low 4GB of the address space for various reasons. We default everything to 48-bit but disable it in those cases. Thanks to Jason Ekstrand for blazing this trail in anv first and finding the nasty undocumented hardware issues. This patch simply rips off all of his findings. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 15:46:11 -08:00
Kenneth Graunke	6712611735	i965: Shorten the name of the workaround BO. This makes the name shorter in debug printouts. If "workaround_bo" is good enough for the code, it's probably good enough for debugging.	2018-03-01 15:46:11 -08:00
Kenneth Graunke	b04c5cece7	i965: Add debugging code to dump the validation list. When anything goes wrong with this code, dumping the validation list is a useful way to figure out what's happening.	2018-03-01 15:46:11 -08:00
Jason Ekstrand	ff4726077d	intel/fs: Set up sampler message headers in the visitor on gen7+ This gives the scheduler visibility into the headers which should improve scheduling. More importantly, however, it lets the scheduler know that the header gets written. As-is, the scheduler thinks that a texture instruction only reads it's payload and is unaware that it may write to the first register so it may reorder it with respect to a read from that register. This is causing issues in a couple of Dota 2 vertex shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-01 15:11:01 -08:00
Timothy Arceri	f5305c1b44	ac: fix nir_intrinsic_shared_atomic_comp_swap handling Following on from `49879f3778` this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Timothy Arceri	13cdf4e590	st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs We only need to check for previously processed location on user defined varyings as they are the only ones that support component packing. Therefore a single instance of processed_locs can be shared by regular varyings and patches. For simplicity we make processed_locs an array in order to handle dual source bleanding. Fixes the follow piglit test on radeonsi: tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-02 09:11:20 +11:00
Jason Ekstrand	89f78cf333	anv: Enable MSAA fast-clears This speeds up the Sascha Willems multisampling demo by around 25% when using 8x or 16x MSAA. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	00da139477	anv/cmd_buffer: Add support for MCS fast-clears and resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	1805c483b1	anv/cmd_buffer: Add helpers for computing resolve predicates We'll want to re-use the complex resolve predicate computations for MCS resolves so it's nice to have them as helper functions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	a0a319f16e	anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage This doesn't actually do anything because att_state->fast_clear is determined based on the return value of anv_layout_to_fast_clear_type which currently returns NONE for multisampled images. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d0f701d2f1	anv/blorp: Pass the clear address to blorp for subpass MSAA resolves Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	f4f95496cb	anv/blorp: Allow indirect clear colors on blorp sources on gen7 Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	d85f05bd6f	anv/blorp: Add partial clear support to anv_image_mcs_op Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	c34feaea52	intel/blorp: Add indirect clear color support to mcs_partial_resolve This is a bit complicated because we have to get the indirect clear color in there somehow. In order to not do any more work in the shader than needed, we set it up as it's own vertex binding which points directly at the clear color address specified by the client. Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-03-01 14:07:58 -08:00
Jason Ekstrand	ca7ab1a6a5	intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE There are enough #ifs in there that it's kind-of pointless to duplicate it for each buffer. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-03-01 14:07:58 -08:00
Andriy Khulap	7859701920	i965: Fix RELOC_WRITE typo in brw_store_data_imm64() Fixes: `6c530ad116` ("i965: Reduce passing 2x32b of reloc_domains to 2 bits") Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 11:20:04 -08:00
Jonathan Gray	034bbaa6c0	gallium/util: use sockets on PIPE_OS_UNIX in u_network Instead of listing all the UNIX PIPE_OS platforms just use PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:39 +00:00
Jonathan Gray	7bea40e566	util: use clock_gettime() on PIPE_OS_BSD OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime() so use it when PIPE_OS_BSD is defined. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-03-01 18:44:38 +00:00
Jose Maria Casanova Crespo	4420d8866c	nir/search: Include 8 and 16-bit support in construct_value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-01 09:16:03 -08:00
Jason Ekstrand	99ee40fb54	nir/search: Support 8 and 16-bit constants in match_value Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-03-01 09:15:01 -08:00
Andres Gomez	b5b912dfee	travis: make Meson find the proper llvm-config Travis CI has moved to LLVM 5.0, and meson is detecting automatically the available version in /usr/local/bin based on the PATH env variable order preference. As for 0.44.x, Meson cannot receive the path to the llvm-config binary as a configuration parameter. See https://github.com/mesonbuild/meson/issues/2887 and `7c8b6ee3fa` We want to use the custom (APT) installed version. Therefore, let's make Meson find our wanted version sooner than the one at /usr/local/bin Once this is corrected, we would still need a patch similar to: https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html v2: Create the link only to the specificly wanted LLVM version (Gert). Cc: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Gert Wollny <gw.fossdev@gmail.com> Cc: Jon Turney <jon.turney@dronecode.org.uk> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-01 12:21:30 +02:00
Andres Gomez	98f7650add	meson: fix LLVM version detection when <= 3.4 3 digits versions in LLVM only started from 3.4.1 on. Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in the system while not needing LLVM at all (auto), when passing through the LLVM version detection code, meson will fail when accessing "_llvm_version[2]" due to: "Index 2 out of bounds of array of size 2." v2: Properly compare LLVM version and set patch version to 0 if < 3.4.1 (Eric). v3: Improve the commit log explanation (Eric). Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-03-01 12:16:23 +02:00
Iago Toral Quiroga	bc73016703	i965/sbe: fix number of inputs for active components In `16631ca30e` we fixed gen9 active components to account for padded inputs in the URB, which we can have with SSO programs. To do that, instead of going through the bitfield of inputs (which doesn't include padding information), we compute the number of inputs from the size of the URB entry. Unfortunately, there are some special inputs that are not stored in the URB and that we also need to account for. These special inputs are identified and handled during calculate_attr_overrides(). Instead of keeping track of the exact number of inputs, we just program active components for all possible inputs like we do in anvil. This fixes a regression in a WebGL program that uses Point Sprite functionality (specifically, VARYING_SLOT_PNTC). v2: - Add 'Fixes' tag (Mark Janes) - make no_vue_inputs int instead of uint32_t, and add const qualifier to num_inputs variable (Ian) v3: - Do not try to count inputs correctly, just program all input slots like we do in anvil (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224 Fixes: `16631ca30e` (i965/sbe: fix active components for SSO programs with over 16 inputs) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-01 10:55:12 +01:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Alejandro Piñeiro	e72fb4e611	nir/serialize: handle var->name being NULL var->name could be NULL under ARB_gl_spirv for example. And in any case, the code is already handing var name being NULL when reading a variable, so it is consistent to do it writing a variable too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-03-01 08:23:33 +01:00
Jose Maria Casanova Crespo	ba642ee3ee	anv: Enable VK_KHR_16bit_storage for PushConstant Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	02266f9ba1	spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned The introduction of 16-bit types with VK_KHR_16bit_storages implies that push constant offsets could be multiple of 2-bytes. Some assertions are updated so offsets should be just multiple of size of the base type but in some cases we can not assume it as doubles aren't aligned to 8 bytes in some cases. For 16-bit types, the push constant offset takes into account the internal offset in the 32-bit uniform bucket adding 2-bytes when we access not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0. v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	23ffb7c2d1	spirv: Calculate properly 16-bit vector sizes Range in 16-bit push constants load was being calculated wrongly using 4-bytes per element instead of 2-bytes as it should be. v2: Use glsl_get_bit_size instead of if statement (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	994d210429	anv: Enable VK_KHR_16bit_storage for SSBO and UBO Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss features of VK_KHR_16bit_storage for Gen8+. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	69be3a82ca	i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout Restrict the use of untyped_surface_write with 16-bit pairs in ssbo to the cases where we can guarantee that offset is multiple of 4. Taking into account that VK_KHR_relaxed_block_layout is available in ANV we can only guarantee that when we have a constant offset that is multiple of 4. For non constant offsets we will always use byte_scattered_write. v2: (Jason Ekstrand) - Assert offset_reg to be multiple of 4 if it is immediate. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	8dd8be0323	i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout 16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't guarantee that offsets are multiple of 4-bytes as required by untyped_read message. This happens for example in the case of f16mat3x3 when then VK_KHR_relaxed_block_layout is enabled. Vectors reads when we have non-constant offsets are implemented with multiple byte_scattered_read messages that not require 32-bit aligned offsets. Now for all constant offsets we can use the untyped_read_surface message. In the case of constant offsets not aligned to 32-bits, we calculate a start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data function and the first_component parameter to skip the copy of the unneeded component. v2: (Jason Ekstrand) Use untyped_read_surface messages always we have constant offsets. v3: (Jason Ekstrand) Simplify loop for reads with non constant offsets. Use end - start to calculate the number of 32-bit components to read with constant offsets. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	2dd94f462b	i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components This helper used to load 16bit components from 32-bits read now allows skipping components with the new parameter first_component. The semantics now skip components until we reach the first_component, and then reads the number of components passed to the function. All previous uses of the helper are updated to use 0 as first_component. This will allow read 16-bit components when the first one is not aligned 32-bit. Enabling more usages of untyped_reads with 16-bit types. v2: (Jason Ektrand) Change parameters order to first_component, num_components Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Jose Maria Casanova Crespo	67d7dd594e	isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit The surfaces that backup the GPU buffers have a boundary check that considers that access to partial dwords are considered out-of-bounds. For example, buffers with 1,3 16-bit elements has size 2 or 6 and the last two bytes would always be read as 0 or its writting ignored. The introduction of 16-bit types implies that we need to align the size to 4-bytew multiples so that partial dwords could be read/written. Adding an inconditional +2 size to buffers not being multiple of 2 solves this issue for the general cases of UBO or SSBO. But, when unsized arrays of 16-bit elements are used it is not possible to know if the size was padded or not. To solve this issue the implementation calculates the needed size of the buffer surfaces, as suggested by Jason: surface_size = isl_align(buffer_size, 4) + (isl_align(buffer_size, 4) - buffer_size) So when we calculate backwards the buffer_size in the backend we update the resinfo return value with: buffer_size = (surface_size & ~3) - (surface_size & 3) It is also exposed this buffer requirements when robust buffer access is enabled so these buffer sizes recommend being multiple of 4. v2: (Jason Ekstrand) Move padding logic fron anv to isl_surface_state. Move calculus of original size from spirv to driver backend. v3: (Jason Ekstrand) Rename some variables and use a similar expresion when calculating. padding than when obtaining the original buffer size. Avoid use of unnecesary component call at brw_fs_nir. v4: (Jason Ekstrand) Complete comment with buffer size calculus explanation in brw_fs_nir. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 21:37:40 -08:00
Mathias Fröhlich	4c232dc721	vbo: Remove vbo_save_vertex_list::vertex_size. Like before use local variables from compile_vertex_list instead. Remove vertex_size from struct vbo_save_vertex_list. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	478a9bc7bb	vbo: Remove vbo_save_vertex_list::buffer_offset. The buffer_offset is used in aligned_vertex_buffer_offset. But now that most of these decisions are done in compile_vertex_list we can work on local variables instead of struct members in the display list code. Clean that up and remove buffer_offset. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	bfa8d8e5bf	vbo: Remove vbo_save_vertex_list::start_vertex. Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from that it is not used anymore. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	6dd3e98c21	vbo: Remove vbo_save_vertex_list::attrsz. Is not used anymore on replay, move the last use in display list compilation to the original array in the display list compiler. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	95b4be4f29	vbo: Remove vbo_save_vertex_list::attrtype. Is not used anymore on replay, move the last use in display list compilation to the original array in the display list compiler. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	77df52cc4f	vbo: Remove vbo_save_vertex_list::enabled. Is not used anymore on replay. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	19a0f27a49	vbo: Remove reference to the vertex_store from the dlist node. Since we now store a set of VAOs in the display list, use these object to get the reference to the VBO in several places. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	6e410270ee	vbo: Implement current values update in terms of the VAO. Use the information already present in the VAO to update the current values after display list replay. Set GL_OUT_OF_MEMORY on allocation failure for the current value update storage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	08aa0d9bf4	vbo: Implement vbo_loopback_vertex_list in terms of the VAO. Use the information already present in the VAO to replay a display list node using immediate mode draw commands. Use a hand full of helper methods that will be useful for the next patches also. v2: Insert asserts, constify local variables. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	f7178d677c	vbo: Use a local variable for the dlist offsets. The master value is now stored inside the VAO already present in struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	1cc3516a11	vbo: Remove unused vbo_save_context::wrap_count. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Mathias Fröhlich	07915020f0	vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-01 04:06:23 +01:00
Jason Ekstrand	6d3edbea16	anv: Always set has_context_priority We don't zalloc the physical device so we need to unconditionally set everything. Crucible helpfully initializes all allocations to 139 so it was getting true regardless of whether or not the kernel actually supports context priorities. Fixes: `6d8ab53303` "anv: implement VK_EXT_global_priority extension" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 17:31:20 -08:00
Mark Janes	0fc009b8c7	Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+" This reverts commit `a2c1e48f15`. On BDWGT3e and KBLGT3e systems, this commit regressed the following tests: piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small depthstencil piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve small depthstencil	2018-02-28 17:26:08 -08:00
Dave Airlie	6c1b5a40fd	radeonsi/nir: increase values to 8 for gs fetch. This stops a crash when running (still fails): tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:35:09 +10:00
Bas Nieuwenhuizen	f9898b211e	radv: Use the syncobj wait ioctl to wait on fences if possible. Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	34bd5e2e2e	radv: Implement more efficient !waitAll fence waiting. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	6968d782d3	radv: Implement waiting on non-submitted fences. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Bas Nieuwenhuizen	2a404c6f92	radv: Implement WaitForFences with !waitAll. Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-03-01 01:07:18 +01:00
Dave Airlie	49879f3778	ac/nir: fix shared atomic operations. The nir->llvm conversion was using the wrong srcs. Fixes: tests/spec/arb_compute_shader/execution/shared-atomics.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:06:06 +10:00
Dave Airlie	69495b30a3	ac/nir: don't apply slice rounding on txf_ms This matches the tgsi code. Fixes arb_texture_multisample texelFetch piglit tests. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:04:34 +10:00
Timothy Arceri	f383fec903	radeonsi: set some context vars for nir path Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-01 10:51:56 +11:00
Timothy Arceri	7e46214f87	gallium: remove llvm from ir struct This was added in `425dc4c4b3` but never used. Also since `100796c15c` native has superseded llvm. Acked-by: Dave Airlie <airlied@redhat.com>	2018-03-01 10:51:56 +11:00
Kenneth Graunke	e51b0664e0	i965: Don't emit MOVs with undefined registers for Gen4 point clipping. Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0, which means that c->reg.vertex[] isn't initialized. It then emits MOVs to stomp components of those uninitialized registers to 0. This started causing assertions after Matt's recent series, when those uninitialized registers started getting BRW_REGISTER_TYPE_NF, which definitely doesn't exist on Gen4-5. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-28 15:03:51 -08:00
Eric Anholt	e4e79a02da	broadcom/vc5: Fix regression in the page-cache slice size alignment. We need to align the size of the slice, not the offset of the next slice. Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge. Fixes: `b4b4ada761` ("broadcom/vc5: Fix layout of 3D textures.")	2018-02-28 13:59:50 -08:00
Jason Ekstrand	a2c1e48f15	i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+ Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 13:31:42 -08:00
Jason Ekstrand	67da59e320	i965: Be more clever about setting up our viewport clip Before, we were trusting in the hardware to take the intersection of the viewport clip with the drawing rectangle. Unfortunately, 3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly does a full pipeline stall. If we're a bit more careful with our viewport clipping, we can just re-emit it once at context creation time. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 13:31:42 -08:00
Matt Turner	debaa822ef	intel/compiler: Re-add .vs_inputs_dual_locations = true Looks like a rebase mistake. Fixes: `89fe5190a2` ("intel/compiler: Lower flrp32 on Gen11+")	2018-02-28 13:25:21 -08:00
Dave Airlie	7cb9353de3	r600/shader: when using images always load thread id gpr at start (v2) The delayed loading code was fail if we had control flow. This fixes: tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test v2: don't use temp_reg before setting temp_reg up. Tested-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 20:16:19 +00:00
Dave Airlie	8369fdee8b	r600: fix whitespace in recent 1d texture commit. trivial fix.	2018-02-28 20:16:19 +00:00
Matt Turner	6f00bf519d	intel/compiler: Add ICL to test_eu_validate.cpp With the Align16 tests now disabled, we can run the rest of the tests in ICL mode (and see them pass!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	ff4b41dd1d	intel/compiler: Disable Align16 tests on Gen11+ Align16 is no more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	c31d77ac22	intel/compiler: Add instruction compaction support on Gen11 Gen11 only differs from SKL+ in that it uses a new datatype index table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	d5bf093cf9	intel/compiler: Mark line, pln, and lrp as removed on Gen11+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	89fe5190a2	intel/compiler: Lower flrp32 on Gen11+ The LRP instruction is no more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	2134ea3800	intel/compiler/fs: Implement ddy without using align16 for Gen11+ Align16 is no more. We previously generated an align16 ADD instruction to calculate DDY: add(16) g25<1>F -g23<4>.xyxyF g23<4>.zwzwF { align16 1H }; Without align16, we now implement it as: add(4) g25<1>F -g23<0,2,1>F g23.2<0,2,1>F { align1 1N }; add(4) g25.4<1>F -g23.4<0,2,1>F g23.6<0,2,1>F { align1 1N }; add(4) g26<1>F -g24<0,2,1>F g24.2<0,2,1>F { align1 1N }; add(4) g26.4<1>F -g24.4<0,2,1>F g24.6<0,2,1>F { align1 1N }; where only the first two instructions are needed in SIMD8 mode. Note: an earlier version of the patch implemented this in two instructions in SIMD16: add(8) g25<2>F -g23<4,2,0>F g23.2<4,2,0>F { align1 1N }; add(8) g25.1<2>F -g23.1<4,2,0>F g23.3<4,2,0>F { align1 1N }; but I realized that the channel enable bits will not be correct. If we knew we were under uniform control flow, we could emit only those two instructions however. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	62cfd4c656	intel/compiler/fs: Simplify ddx/ddy code generation The brw_reg() constructor just obfuscates things here, in my opinion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	bed0267ff6	intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode In a future patch, generate_ddy will want to inspect inst->exec_size. Change generate_ddx as well for consistency. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	3a584a15c0	intel/compiler/fs: Don't generate integer DWord multiply on Gen11 Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer multiplies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	432674ce93	intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+ The PLN instruction is no more. Its functionality is now implemented using two MAD instructions with the new native-float type. Instead of pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F we now have mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F ... and in the case of SIMD8 only the first pair of MAD instructions is used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	b5d8781e19	intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp If multiple instructions are emitted, special handling of things like conditional mod and NoDDClr/NoDDChk need to be performed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	b1afdf9fc1	intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair This isn't technically broken, but the next patch will make this function report whether it generated multiple instructions, and that information will be used to disable the application of conditional mod by the generic code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	2cff324210	intel/compiler: Add Gen11+ native float type This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	58611ff913	intel/compiler: Add Gen11 register types The hardware register types' encodings have changed on Gen11. Good thing we have that superfluous looking brw_reg_type abstraction lying around! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:15:47 -08:00
Matt Turner	bb428454a9	intel: Disable 64-bit extensions on platforms without 64-bit types Gen11 does not support DF, Q, UQ types in hardware. As a result, we have to disable some GL extensions until they can be reimplemented. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-28 11:15:47 -08:00
Anuj Phogat	5e42103f3b	intel: Add icl pci id for INTEL_DEVID_OVERRIDE Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2018-02-28 11:15:47 -08:00
Matt Turner	35bfe20995	i965: Warn about preliminary support for Gen11 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-28 11:14:03 -08:00
Anuj Phogat	5ac804bd9a	intel: Add a preliminary device for Ice Lake Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@intel.com>	2018-02-28 11:14:03 -08:00
Tapani Pälli	0c983b9094	anv: remove anv_gem_set_context_priority helper anv_gem_set_context_param is to be used directly instead! Fixes: `6d8ab53303` "anv: implement VK_EXT_global_priority extension" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 19:50:54 +02:00
George Kyriazis	a01d5e3712	swr/rast: revert clip distance precision Fixes piglit tests that broke with `8a64593bde` Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:50 -06:00
George Kyriazis	7e813f6214	swr/rast: Faster frustum prim culling Fix clipper validMask setting. We don't need to run frustum rejected primitives through the clipper. Perform frustum culling with only frustum clip codes. Guardband clip codes cannot be used because they overlap frustum codes. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:46 -06:00
George Kyriazis	1c73f42e6e	swr/rast: Consolidate TRANSLATE_ADDRESS Translate is now part of an overloaded LOAD call which required a change to the code gen to skip the load functions in order to handle them manually to make them virtual. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:41 -06:00
George Kyriazis	e2a4fd0761	swr/rast: Code generation cleanup Generate more compact code from gen_llvm.hpp. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:37 -06:00
George Kyriazis	190ead3d79	swr/rast: Remove draw type from event definitions - Have the draw type sent to DrawInfoEvent in handlers created in archrast.cpp. The draw type no longer needs to be sent during during AR_API_EVENT() call in api.cpp. - Remove draw type from event defintions in events_private.proto, no longer needed Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:32 -06:00
George Kyriazis	90e3e23f63	swr/rast: whitespace change Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:28 -06:00
George Kyriazis	539de78633	swr/rast: Fix index buffer overfetch issue for non-indexed draws Populate pLastIndex, even for the non-indexed case. An zero pLastIndex can cause the index offsets inside the fetcher to have non-sensical values that can be either very large positive or very large negative numbers. Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-28 11:42:19 -06:00
Roland Scheidegger	26103487b5	softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS We were setting view to NULL if the iteration was larger than i. But in fact if the view is NULL the code did nothing anyway... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-28 18:22:28 +01:00
Roland Scheidegger	b923f21eaa	cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy There's no point, we know the highest non-null one. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-28 18:22:28 +01:00
Roland Scheidegger	89ae5def8c	draw: don't needlessly iterate through all sampler view slots We already stored the highest (potentially) used number. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-28 18:22:28 +01:00
Tapani Pälli	6d8ab53303	anv: implement VK_EXT_global_priority extension v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris) use unreachable with unknown priority (Samuel) v3: add stubs in gem_stubs.c (Emil) use priority defines from gen_defines.h v4: cleanup, add anv_gem_set_context_param (Jason) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (v2) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v3) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 14:36:57 +02:00
Tapani Pälli	5960023cf4	i965: use context priority definitions from gen_defines.h Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-28 14:36:57 +02:00
Tapani Pälli	4449a1f80d	intel: add new common header gen_defines.h Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-28 14:36:57 +02:00
Christian König	33633690aa	winsys/amdgpu: request high addresses We now have hopefully fixed all bugs regarding high addresses on Vega10 and Raven. Start to use the high range to make room for SVM in the low range. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 13:30:32 +01:00
Samuel Pitoiset	639c4f2b54	ac/shader: move scanning some info about input PS declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-28 10:14:26 +01:00
Samuel Iglesias Gonsálvez	e207b2e2c8	glsl/linker: fix bug when checking precision qualifier According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders" section, the precision qualifier should match for uniform variables. This also applies to previous GLSL ES 3.x specs. This 'if' checks the condition for uniform variables, while for UBOs it is checked in link_interface_blocks.cpp. Fixes: `b50b82b8a5` ("glsl/es31: precision qualifier doesn't need to match in shader interface block members") Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-28 07:04:13 +01:00
Samuel Iglesias Gonsálvez	c757c9dc03	anv: set maxResourceSize to the respective value for each generation v2: - Add the proper values to gen9+ (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-28 06:54:48 +01:00
Dave Airlie	a5853a3333	r600: partly revert disabling tiling for 1d texture. Previously we had a check for 1d of narrow 2D textures, however narrow 2d textures caused gpu hangs, but it was correct for 1d textures. This fixes a bunch of 1D image piglits for me. Fixes: `7b8e1c089d` (r600/texture: drop lowering 1d/2d images to linear.) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 04:59:37 +00:00
Timothy Arceri	0c1f37cc2d	nir: fix interger divide by zero crash during constant folding From the GLSL 4.60 spec Section 5.9 (Expressions): "Dividing by zero does not cause an exception but does result in an unspecified value." Fixes: `89285e4d47` "nir: add new constant folding infrastructure" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271	2018-02-28 15:55:39 +11:00
Ilia Mirkin	086c88551d	st/mesa: ensure that images don't try to reference non-existent levels Ideally the st_finalize_texture call would take care of that, but it doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This assertion makes sure that no such values are passed to the driver. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-27 22:38:33 -05:00
Dave Airlie	c7b25005a1	ac/radv: move load base vertex abi setup to vertex shader. This was segfaulting: dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024 Fixes: `8de6f79707` (ac/radeonsi: add load_base_vertex() to the abi) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:58:12 +10:00
Dave Airlie	3401b028df	ac/shader: fix vertex input with components. This fixes: dEQP-VK.glsl.440.linkage.varying.component.* Fixes: `1c57a6da5e` (ac/shader: scan vertex inputs usage mask) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:04:46 +10:00
Dave Airlie	6bafd4f4dd	radv: remove device pointer from buffer. This is never used. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-28 09:03:26 +10:00
Timothy Arceri	a050ea60ee	nir: add lower_ldexp to nir compiler options Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	08fa84bb9a	ac: implement nir_op_ldexp Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	9790921ff5	ac: fix nir_op_fdd{x,y} handling radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation decide how it should be handled and radv was previously going for the higher quality option. Here we change the shared amd code to match how nir_op_fdd{x,y} is expected to be handled by the other NIR drivers. Fixes piglit test: ./bin/arb_shader_texture_lod-texgrad -auto Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	8de6f79707	ac/radeonsi: add load_base_vertex() to the abi Fixes the following piglit tests: ./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo ./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	7f91473414	radeonsi: create get_base_vertex() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	ae47af50d6	radeonsi/nir: disable vertex_id_zero_based lowering The lowering is incompatible with how the radeonsi backend works. Fixes piglit test: ./bin/arb_shader_draw_parameters-basevertex vertexid-zerobased -auto Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	5504bebfc4	ac: add support for handling nir_intrinsic_load_vertex_id This will be used by radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-28 09:23:49 +11:00
Timothy Arceri	3a0b4187dd	ac: fix f2b and i2b for doubles Without this llvm was asserting in debug builds. V2: use LLVMConstNull() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-28 09:23:49 +11:00
Francisco Jerez	cb309d27c5	intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact. test_fuzz_compact_instruction() was attempting to modify the uint64_t data array of a brw_inst through a pointer to uint32_t, which has undefined behavior. This was causing the test_eu_compact unit test to fail mysteriously for me on GCC 7 with some additional harmless-looking changes I had applied to my tree, which happened to affect the order instructions are emitted by GCC causing the bit twiddling to be done after the clear_pad_bits() call which is supposed to overwrite the same data through a pointer of different type, leading to data corruption. A similar failure has been reported by Vinson Lee on the master branch built with GCC 8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105052 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-27 11:42:39 -08:00
Francisco Jerez	69b4a9d21d	util/bitset: Make C++ wrapper trivially constructible. In order to fix a build failure on compilers not implementing unrestricted unions, which is a C++11 feature. v2: Provide signed integer comparison and assignment operators instead of BITSET_WORD ones to avoid spurious ambiguity warnings on comparisons with a signed integer literal. Fixes: `ba79a90fb5` "glsl: Switch ast_type_qualifier to a 128-bit bitset." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238 Tested-by: Roland Scheidegger <sroland@vmware.com> Tested-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-27 11:38:18 -08:00
Jordan Justen	9f223d860b	intel/tools: Use gen_device_name_to_pci_device_id in aubinator Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	8ff89250ff	intel/common: Add gen_device_name_to_pci_device_id Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	c2134f94c8	intel/vulkan: Support INTEL_DEVID_OVERRIDE environment variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	843f6d187a	i965: Use gen_get_pci_device_id_override Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	e560bb9dc2	intel/common: Add gen_get_pci_device_id_override Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Jordan Justen	6b274d5cc6	intel/vulkan: Support INTEL_NO_HW environment variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-02-27 11:15:10 -08:00
Harish Krupo	b9af043716	android: fix source files path for libmesa_anv_gen11 Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-27 14:16:08 +02:00
Eric Engestrom	248c593132	meson: avoid changing types for the dri3 option Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-27 11:21:20 +00:00
Eric Engestrom	76e8d61999	meson: simplify the gbm option code, and avoid changing types v2: drop gallium comment (Dylan) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-27 11:21:20 +00:00
Samuel Pitoiset	a549da877b	ac/nir: clean up a hack about rounding 2nd coord component It's basically just the opposite, and it only makes sense to round the layer for 2D texture arrays. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-27 10:09:27 +01:00
Ilia Mirkin	e683a797c6	nvc0: collapse output slots to have adjacent registers The hardware skips over unallocated slots, so we have to make sure those registers are packed together. Fixes KHR-GL45.enhanced_layouts.fragment_data_location_api Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-27 00:10:39 -05:00
Dave Airlie	250468f6b7	radv: expose async compute on SI It looks like we had all the pieces in place for this, just never tested it and turned it on. I don't see any CTS regressions and the computeshader demo runs. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Dave Airlie	1fc19a0f27	radv: merge tess rings into a single bo Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-27 00:54:59 +00:00
Emil Velikov	784d81e97e	docs: update calendar, add news and link release notes to 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-27 00:32:14 +00:00
Emil Velikov	d9391014de	docs: add sha256 checksums for 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b00880973e`)	2018-02-27 00:29:44 +00:00
Emil Velikov	676c58fbdb	docs: add release notes for 17.3.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b3e5a3f35b`)	2018-02-27 00:29:43 +00:00
Dylan Baker	b9636fe38a	meson: fix building without GL libgl will be undefined _glx, so move that check inside the `if with_glx != 'disabled'` block. v2: - Simplify commit message (Eric, Emil) Fixes: `5c460337fd` ("meson: Fix GL and EGL pkg-config files with glvnd") Reported-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> CC: Daniel Stone <daniels@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Untested-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-26 09:32:14 -08:00
Lionel Landwerlin	fca9f5b585	intel: aubinator_error_decode: fix segfault on missing register Some register might be missing in our genxmls. Don't try to decode them. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-26 16:54:48 +00:00
Eric Engestrom	11d45304fd	*-symbol-check: use correct `nm` path when cross-compiling Inspired-by: a similar patch for libdrm by Heiko Becker Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-26 13:50:59 +00:00
Karol Herbst	ef308d4007	nvir/gm107: consider FILE_FLAGS dependencies in SchedDataCalculatorGM107 currently while insterting barriers, writes and reads to FILE_FLAGS aren't considered. This can lead to WaR hazards in some situations. With the previous commit fixes shaders with intstructions like this: mad u32 $r2 $r4 $r11 $r2 mad u32 { $r5 $c0 } $r4 $r10 $r6 mad (SUBOP:1) u32 $r3 $r4 $r10 $r2 $c0 Affects OpenCL CTS tests on Maxwell+: basic/test_basic intmath_long basic/test_basic intmath_long2 basic/test_basic intmath_long4 v2: only put barriers on instructions which actually read flags Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Karol Herbst	2f07f823c9	nvir/gm107: iterate over all defs in SchedDataCalculatorGM107::findFirstUse In the sched data calculator we have to track first use of defs by iterating over all defs of an instruction, not just the first one. v2: fix minGRP and maxGRP values Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-02-26 14:41:58 +01:00
Samuel Pitoiset	e05507a427	ac/nir: use ordered float comparisons except for not equal Original patch from Timothy Arceri, I have just fixed the not equal case locally. This fixes one important rendering issue in Wolfenstein 2 (the cutscene transition issue). RadeonSI uses the same ordered comparisons, so I guess that what we should do as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2018-02-26 13:59:04 +01:00
Mauro Rossi	6451b0703f	android: vulkan/util: add dependency on libnativewindow for O and later Similar to `90dd6e5` ("Android: egl: add dependency on libnativewindow") Fixes the following building error: In file included from out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_vulkan_util_intermediates/util/vk_enum_to_str.c:26: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-02-26 14:50:24 +02:00
Mauro Rossi	d448954228	android: anv: add dependency on libnativewindow for O and later Similar to `90dd6e5` ("Android: egl: add dependency on libnativewindow") Fixes the following building errors: In file included from external/mesa/src/intel/vulkan/gen7_cmd_buffer.c:30: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. ... In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: In file included from external/mesa/src/intel/vulkan/anv_private.h:72: external/mesa/include/vulkan/vk_android_native_buffer.h:22:10: fatal error: 'system/window.h' file not found ^~~~~~~~~~~~~~~~~ 1 error generated. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-02-26 14:49:06 +02:00
Mauro Rossi	9a508b719b	android: anv/extensions: fix generated sources build Building rules are aligned to automake ones The correct script to build anv_extensions.{c,h} is anv_extensions_gen.py Generation rules for anv_extensions.c requires --out-c option Generation rules for anv_extensions.h were missing Necessary include paths are added to avoid following build errors: cp: cannot stat '.../gen/STATIC_LIBRARIES/libmesa_vulkan_common_intermediates/vulkan/anv_extensions.c': No such file or directory In file included from external/mesa/src/intel/vulkan/anv_gem.c:32: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/intel/vulkan/anv_batch_chain.c:30: external/mesa/src/intel/vulkan/anv_private.h:75:10: fatal error: 'anv_extensions.h' file not found ^~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-26 14:37:33 +02:00
Marek Olšák	8799eaed99	radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers The effect of the last 13 commits on user SGPR counts: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:19 +01:00
Marek Olšák	3fa7a59d69	radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:08 +01:00
Marek Olšák	c78640ce31	radeonsi: set correct num_input_sgprs for VS prolog in merged shaders We need to take num_input_sgprs from VS, not the second shader. No apps suffered from this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:05 +01:00
Marek Olšák	f852b24ce0	radeonsi: allow fewer input SGPRs in 2nd shader of merged shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:03 +01:00
Marek Olšák	8d6e6b1d7c	radeonsi: don't use struct si_descriptors for vertex buffer descriptors VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:00 +01:00
Daniel Stone	61d6ff3ba3	build: Move wayland-scanner check into platform Also only check for wayland-scanner if building for the Wayland platform. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:19 +00:00
Daniel Stone	d33cd875e8	build: Move wayland-protocols check into platform In line with wayland-client and wayland-server, move the check for wayland-protocols into the wayland platform branch. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:16 +00:00
Daniel Stone	d8f19d9aa0	vulkan/wsi/wayland: Move Wayland protocol from BUILT_SOURCES autotools wants to have the BUILT_SOURCES ready as soon as it enters the directory, even if they are not used. This meant the build failed if wayland-protocols was not available on the system, even if it was not enabled. As BUILT_SOURCES cannot be used in a conditional (cf. `166852ee95`), do the same thing as EGL and manually encode the dependencies in the Makefile. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `bfa22266cd` ("vulkan/wsi/wayland: Add support for zwp_dmabuf") Cc: Emil Velikov <emil.velikov@collabora.co.uk> Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105211	2018-02-26 10:43:12 +00:00
Dave Airlie	0cc5be7741	r600: fix tgsi clock last setting On cayman this was hitting an assert later, which probably wasn't see on non-cayman due to having the t slot. Fixes: `9041730d1` (r600: add support for ARB_shader_clock.)	2018-02-26 11:05:45 +10:00
Dave Airlie	4d72a1efea	r600: add time lo/hi debugging output. This just adds the these to the debug prints.	2018-02-26 11:05:26 +10:00
Timothy Arceri	22430224fe	radeonsi/nir: enable lowering of fpow Lowering fpow in NIR rather than LLVM can be beneficial. Polaris results: Totals from affected shaders: SGPRS: 124928 -> 124896 (-0.03 %) VGPRS: 68616 -> 68332 (-0.41 %) Spilled SGPRs: 394 -> 413 (4.82 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3668912 -> 3658368 (-0.29 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 18575 -> 18593 (0.10 %) Wait states: 0 -> 0 (0.00 %) Fixes: `d6b7539206` "ac/nir: remove emission of nir_op_fpow" Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	9873bd9dcd	ac: make use of ac_get_llvm_num_components() helper Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	1a757c9c97	gallium/tgsi: remove is_msaa_sampler array from tgsi_shader_info Seems to have not been used since `16be87c904` Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	9f7c940840	radeonsi/nir: fix loading of doubles for tess varyings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Timothy Arceri	81f9d03807	radeonsi/nir: fix lds store in tcs outputs handling We were ignoring the channel offset. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-26 11:43:47 +11:00
Gert Wollny	c7cadcbda4	r600: Take ALU_EXTENDED into account when evaluating jump offsets ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump offset needs to be adjusted accordingly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-26 10:29:48 +10:00
Francisco Jerez	51562ea7a0	mesa: Expose EXT_shader_framebuffer_fetch(_non_coherent) on desktop and embedded GL. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	c6c64d4d6a	glsl: Silence warnings when reading from a framebuffer fetch output. Framebuffer fetch outputs are implicitly initialized upon entry to the fragment shader. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	537bb1da98	glsl: Specify framebuffer fetch coherency mode in lower_blend_equation_advanced(). This requires passing an extra argument to the lowering pass because the KHR_blend_equation_advanced specification doesn't seem to define any mechanism for the implementation to determine at compile-time whether coherent blending can ever be used (not even an "#extension KHR_blend_equation_advanced_coherent" directive seems to be required in the shader source AFAICT). In the long run we'll probably want to do state-dependent recompiles based on the value of ctx->Color.BlendCoherent, but right now there would be no benefit from that because the only driver that supports coherent framebuffer fetch is i965 on SKL+ hardware, which are unable to support the non-coherent path for the moment because of texture layout issues, so framebuffer fetch coherency is always enabled for them. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ef9e3f63ca	glsl: Add support for the framebuffer fetch layout(noncoherent) qualifier. This allows the application to request framebuffer fetch coherency with per-fragment output granularity. Coherent framebuffer fetch outputs (which is the default if no qualifier is present for compatibility with older versions of the EXT_shader_framebuffer_fetch extension) will have ir_variable_data::memory_coherent set to true. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	0aeec504b4	glsl: Allow layout token for EXT_shader_framebuffer_fetch_non_coherent. EXT_shader_framebuffer_fetch_non_coherent requires layout qualifiers even on GL(ES) 2. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	1bc01db95f	glsl: Initialize ir_variable_data::fb_fetch_output earlier for GL(ES) 2. At the same point where it is initialized on GL(ES) 3.0+ so we can implement some common layout qualifier handling in a future commit. Until now the fb_fetch_output flag would be inherited from the original implicit gl_LastFragData declaration at a later point in the AST to GLSL IR translation. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6ebefb0fd5	glsl: Replace MESA_shader_framebuffer_fetch extension flags with EXT ones. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	ba79a90fb5	glsl: Switch ast_type_qualifier to a 128-bit bitset. This should end the drought of bits in the ast_type_qualifier object. The bitset_t type works pretty much as a drop-in replacement for the current uint64_t bitset. The only catch is that the bitset_t type as defined in the previous commit doesn't have a trivial constructor (because it has a user-defined constructor), so it cannot be used as union member without providing a user-defined constructor for the union (which causes it in turn to be non-trivially constructible). This annoyance could be easily addressed in C++11 by declaring the default constructor of bitset_t to be the implicitly defined one -- IMO one more reason to drop support for GCC 4.2-4.3. The other minor change was required because glsl_parser_extras.cpp was hard-coding the type of bitset temporaries as uint64_t, which (unlike would have been the case if the uint64_t had been replaced with e.g. an __int128) would otherwise have caused a build failure, because the boolean conversion operator of bitset_t is marked explicit (if C++11 is available), so the bitset won't be silently truncated down to 1 bit in order to use it to initialize the uint64_t temporaries (yikes). Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	bdbc2ffa42	util/bitset: Add C++ wrapper for static-size bitsets. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	8d1f1ce412	util: Add EXPLICIT_CONVERSION macro. This can be used to specify that a C++ conversion operator is not meant to be used for implicit conversions, which can lead to unintended loss of information in some cases. Implemented as a macro in order to keep old GCC versions happy. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	378e918e28	mesa: Implement glFramebufferFetchBarrierEXT entry point. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	e4124f9bc1	glapi: Update XML for last revision of EXT_shader_framebuffer_fetch. Desktop GL is now supported, and there is an additional entry-point for EXT_shader_framebuffer_fetch_non_coherent. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	6a8ec78c2a	mesa: Rename MESA_shader_framebuffer_fetch gl_extensions bits to EXT. The changes I had originally planned for the MESA_shader_framebuffer_fetch extension have been merged into the EXT spec, there's no point in keeping MESA_shader_framebuffer_fetch extension enables. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	d0bef79f12	mesa: Rename dd_function_table::BlendBarrier to match latest EXT spec. This GL entry point was renamed to glFramebufferFetchBarrier() in the EXT extension on request from Khronos members. Update the Mesa codebase to match the latest spec. Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Francisco Jerez	27c829da28	i965: Fix KHR_blend_equation_advanced with some render targets. This reverts two bogus and seemingly useless changes from the commits referenced below, which broke KHR_blend_equation_advanced (and EXT_shader_framebuffer_fetch_non_coherent which wasn't exposed yet) for any kind of render target surface that would cause the get_isl_surf() call in brw_emit_surface_state() to do anything useful (notice how the result of get_isl_surf() is completely ignored by the caller right now), as was the case while using those extensions with 1D array or 3D framebuffers in particular. Fixes: `f5859b45b1` "i965/miptree: Switch remaining surfaces to isl" Fixes: `bf24c3539e` "i965/miptree: Clean-up unused" Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2018-02-24 15:28:36 -08:00
Marek Olšák	fb410ae392	radeonsi: remove si_descriptors parameter from emit_shader_pointer functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	63ea0a00a3	radeonsi: preload the tess offchip ring in TES so that it's not done multiple times in branches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	2d03c4cac8	radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	190e064e63	radeonsi: move 2nd-shader descriptor pointers into s[0:1] If 32-bit pointers are supported, both pointers can be moved into s[0:1] and then ESGS has exactly the same user data SGPR declarations as VS. If 32-bit pointers are not supported, only one pointer can be moved into s[0:1]. In that case, the 2nd pointer is moved before TCS constants, so that the location is the same in HS and GS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	1d1df76d2b	radeonsi: change si_descriptors::shader_userdata_offset type to short We will want to use SH registers outside of user data SGPRs, like the GFX9 special SGPRs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	41895c26d3	radeonsi: move TCS_OUT_LAYOUT.PatchVerticesIn to lower bits For a later patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Karol Herbst	f0b39779a0	nvir: dont optimize mad with subops to shladd Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-24 18:48:13 +01:00
James Legg	afd8fd0656	radv: Really use correct HTILE expanded words. When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] `5cba4ecbda/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp (L1500)` CC: Dave Airlie <airlied@redhat.com> CC: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Fixes: `5158603182` "radv: Use correct HTILE expanded words."	2018-02-24 02:16:22 +01:00
Mauro Rossi	8eed942136	radv/extensions: fix c_vk_version for patch == None Similar to `cb0d1ba156` ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) \| ((minor) << 12) \| (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: `e72ad05c1d` ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-24 00:31:31 +01:00
Eric Anholt	b4b4ada761	broadcom/vc5: Fix layout of 3D textures. Cube maps are entire miptrees repeated, while 3D textures have each level have all of its layers next to each other. Fixes tex3d and tex-miplevel-selection GL2:texture() 3D.	2018-02-23 15:07:26 -08:00
Eric Anholt	97dc077303	broadcom/vc5: Ignore unused usage flags in is_format_supported. Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to match. Just drop the whole retval == usage thing and return early when we hit a known unsupported case. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 15:07:18 -08:00
Eric Anholt	880573e737	gbm: Fix the alpha masks in the GBM format table. Once GBM started looking at the values of the alpha masks, ARGB/ABGR wouldn't match any more because we had both A and R in the low bits. Fixes: `2ed344645d` ("gbm/dri: Add RGBA masks to GBM format table") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 15:03:36 -08:00
Mathias Fröhlich	b54bf0e3e3	mesa: Update vertex processing mode on _mesa_UseProgram. The change is a bug fix for `92d76a169`: mesa: Provide an alternative to get_vp_mode() that actually got exposed through `4562a7b0`: vbo: Make use of _DrawVAO from the dlist code. Fixes: KHR-GLES31.core.shader_image_load_store.advanced-sso-simple Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105229 Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 21:08:35 +01:00
Marek Olšák	d169438d8e	mesa: rename has_core_gs -> has_gs in get_programiv This is also true for GLES. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:23 +01:00
Marek Olšák	1881f41b6c	mesa: replace some API_OPENGL_CORE checks with _mesa_is_desktop_gl This is more accurate with respect to the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:22 +01:00
Marek Olšák	1defc973db	mesa: add some of missing compatibility support for ARB_bindless_texture The extension is exposed in the compatibility profile. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:20 +01:00
Marek Olšák	b8e2e9e1a1	mesa: expose ARB_enhanced_layouts in the compatibility profile GLSL 1.40 is required. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:19 +01:00
Marek Olšák	a0c8b49284	mesa: enable OpenGL 3.1 with ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:17 +01:00
Marek Olšák	605a7f6db5	mesa: implement ARB_compatibility Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 20:50:15 +01:00
Emil Velikov	14a2c87c41	swr: remove dead LLVM code paths LLVM requirement was bumped to 4.0.0 with earlier commit. Hence any code tailored for older versions is now unreachable. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-By: George Kyriazis <george.kyriazis@intel.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2018-02-23 19:17:31 +00:00
Eric Anholt	5980a41c0f	broadcom/vc4: Remove the retval==usage check in is_format_supported(). This got us into trouble recently, so just remove it entirely.	2018-02-23 08:42:13 -08:00
Eric Anholt	bc3d16e633	broadcom/vc4: Add support for YUV textures using unaccelerated blits. Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.	2018-02-23 08:42:13 -08:00
Eric Anholt	c824a045ea	broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows. When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.	2018-02-23 08:42:13 -08:00
Eric Anholt	6deb158ec1	broadcom/vc4: Add pipe_reference debugging for vc4_bos. Trying to track down the YUV EGLImage use-after-free, it helps to see what the mystery objects are that are being refcounted.	2018-02-23 08:42:13 -08:00
Eric Anholt	34ea1aca92	broadcom/vc4: Remove dead vc4_bo_set_reference(). It would be broken if NULL was passed to it anyway, since it wouldn't participate in screen->bo_handles management.	2018-02-23 08:42:13 -08:00
Eric Anholt	a49738290c	broadcom/vc4: Use pipe_resource_reference in sampler views. Improves u_debug_refcount output.	2018-02-23 08:42:13 -08:00
Eric Anholt	0c1dd9dee0	broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride. This is part of supporting YUV textures -- MMAL will be handing us a single GEM BO with the planes at offsets within it, and MMAL-decided stride.	2018-02-23 08:42:13 -08:00
Eric Anholt	978b884afc	broadcom/vc4: Ignore PIPE_BIND_DISPLAY_TARGET in is_format_supported(). We were failing the retval == usage check at the end. Fixes: `f7604d8af5` ("st/dri: only expose config formats that are display targets")	2018-02-23 08:42:13 -08:00
Lucas Stach	8df11f3fad	etnaviv: fix in-place resolve tile count TS tiles map to a fixed amount of bytes in the color/depth surface, so the blocksize of the format needs to be taken into account when calculating the number of tiles to fill. The simplest fix is to just use the layer stride, which is the surface size in bytes. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	add23b59c9	etnaviv: switch magic single buffer state to "3" Some of the 16bit formats misrender with missing tiles with the current "2" state. As all the previously working formats also work with the "3" state, just always use that one. Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 15:34:39 +01:00
Lucas Stach	8befc11186	etnaviv: add debug switch to disable single buffer feature This feature has caused some trouble already. Add a debug switch to allow users to quickly check if a specific issue is caused by this feature. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-23 15:34:31 +01:00
Dylan Baker	5c460337fd	meson: Fix GL and EGL pkg-config files with glvnd Currently meson will generate a pkg-config that links to EGL_mesa (or GLX_mesa), but this isn't correct, it should always link to EGL or GL. Probably the "right" solution is to have glvnd itself provide the pkg config files for GL and EGL, but that also means that glvnd needs to provide many of the header files, which makes it a more involved job. Fixes: `a47c525f32` ("meson: build glx") Fixes: `035ec7a2bb` ("meson: Add support for EGL glvnd") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-23 13:30:28 +00:00
Frank Binns	6160bf97db	egl/dri2: fix segfault when display initialisation fails dri2_display_destroy() is called when platform specific display initialisation fails. However, this would typically lead to a segfault due to the dri2_egl_display vbtl not having been set up. Fixes: `2db9548296` ("loader_dri3/glx/egl: Optionally use a blit context for blitting operations") Signed-off-by: Frank Binns <francisbinns@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-23 11:13:22 +00:00
Juan A. Suarez Romero	e1623b303c	mesa: add missing RGB9_E5 format in _mesa_base_fbo_format RGB9_E5 should be accepted by RenderbufferStorage if the EXT_texture_shared_exponent is exposed. It is left to the implementations to return GL_FRAMEBUFFER_UNSUPPORTED_EXT when checking the framebuffer completeness if they do not support rendering in this format. Discussed in: https://github.com/KhronosGroup/OpenGL-API/issues/32 This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5 v2: Added more info to the commit message (Antia) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Antia Puentes <apuentes@igalia.com>	2018-02-23 10:12:06 +01:00
Christian Gmeiner	e72062b66d	etnaviv: npot_tex_any_wrap needs one bit only Reduces size of struct etna_specs from 100 to 94 bytes. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de>	2018-02-23 09:38:16 +01:00
Mathias Fröhlich	4562a7b0e8	vbo: Make use of _DrawVAO from the dlist code. Finally use an internal VAO to execute display list draws. Avoid duplicate state validation for display list draws. Remove client arrays previously used exclusively for display lists. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:14 +01:00
Mathias Fröhlich	2f35140846	mesa: Use atomics for shared VAO reference counts. VAOs will be used in the next change as immutable object across multiple contexts. Only reference counting may write concurrently on the VAO. So, make the reference count thread safe for those and only those VAO objects. v3: Use bool/true/false for gl_vertex_array_object::SharedAndImmutable. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:11 +01:00
Mathias Fröhlich	8a3a4b6fae	vbo: Make use of _DrawVAO from immediate mode draw Finally use an internal VAO to execute immediate mode draws. Avoid duplicate state validation for immediate mode draws. Remove client arrays previously used exclusively for immediate mode draws. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:07 +01:00
Mathias Fröhlich	c757e416ce	vbo: Implement tool functions for vbo specific VAO setup. Correct VBO_MATERIAL_SHIFT value. The functions will be used next in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:04 +01:00
Mathias Fröhlich	ef8028017d	mesa: Add flush_vertices to _mesa_bind_vertex_buffer. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:34:01 +01:00
Mathias Fröhlich	354b76ad20	mesa: Make _mesa_vertex_attrib_binding public. Change vertex_attrib_binding() to _mesa_vertex_attrib_binding(), add a flush_vertices argument, and make it publicly available. The function will be needed later in the series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:58 +01:00
Mathias Fröhlich	4331969ac4	mesa: Add flush_vertices to _mesa_{enable,disable}_vertex_array_attrib. We will need the flush_vertices argument later in this series. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:55 +01:00
Mathias Fröhlich	195bb990ed	vbo: Use _DrawVAO for array type draw commands. Switch over to use the _DrawVAO for all the array type draws. The _DrawVAO needs to be set before we enter _mesa_update_state, so move setting the draw method in front of the first call to _mesa_update_state which is in turn called from the validateDraw* calls. Using the gl_vertex_array_object::_Enabled bitmask, gl_vertex_program_state::_VPMode and gl_vertex_array_object::_AttributeMapMode we can already set varying_vp_inputs before we call _mesa_update_state the first time. Thus remove duplicate state validation. v2: Update comments. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:50 +01:00
Mathias Fröhlich	6002ab564b	vbo: Implement method to track the inputs array. Provided the _DrawVAO and the derived state that is maintained if we have the _DrawVAO set, implement a method to incrementally update the array of gl_vertex_array input pointers. v2: Add some more comments. Rename _vbo_array_init to _vbo_init_inputs. Rename vbo_context::arrays to vbo_context::draw_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:46 +01:00
Mathias Fröhlich	08c7474189	mesa: Introduce a yet unused _DrawVAO. During the patch series this VAO gets populated with either the currently bound VAO or an internal VAO that will be used for immediate mode and dlist rendering. v2: More comments about the _DrawVAO, filter and enabled mask. Rename _DrawVAOEnabled to _DrawVAOEnabledAttribs. v3: Fix and move comment. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:43 +01:00
Mathias Fröhlich	ce3d2421a0	vbo: Remove get_vp_mode() and enum vp_mode. Is now unused. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:40 +01:00
Mathias Fröhlich	60c3ca1b23	vbo: Use _VPMode instead of get_vp_mode(). At those places where we used get_vp_mode() use gl_vertex_program_state::_VPMode instead. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:36 +01:00
Mathias Fröhlich	92d76a1691	mesa: Provide an alternative to get_vp_mode() To get equivalent information than get_vp_mode(), track the vertex processing mode in a per context variable at gl_vertex_program_state::_VPMode. This aims to replace get_vp_mode() as seen in the vbo module. But instead of the get_vp_mode() implementation which only gives correct answers past calling _mesa_update_state() this context variable is immediately tracked when the vertex processing state is modified. The correctness of this value is asserted on state validation. With this in place we should be able to untangle the dependency with varying_vp_inputs and state invalidation. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-23 05:33:30 +01:00
Ilia Mirkin	d73f1f2ad8	nv50,nvc0: fix integer MS resolves using 2d engine We don't want filtering for integer textures, same as depth/stencil. Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	33ce3569c5	nvc0: fix writing query results into buffer We need to mark the range as valid, and validate the resource using a helper to ensure that the buffer status is marked properly. Fixes some CTS pipeline stats query tests, and KHR-GL45.direct_state_access.queries_functional Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Ilia Mirkin	f6e4f95668	nv50,nvc0: fix clear buffer acceleration Two things were off: - valid range was not updated, which could affect waiting for future maps - fencing was done manually instead of using the *_resource_validate helper, which resulted in a missed dirty buffer flag being set Fixes: KHR-GL45.direct_state_access.buffers_clear Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-02-22 20:47:48 -05:00
Lionel Landwerlin	bd9672695b	i965: perf: ensure reading config IDs from sysfs isn't interrupted Fixes: `458468c136` "i965: Expose OA counters via INTEL_performance_query" Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-23 01:44:07 +00:00
Bas Nieuwenhuizen	032870beda	radv: Fix autotools build. Somewhere along the way the Makefile changes got lost ... Fixes: `4db78f3a6b` "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <airlied@redhat.com>	2018-02-23 01:54:12 +01:00
Bas Nieuwenhuizen	e72ad05c1d	radv: Return NULL for entrypoints when not supported. This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	414f5e0e14	radv: Reword radv_entrypoints_gen.py With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	076f7cfc6b	radv: Track enabled extensions. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Bas Nieuwenhuizen	4db78f3a6b	radv: Put supported extensions in a struct. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-23 00:39:02 +01:00
Jose Fonseca	1f5618e81c	appveyor: Build with MSVC 2015. The MSVC version we (at VMware) primarily care about from now on is 2015. See https://ci.appveyor.com/project/jrfonseca/mesa/build/46 We can drop support for building with 2013 in a future commit. I'm not aware of significant changes in C99/C11 support from MSVC 2013 to 2015, but there's no point in continuing supporting old MSVC versions when nobody cares. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-22 21:10:20 +00:00
Samuel Pitoiset	d6b7539206	ac/nir: remove emission of nir_op_fpow fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:44:46 +01:00
Samuel Pitoiset	7aa008d1d7	radv: enable lowering of fpow to fexp2 and flog2 There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:47 +01:00
Samuel Pitoiset	63fb30c674	nir: lower fexp2(fmul(flog2(a), 2)) to fmul(a, a) Similar for the 4 case. Suggested by Bas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:45 +01:00
Samuel Pitoiset	b18997876f	nir: add is_used_once for fmul(fexp2(a), fexp2(b)) to fexp2(fadd(a, b)) Otherwise the code size increases because the original fexp2() instructions can't be deleted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:40:43 +01:00
Samuel Pitoiset	a01e9996b5	ac/nir: set GLC=1 for load/store of coherent/volatile images This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:55 +01:00
Samuel Pitoiset	3c40be126f	spirv: apply memory qualifiers to images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-22 20:39:53 +01:00
Chuck Atkins	540e49e105	glx: Properly handle cases where screen creation fails This fixes a segfault exposed by `a29d63ecf7` which occurs when swr is used on an unsupported architecture. v2: re-work to place logic in xmesa_init_display Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: mesa-stable@lists.freedesktop.org Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-22 10:20:32 -05:00
Iago Toral Quiroga	7668b594e6	anv/blorp: multisample resolve all attachment layers We were only resolving the first. v2: - Do not require that the number of layers on dst and src are an exact match, it is okay if the dst has more layers so long as it has at least the same that we are going to resolve. - Do not always resolve array_len layers, we should resolve only from base_array_layer to array_len. v3: - v2 was assuming that array_len represented the total number of layers in the image, but it represents the number of layers starting at the base array ayer. v4: - The number of layers to resolve should be taken from the framebuffer (Nanley). Fixes new CTS tests for multisampled layered rendering: dEQP-VK.renderpass.multisample_resolve.layers_* Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-22 08:23:39 +01:00
Jason Ekstrand	2dce4ac6ac	intel/isl: Improve the documentation on get_default_aux_state Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	24952160fd	i965: Use finish_external instead of make_shareable in setTexBuffer2 The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT which has tighter restrictions than just "it's shared". In particular, it says that any rendering to the image while it is bound causes the contents to become undefined. The GLX_EXT_texture_from_pixmap extension provides us with an acquire and release in the form of glXBindTexImageEXT and glXReleaseTexImageEXT. The extension spec says, "Rendering to the drawable while it is bound to a texture will leave the contents of the texture in an undefined state. However, no synchronization between rendering and texturing is done by GLX. It is the application's responsibility to implement any synchronization required." From the EGL 1.4 spec for eglBindTexImage: "After eglBindTexImage is called, the specified surface is no longer available for reading or writing. Any read operation, such as glReadPixels or eglCopyBuffers, which reads values from any of the surface’s color buffers or ancillary buffers will produce indeterminate results. In addition, draw operations that are done to the surface before its color buffer is released from the texture produce indeterminate results In other words, between the bind and release calls, we effectively own those pixels and can assume, so long as we don't crash, that no one else is reading from/writing to the surface. The GLX and EGL implementations call the setTexBuffer2 and releaseTexBuffer function pointers that the driver can hook. In theory, this means that, between BindTexImage and ReleaseTexImage, we own the pixels and it should be safe to track aux usage so we can avoid redundant resolves so long as we start off with the right assumption at the start of the bind/release pair. In practice, however, X11 has slightly different expectations. It's expected that the server may be drawing to the image at the same time as the compositor is texturing from it. In that case, the worst expected outcome should be tearing or partial rendering and not random corruption like we see when rendering races with scanout with CCS. Fortunately, the GEM rules about texture/render dependencies save us here. If X11 submits work to write to a pixmap after the compositor has submitted work to texture from it, GEM inserts a dependency between the compositor and X11. If X11 is using a high-priority context, this will cause the compositor to get a temporarily boosted priority while the batch from X11 is waiting on it. This means that we will never have an actual race between X11 and the compositor so no corruption can happen. Unfortunately, however, this means that X11 will likely be rendering to it between the compositor's BindTexImage and ReleaseTexImage calls. If we want to avoid strange issues, we need to be a bit careful about resolves because we can't really transition it away from the "default" aux usage. The only case where this would practically be a problem is with image_load_store where we have to do a full resolve in order to use the image via the data port. Even there it would only be a problem if batches were split such that X11's rendering happens between the resolve and the use of it as a storage image. However, the chances of this happening are very slim so we just emit a warning and hope for the best. This commit adds a new helper intel_miptree_finish_external which resets all aux state to whatever ISL says is the right worst-case "default" for the given modifier. It feels a little awkward to call it "finish" because it's actually an acquire from the perspective of the driver, but it matches the semantics of the other prepare/finish functions. This new helper gets called in intelSetTexBuffer2 instead of make_shareable. We also add an intelReleaseTexBuffer (we passed NULL to releaseTexBuffer before) and call intel_miptree_prepare_external in it. This probably does nothing most of the time but it means that the prepare/finish calls are properly matched. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	00926a2730	i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2 The old code made a new miptree that referenced the same BO as the renderbuffer and just trusted in the memory aliasing to work. There are only two ways in which the new miptree is liable to differ from the one in the renderbuffer and neither of them matter: 1) It may have a different target. The only targets that we can ever see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE and the difference between the two doesn't matter as far as the miptree is concerned; genX(update_sampler_state) only looks at the gl_texture_object and not the miptree when determining whether or not to use normalized coordinates. 2) It may have a very slightly different format. Again, this doesn't matter because we've supported texture views for quite some time so we always look at the gl_texture_object format instead of the miptree format for hardware setup anyway. On the other hand, because we were recreating the miptree, we were using intel_miptree_create_for_bo which doesn't understand modifiers. We really want this function to work without doing a resolve so long as you have modifiers so we need to fix that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	41d45eb21e	i965/tex_image: Pull the tex format from the renderbuffer in intelSetTexBuffer2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	344b57b10b	i965/miptree: Loosen the format check in miptree_match_image This function is used to determine when we need to re-allocate a miptree. Since we do nothing different in miptree allocation for sRGB vs. linear, loosening this should be safe and may lead to less copying and reallocating in some odd cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Jason Ekstrand	5b1b710e6f	i965/state: Ignore intel_obj->_Format for depth/stencil and ETC2 We're about to start letting the intel_obj->_Format be the "real" texture format. For depth/stencil textures, this may be a combined depth stencil format. For ETC2 on gen7 and earlier, this will be the actual ETC2 format. This makes a bit more GL sense but means we have to be careful in state upload. Reviewed-by: Chad Versace <chadversary@chromium.org>	2018-02-21 18:18:16 -08:00
Kenneth Graunke	183ce5e629	glsl: Parse 'layout' as a token with advanced blending or bindless Both KHR_blend_equation_advanced and ARB_bindless_texture provide layout qualifiers, and are exposed in compatibility contexts. We need to parse the layout qualifier as a token in order for those to work, but forgot to extend this check. ARB_shader_image_load_store would need a similar treatment, but we don't expose that in legacy OpenGL contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-02-21 17:50:57 -08:00
Daniel Stone	c7e22483fe	vulkan/wsi/x11: Consistently update and return swapchain status Use a helper function for updating the swapchain status. This will be used later to handle VK_SUBOPTIMAL_KHR, where we need to make a non-error status stick to the swapchain until recreation. Instead of direct comparisons to VK_SUCCESS to check for error, test for negative numbers meaning an error status, and positive numbers indicating non-error statuses. v2 (Jason Ekstrand): - Use a pattern of "return x11_swapchain_result(chain, VK_WHATEVER)" - Handle wsi_queue_pull returning VK_TIMEOUT - Call x11_swapchain_result in x11_present_to_x11 Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	6937c61324	vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails This most likely means we lost our connection to the X server so OUT_OF_DATE is reasonable. This was also the one case where we pushed a UINT32_MAX into the queue without setting an error condition. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	bfa22266cd	vulkan/wsi/wayland: Add support for zwp_dmabuf zwp_linux_dmabuf_v1 lets us use multi-planar images and buffer modifiers. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	c757fd2852	anv/image: Add support for modifiers for WSI This adds support for the modifiers portion of the WSI "extension". Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	adca1e4a92	anv/image: Separate modifiers from legacy scanout For a bit there, we had a bug in i965 where it ignored the tiling of the modifier and used the one from the BO instead. At one point, we though this was best fixed by setting a tiling from Vulkan. However, we've decided that i965 was just doing the wrong thing and have fixed it as of `5048572352`. The old assumptions also affected the solution we used for legacy scanout in Vulkan. Instead of treating it specially, we just treated it like a modifier like we do in GL. This commit goes back to making it it's own thing so that it's clear in the driver when we're using modifiers and when we're using legacy paths. v2 (Jason Ekstrand): - Rename legacy_scanout to needs_set_tiling Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Jason Ekstrand	f5433e4d6c	vulkan/wsi: Add modifiers support to wsi_create_native_image This involves extending our fake extension a bit to allow for additional querying and passing of modifier information. The added bits are intended to look a lot like the draft of VK_EXT_image_drm_format_modifier. Once the extension gets finalized, we'll simply transition all of the structs used in wsi_common to the real extension structs. Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-21 22:37:10 +00:00
Daniel Stone	55b27e1e5f	vulkan/wsi: Add drm_modifier member to wsi_image Not yet used anywhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Daniel Stone	61c3feb38d	vulkan/wsi: Add multiple planes to wsi_image Not currently used. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-21 22:37:10 +00:00
Timothy Arceri	cdeac00267	nir: remove old assert This was originally intended to make sure the remap location was not -1. However the code has changed alot since then, the location is now never set to -1 and we also handle components meaning this old assert has been doing comparisions with the pointer to the array of component data. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183	2018-02-22 09:31:00 +11:00
Timothy Arceri	86098696fc	radeonsi/nir: collect more accurate output_usagemask Fixes assert in the glsl-1.50-gs-max-output-components piglit test. Note that the double handling will only work for doubles that don't take up multiple slots i.e. double and dvec2. However dual slot double handling is an existing bug which is made no worse by this patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	79dc94828a	radeonsi/nir: disable GLSL IR loop unrolling Delaying unrolling and allowing NIR to do it instead has been shown to result in better code in drivers such as i965. shader-db results appear to show the same is true for radeonsi. The other advantage is that using NIR unrolling improves compile times significantly. Totals from affected shaders: SGPRS: 9624 -> 10016 (4.07 %) VGPRS: 6800 -> 6464 (-4.94 %) Spilled SGPRs: 0 -> 2 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 359176 -> 332264 (-7.49 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1355 -> 1432 (5.68 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	e6269ffc2e	radeonsi/nir: fix tess varying loads for doubles Fixes the following piglit tests: tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Timothy Arceri	6d338d757f	ac/radeonsi: pass type to load_tess_varyings() We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-22 09:31:00 +11:00
Daniel Stone	eef890b7b1	x11/dri3: Store raw present completion mode The DRI3 drawable info struct currently stores a boolean for whether the last completed operation was a flip or not. As we need to track the full completion mode for handling suboptimal returns, change the 'flipping' field to the raw present completion mode from the server. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-21 21:57:38 +00:00
Daniel Stone	a6f1952814	x11/dri3: Don't open-code ARRAY_SIZE Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-21 21:57:38 +00:00
Jason Ekstrand	52056206e1	anv: Don't assert that stencil HiZ clears are single-slice It's true for depth HiZ clears because we only have HiZ on single-slice images right now. However, for stencil-only clears there is no such restriction. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-21 13:54:11 -08:00
Jason Ekstrand	7dd0f73fe1	anv: Only copy clear dwords if we're rendering to the first slice Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-02-21 12:47:17 -08:00
Marek Olšák	b494ed168c	radeonsi: don't flush when si_eliminate_fast_color_clear is no-op	2018-02-21 20:03:11 +01:00
Marek Olšák	5f55f4c59f	radeonsi: make texture_discard_cmask/eliminate functions non-static	2018-02-21 20:03:11 +01:00
James Zhu	81dd4a7637	radeonsi: enable uvd encode for HEVC main Enable UVD encode for HEVC main profile Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	b38b208ff8	radeonsi:create uvd hevc enc entry Add UVD hevc encode pipe video codec creation entry Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	e7d51e27ed	radeon/uvd:add uvd hevc enc functions Implement UVD hevc encode functions Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	2b86f5fa0b	radeon/uvd:add uvd hevc enc hw ib implementation Implement required IBs for UVD HEVC encode. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	461508c15c	radeon/uvd:add uvd hevc enc hw interface header Add hevc encode hardware interface for UVD Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	c6acae22c8	winsys/amdgpu:add uvd hevc enc support in amdgpu cs Support UVD HEVC encode in amdgpu cs Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>	2018-02-21 13:53:38 -05:00
James Zhu	f0ad908e79	amd/common:add uvd hevc enc support check in hw query Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-21 13:53:38 -05:00
Karol Herbst	7319311a50	nvir/nvc0: fix legalizing of ld unlock c0[0x10000] We have to increase the file index also for 0x10000 not just for values greater than 0x10000. Fixes: `37b67db6ae` Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-21 11:12:45 +01:00
Samuel Pitoiset	a6accad68f	ac/nir: add glsl_is_array_image() helper For consistency. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:51 +01:00
Samuel Pitoiset	ff83dfb364	ac/nir: set the DA field when performing atomics on 3D images This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-21 09:41:49 +01:00
Eric Anholt	afa7b2f199	i965: Fix compiler warning about write being undefined. This looks like it should be protected by the assume() about nr_color_regions, but my compiler warns anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	4636ce362d	glsl/tests: Fix a compiler warning about signed/unsigned loop comparison. Fixes: `d32956935e` ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	7075c084fc	loader: Fix compiler warnings about truncating the PCI ID path. My build was producing: ../src/loader/loader.c:121:67: warning: ‘%1u’ directive output may be truncated writing between 1 and 3 bytes into a region of size 2 [-Wformat-truncation=] and we can avoid this careful calculation by just using asprintf (as we do elsewhere in the file). Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Eric Anholt	1b313eedb5	glsl: Silence warnings in the uniform initializer test about 16-bit types They should probably get unit tests implemented, but this cleans up a bunch of warnings in my build for now. Fixes: `59f458cd87` ("glsl: Add 16-bit types") Cc: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-20 20:23:57 -08:00
Jordan Justen	96fe36f7ac	i965: Enable disk shader cache by default Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-20 18:49:43 -08:00
Dave Airlie	baa0feb73d	radv: don't send num_tcs_input_cp to sgprs. We never use it in the shaders. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:36 +00:00
Dave Airlie	952222ddd4	radv/tess: don't need to look in constant for vertices_per_patch This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:28 +00:00
Dave Airlie	77fd1b9187	ac/radv: cleanup some tcs output values access Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:23 +00:00
Dave Airlie	0e6f0d400b	ac/radv: remove total_vertices variable This just removes an unneeded variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:19 +00:00
Dave Airlie	e9b9fb3616	ac/radv: don't mark tess inner as used if we don't use it. This just avoids marking it as a used output if we don't actually use it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-21 00:01:15 +00:00
Dave Airlie	d5b2d7ed67	ac/nir: to integer the args to bcsel. dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw was hitting an llvm assert due to one value being an int and the other a float. This just casts both values to integer and fixes the test. Fixes: dEQP-VK.tessellation.invariance.outer_edge_symmetry.triangles_equal_spacing_ccw Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-20 23:15:18 +00:00
Jason Ekstrand	c66fb12117	anv/blorp: Use layout_to_aux_usage when a layout is provided Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me something reasonable" we now use anv_layout_to_aux_usage whenever a layout is available. If a layout is available, we ignore the aux_usage parameter. For the cases where we have an explicit aux usage such as clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:17 -08:00
Jason Ekstrand	0fa040e6f5	anv/cmd_buffer: Delete some assert-only variables Checking the sample count is almost as good as aux usage in this case. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:16 -08:00
Jason Ekstrand	e10a62662b	anv/cmd_buffer: Use layout_to_* helpers in compute_aux_usage Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:14 -08:00
Jason Ekstrand	7ea8131aa0	anv/cmd_buffer: Simplify transition_depth_buffer If we don't have HiZ, then anv_layout_to_aux_usage will return NONE for both layouts. If the two layouts are the same, they will get the aux usage. In either case, the code below will give us ISL_AUX_OP_NONE and we'll return without doing anything. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-20 13:57:09 -08:00
Jason Ekstrand	87e86ee2e6	anv/cmd_buffer: Do subpass image transitions in begin/end_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	7d5f6b6088	anv/cmd_buffer: Mark depth/stencil surfaces written in begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	8a3f086a42	anv/cmd_buffer: Sync clear values in begin_subpass This is quite a bit cleaner because we now sync the clear values at the same time as we do the fast clear. For loading the clear values into the surface state, we now do it once when we handle the LOAD_OP_LOAD instead of every subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	a4136b8c1a	anv/pass: Store usage in each subpass attachment This requires us to ditch the VkAttachmentReference struct in favor of an anv-specific struct. However, we can now easily identify from just the subpass attachment what kind of an attachment it is. This will make iteration over anv_subpass::attachments a little easier in some case. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	bd356e1bcf	anv/cmd_buffer: Add a concept of pending load aspects These are the same as pending clear aspects only for the "load" operation. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	e526d49edd	anv/cmd_buffer: Iterate all subpass attachments when clearing This unifies things a bit because we now handle depth and stencil at the same time. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:25 -08:00
Jason Ekstrand	2cc3445eb2	anv/cmd_buffer: Decide whether or not to HiZ clear up-front This moves the decision out of begin_subpass and into BeginRenderPass like the decision for color clears. We use a similar name for the function for depth/stencil as for color even though no aux usage is really getting computed. v2 (Jason Ekstrand): - Don't always disable HiZ clears by accident - Use the initial layout to decide whether to do fast clears Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fc8555610	anv/cmd_buffer: Move the rest of clear_subpass into begin_subpass Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	7991838973	intel/blorp: Add a blorp_hiz_clear_depth_stencil helper This is similar to blorp_gen8_hiz_clear_attachments except that it takes actual images instead of trusting in the already set depth state. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	1900dd76d0	anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpass This doesn't really change much now but it will give us more/better control over clears in the future. The one interesting functional change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and friends for each clear. However, this only happens at begin_subpass time so it shouldn't be substantially more expensive. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	6fb9d6c6f5	anv/cmd_buffer: Pass a subpass id into begin_subpass This is a bit less awkward than passing in the subpass because it means we don't have to extract the subpass id from the subpass. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	01223b8199	anv/cmd_buffer: Add begin/end_subpass helpers Having begin/end_subpass is a bit nicer than the begin/next/end hooks that Vulkan gives us. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	b5bd3fb4e4	anv/cmd_buffer: Apply subpass flushes before set_subpass This seems slightly more correct because it means that the flushes happen before any clears or resolves implied by the subpass transition. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	869448a8ab	anv: Use framebuffer layers for implicit subpass transitions Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	85d0bec961	anv: Be more careful about fast-clear colors Previously, we just used all the channels regardless of the format. This is less than ideal because some channels may have undefined values and this should be ok from the client's perspective. Even though the driver should do the correct thing regardless of what is in the undefined value, it makes things less deterministic. In particular, the driver may choose to fast-clear or not based on undefined values. This level of nondeterminism is bad. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	4796025ba5	intel/isl: Add an isl_color_value_is_zero helper Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:24 -08:00
Jason Ekstrand	116e818ef1	anv/gpu_memcpy: CS Stall before a MI memcpy on gen7 This fixes a pile of hangs caused by the recent shuffling of resolves and transitions. The particularly problematic case is when you have at least three attachments with load ops of CLEAR, LOAD, CLEAR. In this case, we execute the first CLEAR followed by a MI memcpy to copy the clear values over for the LOAD followed by a second CLEAR. The MI commands cause the first CLEAR to hang which causes us to get stuck on the 3DSTATE_MULTISAMPLE in the second CLEAR. We also add guards for BLORP to fix the same issue. These shouldn't actually do anything right now because the only use of indirect clears in BLORP today is for resolves which are already guarded by a render cache flush and CS stall. However, this will guard us against potential issues in the future. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-20 13:49:19 -08:00
Guillaume Charifi	a572ec2efe	st/mesa: Factorize duplicate code for atomic buffer binding Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Guillaume Charifi	56bfcd50f7	st/mesa: Factorize duplicate code in st_update_framebuffer_state() Signed-off-by: Guillaume Charifi <guillaume.charifi@sfr.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 20:54:49 +01:00
Rob Clark	4c4e6232ee	freedreno/ir3: fix use_count refcnt'ing issue Was hitting an assert with vs-varying-array-mat4-index-col-row-wr.shader_test When eliminating a copy, we were dropping the use_count of the mov that is skipped, but not increasing the use_count of it's src instruction. Fixes: `76440fcca9` freedreno/ir3: clean up dangling false-dep's Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-20 13:43:42 -05:00
Eric Engestrom	ac731531a1	docs: fix patent url Reported-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 15:14:34 +00:00
Brian Paul	e7d1a93723	svga: replaced 'unsigned' with proper enum types in shader code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-20 08:11:06 -07:00
Jonathan Gray	9401d90a53	configure.ac: pthread-stubs not present on OpenBSD pthread-stubs is no longer required on OpenBSD and has been removed. libpthread parts involved moved to libc. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 15:08:47 +00:00
Andres Gomez	36ac485bd1	swr: bump minimum supported LLVM version to 4.0 Since radv and radeonsi removed support for LLVM 3.9 the distcheck target got broken because SWR distribution needed 3.9.x. After checking with George Kyriazis, SWR is OK with moving to LLVM 4.0 and above, which will solve this problem. Fixes: `3bf1e036e8` ("amd: remove support for LLVM 3.9") Cc: George Kyriazis <george.kyriazis@intel.com> Cc: Tim Rowley <timothy.o.rowley@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: George Kyriazis <george.kyriazis@intel.com>	2018-02-20 17:03:06 +02:00
Andres Gomez	b39f6d5fc7	travis: radeonsi and radv need LLVM 4.0 Fixes: `3bf1e036e8` ("amd: remove support for LLVM 3.9") Cc: Marek Olšák <marek.olsak@amd.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-20 16:58:30 +02:00
Samuel Pitoiset	1ac741d690	ac/nir: move ac_declare_lds_as_pointer() outside of the switch Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-20 10:44:59 +01:00
Samuel Pitoiset	b5d111ae76	radv: allow to force family using RADV_FORCE_FAMILY Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-20 10:44:47 +01:00
Thomas Hellstrom	f386776ea5	loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback Removing this callback caused rendering corruption in some multi-screen cases, so it is reinstated but without the drawable argument which was never used by implementations and was confusing since the drawable could have been created with another screen. Cc: "17.3 18.0" mesa-stable@lists.freedesktop.org Fixes: `5198e48a0d` (loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105013 Reported-by: Daniel van Vugt <daniel.van.vugt@canonical.com> Tested-by: Timo Aaltonen <tjaalton@ubuntu.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:36:53 +01:00
Thomas Hellstrom	80c31f7837	svga: Fix a leftover debug hack Fix what appears to be a leftover debug hack. The hack would force the driver to take a different blit path; possibly, although unverified, reverting to software blits. Tested using piglit tests/quick. No related regressions. Cc: "17.2 17.3 18.0" <mesa-stable@lists.freedesktop.org> Fixes: `9d81ab7376` (svga: Relax the format checks for copy_region_vgpu10 somewhat) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104625 Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-20 10:12:19 +01:00
Iago Toral Quiroga	af5f2322d0	anv/entrypoints: make vkGetDeviceProcAddr return NULL for instance commands Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-20 08:12:32 +01:00
Ilia Mirkin	e1a70aed10	nv50,nvc0: mark ABGR format as displayable instead of ARGB format This matches the hardware's capabilities. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	f7604d8af5	st/dri: only expose config formats that are display targets In the case of NVIDIA hardware, ABGR is displayable but ARGB is not. Only advertise the one set in the visuals list. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Ilia Mirkin	ebdc4c31e2	mesa: add xbgr support adjacent to xrgb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Daniel Stone <daniels@collabora.com>	2018-02-19 22:33:58 -05:00
Timothy Arceri	d88a2906f8	st/shader_cache: copy nir pointer to gl_program after deserializing This fixes a crash when running the arb_get_program_binary-api-errors piglit test twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	691c320de0	radeonsi: add nir shader cache support In future we might want to try avoid calling nir_serialize() but this works for now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	2b431808ab	radeonsi: rename variables tgsi_binary -> ir_binary This better represents that the ir could be either tgsi or nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Emil Velikov	1270990438	docs: update calendar, add news and link release notes to 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-19 22:10:18 +00:00
Emil Velikov	be5a996039	docs: add sha256 checksums for 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `164a993112`)	2018-02-19 22:08:14 +00:00
Emil Velikov	ca614d40cd	docs: add release notes for 17.3.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2529d77179`)	2018-02-19 22:08:12 +00:00
Marek Olšák	f78fe98fff	radeonsi: fix regression from 32-bit pointers on CI Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2018-02-19 17:56:23 +01:00
Samuel Pitoiset	549c7f3724	radv: compact varyings after removing unused ones It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-19 12:19:17 +01:00
Timothy Arceri	51e745cf77	radeonsi/nir: fix gl_FragCoord for pixel_center_integer Fixes piglit test glsl-arb-fragment-coord-conventions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Timothy Arceri	347038baa9	glsl/nir: add pixel_center_integer to shader info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-19 08:47:48 +11:00
Ilia Mirkin	fe76fc11b1	gm107/ir: avoid using kepler instruction capabilities Split up the op properties table into generation-specific bits, and only use the kepler ones on kepler. Fixes some CTS images tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	f08fd676bf	nvc0: add support for bindless on maxwell+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Ilia Mirkin	0255550eb1	gm107/ir: change how SUQ works in preparation for bindless All this information can be retrieved from the TIC directly. Avoid having to dip into the constbuf information about the image. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 23:41:21 -05:00
Kenneth Graunke	fa8a764b62	i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+. By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic state base address. This makes it unusable for pushing UBOs. There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake) which controls whether buffer 0 is relative to dynamic state base address, or simply a normal pointer. Setting that gives us full flexibility. This lets us push up to 4 UBO ranges. We can't currently write this on Haswell and earlier, and will need to update the kernel command parser, and then do the whole version checking song and dance. We also need a brand new kernel that supports context isolation - on older kernels, newly created contexts inherit register state from whatever happened to be running. So, setting this would have catastrophic impact on other drivers such as libva, Beignet, or older Mesa. See commit `8ec5a4e4a4` where we did this once before, but had to revert it in commit `013d331220`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:31 -08:00
Kenneth Graunke	a63c74be85	i965: Stop restoring the default L3 configuration on Kernel 4.16+. Kernel 4.16 has proper context isolation, which means we can change the L3 configuration without worrying about that leaking to other newly created contexts, breaking the assumptions of other userspace. So, disable our workaround to reprogram it back to the default. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-17 11:26:18 -08:00
Mikko Perttunen	5a1606c51f	nvc0: Use GP100_COMPUTE_CLASS on GP10B GP10B requires the use of GP100_COMPUTE_CLASS instead of GP104_COMPUTE_CLASS as is used for other non-GP100 chips. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-17 14:16:10 -05:00
Daniel Stone	9d21dbeb88	i965: Fix aux-surface size check The previous commit reworked the checks intel_from_planar() to check the right individual cases for regular/planar/aux buffers, and do size checks in all cases. Unfortunately, the aux size check was broken, and required the aux surface to be allocated with the correct aux stride, but full image height (!). As the ISL aux surface is not recorded in the DRIimage, we cannot easily access it to check. Instead, store the aux size from when we do have the ISL surface to hand, and check against that later when we go to access the aux surface. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `c2c4e5bae3` ("i965: Fix bugs in intel_from_planar") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-17 10:22:35 +00:00
Marek Olšák	931ec80eeb	radeonsi: implement 32-bit pointers in user data SGPRs (v2) User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: `1645656` -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)	2018-02-17 04:52:17 +01:00
Marek Olšák	5722cd4084	radeonsi: disallow constant buffers with a 64-bit address in slot 0 State trackers must use a user buffer or const_uploader, or set pipe_resource::flags same as const_uploader->flags. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	d790b6cece	radeonsi: move const_uploader allocations to 32-bit address space Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	50581549b7	winsys/radeon: implement and enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	1104d1e9d3	winsys/radeon: add struct radeon_vm_heap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	48ecacfefa	winsys/amdgpu: enable 32-bit VM allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	c2da45be86	gallium/radeon: add 32-bit address space heaps Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-17 04:52:17 +01:00
Marek Olšák	0977b7f7b3	ac: query high bits of 32-bit address space	2018-02-17 04:51:58 +01:00
Marek Olšák	16be55da94	gallium: use PIPE_CAP_CONSTBUF0_FLAGS	2018-02-17 04:20:55 +01:00
Marek Olšák	8e7222f4e5	gallium: allow drivers to impose BO flags restrictions on constant buffer 0 Required by radeonsi for optimal behavior.	2018-02-17 04:20:55 +01:00
Alexander von Gluck IV	834d221512	meson: Add Haiku platform support v4 Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-16 16:56:34 -06:00
Anuj Phogat	7b283544dc	anv/icl: Add render target flush after uploading binding table The PIPE_CONTROL command description says: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	136f583a24	anv/icl: Enable float blend optimization Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd7102972f	anv/icl: Use gen11 functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	9673c21d4f	anv/icl: Build anv libs for gen11 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	1f108b436b	anv/icl: Generate gen11 entry point functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	a86c0a08df	anv/icl: Don't use DISPATCH_MODE_SIMD4X2 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	cd5fc634a8	anv/icl: Don't use SingleVertexDispatch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	6e3940b3cf	anv/icl: Don't set ResetGatewayTimer Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:32 -08:00
Anuj Phogat	41a4c2c8e8	anv/icl: Add #define genX Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Anuj Phogat	413d475b44	anv/icl: Add gen11 mocs defines Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-16 11:10:31 -08:00
Kenneth Graunke	1d6cf433d2	i965: Implement GenerateMipmap directly, rather than using Meta. Meta is awful and we'd like to stop using it. Implementing this using BLORP allows us to stop trashing a bunch of GL state every time. This follows the structure of st_generate_mipmap(). compute_num_levels is lifted directly from there. Improves performance in Gl41HdrBloom by about 11.794% +/- 1.01919% (n=3) on Kabylake GT2 at 1280x720 (the difference seems much smaller at higher resolutions). v2 (idr): Don't try depth or depth-stencil blorp blits on Gen4 or Gen5 because it's not implemented yet. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-02-16 10:48:10 -08:00
Kenneth Graunke	9bcd31ea90	mesa: Move compute_num_levels from st_gen_mipmap.c to mipmap.c. I want to use compute_num_levels inside i965. Rather than duplicating it, move it from mesa/st to core Mesa, and make it non-static. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-16 10:48:10 -08:00
Dylan Baker	03ab40b1f7	meson: freedreno depends on nir This fixes a race condition in building targets that link in freedreno. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105120 Fixes: `0bbecc5a85` ("meson: define driver dependencies") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Mark Janes <mark.a.janes@intel.com>	2018-02-16 10:10:18 -08:00
George Kyriazis	f1fbeb1a53	swr/rast: blend_epi32() should return Integer, not Float fix gcc8 compiler error for KNL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105029 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	7dd793d10c	swr/rast: Normalize path for debug metadata in template gen_llvm.hpp Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	f979d0bc2f	swr/rast: Consolidate archrast Draw events Consolidate archrst draw events into single draw event with an attribute that represents the type of draw - Add handlers for new private proto versions of DrawInstancedEvent, DrawIndexedInstancedEvent, DrawInstancedSplitEvent, and DrawIndexedInstancedSplitEvent - Convert the draw events to generic DrawInfoEvents - parse_proto_event_fields() replaces 'AR_DRAW_TYPE' as a field type with 'uint32_t'. This draw type is actually an enum, but can be represented as an unsigned integer. - is_draw_or_dispatch() recognizes DrawInfoEvent as a draw event Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	45df1a6520	swr/rast: Add semantics for translating address Added support for another full translation path in fetch jitter. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:02 -06:00
George Kyriazis	c09483cf0a	swr/rast: Convert C Sampler intrinsics Convert portions of the C sampler to the rasty SIMD lib. Also fix SRL call with a non-immediate. Don't count on the compiler automagically converting an srli call to srl if the shift count isn't an immediate. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	37ebf86add	swr/rast: Make SIMDLib templated types easier to use "typename SIMD_T::TypeName" --> "TypeName<SIMD_T>" Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	74e8bb4a22	swr/rast: Be more explicit when fetching next component Use a new function to denote that we want to get offset to next component and hide the fact that GEP is used underneath. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	da77eb55d5	swr/rast: Fix bug related to passing AR handle We were passing a garbage handle. Let's not do that. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	48d62409f8	swr/rast: Fix primitive replication issue in tesselation PA. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	e12db47a7d	swr/rast: Use llvm intrinsic masked gather Use llvm intrinsic masked.gather instead of manual unroll for the cases where we have vector of pointers. Improves llvm IR debug experience by reducing a ton of IR to a single intrinsic call. Also seems to reduce overall stack use considerably. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:01 -06:00
George Kyriazis	9cc9688e49	swr/rast: Misc cleanup Together with correct detection of clipDistance NaNs when no cullDistance is set Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	036c8b6247	swr/rast: Renamed variable in vertexbufferstate Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	b25efa36e6	swr/rast: Fix GATHERPS to avoid assertions. With the pBase type change, LLVM was asserting because of wrong types. Cast appropriately. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	8a64593bde	swr/rast: More precise user clip distance interpolation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	3e560b7c85	swr/rast: Cull prims when all verts have negative clip distances Performance optimization, and fixes some clipping issues. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	cb4b604ebd	swr/rast: whitespace and comment cleanup Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:54:00 -06:00
George Kyriazis	5df4d98780	swr/rast: Fix invalid number of attributes Fix invalid number of attributes passed into tesselation PA. Needs to take into account any offsets from the shader. Innocuous issue, but removes an assert firing in debug. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	2053472723	swr/rast: Add clipper stats. Clipper event is now: event ClipperEvent { uint32_t drawId; uint32_t trivialRejectCount; uint32_t trivialAcceptCount; uint32_t mustClipCount; }; Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	0420b2be89	swr/rast: Separate event types to public and private Split into two proto files and modify appropriate build rules for configure / scons / meson builds. There are private internal events (proxy) that communicate information from rasterizer to ArchRast. ArchRast can use these events to calculate a final answer and then emit other public events which will be saved to file. Users will use the public proto file and not the private one. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	e48dd2489c	swr/rast: Clean up event types and remove BE events Begin/End events not needed anymore. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	7070027d7b	swr/rast: Removed unused variable Gets rid of zillions of unused variable warnings, made worse by templates. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	e3f92bb7af	swr/rast: Separate RDTSC code from archrast Renamed rdstc defines more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:59 -06:00
George Kyriazis	8bce71622e	swr/rast: Cleanup of mpPrivateContext in Builder Provide access functions for mpPrivateContext in Builder. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	5697dc3e23	swr/rast: Remove some JIT debug code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	2407b8c9b4	swr/rast: Don't include private context in gather args Move mpPrivateContext to compensate Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:58 -06:00
George Kyriazis	a4c23fc25b	swr/rast: Cleanup knob definitions Rename some of the categories and move some options around. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:53:42 -06:00
George Kyriazis	ec34ed73d6	swr/rast: Add missing parameter to a few gather functions We now pass pDrawContext as a default parameter Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-02-16 10:39:42 -06:00
Philipp Zabel	bfe4e24a42	etnaviv: add useful information to BO import errors Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-02-16 17:05:43 +01:00
Daniel Stone	ff5432dc50	egl/wayland: Always use in-tree wayland-egl-backend.h A recent patchset to Wayland[0] migrated Mesa's libwayland-egl backend into Wayland itself, so implementations could provide backends. Mesa still uses its own, and the two have already diverged[1]. The include from egl_dri2.h could pick up either the installed Wayland wayland-egl-backend.h (with a 'driver_private' member), or the Mesa internal wayland-egl-backend.h (with a 'private' member), failing the build in the first instance. Add an explicit directory prefix to the include, so we always get our in-tree version. [0]: https://patchwork.freedesktop.org/series/31663/ [1]: https://cgit.freedesktop.org/wayland/wayland/commit/?id=9fa60983b579 Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105103 Fixes: `198af27c67` ("wayland-egl: rename wayland-egl-{priv,backend}.h")	2018-02-16 14:04:19 +00:00
Daniel Stone	f766e1afa5	meson: Move Wayland dmabuf to wayland-drm As the comment notes: linux-dmabuf has nothing to do with wayland-drm, but we need a single place to build these files we can use from both EGL and Vulkan, which is guaranteed to be included before both EGL and Vulkan WSI. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-02-16 14:04:19 +00:00
Eric Engestrom	65dda6c9ec	egl/wayland: check for invalid format index v2: just tell the compiler to assume the format will always be found, as it comes from the table itself to begin with. (DanielS) CID: 1429516 Fixes: `d32b23f383` "egl/wayland: Add bpp to visual map" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-16 13:14:29 +00:00
Eric Engestrom	a176b053b6	glsl: fix sizeof(pointer) bug Doesn't really change anything to the test though ¯\_(ツ)_/¯ CID: 1429511 Fixes: `e8495646af` "glsl/tests: changes to test_disk_cache_create test" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-16 12:04:29 +00:00
Timothy Arceri	2f5d3df9fc	radeonsi/nir: set TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL correctly We set this for post_depth_coverage in addition to early_fragment_tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-16 15:53:13 +11:00
Dave Airlie	60c14a0db2	virgl: remap query types to hw support. The gallium query types changed, so we need to remap from the gallium ones to the virgl ones. Fixes: dEQP-GLES3.functional.transform_feedback.basic_types* "This also fixes: dEQP-GLES3.functional.transform_feedback.array.separate* dEQP-GLES3.functional.transform_feedback.array_element* dEQP-GLES3.functional.transform_feedback.interpolation.* Gallium's p_defines.h and virglrenderer's p_defines.h have diverged quite a bit, so not including PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE there makes sense for now." - Gurchetan Singh Fixes: `3f6b3d9db` (gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-16 12:42:06 +10:00
Anuj Phogat	8a05b06146	i965/icl: Add render target flush after uploading binding table From PIPE_CONTROL command description in gfxspecs: "Whenever a Binding Table Index (BTI) used by a Render Taget Message points to a different RENDER_SURFACE_STATE, SW must issue a Render Target Cache Flush by enabling this bit. When render target flush is set due to new association of BTI, PS Scoreboard Stall bit must be set in this packet." V2: Move the PIPE_CONTROL to update_renderbuffer_surfaces() in brw_wm_surface_state.c (Ken). Fixes a fulsim error and a GPU hang described in below JIRA. JIRA: MD5-322 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	3f8289164f	i965/icl: Enable float blend optimization and Wa3DStateMode Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	ba3cbee6c5	intel/common/icl: Add has_sample_with_hiz flag in gen_device_info Sampling from hiz is enabled in i965 for GEN9+ but this feature has been removed from gen11. So, this new flag will be useful to turn the feature on/off for different gen h/w. It will be used later in a patch adding device info for gen11. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	9c144dc81e	i965/icl: Add assertions to check dispatch mode is SIMD8 SIMD4x2 dispatch mode has been removed in GEN11. We're not using it anyways in Mesa. Adding few asserts to make it explicit. Use GEN_GEN macro in place of devinfo->gen (Ken) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	02e91b6d62	i965/icl: Update switch statements Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	27d0034938	i965/icl: Update the assert in brw_memory_barrier() Nothing is changed here from gen10 to gen11. So, just update the assert. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	d6b26649a6	i965/icl: Define and use icl mocs settings Gen11 MOCS settings are duplicate of Gen10 MOCS settings. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	e9ad5c9a5d	i965/icl: Update the comment for maximum number of threads per PSD Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	93f601d7ed	i965/icl: Build and use gen11 functions for genxml state-upload and blorp Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:56 -08:00
Anuj Phogat	85f319155f	i965/icl: Don't set ResetGatewayTimer This field is removed in gen11+ Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:56 -08:00
Anuj Phogat	772a75be46	intel/icl: Do StateCacheInvalidation for indirect clear color StateCacheInvalidation is required on all gen7+ platforms. We don't need to update this check for every new gen h/w unless this requirement is changed. So, dropping the check for latest gen h/w. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	bff24e2173	intel/isl/icl: Build and use gen11 surface state emit functions Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:55 -08:00
Anuj Phogat	0427bd4954	intel/isl/icl: Add the maximum surface size limit Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	c68ede0be7	intel/genxml/icl: Update genx_bits header Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-15 16:14:55 -08:00
Anuj Phogat	165a68b05a	intel/genxml/icl: Generate packing headers Move build system changes in to one patch (Ken, Emil) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-15 16:14:55 -08:00
Anuj Phogat	7ed27d8cbf	intel/genxml/icl: Add gen11.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 16:14:55 -08:00
Kenneth Graunke	4dee8f0548	i965: Drop EXEC_OBJECT_CAPTURE defines. These only existed to avoid making people update libdrm for new uABI headers. A while ago we imported those headers into the Mesa repo, so the dependency is gone and these are no longer useful. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-15 15:35:52 -08:00
Jan Vesely	78673b614b	clover: Fix build after llvm r325155 and r325160 r325155 ("Pass a reference to a module to the bitcode writer.") and r325160 ("Pass module reference to CloneModule") change function interface from pointer to reference. v2: Fix indentation (tab instead of spaces) Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-02-15 18:18:53 -05:00
Bas Nieuwenhuizen	05d84ed68a	radv: Always lower indirect derefs after nir_lower_global_vars_to_local. Otherwise new local variables can cause hangs on vega. CC: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-15 23:45:59 +01:00
Dylan Baker	2ab1ce30c4	meson: fix xvmc target linkage This needs to link the state tracker with --whole-archive to expose the right symbols. v4: - Always add libswdri and libswkmsdri to the link_with list Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:38:43 -08:00
Dylan Baker	0b73c329bc	meson: Fix xa target linkage This needs to use --whole-archive (link_whole in meson) to properly expose symbols. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `0ba909f0f1` ("meson: build gallium xa state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:31 -08:00
Dylan Baker	91a59b6287	meson: Fix omx-bellagio target linkage This needs to use --whole-archive (link_whole in meson) to properly expose symbols. v4: - Always add libswdri and libswkmsdri to link_with Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:26 -08:00
Dylan Baker	2e4be28fb2	meson: fix va target linkage The state tracker needs to be linked with whole-archive (like autotools). As a result there are symbols from libswdri and libswkmsdri that are needed, so link those as well. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:16 -08:00
Dylan Baker	90d361753c	meson: fix vdpau target linkage The VDPAU state tracker needs to be linked with whole-archive (autotools does this). Because we are linking the whole archive we alos need to link with libswdri and libswkmsdri if those have been enabled. v4: - Always add libswdri and libswkmsdri to link_with list Fixes: `68076b8747` ("meson: build gallium vdpau state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:09 -08:00
Dylan Baker	3403055768	meson: Actually link xvmc target with libxvmc Unlike vdpau this is required. Fixes: `22a817af8a` ("meson: build gallium xvmc state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:36:04 -08:00
Dylan Baker	7708103857	meson: actually link with libomxil-bellagio This state tracker actually needs to link, unlike vdpau. Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:57 -08:00
Dylan Baker	7023b373ec	meson: link dri3 xcb libs into vlwinsys instead of into each target This makes the dependencies easier to manage, since each media target doesn't need to worry about linking to half a dozen libraries. Fixes: `b1b65397d0` ("meson: Build gallium auxiliary") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:51 -08:00
Dylan Baker	424e654cb0	meson: use va-api version reported by pkg-config Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:47 -08:00
Dylan Baker	8eb608df61	meson: add libswdri and libswkmsdri to dri link_with Fixes: `b154b44ae3` ("meson: build radeonsi gallium driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:42 -08:00
Dylan Baker	be879f9f29	meson: add libswdri and libswkmsdri to d3dadaptor link_with v5: - Fix libswdi -> libswdri typo Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:36 -08:00
Dylan Baker	d672084ba2	meson: define empty variables for libswdri and libswkmsdri This allows these variables to unconditionally included in `link_with` lists, even if they're not used. This allows deleting duplicated logic in nearly every gallium target implemented in meson today. This also removes the now useless `build_by_default` flag from swdri and swkmsdri. v4: - add this patch Fixes: `66c94b9313` ("meson: build gallium winsys for dri, null, and wrapper") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 10:35:23 -08:00
Dylan Baker	7d0e342af2	meson: add convenience variable for anv_extensions.py depdendency Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:46:07 -08:00
Dylan Baker	0e617c04f1	meson: use depend_files for adding extra file dependencies cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:46:04 -08:00
Dylan Baker	b03969a5ad	meson: use depend_files to track extra file dependencies cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `f939940809` ("anv: Split anv_extensions.py into two files") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:45:56 -08:00
Dylan Baker	384bff13e0	Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py" This reverts commit `10d1b0be8e`. This is unnecessary, the depend_files argument is for adding dependencies on files that are not part of the input, which is already done. cc: Jason Ekstrand <jason.ekstrand@intel.com> Fixes: `10d1b0be8e` Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-15 09:45:40 -08:00
Brian Paul	64a1223a80	svga: replace gotos with else clauses Simple clean-up. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:49:06 -07:00
Brian Paul	fa901768a4	svga: s/unsigned/enum pipe_shader_type/ Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-15 09:05:09 -07:00
Brian Paul	8b54299c34	svga: move duplicated code for setting fillmode/flatshade state Move the calls to svga_hwtnl_set_fillmode() and svga_hwtnl_set_flatshade() out of the two retry_draw_*() functions to the svga_draw_vbo() function. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:09 -07:00
Brian Paul	072df89a79	svga: move svga_update_state() call in draw code This fixes a few Piglit transform feedback regressions caused by commit `7a1401938b`. In that change I moved the moved svga_update_state() into the loops, after the calls to svga_hwtnl_set_flatshade(). But svga_hwtnl_set_flatshade() actually depends on some derived shader state. This patch moves the svga_update_state() call into svga_draw_vbo() so it's not duplicated in two places. Fixes: `7a1401938b` ("svga: clean up retry_draw_range_elements(), retry_draw_arrays()") Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:08 -07:00
Brian Paul	6f0aec5671	svga: call tgsi_scan_shader() for dummy shaders If we fail to compile the normal VS or FS we fall back to a simple/ dummy shader. We need to rescan the the shader to update the shader info. Otherwise, this can lead to further translations failures because the shader info doesn't match the actual shader. Found by adding some extra debug assertions in the state-update code while debugging something else. v2: also update shader generic_inputs/outputs, etc. per Charmaine Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-02-15 09:05:01 -07:00
Samuel Pitoiset	579b33c1fd	ac/nir: do not reserve user SGPRs for unused descriptor sets In theory this might lead to corruption if we bind a descriptor set which is unused, because LLVM is smart and it can re-use unused user SGPRs. In practice, this doesn't seem to fix anything. As a side effect, this will reduce the number of emitted SH_REG packets. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	309854148c	ac/shader: fix gathering of desc_set_used_mask This was quite wrong. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Samuel Pitoiset	61a4fc3ecc	ac/shader: be a little smarter when scanning vertex buffers Although meta shaders don't use any vertex buffers, there is no behaviour change but I think it's better to do this. Though, this saves two user SGPRs for push constants inlining or something else. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-15 14:53:30 +01:00
Louis-Francis Ratté-Boulianne	a34715ad9c	dri: fromPlanar() can return NULL as a valid result It was assumed that fromPlanar() could return NULL to mean that the planar image is the same as the parent DRI image. That assumption wasn't made everywhere though. Let's fix things and make sure that all callers understand a NULL result Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-15 11:58:17 +00:00
Emil Velikov	f0654dfa65	docs: correct link to the 17.3.3 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 11:33:27 +00:00
Emil Velikov	dd4734d5c1	docs: update calendar, add news and link release notes to 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-15 11:33:04 +00:00
Emil Velikov	eadde35f83	docs: add sha256 checksums for 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `26c84b8af9`)	2018-02-15 11:28:19 +00:00
Emil Velikov	6f4a6e2310	docs: add release notes for 17.3.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2f9820c553`)	2018-02-15 11:28:18 +00:00
Karol Herbst	7bc15090fc	nvc0: disable MS Images for sample_count == 1 on Maxwell fixes KHR-GL45.multi_bind.dispatch_bind_textures on Maxwell Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-15 11:14:46 +01:00
Gurchetan Singh	c6694793e1	mesa: don't clamp just based on ARB_viewport_array extension The ARB_viewport_array spec says: "Dependencies OpenGL 1.0 is required. OpenGL 3.2 or the EXT_geometry_shader4 or ARB_geometry_shader4 extensions are required. This extension is written against the OpenGL 3.2 (Compatibility) Specification." As such, we should ignore it for GLES2 contexts. Fixes: dEQP-GLES2.functional.state_query.integers.viewport_getinteger dEQP-GLES2.functional.state_query.integers.viewport_getfloat on llvmpipe and virgl. v2: Use _mesa_has_* (Ilia) Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-15 01:58:50 +01:00
Dylan Baker	5317211fa0	meson: use a custom target instead of a generator for i965 oa Generators really are never the thing you want. The problem in this case is that a generator must create a file that contains any file that the generated target depends on. Since brw_oa.py doesn't generate such a file the generated sources are not regenerated even if the xml files they should depend on changes. While we could change brw_oa.py to write such a file, that's silly, it depends on itself and the xml file. So we'll just use a custom target instead, which will have the correct dependency behavior and doesn't really add that much code. Fixes: `3218056e0e` ("meson: Build i965 and dri stack") CC: Ian Romanick <idr@freedesktop.org> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-14 16:45:40 -08:00
Anuj Phogat	0cd37f9178	isl: Don't use surface format R32_FLOAT for typed atomic integer operations From Skylake PRM Surface Formats section: "The surface format for the typed atomic integer operations must be R32_UINT or R32_SINT." Fixes an error and a piglit GPU hang in simulation environment. Piglit test: gl45-imageAtomicExchange-float.shader_test Suggested-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.co Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0 17.3" <mesa-stable@lists.freedesktop.org>	2018-02-14 16:30:05 -08:00
Timothy Arceri	7be5f30bb1	radeonsi/nir: fix si_nir_load_tcs_varyings() for outputs We were incorrectly using the input info for outputs. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	9740c8a8aa	ac: implement nir_intrinsic_image_samples Fixes cts test: KHR-GL45.shader_texture_image_samples_tests.image_functional_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	c6b70a0eae	st: add NIR GL_ARB_get_program_binary support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	928be4e97e	st/shader_cache: add st_{de}serialise_nir_program() helpers These will be used for NIR GL_ARB_get_program_binary support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	3ad52501dc	ac/nir_to_llvm: fix image size for arrays of arrays Fixes cts test: KHR-GL44.shader_image_size.advanced-changeSize Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Timothy Arceri	6acab18828	radeonsi/nir: fix shader ballot return value bitsize Fixes cts test: KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-15 09:02:41 +11:00
Jason Ekstrand	8534af44e4	intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATA Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 13:17:26 -08:00
Jason Ekstrand	5c9d47d9c6	i965: Add gl_state_index casts for PATCH_VERTICES_IN This fixes the build in clang Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105088 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 13:16:47 -08:00
Scott D Phillips	3b4f432d9b	i965/miptree: Initialize mcs with a linear map When initializing mcs, map with MAP_RAW and fill in the linear map. Removes a place where gtt mapping is used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 12:38:34 -08:00
Scott D Phillips	d13ab69a78	i965/tiled_memcpy: change linear pointer from (0, 0) to (xt1, yt1) In all current uses, the linear surface is only allocated starting at (xt1, yt1) anyway, so this improves the calling ergonomics. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 12:38:34 -08:00
Scott D Phillips	ecaad89525	i965/tiled_memcpy: linear_to_ytiled a cache line at a time TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0 Thus a cache line in the tiled surface is composed of a 2d area of 16x4 bytes of the linear surface. Add a special case where the area being copied is 4-line aligned and a multiple of 4-lines so that entire cache lines will be written at a time. On Apollolake, this increases tiling throughput to wc maps by 84.0103% +/- 0.862818% v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand) v3: Don't reset src var (Jason), Ensure y0 <= y1 <= y2 <= y3 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-14 12:38:34 -08:00
Rafael Antognolli	eb2e17e2d1	docs: Add Cannonlake support to 18.0 release notes. 17.4 is actually 18.0. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:11:05 -08:00
Rafael Antognolli	fcae3d1a9a	anv/gen10: Remove warning message. Gen10 seems pretty stable so far, remove "alpha support" message. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:11:01 -08:00
Rafael Antognolli	bf1577fe09	i965/gen10: Remove warning message. Gen10 seems pretty stable so far, so there's no reason to keep this message. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0" mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-14 10:09:41 -08:00
Louis-Francis Ratté-Boulianne	aad14cf15a	egl/x11: Fix leak in dri3_create_image_khr_pixmap bp_reply wasn't properly free'd Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-14 11:52:06 +00:00
Iago Toral Quiroga	cb9dbd6dec	i965/compiler: clean up nir_intrinsic_load_input for vertex shaders This code to re-set the type of the source and destination is not necessary since we never manipulate the types. Looks like a left over from a time where we had to retype to float temporarily to handle 64-bit inputs. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Iago Toral Quiroga	4917d38321	intel/compiler: fix first_component for 64-bit types on vertex inputs Divide it by two as we do for other stages. This is because the component layout qualifier is always in 32-bit units. Fixes issues in a new CTS test (still WIP): KHR-GL45.enhanced_layouts.varying_double_components Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-02-14 12:00:14 +01:00
Samuel Pitoiset	ad4b58ea70	ac/nir: rename nir_to_llvm_context to radv_shader_context There is still more to do in that area, but it's a good start. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:16 +01:00
Samuel Pitoiset	141db61509	ac: remove nir_to_llvm_context from ac_nir_translate() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:14 +01:00
Samuel Pitoiset	a541117ff4	ac/nir: remove nir_to_llvm_context::nir link Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:12 +01:00
Samuel Pitoiset	e9f0205ca2	ac: move the outputs array to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:10 +01:00
Samuel Pitoiset	07e4268f36	ac/shader: scan force_persample Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-14 11:53:08 +01:00
Dave Airlie	b9d2ff05a6	r600: fix regression in gl_FragColor drawing This fixes a regression in the broadcast color to all color bufs case. Fixes: `6c691081a` (r600: fixup sparse color exports.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 14:02:41 +10:00
Dave Airlie	9c9a9bee44	r600: fix array spill if temp[0] is before all arrays I found a shader with DCL TEMP[0], LOCAL DCL TEMP[1..256], ARRAY(1), LOCAL DCL TEMP[257..512], ARRAY(2), LOCAL DCL TEMP[513..768], ARRAY(3), LOCAL DCL TEMP[769], LOCAL This would remap badly, as it would add up all the spilled sizes and subtract it from the temp for 0. If the current temp is less than the array start break out. Fixes: `1d871aa6` (r600g: Implement spilling of temp arrays (v2)) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:37:59 +10:00
Dave Airlie	8f2656c75b	virgl: add ARB_sample_shading support. This enable ARB_sample_shading if the renderer supports it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Dave Airlie	9b95b70719	virgl: add ARB_draw_indirect support. This relies on the renderer code landing first. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 13:06:07 +10:00
Roland Scheidegger	f6718baabc	tgsi: Recognize RET in main for tgsi_transform Shaders coming from dx10 state trackers have a RET before the END. And the epilog needs to be placed before the RET (otherwise it will get ignored). Hence figure out if a RET is in main, in this case we'll place the epilog there rather than before the END. (At a closer look, there actually seem to be problems with control flow in general with output redirection, that would need another look. It's enough however to fix draw's aa line emulation in some internal bug - lines tend to be drawn with trivial shaders, moving either a constant color or a vertex color directly to the output). v2: add assert so buggy handling of RET in main is detected Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-14 02:06:54 +01:00
Bas Nieuwenhuizen	7461bd5b8f	ac: Use the renumbered const address space for LLVM 7. The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-14 01:05:03 +01:00
Dave Airlie	9ddacd9af4	gallium: drop all the guard band float caps. Nobody queries these and nobody sets them to anything useful, the docs say TODO. Drop them until a use appears. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-14 08:50:08 +10:00
Vadym Shovkoplias	a553c54abf	mesa: add glsl version query (v4) Add support for GL_NUM_SHADING_LANGUAGE_VERSIONS and glGetStringi for GL_SHADING_LANGUAGE_VERSION v2: - Combine similar functionality into _mesa_get_shading_language_version() function. - Change GLSL version return mechanism. v3: - Add return of empty string for GLSL ver 1.10. - Move _mesa_get_shading_language_version() function to src/mesa/main/version.c. v4: - Add OpenGL version check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104915 Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com> Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 13:24:31 -07:00
Brian Paul	b08d718703	mesa: add missing switch case for EXTRA_VERSION_40 in check_extra() The EXTRA_VERSION_40 predicate is tested as part of extra_gl40_ARB_sample_shading but there was no switch case for it. Fixes: `77b440e42d` ("mesa: Add new functions and enums required by GL_ARB_sample_shading") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-13 10:35:55 -07:00
Mark Janes	e5809788d6	mesa: fix compile failure Missing header triggered a failure in i965 CI buildtest project. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Fixes: `e149a0253c`	2018-02-13 00:22:05 -08:00
Mark Janes	d9de7aaca3	Partially revert "mesa: use GLenum16 in a few more places" This reverts part of commit `ca721b3d89`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Mark Janes	3e5758a70a	Revert "mesa: reduce the size of gl_texture_image" This reverts commit `f4ea2b2a9e`. Several members reduced in size by the offending commit are not large enough to store the data needed by the i965 driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067	2018-02-13 00:22:05 -08:00
Dave Airlie	db5f422169	i965: fix tessellation regressions with gl_state_index16 Looks like one conversion was missed. Fixes: `e149a0253` (mesa,glsl,nir: reduce gl_state_index size to 2 bytes) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105067 Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-12 23:05:16 -08:00
Stéphane Marchesin	5e4a2b394e	virgl: Support v2 caps struct (v2) This struct allows us to report: - accurate max point size/line width. - accurate texel and texture gather offsets - vertex/geometry limits. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-13 14:23:54 +10:00
Timothy Arceri	10457712ed	ac/nir: add nir_intrinsic_{load,store}_shared support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	c787cbfa33	ac/nir_to_llvm: add support for nir_intrinsic_shared_atomic_* Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	b6cf898ec2	radeonsi: make si_declare_compute_memory() more generic and call for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Timothy Arceri	94fa090fad	st/glsl: set req_local_mem earlier for compute shaders Without this change it will never be set for backends using nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-13 14:43:05 +11:00
Marek Olšák	6b1e26e181	mesa: move STATE_LENGTH to shader_enums.h and use it everywhere Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	f4ea2b2a9e	mesa: reduce the size of gl_texture_image 80 -> 40 bytes. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	4794fbc86e	mesa: reduce the size of gl_program_parameter 40 -> 24 bytes, which includes the gl_state_index16 change. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	e149a0253c	mesa,glsl,nir: reduce gl_state_index size to 2 bytes Let's use the new gl_state_index16 type everywhere and remove the typecasts. This helps reduce the size of gl_program_parameter. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	a7882013d3	mesa: reduce the size of gl_viewport_attrib All drivers convert these to float, so there is no reason to use double. The piglit test that expects double precision from glGet will be adjusted not to require it (there is a piglit patch). gl_context::ViewportArray: 512 -> 384 bytes Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	d7550d783a	mesa: reduce the size of gl_texture_object Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	65ed98839b	mesa: reduce the size of gl_program gl_program: 1456 -> 976 bytes Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78f1decc95	mesa: reduce the size of gl_image_unit (v2) gl_context::ImageUnits: 6144 -> 4608 bytes v2: use ASSERT_BITFIELD_SIZE Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca5c5d96d8	mesa: further reduce the size of ctx->Texture Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	78043a75f6	mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8 GL allows doing glTexEnv on 192 texture units, while in reality, only MaxTextureCoordUnits units are used by fixed-func shaders. There is a piglit patch that adjusts piglits/texunits to check only MaxTextureCoordUnits units. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	07c10cc59c	mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	79aca14f5f	mesa: inline init_texture_unit because this is going to be changed Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Marek Olšák	ca721b3d89	mesa: use GLenum16 in a few more places Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-13 01:00:45 +01:00
Jason Ekstrand	4c77e21c81	anv: Move setting current_pipeline to cmd_state_init We were setting current_pipeline to UINT32_MAX and then calling cmd_cmd_state_reset which memsets the entire state struct to 0 which implicitly resets current_pipeline to 3D. I have no idea how this hasn't caused everything to explode. Fixes: `cd3feea745` "anv/cmd_buffer: Rework anv_cmd_state_reset" cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-12 15:18:23 -08:00
Jason Ekstrand	f37bd726c7	anv: Don't resolve or ambiguate non-existent layers The previous code was trying to avoid non-existent layers by taking a MAX with anv_image_aux_layers. Unfortunately, it wasn't taking into account that layer_count starts at base_layer which may not be zero. Instead, we need to subtract base_layer from anv_image_aux_layers with a guard against roll-over. Fixes: `de3be61801` "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-12 15:14:57 -08:00
Daniel Stone	c2c4e5bae3	i965: Fix bugs in intel_from_planar This commit fixes two bugs in intel_from_planar. First, if the planar format was non-NULL but only had a single plane, we were falling through to the planar case. If we had a CCS modifier and plane == 1, we would return NULL instead of the CCS plane. Second, if we did end up in the planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID, we would end up segfaulting in isl_drm_modifier_has_aux. Cc: mesa-stable@lists.freedesktop.org Fixes: `8f6e54c929` Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 15:14:45 -08:00
Eric Anholt	1aed66dc1e	radv: Fix compiler warning about uninitialized 'set' The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:47 +00:00
Eric Anholt	21670f8208	glsl/tests: Fix strict aliasing warning about int64/double. Fixes: `4bf9862747` ("glsl/tests: Add UINT64 and INT64 types") Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>	2018-02-12 20:48:43 +00:00
Eric Anholt	091bff8317	ac/nir: Fix compiler warning about uninitialized dw_addr. Even switching the def's condition to be the same chip revision check as the use, the compiler doesn't figure it out. Just NULL-init it. Fixes: `ec53e52742` ("ac/nir: Add ES output to LDS for GFX9.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 20:48:29 +00:00
Eric Anholt	7a83be4b28	gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax. My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't notice that ddmax is used from the same no_rho_opt as its initialization. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-12 20:48:18 +00:00
Kenneth Graunke	bd87bd178c	anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf. The kernel used to have execbuf parameters to program the INSTPM bit for whether 3DSTATE_CONSTANT_* should be relative to dynamic state base address or an absolute address. However, they never worked in the presence of hardware contexts, so I deleted them a while back. It doesn't make sense to set this flag, as it doesn't exist anymore. It also never did anything anyway - the flag is zero, so \|'ing it in did nothing. The default is relative anyway. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-12 07:00:41 -08:00
Eric Engestrom	111d4bf1d0	r200: remove left over dead code `0aaa27f291` removed the references to this array without removing the array itself Cc: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0aaa27f291` "mesa: Pass the translated color logic op dd_function_table::LogicOpcode" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-02-12 11:19:44 +00:00
Samuel Pitoiset	f4e85ba93f	ac/nir: remove backlink to nir_to_llvm_context Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:39 +01:00
Samuel Pitoiset	be5f6eb13e	ac/nir: remove nir_to_llvm_context::module Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:36 +01:00
Samuel Pitoiset	90a815ddeb	ac/nir: remove nir_to_llvm_context::builder Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:34 +01:00
Samuel Pitoiset	759acfa180	ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:31 +01:00
Samuel Pitoiset	e7373a6498	ac/nir: drop nir_to_llvm_context from visit_var_atomic() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:29 +01:00
Samuel Pitoiset	485346b05a	ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:27 +01:00
Samuel Pitoiset	cd6dfacda9	ac/nir: drop nir_to_llvm_context from visit_load_push_constant() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:25 +01:00
Samuel Pitoiset	5c9e398c83	ac/nir: drop nir_to_llvm_context from cast_ptr() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:23 +01:00
Samuel Pitoiset	5ef5944848	ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:21 +01:00
Samuel Pitoiset	da8b0b8264	ac/nir: drop nir_to_llvm_context from emit_f2f16() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:19 +01:00
Samuel Pitoiset	e32f374944	ac: remove unused parameters in abi::load_tess_coord() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:17 +01:00
Samuel Pitoiset	1e69db003d	ac/nir: remove useless bitcast in load_tess_coord() nir_intrinsic_load_tess_coord always returns a v3i32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:15 +01:00
Samuel Pitoiset	ed179fbdf3	ac: add load_resource() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:13 +01:00
Samuel Pitoiset	ecf229706f	ac: add load_sample_mask_in() to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:11 +01:00
Samuel Pitoiset	0f48eeea05	ac: move view_index to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:09 +01:00
Samuel Pitoiset	0efbede949	ac: move push_constants to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:07 +01:00
Samuel Pitoiset	460d3ce726	ac: move tg_size to the ABI Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:04 +01:00
Samuel Pitoiset	054c92190c	ac/nir: remove unused nir_to_llvm_context:{defs,phis} Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-12 11:54:02 +01:00
Eric Anholt	0b97eb02b0	egl/gbm: Fix compiler warning about visual matching. The compiler doesn't know that num_visuals > 0. Fixes: `37a8d907cc` ("egl/gbm: Ensure EGLConfigs match GBM surface format") Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-12 09:16:44 +00:00
Rob Clark	831fb29252	freedreno: small fix for flushing dependent batches Flush a resource's previous write_batch synchronously. Because a resource's associated batches are not updated until after the flush thread submits rendering to the kernel, this was causing a bit of confusion in the following loop. This fixes a bug that appeared with recent stk. Perhaps we need to re-work things a bit to clear out dependent patches in the ctx's thread and use a fence to deal with the period between when a flush is queued and when it is submitted to the kernel. But this will do until time permits a larger refactor. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c57ed8e01c	freedreno/ir3: intra-block scheduling Because of loops, we can't schedule all of a block's predecessors first. Instead just assume that the result consumed in a block was written far enough away in all paths into a block. And do an intra-block scheduling pass to figure out if there are any cases where we need to insert extra nop's. This works out better than always assuming the worst case (ie. that a value live into a block was written in the last instruction in the predecessor block). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2a2099a875	freedreno/ir3: "boost" the depth of if/else condition Account for the move to predicate register, to try to avoid needing to insert extra NOPs later. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ffb00f6841	freedreno/ir3: account for arrays in delayslot calc Normally false-deps are not something to consider, since they mostly exist for delay-slot related reasons: * barriers * ordering writes after read * SSBO/image access ordering The exception is a false-dependency on an array store. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	f54d2b4f10	freedreno/ir3: more clever legalize algorithm Previously we didn't handle flow control in legalize, and instead just set (ss)(sy) on the first instruction in every block. Which isn't very clever. Instead, consider output state of all predecessor blocks, so we only set a sync bit if needed for any possible path leading into a block. Because of loops, we can't require that all successor blocks are legalized before a given block, so instead run in a loop until results converge. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	015afb6a38	freedreno/ir3: track block predecessors Useful in the following patches. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	76440fcca9	freedreno/ir3: clean up dangling false-dep's Maybe there is a better way for this.. where it comes useful is "array" loads, which end up as a false-dep for a later array store. If all the uses of an array load are CP'd into their consumer, it still leaves the dangling array load, leading to funny things like: mov.u32u32 r5.y, r0.y mov.u32u32 r5.y, r0.z Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	aea223741f	freedreno/ir3: handle IMMED for mad 2nd src special case Consider also immediates for swapping the first two srcs, because they can be lowered to constant. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	242a8a1957	freedreno/ir3: remove ir3 phi instruction Now that we convert phi webs to ssa, we can drop all this. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7b569d60c	freedreno/ir3: remove lower_if_else pass Now that it is unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	268ab05484	freedreno/ir3: add experimental GCM pass Generally seems to do worse on instruction count and register usage, according to shader-db. But shader-db also doesn't do a very good job of weighting loop bodies, so that might not be totally valid. So add an env variable to enable GCM pass for easier experimentation. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	4c15c53d91	freedreno/ir3: change opt passes There are more useful nir passes added since initial conversion to nir. But ir3 was never updated to use them. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ec8bc54ad2	freedreno/ir3: use peephole select pass Agressively lowering all if/else to selects in some extreme cases results in much higher register pressure. Using peephole select instead with a modest threshold speeds up alu2 4x! 16 seems like a good limit, low enough to help alu2 but not too low that it penalizes everything else. With a bit better scheduling of the instruction that moves a value into a predicate register, we might be able to lower this limit a bit more in the future, but since we need 6 cycles from the move to predicate register to predicated branch, that puts some sort of lower bound on how far we can lower this threshold. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a7ea2b4eba	freedreno/ir3: lower phi webs to regs nir's from_ssa pass is much better at avoiding inserting extra moves than our logic is. And lowering phi webs to regs just treats anything involved in a phi web as an array of length=1. Which with previous array related fixes in RA/etc ends up working out quite well. This cuts down on extra instructions and also helps with register pressure. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	0a6ddf964f	freedreno/ir3: separate arrays from groups Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	55f14a1ac4	freedreno/ir3: make block/instruction serialno per-shader Makes it easier to compare values seen in-game (where there are many shaders) to cmdline standalone compiler. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	5a7de94392	freedreno/ir3: add spirv support to cmdline compiler Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	942341bcd0	freedreno/ir3: don't lower fsat Instead, if possible fold (sat) flag into src, otherwise use: (sat)max.f rD, rS, rS Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	b2fc94f074	freedreno/ir3: add encoding/decoding for (sat) bit Seems to be there since a3xx, but we always lowered fsat. But we can shave some instructions, especially in shaders that use lots of clamp(foo, 0.0, 1.0) by not lowering fsat. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	1b658533e1	freedreno/ir3: extend liverange of arrays Use livein state of other blocks to extend liverange of arrays when they are still needed by successor blocks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	ac459a6f7f	freedreno/ir3: avoid extra mov's for "arrays" Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	2bc3fb6992	freedreno/ir3: a couple more array fixes (Plus a couple TODOs) Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	8ea1ef4191	freedreno/ir3: keep array stores Since these are not in SSA form, add to block's keeps so it doesn't appear unused. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	c60f150d56	freedreno/ir3: propagate barrier information When eliminating movs, the instruction that is now directly using the src of the mov has the same scheduling order constraints as the original mov instruction. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	98702c1010	freedreno/ir3: remove pointless statement Function ends after this if/else ladder, so it was pointless. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	930ca0e038	freedreno/ir3: some more debug prints Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a84e324847	freedreno/ir3: fix printing of relative branch offsets The number of bits depends on generation. But printing negative values with a5xx encoding (largest size) but compiling for a3xx or a4xx, would result in negative values printed as large positive values. I guess in practice huge negative branch offsets aren't likely (and if that is the case, the shader is probably too big to grok by reading the assembly). So just print using smallest bitfield size. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	a5c28fe07b	freedreno/ir3: be more clever with if/else jumps Try to clean up things like: br !p0.x #2 br p0.x #something to eliminate the first branch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	44dd7dcd2f	freedreno/ir3: avoid some spurious sync bits Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	069c0ac625	freedreno/ir3: print # of sync bits for shaderdb When trying to optimize to reduce stalls, it is nice to see this info. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Rob Clark	7d45e2e39f	freedreno: add debug trace for flush Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-10 14:54:58 -05:00
Grazvydas Ignotas	9b9a89cd79	intel/compiler: fix 64bit value prints on 32bit Fix the following: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-02-10 17:59:02 +02:00
Timothy Arceri	ff0e3fa1fe	st/glsl_to_nir: remove unused options variable	2018-02-10 11:06:55 +11:00
Timothy Arceri	8f378c116e	st/radeonsi: enable disk cache for nir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	bc9d9f9b86	st: add nir shader disk cache support v2: include compute shader support Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	97efdc0d57	st/glsl_to_tgsi: move nir detection earlier We move the nir check before the shader cache call so that we can call a nir based caching function in a following patch. Also with this change we simply check if vertex shaders support NIR rather than looping over the stages as mixing of shader types is not supported anyway. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	b5e23887fe	radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead. This change indirectly enables NIR support for compute shaders on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	73f1d6f0c1	r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support in clover. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	51f484bb44	clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR for compute shaders, so we let clover pick the one it wants to use. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	3af4f34e61	r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs Acked-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-10 10:59:10 +11:00
Timothy Arceri	ce836487b8	radeonsi/nir: add depth layout to scan pass Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	6a8efbe652	radeonsi/nir: add FRAG_RESULT_COLOR to scan pass Fixes a number of draw buffers piglit tests. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	ef8082baf8	ac: convert nir_op_f2f32 src to a float Fixes the following piglit test: ./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo Where we would end up with the nir such as: vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10 vec1 32 ssa_12 = f2f32 ssa_2 And our pack_64_2x32_split nir to llvm code always produces a 64bit integer as output. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Timothy Arceri	1b1e5f8edf	ac: fix some 64bit unpack asserts Previously the asserts did not take swizzles into account. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-10 10:46:28 +11:00
Mark Janes	9a05c66feb	Revert "i965: prevent potentially null pointer access" This reverts commit `712332ed54`, which caused over 90k failures in Mesa i965 CI. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-09 09:46:07 -08:00
Daniel Stone	37a8d907cc	egl/gbm: Ensure EGLConfigs match GBM surface format When we create an EGL window surface on a GBM surface, ensure that the EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB interchange. For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888 gbm_surface (and vice-versa) are acceptable, but rendering with an XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be rejected. This was previously allowed through; when 10bpc formats were enabled, clients which picked a completely random EGL config and hoped/assumed they were XRGB8888 would break. If you have bisected a failure to start a GBM/KMS client to this commit, please look at its EGLConfig selection (e.g. through eglChooseConfigs), and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the attribs for config selection. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	8174e5b49e	egl/gbm: Remove duplicate format table Now that we have mask/channel information in gbm_dri's format conversion table, we can remove the copy in EGL. As this table contains more formats (notably including R8 and RG8, which can be used for BO but not surface allocation), we now compare the masks of all channels when trying to find a suitable config. Without doing this, an XRGB8888 EGLConfig would match on an R8 format. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	314714ac53	gbm/dri: Expose visuals table through gbm_dri_device Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	2ed344645d	gbm/dri: Add RGBA masks to GBM format table Eventually, we can replace the visuals list inside GBM EGL driver with this one. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4732094cff	egl/wayland: Use an array for modifiers Each Wayland EGLDisplay currently contains a struct with one vector of modifiers per format, hardcoded in the header. To allow easier support for more formats, turn this into an array of u_vectors which is opaque outside of platform_wayland.c. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	5bc49d4cbf	egl/wayland: Remove has_format enum Instead of the has_format enum, use an index into the visual array. This makes adding new formats less typing. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	d32b23f383	egl/wayland: Add bpp to visual map Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation, had their own format -> bpp lookup tables. Replace these with a lookup into the visual map. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	4de98a9c07	egl/wayland: Use visual map for DRIImage<->FourCC map When trying to translate between DRIImage format enums and FourCC codes, use our visual map rather than an open-coded subset. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	68a80c11bd	egl/wayland: Use visual map for format advertisement Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	3323ce72ff	egl/wayland: Use visual map for buffer_from_image When creating a wl_buffer on an upstream Wayland display from an existing EGLImage, use the dri2_wl_visual map rather than another hardcoded list of formats. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:16 +00:00
Daniel Stone	a9cc4edb60	egl/wayland: Use visual map for config->format lookup Having hoisted the format -> config map into common code, we now use it for config -> format lookups. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	1dc013f1ee	egl/wayland: Add format enums to visual map Extend the visual map from only containing names and bitmasks, to also carrying the three format enums we need. These are the DRIImage format tokens for internal allocation, FourCC codes for wl_drm and dmabuf protocol, and wl_shm codes for swrast drivers. We will later use these formats to eliminate a bunch of open-coded conversions. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	66912641df	egl/wayland: Use proper enum type in visual definition No semantic change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	845c2f6156	egl/wayland: Widen channel masks to bpp Widen the channel masks given in the visual table to the full width of the pixel format, i.e. as many leading zeros as required. No functional change. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	19cbca38e4	egl/wayland: Hoist format <-> EGLConfig definition up Pull the mapping between Wayland formats and EGLConfigs up to the top level, so we can reuse it elsewhere. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:15 +00:00
Daniel Stone	4fbd2d50b1	egl/wayland: Fix ARGB/XRGB transposition in config map When `0b2b719121` moved from an if tree to a struct to map between wl_drm formats and EGLConfigs, it transposed the mapping between XRGB and ARGB. Luckily, everyone exposes both formats, so this is harmless. Signed-off-by: Daniel Stone <daniels@collabora.com> Fixes: `0b2b719121` ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-02-09 16:17:06 +00:00
Marek Olšák	76085f2048	st/mesa: generate blend state according to the number of enabled color buffers Non-MRT cases always translate blend state for 1 color buffer only. MRT cases only check and translate blend state for enabled color buffers. This also avoids an assertion failure in translate_blend for: dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	c446dd7927	st/mesa: don't translate blend state when color writes are disabled Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Marek Olšák	3d06c8afb5	st/mesa: don't translate blend state when it's disabled for a colorbuffer Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-09 15:52:22 +01:00
Lionel Landwerlin	712332ed54	i965: prevent potentially null pointer access Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> CID: 1418110	2018-02-09 14:02:59 +00:00
Mark Thompson	5db29d62ce	st/va: Make the vendor string more descriptive Include the Mesa version and detail about the platform. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:43 +01:00
Mark Thompson	768f1487b0	st/va: Enable vaExportSurfaceHandle() It is present from libva 2.1 (VAAPI 1.1.0 or higher). Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-02-09 13:37:36 +01:00
Tapani Pälli	41c5bf3836	disk cache: move path creation back to constructor This patch moves disk cache path and index creation back to the constructor which matches previous behavior. We still allow create to succeed without path so that cache can be used with callback functionality. Fixes: c95d3ed091 "disk cache: create cache even if path creation fails" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 11:33:25 +02:00
Samuel Pitoiset	3a2bb4db23	ac/nir: compute correct number of user SGPRs on GFX9 For merged shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:16:04 +01:00
Michel Dänzer	171076f082	st/mesa: Initialize tex_target in compile_tgsi_instruction Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the variable type was changed to enum tgsi_texture_type. Fixes a bunch of piglit failures with radeonsi, e.g.: gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed. Corresponding compiler warning: CXX state_tracker/st_glsl_to_tgsi.lo ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context, uint, ureg_program, glsl_to_tgsi_visitor, const gl_program, GLuint, const ubyte, const ubyte, const ubyte, const ubyte, const ubyte, GLuint, const ubyte, const ubyte, const ubyte)’: ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized] ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src, ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ inst->buffer_access, ~~~~~~~~~~~~~~~~~~~~ tex_target, inst->image_format); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here enum tgsi_texture_type tex_target; ^~~~~~~~~~ Fixes: `9f9ce1625f` ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:26:40 +01:00
Alejandro Piñeiro	f32b01ca43	glsl/linker: remove ubo explicit binding handling This is already handled at link_uniform_blocks, specifically at process_block_array_leaf. Additionally, this code was not handling correctly arrays of arrays. When creating the name of the block to set the binding, it only took into account the first level, so any attempt to set a explicit binding on a array of array ubo would trigger an assertion. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 08:32:42 +01:00
Mathias Fröhlich	77cb2fc0bd	mesa: Only update enabled VAO gl_vertex_array entries. Instead of updating all modified gl_vertex_array_object::_VertexArray entries just update those that are modified and enabled. Also release buffer object from the _VertexArray that belong to disabled attributes. v2: Also set Ptr and Size to zero. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:26:23 +01:00
Mathias Fröhlich	437cae411e	gallium: Mute arrays for several meta like callbacks. Set the _DrawArray pointer to NULL when calling into the Drivers Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks. This fixes an assert that gets uncovered when the following patch gets applied. v2: Mute from within the state tracker instead of generic mesa. v3: Avoid evaluating _DrawArrays from within st_validate_state. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 04:26:13 +01:00
Mathias Fröhlich	2f9eb0aad5	mesa: Fix VAO buffer object tracking. When changing the attribute binding in the VAO we also need to account for getting rid of non vbo bits from VertexAttribBufferMask. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-09 04:21:36 +01:00
Timothy Arceri	d8bca3809d	radeonsi/nir: gather some missing fs info Fixes some early-z arb_shader_image_load_store piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Timothy Arceri	c77078c942	ac: pass struct ac_llvm_context to emit_membar() Fixes segfault in piglit test: ./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 12:51:27 +11:00
Marek Olšák	12fd567c78	radeonsi: copy the NIR enablement debug bit to the shader cache flags When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-09 02:01:45 +01:00
Jason Ekstrand	8f20cf166e	intel/blorp: Use isl_aux_op instead of blorp_hiz_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1e941a0528	intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1810f965c8	anv: Allow fast-clearing the first slice of a multi-slice image Now that we're tracking aux properly per-slice, we can enable this for applications which actually care. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	de3be61801	anv/cmd_buffer: Rework aux tracking This commit completely reworks aux tracking. This includes a number of somewhat distinct changes: 1) Since we are no longer fast-clearing multiple slices, we only need to track one fast clear color and one fast clear type. 2) We store two bits for fast clear instead of one to let us distinguish between zero and non-zero fast clear colors. This is needed so that we can do full resolves when transitioning to PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear values in all sorts of places we wouldn't normally. 3) We now track compression state as a boolean separate from fast clear type and this is tracked on a per-slice granularity. The previous scheme had some issues when it came to individual slices of a multi-LOD images. In particular, we only tracked "needs resolve" per-LOD but you could do a vkCmdPipelineBarrier that would only resolve a portion of the image and would set "needs resolve" to false anyway. Also, any transition from an undefined layout would reset the clear color for the entire LOD regardless of whether or not there was some clear color on some other slice. As far as full/partial resolves go, he assumptions of the previous scheme held because the one case where we do need a full resolve when CCS_E is enabled is for window-system images. Since we only ever allowed X-tiled window-system images, CCS was entirely disabled on gen9+ and we never got CCS_E. With the advent of Y-tiled window-system buffers, we now need to properly support doing a full resolve of images marked CCS_E. v2 (Jason Ekstrand): - Fix an bug in the compressed flag offset calculation - Treat 3D images as multi-slice for the purposes of resolve tracking v3 (Jason Ekstrand): - Set the compressed flag whenever we fast-clear - Simplify the resolve predicate computation logic Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2cbfcb205e	anv/cmd_buffer: Move the mi_alu helper higher up Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	2e69045c4d	anv/image: Simplify some verbose commennts Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	f0523f70ef	anv: Use blorp_ccs_ambiguate instead of fast-clears Even though the blorp pass looks a bit on the sketchy side, the end result in the Vulkan driver is very nice. Instead of having this weird case where you do a fast clear and then maybe have to resolve, we just do the ambiguate and are done with it. The ambiguate does exactly what we want of setting all the CCS values to 0 which puts it into the pass-through state. This should also improve performance a bit in certain cases. For instance, if we did a transition from UNDEFINED to GENERAL for a surface that doesn't have CCS enabled all the time, we would end up doing a fast-clear and then a full resolve which ends up touching every byte in the main surface as well as the CCS. With the ambiguate pass, that transition only touches the CCS. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	84fd2ebfbc	anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	3ef8c4b2f5	anv/cmd_buffer: Pull the undefined layout condition into the if Now that this isn't a multi-case if and it's just the one case, it's a bit clearer if the condition is just part of the if instead of being pulled out into a boolean variable. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	857b5b5a7f	intel/blorp: Add a CCS ambiguation pass This pass performs an "ambiguate" operation on a CCS-compressed surface by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly detailed notion of how the CCS is laid out so this is fairly simple to do. On gen7, the CCS tiling is quite crazy but that isn't an issue because we can only do CCS on single-slice images so we can just blast over the entire CCS buffer if we want to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	13b621d6fd	anv: Only fast clear single-slice images The current strategy we use for managing resolves has an issues where we track clear colors and the need for resolves per-LOD but we still allow resolves of only a subset of the slices in any given LOD and doing so sets the "needs resolve" flag for that LOD to false while leaving the remaining layers unresolved. This patch is only the first step and does not, by itself fix anything. However, it's fairly self-contained and splitting it out means any performance regressions should bisect to this nice obvious commit rather than to the giant "rework aux tracking" commit. Nanley and I did some testing and none of the applications we tested even tried to fast-clear anything other than the first slice of an image. The test was done by adding a printf right before we call blorp_fast_clear if we were every going to touch any slice other than the first with a fast-clear. Due to the way the original code was structured, this would not have included applications which only cleared a subset of layers. The applications tested were: * All Sascha Willems demos * Aztec Ruins * Dota 2 * The Talos Principle * Mad Max * Warhammer 40,000: Dawn of War III * Serious Sam Fusion 2017: BFE While not the full list of shipping applications, it's a pretty good spread and covers most of the engines we've seen running on our driver. If this is ever shown to be a performance problem in the future, we can reconsider our strategy. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	571ed588ac	anv/cmd_buffer: Add a mark_image_written helper Currently, this helper does nothing but we call it every place where an image is written through the render pipeline. This will allow us to properly mark the aux state so that we can handle resolves correctly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	9876d6f0ef	anv/blorp: Add src/dst_level helper variables in CmdCopyImage Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	c180c2c868	anv/cmd_buffer: Add an anv_genX_call macro This is copied and pasted from the similar macro we added to ISL. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	ab7543b13d	anv/cmd_buffer: Generalize transition_color_buffer This moves it to being based on layout_to_aux_usage instead of being hard-coded based on bits of a priori knowledge of how transitions interact with layouts. This conceptually simplifies things because we're now using layout_to_aux_usage and layout_supports_fast_clear to make resolve decisions so changes to those functions will do what one expects. There is a potential bug with window system integration on gen9+ where we wouldn't do a resolve when transitioning to the PRESENT_SRC layout because we just assume that everything that handles CCS_E can handle it all the time. When handing a CCS_E image off to the window system, we may need to do a full resolve if the window system does not support the CCS_E modifier. The only reason why this hasn't been a problem yet is because we don't support modifiers in Vulkan WSI and so we always get X tiling which implies no CCS on gen9+. This patch doesn't actually fix that bug yet but it takes us the first step in that direction by making us actually pick the correct resolve op. In order to handle all of the cases, we need more detailed aux tracking. v2 (Jason Ekstrand): - Make a few more things const - Use the anv_fast_clear_support enum v3 (Jason Ekstrand): - Move an assert and add a better comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	151771b390	anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	bea7373c92	anv/image: Support color aspects in layout_to_aux_usage Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	b09464db42	anv/image: Add a helper for determining when fast clears are supported v2 (Jason Ekstrand): - Return an enum instead of a boolean v3 (Jason Ekstrand): - Return ANV_FAST_CLEAR_NONE instead of false (Topi) - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE - Add documentation for the enum values v4 (Jason Ekstrand): - Remove a dead comment Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1f7eee6bc1	anv/image: Update a comment This got lost in all of the aspect vs. plane rebasing of YCBCR. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	5c38ab8f07	anv/blorp: Rework HiZ ops to look like MCS and CCS Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	1d473e26f2	anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what we want for blits/copies. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	42f1668a54	anv/blorp: Rework image clear/resolve helpers This replaces image_fast_clear and ccs_resolve with two new helpers that simply perform an isl_aux_op whatever that may be on CCS or MCS. This is a bit cleaner as it separates performing the aux operation from which blorp helper we have to call to do it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Jason Ekstrand	482c24783e	intel/isl: Codify AUX operations in an enum Right now, we have different entrypoints and enums in blorp for these different operations. This provides us a central enum which we can begin to transition to. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2018-02-08 16:35:31 -08:00
Gert Wollny	c36172e387	r600/sb: Check whether optimizations would result in reladdr conflict v2: * Check whether the node src and dst registers are NULL before using them. * fix a type in the commit message. Two cases are handled with this patch: 1. If copy propagation tries to eliminated a move from a relative array access then it could optimize MOV R1, ARRAY[RELADDR_1] MOV R2, ARRAY[RELADDR_2] OP2 R3, R1 R2 into OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2] which is forbidden, because there is only one address register available. 2. When MULADD(x,a,MUL(x,c)) is handled MUL TMP, R1, ARRAY[RELADDR_1] MULLADD R3, R1, ARRAY[RELADDR_2], TMP by folding this into ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1] MUL R3, R1, TMP which is also forbidden. Test for these cases and reject the optimization if a forbidden combination of relative access would be created. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 10:00:38 +10:00
Glenn Kennard	1d871aa626	r600g: Implement spilling of temp arrays (v2) Pessimistically spills arrays if GPR limit is exceeded. v2: fix r600 support [airlied] Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:26 +10:00
Dave Airlie	22fc5eff80	r600/sb: handle scratch mem reads on r600 On r600 we use the scratch mem with read/read_ind, in that case sb should track the rw_gpr as a dst instead of a src. This stops the whole shader being optimised out. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:21 +10:00
Glenn Kennard	cd34deb585	r600g/sb: Add dependency tracking for scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:19 +10:00
Glenn Kennard	a100d906b2	r600g/sb: Support scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:16 +10:00
Glenn Kennard	6b4303f358	r600g: Implement scratch buffer state management (v2) v2: add Glenn's fixes Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:12 +10:00
Glenn Kennard	9d31596d7a	r600g: Add pending output function Spills have to happen after the VLIW bundle currently processed, so defer emitting the spill op. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:53:08 +10:00
Glenn Kennard	9c48a139b0	r600g: Support emitting scratch ops Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:52:48 +10:00
Dave Airlie	2a891ed190	r600: fix texture gather swizzling. This fixes: KHR-GL45.texture_gather.swizzle on cayman and redwood. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-09 09:32:20 +10:00
Timothy Arceri	12a2350e6d	ac: add 64bit support to ac_find_lsb() v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	a9f6b392c7	ac: move get_elem_bits() to ac_llvm_build.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Timothy Arceri	19f9839f0b	ac: add 64bit bitCount support v2: use LLVMBuildTrunc() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-09 09:42:59 +11:00
Samuel Pitoiset	bb750d265c	ac/nir: clean up handle_fs_outputs_post() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:33 +01:00
Samuel Pitoiset	528bc14fa5	ac/nir: add radv_load_output() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:30 +01:00
Samuel Pitoiset	834d9845ca	ac/shader: scan info about output PS declarations NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:27 +01:00
Samuel Pitoiset	a8e04e91de	ac/nir: add radv_export_param() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:26 +01:00
Samuel Pitoiset	e3cfd6b805	ac/nir: remove set but unused export_mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:24 +01:00
Samuel Pitoiset	724136d590	ac/nir: remove dead code in handle_vs_outputs_post() The memcpy can't be reached because the condition is always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:22 +01:00
Samuel Pitoiset	c63d8d0284	ac/nir: remove useless check in si_llvm_init_export_args() values can't be NULL because we use ac_build_export_null() now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:14:20 +01:00
Samuel Pitoiset	26ab5a4269	ac/nir: use ac_build_export_null() The number of enabled channels should be 0 when exporting null. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 22:11:44 +01:00
Samuel Pitoiset	bd9f7b7635	ac: add ac_build_export_null() helper Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 22:11:42 +01:00
Scott D Phillips	1f4d2433e7	meson: Add build option for tools Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-08 11:24:42 -08:00
Anuj Phogat	464d057c86	intel: Add Coffee Lake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-02-08 10:26:34 -08:00
Brian Paul	11e92889aa	gallium/util: silence clang warning in blitter code Silence "warning: comparison of constant 4294967295 with expression of type 'ubyte'". Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:31 -07:00
Brian Paul	4b0a45da25	tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output() So the function matches the prototype. Found with clang. v2: fix copy&paste error Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-08 10:27:19 -07:00
Brian Paul	d95c2d86cc	tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the value zero so there's no change in behavior. It seems funny to declare these fs input registers with constant interpolation. But it looks like ureg_DECL_input_layout() is not called anywhere and ureg_DECL_input() is only called from util_make_geometry_passthrough_shader(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	26948ba761	gallium/util: s/uint/enum tgsi_semantic/ in simple shader code Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	0f40f4ffda	tgsi: s/unsigned/enum pipe_shader_type/ in ureg code And add a default switch case to silence a compiler warning. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	c0dc337ecd	gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c And put static qualifier on const arrays. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	e55de6e20c	st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b9ff185e41	vbo: add a comment on vbo_draw_transform_feedback() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	93b3d38176	gallium/util: trivial whitespace/formatting fixes in u_blit.c Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5396f8546a	vbo: improve comments on vbo_draw_func() And rename a parameter name. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	b03ade55b9	cso: add a couple sanity check assertions in cso_draw_vbo() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Brian Paul	5cf342704d	st/mesa: rename some vars related to indirect draw count 'indirect_params' was a bit vague. Use the names that we use in gallium's pipe_draw_indirect_info. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-08 09:49:03 -07:00
Marek Olšák	d9e6e0bbe3	st/mesa: remove out_num_textures from update_textures Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Marek Olšák	08496c5d52	st/mesa: don't store non-fragment sampler states and views in st_context those are unused. st_context: 10120 -> 3704 bytes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-02-08 16:14:11 +01:00
Lionel Landwerlin	e843667733	i965: perf: cleanup detection of kernel support for loadable configs The initial revision of the patch adding loadable configs was testing the feature's availability by adding a new config successfully and then removing it. A second version tested the availability just by exercising the removal. But some unused code remained. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:52:14 +00:00
Lionel Landwerlin	bd6c0cab60	i965: perf: use drmIoctl() instead of ioctl() ioctl() might be interrupted, use drmIoctl() instead as it'll retry automatically. Fixes: `27ee83eaf7` "i965: perf: add support for userspace configurations" Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2018-02-08 10:51:40 +00:00
Lionel Landwerlin	0f952b778f	i965: perf: add debug messages for loaded configs This helps figuring out potential problems when metrics don't show up on frameretrace for example. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-08 10:51:01 +00:00
Dave Airlie	3f7a7bd897	r600: implement tg4 integer workaround. (v2) This ports the texture gather integer workaround from radeonsi. This fixes: KHR-GL45.texture_gather.plain-gather-uint/int* v2: add rect support, fix 2d array shadow Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:40 +10:00
Glenn Kennard	77b1b33724	r600: clean up initial shader register setup This is taken from Glenn Kennards scratch series, but separated out as a cleanup by me. Reviewed-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 16:21:35 +10:00
Roland Scheidegger	b936f4d1ca	r600: partly fix sampleMaskIn value The hw gives us coverage for pixel, not for individual fragment shader invocations, in case execution isn't per pixel (eg, unlike cm, actually cannot do "real" minSampleShading, it's either per-pixel or per-fragment, but it doesn't really make a difference here). Also, with msaa disabled, the hw still gives us a mask corresponding to the number of samples, where GL requires this to be 1. Fix this up by masking the sampleMaskIn bits with the bit corresponding to the sampleID, if we know this shader is always executed at per-sample granularity. (In case of a per-sample frequency shader and msaa disabled, the sampleID will always be 0, so this works just fine there.) Fixing this for the minSampleShading case will need a shader key (radeonsi uses the prolog part for) (for eg, could get away with a single bit, cm would need more bits depending on sample/invocation ratio, or read the bits from a uniform), unless we'd want to always use a sample mask uniform (which is probably not a good idea, as it would make the ordinary common msaa case slower for no good reason). This fixes some parts of piglit arb_sample_shading-samplemask (with fixed test), in particular those which use a sampleID, still failing others as expected. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	07d724326a	r600: clean up fragment shader input scan code For some reason, we were iterating through the code twice (first just for instructions needing barycentrics, then for instructions and input dcls). Move things around slightly so this is no longer necessary. There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only needed if the per-sample interpolation comes from an input, not from an instruction (just move the assert where it belongs) (since the sample id to sample from comes from a tgsi src in this case, and isn't sampleID). Otherwise there should be no functional change. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	6fd3c39590	mesa: (trivial) remove unused ignore_sample_qualifier_parameter This parameter for _mesa_get_min_incations_per_fragment() was once used by the intel driver, but it's long gone. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@vmware.com>	2018-02-08 04:07:52 +01:00
Roland Scheidegger	becc7faae2	r600/cm: (trivial) code cleanup for emitting msaa state No functional change (compile tested only). Reviewed-by: Dave Airlie <airlied@redhate.com>	2018-02-08 04:07:52 +01:00
Brian Paul	b99cb13002	tgsi: use tgsi_semantic enum type in ureg code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	174f3a4ab7	st/mesa: use tgsi_semantic enum type Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:43:01 -07:00
Brian Paul	0f7be4fc16	tgsi: use TGSI enum types in ureg code v2: fix enum tgsi_interpolate_mode/loc typo. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:42:39 -07:00
Brian Paul	9f9ce1625f	st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	6321b1bd40	gallium/util: replace uint with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Brian Paul	15874338ff	gallium/util: replace unsigned with tgsi enum types Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-07 18:38:04 -07:00
Fredrik Höglund	5a38d8f103	radv: implement VK_EXT_external_memory_host Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-08 00:46:07 +01:00
Dave Airlie	5dd385f378	r600: fix rendering regression on r6/7 gpus Fixes: `2d5b5d267e` (r600: work out target mask at framebuffer bind.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-08 09:37:09 +10:00
Grazvydas Ignotas	f91aa68ac6	radeonsi: avoid int-to-pointer-cast warnings on 32bit I hope the actual dropping of MSB is ok, but that's what's already happened before this change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:13:58 +02:00
Grazvydas Ignotas	13ada91740	gallium/hud: update some query functions It seems these were missed when struct pipe_context * argument was added to hud_graph::query_new_value. Fixes: `3132afdf4c` "gallium/hud: pass pipe_context explicitly to most functions" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-08 01:12:07 +02:00
Roland Scheidegger	09f49b9e50	Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary" This reverts commit `6f82b8d8d0`. This broke scons build, and reportedly clover with autotools/meson too.	2018-02-07 23:47:39 +01:00
Marek Olšák	6f82b8d8d0	gallium: build ddebug, noop, rbug, trace as part of auxiliary Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU. (gallium build time is reduced by 15% when building only radeonsi) Non-recursive makefiles are great!	2018-02-07 22:08:34 +01:00
Roland Scheidegger	def09f8db0	u_blit: (trivial) fix bogus argument order for set_fragment_shader Amazingly this still worked sometimes, albeit I'm not even sure why... This fixes `d7bec6f7a6`.	2018-02-07 22:03:18 +01:00
Andres Rodriguez	83990dd529	mesa: fix incorrect type when allocating arrays The array members are have type 'struct gl_buffer_object *' Found by coverity. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-07 14:50:21 -05:00
Roland Scheidegger	d7bec6f7a6	u_blit,u_simple_shaders: add shader to convert from xrbias format We need this to handle some oddball dx10 format (DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this format is very limited, hence we don't want to add it as a gallium format (we could not express the properties of this format as ordinary format properties neither, so like all special formats it would need specific code for handling it in any case). While here, also nuke the array for different shaders for different writemasks, as it was not actually used (always full masks are passed in for generating shaders). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-02-07 17:09:37 +01:00
Roland Scheidegger	afd1e9be17	u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask The writemask handling was busted, since writing defaults to output meant they got overwritten by the tex sampling anyway. Albeit the affected components were undefined, so maybe with some luck it still would have worked with some drivers - if not could as well kill it... (This would have affected u_blitter but not u_blit since the latter always used xyzw mask.) Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-07 17:08:24 +01:00
Bas Nieuwenhuizen	5d754872b5	autotools: Only build libmesa-st-tests-common.a for tests. We don't need the library if we don't build tests, and building it adds a dependency on gtest which adds a dependency on cxxabi.h. Fixes: `6569b33b6e` "mesa/st/tests: unify MockCodeLine* classes" Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2018-02-07 14:04:04 +01:00
Tapani Pälli	9d322fde97	i965: add __DRI2_BLOB support and set cache functions v2: adjust to change that moved cache from ctx to screen Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	ae00ef2702	disk cache: add callback functionality v2: add disk_cache_has_key, disk_cache_put_key support using blob cache (Nicolai, Jordan) v3: rename set_cb as put_cb to match existing naming (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6a651b6b77	disk cache: initialize cache path and index only when used This patch makes disk_cache initialize path and index lazily so that we can utilize disk_cache without a path using callback functionality introduced by next patch. v2: unmap mmap and destroy queue only if index_mmap exists Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	e8495646af	glsl/tests: changes to test_disk_cache_create test Next patch will allow disk_cache instance to be created without path set for it, modify some test cases that assume disk_cache creation to fail with invalid path. Creation should succeed but simple put/get test fail. v2: leave tests as is but check that both cache struct exists and try simple put/get that should fail with invalid path set (Emil) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	83c81b6cce	glsl/tests: move utility functions in cache_test Patch moves functions higher so that we can utilize them from test_disk_cache_create which is modified by next patch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	6f5b57093b	egl: add support for EGL_ANDROID_blob_cache v2: cleanup, move callbacks to _egl_display struct (Emil Velikov) adapt to earlier ctx->screen changes v3: remove useless checking, add _eglSetFuncName (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Tapani Pälli	cf4569da6b	dri: add interface for EGL_ANDROID_blob_cache extension v2: move from __DRIcontext to __DRIscreen (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-07 14:45:34 +02:00
Samuel Pitoiset	757d36ee70	ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Ported from RadeonSI. Only one F1 2017 shader is affected, code size decreased from 532 to 488 on both Polaris10 and Vega10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:13 +01:00
Samuel Pitoiset	2f54d7382d	ac/nir: avoid loading unused VS input components Polaris10: Totals from affected shaders: SGPRS: 122840 -> 120984 (-1.51 %) VGPRS: 78812 -> 78440 (-0.47 %) Spilled SGPRs: 177 -> 129 (-27.12 %) Code Size: 2950028 -> 2941276 (-0.30 %) bytes Max Waves: 17899 -> 17976 (0.43 %) Vega10: Totals from affected shaders: SGPRS: 117144 -> 115776 (-1.17 %) VGPRS: 77580 -> 77532 (-0.06 %) Spilled SGPRs: 0 -> 152 (0.00 %) Code Size: 3352656 -> 3347860 (-0.14 %) bytes Max Waves: 19756 -> 19866 (0.56 %) This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:09 +01:00
Samuel Pitoiset	1c57a6da5e	ac/shader: scan vertex inputs usage mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-07 12:42:07 +01:00
Iago Toral Quiroga	f474b19875	i965: allocate a SGVS element when VertexID or InstanceID are read Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS to put these beyond the last vertex element it seems that we still need to allocate the SVGS element, otherwise we have observed cases where we end up reading garbage. Specifically, the CTS test mentioned below was flaky with a fail rate of ~1% on some gen9+ platforms caused by reading garbage for the gl_InstanceID value. The flakyness goes away as soon as we start allocating the SVGS element. v2: - Do this for gen8+, not just gen9+, and pull the boolean outside the #if block (Jason) Fixes flaky test: KHR-GL45.vertex_attrib_64bit.limits_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-07 11:11:16 +01:00
Dylan Baker	c74719cf4a	glapi: fix check_table test for non-shared glapi with meson v2: - Add glapitable_h generated source to requirements Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)	2018-02-06 15:00:17 -08:00
Dylan Baker	002fbde71e	glapi: Don't search through subdirs from glapitable.h Because meson won't put it in that folder. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	aac3d01178	state_tracker: Don't build st-renumerate-test without shared glapi Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	0316aa432d	glapi: remove APPLE extensions from test Fixes: `7009955281` ("mesa: Remove GL_APPLE_vertex_array_object stubs") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	a4f1fc5dd1	glapi/check_table: Remove 'extern "C"' block Using 'extern "C"' around includes is always incorrect, as the header may contain C++ symbols (as it does in this case), which means it cannot use C linkage. In this case the header has a template in it, which obviously cannot be linked with C linkage rules. Fixes: `a29ad2b421` ("mesa/tests: Add tests for the generated dispatch table") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	105178db8f	meson: fix test source name for static glapi fixes: `43a6e84927` ("meson: build mesa test.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Dylan Baker	9be7487f30	glapi: don't walk backwards for includes Instead just set the proper -I flags and include it from a more standard path. In this case we'll add -Isrc/mesa (which is common), and #include main/foo.h. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-06 15:00:17 -08:00
Brian Paul	e7a4536e64	mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray Since the type is gl_vertex_array. Update comment to explain that these arrays are only used by the VBO module. Also rename some local variables in _mesa_update_vao_derived_arrays(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:36:47 -07:00
Brian Paul	d9ab39ea65	mesa: minor whitespace fixes, line wrapping in texcompress.c Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00
Brian Paul	b38196b452	mesa: simplify _mesa_get_compressed_formats() Instead of testing for formats==NULL everywhere, just point formats at a dummy array which will be discarded. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:23:26 -07:00
Vlad Golovkin	d919ff0f27	util: remove redundant check for the __clang__ macro Clang defines __GNUC__ macro, so one doesn't need to check __clang__ macro in this particular case. v2: added comment as per Brian Paul's suggestion Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 15:23:26 -07:00
Brian Paul	77bc74e674	st/mesa: use st_access_flags_to_transfer_flags() helper in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Brian Paul	1852a2e1a2	st/mesa: refactor st_bufferobj_map_range() Use a new helper function, st_access_flags_to_transfer_flags(), to convert the GL_MAP_x flags to PIPE_TRANSFER_x flags. We'll be able to use this function in a couple other places. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Brian Paul	8a32dd2ec9	st/mesa: refactor bufferobj_data() Split out some of the code into three new helper functions: buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(), buffer_usage() to make the code more managable. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-06 15:23:26 -07:00
Samuel Pitoiset	3488a3f033	radv: run nir_opt_shrink_load LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:44 +01:00
Samuel Pitoiset	e68562b94b	nir: add nir_opt_shrink_load pass This is a very simple pass that just shrinks load_push_constant intrinsics when some components are unused. For now, it can just shrink vec4 to vec3, vec3 to vec2 and so on. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-06 23:08:39 +01:00
Timothy Arceri	e2ea9e1191	radeonsi/nir: add nir support for compiling compute shaders Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	9c52902c76	ac/radeonsi: add num_work_groups to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f12e2f9c12	ac: implement nir_intrinsic_shader_clock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	b7b89bbddb	ac/radeonsi: create ac_build_shader_clock() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	d116af383f	ac/radeonsi: add load_local_group_size() to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	f6932d1ef3	radeonsi: add get_block_size() helper This will be reused by the nir backend in a later patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	e3ebffdbb0	ac: don't call emit_outputs() for compute Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	c8066cdfa7	ac/radeonsi: add local_invocation_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	fa5239c153	ac/radeonsi: add workgroup_ids to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	64c10c9737	radeonsi/nir: gather some compute info in si_nir_scan_shader() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:43:08 +11:00
Timothy Arceri	1142b1d3e1	radeonsi/nir: always set input_usage_mask as using all components This fixes a regression for now, in the future we should gather the used components properly. V2: just set for VS and correctly handle doubles Fixes: `be973ed21f` "radeonsi: load the right number of components for VS inputs and TBOs" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-07 08:38:52 +11:00
Timothy Arceri	ffeebcfa7e	i965: remove unused brw_nir_lower_cs_shared() This has been unused since `8761a04d0d`. Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-02-07 08:38:01 +11:00
Bas Nieuwenhuizen	a3e42e7a69	vulkan/wsi: Fix OOM behavior with prime images. Fixes: `d50937f137` "vulkan/wsi: Implement prime in a completely generic way" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-02-06 21:52:39 +01:00
Bas Nieuwenhuizen	c7d640fbbf	ac/nir: fix GS load input type. Fixes: `df1d5174fc` "ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-06 21:52:38 +01:00
Mathias Fröhlich	e8a9473d32	mesa: Factor out _mesa_disable_vertex_array_attrib. And use it in the enable code path. Move _mesa_update_attribute_map_mode into its only remaining file. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Mathias Fröhlich	236657842b	vbo: Move vbo_rebase into its only caller module tnl. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Mathias Fröhlich	2313c33e95	mesa: Use atomics for buffer objects reference counts. The mutex is currently used for reference counting and updating the minmax index cache. The change uses atomics directly for reference counting and the mutex for the minmax cache. This is safe since the reference count is not modified beside in _mesa_reference_buffer_object where atomics aim to be used. While using the minmax cache, the calling code holds a reference to the buffer object. Thus unreferencing or even referencing the buffer object does not need to be serialized with accessing the minmax cache. The change reduces the time _mesa_reference_buffer_object_ takes by about a factor of two when looking at perf results for some of my favorite use cases. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-06 21:20:14 +01:00
Dave Airlie	6c691081a1	r600: fixup sparse color exports. If we have gaps in the shader mask we have to have 0x1 in them according to a comment in radeonsi, and this is required to fix the test at least on cayman. We also need to record the highest one written to write to the ps exports reg. This fixes: KHR-GL45.enhanced_layouts.fragment_data_location_api Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:59 +10:00
Dave Airlie	2d5b5d267e	r600: work out target mask at framebuffer bind. If we only get 1,2,3,6 framebuffers we want a sparse target mask. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:55 +10:00
Dave Airlie	5b14e06d8b	r600: work out shader export mask at shader build time (v1.1) Since enhanced layouts allows setting specific MRT outputs, we can get sparse outputs, so we have to calculate the shader mask earlier. v1.1: update checks for state update (Roland) Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:16:27 +10:00
Dave Airlie	f292eceae1	r600: fix xfb stream check. This fixes: KHR-GL45.enhanced_layouts.xfb_vertex_streams Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	680cb9898a	r600/compute: add render cond support. Set render cond and emit atom. Fixes: KHR-GL45.compute_shader.conditional-dispatching Reviewed-by: Roland Scheidegger <sorland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	5fd7b282b3	r600: fix not-very indirect compute We need to get the grid sizes earlier to fill in to the const buffer. Fixes: KHR-GL45.compute_shader.built-in-variables and KHR-GL45.compute_shader.dispatch-indirect Reviewed-by: Roland Scheidegger <sorland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	00a112641b	r600: overhaul buffer resource query. This cleans up and fixes the previous fix even more. Buffers from textures start at max const, buffers from buffers/images come in from the 168 offset. This fixes a bunch of: KHR-GL45.shader_storage_buffer_object* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	736b150768	r600/eg: fix buffer sizing. For buffers we want the size in bytes, For images we want it in elements. This fixes: KHR-GL45.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-pad Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	c9c4f0b722	r600/images: set offset for compute shaders with number of declared samplers for frag shaders we get a value in the key, I expect I need to make compute work better Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	ab5cee4c24	r600/compute: only mark buffer/image state dirty for fragment shaders The compute emission path always emits this currently, and emitting it on the fragment path breaks the blitter. This fixes gpu hangs in KHR-GL45.compute_shader.resource-texture Reviewed-by: Roland Scheidegger <sorland@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:12 +10:00
Dave Airlie	4e3b43f180	r600/atomic: fix ATOMCAS instruction. This has 4 srcs. This fixes: KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	8bdad9fa1f	r600/sb/cayman: fix indirect ubo access on cayman With sb enabled on cayman, this was overwriting the proper cf index value with random ones if the dst gpr was 2 or 3, only save the value for a MOVA instruction. Fixes: KHR-GL45.gpu_shader5.uniform_blocks_array_indexing (on cayman with sb) Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	012100b809	r600/eg: use texture target to pick array size not view target (v2) This fixes a few CTS cases in : KHR-GL45.texture_view.view_sampling some multisample cases are still broken, but not sure this is the same problem. v2: fix more cases Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-07 06:08:11 +10:00
Dave Airlie	e7e81f362d	radv: don't support tc-compat on multisample d32s8 at all. RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: `f4c534ef6` (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 19:56:00 +00:00
Michal Navratil	4081e08896	winsys/amdgpu: allow non page-aligned size bo creation from pointer Fix INVALID_OPERATION caused by BufferData with target EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is not page aligned. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-06 18:51:12 +01:00
Jon Turney	9440599c8e	meson: ensure xmlpool/options.h is generated for libgallium In file included from ../src/gallium/targets/dri/target.c:1: In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8: ../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found See also `26bde1e3`. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-06 15:56:12 +00:00
Andres Gomez	1ec88755c2	vbo: provide 64bits support to print_draw_arrays Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:29 +02:00
Andres Gomez	0057ae4038	vbo: take into account the size when printing VAO elements When using print_draw_arrays for debugging, we were printing an "n" amount of vertex but that meant not to print all the size in the "n" vertex, depending on the stride used. Now we print the whole size in the "n" vertex. Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:23 +02:00
Andres Gomez	c9325b4fa9	vbo: print first element of the VAO when the binding stride is 0 Cc: Mathias Fröhlich <mathias.froehlich@web.de> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-06 15:30:12 +02:00
Iago Toral Quiroga	a5053ba27e	anv/device: initialize the list of enabled extensions properly The loop goes through the list of enabled extensions marking them as enabled in the list, but this relies on every other extension being initialized to false by default. This bug would make us, for example, advertise certain device extension entry points as available even when the corresponding extensions had not been enabled. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: `abc62282b5` "anv: Add a per-device table of enabled extensions" Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-02-06 07:51:00 +01:00
Iago Toral Quiroga	ef439a4fdc	spirv: split constant initializers on in/out structs The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-02-06 07:50:18 +01:00
Iago Toral Quiroga	1d20001d97	i965/nir: do int64 lowering before optimization Otherwise loop unrolling will fail to see the actual cost of the unrolling operations when the loop body contains 64-bit integer instructions, and very specially when the divmod64 lowering applies, since its lowering is quite expensive. Without this change, some in-development CTS tests for int64 get stuck forever trying to register allocate a shader with over 50K SSA values. The large number of SSA values is the result of NIR first unrolling multiple seemingly simple loops that involve int64 instructions, only to then lower these instructions to produce a massive pile of code (due to the divmod64 lowering in the unrolled instructions). With this change, loop unrolling will see the loops with the int64 code already lowered and will realize that it is too expensive to unroll. v2: Run nir_algebraic first so we can hopefully get rid of some of the int64 instructions before we even attempt to lower them. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-02-06 07:49:27 +01:00
Ilia Mirkin	02a6d901ee	mesa: add OES_EGL_image_external_essl3 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-06 07:28:11 +02:00
Vinson Lee	fe32f796f2	r600/fp64: Fix build. CC r600_shader.lo r600_shader.c: In function ‘egcm_int_to_double’: r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’? if (ctx.bc->chip_class == CAYMAN) ^ -> Fixes: `35b4301577` ("r600/fp64: fix integer->double conversion") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 15:32:20 -08:00
Dave Airlie	35b4301577	r600/fp64: fix integer->double conversion Doing a straight uint/int->fp32->fp64 conversion causes some precision issues, Roland suggested splitting the integer into two portions and doing two separate int->fp32->fp64 conversions then adding the results. This passes the tests in CTS and piglit. [airlied: fix cypress conversion opcodes] Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-06 08:21:48 +10:00
Samuel Pitoiset	0170ae1e23	ac/nir: remove emission of nir_op_fdiv RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 23:09:34 +01:00
Jon Turney	b5af199f92	travis: add macOS meson build v2: Simplify set of options now we have better defaults Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:42:01 +00:00
Jon Turney	80bc41b2ec	meson: osx ld doesn't support --build-id Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 19:40:43 +00:00
Jon Turney	ea8730024f	meson: build src/glx/apple Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:40:43 +00:00
Dylan Baker	569628dd24	meson: set apple glx defines Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-05 19:40:43 +00:00
Jon Turney	4772909447	meson: better defaults for osx, windows and cygwin set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers' and 'platforms' options for osx, windows and cygwin, adding cygwin where appropriate. v2: error() for unknown OS Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-05 19:34:37 +00:00
Matt Turner	e2b31e9acf	i965: Move mistakenly placed line Ken called this out in review, but it seems I forgot to make the change. I noticed that the control flow annotations in the fragment shader disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not correct, and moving this line to the correct place fixes it.	2018-02-05 09:50:56 -08:00
Juan A. Suarez Romero	4195eed961	glsl/linker: check same name is not used in block and outside According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: `9b894c8` ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <mark.a.janes@intel.com> CC: "18.0" <mesa-stable@lists.freedesktop.org> Tested-by: Matteo Bruni <matteo.mystral@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-02-05 18:10:43 +01:00
Juan A. Suarez Romero	3d14e72057	mesa: enable ASTC format for CompressedTexSubImage3D If extensions GL_KHR_texture_compression_astc_hdr or GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC format are supported in CompressedTexÎmage3D. Fixes KHR-GLES2.texture_3d. with this format. CC: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-02-05 17:00:19 +01:00
Stephan Gerhold	02e2009b92	util/build-id: Fix address comparison for binaries with LOAD vaddr > 0 build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD segment has a virtual address other than 0x0. For most shared libraries, the first LOAD segment has vaddr=0x0: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000 LOAD 0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW 0x1000 However, compiling the Intel Vulkan driver as 32-bit binary on Android produces the following ELF header with vaddr=0x8000 instead: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x00008034 0x00008034 0x00100 0x00100 R 0x4 LOAD 0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000 LOAD 0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW 0x1000 build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr() and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a different memory address, e.g.: dli_fbase=0xd8395000 (offset 0x8000) dlpi_addr=0xd838d000 At least on glibc and bionic (Android) dli_fbase refers to the address where the shared object is mapped into the process space, whereas dlpi_addr is just the base address for the vaddrs declared in the ELF header. To compare them correctly, we need to calculate the start of the mapping by adding the vaddr of the first LOAD segment to the base address. Note: musl users will need the following patch. https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac Cc: Chad Versace <chadversary@chromium.org> Cc: <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642 Fixes: `5c98d38` "util: Query build-id by symbol address, not library name" Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-05 14:26:33 +00:00
Boyuan Zhang	d645b0850a	radeonsi: enable vcn encode for HEVC main Enable vcn encode for HEVC main profile on Raven. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	5534a2791f	st/va: implement HEVC encode functions Implement HEVC encode functions based on VAAPI HEVC encode interface. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	9ac50a2e0c	st/va: add HEVC encode functions Add a separate file for HEVC encode functions. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	66087d8a2d	st/va: enable dual instances encode only for H264 Logics that related to dual instances encode should only be done for H264, not other codecs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	a9c0861c6c	st/va: add entrypoint check for HEVC Add entrypoint check for HEVC to differentiate decode and encode jobs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	ecc3944344	st/va: add HEVC picture desc Add HEVC picture desc, and add codec check when creating and destroying context. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	9393b53c29	st/va: move H264 enc functions into separate file Move all H264 encode related functions into separate file. Similar to VAAPI decode side, there will be separate file for each codec on encode side as well. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	b391d34916	radeon/vcn: add header implementations for HEVC Implement encoding of sps, pps, vps, aud, and slice headers for HEVC based on HEVC specs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	fdc952b320	radeon/vcn: add ib implementations for HEVC Implement required ibs for vcn HEVC encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	5ab73edddb	radeon/vcn: support picture parameters for HEVC Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that it can be used for different codecs. Add functions to handle picture parameters that will be used for HEVC encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	db67d04df3	radeon/vcn: add vcn encode interface for HEVC Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Boyuan Zhang	f410936439	vl: add parameters for HEVC encode Add HEVC encode interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-02-05 09:16:18 -05:00
Eric Anholt	aa2f609f70	broadcom/vc5: Ignore samplers for finding uniform offsets. Fixes: KHR-GLES3.shaders.struct.uniform.sampler_array_fragment KHR-GLES3.shaders.struct.uniform.sampler_array_vertex KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex	2018-02-05 13:56:02 +00:00
Eric Anholt	63a8a0f3c0	broadcom/vc5: Fix non-mipfiltered sampling. We need to clamp the LOD to 0 if mip filtering is disabled. This is part of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.	2018-02-05 13:53:38 +00:00
Eric Anholt	e29988c908	broadcom/vc5: Fix "hardwrae" typo in a field name in XML.	2018-02-05 13:53:38 +00:00
Samuel Pitoiset	a1d568c830	ac/nir: fix a crash in load_gs_input() on pre-GFX9 chips Fixes: `df1d5174fc` ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-05 11:05:52 +01:00
Eric Anholt	8bb000f460	broadcom/vc5: Try to merge more than 2 QPU instructions together. Obviously it would be good to have an ADD and a MUL and a signal together, but we can even potentially have multiple signals merged, as well. total instructions in shared programs: 100423 -> 97874 (-2.54%) instructions in affected programs: 78812 -> 76263 (-3.23%)	2018-02-05 09:29:37 +00:00
Eric Anholt	dc78643ace	broadcom/vc5: Remove no-op MOVs after register allocation. We emit some MOVs to track lifetimes of payload registers, but we don't need there to be actual MOV instructions for them. total instructions in shared programs: 101045 -> 100423 (-0.62%) instructions in affected programs: 37083 -> 36461 (-1.68%)	2018-02-05 09:29:37 +00:00
Eric Anholt	f3978a7380	broadcom/vc5: Add missing shader-db instruction counting. I must have misplaced it in the instruction packing rework.	2018-02-05 09:29:37 +00:00
Dave Airlie	7801425028	r600: fix resq for buffer images. If this is an image buffer, we need to calculate the correct resource id. Fixes: KHR-GL45.shader_image_size.* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-05 05:15:41 +10:00
Dave Airlie	6c1432f0be	r600/eg: fix cube map array buffer images. This fixes a crash in: KHR-GL45.texture_cube_map_array.texture_size_compute_sh. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-05 05:14:56 +10:00
Marek Olšák	af3685d149	mesa: change ctx->Color.ColorMask into a 32-bit bitmask 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. This is easier to work with. Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-04 01:50:10 +01:00
Jordan Justen	83e60ce927	i965: Create new program cache bo when clearing the program cache When the disk shader cache CI testing was enabled, we started noticing occasional failures on deqp test runs. (Mainly SNB, rarely HSW) Before this change, when we cleared the (in memory) program cache we reused the same bo. Since the disk shader cache quickly restores programs, it appears that this would lead to overwrites of the older program binaries in the in memory program cache that apparently were still executing in some cases. If these programs were still executing, this could cause a GPU hang. This issue is probably not disk shader cache specific, but may have been hidden due to the compiler taking time to recompile programs after the cache was cleared. v2: * Don't add `copy` param to brw_cache_new_bo (Ken) * Call from brw_program_cache_check_size (Ken) Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-02-03 12:16:58 -08:00
Jason Ekstrand	589e9db23f	aubinator: Multiply count by 4 to compute buffer sizes The count field is in terms of dwords and not bytes. In `7d4007d58a`, I fixed one instance of this but missed another.	2018-02-02 22:30:56 -08:00
Eric Anholt	2e746bc63d	broadcom/vc5: Enable UIF XOR on textures. This should increase performance by reducing SDRAM bank conflicts when crossing between UIF columns (particularly on power-of-two height textures). The uif_xor_disable setup is dropped, since we need to allow XOR on lower miplevels even when level 0 is XOR. The level 0 force UIF and level 0 XOR flags should handle setting XOR properly on imported buffers.	2018-02-02 16:50:02 -08:00
Eric Anholt	6a862b0de7	broadcom/vc5: Fix alignment of miplevel 1 with UIF. The alignment here means that we can't get back the padded height from the size/stride any more, so it's now a field in the slice as well. Fixes piglit fbo-generatemipmap-formats RGBA16 NPOT.	2018-02-02 16:27:49 -08:00
Eric Anholt	5c57e0a549	broadcom/vc5: Switch our RGBA4 support to the new gallium format. Fixes fbo-generatemipmap-formats, fbo-alphatest-formats, etc. tests for GL_RGBA4, GL_RGB4, GL_RGBA2, etc.	2018-02-02 16:27:49 -08:00
Eric Anholt	2a97f1d3ef	gallium: Add a new A4B4G4R4 pipe format for Broadcom. The VC5 HW puts A in the low bits and R in the high bits. We can't just swizzle in the shaders because the blending HW can't pick what channel A is in, so make a new format to match it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-02 16:27:49 -08:00
Eric Anholt	1429cd74c2	mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases. swapBytes operates on bytes, not 4-bit channels, so you can't just take non-swapBytes cases and flip the REV flag. Avoids piglit texture-packed-formats regressions when enabling the ABGR4444 format. Fixes: `c5a5c9a7db` ("mesa/formats: add new mesa formats and their pack/unpack functions.") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-02 16:27:49 -08:00
George Kyriazis	bbef9474fa	meson/swr: Updated copyright dates cc: mesa-stable@lists.freedesktop.org cc: dylan@pnwbakers.com Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-02 17:43:07 -06:00
George Kyriazis	16bf813830	meson/swr: re-shuffle generated files Move generated files from codegen/meson.build to other directories, in order to satisfy generated include file dependencies Add correct file lists for architecture-specific libraries. cc: mesa-stable@lists.freedesktop.org cc: dylan@pnwbakers.com Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-02-02 17:43:00 -06:00
Marek Olšák	3bf1e036e8	amd: remove support for LLVM 3.9 Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 23:47:40 +01:00
Dylan Baker	c75a4e5b46	meson: Check for actual LLVM required versions Currently we always check for 3.9.0, which is pretty safe since everything except radv work with >= 3.9 and 3.9 is pretty old at this point. However, radv actually requires 4.0, and there is a patch for radeonsi to do the same. Fixes: `673dda8330` ("meson: build "radv" vulkan driver for radeon hardware") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 13:22:58 -08:00
Dylan Baker	d7235ef83b	meson: Don't confuse the install and search paths for dri drivers Currently there is not a separate option for setting the search path of DRI drivers in meson, like there is in scons and autotools. This is an oversight and needs to be fixed. This adds an extra option `dri-search-path`, which will default to the value of `dri-drivers-path`, like autotools does. v2: - Split input list before joining. v3: - use : instead of ; as the delimiter. The autotools help string incorrectly says ; but the code uses : v4: - Take list in pre : delimited form (Ilia) - Ensure that the dri-search-path is absolute when using dri_drivers_path Fixes: `db9788420d` ("meson: Add support for configuring dri drivers directory.") Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v2) Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)	2018-02-02 11:01:42 -08:00
Marek Olšák	847d0a393d	radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-02 16:46:22 +01:00
Jon Turney	b3a1d9588e	travis: add osx autotools build Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	4701379d96	travis: pip -> pip2 On travis, for OSX, python2 from homebrew is pre-installed. per [1]: python points to the macOS system Python (with no manual PATH modification) python2 points to Homebrew’s Python 2.7.x (if installed) python3 points to Homebrew’s Python 3.x (if installed) pip doesn't exist pip2 points to Homebrew’s Python 2.7.x’s pip (if installed) pip3 points to Homebrew’s Python 3.x’s pip (if installed) We will end up using 'python2' for building mesa. Just use 'pip2' instead of 'pip', as that seems to work for all platforms on travis. [1] https://docs.brew.sh/Homebrew-and-Python.html Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	7d1ec6d6a9	travis: conditionalize building of prerequisites on if OS=linux Use a '\|' YAML literal block to avoid the convoluted syntax needed to put the entire conditional on a single line. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Jon Turney	63041ba613	glx/test: fix building for osx An additional stub for applegl_create_context() is needed Cannot test indirect API as it's not built on osx, currently Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-02 15:28:52 +00:00
Andres Gomez	4761a8fea6	i965: check if upload is 0 explicitely, when downsizing a format downsize_format_if_needed takes an integer as number of uploads parameter. Hence, let's do an integer comparation instead of a boolean check, since that is confusing. Since we are at it, fix a couple of wrongly tabbed indents. Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-02-02 16:32:30 +02:00
Marek Olšák	51d36f5e02	mesa: don't flag _NEW_COLOR for KHR adv.blend if prog constant doesn't change This only affects drivers that set DriverFlags.NewBlend. v2: - fix typo advanded -> advanced - return "enum gl_advanced_blend_mode" from _mesa_get_advanced_blend_sh_constant - don't call FLUSH_VERTICES twice Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-02-02 15:06:47 +01:00
Samuel Pitoiset	df1d5174fc	ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load The old one generates useless instructions in there, found while comparing geometry shaders between RadeonSI and RADV. This improves all Vulkan demos that use geometry shaders, +4% for deferredshadows, +9% for viewportarray, +7% for geometryshader on Polaris10. This seems to also improve DOW3 a little bit (+1%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-02 12:32:21 +01:00
Dave Airlie	f9c121c420	r600/eg: add crap indirect compute support. I think the cp packets can be made work, but I think it might need a kernel change, so for now just do the worst thing. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-02 16:50:18 +10:00
Jason Ekstrand	2f7205be47	i965: Call prepare_external after implicit window-system MSAA resolves This fixes some rendering corruption in a couple of Android apps that use window-system MSAA. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104741 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-02-01 21:45:25 -08:00
Roland Scheidegger	c2f0e08857	r600: don't do stack workarounds for hemlock By the looks of it it seems hemlock is treated separately to cypress, but certainly it won't need the stack workarounds cedar/redwood (and seemingly every other eg chip except cypress/juniper) need. (Discovered by accident.) Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-02-02 01:46:43 +01:00
Dave Airlie	8fa5aade43	r600: initial attempt at gl_HelperInvocation (v3) This passes the CTS and piglit tests. This also disable sb for helper invocations until it doesn't mess up the VPM flags. Thanks to Ilia and Glenn for advice, and Roland for working out the working evergreen path. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-02 09:46:05 +10:00
Bas Nieuwenhuizen	2ffe395cba	radv: Don't expose VK_KHX_multiview on android. deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.0 <mesa-stable@lists.freedesktop.org>	2018-02-01 23:32:48 +01:00
Mathias Fröhlich	5b3d58520f	vbo: Simplify input array distribution for dlist type draws. Using the newly introduced VAO array maps, we can simplify vbo_bind_vertex_list. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:08 +01:00
Mathias Fröhlich	fb10a7b7b0	vbo: Simplify input array distribution for imm type draws. Using the newly introduced VAO array maps, we can simplify vbo_exec_bind_arrays. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:08 +01:00
Mathias Fröhlich	44b1454b96	vbo: Simplify input array distribution for array type draws. Using the newly introduced VAO state variable, we can simplify recalculate_input_bindings. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:07 +01:00
Mathias Fröhlich	3d4fb879dd	vbo: Use static const VERT_ATTRIB->VBO_ATTRIB maps. Instead of each context having its own map instance for this purpose, use a global static const map. v2: s,unsigned char,GLubyte,g s,_VP_MODE_MAX,VP_MODE_MAX,g Change comment style. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:07 +01:00
Mathias Fröhlich	b4fd63015a	mesa: Track position/generic0 aliasing in the VAO. Since the first material attribute no longer aliases with the generic0 attribute, only aliasing between generic0 and position is left and entirely dependent on the enabled state of the VAO. So introduce a gl_attribute_map_mode in the VAO that is used to track how the position and the generic 0 attribute alias. Provide a static const array that can be used to map from vertex program input indices to VERT_ATTRIB_* indices. The outer dimension of the array is meant to be indexed directly by the new VAO member variable. Also provide methods on the VAO to convert bitmasks of VERT_BIT's from the VAO numbering to the vertex processing inputs numbering. v2: s,unsigned char,GLubyte,g s,_ATTRIBUTE_MAP_MODE_MAX,ATTRIBUTE_MAP_MODE_MAX,g Change comment style, add comments. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	186f03cfb0	mesa: Put materials at the end of the generic block. The materials are now moved to the end of the generic attributes block to the range 4-15. Before, the way the position and generic 0 attribute is handled was dependent on the presence and kind of the currently attached vertex program. With this change the way the position attribute and the generic 0 attribute is treated only depends on the enabled flag of those two arrays. This will later help to untangle the update dependencies between enabled arrays and shader inputs. v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	38b41fd718	mesa: Use defines for the aliased material array attributes. Instead of just assuming that the material attributes just overlap with the generic attributes 0-12, give them symbolic defines so that we can easier move them to an other range. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:06 +01:00
Mathias Fröhlich	f37e29ac22	vbo: Correctly handle attribute offsets in dlist draw. When executing a display list draw, for the offset list to be correct, the offset computation needs to accumulate all attribute size values in order. Specifically, if we are shuffling around the position and generic0 attributes, we may violate the order or if we do not walk the generic vbo attributes we may skip some of the attributes. Even if this is an unlikely usecase we can fix this use case by precomputing the offsets on the full attribute list and store the full offset list in the display list node. v2: Formatting fix v3: Rebase Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 22:39:05 +01:00
Brian Paul	7a044ef68b	gallivm/llvmpipe: add const qualifiers on sampler variables Once a lp_build_sampler_soa or lp_build_sampler_aos object is created, it should never be modified. Found by inspection. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-02-01 14:19:58 -07:00
Brian Paul	1bdbeae17c	vbo: change an argument in vbo_draw_indirect_prims() In vbo_draw_indirect_prims() pass the 'indirect_data' argument to vbo->draw_prims(). All the callers are passing ctx->DrawIndirectBuffer so this should be no functional change. Add a (temporary) assertion to be sure. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	1b7ad3ae97	vbo: add comments on the VBO draw function typedefs And rename indirect_params -> indirect_draw_count_buffer and indirect_params_offset -> indirect_draw_count_offset to be more specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	c7bf05c833	vbo: s/drawcount/drawcount_offset This parameter (from the glMultiDrawArraysIndirectCountARB function) is poorly named. It's an offset into the buffer which contains the number of primitives to draw. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	b0a2f38db9	vbo: use vbo local var for draw call in vbo_save_playback_vertex_list() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-02-01 12:17:59 -07:00
Brian Paul	84c3641864	svga: remove unneeded #includes in svga_pipe_draw.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	fa98730bf3	svga: whitespace/formatting fixes in svga_pipe_draw.c Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	7a1401938b	svga: clean up retry_draw_range_elements(), retry_draw_arrays() Get rid of a bunch of goto spaghetti. Remove unneeded do_retry parameter. No Piglit changes. Also tested w/ Google Earth and other apps. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Brian Paul	c744289552	svga: remove unused min/max_index params to draw_vgpu10() Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-02-01 12:17:59 -07:00
Eric Anholt	06858c7348	broadcom/vc5: Fix image_h setup for both loads and stores. The image_h for the tiling algorithm needs to be the padded-to-a-uifblock height of the level, not the unpadded height or the height of level 0. Fixes some cases of KHR-GLES3.texture_repeat_mode.* and depthstencil-render-miplevels.	2018-02-01 11:02:29 -08:00
Eric Anholt	5329f35ea1	broadcom/vc5: Add appropriate height padding for bank conflicts. I thought I didn't need this because I was doing level-0-always-UIF and that the pad there would propagate down, but it turns out that for level 1 the padding ends up being chosen by the HW. This brings us closer to being able to turn on UIF XOR for increased performance, as well.	2018-02-01 11:02:29 -08:00
Eric Anholt	dea902c933	broadcom/vc5: Simplify separate stencil surface setup. If we just make another gallium surface for the separate stencil, it's a lot easier to keep track of which set of fields we're using in RCL setup. This also incidentally fixes a little bug in setting up the surface's padded height for separate stencil when the UIF-ness changes at different levels of Z versus stencil.	2018-02-01 11:02:29 -08:00
Eric Anholt	7239b3edbe	broadcom/vc5: Rename the UIFCFG register in the UAPI. This matches the naming of the other hub regs we get, and I don't know for sure if UIFCFG will be the same register between the hub and the cores on all versions.	2018-02-01 11:02:29 -08:00
Eric Anholt	353b42ccc7	broadcom/vc5: Fix a segfault on mix of booleans. We don't have a src1 to look up if the compare instruction is "i2b".	2018-02-01 11:02:29 -08:00
Eric Anholt	eb765394c2	broadcom/vc5: Skip over missing color buffers for a couple of checks. Fixes crashes in piglit alpha-to-coverage-no-draw-buffer-zero 2	2018-02-01 11:02:29 -08:00
Eric Anholt	aec066c7aa	broadcom/vc5: Add the missing PIPE_CAP_FENCE_SIGNAL.	2018-02-01 11:02:29 -08:00
Baldur Karlsson	030821a873	mesa: fix query of GL_TEXTURE_COMPRESSION_HINT_ARB Fixes: `f96a69f916` ("mesa: replace GLenum with GLenum16 in common structures (v4)") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104908 Reviewed-by: Brian Paul <brianp@vmware.com>	2018-02-01 11:58:02 -07:00
Lucas Stach	0c71a19fe4	renderonly: fix dumb BO allocation for non 32bpp formats Take into account the resource format, instead of applying a hardcoded 32bpp. This not only over-allocates 16bpp formats, but also results in a wrong stride being filled into the handle. Fixes: `848b49b288` ("gallium: add renderonly library") CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-02-01 19:36:17 +01:00
Kenneth Graunke	85ec7abc3f	intel/decoder: Fix control / evaluation label mixup. Trivial. DS is TES, HS is TCS.	2018-02-01 09:44:15 -08:00
Kenneth Graunke	c3cd2aac27	i965: Bump official kernel requirement to Linux v3.9. In commit `3f353342a6` (present in 17.3.0) we started unconditionally using I915_EXEC_NO_RELOC, which was introduced in Linux v3.9. ChromeOS kernel 3.8 has backported this, so it should work too. Running on older kernels would likely result in every single batch being rejected by the kernel, which is pretty catastrophic. Yet, it appears that nobody noticed. So, let's just bump the official requirement and move forward ever so slowly. Fixes: `3f353342a6` ("i965: Use I915_EXEC_NO_RELOC") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 07:58:58 -08:00
Marc Dietrich	4c5f0b4fd4	meson: don't install windows headers on non-windows platforms Only dive into the windows subdir if windows platform is selected. Signed-off-by: Marc Dietrich <marvin24@gmx.de> Fixes: `5ef75cb02b` "meson: build src/glx/windows" Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-02-01 15:33:02 +00:00
Marek Olšák	71c6f64e54	radeonsi: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	b0a6053a99	ac/nir: use ac_build_buffer_load_format for image buffer loads Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	bac9fa9f17	ac: add glc parameter to ac_build_buffer_load_format Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	be973ed21f	radeonsi: load the right number of components for VS inputs and TBOs The supported counts are 1, 2, 4. (3=4) The following snippet loads float, vec2, vec3, and vec4: Before: buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904 buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507 After: buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04 buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805 buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Marek Olšák	472361dd7e	radeonsi: remove unused si_shader_context members Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-02-01 16:20:19 +01:00
Jon Turney	d3540b405b	glx/apple: locate dispatch table functions to wrap by name Avoid reaching into the dispatch table internals (and thus having to deal with the complexities of remap etc.) by identifying functions to wrap by name. See: https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq. https://bugs.freedesktop.org/show_bug.cgi?id=90311 Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:14:08 +00:00
Jon Turney	b37b7b42dc	glx/apple: include util/debug.h for env_var_as_boolean prototype mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration] mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration] Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:14:02 +00:00
Jon Turney	f8ed9f24d5	osx: ld doesn't support --build-id Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:13:56 +00:00
Jon Turney	7ad7a07c88	configure: Default to gbm=no on osx Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-02-01 15:13:00 +00:00
Andres Rodriguez	bbd00844a2	mesa: remove usage of alloca in externalobjects.c v4 Don't want an overly large numBufferBarriers/numTextureBarriers to blow up the stack. v2: handle malloc errors v3: fix patch v4: initialize texObjs/bufObjs Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com>	2018-02-01 09:48:04 -05:00
Samuel Pitoiset	2ef5ce1198	radv: do not insert shaders in cache when it's disabled When the application doesn't provide its own pipeline cache, the driver uses a in-memory cache but it shouldn't insert any entries when the cache is explicitely disabled by the user. Found while running my experimental pipeline-db tool with a ton of shaders, the memory footprint was just huge, and sometimes the process was even killed... Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:40:11 +01:00
Samuel Pitoiset	4922e7f25c	radv: use separate bindings for graphics and compute descriptors The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:09 +01:00
Samuel Pitoiset	cf224014dd	radv: store the bind point when creating descriptors with templates Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-02-01 09:37:07 +01:00
Dave Airlie	7ea15a36fb	r600/eg: make sure we allow vpm bit on other CF ops. the vpm bit wasn't being applied to the push/pop instructions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 13:41:32 +10:00
Timothy Arceri	4d982ae2c7	gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM This has been unused since `100796c15c`. Acked-by: Marek Olšák <marek.olsak@amd.com>	2018-02-01 13:56:34 +11:00
Dave Airlie	0491d5425f	r600/sb: just add some missing debug bits Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 12:06:40 +10:00
Dave Airlie	df155a73f4	r600: fix buffer resinfo opcode translation. The vtx operations never got translated, so things worked by 0 being equal to 0, translate them so we can use the proper buffer resinfo code. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-02-01 11:59:55 +10:00
Timothy Arceri	679e4e7a46	st/glsl_to_nir: add more nir opts to st_nir_opts() All of the current gallium nir driver use these optimisations but they do so in their backends. Having these called in the backend only can cause a number of problems: - Shader compile times are greater because the opts need to do significant passes over all shader variants. - The shader cache is partially defeated due to the significant optimisation passes over variants. - We might miss out on nir linking optimisation opportunities. Adding these passes to st_nir_opts() alleviates these problems. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-02-01 09:42:57 +11:00
Andres Gomez	5a7aba2e0a	i965: perform 2 uploads with dual slot 64PASSTHRU formats on gen<8 The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the <size> passed to the VertexAttribL* commands was <= 2. In `61a8a55f55` ("i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs"), for gen8+ we needed to determine if the attrib was dual slot to emit 128 or 256-bit, independently of the VAO size. Similarly, for gen < 8 we also need to determine whether the attrib is dual slot to force the emission of 256-bits through 2 uploads. Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this second upload to fill these unspecified components with zeros, as we also do for gen8+. Fixes the following test on Haswell: KHR-GL46.vertex_attrib_binding.basic-inputL-case1 v2: Added more inline comments to explain why we are using ISL_FORMAT_R32_FLOAT and its consequences, as requested by Alejandro and Antía. Fixes: `75968a668e` ("i965/gen7: expose OpenGL 4.2 on Haswell when supported") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006 Cc: Alejandro Piñeiro <apinheiro@igalia.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Antia Puentes <apuentes@igalia.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-31 22:50:06 +02:00
Kenneth Graunke	ab1f2e6bc4	i965: Make texture validation code use texture objects, not units. This requires moving the _MaxLevel handling up to the callers. Another user of intel_finalize_mipmap_tree will be added later that depends on _MaxLevel not being modified. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-31 11:33:52 -08:00
Kenneth Graunke	0a2e878c69	i965: Pass tObj into intel_update_max_level instead of intel_obj. We want both anyway, but this will simplify things a tiny bit in an upcoming patch. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-31 11:33:52 -08:00
Kenneth Graunke	876f1537e9	i965: Delete more misleading comments. brw_bo_wait_rendering used to take a brw_context pointer for perf_debug messages about stalls. Chris eliminated that in `833108ac14`. This message about passing NULL to avoid those warnings is no longer relevant, and just adds confusion. So, drop it.	2018-01-31 11:33:52 -08:00
Andres Rodriguez	8996610acb	docs/features: mark EXT_semaphore(_fd) as DONE v2 Support for these extensions is available in radeonsi. v2: also updated relnotes Signed-off-by: Andres Rodriguez <andresx7@gmail.com>	2018-01-31 12:31:40 -05:00
Brian Paul	d32c22a13f	st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	3b3d8275d8	st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	1882ec4ff7	svga: use opcode local var to simplify some code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Brian Paul	338c35c427	svga: s/unsigned/VGPU10_OPCODE_TYPE/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-31 08:17:25 -07:00
Samuel Pitoiset	a097a6f519	radv: do not dump meta shader stats That's quite useless and that pollutes the output. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-31 14:10:26 +01:00
Samuel Pitoiset	26cc3e74b9	ac/nir: fix emission of ffract for 64-bit Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-31 14:10:24 +01:00
Eric Engestrom	2f0db33527	meson: dedup gallium-xa logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	fa5d616bf9	meson: dedup gallium-va logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	86168ed31c	meson: dedup gallium-omx logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	724916c8a8	meson: dedup gallium-xvmc logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Eric Engestrom	992af0a4b8	meson: dedup gallium-vdpau logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-31 11:17:03 +00:00
Antia Puentes	0da434fb47	Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format" This reverts commit `513c2263cb`. _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStorage, which in the OpenGL 4.6 is said: "An INVALID_ENUM error is generated if internalformat is not one of the color-renderable, depth-renderable, or stencil-renderable formats defined in section 9.4." RGB9_E5 format is not renderable, as stated in the same specification (Bug 9338). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794 Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>	2018-01-31 12:06:00 +01:00
Michel Dänzer	1cf1bf32ef	winsys/radeon: Compute is_displayable in surf_drm_to_winsys It was always 0, breaking (at least) DRI3 with Xwayland. Bugzilla: https://bugs.freedesktop.org/104306 Fixes: `5f2073be32` ("ac/surface: add ac_surface::is_displayable") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:53:58 +01:00
Matthew Nicholls	ef272b161e	radv: remove predication on cache flushes This can lead to a situation where cache flushes could get conditionally disabled while still clearing the flush_bits, and thus flushes due to application pipeline barriers may never get executed. Fixes: `a6c2001ace` (radv: add support for cmd predication.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 13:37:18 +10:00
Brian Paul	1ea9efd2f8	mesa: fix broken glGet*(GL_POLYGON_MODE) query This reverts part of the patch which introduced the GLenum16 change. Fixes a conform regression found by Roland. Fixes: `f96a69f916` ("mesa: replace GLenum with GLenum16 in common structures (v4)") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-30 20:32:37 -07:00
Dave Airlie	49c61d8b84	virgl: also remove dimension on indirect. This fixes some dEQP tests that generated bad shaders. Fixes: `b6f6ead19` (virgl: drop const dimensions on first block.) Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> Tested-by: Gurchetan Singh <gurchetansingh@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 12:24:11 +10:00
Marek Olšák	fdf01d0244	radeonsi: remove DBG_PRECOMPILE it's useless and shader-db stats only report the main shader part. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	148b48646b	radeonsi: print shader-db stats for main parts, not final binaries This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	c02c9ee550	radeonsi: move max_simd_waves computation into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	a7311cd7ee	mesa: fix glGet MAX_VERTEX_ATTRIB queries Broken by `f96a69f916` Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-31 03:21:20 +01:00
Jason Ekstrand	97938dac36	anv/cmd_buffer: Re-emit the pipeline at every subpass If we ever hit this edge-case, it can theoretically cause problem for CNL because we could end up changing render targets without re-emitting 3DSTATE_MULTISAMPLE which is part of the pipeline. Just get rid of the edge case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-30 17:16:33 -08:00
Ian Romanick	ee63933a73	nir: Distribute binary operations with constants into bcsel This was specifically designed to simplify 1+mix(0, a-1, condition) to mix(1, a, condition) by pushing the 1+ inside. Skylake, Broadwell, and Haswell had similar results. Skylake shown. total instructions in shared programs: 14521753 -> 14521716 (<.01%) instructions in affected programs: 10619 -> 10582 (-0.35%) helped: 51 HURT: 14 helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1 helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95% HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1 HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32% 95% mean confidence interval for instructions value: -1.31 0.17 95% mean confidence interval for instructions %-change: -0.80% -0.27% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533000205 -> 533003533 (<.01%) cycles in affected programs: 110610 -> 113938 (3.01%) helped: 43 HURT: 28 helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16 helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67% HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14 HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62% 95% mean confidence interval for cycles value: -43.81 137.56 95% mean confidence interval for cycles %-change: -1.47% 3.60% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10018840 -> 10018713 (<.01%) instructions in affected programs: 9431 -> 9304 (-1.35%) helped: 51 HURT: 3 helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1 helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81% HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1 HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22% 95% mean confidence interval for instructions value: -5.36 0.66 95% mean confidence interval for instructions %-change: -1.66% -0.46% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 87571944 -> 87572785 (<.01%) cycles in affected programs: 117234 -> 118075 (0.72%) helped: 42 HURT: 23 helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30 helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74% HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10 HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61% 95% mean confidence interval for cycles value: -61.05 86.93 95% mean confidence interval for cycles %-change: -3.47% -0.33% Inconclusive result (value mean confidence interval includes 0). Sandy Bridge total instructions in shared programs: 10542933 -> 10542844 (<.01%) instructions in affected programs: 11487 -> 11398 (-0.77%) helped: 52 HURT: 3 helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1 helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72% HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1 HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22% 95% mean confidence interval for instructions value: -3.17 -0.07 95% mean confidence interval for instructions %-change: -1.13% -0.52% Instructions are helped. total cycles in shared programs: 146098397 -> 146097094 (<.01%) cycles in affected programs: 128140 -> 126837 (-1.02%) helped: 47 HURT: 8 helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18 helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95% HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9 HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34% 95% mean confidence interval for cycles value: -37.49 -9.90 95% mean confidence interval for cycles %-change: -1.22% -0.71% Cycles are helped. Iron Lake total instructions in shared programs: 7886711 -> 7886509 (<.01%) instructions in affected programs: 10425 -> 10223 (-1.94%) helped: 50 HURT: 2 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89% 95% mean confidence interval for instructions value: -8.05 0.28 95% mean confidence interval for instructions %-change: -1.83% -0.26% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178115324 -> 178114612 (<.01%) cycles in affected programs: 765726 -> 765014 (-0.09%) helped: 39 HURT: 1 helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8 helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -32.07 -3.53 95% mean confidence interval for cycles %-change: -0.86% 0.10% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857762 -> 4857661 (<.01%) instructions in affected programs: 5523 -> 5422 (-1.83%) helped: 25 HURT: 1 helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1 helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86% 95% mean confidence interval for instructions value: -9.99 2.22 95% mean confidence interval for instructions %-change: -2.01% 0.08% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 122179674 -> 122179194 (<.01%) cycles in affected programs: 530162 -> 529682 (-0.09%) helped: 22 HURT: 1 helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7 helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -46.56 4.82 95% mean confidence interval for cycles %-change: -1.20% 0.36% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:15 -08:00
Ian Romanick	03fb13f646	nir: Rearrange logic op-compounded integer compares Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14521769 -> 14521753 (<.01%) instructions in affected programs: 8782 -> 8766 (-0.18%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.23% -0.16% Instructions are helped. total cycles in shared programs: 533000376 -> 533000205 (<.01%) cycles in affected programs: 447035 -> 446864 (-0.04%) helped: 9 HURT: 9 helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40 helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09% HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10 HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12% 95% mean confidence interval for cycles value: -25.07 6.07 95% mean confidence interval for cycles %-change: -0.08% 0.27% Inconclusive result (value mean confidence interval includes 0). No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	053be9f020	nir: Rearrange and-compounded float compares If both comparisons are used as sources for instructions other than the iand, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. It is interesting to me that on Iron Lake only 81 shaders have instruction counts changed, but 726 shaders have cycle counts changed. shader-db results: Skylake total instructions in shared programs: 14525728 -> 14521017 (-0.03%) instructions in affected programs: 1164726 -> 1160015 (-0.40%) helped: 1692 HURT: 5 helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2 helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1 HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86% 95% mean confidence interval for instructions value: -3.52 -2.03 95% mean confidence interval for instructions %-change: -0.86% -0.74% Instructions are helped. total cycles in shared programs: 533115449 -> 532991404 (-0.02%) cycles in affected programs: 119401803 -> 119277758 (-0.10%) helped: 1145 HURT: 467 helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18 helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42% HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15 HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39% 95% mean confidence interval for cycles value: -122.16 -31.74 95% mean confidence interval for cycles %-change: -0.94% -0.57% Cycles are helped. total spills in shared programs: 9597 -> 9534 (-0.66%) spills in affected programs: 403 -> 340 (-15.63%) helped: 1 HURT: 1 total fills in shared programs: 13904 -> 13790 (-0.82%) fills in affected programs: 1627 -> 1513 (-7.01%) helped: 2 HURT: 1 LOST: 0 GAINED: 2 Broadwell total instructions in shared programs: 14816966 -> 14812590 (-0.03%) instructions in affected programs: 1499885 -> 1495509 (-0.29%) helped: 1672 HURT: 15 helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2 helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33% HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8 HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53% 95% mean confidence interval for instructions value: -3.14 -2.05 95% mean confidence interval for instructions %-change: -0.85% -0.73% Instructions are helped. total cycles in shared programs: 559353622 -> 559345595 (<.01%) cycles in affected programs: 139893703 -> 139885676 (<.01%) helped: 921 HURT: 697 helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18 helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87% HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38 HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14% 95% mean confidence interval for cycles value: -59.64 49.72 95% mean confidence interval for cycles %-change: -1.02% -0.66% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 78902 -> 78861 (-0.05%) spills in affected programs: 2418 -> 2377 (-1.70%) helped: 1 HURT: 11 total fills in shared programs: 83782 -> 83678 (-0.12%) fills in affected programs: 3515 -> 3411 (-2.96%) helped: 2 HURT: 11 LOST: 0 GAINED: 5 Haswell and Ivy Bridge had similar results. Haswell shown. total instructions in shared programs: 9033898 -> 9032010 (-0.02%) instructions in affected programs: 308064 -> 306176 (-0.61%) helped: 921 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1 helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for instructions value: -2.21 -1.87 95% mean confidence interval for instructions %-change: -0.88% -0.68% Instructions are helped. total cycles in shared programs: 84628949 -> 84620520 (<.01%) cycles in affected programs: 2164913 -> 2156484 (-0.39%) helped: 518 HURT: 359 helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20 helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01% HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8 HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40% 95% mean confidence interval for cycles value: -15.17 -4.05 95% mean confidence interval for cycles %-change: -0.77% -0.32% Cycles are helped. LOST: 0 GAINED: 4 Sandy Bridge total instructions in shared programs: 10544860 -> 10542933 (-0.02%) instructions in affected programs: 360019 -> 358092 (-0.54%) helped: 931 HURT: 4 helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1 helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33% 95% mean confidence interval for instructions value: -2.23 -1.89 95% mean confidence interval for instructions %-change: -0.76% -0.58% Instructions are helped. total cycles in shared programs: 146106820 -> 146098397 (<.01%) cycles in affected programs: 3435047 -> 3426624 (-0.25%) helped: 572 HURT: 329 helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15 helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33% HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6 HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19% 95% mean confidence interval for cycles value: -16.85 -1.85 95% mean confidence interval for cycles %-change: -0.39% -0.01% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7886925 -> 7886711 (<.01%) instructions in affected programs: 25763 -> 25549 (-0.83%) helped: 75 HURT: 6 helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1 helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53% HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86% 95% mean confidence interval for instructions value: -3.69 -1.60 95% mean confidence interval for instructions %-change: -2.54% -0.57% Instructions are helped. total cycles in shared programs: 178116888 -> 178115324 (<.01%) cycles in affected programs: 5858790 -> 5857226 (-0.03%) helped: 484 HURT: 242 helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03% 95% mean confidence interval for cycles value: -2.76 -1.55 95% mean confidence interval for cycles %-change: -0.12% 0.01% Inconclusive result (%-change mean confidence interval includes 0). GM45 total instructions in shared programs: 4857870 -> 4857762 (<.01%) instructions in affected programs: 13994 -> 13886 (-0.77%) helped: 39 HURT: 5 helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2 helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48% HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1 HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86% 95% mean confidence interval for instructions value: -3.86 -1.05 95% mean confidence interval for instructions %-change: -2.61% 0.04% Inconclusive result (%-change mean confidence interval includes 0). total cycles in shared programs: 122180744 -> 122179674 (<.01%) cycles in affected programs: 3686646 -> 3685576 (-0.03%) helped: 273 HURT: 141 helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6 helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06% HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2 HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02% 95% mean confidence interval for cycles value: -3.42 -1.75 95% mean confidence interval for cycles %-change: -0.15% 0.03% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	821e7a4d32	nir: Separate a weird compare with zero to two compares with zero min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0). No shader-db changes, but it does prevent 6 to 12 instruction regressions in the next patch on all measured Intel platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	68420d8322	nir: Simplify min and max of b2f v2: Rebase on almost 2 years. Require that one of the arguments to fmin or fmax be used only once. This prevents some regressions. shader-db results: Skylake and Broadwell had similar results. Skylake shown. total instructions in shared programs: 14526021 -> 14525913 (<.01%) instructions in affected programs: 4613 -> 4505 (-2.34%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4 helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42% total cycles in shared programs: 533118710 -> 533118403 (<.01%) cycles in affected programs: 34334 -> 34027 (-0.89%) helped: 24 HURT: 0 helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14 helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03% No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	d8d18516b0	nir: Undo possible damage caused by rearranging or-compounded float compares shader-db results: Skylake and Broadwell had similar results (Skylake shown) total instructions in shared programs: 14525898 -> 14525836 (<.01%) instructions in affected programs: 1964 -> 1902 (-3.16%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 25 x̄: 4.43 x̃: 1 helped stats (rel) min: 0.68% max: 9.77% x̄: 2.10% x̃: 0.86% 95% mean confidence interval for instructions value: -9.46 0.60 95% mean confidence interval for instructions %-change: -3.97% -0.24% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 533119892 -> 533115756 (<.01%) cycles in affected programs: 96061 -> 91925 (-4.31%) helped: 13 HURT: 1 helped stats (abs) min: 60 max: 596 x̄: 318.77 x̃: 300 helped stats (rel) min: 1.15% max: 5.49% x̄: 4.27% x̃: 4.42% HURT stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8 HURT stats (rel) min: 0.46% max: 0.46% x̄: 0.46% x̃: 0.46% 95% mean confidence interval for cycles value: -379.43 -211.43 95% mean confidence interval for cycles %-change: -4.84% -3.01% Cycles are helped. Haswell, Ivy Bridge and Sandy Bridge had similar results (Haswell shown). total instructions in shared programs: 9033948 -> 9033898 (<.01%) instructions in affected programs: 535 -> 485 (-9.35%) helped: 2 HURT: 0 total cycles in shared programs: 84631402 -> 84628949 (<.01%) cycles in affected programs: 63197 -> 60744 (-3.88%) helped: 13 HURT: 2 helped stats (abs) min: 1 max: 594 x̄: 189.62 x̃: 140 helped stats (rel) min: 0.07% max: 5.04% x̄: 3.79% x̃: 4.01% HURT stats (abs) min: 4 max: 8 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.17% max: 0.45% x̄: 0.31% x̃: 0.31% 95% mean confidence interval for cycles value: -253.40 -73.67 95% mean confidence interval for cycles %-change: -4.24% -2.25% Cycles are helped. No changes on GM45 or Iron Lake. v2: Add a couple more tautological compares. Suggested by Elie. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	3941cba0f7	nir: Be more conservative about rearranging or-compounded compares If both comparisons are used as sources for instructions other than the ior, this transformation is detrimental. If the non-identical value in both compares is constant, the fmin or fmax will be constant-folded away, so the transformation is always a win. shader-db results: Skylake total instructions in shared programs: 14526147 -> 14525898 (<.01%) instructions in affected programs: 70239 -> 69990 (-0.35%) helped: 102 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.86 -2.02 95% mean confidence interval for instructions %-change: -0.46% -0.31% Instructions are helped. total cycles in shared programs: 533120531 -> 533119892 (<.01%) cycles in affected programs: 994875 -> 994236 (-0.06%) helped: 76 HURT: 26 helped stats (abs) min: 1 max: 324 x̄: 27.09 x̃: 13 helped stats (rel) min: <.01% max: 4.21% x̄: 0.45% x̃: 0.18% HURT stats (abs) min: 1 max: 167 x̄: 54.62 x̃: 26 HURT stats (rel) min: <.01% max: 4.36% x̄: 1.01% x̃: 0.39% 95% mean confidence interval for cycles value: -19.44 6.91 95% mean confidence interval for cycles %-change: -0.30% 0.15% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 14816005 -> 14815787 (<.01%) instructions in affected programs: 64658 -> 64440 (-0.34%) helped: 97 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 2.25 x̃: 1 helped stats (rel) min: 0.07% max: 2.30% x̄: 0.38% x̃: 0.20% 95% mean confidence interval for instructions value: -2.62 -1.87 95% mean confidence interval for instructions %-change: -0.45% -0.30% Instructions are helped. total cycles in shared programs: 559340386 -> 559339907 (<.01%) cycles in affected programs: 1090491 -> 1090012 (-0.04%) helped: 66 HURT: 28 helped stats (abs) min: 2 max: 198 x̄: 23.83 x̃: 16 helped stats (rel) min: 0.01% max: 4.21% x̄: 0.47% x̃: 0.27% HURT stats (abs) min: 2 max: 226 x̄: 39.07 x̃: 11 HURT stats (rel) min: <.01% max: 4.61% x̄: 0.64% x̃: 0.20% 95% mean confidence interval for cycles value: -15.94 5.75 95% mean confidence interval for cycles %-change: -0.35% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 Haswell total instructions in shared programs: 9034106 -> 9033948 (<.01%) instructions in affected programs: 24096 -> 23938 (-0.66%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.16 x̃: 4 helped stats (rel) min: 0.42% max: 2.29% x̄: 0.71% x̃: 0.64% 95% mean confidence interval for instructions value: -4.71 -3.60 95% mean confidence interval for instructions %-change: -0.84% -0.58% Instructions are helped. total cycles in shared programs: 84631628 -> 84631402 (<.01%) cycles in affected programs: 148674 -> 148448 (-0.15%) helped: 14 HURT: 14 helped stats (abs) min: 1 max: 114 x̄: 22.14 x̃: 12 helped stats (rel) min: 0.02% max: 2.98% x̄: 0.66% x̃: 0.21% HURT stats (abs) min: 1 max: 10 x̄: 6.00 x̃: 5 HURT stats (rel) min: 0.01% max: 0.20% x̄: 0.12% x̃: 0.11% 95% mean confidence interval for cycles value: -19.42 3.28 95% mean confidence interval for cycles %-change: -0.59% 0.05% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total instructions in shared programs: 10015456 -> 10015293 (<.01%) instructions in affected programs: 27701 -> 27538 (-0.59%) helped: 38 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 4.29 x̃: 4 helped stats (rel) min: 0.33% max: 2.79% x̄: 0.66% x̃: 0.52% 95% mean confidence interval for instructions value: -4.87 -3.71 95% mean confidence interval for instructions %-change: -0.82% -0.51% Instructions are helped. total cycles in shared programs: 87524771 -> 87524569 (<.01%) cycles in affected programs: 112324 -> 112122 (-0.18%) helped: 6 HURT: 12 helped stats (abs) min: 2 max: 111 x̄: 44.67 x̃: 20 helped stats (rel) min: 0.02% max: 2.94% x̄: 1.45% x̃: 1.26% HURT stats (abs) min: 1 max: 16 x̄: 5.50 x̃: 5 HURT stats (rel) min: <.01% max: 0.16% x̄: 0.08% x̃: 0.08% 95% mean confidence interval for cycles value: -29.14 6.69 95% mean confidence interval for cycles %-change: -0.93% 0.08% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10545655 -> 10545465 (<.01%) instructions in affected programs: 37198 -> 37008 (-0.51%) helped: 42 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 4.52 x̃: 4 helped stats (rel) min: 0.31% max: 2.15% x̄: 0.58% x̃: 0.49% 95% mean confidence interval for instructions value: -5.14 -3.91 95% mean confidence interval for instructions %-change: -0.68% -0.47% Instructions are helped. total cycles in shared programs: 146113059 -> 146112427 (<.01%) cycles in affected programs: 423514 -> 422882 (-0.15%) helped: 32 HURT: 10 helped stats (abs) min: 4 max: 162 x̄: 24.34 x̃: 12 helped stats (rel) min: 0.06% max: 2.74% x̄: 0.37% x̃: 0.11% HURT stats (abs) min: 12 max: 19 x̄: 14.70 x̃: 14 HURT stats (rel) min: 0.10% max: 0.18% x̄: 0.16% x̃: 0.14% 95% mean confidence interval for cycles value: -26.03 -4.07 95% mean confidence interval for cycles %-change: -0.43% -0.05% Cycles are helped. Iron Lake total instructions in shared programs: 7886959 -> 7886925 (<.01%) instructions in affected programs: 1340 -> 1306 (-2.54%) helped: 4 HURT: 0 helped stats (abs) min: 2 max: 15 x̄: 8.50 x̃: 8 helped stats (rel) min: 0.63% max: 4.30% x̄: 2.45% x̃: 2.43% 95% mean confidence interval for instructions value: -20.44 3.44 95% mean confidence interval for instructions %-change: -5.78% 0.89% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 178116996 -> 178116888 (<.01%) cycles in affected programs: 6262 -> 6154 (-1.72%) helped: 2 HURT: 2 helped stats (abs) min: 44 max: 78 x̄: 61.00 x̃: 61 helped stats (rel) min: 3.31% max: 3.94% x̄: 3.62% x̃: 3.62% HURT stats (abs) min: 6 max: 8 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.34% max: 0.68% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -93.27 39.27 95% mean confidence interval for cycles %-change: -5.38% 2.27% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4857887 -> 4857870 (<.01%) instructions in affected programs: 674 -> 657 (-2.52%) helped: 2 HURT: 0 total cycles in shared programs: 122180816 -> 122180744 (<.01%) cycles in affected programs: 3764 -> 3692 (-1.91%) helped: 1 HURT: 1 helped stats (abs) min: 78 max: 78 x̄: 78.00 x̃: 78 helped stats (rel) min: 3.94% max: 3.94% x̄: 3.94% x̃: 3.94% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.34% max: 0.34% x̄: 0.34% x̃: 0.34% Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Ian Romanick	cfc0d34802	nir: See through an fneg to apply existing optimizations Doing the same for the existing feq and fne transformations didn't help anything in shader-db. shader-db results: Broadwell and Skylake (Skylake shown) total instructions in shared programs: 14529463 -> 14526147 (-0.02%) instructions in affected programs: 402420 -> 399104 (-0.82%) helped: 2136 HURT: 131 helped stats (abs) min: 1 max: 10 x̄: 1.61 x̃: 1 helped stats (rel) min: 0.03% max: 16.22% x̄: 3.14% x̃: 1.12% HURT stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.75% x̃: 0.57% 95% mean confidence interval for instructions value: -1.51 -1.41 95% mean confidence interval for instructions %-change: -3.06% -2.78% Instructions are helped. total cycles in shared programs: 533146915 -> 533120531 (<.01%) cycles in affected programs: 10356261 -> 10329877 (-0.25%) helped: 1933 HURT: 844 helped stats (abs) min: 1 max: 490 x̄: 29.44 x̃: 16 helped stats (rel) min: <.01% max: 28.57% x̄: 3.43% x̃: 1.88% HURT stats (abs) min: 1 max: 423 x̄: 36.17 x̃: 12 HURT stats (rel) min: <.01% max: 23.75% x̄: 1.90% x̃: 0.59% 95% mean confidence interval for cycles value: -11.78 -7.22 95% mean confidence interval for cycles %-change: -1.98% -1.65% Cycles are helped. Haswell total instructions in shared programs: 9037416 -> 9034106 (-0.04%) instructions in affected programs: 389831 -> 386521 (-0.85%) helped: 2184 HURT: 120 helped stats (abs) min: 1 max: 11 x̄: 1.57 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.73% x̃: 1.02% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.19% max: 7.69% x̄: 0.81% x̃: 0.57% 95% mean confidence interval for instructions value: -1.49 -1.39 95% mean confidence interval for instructions %-change: -2.68% -2.41% Instructions are helped. total cycles in shared programs: 84636243 -> 84631628 (<.01%) cycles in affected programs: 4745058 -> 4740443 (-0.10%) helped: 1904 HURT: 960 helped stats (abs) min: 1 max: 466 x̄: 30.21 x̃: 18 helped stats (rel) min: 0.02% max: 36.36% x̄: 3.57% x̃: 2.38% HURT stats (abs) min: 1 max: 1080 x̄: 55.11 x̃: 14 HURT stats (rel) min: 0.02% max: 51.33% x̄: 2.77% x̃: 0.81% 95% mean confidence interval for cycles value: -4.51 1.29 95% mean confidence interval for cycles %-change: -1.64% -1.25% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 0 Sandy Bridge and Ivy Bridge (Ivy Bridge shown) total instructions in shared programs: 10018873 -> 10015456 (-0.03%) instructions in affected programs: 512820 -> 509403 (-0.67%) helped: 2268 HURT: 162 helped stats (abs) min: 1 max: 11 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.03% max: 25.00% x̄: 2.47% x̃: 0.88% HURT stats (abs) min: 1 max: 4 x̄: 1.59 x̃: 1 HURT stats (rel) min: 0.09% max: 7.69% x̄: 0.86% x̃: 0.50% 95% mean confidence interval for instructions value: -1.46 -1.35 95% mean confidence interval for instructions %-change: -2.38% -2.12% Instructions are helped. total cycles in shared programs: 87538223 -> 87524771 (-0.02%) cycles in affected programs: 5435520 -> 5422068 (-0.25%) helped: 1916 HURT: 946 helped stats (abs) min: 1 max: 1392 x̄: 29.44 x̃: 18 helped stats (rel) min: <.01% max: 34.51% x̄: 3.34% x̃: 1.97% HURT stats (abs) min: 1 max: 633 x̄: 45.41 x̃: 11 HURT stats (rel) min: 0.02% max: 25.95% x̄: 2.41% x̃: 0.62% 95% mean confidence interval for cycles value: -7.34 -2.06 95% mean confidence interval for cycles %-change: -1.62% -1.26% Cycles are helped. LOST: 1 GAINED: 0 Iron Lake total instructions in shared programs: 7888446 -> 7886959 (-0.02%) instructions in affected programs: 331581 -> 330094 (-0.45%) helped: 1160 HURT: 97 helped stats (abs) min: 1 max: 10 x̄: 1.37 x̃: 1 helped stats (rel) min: 0.02% max: 9.68% x̄: 0.93% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.17% max: 4.17% x̄: 0.37% x̃: 0.25% 95% mean confidence interval for instructions value: -1.25 -1.12 95% mean confidence interval for instructions %-change: -0.91% -0.75% Instructions are helped. total cycles in shared programs: 178130766 -> 178116996 (<.01%) cycles in affected programs: 12534564 -> 12520794 (-0.11%) helped: 1856 HURT: 187 helped stats (abs) min: 2 max: 202 x̄: 7.78 x̃: 4 helped stats (rel) min: <.01% max: 6.47% x̄: 0.28% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.55 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.08% x̃: 0.02% 95% mean confidence interval for cycles value: -7.41 -6.07 95% mean confidence interval for cycles %-change: -0.28% -0.22% Cycles are helped. GM45 total instructions in shared programs: 4858912 -> 4857887 (-0.02%) instructions in affected programs: 237565 -> 236540 (-0.43%) helped: 867 HURT: 57 helped stats (abs) min: 1 max: 10 x̄: 1.25 x̃: 1 helped stats (rel) min: 0.02% max: 9.38% x̄: 0.87% x̃: 0.43% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.16% max: 3.85% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: -1.18 -1.04 95% mean confidence interval for instructions %-change: -0.88% -0.71% Instructions are helped. total cycles in shared programs: 122189118 -> 122180816 (<.01%) cycles in affected programs: 8776418 -> 8768116 (-0.09%) helped: 1213 HURT: 166 helped stats (abs) min: 2 max: 202 x̄: 7.30 x̃: 4 helped stats (rel) min: <.01% max: 6.43% x̄: 0.25% x̃: 0.11% HURT stats (abs) min: 2 max: 26 x̄: 3.35 x̃: 2 HURT stats (rel) min: 0.01% max: 2.14% x̄: 0.06% x̃: 0.02% 95% mean confidence interval for cycles value: -6.78 -5.26 95% mean confidence interval for cycles %-change: -0.24% -0.18% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Elie Tournier <elie.tournier@collabora.com>	2018-01-30 15:40:14 -08:00
Timothy Arceri	283e25102b	st/glsl_to_nir: disable io lowering and array splitting of fs inputs We need this to be able to support the interpolateAt builtins in a sane way. It also leads to the generation of more optimal code. The lowering and splitting is made conditional on lower_all_io_to_temps because vc4 and freedreno both expect these passes to be enabled and niether support glsl 400 so don't need to deal with the interpolateAt builtins. We leave the other stages for now as to avoid regressions. Ideally we could remove the stage checks and just set the nir options correctly for each stage. However all gallium drivers currently just use return the same nir compiler options for all stages, and it's probably more trouble than its worth to change this. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	9a2e085680	nir: add lower_all_io_to_temps flag This will be used for freedreno and vc4 which require all inputs and outputs to be copied to temps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	3218756262	nir/st_glsl_to_nir: add param to disable splitting of inputs We need this because we will always copy fs outputs to temps and split the arrays, but do not want to do either of these with fs inputs as it is unnessisary and makes handling interpolateAt builtins difficult. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	93e213f91f	st/glsl_to_nir: copy nir compiler options to context Various nir passes may expect this to be here as does the nir serialisation pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:08 +11:00
Timothy Arceri	dd6d6c63a7	radeonsi/nir: add input support for arrays that have not been copied to temps and split We need this to be able to support the interpolateAt builtins in a sane way. It also leads to the generation of more optimal code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	d185190222	ac/radeonsi: add lookup_interp_param and load_sample_position to the abi This will enable the interpolateAt builtins to work on the radeonsi nir backend. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	97058168a4	radeonsi/nir: add prim_mask to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	3ff012f142	radeonsi/nir: adjust load_sample_position() to be shared between backends With this interface change it can be shared between the tgsi and nir backends. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	3a47b138e3	radeonsi/nir: add si_nir_lookup_interp_param() helper Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	b8808848ce	ac/nir_to_llvm: move some interp defines to the header These will be used in the following patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	fea6da9aaa	radeonsi/nir: move the interpolation qualifier scanning We need to collect this when scanning over the instruction rather than when scanning over the inputs otherwise we might get confliting values for inputs that are use by the interpolateAt* builtins. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Timothy Arceri	580f1aa247	radeonsi/nir: add interpolate at intrinsics to scan_instruction() V2: use the uses__opcode_interp_ flags Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 09:14:07 +11:00
Bas Nieuwenhuizen	882eff4d20	radv: Merge raster state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:05 +01:00
Bas Nieuwenhuizen	69364f1c34	radv: Move gs state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:02:01 +01:00
Bas Nieuwenhuizen	e4e060d135	radv: Split out cliprect rule generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:56 +01:00
Bas Nieuwenhuizen	acbaef3005	radv: Merge VGT_GS_MODE computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:52 +01:00
Bas Nieuwenhuizen	4ae6a8b0cd	radv: Split out processing the vertex input state. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:41 +01:00
Bas Nieuwenhuizen	9062b1c241	radv: Move tessellation state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:38 +01:00
Bas Nieuwenhuizen	4aa1cb4e90	radv: Move blend state out of pipeline. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:34 +01:00
Bas Nieuwenhuizen	0f72f0eacb	radv: Split out generating VGT_SHADER_STAGES_EN. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:30 +01:00
Bas Nieuwenhuizen	694c34314b	radv: Split out the ia_multi_vgt_param precomputation. Also moved everything in a struct and then return the struct from the helper function, so it is clear in the caller what part of the pipeline gets modified. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:26 +01:00
Bas Nieuwenhuizen	0bea0851aa	radv: Split out db_shader_control computation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:18 +01:00
Bas Nieuwenhuizen	5dce47ae6d	radv: Compute shader_z_format when emitting it. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:13 +01:00
Bas Nieuwenhuizen	df2e7ab0db	radv: Merge depth stencil state with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:06 +01:00
Bas Nieuwenhuizen	d5a0af84ec	radv: Merge ps_input_cntl computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:01:01 +01:00
Bas Nieuwenhuizen	e2bf18030d	radv: Merge vtx_reuse_depth computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:55 +01:00
Bas Nieuwenhuizen	c80747b32c	radv: Merge vs state computation with PM4 generation. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:50 +01:00
Bas Nieuwenhuizen	c4191cf944	radv: Merge binning state generation with pm4 emission. We don't need the pipeline state struct anymore. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:45 +01:00
Bas Nieuwenhuizen	6f1a3f081e	radv: Constify some pipeline helpers. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:40 +01:00
Bas Nieuwenhuizen	f0c9ef410a	radv: Add PM4 pregeneration for compute pipelines. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:34 +01:00
Bas Nieuwenhuizen	beeab44190	radv: Record a PM4 sequence for graphics pipeline switches. This gives about 2% performance improvement on dota2 for me. This is mostly a mechanical copy and replacement, but at bind time we still do: 1) Some stuff that is only based on num_samples changes. 2) Some command buffer state setting. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:22 +01:00
Bas Nieuwenhuizen	7c366bc152	radv: Determine unneeded dynamic states. Which avoids setting or emitting them. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-30 22:00:17 +01:00
Andres Rodriguez	0a89784bcc	mesa: check for invalid index on UUID glGet queries This fixes the piglit test: spec/ext_semaphore/api-errors/usigned-byte-i-v-bad-value Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	566ed727a4	mesa: fix glGet for ext_external_objects parameters This allows the client to actually query the enums specified in the ext_external_objects spec. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	0ebd3cc863	mesa: fix error codes for importing memory/semaphore FDs This fixes the following piglit tests: spec/ext_semaphore_fd/api-errors/import-semaphore-fd-bad-enum spec/ext_memory_object_fd/api-errors/import-memory-fd-bad-enum Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	50b06cbc10	radeonsi: fix fence_server_sync() holding up extra work v2 When calling si_fence_server_sync(), the wait operation is associated with the next kernel submission. Therefore, any unflushed work submitted previous to fence_server_sync() will also be affected by the wait. To avoid adding the dependency to the unflushed work, we flush before emitting the fence dependency. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	e0f16ee666	radeonsi: implement semaphore_server_signal v2 Syncobj based waits or signals only happen at submission boundaries. In order to guarantee that the requested signal event will occur when the state tracker requested it, we must issue a flush. v2: s/fence/semaphore for pipe objects Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	5b07b06d6b	radeonsi: add support for importing PIPE_FD_TYPE_SYNCOBJ semaphores Hook up importing semaphores of type PIPE_FD_TYPE_SYNCOBJ Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	cc9762d74d	winsys/amdgpu: add support for syncobj signaling v3 Add the ability to signal a syncobj when a cs completes execution. v2: corresponding changes for gallium fence->semaphore rename v3: s/semaphore/fence for pipe objects Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	29b9bd0539	mesa/st: add support for semaphore object signal/wait v4 Bits to implement ServerWaitSemaphoreObject/ServerSignalSemaphoreObject v2: - corresponding changes for gallium fence->semaphore rename - flushing moved to mesa/main v3: s/semaphore/fence for pipe objects v4: add bitmap flushing Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	89b52891fd	mesa: add support for semaphore object signal/wait v3 Memory synchronization is left for a future patch. v2: flush vertices/bitmaps moved to mesa/main v3: removed spaces before/after braces Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	260f7fcc46	mesa: add semaphore parameter stub v2 EXT_semaphore and EXT_semaphore_fd define no pnames. Therefore there isn't much to do besides determining the correct error code. v2: removed useless return Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	382067f065	mesa/st: add support for semaphore object create/import/delete v3 Add basic semaphore object operations. v2: s/semaphore/fence for pipe objects v3: added missing license headers Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	67d5d08682	mesa: add support for semaphore object creation/import/delete v3 Used by EXT_semmaphore and EXT_semaphore_fd v2: Removed unnecessary dummy callback initialization v3: Fixed attempting to free the DummySemaphoreObject Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	8e635f7d65	mesa/st: introduce EXT_semaphore and EXT_semaphore_fd v2 Guarded by PIPE_CAP_SEMAPHORE_SIGNAL v2: corresponding changes for PIPE_CAP_SEMAPHORE_SIGNAL rename Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	fde1afc495	u_threaded_context: add support for fence_server_signal v2 v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	d34c2cf3e6	gallium: add fence_server_signal() v2 Calling this function will emit a fence signal operation into the GPU's command stream. v2: documentation typos Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	458f89be78	gallium: introduce PIPE_FD_TYPE_SYNCOBJ Denotes that a fd is backed by a synobj. For example, radv shared semaphores. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	2ab405d254	gallium: introduce PIPE_CAP_FENCE_SIGNAL v2 Protects semaphore signaling functionality required by GL_EXT_semaphore. v2: s/semaphore/fence Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Andres Rodriguez	585daa2378	gallium: add type parameter to create_fence_fd An fd can potentially have different types of objects backing it. Specifying the type helps us make sure we treat the FD correctly. This is in preparation to allow importing syncobj fence FDs in addition to native sync FDs. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 15:13:49 -05:00
Dave Airlie	16dd0eb517	ac/llvm: bump the number of results to 8. This function can get access for a 64-bit dvec4, which means we have to load 8 components. This fixes: R600_DEBUG=nir ./bin/shader_runner generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-abs-dvec4.shader_test -auto Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-31 05:37:16 +10:00
Dave Airlie	8d633f067b	r600/sb: insert the else clause when we might depart from a loop If there is a break inside the else clause and this means we are breaking from a loop, the loop finalise will want to insert the LOOP_BREAK/CONTINUE instruction, however if we don't emit the else there is no where for these to end up, so they will end up in the wrong place. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101442 Tested-By: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-31 04:47:29 +10:00
Brian Paul	1a9aa69ae8	mesa: remove invalid assertion in _mesa_enable_vertex_array_attrib() The meta module passes some 0-based attrib values. Should fix Piglit regressions reported by Mark Janes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104863 Fixes: `4ab7e03e1f` ("mesa: add an assertion in _mesa_enable_vertex_array_attrib()") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-30 11:02:43 -07:00
Brian Paul	efa0993eaf	mesa: use gl_vert_attrib enum type in more places Slightly better readbility. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 11:02:43 -07:00
Brian Paul	f892e332a8	mesa: rename some 'client' array functions A long time ago gl_vertex_array was gl_client_array. Update some function names to be consistent. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	d2d9d090e5	mesa: s/src/attribs/ in _mesa_update_client_array() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	e863541e43	mesa: check/assert array index in _mesa_bind_vertex_buffer() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	fcee2cc711	mesa: trivial comment typo fix in arrayobj.c Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	4ab7e03e1f	mesa: add an assertion in _mesa_enable_vertex_array_attrib() Some of the enable/disable vertex array functions take a zero-based generic index, while others take a VERT_ATTRIB_GENERIC0-based value. Add an assertion to clarify that in one place. Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Brian Paul	7f12791cc6	mesa: rename some vars in client_state() Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	06621e8a0d	mesa: Care for differences in fog mode only if fog is consumed. In creating fixed function vertex shader hash keys do only care for producing the varying output if fog is enabled and the varing is consumed in the fragment stage. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	6395a0ecf2	mesa: Reduce ffvertex_prog state_key to 36 bytes. Using lower alignment restrictions for the state key fields finally yields to a smaller hashing state key. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	b4216b588e	mesa: Remove unused ffvertex_prog texunit_really_enabled. Remove set but not read field from the state key used for hashing fixed function vertex shaders. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	1169791c18	mesa: Remove unused bit in ffvertex_prog state_key. Remove set but not read field from the state key used for hashing fixed function vertex shaders. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	6726d16098	mesa: texgen_enabled is only 1 bit. For the state key for hashing fixed function vertex shaders, the texgen_enabled field requires only a single bit. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	d6b0ad51ec	mesa: Encode fog modes in a 2 bit field. For the state key for hashing fixed function vertex shaders, encode the different fog modes, including if fog is generally enabled or not, into a 2 bit field. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:59 -07:00
Mathias Fröhlich	63e845d3cc	mesa: Move seperate_specular into the lighting section. For the state key for hashing fixed function vertex shaders, the information is only evaluated if lighting is generally switched on. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Mathias Fröhlich	11e665d434	mesa: Get the point size array state from varying_vp_inputs. For the state key for hashing fixed function vertex shaders, The varying_vp_inputs bitmask already contains the point size array enabled information. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Mathias Fröhlich	bc5c54cadf	mesa: Remove unused gl_fog_attrib::_Scale. The patch removes a variable that is only written to. Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-30 09:07:58 -07:00
Iago Toral Quiroga	99b57daf4a	anv/pipeline: lower constant initializers on output variables earlier If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_vert dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output_frag Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-30 08:10:29 +01:00
Tapani Pälli	6316c2ecbd	i965: move disk cache from brw_context to intel_screen Now every context refers to same disk_cache instance in screen. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-30 08:42:51 +02:00
Elie Tournier	6f8518e068	mesa: Correctly print glTexImage dimensions texture_format_error_check_gles() displays error like "glTexImage%dD". This patch just replace the %d by the correct dimension. Signed-off-by: Elie Tournier <elie.tournier@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-30 07:48:56 +02:00
Brian Paul	d5f42f96e1	mesa: shrink size of gl_array_attributes (v2) Inspired by Marek's earlier patch, but even smaller. Sort fields from largest to smallest. Use bitfields for more fields (sometimes with an extra bit for MSVC). Reduce Stride field to GLshort. Note that some fields cannot be bitfields because they're accessed via pointers (such as for glEnableClientState(GL_VERTEX_ARRAY) to set the Enabled field). Reduces size from 48 to 24 bytes. Also reduces size of gl_vertex_array_object from 3632 to 2864 bytes. And add some assertions in init_array(). v2: use s/GLuint/unsigned/, improve commit comments. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:16:50 -07:00
Brian Paul	79cafa0df3	mesa: shrink gl_vertex_array Inspired by Marek's earlier patch, but goes a little further. Sort fields from largest to smallest. Use bitfields. Reduced from 48 bytes to 32. Also reduces size of gl_vertex_array_object from 4144 to 3632 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:15:52 -07:00
Marek Olšák	f96a69f916	mesa: replace GLenum with GLenum16 in common structures (v4) v2: - fix glGet* - also use GLenum16 for DrawBuffers v3: - rebase to top of tree (BrianP) and incorporate Ian's suggestions v4: - fix a GLenum16 bug in VBO/save code, add some STATIC_ASSERT()s gl_context = 152432 -> 136840 bytes vbo_context = 22096 -> 20608 bytes Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-29 21:15:52 -07:00
Brian Paul	94843e6056	mesa: fix incorrect size/error test in _mesa_GetUnsignedBytevEXT() get_value_size() returns -1 for an error. The similar check in _mesa_GetUnsignedBytei_vEXT() is correct. Found by chance. There are apparently no Piglit tests which exercise glGetUnsignedBytei_vEXT() or glGetUnsignedBytevEXT(). Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-29 21:15:52 -07:00
Neha Bhende	e4ca1d6456	svga: Check rasterization state object before checking poly_stipple_enable Sometimes rasterization state object could be empty. This is causing segfault on hw8,9,10 for some traces. This patch fixes enemy_territory_quake_wars_high, enemy_territory_quake_wars_low, etqw-demo, lightsmark2008, quake1 glretrace crashes on hw 8,9,10. Tested with mtt-glretrace and mtt-piglit. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Neha Bhende	d4a5e14fae	svga: Adjust alpha for S3TC_DXT1_EXT RGB formats According to spec, S3TC_DXT1_EXT RGB formats are supposed to be opaque. Correspoding svga formats are not handling it so explicitly setting it to 1.0. This fixes piglit test spec@ext_texture_compression_s3tc@s3tc-targeted Note: This test is testcase for freedesktop bug 100925 Tested with mtt-piglit and mtt-glretrace on 8,9,10,11 and 15 Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Gert Wollny	6a7d1ca2c4	mesa/st/glsl_to_tgsi: Mark first write as unconditional when appropriate In the register lifetime estimation if the first write is unconditional or conditional but not within a loop then this is an unconditional dominant write in the sense of register life time estimation. Add a test case and record the write accordingly. Fixes: `807e2539e5` ("mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104803 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-29 21:04:49 -07:00
Roland Scheidegger	3c7aa242f5	mesa: skip validation of legality of size/type queries for format queries The size/type query is always legal (if we made it that far). Removing this causes a difference for GL_TEXTURE_BUFFER - the reason is that these parameters are valid only with GetTexLevelParameter() if gl 3.1 is supported, but not if only ARB_texture_buffer_object is supported. However, while the spec says that these queries return "the same information as querying GetTexLevelParameter" I believe we're not expected to return just zeros here. By definition, these pnames are always valid (unlike for the GetTexLevelParameter() function which would return an error without GL 3.1). The spec is a bit inconsistent there and open to interpretation - while mentioning the "same information as querying GetTexLevelParameter" is returned, it also mentions that 0 is returned for size/type if the target/format is not supported - implying correct results to be returned if it is supported, regardless that GetTexLevelParameter would return an error. (Also, the bit about this returning the same as GetTexLevelParameter also includes querying stencil type, which isn't even possible with GetTexLevelParameter.) This breaks some piglit arb_internalformat_query2 tests (which I believe to be wrong). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>§	2018-01-30 01:28:47 +01:00
Roland Scheidegger	21fe02d1d3	mesa: restrict formats being supported by target type for formatquery The code just considered all formats as being supported if they were either a valid fbo or texture format. This was quite awkward since then the query would return "supported" for e.g. GL_RGB9E5 or compressed formats and target RENDERBUFFER (albeit the driver could still refuse it in theory). However, when then querying for instance the internalformat sizes, it would just return 0 (due to the checks being more strict there). It was also a problem for texture buffer targets, which have a more restricted list of formats which are allowed (and again, it would return supported but then querying sizes would return 0). So only take validation of formats into account which make sense for a given target. Can also toss out some special checks for rgb9e5 later, since we'd never get there if it wasn't supported in the first place. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-30 01:28:47 +01:00
Roland Scheidegger	272e7e1bd5	mesa: (trivial) add TODO comment for default results for internal queries	2018-01-30 01:28:47 +01:00
Roland Scheidegger	09dc4f9012	mesa: remove misleading gles checks for formatquery Testing for gles there is just confusing - this is about target being supported, if it was valid at all was already determined earlier (in _legal_parameters). It didn't make sense at all in any case, since it would only have said false there for gles for 2d but not 2d arrays etc. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-30 01:28:47 +01:00
Rafael Antognolli	e7ecc5e160	i965: Emit PIPE_CONTROL with ISP bit on older platforms. Emit it on all platforms since gen7. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-29 14:52:07 -08:00
Rafael Antognolli	fa21ddf7b1	anv/cmd_buffer: Emit PIPE_CONTROL with ISP bit on older platforms. Emit it on all platforms since gen7. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-29 14:52:07 -08:00
Timothy Arceri	2b4afaef1c	st/glsl_to_nir: remove dead io after conversion to nir This fixes an assert in nir_lower_var_copies() for some bioshock shaders where an unused clipdistance array has no size. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:14:36 +11:00
Timothy Arceri	327c1a7fb3	radeonsi/nir: add support vs double inputs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	44067d6f0d	radeonsi: pass input_idx to declare_nir_input_vs() This make it consistent with declare_nir_input_fs() and will allow us to support doubles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	cf75ee3ab1	radeonsi: add bitcast_inputs() helper Will be used in a following patch to help support doubles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	96cfd4bd7e	radeonsi/nir: fix num_inputs for doubles in vs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	09cd484d61	nir: partially revert `c2acf97fcc` `c2acf97fcc` changed the use of double_inputs_read to be inconsitent with its previous meaning. Here we re-enable the gather info code that was removed as the modified code from `c2acf97fcc` now uses the double_inputs member rather than double_inputs_read. This change allows us to use double_inputs_read with gallium drivers without impacting double_inputs which is used by i965. We also make use of the compiler option vs_inputs_dual_locations to allow for the difference in behaviour between drivers that handle vs inputs as taking up two locations for doubles, versus those that treat them as taking a single location. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	5b8de4bdff	nir: add vs_inputs_dual_locations compiler option Allows nir drivers to either use a single or dual locations for vs double inputs. i965 uses dual locations for both OpenGL and Vulkan drivers, for now gallium OpenGL drivers only use a single location. The following patch will also make use of this option when calling nir_shader_gather_info(). Reviewed-by: Karol Herbst <kherbst@redhat.com>	2018-01-30 09:08:47 +11:00
Timothy Arceri	f63e05ae9e	compiler: tidy up double_inputs_read uses First we move double_inputs_read into a vs struct in the union, double_inputs_read is only used for vs inputs so this will save space and also allows us to add a new double_inputs field. We add the new field because `c2acf97fcc` changed the behaviour of double_inputs_read, and while it's no longer used to track actual reads in i965 we do still want to track this for gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-30 09:08:47 +11:00
Dave Airlie	f6cc15dccd	radv/gfx9: fix block compression texture views. (v2) This ports a fix from amdvlk, to fix the sizing for mip levels when block compressed images are viewed using uncompressed views. My original fix didn't power the clamping, but it looks like the clamping is required to stop the sizing going too large. Fixes: dEQP-VK.image.texel_view_compatible.graphic.extendedbc Doesn't crash DOW3 anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-30 07:39:13 +10:00
Bas Nieuwenhuizen	0347a83bbf	radv: Signal fence correctly after sparse binding. It did not signal syncobjs in the fence, and also signalled too early if there was work on the queue already, as we have to wait till that work is done. Fixes: `d27aaae4d2` "radv: Add external fence support." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-29 17:22:58 +01:00
Brian Paul	0d044f7d61	mesa/vbo: replace vbo_draw_method() with _mesa_set_drawing_arrays() The arrays specified by ctx->Array._DrawArrays are used for all vertex drawing via vbo_context::draw_prims(). Different arrays are used for immediate mode, vertex arrays, display lists, etc. Changing from one to another requires updating derived/driver array state. Before, we indirectly specifid the arrays with the gl_draw_method values. Now we just directly specify the arrays instead. This is simpler and will allow a subsequent display list optimization. In the future, it might make sense to get rid of ctx->Array._DrawArrays entirely and just pass the arrays as another parameter to vbo_context::draw_prims(). Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	d9894ede02	vbo: s/[0]/[VERT_ATTRIB_POS]/ in recalculate_input_bindings() Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	48a6ab472a	vbo: add new VBO_ATTRIBS_ masks to vbo_attrib.h These will be used in a later patch. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	41cd3ee5a2	vbo: s/VBO_ATTRIB_INDEX/VBO_ATTRIB_COLOR_INDEX/ To match the VERT_ATTRIB_COLOR_INDEX name. Give a name to the previously anonymous enum of VBO_ATTRIB_x values. Update the comment on the enum. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	425da3bbfc	vbo: minor clean-ups in vbo_exec.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	d631ea3a23	vbo: s/_API_NOOP_H/VBO_NOOP_H/ in vbo_noop.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	094a80db4c	vbo: whitespace/formatting fixes in vbo_exec.h Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	b080fc6199	vbo: move, rename vp_mode enums, get_program_mode() function Instead of NONE/ARB use FF/SHADER. Move the enum declaration to vbo_private.h where it's used. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Brian Paul	35e0ff5bd5	vbo: s/cl/array/ in vbo_context.c I think 'cl' used to mean client array. Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>	2018-01-29 08:35:14 -07:00
Tapani Pälli	d0343bef66	nir: mark unused space in packed_tex_data This change cleans following scary warnings in valgrind output when disk cache is being written: ==6532== Uninitialised byte(s) found during client check request ==6532== at 0x14423FAD: blob_write_bytes (blob.c:152) ==6532== by 0x144240FB: blob_write_uint32 (blob.c:194) ==6532== by 0x144001A5: write_tex (nir_serialize.c:613) and later (loads of): ==6532== Use of uninitialised value of size 8 ==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11) ==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127) ==6532== by 0x13F5DABA: cache_put (disk_cache.c:947) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:22 +02:00
Tapani Pälli	b99c88037b	i965: fix disk_cache leak when destroying context ==2780== 1,024 bytes in 1 blocks are possibly lost in loss record 180 of 205 ==2780== at 0x4C31A1E: calloc (vg_replace_malloc.c:711) ==2780== by 0x13F6467E: util_queue_init (u_queue.c:309) ==2780== by 0x13F5C9F6: disk_cache_create (disk_cache.c:369) ==2780== by 0x13F05406: brw_disk_cache_init (brw_disk_cache.c:428) ==2780== by 0x13F01E78: brwCreateContext (brw_context.c:1068) Fixes: `1a61a8b9a7` ("i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-29 08:11:14 +02:00
Tapani Pälli	28db950b51	i965: fix prog_data leak in brw_disk_cache ==25481== 576 bytes in 1 blocks are definitely lost in loss record 179 of 208 ==25481== at 0x4C2FB6B: malloc (vg_replace_malloc.c:299) ==25481== by 0x1404E2CC: ralloc_size (ralloc.c:121) ==25481== by 0x14119F82: read_and_upload (brw_disk_cache.c:176) ==25481== by 0x1411A5C9: brw_disk_cache_upload_program (brw_disk_cache.c:271) ==25481== by 0x1412FCA4: brw_upload_wm_prog (brw_wm.c:597) Fixes: `516d50db31` ("i965: add initial implementation of on disk shader cache") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-29 08:11:03 +02:00
Timothy Arceri	9afc38c799	ac: fix indentation Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Timothy Arceri	03086f86ae	ac: remove unused nir2llvmtype() The last use of this was removed in the previous patch. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Timothy Arceri	fa29a9625e	ac: fix gs load inputs type This fixes the scenario where the input is a struct. With this the Unreal engines Elemental demo now works on radeonsi. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-29 11:14:23 +11:00
Kai Wasserbäch	0aba967328	ac/nir: call glsl_get_sampler_dim() only once where possible Changes since v1: * Rebased on top of `e68150de26` and `82adf53308`. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-29 10:47:31 +11:00
Dave Airlie	2af66ba7e7	docs/features: add r600 ARB_query_buffer_object support Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:34 +10:00
Dave Airlie	1c9ea24a19	r600: add ARB_query_buffer_object support This uses a different shader than radeonsi, as we can't address non-256 aligned ssbos, which the radeonsi code does. This passes some extra offsets into the shader. It also contains a set of u64 instruction implementation that may or may not be complete (at least the u64div is definitely not something that works outside this use-case). If r600 grows 64-bit integers, it will use the GLSL lowering for divmod. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:28 +10:00
Dave Airlie	a7ec366e50	r600/shader: refactor mul hi/lo instruction emission This just makes it a bit simpler for cayman vs eg Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:42:17 +10:00
Dave Airlie	e0e23ea69c	r600/eg: construct proper rat mask for image/buffers. If the images/buffer bindings had a gap, this produced the wrong values, this should fix that to generate the correct rat mask for mixes of images/buffers/cbs. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-29 05:41:58 +10:00
Jon Turney	4a0bab1d7f	meson: libdrm shouldn't appear in Requires.private: if it wasn't found Otherwise, using pkg-config to retrieve flags will fail, e.g. $ pkg-config gl --cflags Package libdrm was not found in the pkg-config search path. Perhaps you should add the directory containing `libdrm.pc' to the PKG_CONFIG_PATH environment variable Package 'libdrm', required by 'gl', not found Fixes: `3218056e0e` ("meson: Build i965 and dri stack") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>	2018-01-27 18:13:18 +00:00
Eric Anholt	e5a81ac704	broadcom/vc5: Don't forget to get the BO offset when opening a dmabuf. Fixes black display in DRI due to storing to 0x00000000.	2018-01-27 19:40:14 +11:00
Eric Anholt	314e9ee6c4	broadcom/vc5: Enable the driver on V3D 4.2. The changes in 4.2 haven't impacted any of our CL or state struct entries that I can see, so I haven't enabled custom compile for doing 4.2 instead of 4.1.	2018-01-27 19:39:56 +11:00
Eric Anholt	71c7e9bea1	broadcom/vc5: Enable CLIF dumping of V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	91f899cbc1	broadcom/vc5: Update the compiler for V3D 4.2.	2018-01-27 19:04:21 +11:00
Eric Anholt	f2e41daac5	broadcom/vc5: Update QPU instruction pack/unpack for v4.2. After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because it's used for other barriers in compute.	2018-01-27 19:03:55 +11:00
Eric Anholt	96d3e8f134	broadcom/vc5: Add XML for V3D 4.2.	2018-01-27 18:57:58 +11:00
Eric Anholt	b026063b16	broadcom/vc5: Fix a race between XML codegen build and CLIF build.	2018-01-27 18:57:58 +11:00
Eric Anholt	de60ea4432	Android: Attempt to fix broadcom build after vc5 changes.	2018-01-27 18:03:58 +11:00
Marek Olšák	b633999a4e	ac: rename and move si_const_array into common code Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	e17eb8800f	ac: move address space definitions to common code Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	0d62370bbb	ac: don't use byval LLVM qualifier in shaders shader-db doesn't show any regression and 32-bit pointers with byval are declared as VGPRs for some reason. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	0e40c6a7b7	gallium/radeon: set number of pb_cache buckets = number of heaps Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	175549e0e9	pb_cache: let drivers choose the number of buckets Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	ecfd521502	pb_cache: call os_time_get outside of the loop Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Marek Olšák	e553cb5a68	gallium/radeon: simplify radeon_flags_from_heap Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-27 02:09:09 +01:00
Timothy Arceri	041b18cf23	st/shader_cache: restore num_tgsi_tokens when loading from cache Without this we will fail to correctly serialise programs when using glGetProgramBinary() if the program was retrieved from the disk cache rather than freshly compiled. Fixes: `c69b0dd681` "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program" Reviewed-by: Gert Wollny <gw.fossdev@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762	2018-01-27 10:06:16 +11:00
Marek Olšák	17423c993d	winsys/amdgpu: fix assertion failure with UVD and VCE rings Cc: 18.0 <mesa-stable@lists.freedesktop.org>	2018-01-26 23:12:11 +01:00
Brian Paul	ac0e9e343c	mesa: remove MESA_FUNCTION Just use __func__ in the two macros where it was used. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	bacf72a18d	mesa: change gl_link_status enums to uppercase follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	aff5d9c256	mesa: change gl_compile_status enums to uppercase To follow the convention of other enums. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 13:52:48 -07:00
Brian Paul	d9832f1fc4	mesa: minor comment reformatting, whitespace fixes in mtypes.h Trivial.	2018-01-26 13:52:42 -07:00
Rafael Antognolli	131e871385	i965/gen10: Use CS Stall instead of WriteImmediate. Fixes: `ca19ee33d7` Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 12:02:34 -08:00
Rafael Antognolli	20578f81a6	anv/gen10: Emit CS stall and mark push constants dirty. I got reviews and fixed the patches locally, but ended up merging the ones that I sent originally to the list. This patch fixes those mistakes. Fixes: `78c125af39` Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 11:59:17 -08:00
Rafael Antognolli	bcfd78e448	i965/gen10: Re-enable push constants. The GPU hang caused by push constants is apparently fixed, so let's enable them again. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:44 -08:00
Rafael Antognolli	78c125af39	anv/gen10: Ignore push constant packets during context restore. Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a context restore. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:40 -08:00
Rafael Antognolli	ca19ee33d7	i965/gen10: Ignore push constant packets during context restore. These packets were causing GPU hangs when the context was restored, possibly because they were pointing to BO's that were already unreferenced. So we tell the hardware to ignore such packets after the batch buffer ends, since we know those BO's are not around anymore. This change fixes GPU hangs on CNL. The (partial) solution to this problem so far was to entirely disable push constants on this platform. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: "18.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 10:07:35 -08:00
Brian Paul	acaec6cdd9	mesa: silence MinGW 'may be unused uninitialized' warning in get.c The warning happens on line 2114 for the memcpy(data, p, size) call. I'm not sure why that generates the warning but not the earlier use of p in the code. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-26 10:44:05 -07:00
Eleni Maria Stea	8096b558a7	mesa: Fix function pointers initialization in status tracker We assigned the function that gets the device uuid to the GetDriverUuid function pointer and the function that gets the driver uuid to the GetDeviceUuid function pointer inside the state tracker. Exchanged the pointers. cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-26 08:17:55 -07:00
Iago Toral Quiroga	d3ce493b34	anv/pipeline: remove the pipeline layout field from anv_pipeline It no longer has any users. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:47 +01:00
Iago Toral Quiroga	75a4802060	anv/cmd_buffer: add the pipeline layout to the pipeline state We need to access the pipeline layout to compute correct dynamic offsets for dyamic UBO/SSBO descriptors when we emit draw commands. Instead of taking it from the pipeline object, store the layout in the command buffer pipeline state. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:47 +01:00
Iago Toral Quiroga	e1a49f974b	anv/pipeline: don't take the layout from the pipeline to compile shaders The Vulkan spec states that VkPipelineLayout objects must not be destroyed while any command buffer that uses them is in the recording state, but it permits them to be destroyed otherwise. This means that applications are allowed to free pipeline layouts after command recording is finished even if there are pipeline objects that still exist and were created with these layouts. There are two solutions to this, one is to use reference counting on pipeline layout objects. The other is to avoid holding references to pipeline layouts where they are not really needed. This patch takes a step towards the second option by making the pipeline shader compile code take pipeline layout from the VkGraphicsPipelineCreateInfo provided rather than the pipeline object. A follow-up patch will remove any remaining uses of the layout field so we can remove it from the pipeline object and avoid the need for reference counting. v2: Use ANV_FROM_HANDLE, remove unnecessary braces (Jason) Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:46 +01:00
Iago Toral Quiroga	14f6275c92	anv/descriptor_set: add reference counting for descriptor set layouts The spec states that descriptor set layouts can be destroyed almost at any time: "VkDescriptorSetLayout objects may be accessed by commands that operate on descriptor sets allocated using that layout, and those descriptor sets must not be updated with vkUpdateDescriptorSets after the descriptor set layout has been destroyed. Otherwise, descriptor set layouts can be destroyed any time they are not in use by an API command." v2: allocate off the device allocator with DEVICE scope (Jason) Fixes the following work-in-progress CTS tests: dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 14:06:46 +01:00
Samuel Pitoiset	e28233a527	ac/nir: set amdgpu.uniform and invariant.load for SSBOs For descriptors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	49b0a140a7	ac/nir: set amdgpu.uniform and invariant.load for UBOs UBOs are constants buffers. Cc: "18.0" <mesa-stable@lists.freedesktop.org> Fixes: `41c36c45` ("amd/common: use ac_build_buffer_load() for emitting UBO loads") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	b453f38a47	ac/nir: set the noalias attribute on input pointers This attribute is similar to the definition of restrict in C99 and it might help LLVM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:28 +01:00
Samuel Pitoiset	310d17fcf1	ac: only load used channels when sampling buffer views This allows to reduce the number of dwords that are loaded with buffer_load_format_xyzw. For example, when the only used channel is 1, the driver will emit buffer_load_format_x instead. Shader stats for DOW3 (with some local hacky scripts for SPIRV): 143 shaders in 143 tests Totals: SGPRS: 5344 -> 5352 (0.15 %) VGPRS: 3476 -> 3452 (-0.69 %) Spilled SGPRs: 30 -> 29 (-3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 269860 -> 269808 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1267 -> 1272 (0.39 %) Wait states: 0 -> 0 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	51e14bc3c0	ac: pass the number of channels to ac_build_buffer_load_format() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	d7c93b558a	ac: add ac_build_buffer_load_common() helper For both versions of llvm.amdgcn.buffer.load.{format}.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	6d07e443ba	radv: fix RADV_DEBUG=syncshaders on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	5391de1262	radv: fix a GPU hang with RADV_DEBUG=syncshaders The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after a dispatch call (and vice versa for graphics). Something has changed in the kernel driver because it used to work. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	b358e0e67f	ac/shader: scan if fragment shaders write memory It's better to do that in ac_shader_info. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Samuel Pitoiset	b9e2f78d6e	ac/nir: only canonicalize 32-bit float min/max outputs on pre-GFX9 According to LLVM, only pre-GFX9 targets do not flush denorms for fmin/fmax. All dEQP-VK.glsl.builtin.precision.* still pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-26 12:14:27 +01:00
Jason Ekstrand	c8949e2498	anv/pipeline: Don't look at blend state unless we have an attachment Without this, we may end up dereferencing blend before we check for binding->index != UINT32_MAX. However, Vulkan allows the blend state to be NULL so long as you don't have any color attachments. This fixes a segfault when running The Talos Principal. Fixes: `12f4e00b69` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-26 01:44:45 -08:00
Maxin B. John	8116b9170b	anv_icd.py: improve reproducible builds Sort the output to ensure build reproducibility Signed-off-by: Maxin B. John <maxin.john@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Fixes: `0ab04ba979` ("anv: Use python to generate ICD json files") Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-26 01:37:45 -08:00
Ian Romanick	c7deeb71a8	nouveau: Remove no-op nvgl_logicop_func function The values that this function returned were always the values passed in. The only thing that happened was either an assertion or undefined results when an unknown value was passed in. This doesn't seem that useful. Most of nouveau_gldefs.h could be removed in this manner. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-01-26 11:21:46 +08:00
Ian Romanick	f5b9c2a6e3	i915: Silence unused parameter warnings ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_alloc_window_storage’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:290:48: warning: unused parameter ‘ctx’ [-Wunused-parameter] intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer rb, ^~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_nop_alloc_storage’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:303:74: warning: unused parameter ‘rb’ [-Wunused-parameter] intel_nop_alloc_storage(struct gl_context ctx, struct gl_renderbuffer rb, ^~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:32: warning: unused parameter ‘internalFormat’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~~~~~~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:55: warning: unused parameter ‘width’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:304:69: warning: unused parameter ‘height’ [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_bind_framebuffer’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:47: warning: unused parameter ‘fb’ [-Wunused-parameter] struct gl_framebuffer fb, struct gl_framebuffer fbread) ^~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:396:74: warning: unused parameter ‘fbread’ [-Wunused-parameter] struct gl_framebuffer fb, struct gl_framebuffer fbread) ^~~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_renderbuffer_update_wrapper’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:422:57: warning: unused parameter ‘intel’ [-Wunused-parameter] intel_renderbuffer_update_wrapper(struct intel_context intel, ^~~~~ ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c: In function ‘intel_blit_framebuffer_with_blitter’: ../../SOURCE/master/src/mesa/drivers/dri/i915/intel_fbo.c:644:61: warning: unused parameter ‘filter’ [-Wunused-parameter] GLbitfield mask, GLenum filter) ^~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	39f875a6b7	i915: Make intelEmitCopyBlit static And rename to emit_copy_blit. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	9eed6bea6b	i965: Make intelEmitCopyBlit static And rename to emit_copy_blit. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	4e9e964de6	i915: Use enum color_logic_ops for blits v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	21be331401	i965: Use enum color_logic_ops for blits v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]	2018-01-26 11:21:46 +08:00
Ian Romanick	0aaa27f291	mesa: Pass the translated color logic op dd_function_table::LogicOpcode And delete the resulting dead code. This has only been compile-tested. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	cf0b26ec12	st/mesa: Use the translated color logic op from the context And delete the resulting dead code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	0c69db895f	i965: Use the translated color logic op from the context And delete the resulting dead code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-26 11:21:46 +08:00
Ian Romanick	9c1f010f34	mesa: Also track a remapped version of the color logic op With the exception of NVIDIA hardware, these are is the values that all hardware and Gallium want. The remapping is currently implemented in at least 6 places. This starts the process of consolidating to a single place. v2: sed --in-place -e 's/color_logic_ops/gl_logicop_mode/g' $(grep -lr color_logic_ops src/) suggested by Brian. Added some comments about the selection of bit patterns for gl_logicop_mode and the GLenums. Suggested by Nicolai. Folded the GLenum_to_color_logicop macro into its only users. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> [v1] Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-26 11:21:46 +08:00
Bas Nieuwenhuizen	5a3404d443	radeonsi: Export signalled sync file instead of -1. -1 is considered an error for EGL_ANDROID_native_fence_sync, so we need to actually create a sync file. Fixes: `f536f45250` "radeonsi: implement sync_file import/export" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-26 01:26:53 +01:00
Jason Ekstrand	db682b8f0e	i965/fs: Reset the register file to VGRF in lower_integer_multiplication `18fde36ced` changed the way temporary registers were allocated in lower_integer_multiplication so that we allocate regs_written(inst) space and keep the stride of the original destination register. This was to ensure that any MUL which originally followed the CHV/BXT integer multiply regioning restrictions would continue to follow those restrictions even after lowering. This works fine except that I forgot to reset the register file to VGRF so, even though they were assigned a number from alloc.allocate(), they had the wrong register file. This caused some GLES 3.0 CTS tests to start failing on Sandy Bridge due to attempted reads from the MRF: ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64 ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64 This commit remedies this problem by, instead of copying inst->dst and overwriting nr, just make a new register and set the region to match inst->dst. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626 Fixes: `18fde36ced` Cc: "17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-01-25 13:58:55 -08:00
Jason Ekstrand	af9d4ce480	vulkan: Update the XML and headers to 1.0.68 Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Chad Versace <chadversary@chromium.org>	2018-01-25 13:30:05 -08:00
Dave Airlie	f4c534ef68	radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1) This seems to be broken, at least the cts tests fail. This fixes: dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_4 dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_8 2 samples seems to pass fine, amdvlk doesn't appear to enable TC for possibly some other reasons here. This is most likely a hack. v1.1: add a bit of explaination text. (Samuel) Fixes: `ad3d98da9` (radv: enable tc compatible htile for d32s8 also.) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-26 06:55:09 +10:00
Chuck Atkins	6ac5e851f1	configure.ac: add missing llvm dependencies to .pc files v2: Only add as dependencies for gallium-osmesa and gallium-xlib CC: <mesa-stable@lists.freedesktop.org> Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 14:54:08 -05:00
George Kyriazis	5d8f270d10	swr/rast: Optimize DumpToFile output size Modify DumpToFile to only dump the function, not the entire module. Reduces file sizes and speeds up the dumping. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	dfe4dd48ec	swr/rast: Updated copyright dates on knob-related files. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	36dbbf11a0	swr/rast: Move memory-related JIT functions Move them to their own file (builder_mem.{h\|cpp}). Add builder_mem.cpp to the build system. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	94922dbe4b	swr/rast: Add extra (optional) parameter in GATHERPS Now also takes in an additional parameter (draw context) for future expansion. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	0b46c7b3b0	swr/rast: Better ExecCmd (i.e. system()) implmentation Hides console window creation during JIT linker execution in apps that don't have a console. Remove hooking of CreateProcessInternalA - the MSFT implementation just turns around and calls CreateProcessInternalW which, we do hook. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
George Kyriazis	2d16b61bff	swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0. Fix it so it works there, too. Please note that the default setting is USE_SIMD16_FRONTEND=1. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 13:26:49 -06:00
Brian Paul	123798eb44	mesa: whitespace fixes in attrib.c Trivial.	2018-01-25 12:17:26 -07:00
Brian Paul	0e7aaaf5a5	mesa: whitespace fixes in varray.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	ba01589c0c	mesa: include mtypes.h in varray.h We actually use some of the types from mtypes.h so include it directly instead of relying on indirectly including it via bufferobj.h Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	e4504be6fc	mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments The structure type was renamed some time ago, but some comments were not updated. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	6c724fb7c1	mesa: simplify _mesa_delete_list() a bit, add some assertions All but two cases of the switch did the same n += InstSize[n[0].opcode] instruction. Just move it after the switch. Add some sanity check assertions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	c860171c63	st/mesa: expand glDrawPixels cache to handle multiple images The newest version of WSI Fusion makes several glDrawPixels calls per frame. By caching more than one image, we get better performance when panning/zooming the map. v2: move pixel unpack param checking out of cache search loop, per Roland v3: also move unpack->BufferObj check out of loop, per Roland.	2018-01-25 12:17:26 -07:00
Brian Paul	5092610f29	st/mesa: add some debug code in st_choose_format() To aid in debugging gallium surface format selection issues. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-25 12:17:26 -07:00
Brian Paul	94610758a3	svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult And fix whitespace. To sync up with in-house code. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-25 11:56:33 -07:00
Emil Velikov	6aeef54644	configure.ac: correct driglx-direct help text The default was toggled a while back, but the text wasn't updated. Fixes: `bd526ec9e1` ("configure: Always default to --enable-driglx-direct") Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-01-25 17:44:35 +00:00
Emil Velikov	7b744a494d	swrast: remove non-applicable GLX_SWAP_COPY_OML comment Noticed while skimming for GLX_ instances in the dri codebase. Comment is completely off and was in such a state since day 1. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:57 +00:00
Emil Velikov	3e3956d6ae	mapi: remove duplicate GL typedefs Remove the instances already available in gl.h or glext.h. Sadly GLclampx is only available in GLES(1) so we need to keep that one. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:50 +00:00
Emil Velikov	647f40298a	mapi: remove non applicable HAVE_DIX_CONFIG_H hunk Seeming artefact from when the xserver build was diving directly into mesa's tree. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:48 +00:00
Emil Velikov	48e7bc6833	mapi: autotools: remove unused MAPI_FILES file list The sole user was OpenVG, which was removed couple of years ago. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-25 17:42:46 +00:00
Emil Velikov	785d9a4ed8	automake: st/mesa/tests: add st_tests_common.h to the tarball Fixes: `6569b33b6e` ("mesa/st/tests: unify MockCodeLine* classes") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	0beaf7ad3e	automake: mesa: include vbo_private.h in the tarball Fixes: `a7cfec3be0` ("vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	ac4437b20b	automake: small cleanup after the meson.build inclusion Namely extend the EXTRA_DIST list, instead of re-assigning it and bring back a file dropped by mistake. Fixes: `436ed65d38` ("autotools: include meson build files in tarball") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	50265cd9ee	automake: anv: ship anv_extensions_gen.py in the tarball Fixes: `dd088d4bec` ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:29 +00:00
Emil Velikov	265d36c890	automake: vc5: remove non-applicable v3dx_simulator.h Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 17:06:28 +00:00
Roland Scheidegger	4fe662c58f	gallivm: fix crash with seamless cube filtering with different min/mag filter We are not allowed to modify the incoming coords values, or things may crash (as we may be inside a llvm conditional and the values may be used in another branch). I recently broke this when fixing an issue with NaNs and seamless cube map filtering, and it causes crashes when doing cubemap filtering if the min and mag filters are different. Add const to the pointers passed in to prevent this mishap in the future. Fixes: `a485ad0bcd` ("gallivm: fix an issue with NaNs with seamless cube filtering") Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-01-25 18:03:38 +01:00
Eric Engestrom	57223fb07a	egl: keep extension list sorted, per comment at the top Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-01-25 16:38:11 +00:00
George Kyriazis	0e879aad2f	swr/rast: support llvm 3.9 type declarations LLVM 3.9 was not taken into account in initial check-in. Fixes: `01ab218bbc` ("swr/rast: Initial work for debugging support.") cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749 Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-25 08:22:52 -06:00
Samuel Pitoiset	e1331c9d61	ac/nir: add break statements in needs_view_index_sgpr() Previous code is correct but as the first case statement uses a break, keep it consistent. CID: `1428579` Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-25 13:59:52 +01:00
Eric Engestrom	0663ae0aa1	loader: let compiler figure out the length of the string Basically, turn comment into code Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-25 11:40:25 +00:00
Eric Engestrom	57b0ccd178	meson: simplify dri3 logic Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-25 10:10:04 +00:00
Juan A. Suarez Romero	513c2263cb	mesa: add missing RGB9_E5 format in _mesa_base_fbo_format This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-25 09:54:31 +01:00
Jason Ekstrand	df13588d21	i965: Stop disabling aux during texture preparation Previously, we were handling self-dependencies by marking the render buffer and then passing disable_aux=true to prepare_texture so that it would do a resolve. This works but ends us up doing to much resolving in some cases. Specifically, if we're doing something such as mipmap generation, this would cause us to resolve all levels of the texture if even one of them is overlapping. Instead, this commit makes us wait until we process the framebuffer to do these resolves and we only resolve the slices needed for rendering. Doing this resolve puts them into the pass-through state so, even if we do texture using CCS_E, the CCS data will effectively be ignored and the real surface contents read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	20f70ae385	i965/draw: Set NEW_AUX_STATE when draw aux changes Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383 Fixes: `ea0d2e98ec` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	e52a9f18d6	i965: Replace draw_aux_buffer_disabled with draw_aux_usage Instead of keeping an array of booleans, we now hang onto an array of isl_aux_usage enums. This means that the thing we are passing from brw_draw.c to surface state setup is the thing that surface state setup actually needs instead of an input to compute what it needs. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	468ea3cc45	i965/surface_state: Drop brw_aux_surface_disabled The only purpose of this function is to disable aux on texture surfaces when the corresponding renderbuffer has aux disabled. However, the act of disabling aux on the renderbuffer will cause it to be resolved and intel_miptree_texture_aux_usage will already check the resolved status of a texture and return ISL_AUX_USAGE_NONE for it. Even if we used CCS for it, that wouldn't really be a problem because the CCS will be in the pass-through state and so it would effectively be ignored. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	d38ec24f53	i965/miptree: Add an aux_disabled parameter to render_aux_usage Only one of the callers of intel_miptree_render_aux_usage actually took brw->draw_aux_buffer_disabled into account. This was causing us to ignore draw_aux_buffer_disabled for the intel_miptree_prepare_render. This isn't a problem because the draw_aux_buffer_disabled entry was set during texture preparation and we already did the resolve at that time. However, this also meant that the aux_usage we were passing to brw_cache_flush_for_render and brw_render_cache_add_bo was wrong so our automatic cache flushing around aux_usage changes wasn't happening. This was causing GPU hangs in Oxenfree. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104711 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104411 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104383 Fixes: `ea0d2e98ec` Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	dfe0217905	i965/miptree: Take an aux_usage in prepare/finish_render Both callers of intel_miptree_prepare/finish_render have to call intel_miptree_render_aux_usage anyway for other reasons. They may as well pass the result in instead of us calling it again. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-24 19:05:36 -08:00
Jason Ekstrand	7d4007d58a	aubinator: Multiply count by 4 to compute buffer sizes The count field is in terms of dwords and not bytes.	2018-01-24 19:05:36 -08:00
Timothy Arceri	e776791432	st/glsl_to_nir: remove reallocation of sampler/image location As far as I can tell this always just reassigns the same value. Also as we don't curretly store UniformHash in the shader cache removing this will help with adding a shader cache to gallium nir drivers. Reviewed-by: Rob Clark <robdclark@gmail.com>	2018-01-25 13:27:22 +11:00
Jordan Justen	62b68d05e7	docs: add 18.1.0-devel release notes template Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-24 17:10:58 -08:00
Jordan Justen	65c18b02fc	mesa: bump version to 18.1.0-devel Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-24 17:10:58 -08:00
Greg V	8fae5eddd9	meson: handle LLVM 'x.x.xgit-revision' versions When LLVM is built inside of a git repo (even way below, e.g. /usr/ports/.git exists, and LLVM is built in /usr/ports/devel/llvm50/work), its version becomes something like 5.0.0git-f8ab206b2176. New meson versions already handle this, but we support older versions too. Fixes: `673dda8330` ("meson: build "radv" vulkan driver for radeon hardware") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-24 15:25:54 -08:00
Greg V	53f9131205	meson: fix getting cflags from pkg-config get_pkgconfig_variable('cflags') always returns an empty list, it's a function for getting custom variables. Meson does not yet support asking for cflags, so explicitly invoke pkg-config for now. Fixes: `68076b8747` ("meson: build gallium vdpau state tracker") Fixes: a817af8a89eb ("meson: build gallium xvmc state tracker") Fixes: `1d36dc674d` ("meson: build gallium omx state tracker") Fixes: `5a785d51a6` ("meson: build gallium va state tracker") Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-24 15:25:54 -08:00
Greg V	c38c60a63c	meson: fix BSD build CC: 18.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-24 15:25:54 -08:00
Greg V	7c8cfe2d59	meson: fix missing dependencies Fixes: `66f97f6640` ("meson: build radeonsi") Reviewed-by: Emil Velikov <emil.velikov@colalbora.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-24 15:25:54 -08:00
Grazvydas Ignotas	0cc7370733	anv: correct a duplicate check in an assert Looks like checking both sources was intended, instead of the first one twice. Found with Coccinelle, coccinellery/xand/xand.cocci semantic patch. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-25 01:10:45 +02:00
Marc Dietrich	a2a1b0e75e	meson: fix HAVE_LLVM version define in meson build LLVM patch level is not included in HAVE_LLVM. Fixes: `e6418ab156` ("meson: build "radv" vulkan driver for radeon hardware") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Signed-off-by: Marc Dietrich <marvin24@gmx.de>	2018-01-24 14:04:20 -08:00
Dylan Baker	5781c3d1db	meson: correctly set SYSCONFDIR for loading dirrc Fixes: `d1992255bb` ("meson: Add build Intel "anv" vulkan driver") Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-24 13:10:32 -08:00
Dave Airlie	d2414e64e4	radv: add multisample Z optimisation from amdvlk This was just found while reading for other stuff, src/core/hw/gfxip/gfx6/gfx6DepthStencilView.cpp. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-25 06:48:11 +10:00
Dave Airlie	298554541d	radv: move spi_baryc_cntl to pipeline We need to enable the pos float location 2 mode anytime we have persample not just when forced by the frag shader. This fixes: dEQP-VK.pipeline.multisample.min_sample_shading* Fixes: `58c97a079` (radv: enable location at sample when persample is forced.) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-25 06:47:28 +10:00
Marek Olšák	125c0529f3	gallium/u_tests: add texture_barrier and FBFETCH tests Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-24 21:08:45 +01:00
Marek Olšák	022c5b22fe	radeonsi: don't ignore pitch for imported textures Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-24 21:08:45 +01:00
Scott D Phillips	0b8d38bd48	meson: Fix define for USE_SSE41 Before we were adding -DHAVE_SSE41 which isn't what the code is looking for, so some uses of the sse4.1 code were always being skipped. v2: Don't add any compile check for the quite old -msse4.1 option (Dylan) Fixes: `84486f6462` ("meson: Enable SSE4.1 optimizations") Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-24 11:32:34 -08:00
Gert Wollny	8172b9ff48	mesa/st/glsl_to_tgsi: remove now unneeded assert. With the implementation of the tracking of the registers used in reladdr asserting that a driver calling merge_register() uses the address register is no longer needed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:34:05 -07:00
Gert Wollny	f2040fbe48	mesa/st/tests: Add tests for lifetime tracking with indirect addressing Add a code line type that accepts one layer of indirect addressing and add tests to check that temporary register access used for indirect addressing is accounted for in the lifetime estimation. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:34:00 -07:00
Gert Wollny	51c0cee267	mesa/st/glsl_to_tgsi: Add tracking of indirect addressing registers So far indirect addressing was not tracked to estimate the temporary life time, and it was not needed, because code to load the address registers was always emitted eliminating the reladdr* handles in the past glsl-to.tgsi stages. Now, with Mareks patch allowing any 1D register to be used for addressing on some hardware this changed, and the tracking becomes necessary. Because the registers have no direct indication on whether the reladdr* was already loaded into an address register, the temporaries in reladdr* are always tracked as reads. This may result in a slight over-estimation of the lifetime in the cases when the load to the address register was emitted. v2: no changes v3: Use debug_log variable instead of directly writing to std::err in debugging output. v6: fix indention and typos Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	517e34c62f	mesa/st/tests: Add tests for improved tracking of temporaries Additional tests are added that check the tracking of access to temporaries in if-else branches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	807e2539e5	mesa/st/glsl_to_tgsi: Add tracking of ifelse writes in register merging Improve the life-time evaluation of temporary registers by also tracking writes in both if and else branches and in up to 32 nested scopes. As a result the estimated required register life-times can be further reduced enabling more registers to be merged. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	8dda01ef5a	mesa/st/tests: cleanup whitespace usage and correct some comments Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	6569b33b6e	mesa/st/tests: unify MockCodeLine* classes * Merge the classes MockCodeLine and MockCodelineWithSwizzle into one, and refactor tests accordingly. * Change memory allocations to use ralloc* interface. v2: * move the test classes into a conveniance library * rename the Mock* classes to Fake* since they are not really Mocks * Base assertion of correct number of src and dst registers in tests on what the operatand actually expects * Fix number of destinations in one test v6: * fix local includes using "..." insteadof <...> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	ad1990629e	mesa/st/tests: Fix zero-byte allocation leaks Don't allocate a zero-sized array, when no texture offsets are given. v5: correct spaces and empty lines Reviewed-by: Brian Paul <brianp@vmware.com>(v4) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	ee48e3acb8	mesa/st/glsl_to_tgsi: Add some operators for glsl_to_tgsi related classes Add the equal operator and the "<<" stream write operator for the st_*_reg classes and the "<<" operator to the instruction class, and make use of these operators in the debugging output. v5: Fix empty lines Reviewed-by: Brian Paul <brianp@vmware.com> (v4) Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Gert Wollny	6a3421078a	mesa/program: Add missing file types to printout Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2018-01-24 10:23:00 -07:00
Brian Paul	365a48abdd	vbo: fix incorrect min/max_index values in display list draw call This fixes another regression from commit `8e4efdc895` ("vbo: optimize some display list drawing"). The problem was the min_index, max_index values passed to the vbo drawing function were not computed to compensate for the biased prim::start values. https://bugs.freedesktop.org/show_bug.cgi?id=104746 https://bugs.freedesktop.org/show_bug.cgi?id=104742 https://bugs.freedesktop.org/show_bug.cgi?id=104690 Tested-by: Clayton Craft <clayton.a.craft@intel.com> Fixes: `8e4efdc895` ("vbo: optimize some display list drawing") Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2018-01-24 10:12:49 -07:00
Brian Paul	2123bd2805	vbo: whitespace/formatting fixes in vbo_split_inplace.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	6b0109cf39	vbo: whitespace/formatting fixes in vbo.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	b9280031a8	vbo/i965: move vbo_all_varyings_in_vbos() to brw_draw.c It's only used in brw_draw_prims(). s/GLboolean/bool/, etc. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a83f7e119c	vbo: remove unused vbo_any_varyings_in_vbos() function Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	718f4251c5	vbo: remove unneeded #includes Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	f4376a0c2b	vbo: remove vbo_context.h and change includes to use vbo.h instead Now vbo.h is the public interface to the VBO module. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	aafb56a148	vbo: move remaining items from vbo_context.h to vbo.h Non-VBO sources files sometimes included vbo.h while others included vbo_context.h. We're moving all public types, functions to the former. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a7cfec3be0	vbo: move VBO-private types, prototypes, etc. into new vbo_private.h header Things which should not be used outside the VBO module. More public/private clean-ups coming. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	d40fa42292	mesa: use new _vbo_install_exec_vtxfmt() function Instead of reaching into the vbo_context object in vtxfmt.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	04a17ec327	nouveau: remove vbo_context() call _vbo_DestroyContext() can be safely called even if there's no VBO module. Removes a dependency on the vbo_context() function. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	7b0ae96711	i965: use vbo_set_[indirect]_draw_func() Instead of poking into the vbo_context object. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	3bbf8d9042	vbo: move vbo_sizeof_ib_type() into vbo_exec_array.c It's only used in this one file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	a152cb7492	mesa: move vbo_count_tessellated_primitives() to api_validate.c It's only used in this file and has nothing VBO-specific about it. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	5d3e10fd27	mesa: update comment on gl_display_list Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	cffa82327d	mesa: whitespace clean-ups in mtypes.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	b3a1aa94d9	mesa: remove unused MAT_INDEX_AMBIENT/DIFFUSE/SPECULAR contants Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	67dc551ba9	vbo: move DLIST_DANGLING_REFS from mtypes.h to vbo_save_api.c It's only used in this file. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	cb7ef0df00	vbo: replace assert(0) with unreachable() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	8b3cb7c651	vbo: fix, add comment in vbo_save.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Brian Paul	67ebde19d4	vbo: whitespace, formatting fixes in vbo_split.[ch] Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-24 10:12:49 -07:00
Topi Pohjolainen	ec4bb693a0	i965: Don't try to disable render aux buffers for compute Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104546 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2018-01-24 10:54:08 +02:00
Jason Ekstrand	4064fe59e7	anv/cmd_buffer: Move gen7 index buffer state to graphics state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:46 -08:00
Jason Ekstrand	38ec78049f	anv/cmd_buffer: Move num_workgroups to compute state While we're here, make it an anv_address. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:44 -08:00
Jason Ekstrand	95ff232294	anv/cmd_buffer: Move dynamic state to graphics state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:43 -08:00
Jason Ekstrand	24caee8975	anv/cmd_buffer: Use a temporary variable for dynamic state We were already doing this for some packets to keep the lines shorter. We may as well just do it for all of them. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:40 -08:00
Jason Ekstrand	8bd5ec5b86	anv/cmd_buffer: Move vb_dirty bits into anv_cmd_graphics_state Vertex buffers are entirely a graphics pipeline thing. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:39 -08:00
Jason Ekstrand	e85aaec148	anv/cmd_buffer: Move dirty bits into anv_cmd_*_state Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:36 -08:00
Jason Ekstrand	97f96610c8	anv: Separate compute and graphics descriptor sets The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Up until now, we've been ignoring the pipeline bind point and had just one bind point for everything. This commit separates things out into separate bind points. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102897 Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:33 -08:00
Jason Ekstrand	31b2144c83	anv/cmd_buffer: Use anv_descriptor_for_binding for samplers Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:31 -08:00
Jason Ekstrand	b9e1ca16f8	anv/cmd_buffer: Add a helper for binding descriptor sets This lets us unify some code between push descriptors and regular descriptors. It doesn't do much for us yet but it will. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:30 -08:00
Jason Ekstrand	90cceaa9dd	anv/cmd_buffer: Refactor ensure_push_descriptor_set It's now a function which returns the push descriptor set. Since we set the error on the command buffer, returning the error is a little redundant. Returning the descriptor set (or NULL on error) is more convenient. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:28 -08:00
Jason Ekstrand	d5592e2fda	anv: Remove semicolons from vk_error[f] definitions With the semicolons, they can't be used in a function argument without throwing syntax errors. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:27 -08:00
Jason Ekstrand	9af5379228	anv/cmd_buffer: Add substructs to anv_cmd_state for graphics and compute Initially, these just contain the pipeline in a base struct. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:25 -08:00
Jason Ekstrand	ddc2d28548	anv/cmd_buffer: Use some pre-existing pipeline temporaries There are several places where we'd already saved the pipeline off to a temporary variable but, due to an artifact of history, weren't actually using that temporary everywhere. No functional change. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:24 -08:00
Jason Ekstrand	cd3feea745	anv/cmd_buffer: Rework anv_cmd_state_reset This splits anv_cmd_state_reset into separate init and finish functions. This lets us share init code with cmd_buffer_create. This potentially fixes subtle bugs where we may have missed some bit of state that needs to get initialized on command buffer creation. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:22 -08:00
Jason Ekstrand	d6c9a89d13	anv/cmd_buffer: Get rid of the meta query workaround Meta has been gone for a long time. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:20 -08:00
Jason Ekstrand	bc0a21e348	anv/cmd_state: Drop the scratch_size field This is a legacy left-over from the mechanism we used to use to handle scratch. The new (and better) mechanism doesn't use this. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:19 -08:00
Jason Ekstrand	4b69ba3817	anv/pipeline: Don't assert on more than 32 samplers This prevents an assert when running one unreleased Vulkan game. Tested-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "18.0" <mesa-stable@lists.freedesktop.org>	2018-01-23 21:10:08 -08:00
Dave Airlie	766589d89a	radv: fix sample_mask_in loading. (v3.1) This is ported from radeonsi and fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.bit_* v2: don't call this path for radeonsi, it does it in the epilog. use the radeonsi code path. v3: handle NULL pCreateInfo->pMultisampleState properly (Samuel) v3.1: set ps_iter_samples default to 1 (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `bdcbe7c76` (radv: add sample mask input support) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 14:25:11 +10:00
Dave Airlie	c727ea9370	radv: don't use hw resolves for r16g16 norm formats. radeonsi has a workaround for this, but it uses a R16A16 format, which vulkan doesn't have, we could probably come up with a work around but for now just avoid hw resolves. Fixes: dEQP-VK.renderpass.suballocation.multisample.r16g16_norm Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `2a04f5481d` (radv/meta: select resolve paths) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 09:01:12 +10:00
Dave Airlie	4df414bbd2	radv: don't use hw resolve for integer image formats From reading AMDVLK it currently never uses hw resolve paths. This patch takes from radeonsi which doesn't use hw resolve for integer formats, and does the same for radv. This fixes: dEQP-VK.renderpass.suballocation.multisample*uint tests. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `2a04f5481d` (radv/meta: select resolve paths) Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 08:53:18 +10:00
Dave Airlie	316d762186	radv: add fs_key meta format support to resolve passes. Some of the hw resolve passes need the SPI color format setup correctly. This fixes lots of 16-bit and 32-bit format tests in dEQP-VK.renderpass.suballocation.multisample* Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-24 08:50:51 +10:00
Grazvydas Ignotas	224fd17e1e	winsys/svga: check correct member after create .mob_fenced was already checked, probably a copy-paste bug. Found by Coccinelle. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-23 11:04:07 -07:00
Grazvydas Ignotas	08085df313	svga: fix context alloc error handling 'cleanup' path is dereferencing 'svga' a lot, 'done' is a better choice. Found by Coccinelle. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-23 11:04:07 -07:00
Christoph Haag	4b4d929c27	meson: remove lib prefix from libd3dadapter9.so Fixes: `6b4c7047d5` ("meson: build gallium nine state_tracker") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-23 09:30:30 -08:00
Emil Velikov	3b6d232a5c	docs: update calendar 18.0.0-rc1 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-23 17:02:17 +00:00
Eric Engestrom	eee8dd7c33	radeon: remove left over dead code Fixes: `4e0d99a635` "r100: Use shared debug code" Cc: Pauli Nieminen <suokkos@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-23 15:39:57 +00:00
Eric Engestrom	10f5e0dce2	docs: ask for backport nominations to cc: the author Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-23 15:39:57 +00:00
Marc Dietrich	911ca587f8	meson: fix some defines misspelled errors in meson.build Defines - HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL - HAVE_FUNC_ATTRIBUTE_VISIBILITY were misspelled. Signed-off-by: Marc Dietrich <marvin24@gmx.de> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-23 15:39:57 +00:00
Bas Nieuwenhuizen	5a4dc28500	ac/nir: Use instance_rate_inputs per attribute, not per variable. This did the wrong thing if we had e.g. an array for which only some of the attributes use the instance index. Tripped up some new CTS tests. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-23 12:58:48 +01:00
Jason Ekstrand	de00e8227b	anv: Return trampoline entrypoints from GetInstanceProcAddr Technically, the Vulkan spec requires that we return valid entrypoints for all core functionality and any available device extensions. This means that, for gen-specific functions, we need to return a trampoline which looks at the device and calls the right device function. In 99% of cases, the loader will do this for us but, aparently, we're supposed to do it too. It's a tiny increase in binary size for us to carry this around but really not bad. Before: text data bss dec hex filename 3541775 204112 6136 3752023 394057 libvulkan_intel.so After: text data bss dec hex filename 3551463 205632 6136 3763231 396c1f libvulkan_intel.so Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	eac29f3a6d	anv/entrypoints: Use an named tuple for params This allows us to store a bit more detailed data per-param Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	1f79d986af	anv: Only advertise enabled entrypoints The Vulkan spec annoyingly requires us to track what core version and what all extensions are enabled and only advertise those entrypoints. Any call to vkGet*ProcAddr for an entrypoint for an extension the client has not explicitly enabled is supposed to return NULL. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	e3d27542ae	anv: Add a per-device dispatch table We also switch GetDeviceProcAddr over to use it. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	0c399dca51	anv: Add a per-instance dispatch table We also switch GetInstanceProcAddr over to use it. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	a372b9247d	anv: Properly NULL for GetInstanceProcAddr with a null instance Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	cb0d1ba156	anv/extensions: Fix VkVersion::c_vk_version for patch == None Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	93e789a266	anv/entrypoints: Parse entrypoints before extensions/features Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	2f493121ae	anv/entrypoints: Expose the different dispatch tables Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	083e126694	anv/entrypoints: Split entrypoint index lookup into its own function Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	7039308d7c	anv/entrypoints: Add a LAYERS helper variable Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	f54227856f	anv/entrypoints: Add an Entrypoint class Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	abc62282b5	anv: Add a per-device table of enabled extensions Nothing uses this at the moment, but we will need it soon. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	01b9701a5c	anv: Use tables for device extension wrangling Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	920bd2c0bc	anv: Add a per-instance table of enabled extensions Nothing needs this yet but we will want it later. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	ff5f3e2b21	anv: Use tables for instance extension wrangling This lets us move a bunch of stuff out of codegen and back into anv_device.c which is a bit nicer. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	dd088d4bec	anv/extensions: Generate a header file with extension tables This allows us better introspection into extensions. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	ffb10bfd8e	anv/meson: Simplify some dependency and flag tracking This removes some redundant code between libanv_common, libvulkan_intel, and libvulkan_intel_test. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	f939940809	anv: Split anv_extensions.py into two files The new anv_extensions_gen.py is the code generator while the old anv_extensions.py file is purely declarative. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Jason Ekstrand	10d1b0be8e	anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2018-01-23 00:15:40 -08:00
Timothy Arceri	e68150de26	ac: fix image load store for GLSL_SAMPLER_DIM_3D Fixes the following piglit tests: arb_shader_image_load_store/layer/image3d/layered binding test arb_shader_image_load_store/max-size/image3d max size test/2048x8x8x1 arb_shader_image_load_store/max-size/image3d max size test/8x2048x8x1 arb_shader_image_load_store/max-size/image3d max size test/8x8x2048x1 arb_shader_image_load_store/semantics/imageload/vertex shader/rgba32f/image3d test Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 18:05:13 +11:00
Timothy Arceri	82adf53308	ac: image size builtin for GLSL_SAMPLER_DIM_3D This is what radeonsi does. Fixes remaing piglit subtest in: ./bin/arb_shader_image_size-builtin --quick -auto -fbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 18:05:13 +11:00
Chuck Atkins	a29d63ecf7	swr: refactor swr_create_screen to allow for proper cleanup on error This makes the following changes to address cleanup issues: - Error conditions now return NULL instead of calling exit() - swr_creen is now freed upon error, rather than leak. - Library handle from dlopen is now closed upon swr_screen destruction v2: Added additional context in commit msg and remove unnecessary "PUBLIC" v3: Fix typo in commit message. Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: Bruce Cherniak <bruce.cherniak@intel.com> Cc: Tim Rowley <timothy.o.rowley@intel.com> cc: mesa-stable@lists.freedesktop.org	2018-01-22 17:56:44 -06:00
Anuj Phogat	56b9060381	intel: Add Geminilake brand strings Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-22 15:40:04 -08:00
Timothy Arceri	5b9362c248	ac: fix ac_build_varying_gather_values() for packed layouts This fixes a segfault for varyings not starting at component 0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 10:00:52 +11:00
Timothy Arceri	209b14c2cb	ac: remove arrays when when querying sampler info Fixes the following ARB_arrays_of_arrays piglit tests: basic-imagestore-const-uniform-index basic-imagestore-mixed-const-non-const-uniform-index basic-imagestore-mixed-const-non-const-uniform-index2 basic-imagestore-non-const-uniform-index Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:50:47 +11:00
Timothy Arceri	0cbc62a4dd	glsl: add image and sampler (un)packing support to glsl to nir This is needed for ARB_bindless_texture support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:44:37 +11:00
Timothy Arceri	549ccbb435	nir: add image and sampler type to glsl_get_bit_size() These are needed for ARB_bindless_texture support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:44:37 +11:00
Timothy Arceri	324d2fe6a7	ac: fix emit vertex stream parameter Fixes the following piglit test on radeonsi: ./bin/arb_enhanced_layouts-gs-stream-location-aliasing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:30:00 +11:00
Timothy Arceri	271067967a	ac: add support for gl_HelperInvocation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:23:26 +11:00
Timothy Arceri	3bc5fa69f5	ac/radeonsi: add emit primitive to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:18:37 +11:00
Timothy Arceri	dd4591b794	radeonsi: add generic emit primitive helper This will be shared by the tgsi and nir backends. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:18:36 +11:00
Timothy Arceri	fdc2fb4d88	ac: add stream handling to visit_end_primitive() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:18:36 +11:00
Timothy Arceri	6bf1c93fe0	radeonsi/nir: fix fs output index Fixes the following piglit tests: arb_blend_func_extended-fbo-extended-blend arb_blend_func_extended-fbo-extended-blend-explicit arb_blend_func_extended-fbo-extended-blend-explicit_gles3 arb_blend_func_extended-fbo-extended-blend-pattern arb_blend_func_extended-fbo-extended-blend-pattern_gles2 arb_blend_func_extended-fbo-extended-blend-pattern_gles3 arb_blend_func_extended-fbo-extended-blend_gles3 ext_framebuffer_multisample/alpha-to-coverage-dual-src-blend ext_framebuffer_multisample/alpha-to-one-dual-src-blend Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:11:22 +11:00
Timothy Arceri	882af004d8	ac/nir/radeonsi: add ARB_shader_ballot support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:11:22 +11:00
Timothy Arceri	4a9643413f	ac/nir: add ARB_shader_group_vote support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:11:22 +11:00
Timothy Arceri	dabff1cf7a	radeonsi/nir: add primitive id to inputs scan Fixes the following piglit tests: arb_tessellation_shader/fs-primitiveid-instanced glsl-1.50/primitive-id-no-gs glsl-1.50/primitive-id-no-gs-first-vertex glsl-1.50/primitive-id-no-gs-instanced glsl-1.50/primitive-id-no-gs-strip glsl-1.50/primitive-id-no-gs-strip-first-vertex Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:11:21 +11:00
Timothy Arceri	c6a0ce7e54	radeonsi/nir: add nir_intrinsic_load_sample_mask_in to ir scan Fixes a bunch of ARB_sample_shading piglit tests. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-23 09:11:21 +11:00
Samuel Thibault	9131e6d3c2	u_thread: Use pthread_setname_np on linux only. pthread_setname_np was added in glibc 2.12 for the Linux port only, other ports do not necessarily have it. Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2018-01-22 21:12:41 +00:00
Jose Fonseca	dcbb224c68	svga: Prevent use after free. Courtesy of clang static analyzer. I was hunting for potential sources of memory corruption using Mesa with a GL trace, and happened to find this (unrelated) issue. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2018-01-22 21:12:41 +00:00
Kenneth Graunke	60f15477da	i965: Drop render_target_start from binding table struct. We have to start render targets at binding table index 0 in order to use headerless FB write messages, and in fact already assume this in a bunch of places in the code. Let's finish that off, and not bother storing 0 in a struct to pretend to add it in a few places. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-22 10:03:52 -08:00
Emil Velikov	a9bb067e27	i965: make brw_context::num_samples unsigned int It is never a negative number. Variable is compared against unsigned values and passed into functions that expect unsigned int. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-22 16:31:57 +00:00
Emil Velikov	ef1df63046	st/mesa: provide static inline st_init_vdpau_functions The ifdef spaghetty in st_vdpau.c is rather confusing and misleading. Simplily it by introducing a static inline helper noop (when HAVE_ST_VDPAU is not defined) in the header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-01-22 16:31:15 +00:00
Samuel Pitoiset	33e6e5e6a4	radv: add an option that allows to dump pre-optimization ir With RADV_DEBUG=preoptir. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-22 12:28:33 +01:00
Chris Wilson	525b4f7548	i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext The forward port of commit `6d87500fe1` ("dri: Change __DriverApiRec::CreateContext to take a struct for attribs") failed to adapt the set of allowed attributes for the earlier introduction of context priorities (commit `1617fca6d1` "i965: Pass the EGL/DRI context priority through to the kernel"). Fixes: `6d87500fe1` ("dri: Change __DriverApiRec::CreateContext to take a struct for attribs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Adam Jackson <ajax@redhat.com> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-01-22 10:24:20 +00:00
Matthew Nicholls	005375717b	radv: restore previous stencil reference after depth-stencil clear Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alex Smith <asmith@feralinteractive.com>	2018-01-22 08:57:42 +00:00
Jason Ekstrand	5048572352	i965: Set tiling on BOs imported with modifiers We need this to ensure that GTT maps work on buffers we get from Vulkan on the off chance that someone does a readpixels or something. Soon, we will be removing GTT maps from i965 entirely and this can be reverted. None the less, it's needed for stable. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-01-21 23:07:18 -08:00
Jason Ekstrand	b9e7b29705	i965/bufmgr: Add a create_from_prime_tiled function This new function is an import and a set tiling in one go. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-01-21 23:07:18 -08:00
Jason Ekstrand	ad424b2243	i965/miptree: Use the tiling from the modifier instead of the BO This fixes a bug where we were taking the tiling from the BO regardless of what the modifier said. When we got images in from Vulkan where it doesn't set the tiling on the BO, we would treat them as linear even though the modifier expressly said to treat it as Y-tiled. Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-01-21 23:07:18 -08:00
Jason Ekstrand	0465dd13d2	i965/miptree: Add an explicit tiling parameter to create_for_bo Otherwise, create_for_bo will just grab the tiling from the BO which is not what we want when using modifiers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2018-01-21 23:07:16 -08:00
Bas Nieuwenhuizen	4584c4ef04	radv: Don't allow 3d or 1d depth/stencil textures. addrlib asserts when that happens, and supporting it is not required so lets not allow this for now. It also assert on fmask, but we don't have the number of samples here. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:43 +01:00
Bas Nieuwenhuizen	8b98929074	radv: Init variant entry with memset. This gets memcpy'd and written driectly, and due to alignment, this resulted in uninitialized gaps. This makes those gaps go away. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:39 +01:00
Bas Nieuwenhuizen	fb0992e967	radv: Fix bufimage failure deallocation. The inidividual init parts don't clean up their own stuff on failure. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:32 +01:00
Bas Nieuwenhuizen	2c802ca66c	radv: Fix fragment resolve init memory allocation failure paths. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:29 +01:00
Bas Nieuwenhuizen	c685076ab0	radv: Fix freeing meta state if the device pipeline cache fails to allocate. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:24 +01:00
Bas Nieuwenhuizen	71f0315a88	radv: Fix memory allocation failure path in compute resolve init. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:19 +01:00
Bas Nieuwenhuizen	d956e0bdf5	radv: Fix ordering issue in meta memory allocation failure path. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-22 00:07:03 +01:00
Lucas Stach	29a0ea699a	etnaviv: dirty TS state when framebuffer has changed When switching between framebuffers with and without TS, the TS state needs to be flushed to the command stream even if the derived state isn't changed. Fixes: `4ee7c2c284` ("etnaviv: enable TS, but disable autodisable") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-01-21 12:58:02 +01:00
Vinson Lee	e03c880971	broadcom/vc5: Fix source file name. Fixes: `c9b2cb7897` ("vc5: add missing files to the tarball") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-21 11:13:16 +08:00
Vinson Lee	14abbe604b	broadcom/vc5: Add missing include paths. Fixes: `954a704da3` ("broadcom/vc5: Port the RCL setup to V3D4.1.") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-21 11:05:33 +08:00
Eric Anholt	f398aa6aef	mesa: Only require independent blending for GLES 3.2. We've been requiring this since GLES 3.0 was introduced, but the GLES 3.2 spec is the one that has "Supporting blending on a per-draw-buffer basis" in the new features. V3D 3.3 would require lowering blending to shader code to implement independent blending. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-21 10:48:00 +08:00
Kenneth Graunke	60d8fe9216	i965: Delete completely bogus comment This hasn't been true in 6+ years, if it was even true then. Before we rewrote the compiler and introduced GLSL IR in 2010-2011, i965 used to have two compiler backends for WM programs, based on Mesa IR. One handled flow control and was SIMD8-only, while the other was SIMD16 only and didn't handle flow control. Or something like that. Even then, this certainly didn't handle vertex shaders, so "all ... code generation" is a bit strong.	2018-01-20 01:31:25 -08:00
Dylan Baker	436ed65d38	autotools: include meson build files in tarball This adds the meson.build, meson_options.txt, and a few scripts that are used exclusively by the meson build. v2: - Remove accidentally included changes needed to test make dist with LLVM > 3.9 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 16:30:51 -08:00
George Kyriazis	9d80ed0862	swr/rast: Fix llvm5 behavior For some reason llvm5 is picky about accepting a void * type in the case of building an argument list. Since we don't care about the type (we ignore the argument for now), pick another pointer type Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 17:08:30 -06:00
George Kyriazis	d335b32baf	swr/rast: Enable early rasterization Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:43 -06:00
George Kyriazis	bacfbe5a32	swr/rast: Implement Early Rasterization optimization Early Rasterization is an optimization for small triangles. Scientific workloads often contain very small triangles that has non-zero area and cannot be trivially rejected as falling between pixel centers, but does not cover any pixel center. Those triangles can be initially rasterized as early as in binner and rejected if they cover no pixels The optimization can be disabled in compilation using KNOB_ENABLE_EARLY_RAST option in knobs.h The Early Rast is disabled by default. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:43 -06:00
George Kyriazis	be3cd7add1	swr/rast: Enable simd16 vertex shaders Flip the switch(es) to enable simd16 vertex shaders: USE_SIMD16_SHADERS and USE_SIMD16_VS Both have to be enabled at the same time. Currently, just setting USE_SIMD16_SHADERS does not work correctly. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:42 -06:00
George Kyriazis	8c83d2d371	swr: Support simd16 vertex shaders Supporting simd16 vertex shaders involves packing the output of the fetch shader appropriately, especially the vertexID buffers that have to be formatted in one simd16 register, needed by the VS. As part of this support, we needed to remove the 2nd JitManager, since it was not accounting for vector width correctly. USE_SIMD16_SHADERS is also split into two defines. The additional one (USE_SIMD16_VS) controls the width of the vertex shader (VS), while the original one (USE_SIMD16_SHADERS) controls overall front end width. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:42 -06:00
George Kyriazis	1874d95a8e	swr/rast: changed jit debug magic number Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:41 -06:00
George Kyriazis	c719f62621	swr/rast: Added ICLAMP builder function Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:41 -06:00
George Kyriazis	f192502001	swr/rast: Jit debug work Properly validate DLL matches OBJ for jitted function Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:41 -06:00
George Kyriazis	3c405e32b0	swr/rast: silence generated file warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:40 -06:00
George Kyriazis	fe107e3c17	swr/rast: jit shader lib debug work Create shader_lib during build, link with shaders at DLL generation time Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:40 -06:00
George Kyriazis	0cd9ad98a3	swr/rast: AVX-512 changes to enable 16-wide VS Add a new define (USE_SIMD16_VS), to denote calling a 16-wide vertex shader. This is needed because the mesa driver can do 16-wide shaders, but rasty cannot yet, so we need to distinguish. Create a new VertexID entry (VertexID16) for the USE_SIMD16_VS case, since we need to format the vertex id in a way that is digestible by the 16-wide VS Disabled for now. To be enabled in a future checkin when driver work is complete. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:40 -06:00
George Kyriazis	3140e714d2	swr/rast: x86 autogenerated macro work Add name argument to x86 autogenerated macros. Add useful variable names for DCL_inputVec implementation. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:39 -06:00
George Kyriazis	4cd6e2ebfd	swr/rast: Shorten some filenames in shader and fetch dump files Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:39 -06:00
George Kyriazis	3936044d07	swr/rast: work supporting optimizations in Debug builds. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:38 -06:00
George Kyriazis	c4a42f5add	swr/rast: Add debugging type support for function types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:38 -06:00
George Kyriazis	e9e7f3ce0a	swr/rast: Shader debugging work - Move debug .ll files to JIT_CACHE_DIR - Don't link against jitter SRGBLut table, add global data to shader that needs it. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:34 -06:00
George Kyriazis	34bbcb5052	swr/rast: Debug Symbols work Added support for Fetch / Sample / LD functions Added DLL link to JitCache implementation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:30 -06:00
George Kyriazis	01ab218bbc	swr/rast: Initial work for debugging support. Adds ability to step into jitted llvm IR in Visual Studio. - Updated llvm type generation script to also generate corresponding debug types. - New module pass inserts debug metadata into the IR for each function Disabled by default. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:52:22 -06:00
George Kyriazis	4660e13152	swr/rast: Add private state parameter in fetcher Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:48:41 -06:00
George Kyriazis	079ae3c48d	swr/rast: Added missing define for Linux/gcc + ZeroMemory() macro definition for non win32-compilation in common/os.h Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:48:41 -06:00
George Kyriazis	70f8eac603	swr/rast: Fix one more invalid object format for windows. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:48:41 -06:00
Bas Nieuwenhuizen	61a790409e	radv: Always re-emit the sample position offset user SGPR. The user SGPR location can change between pipelines, so we need to emit it again to the pottentially changed SGPR index. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-19 23:35:12 +01:00
Bas Nieuwenhuizen	dbf1e918cd	radv: emit pa_sc_mode_cntl_0 with multisample state. We don't have the meta kludge with 0 viewports anymore, so we can always enable them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-19 23:35:12 +01:00
Kenneth Graunke	c7dcee58b5	i965: Avoid problems from referencing orphaned BOs after growing. Growing the batch/state buffer is a lot more dangerous than I thought. A number of places emit multiple state buffer sections, and then write data to the returned pointer, or save a pointer to brw->batch.state.bo and then use it in relocations. If each call can grow, this can result in stale map references or stale BO pointers. Furthermore, fences refer to the old batch BO, and that reference needs to continue working. To avoid these woes, we avoid ever swapping the brw->batch.*.bo pointer, instead exchanging the brw_bo structures in place. That way, stale BO references are fine - the GEM handle changes, but the brw_bo pointer doesn't. We also defer the memcpy until a quiescent point, so callers can write to the returned pointer - which may be in either BO - and we'll sort it out and combine the two properly in the end. v2/v3: - Handle stale pointers in the shadow copy case, where realloc may or may not move our shadow copy to a new address. - Track the partial map explicitly, to avoid problems with buffer reuse where multiple map modes exist (caught by Chris Wilson). v4: - Don't use realloc in the CPU shadow case, it isn't safe. Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v3] Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-01-19 11:30:10 -08:00
Kenneth Graunke	8a5bc304ff	i965: Rename 'aux' to 'prog_data' in program cache. 'aux' is a very generic name, suggesting it can be a bunch of things. However, it's always the brw_*_prog_data structure. So, call it that. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-19 11:29:47 -08:00
Chuck Atkins	a4be2bcee2	swr: allow a single swr architecture to be builtin Part 2 of 2 (part 1 is autoconf changes, part 2 is C++ changes) When only a single SWR architecture is being used, this allows that architecture to be builtin rather than as a separate libswrARCH.so that gets loaded via dlopen. Since there are now several different code paths for each detected CPU architecture, the log output is also adjusted to convey where the backend is getting loaded from. This allows SWR to be used for static mesa builds which are still important for large HPC environments where shared libraries can impose unacceptable application startup times as hundreds of thousands of copies of the libs are loaded from a shared parallel filesystem. Based on an initial implementation by Tim Rowley. v2: Refactor repetitive preprocessor checks to reduce code duplication v3: Formatting changes per Bruce C. Also delay screen creation until end to avoid leaks when failure conditions are hit. Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> CC: Tim Rowley <timothy.o.rowley@intel.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 13:16:00 -06:00
Chuck Atkins	2ed8b6f827	swr: (autoconf) allow a single swr architecture to be builtin Part 1 of 2 (part 1 is autoconf changes, part 2 is C++ changes) When only a single SWR architecture is being used, this allows that architecture to be builtin rather than as a separate libswrARCH.so that gets loaded via dlopen. Since there are now several different code paths for each detected CPU architecture, the log output is also adjusted to convey where the backend is getting loaded from. This allows SWR to be used for static mesa builds which are still important for large HPC environments where shared libraries can impose unacceptable application startup times as hundreds of thousands of copies of the libs are loaded from a shared parallel filesystem. Based on an initial implementation by Tim Rowley. v2: Fix comment placement pointed out by Bruce C. Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> CC: Tim Rowley <timothy.o.rowley@intel.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 13:15:54 -06:00
Greg V	8ff8c82630	swr: fix clang 5 null cast warning Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-19 16:15:56 +00:00
Gert Wollny	ea89843b3d	mesa/program: Fix -Wunused-param warning v2: Don't annotate, but remove the unused ctx parameter Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-19 15:45:57 +00:00
Gert Wollny	81d8a0f4a4	mesa/program/prog_execute.c: Silence -Wunused-param v2: Don't annotate, but remove the unused ctx parameter Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:45:57 +00:00
Gert Wollny	fef4b16523	mesa: Make numSamples an unsigned int As a followup to the previous patch propagate the change of numSamples from int to unsigned to gl_config::samples and consequently fix some -Wsign-compare warnings. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-19 15:45:57 +00:00
Gert Wollny	d0e37599ab	gallium: Make (num_)samples an unsigned int According to the ARB_multisample num_samples is a non-negative integer. Consequently define it as such, fail in glx/choose_visual if a negative number is given. v2: split patch into gallium and mesa part Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:45:57 +00:00
Andres Gomez	7a2c87177a	docs: correct a typo in releasing instructions Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:25:53 +02:00
Andres Gomez	7760566ab7	docs: move untar line in basic testing instructions for coherence For scons, windows/mingw dealing with LLVM_CONFIG is done before untarring. This is also more convenient for copy and paste. Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:25:39 +02:00
Andres Gomez	bd8537fa71	docs: add a notice whenever a release is the final in a series Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:25:27 +02:00
Andres Gomez	b910eec489	docs: add final release note for 17.2.8 Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:25:20 +02:00
Andres Gomez	4b50cfef44	docs: add final release note for 17.1.10 Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-19 15:20:57 +02:00
Grazvydas Ignotas	e6abc613e2	st/vdpau: release held lock in error path Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-01-19 13:30:22 +02:00
Juan A. Suarez Romero	302ff82434	docs: update calendar, add news and link release notes to 17.3.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero	059db12097	docs: add sha256 checksums for 17.3.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `bc1503b13f`)	2018-01-19 10:46:18 +01:00
Juan A. Suarez Romero	3205a45fc3	docs: add release notes for 17.3.3 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `80f5f279b3`)	2018-01-19 10:46:18 +01:00
Samuel Iglesias Gonsálvez	7109a1fe13	anv: avoid segmentation fault due to vk_error() vk_error() is a macro that calls __vk_errorf() with instance == NULL. Then, __vk_errorf() passes a pointer to instance->debug_report_callbacks to vk_debug_error(), which segfaults as this pointer is invalid but not NULL. Fixes: `e5b1bd6ab8` "vulkan: move anv VK_EXT_debug_report implementation to common code." Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-19 09:39:05 +01:00
Bas Nieuwenhuizen	32170d87e3	ac/nir: Fix vector extraction if source vector has >4 elements. v2: Add forgotten argument and start offset. Fixes: `91074bb11b` "radv/ac: Implement Float64 SSBO stores." Tested-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-19 02:00:28 +01:00
Bas Nieuwenhuizen	f4211e6f93	ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores. Fixes: `91074bb11b` "radv/ac: Implement Float64 SSBO stores." Tested-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-19 02:00:14 +01:00
Bas Nieuwenhuizen	4a9fd90e1e	ac/nir: Fix TCS output LDS offsets. When a channel was not set we also did not increase the LDS address, while that obviously should happen. The output loading code was inadvertently fixed which resulted in a mismatch causing the SaschaWillems tessellation demo to result in corrupt rendering. Fixes: `7898eb9a60` "ac: rework load_tcs_{inputs,outputs}" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen	bd5c942cef	radv: Use correct bindings for inputRate in key generation. The bindings also have an index field. Fixes: `49d035122e` "radv: Add single pipeline cache key." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104677 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-19 01:54:59 +01:00
Bas Nieuwenhuizen	b1444c9ccb	radv: Implement VK_ANDROID_native_buffer. Passes dEQP-VK.api.smoke.* dEQP-VK.wsi.android.* with android-cts-7.1_r12 . Unlike the initial anv implementation this does use syncobjs instead of waiting on the CPU. This is missing meson build coverage for now. One possible todo is that linux 4.15 now has a sycall that allows us to export amdgpu fence to a sync_file, which allows us not to force all fences and semaphores to use syncobjs. However, I had trouble with my kernel crashing regularly with NULL pointers, and I'm not sure how beneficial it is in the first place given that intel uses syncobjs for all fences if available. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen	a3e241ed07	radv: Add create image flag to not use DCC/CMASK. If we import an image, we might not have space in the buffer for CMASK, even though it is compatible. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen	e344cd8178	radv: Generate VK_ANDROID_native_buffer. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-19 01:43:55 +01:00
Bas Nieuwenhuizen	0f89f9b8eb	radv: Replace an assert with unreachable. Otherwise we get uninitialized variable warnings for es_vgpr_comp_cnt. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-19 00:38:45 +01:00
Bas Nieuwenhuizen	e417ab212b	radv: Remove DCC check on CS resolve dst image. Gives a warning when the assert is disabled, and not even necessarily true. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-19 00:38:45 +01:00
George Kyriazis	f76ca91ae0	gallivm: support avx512 (16x32) in interleave2_half lp_build_interleave2_half was not doing the right thing for avx512-style 16-wide loads. This path is hit in the swr driver with a 16-wide vertex shader. It is called from lp_build_transpose_aos, when doing texel fetches and the fetched data needs to be transposed to one component per output register. Special-case the post-load swizzle operations for avx512 16x32 (16-wide 32-bit values) so that we move the xyzw components correctly to the outputs. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-18 17:07:06 -06:00
Brian Paul	9e6efdd177	vbo: fix VBO optimization regression The optimization in change `8e4efdc895` ("vbo: optimize some display list drawing") missed the loopback case. This is used when the glBegin/End primitive doesn't have a uniform set of vertex attributes. The new Piglit gl-1.0-dlist-materials test hits this. So check the aligned_vertex_buffer_offset(list) value and adjust the buffer offset accordingly. We also need to remove the 'start == 0' assertion in the loopback code since it no longer applies. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-18 15:07:17 -07:00
Dylan Baker	26bde1e354	meson: ensure that xmlpool_options.h is generated for targets that need it Currently a couple of gallium targets race with xmlpool_options.h being generated, don't do that. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2018-01-18 13:31:47 -08:00
Timothy Arceri	3bccb5dba9	ac: fix visit_ssa_undef() for doubles V2: use LLVMIntTypeInContext() Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-19 08:09:04 +11:00
Dave Airlie	3153d74207	ac/nir: account for view index in the user sgpr allocation. The view index user sgpr wasn't being accounted for properly, this refactors out the code to decide if it's required and then uses that info to account for it. Fixes: `180c1b924e` (ac/nir: Add shader support for multiviews.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 19:47:40 +00:00
Dave Airlie	5758a8c402	r600: enable ARB_enhanced_layouts Only one piglit test fails, sso-vs-gs-fs-array-interleave There are 3 tests using ssbo without checking sizes failing also but those are test bugs. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-19 05:33:44 +10:00
Chris Wilson	34499e8ddc	intel: Future-proof ring names for aubinator_error_decode The kernel is moving to a $class$instance naming scheme in preparation for accommodating more rings in the future in a consistent manner. It is already using the naming scheme internally, and now we are looking at updating some soft-ABI such as the error state to use the new naming scheme. This of course means we need to teach aubinator_error_decode how to map both sets of ring names onto its register maps. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2018-01-18 17:35:21 +00:00
Kenneth Graunke	3e18c53e59	i965: Bind null render targets for shadow sampling + color. Portal 2 appears to bind RGBA8888_UNORM textures to a sampler2DShadow, and calls shadow2D() on it. This causes undefined behavior in OpenGL. Unfortunately, our sampler appears to hang in this scenario, which is not acceptable. Just give them a null surface instead, which returns all zeroes. Fixes GPU hangs in Portal 2 on Kabylake. Huge thanks to Jason Ekstrand for noticing this crazy behavior while sifting through crash dumps. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104487 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-18 09:32:28 -08:00
Iago Toral Quiroga	7ec6e4e689	anv/query: implement multiview interactions From the Vulkan spec with KHX extensions: "If queries are used while executing a render pass instance that has multiview enabled, the query uses N consecutive query indices in the query pool (starting at query) where N is the number of bits set in the view mask in the subpass the query is used in. How the numerical results of the query are distributed among the queries is implementation-dependent. For example, some implementations may write each view's results to a distinct query, while other implementations may write the total result to the first query and write zero to the other queries. However, the sum of the results in all the queries must accurately reflect the total result of the query summed over all views. Applications can sum the results from all the queries to compute the total result." In our case we only really emit a single query (in the first query index) that stores the aggregated result for all views, but we still need to manage availability for all the other query indices involved, even if we don't actually use them. This is relevant when clients call vkGetQueryPoolResults and pass all N queries to retrieve the results. In that scenario, without this patch, we will never see queries other than the first being available since we never emit them. v2: we need the same treatment for timestamp queries. v3 (Jason): - Better an if instead of an early return. - We can't write to this memory in the CPU, we should use MI_STORE_DATA_IMM and emit_query_availability (Jason). v4 (Jason): - No need to take the value to write as parameter, just hard code it to 0. Fixes test failures in some work-in-progress CTS multiview+query tests. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-18 16:37:06 +01:00
Emil Velikov	c9b2cb7897	vc5: add missing files to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-18 11:36:36 +00:00
Emil Velikov	393cf04fa4	broadcom: add missing headers to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-18 11:21:35 +00:00
Mario Kleiner	d67ef48580	i965/screen: Allow drirc to set 'allow_rgb10_configs' again. Since setup of ALLOW_RGB10_CONFIGS was moved to i965's own brw_config_options.xml, this was hard-coded to false and could not be overriden by drirc. Add some parsing into i965's private screen->optionCache to enable drirc again. Fixes: `b391fb26df` ("dri_util: remove ALLOW_RGB10_CONFIGS option (v2)") Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Marek Olšák <marek.olsak@amd.com> Cc: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-18 08:18:47 +02:00
Samuel Iglesias Gonsálvez	eac629deb6	anv: return VK_ERROR_OUT_OF_DEVICE_MEMORY when surface size is out of HW limits Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-18 06:48:47 +01:00
Timothy Arceri	9248f72c4e	ac: tidy up array indexing logic Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-18 15:59:27 +11:00
Rob Clark	4c69961daf	mesa/st: translate SO info in glsl_to_nir() case This was handled for VS, but not for GS. Fixes for gallium drivers using nir: spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams-without-invocations spec@arb_gpu_shader5@arb_gpu_shader5-xfb-streams* spec@arb_transform_feedback3@arb_transform_feedback3-ext_interleaved_two_bufs_gs* spec@ext_transform_feedback@geometry-shaders-basic spec@ext_transform_feedback@* use_gs spec@glsl-1.50@execution@geometry@primitive-id* spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip * spec@glsl-1.50@transform-feedback-builtins spec@glsl-1.50@transform-feedback-type-and-size v2: don't call st_translate_program_stream_output) for TCS v3: drop scanning patch outputs as TCS can't output xfb Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Karol Herbst <kherbst@redhat.com>	2018-01-18 15:35:58 +11:00
Dave Airlie	44a27cdcec	r600/sb: add lds related peepholes. if no destination: a) convert _RET instructions to non _RET variants if no dst b) set src0 to undefined if it's a READ, this should get DCE then. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:38:17 +00:00
Dave Airlie	3bb2b2cc45	r600/sb: use different stacks for tracking lds and queue usage. The normal ssa renumbering isn't sufficient for LDS queue access, this uses two stacks, one for the lds queue, and one for the lds r/w ordering. The LDS oq values are incremented in their use in a linear fashion. The LDS rw values are incremented in their definitions and used in the next lds operation to ensure reordering doesn't occur. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:38:09 +00:00
Dave Airlie	8cfec333c0	r600/sb: schedule LDS ops in appropriate places. So LDS ops have to be SLOT_X, and LDS OQ reads have read port restrictions so we try and force those into only having one per slot and avoiding bank swizzles. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:38:05 +00:00
Dave Airlie	71a50de4fc	r600/sb: hit the scheduler with a big hammer to avoid lds splits. This tries to avoid an lds queue read getting scheduled separately from an lds ret read, the non-sb code uses the same style of hammer, this isn't foolproof. We can do better, but it's a bit tricky, as you have to scan ahead and either schedule more lds oq moves and more lds reads and that could lead to you running out of space anyways. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:56 +00:00
Dave Airlie	46549bd6b6	r600/sb: adding lds oq tracking to the scheduler This adds support for tracking the lds oq read/writes so can avoid scheduling other things in between. This patch just adds the tracking and assert to show problems. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:52 +00:00
Dave Airlie	5002dd4052	r600/sb: add gcm support to avoid clause between lds read/queue read You have to schedule LDS_READ_RET _, x and MOV reg, LDS_OQ_A_POP in the same basic block/clause. This makes sure once we've issues and MOV we don't add another block until we balance it with an LDS read. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:42 +00:00
Dave Airlie	046cf68cad	r600/sb: handle lds special dest registers. This adds lds to the geom emit handling Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:39 +00:00
Dave Airlie	d72590032f	r600/sb: handle LDS operations in folding. Don't try and fold LDS using expressions. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:35 +00:00
Dave Airlie	c314b0a27a	r600/sb: add finalising for lds output queue special values. We need to convert these to the hw special registers. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:27 +00:00
Dave Airlie	9f3a1e9b0c	r600/sb: add initial support for parsing lds operations. This handles parsing the LDS ops and queue accessess. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:13 +00:00
Dave Airlie	795512b235	r600/sb: disable if conversion for hs This fixes bad interactions with the LDS special values. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:37:01 +00:00
Dave Airlie	1ca2eb3bf3	r600/sb: lds ops have no dst register. Although these are op3s they don't have a dst reg. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:36:52 +00:00
Dave Airlie	09c1c13c44	r600/sb: introduce special register values for lds support. For LDS read/write ordering we use the LDS_RW value, reads will wait on previous writes. For LDS read/read from LDS queue ordering we use the LDS_OQ values, we define two for now, though initially we'll just support OQA. Also add the check for the lds oq values Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:36:47 +00:00
Dave Airlie	2f2cef385f	r600/sb: update last_cf if alu is the last clause It's rare to have a final alu clause on normal shaders (exports) but tess shaders write to LDS as their output, so we see some alu clauses, and the CF_END get put in the wrong place. This makes sure to update last_cf correctly. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:36:41 +00:00
Dave Airlie	da977ad907	r600/sb: start adding GDS support This adds support for GDS ops to sb backend. This seems to work for atomics and tess factor writes. Acked-By: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:35:37 +00:00
Dave Airlie	05f5282d63	r600/sb: add tess/compute initial state registers. This stops them being optimised out. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:35:12 +00:00
Dave Airlie	68b976bd91	r600/sb: fix a bug emitting ar load from a constant. Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then hitting an assert in sb_bc_finalize.cpp:translate_kcache. This makes sure the toplevel kcache tracker gets updated, and the clause gets fixed up. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:34:46 +00:00
Dave Airlie	7efcafce7c	r600/shader: only emit add instruction if param has a value. Just saves a pointless a = a + 0; Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:34:43 +00:00
Dave Airlie	2bd01adf14	r600: emit 0 gds_op for tf write. This field is ignored for tf writes so should be 0. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 03:34:36 +00:00
Dave Airlie	9041730d1c	r600: add support for ARB_shader_clock. Reviewed-by: Gert Wollny <gw.fossedev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 13:25:59 +10:00
Dave Airlie	6785034a70	radv/ws: get rid of useless return value This also used boolean, so nice to kill that. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-18 01:57:53 +00:00
Bas Nieuwenhuizen	2ce11ac11f	radv: Initialize DCC on transition from preinitialized. Looks like the decompress does not handle invalid encodings well, which happens with random memory. Of course apps should not use it with random memory, but they are allowed to .... Fixes: `44fcf58744` "radv: Disable DCC for GENERAL layout and compute transfer dest." Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-18 01:57:52 +01:00
Timothy Arceri	e2b9296146	ac: fix buffer overflow bug in 64bit SSBO loads Fixes: `441ee1e65b` "radv/ac: Implement Float64 SSBO loads" Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-18 10:26:58 +11:00
Timothy Arceri	409e15f26f	ac: fix nir_intrinsic_get_buffer_size for radeonsi Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-18 10:25:20 +11:00
Kenneth Graunke	d139b5e4cc	i965: Pass brw_growing_bo to grow_buffer(). Cleaner. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-17 13:13:26 -08:00
Kenneth Graunke	81ca8e69e3	i965: Make a helper for recreating growing buffers. Now that we have two of these, we're duplicating a bunch of this logic. The next commit will add more logic, which would make the duplication seem worse. This ends up setting EXEC_OBJECT_CAPTURE on the batch, which isn't necessary (it's already captured), but it should be harmless. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-17 13:13:26 -08:00
Kenneth Graunke	02c1c25b1a	i965: Replace cpu_map pointers with a "use_shadow_copy" boolean. Having a boolean for "we're using malloc'd shadow copies for all buffers" is cleaner than having a cpu_map pointer for each. It was okay when we had one buffer, but this is more obvious. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2018-01-17 13:13:26 -08:00
Francisco Jerez	11674dad8a	intel/fs: Optimize and simplify the copy propagation dataflow logic. Previously the dataflow propagation algorithm would calculate the ACP live-in and -out sets in a two-pass fixed-point algorithm. The first pass would update the live-out sets of all basic blocks of the program based on their live-in sets, while the second pass would update the live-in sets based on the live-out sets. This is incredibly inefficient in the typical case where the CFG of the program is approximately acyclic, because it can take up to 2*n passes for an ACP entry introduced at the top of the program to reach the bottom (where n is the number of basic blocks in the program), until which point the algorithm won't be able to reach a fixed point. The same effect can be achieved in a single pass by computing the live-in and -out sets in lock-step, because that makes sure that processing of any basic block will pick up the updated live-out sets of the lexically preceding blocks. This gives the dataflow propagation algorithm effectively O(n) run-time instead of O(n^2) in the acyclic case. The time spent in dataflow propagation is reduced by 30x in the GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP test-case on my CHV system (the improvement is likely to be of the same order of magnitude on other platforms). This more than reverses an apparent run-time regression in this test-case from my previous copy-propagation undefined-value handling patch, which was ultimately caused by the additional work introduced in that commit to account for undefined values being multiplied by a huge quadratic factor. According to Chad this test was failing on CHV due to a 30s time-out imposed by the Android CTS (this was the case regardless of my undefined-value handling patch, even though my patch substantially exacerbated the issue). On my CHV system this patch reduces the overall run-time of the test by approximately 12x, getting us to around 13s, well below the time-out. v2: Initialize live-out set to the universal set to avoid rather pessimistic dataflow estimation in shaders with cycles (Addresses performance regression reported by Eero in GpuTest Piano). Performance numbers given above still apply. No shader-db changes with respect to master. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271 Reported-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:56:08 -08:00
Marek Olšák	63b231309e	gallium: remove PIPE_CAP_USER_CONSTANT_BUFFERS Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:18:00 +01:00
Marek Olšák	85bbcdda34	st/mesa: assume that user constant buffers are always supported Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:17:59 +01:00
Marek Olšák	5981a5226e	nine: assume that user constant buffers are always supported Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:17:59 +01:00
Marek Olšák	e871abe452	gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:17:59 +01:00
Marek Olšák	e411d2572b	st/mesa: expose ARB_sync unconditionally All drivers support it. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:17:59 +01:00
Marek Olšák	3778a0a533	gallium: remove PIPE_CAP_TWO_SIDED_STENCIL Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-01-17 20:17:59 +01:00
Brian Paul	f631a6ae8c	glsl: remove unneeded extern "C" {} bracketing around Mesa includes The two headers already have the right extern "C" annotations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	7421e34dd6	mesa: move gl_external_samplers() to program.[ch] The function is only called from a couple places. It doesn't make sense to have it in mtypes.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	6dc8896726	st/mesa: include util/bitscan.h in st_glsl_to_tgsi_temprename.cpp And use "" instead of <> for including Mesa headers, as we do elsewhere. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	741d423478	glsl: include util/bitscan.h in serialize.cpp Instead of relying on indirect inclusion of the header. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	2f14146200	util: include string.h in u_dynarray.h To get memset() prototype. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	02c0734adc	mesa: remove unneeded #includes of main/compiler.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	7845397183	st/mesa: remove unneeded #includes of main/compiler.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	bafa33befd	st/mesa: include main/compiler.h in st_cb_queryobj.c To get CPU_TO_LE32() macro. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	7de0262f8f	mesa: include util/macros.h in format_fallback.c To get definition of unreachable() macro. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	484ac243f6	mesa: include compiler.h in disk_cache.c Instead of indirect inclusion to get CPU_TO_LE32() macro. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	ad00a78993	mesa/program: change validate_inputs() local var 'inputs' to GLbitfield64 Both state->prog->info.inputs_read and state->InputsBound are GLbitfield64 so it seems that the OR of those values should be of the same type. I'm not sure this fixes any actual issues though. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2018-01-17 11:17:56 -07:00
Brian Paul	d8306de4ac	vbo: reindent vbo_attrib.h to use 3 spaces Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	ef01d911ee	vbo: whitespace, formatting fixes in vbo_exec_api.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	18f4241b89	vbo: add assertions, comments in vbo_exec_api.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	5e8962b58f	vbo: whitespace, formatting fixes in vbo_exec_draw.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	f5140c1a6b	vbo: use inputs_read var to simplify code v2: add some const qualifiers, per Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	e6e78d98a2	vbo: whitespace, formatting fixes in vbo_split_copy.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	380165d399	vbo: use a new local 'array' variable in bind_vertex_list() loop Make the code a bit more concise. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	37863c38f3	vbo: remove unneeded #includes in vbo_context.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	76a08eeeec	vbo: whitespace, formatting fixes in vbo_context.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	95dec9097f	vbo: change vbo_context attribute map arrays to GLubyte The values will never be larger than VBO_ATTRIB_MAX (currently 44). v2: add STATIC_ASSERT to be sure VBO_ATTRIB_MAX can fit in ubyte, per Emil. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	5d78440d58	vbo: lift common code out of switch cases Both switch cases began with the same code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-17 11:17:56 -07:00
Brian Paul	8e4efdc895	vbo: optimize some display list drawing (v2) The vbo_save_vertex_list structure records one or more glBegin/End primitives which all have the same vertex format. To draw these primitives, we setup the vertex array state, then issue the drawing command. Before, the 'start' vertex was typically zero and we used the vertex array pointer to indicate where the vertex data starts. This patch checks if the vertex buffer offset is an exact multiple of the vertex size. If so, that means we can use zero-based vertex array pointers and use the draw's start value to indicate where the vertex data starts. This means a series of display list drawing commands may have identical vertex array state. This will get filtered out by the Gallium CSO module so we can issue a tight series of drawing commands without state changes to the device. Note that this also works for a series of glCallList commands (not just one list that contains multiple glBegin/End pairs). No Piglit or conform changes. v2: minor fixes suggested by Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	4edc8fdcdc	vbo: rewrite some code in playback_copy_to_current() I think this is a little easier to understand. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	ad3b162b81	vbo: add some comments in vbo_save_api.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	ee86c34664	vbo: rename some functions in vbo_save_api.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	efb5ce1c5e	vbo: rename some functions in vbo_save_draw.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	2064c4e814	vbo: add comment that vbo_save_vertex_list::buffer_offset is in bytes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	8b06acf642	vbo: minor code simplification in _save_compile_vertex_list() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	4acc8a0de3	vbo: rename prim to prims Using a plural name makes it easier to see that this is an array and not a pointer to a single object. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	841d1839e2	vbo: removed unused ctx parameter for alloc_prim_store() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	3442672df4	vbo: rename vbo_save_context::buffer to buffer_map And move the field and improve comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	62fd12e5bc	vbo: remove unused vbo_save_context::count field Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	1863b18ce6	vbo: s/GLuint/GLbitfield/ for vbo_save_context::replay_flags Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	65df0d41cf	vbo: rename vbo_save_vertex_list::count to vertex_count Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	b04195b959	vbo: rename vbo_save_vertex_store::buffer to buffer_map To match other parts of the VBO code and make things easier to understand. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	213bef62c3	vbo: rename vbo_save_primitive_store::buffer to prims A little easier to understand. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	1e9ca36058	vbo: whitespace fixes in vbo_save.h Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	c40e9034be	vbo: whitespace fixes in vbo_save_draw.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-17 11:17:56 -07:00
Brian Paul	7027d9c1fd	svga: add num-commands-per-draw HUD query This query shows the ratio of total commands vs. drawing commands sent to the vgpu device. This gives some idea of how many state changes are sent per draw call. The closer the ratio is to 1.0, the better. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Neha Bhende <bhenden@vmware.com>	2018-01-17 11:17:56 -07:00
Brian Paul	92840bd276	gallium/hud: Fix support for PIPE_DRIVER_QUERY_TYPE_FLOAT Evidently, nobody has used PIPE_DRIVER_QUERY_TYPE_FLOAT up to this point. Adding a driver query of this type which returns the query value in pipe_query_result::f resulted in garbage output in the HUD. The problem is the pipe_query_result::f field was being accessed as through the u64 field and being added to the query_info::results_cumulative field. This patch checks for PIPE_DRIVER_QUERY_TYPE_FLOAT in a few places and scales the float by 1000 before converting to uint64_t. Also, add some comments to explain the query_info::result_index field. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	23a5fae317	gallium/hud: remove uint64_t casts in sensor query_sti_load() function The hud_graph_add_value() function takes a double value, so just pass the current/critical values as-is since they're doubles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	f774f2b52f	gallium/hud: compute cpu load, percent with doubles The hud_graph_add_value() function takes a double precision value, so compute it that way. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 11:17:56 -07:00
Brian Paul	541f569a19	gallium/hud: s/unsigned/enum pipe_query_type/ Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 11:17:56 -07:00
Jon Turney	3c35dad1df	meson: Set with_dri from with_gallium when DRI glx is explicitly configured Set with_dri from with_gallium when DRI GLX is explicitly configured, as well as when DRI GLX is chosen automatically. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-17 18:05:18 +00:00
George Kyriazis	22a2027dd7	meson: add llvm dependency for swr build Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-01-17 11:38:57 -06:00
Leo Liu	d1833b8cd8	st/va: add break for MPEG4 data buffer handling case Signed-off-by: Leo Liu <leo.liu@amd.com>	2018-01-17 08:31:38 -05:00
Leo Liu	3d0b561f34	st/va: remove TODO line for JPEG data buffer handling Nothing to do Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-01-17 08:31:38 -05:00
Rhys Kidd	996a397719	i915: No longer rely on compatability define in intel_bufmgr.h Symbol rename from dri_* to drm_intel_* introduced a number of compatability defines within intel_bufmgr.h. Replace the old function with the new function, consistent with the balance of this file. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-17 08:08:04 -05:00
Timothy Arceri	1256ab18c1	radeonsi: bump glsl version to 450 for nir backend We still have more work to do but piglit results are looking pretty good. At GLSL 1.50 we have 30647/31118 piglit tests passing. At GLSL 4.50 we have 37927/38551 piglit tests passing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-18 00:03:33 +11:00
Timothy Arceri	b282207c32	radeonsi/nir: add some missing tcs bits to the nir scan pass Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-18 00:03:33 +11:00
Timothy Arceri	7898eb9a60	ac: rework load_tcs_{inputs,outputs} This shares more code and calls the new shared load_tess_varyings() abi so that the radeonsi nir path now supports tcs output loads. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-18 00:03:33 +11:00
Timothy Arceri	9622b445c8	ac/radeonsi: add tcs load outputs support The code to load outputs is essentially the same as load inputs so we make the interface more generic to maximise code sharing. We will make use of the new support in the following patch. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-18 00:03:33 +11:00
Timothy Arceri	a20016d827	st/glsl_to_tgsi: add ARB_get_program_binary support using TGSI This resolves a game bug in Dead Island. The game doesn't properly handle ARB_get_program_binary with 0 supported formats, and ends up crashing. This will enable ARB_get_program_binary binary support for any driver that currently enables the on-disk shader cache. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564	2018-01-17 23:43:28 +11:00
Timothy Arceri	a34262aed7	st/glsl_to_tgsi: add st_get_program_binary_driver_sha1() helper This will be used by ARB_get_program_binary. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 23:43:28 +11:00
Timothy Arceri	9eebc55cc2	st/glsl_to_tgsi: add (de)serialise program helpers These will be shared between the on-disk shader cache and ARB_get_program_binary. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 23:43:28 +11:00
Timothy Arceri	dbf7e483b4	st/glsl_to_tgsi: stop passing pipe_shader_state to st_store_tgsi_in_disk_cache() We can instead just get this from st_*_program. V2: store tokens to to st_compute_program before attempting to write to cache (fixes crash). Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 23:43:28 +11:00
Timothy Arceri	c69b0dd681	st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program We will need this for ARB_get_program_binary binary support. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-17 23:43:28 +11:00
Michel Dänzer	7b0e8264dd	loader/dri3: Try to make sure we only process our own NotifyMSC events We were using a sequence counter value to wait for a specific NotifyMSC event. However, we can receive events from other clients as well, which may already be using higher sequence numbers than us. In that case, we could stop processing after an event from another client, which could have been received significantly earlier. This would have multiple undesirable effects: * The computed MSC and UST values would be lower than they should be * We could leave a growing number of NotifyMSC events from ourselves and other clients in XCB's special event queue I ran into this with Firefox and Thunderbird, whose VSync threads both seem to use the same window. The result was sluggish screen updates and growing memory consumption in one of them. Fix this by checking the XCB sequence number and MSC value of NotifyMSC events, instead of using our own sequence number. v2: * Use the Present event ID for the sequence parameter of the PresentNotifyMSC request, as another safeguard against processing events from other clients * Rebase on drawable mutex changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> # v1	2018-01-17 11:40:22 +01:00
Bas Nieuwenhuizen	0b8991c0b6	radv: Implement VK_EXT_debug_report. This is not hooked up to any messages yet, but useful for e.g. renderdoc if you add some messages during development. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-17 11:29:04 +01:00
Bas Nieuwenhuizen	e5b1bd6ab8	vulkan: move anv VK_EXT_debug_report implementation to common code. For also using it in radv. I moved the remaining stubs back to anv_device.c as they were just trivial. This does not move the vk_errorf/anv_perf_warn or the object type macros, as those depend on anv types and logging. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-17 11:27:52 +01:00
Timothy Arceri	f69cbb2b53	st/glsl_to_nir: disable io lowering to temps for tess Lowering these to temps makes a big mess, and results in some piglit test failures. Also the radeonsi backend (the only backend to support tess) has support for indirects so there is no need to lower them anyway. Fixes the following piglit tests on radeonsi: tests/spec/arb_tessellation_shader/execution/variable-indexing/tes-input-array-vec3-index-rd.shader_test tests/spec/arb_tessellation_shader/execution/variable-indexing/tes-input-array-vec4-index-rd.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 17:14:14 +11:00
Jason Ekstrand	af10ce21ff	i965: Enable CCS_E sampling of sRGB textures as UNORM Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-16 21:41:32 -08:00
Jason Ekstrand	96aa558715	i965/draw: Do resolves properly for textures used by TXF Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-16 21:41:32 -08:00
Jason Ekstrand	361e1df1ed	i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage This commit unifies the CCS_E and CCS_D cases. This should fix a couple of subtle issues. One is that when you use INTEL_DEBUG=norbc to disable CCS_E, we don't get the sRGB blending workaround. By unifying the code, we give CCS_D that workaround as well. The second issue fixed by this refactor is that the blending workaround was appears to be enabled on all gens but really only applies on gen9. Due to a happy accident in the way code was laid out, it was only getting enabled on gen9: gen8 and earlier don't support non-zero-one clear colors, and gen10 supports sRGB for CCS_E so it got caught in the format_ccs_e_compat_with_miptree case. This refactor moves it above the format_ccs_e_compat_with_miptree case so it's an explicit early exit and makes it explicitly only on gen9. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.3" <mesa-stable@lists.freedesktop.org>	2018-01-16 21:41:32 -08:00
Jason Ekstrand	f79bb2e651	Re-enable regular fast-clears (CCS_D) on gen9+ This reverts commit `ee57b15ec7`, "i965: Disable regular fast-clears (CCS_D) on gen9+". How taht we've fixed the issue with too many different aux usages in the render cache, it should be safe to re-enable CCS_D for sRGB. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104163 Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.3" <mesa-stable@lists.freedesktop.org>	2018-01-16 21:41:32 -08:00
Jason Ekstrand	d84275b884	i965: Track format and aux usage in the render cache This lets us perform render cache flushes whenever a surface goes from being used with one aux+format to a different aux+format. This is the "proper" fix for https://bugs.freedesktop.org/102435. `ee57b15ec7` which was really just a partial revert of `3e57e9494c` was just a hack to get rid of a hang in a bunch of Valve games. This solves the actual problem responsible for the hang and lets us enable CCS_E once again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.3" <mesa-stable@lists.freedesktop.org>	2018-01-16 21:41:32 -08:00
Jason Ekstrand	622786c20c	i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer This makes sure we flush things out of other caches prior to using a surface through the render cache. Currently, this is a no-op because GL won't let you bind anything other than a color surface as color so it should never end up in the depth cache. However, this does complete the flush/add_bo pair for regular drawing which will be required for the next commit. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "17.3" <mesa-stable@lists.freedesktop.org>	2018-01-16 21:41:32 -08:00
Francisco Jerez	53d8508f1d	i965/gen6-7/sol: Bump primitive counter BO size. Improves performance of SynMark2 OglGSCloth by a further 9.65%±0.59% due to the reduction in overwraps of the primitive count buffer that lead to a CPU stall on previous rendering. Cummulative performance improvement from the series 81.50% ±0.96% (data gathered on VLV). Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-16 16:03:56 -08:00
Francisco Jerez	f476b3f6e7	i965/gen6-7/sol: Keep independent counters for the current and previous begin/end block. This allows us to aggregate the primitive counts of a completed transform feedback begin/end block lazily, which in the most typical case (where glDrawTransformFeedback is not used) will allow us to avoid aggregating the primitive counters on the CPU altogether, preventing a stall on previous rendering during glBeginTransformFeedback(), which dramatically improves performance of applications that rely heavily on transform feedback. Improves performance of SynMark2 OglGSCloth by 65.52% ±0.25% (data gathered on VLV). Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-16 16:03:48 -08:00
Francisco Jerez	b0c8d61281	i965/gen6-7/sol: Restructure primitive counter into a separate type. A primitive counter encapsulates a scalar aggregating counter for each vertex stream along with a section within the primitive tally buffer which hasn't been read out yet. Defining this as a separate type will allow us to keep multiple counter objects around for the same transform feedback object without any code duplication. Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-01-16 16:03:42 -08:00
Timothy Arceri	dc520dafdc	st/mesa: enable ARB_enhanced_layouts on nir drivers I'm guessing this may have been disable because of missing component packing support. However recent nir linking changes required nir based gallium drivers to support component packing so this should now be ok to enable. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-17 10:42:55 +11:00
Roland Scheidegger	b0413cfd8b	draw: remove VSPLIT_CREATE_IDX macro Just inline the little bit of code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-17 00:01:19 +01:00
Roland Scheidegger	1f462eaf39	draw: fix vsplit code when the (post-bias) index value is -1 vsplit_add_cache uses the post-bias index for hashing, but the vsplit_add_cache_uint/ushort/ubyte ones used the pre-bias index, therefore the code for handling the special case (because -1 matches the initialization value of the cache) wasn't actually working. Commit `78a997f728` actually simplified the cache logic somewhat, but it looks like this particular problem carried over (and duplicated to the ushort/ubyte cases, since before only uint needed it). This could lead to the vsplit cache doing the wrong thing, in particular later fetch_info might indicate there are 0 values to fetch. This only really affected edge cases which were bogus to begin with, but it could lead to a crash with the jit vertex shader, since it cannot handle this case correctly (the count loop is always executed at least once and we would not allocate any memory for the shader outputs), so add another assert to catch it there. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-01-17 00:01:19 +01:00
Grazvydas Ignotas	0ad73031ec	st/va: release held locks in error paths Found with the help of following Coccinelle semantic patch: // <smpl> @@ expression E; @@ $pthread_mutex_lock\\|mtx_lock\\|simple_mtx_lock$(E) ... ( $pthread_mutex_unlock\\|mtx_unlock\\|simple_mtx_unlock$(E); ... return ...; \| + maybe need_unlock(E); return ...; ) // </smpl> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: mesa-stable@lists.freedesktop.org	2018-01-17 00:39:55 +02:00
Grazvydas Ignotas	cce982a70b	mesa: remove unneeded semicolons Trivial. Found by Coccinelle. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-17 00:39:55 +02:00
Grazvydas Ignotas	e3adb1abaf	radeon: remove unneeded semicolons Trivial. Found by Coccinelle. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-17 00:39:55 +02:00
Grazvydas Ignotas	6129c03cc7	osmesa: don't check SmoothFlag twice Trivial. Found by Coccinelle. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-17 00:39:55 +02:00
Samuel Pitoiset	05f73b9672	ac: set no-signed-zeros-fp-math when RADV_DEBUG="unsafemath" is used This is an optimisation that is recommended by Matt Arsenault, and used by RadeonSI, but it's not compatible with Vulkan. Note that AC_FLOAT_MODE_UNSAFE_FP_MATH includes the no signed zeros flag in LLVM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-16 21:39:57 +01:00
Samuel Pitoiset	4f5318df2c	ac: set fast math flags when RADV_DEBUG="unsafemath" is used When that debug option is not used, we use the default float mode because the no signed zeros optimisation is not Vulkan compatible. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-16 21:39:55 +01:00
Samuel Pitoiset	2091206ad3	ac: import lp_create_builder() from gallivm Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-16 21:39:53 +01:00
Samuel Pitoiset	ad2b3b2a9c	ac: replace llvm.AMDGPU.kilp by llvm.amdgcn.kill with LLVM 6 This also replaces llvm.AMDGPU.kilp by llvm.AMDGPU.kill with LLVM < 6. Similar to RadeonSI codepath. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-16 21:39:51 +01:00
Juan A. Suarez Romero	9b894c88a6	glsl/linker: link-error using the same name in unnamed block and outside According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This means that it is a link error if for example we have a vertex shader with the following definition. "layout(location=0) uniform Data { float a; float b; };" and a fragment shader with: "uniform float a;" As in both cases we refer to both uniforms as "a", and thus using glGetUniformLocation() wouldn't know which one we mean. This fixes KHR-GL*.shaders.uniform_block.common.name_matching. v2: add fixed tests (Tapani) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-16 19:42:35 +01:00
Samuel Thibault	47ac11bcf8	glx: fix non-dri build glXGetDriverConfig parameters do not provide a context to dynamically check for the presence of the function, so the dispatcher directly calls glXGetDriverConfig, but in non-dri builds dri_glx.c didn't provide glXGetDriverConfig. This change make it just return NULL in that case. Fixes: `84f764a759` "glxglvnddispatch: Add missing dispatch for GetDriverConfig Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-16 16:27:19 +00:00
Indrajit Das	338638a8af	st/va: clear pointers for mpeg2 quantiser matrices This is to fix VA-API issues with GStreamer and MPEG2. Since gstreamer does not pass quantiser matrices with each frame, invalid pointers were being passed to the driver. This patch addresses the same. Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-01-16 10:51:15 +01:00
Indrajit Das	f5277e8492	radeon/vcn: update quantiser matrices only when requested Only update them when the pointers are valid. Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-01-16 10:51:00 +01:00
Indrajit Das	38dee62c9a	radeon/uvd: update quantiser matrices only when requested Only upload them when the pointers are valid. Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-01-16 10:50:08 +01:00
Adam Jackson	1cbcd70b64	Revert "docs: Mark GLX_ARB_context_flush_control done" This reverts commit `d547e18184`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490 Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-01-15 13:48:41 -05:00
Adam Jackson	138f4e3805	Revert "gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_control" This reverts commit `0d044351b7`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490 Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-01-15 13:48:39 -05:00
Adam Jackson	f2a5d27ce2	Revert "i965: Enable flush control" This reverts commit `6ce9006d76`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104490 Signed-off-by: Adam Jackson <ajax@redhat.com>	2018-01-15 13:48:16 -05:00
Samuel Pitoiset	8045f01e2a	Revert "ac/shader: gather If TES reads TESSINNER or TESSOUTER" This can't work for two reasons: - TESSINNER/TESSOUTER are shader input values, so never translated to the intrinsic ops - the shader info pass scans the current stage but we want to know in TCS, if TES reads the tess factors. This fixes 6 regressions related to deqp-vk/tessellation/shader_input_output/tess_level_{inner,outer}_XXX_tes This reverts commit `5ba1a61648`.	2018-01-15 13:47:18 +01:00
Samuel Pitoiset	5842cb0df1	amd/common: fix loading InstanceID for tess on < GFX9 InstanceID is in VGPR2, not 1. One more failure that CTS didn't catch up... Reported-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-15 11:59:16 +01:00
Samuel Pitoiset	5ba1a61648	ac/shader: gather If TES reads TESSINNER or TESSOUTER This shouldn't be scanned in the pipeline. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-15 11:51:47 +01:00
Samuel Pitoiset	aebde47840	ac: remove ac_shader_variant_info::fs::output_mask Unused. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-15 11:48:42 +01:00
Gert Wollny	5d6470d26b	r600/shader: Initialize max_driver_temp_used correctly for the first time Without this initialization the temp registers used in tgsi_declaration may used random indices, and this may result in failing translation from TGSI with an error message "GPR limit exceeded", because the random index is greater then the allowed limit implying that the shader uses more temporary registers then available. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-15 09:14:22 +10:00
Rob Clark	f10bd0a0e1	freedreno/ir3: "soft" depth scheduling for SFU instructions First try with a "soft" depth, to try to schedule sfu instructions further from their consumers, but fall back to hard depth (which might result in stalling) if nothing else is avail to schedule. Previously the consumer of a sfu instruction could end up scheduled immediately after (since "hard" depth from sfu to consumer would be 0). This works because legalize pass would insert a (ss) sync bit, but it is sub-optimal since it would cause a stall. Instead prioritize other instructions for 4 cycles if they would no cause a nop to be inserted. This minimizes the stalling. There is a slight penalty in general to overall # of instructions in shader (since we could end up needing nop's later due to scheduling the "deeper" sfu consumer later), but ends up being a wash on register pressure. Overall this seems to be worth a 10+% gain in fps. Increasing the "soft" depth of sfu consumer beyond 4 helps a bit in some cases, but 4 seems to be a good trade-off between getting 99% of the gain and not increasing instruction count of shaders too much. It's possible a similar approach could help for tex/mem instructions, but the (sy) sync bit seems to trigger a switch to a different thread- group to hide memory latency (possibly with some limits depending on number of registers used?). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-14 16:14:19 -05:00
Rob Clark	50f9a9aa96	freedreno/a5xx: work around SWAP vs TILE_MODE constraint If the blit isn't changing format, but is changing tiling, just lie and call things ARGB (since the exact component order doesn't matter for a tiling blit). Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-14 16:14:19 -05:00
Rob Clark	39b63c18f1	freedreno/a5xx: texture tiling Overall a nice 5-10% gain for most games. And more for things like glmark2 texture benchmark. There are some rough edges. In particular, the hardware seems to only support tiling or component swap. (Ie. from hw PoV, ARGB/ABGR/RGBA/ BGRA are all the same format but with different component swap.) For tiled formats, only ARGB is possible. This isn't a big problem for sampling since we also have swizzle state there (and since util_format_compose_swizzles() already takes into account the component order, we didn't use COLOR_SWAP for sampling). But it is a problem if you try to render to a tiled BGRA (for example) surface. The next patch introduces a workaround for blitter, so we can generate tiled textures in ABGR/RGBA/BGRA, but that doesn't help the render- target case. To handle that, I think we'd need to keep track that the tiled format is different from the linear format, which seems like it would get extra fun with sampler views/etc. So for now, disabled by default, enable with FD_MESA_DEBUG=ttile. In practice it works fine for all the games I've tried, but makes piglit grumpy. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-14 16:13:39 -05:00
Rob Clark	868b02cfb4	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-14 16:10:06 -05:00
Rob Clark	16b91c2254	freedreno: add screen->setup_slices() for tex layout The rules are sufficiently different for a5xx with tiled textures, so split this out into something that can be implemented per-generation. The a5xx specific implementation will come in a later patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-14 16:10:06 -05:00
Grazvydas Ignotas	047c6fe2c5	r300g: remove double assignment Trivial. Found by Coccinelle.	2018-01-14 23:04:49 +02:00
Grazvydas Ignotas	6acf22a179	util: use faster zlib's CRC32 implementaion zlib provides a faster slice-by-4 CRC32 implementation than the traditional single byte lookup one used by mesa. As most supported platforms now link zlib unconditionally, we can easily use it. Improvement for a 1MB buffer (avg MB/s, n=100, zlib 1.2.8): i5-6600K C2D E4500 mesa zlib mesa zlib 443 1443 225% +/- 2.1% 403 1175 191% +/- 0.9% It has been verified the calculation results stay the same after this change. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-14 19:10:33 +02:00
Grazvydas Ignotas	87f723408b	android,configure,meson: define HAVE_ZLIB The next change wants to use some optional zlib functionality, however not all platforms currently use zlib. Based on earlier Jordan Justen's patches and their review feedback. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-14 18:52:23 +02:00
Grazvydas Ignotas	b7347cc313	util/crc32: don't drop the const qualifier Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-14 18:47:50 +02:00
Timothy Arceri	e6378962ce	ac: add doubles support to isign Fixes a number of int64 piglit tests, for example: generated_tests/spec/arb_gpu_shader_int64/execution/built-in-functions/fs-sign-i64vec2.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-14 11:40:03 +11:00
Timothy Arceri	38876c88d1	ac: add i64_0 and i64_1 to llvm build context These will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-14 11:40:03 +11:00
Timothy Arceri	741b21b713	ac/nir: fix translation of nir_op_b2i for doubles V2: just zero-extend the 32-bit value. Fixes a number of int64 piglet tests, for example: generated_tests/spec/arb_gpu_shader_int64/execution/conversion/frag-conversion-explicit-bool-int64_t.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-14 11:40:03 +11:00
Mauro Rossi	4d61eb8018	ac: fix build error in si_shader assert() is replaced by unreachable(), to avoid following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader.c:1967:1: error: control may reach end of non-void function [-Werror,-Wreturn-type] } ^ 1 error generated. Fixes: `c797cd6` ("ac: add load_patch_vertices_in() to the abi") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-13 18:13:07 +11:00
Timothy Arceri	f0d74ecce8	radv/radeonsi/nir: lower 64bit flrp Fixes a bunch of arb_gpu_shader_fp64 piglit tests for example: generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-mix-double-double-double.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-13 18:04:40 +11:00
Eric Anholt	5bc0b63799	broadcom/vc5: Use MSF to ignore discards/non-dispatched channels in loops. Prevents potential infinite loops when a non-dispatched or discarded channel never triggers the loop break condition.	2018-01-12 21:58:24 -08:00
Eric Anholt	762dd52951	broadcom/vc5: Use XOR instead of SUB for execute flags comparisons. I think this should be equivalent other than power, and it's the kind of comparison we use for nir_op_ieq.	2018-01-12 21:58:18 -08:00
Eric Anholt	8e4cba9d92	broadcom/vc5: Also check the update flags for avoiding DCE. I was trying to do a NULL-destination UF, and it got removed.	2018-01-12 21:58:11 -08:00
Eric Anholt	ff77ca8a3b	broadcom/vc5: Fix up channel swizzling for textures on 4.x. I had 3.x putting swizzling in the texture state only for 16-bit texture returns, and in the shader for 32-bit. This may be due to having mixed up the return channel setup on 3.x back before I had moved it into the compiler. On 4.x, the non-border-color texwrap tests are passing nicely with both 16 and 32-bit returns with swizzling in the texture state.	2018-01-12 21:58:04 -08:00
Eric Anholt	e41e33fdb8	broadcom/vc5: Port the draw-time state emission to V3D 4.1.	2018-01-12 21:57:52 -08:00
Eric Anholt	aa77a9cf5a	broadcom/vc5: Rename V3D 3.x Flat Shade Action to match v4.x naming. Now that the actions are reused for centroid and nonperspective, give them a more generic name.	2018-01-12 21:57:45 -08:00
Eric Anholt	8e4c705515	broadcom/vc5: Update pixel center setup for V3D 4.x. The fxcd/fycd instructions now return half-integer pixel centers when not doing sample-rate shading.	2018-01-12 21:57:37 -08:00
Eric Anholt	95873a184e	broadcom/vc5: Print the buffer name in simulator overflow checks. Revealed that I was writing past the TSDA, not the Z buffer as I expected.	2018-01-12 21:57:30 -08:00
Eric Anholt	368bab43fd	broadcom/vc5: Add support for loading varyings in V3D 4.1. The LDVARY signal now writes an arbitrary register, so I took out the magic src register file and replaced it with an instruction with LDVARY set so we have somewhere to hang a QFILE_TEMP destination for register allocation.	2018-01-12 21:57:21 -08:00
Eric Anholt	af9753e246	broadcom/vc5: Update state setup for V3D 4.1.	2018-01-12 21:57:09 -08:00
Eric Anholt	5aaea3c4a0	broadcom/vc5: Add compiler support for V3D 4.x texturing.	2018-01-12 21:56:57 -08:00
Eric Anholt	028f6b327c	broadcom/vc5: Add the new TMU write addresses for V3D 4.x (and r5rep). The V3D 3.x series of TMU writes with meaning depending on the texture type is replaced with writes to specific registers for each texture argument semantic.	2018-01-12 21:56:48 -08:00
Eric Anholt	42a35da96d	broadcom/vc5: Move V3D 3.3 texturing to a separate file. V3D 4.x texturing changes enough that #ifdefs would just make a mess of it.	2018-01-12 21:56:37 -08:00
Eric Anholt	acf30e4916	broadcom/vc5: Move V3D 3.3 VPM write setup to a separate file. For V4.1 texturing, I need the V4.1 XML, so the main compiler needs to stop including V3.3 XML.	2018-01-12 21:56:24 -08:00
Eric Anholt	725d73981a	broadcom/vc5: Set up depth formats for V3D 4.x. We no longer have the small depth-specific output format enum, and instead depth is just at the end of the output image format enum.	2018-01-12 21:56:17 -08:00
Eric Anholt	66f2f3ed97	broadcom/vc5: Always use the RGBA8 formats for RGBX8. The RGBX8 formats were dropped from V3D 4.x, but we don't really need them anyway (we already handle other non-alpha formats by forcing A to 1).	2018-01-12 21:56:10 -08:00
Eric Anholt	469bbd8387	broadcom/vc5: Move the formats table to per-V3D-version compile.	2018-01-12 21:56:00 -08:00
Eric Anholt	34898c8c45	broadcom/vc5: Add support for V3D 4.1 CLIF dumping.	2018-01-12 21:55:49 -08:00
Eric Anholt	409696b76e	broadcom/vc5: Move the body of CLIF dumping to a per-version file. I want the library's entrypoints to still be unversioned, but the actual packet dumping needs to be per-version.	2018-01-12 21:55:38 -08:00
Eric Anholt	90269ba353	broadcom/vc5: Use THRSW to enable multi-threaded shaders. This is a major performance boost on all of V3D, but is required on V3D 4.x where shaders are always either 2- or 4-threaded.	2018-01-12 21:55:30 -08:00
Eric Anholt	86a12b4d5a	broadcom/vc5: Properly schedule the thread-end THRSW. This fills in the delay slots of thread end as much as we can (other than being cautious about potential TLBZ writes). In the process, I moved the thread end THRSW instruction creation to the scheduler. Once we start emitting THRSWs in the shader, we need to schedule the thread-end one differently from other THRSWs, so having it in there makes that easy.	2018-01-12 21:55:23 -08:00
Eric Anholt	a075bb6726	broadcom/vc5: Implement GFXH-1684 workaround. Apparently the VPM writes need to be flushed out before we end the shader.	2018-01-12 21:55:15 -08:00
Eric Anholt	57965755e2	broadcom/vc5: Port drawing commands to V3D 4.x. This required extending the CL submit ioctl, because the tile alloc/state buffer setup has moved from the BCL to register writes.	2018-01-12 21:55:04 -08:00
Eric Anholt	f50d39ab49	broadcom/vc5: Add a test for .ifb in ADD ops. I had a .ifb being decoded weird in sampid, so this is to check that .ifb is fine.	2018-01-12 21:54:57 -08:00
Eric Anholt	267f13dbee	broadcom/vc5: Add the new tesselation opcodes in V3D 4.1.	2018-01-12 21:54:50 -08:00
Eric Anholt	edbd817c30	broadcom/vc5: Use a physical-reg-only register class for LDVPM. This is needed for LDVPM on V3D 4.x, but will also be needed for keeping values out of the accumulators across THRSW.	2018-01-12 21:54:42 -08:00
Eric Anholt	22a02f3e34	broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1. Now, instead of a magic write register for VPM stores we have an instruction to do them (which means no packing of other ALU ops into it), with the ability to reorder the VPM stores due to the offset being baked into the instruction. VPM loads also gain the ability to be reordered by packing the row into the A argument. They also no longer write to the r3 accumulator, and instead must be stored to a physical register.	2018-01-12 21:54:33 -08:00
Eric Anholt	55f8a01aca	broadcom/vc5: Drop dead VC5_QPU_* defines from qpu_instr.c. I had all the packing code in this file at one point, but these defines now live in qpu_pack.c.	2018-01-12 21:54:27 -08:00
Eric Anholt	2bd378647b	broadcom/vc5: Add support for QPU pack/unpack/disasm of small immediates.	2018-01-12 21:54:18 -08:00
Eric Anholt	5f227ac210	broadcom/vc5: Enable the driver on V3D 4.1	2018-01-12 21:54:12 -08:00
Eric Anholt	39ce1ab7ba	broadcom/vc5: Port the simulator to support V3D 4.1 This required moving the register accesses to a separate v3dx file, since the register definitions for each V3D version collide. It seems that initializing the v3d_hw from a file dictating 3.3 (v3d_simulator_wrapper.cpp) is safe, though.	2018-01-12 21:54:00 -08:00
Eric Anholt	c81cc767e4	broadcom/vc5: Drop signal bit #defines. Signals are more complicated than that, and tables ended up being better.	2018-01-12 21:53:53 -08:00
Eric Anholt	dfee62eed3	broadcom/vc5: Add support for V3Dv4 signal bits. The WRTMUC replaces the implicit uniform loads in the first two texture instructions. LDVPM disappears in favor of an ALU op. LDVARY, LDTMU, LDTLB, and LDUNIF*RF now write to arbitrary registers, which required passing the devinfo through to a few more functions.	2018-01-12 21:53:45 -08:00
Eric Anholt	81ec2ba229	broadcom/vc5: Fix pack/unpack of vfmul input unpack flags.	2018-01-12 21:53:38 -08:00
Eric Anholt	954a704da3	broadcom/vc5: Port the RCL setup to V3D4.1. The TLB load/store path is rebuilt in this version. There is no longer a single-byte resolved store or the 3-byte extended store. Instead, you get to always use general loads/stores (which, honestly, was tempting even in previous versions).	2018-01-12 21:53:26 -08:00
Eric Anholt	80c84241af	broadcom/vc5: Fix per-tile extra clear packet. I accidentally emitted this into the RCL instead of the per-tile generic list, so we wouldn't get tiles after the first cleared.	2018-01-12 21:52:02 -08:00
Eric Anholt	f13fe510d1	broadcom/vc5: Move the TLB loads and stores to helper functions. This is going to get more complicated with V3D 4.1 support, which redoes all the TLB packets.	2018-01-12 21:51:54 -08:00
Eric Anholt	2c48ce74f7	broadcom/vc5: Convert vc5_cl.h to use the V3DX() macros. To conditionally compile cl_emit() macros per V3D version, we need it to expand to whatever V3D we're building for. This required emitting #define V3D_VERSION 33 in all our currently 3.3-only code.	2018-01-12 21:51:47 -08:00
Eric Anholt	fb4face86a	broadcom/vc5: Introduce v3dx_macros.h and v3dx_pack.h headers. This will be used by vc5 for prefixing functions and including the pack header in v3d-version-dependent code, following the model of anv.	2018-01-12 21:51:40 -08:00
Eric Anholt	7dedfd9660	broadcom/cle: Fix error path of missing a "type" in the XML. We try to emit a #error and continue so that you can debug the missing type at C compile time, but were missing a couple of definitions in that path (sigh, python).	2018-01-12 21:51:34 -08:00
Eric Anholt	3d8ad50370	broadcom/vc5: Add XML for V3D v4.1 (BCM7278)	2018-01-12 21:48:07 -08:00
Samuel Pitoiset	0eb30d81c4	ac: add 'const' qualifiers to the shader info pass For clarification purposes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-12 12:25:21 +01:00
Samuel Pitoiset	20f7f9a328	ac: remove unused ac_nir_compiler_options from gather_info_input_decl() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-12 12:25:19 +01:00
Samuel Pitoiset	d5e369ff8a	nir: add a 'const' qualifier to nir_ssa_def_components_read() To avoid compilation warnings and because this helper shouldn't update anything. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-12 12:25:17 +01:00
Thomas Hellstrom	897c54d522	loader/dri3: Avoid freeing renderbuffers in use Upon reception of an event that lowered the number of active back buffers, the code would immediately try to free all back buffers with an id equal to or higher than the new number of active back buffers. However, that could lead to an active or to-be-active back buffer being freed, since the old number of back buffers was used when obtaining an idle back buffer for use. This lead to crashes when lowering the number of active back buffers by transitioning from page-flipping to non-page-flipping presents. Fix this by computing the number of active back buffers only when trying to obtain a new back buffer. Fixes: `15e208c4cc` ("loader/dri3: Don't accidently free buffer holding new back content") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104214 Cc: "17.3" <mesa-stable@lists.freedesktop.org> Tested-by: Andriy.Khulap <andriy.khulap@globallogic.com> Tested-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2018-01-12 09:17:35 +01:00
Samuel Iglesias Gonsálvez	e63adf8b1e	anv: VkDescriptorSetLayoutBinding can have descriptorCount == 0 From Vulkan spec: "descriptorCount is the number of descriptors contained in the binding, accessed in a shader as an array. If descriptorCount is zero this binding entry is reserved and the resource must not be accessed from any stage via this binding within any pipeline using the set layout." Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2018-01-12 07:08:51 +01:00
Roland Scheidegger	734bef372d	mesa: require at least 14 UBOs for GL 4.3 ARB_ubo requires 12 UBOs (per stage) at least, but this limit has been raised by GL 4.3 to 14, so don't advertize GL 4.3 without it (only checking the vertex stage since all drivers probably have the same limit anyway for other stages). (piglit has minmax tests for that kind of thing, but they go only up to 3.3, so this won't really be noticed.) I think this currently should not affect any driver - r600 until very recently only supported 12 but now advertizes 14 too. Reviewed-by: Brian Paul <brianp@vmware.com>	2018-01-12 02:52:10 +01:00
Roland Scheidegger	85377dc55c	util: fix NORETURN for msvc, add HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h We've seen some problems internally due to macro redefinition. Fix this by adding HAVE_FUNC_ATTRIBUTE_NORETURN to c99_compat.h, and defining it for msvc. And avoid redefinition just in case. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2018-01-12 02:52:10 +01:00
Dave Airlie	ad11fc3571	radv: don't emit unneeded vertex state. If the number of instances hasn't changed and we've already emitted it, don't emit it again. If the vertex shader is the same and the first_instance, vertex_offset haven't changed don't emit them again. This increases the fps in GL_vs_VK -t 1 -m -api vk from around 40 to around 60 here, it may not impact anything else. Dieter also reported smoketest going from 1060->1200 fps. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-12 00:43:07 +00:00
Dave Airlie	e37db93246	radv: trim buffer load result (fixes dota2) Running dota2 since the below commit crashes with an llvm assert. Trim the vector like the other user. This possible could also be avoided by not padding inside the load vec3->vec4. Fixes: `41c36c4549` (amd/common: use ac_build_buffer_load() for emitting UBO loads) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-12 00:41:55 +00:00
Dylan Baker	aca3b647be	meson: add variable for including include/GL/internal Signed-off-by: <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	5fcadaec80	meson: define inc_gbm as empty if not otherwise assigned Otherwise this could be undefined in the egl directory. Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	a0a764cde5	meson: move libsensors dependency to libgallium This simplifies the build by removing the need to link targets against libsensors. Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	2083a14179	meson: Use dependencies for nir This creates two new internal dependencies, idep_nir_headers and idep_nir. The former encapsulates the generation of nir_opcodes.h and nir_builder_opcodes.h and adding src/compiler/nir as an include path. This ensures that any target that needs nir headers will have the includes and that the generated headers will be generated before the target is build. The second, idep_nir, includes the first and additionally links to libnir. This is intended to make it easier to avoid race conditions in the build when using nir, since the number of consumers for libnir and it's headers are quite high. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	60856a7b49	meson: don't use intermediate variables that are immediately discarded For things like: loop x = func() list += x end just do: loop list += func() end Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	4ccb981673	meson: Use consistent style for tests Don't use intermediate variables, use consistent whitespace. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	8e981eb2b7	meson: Use include variables These were added after adderlib was mesonified, but it still good to use them instead of open coding them. Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Dylan Baker	fbf192a67e	meson: Use consistent style Currently the meosn build has a mix of two styles: arg : [foo, ... bar], and arg : [ foo, ..., bar, ] For consistency let's pick one. I've picked the later style, which I think is more readable, and is more common in the mesa code base. v2: - fix commit message Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-11 15:40:02 -08:00
Jason Ekstrand	c3d802d68e	i965: Use UD types for gl_SampleID setup We already had to switch all of the W types to UW to prevent issues with vector immediates on gen10. We may as well use unsigned types everywhere. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-01-11 14:31:47 -08:00
Jason Ekstrand	3d2b157e23	i965/fs: Use UW types when using V immediates Gen 10 has a strange hardware bug involving V immediates with W types. It appears that a mov(8) g2<1>W 0x76543210V will actually result in g2 getting the value {3, 2, 1, 0, 3, 2, 1, 0}. In particular, the bottom four nibbles are repeated instead of the top four being taken. (A mov of 0x00003210V yields the same result.) This bug does not appear in any hardware documentation as far as we can tell and the simulator does not implement the bug either. Commit `6132992cdb` was mostly a no-op except that it changed the type of the subgroup invocation from UW to W and caused us to tickle this bug with basically every compute shader that uses any sort of invocation ID (which is most of them). This is also potentially an issue for geometry shader input pulls and SampleID setup. The easy solution is just to change the few places where we use a vector integer immediate with a W type to use a UW type. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org Fixes: `6132992cdb`	2018-01-11 14:31:38 -08:00
Timothy Arceri	30c1a93f6d	ac/nir: fix translation of nir_op_fsign for doubles Without this we end up with the llvm error message: "Both operands to a binary operator are not of the same type!" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-12 09:29:18 +11:00
Timothy Arceri	d7b6b8ba52	ac: add f64_0 to the llvm build context Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-12 09:29:18 +11:00
Timothy Arceri	7b971c828a	ac/nir: fix translation of nir_op_frcp for doubles Without this we end up with the llvm error message: "Both operands to a binary operator are not of the same type!" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-12 09:29:18 +11:00
Timothy Arceri	24575c815c	ac/nir: fix translation of nir_op_frsq for doubles Without this we end up with the llvm error message: "Both operands to a binary operator are not of the same type!" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-12 09:29:17 +11:00
Timothy Arceri	c0eb304acd	ac: add f64_1 to the llvm build context Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-12 09:29:17 +11:00
Bas Nieuwenhuizen	b9f4c615f8	radv: reset semaphores & fences on sync_file export. Per spec: "Additionally, exporting a fence payload to a handle with copy transference has the same side effects on the source fence’s payload as executing a fence reset operation. If the fence was using a temporarily imported payload, the fence’s prior permanent payload will be restored." And similar for semaphores: "Additionally, exporting a semaphore payload to a handle with copy transference has the same side effects on the source semaphore’s payload as executing a semaphore wait operation. If the semaphore was using a temporarily imported payload, the semaphore’s prior permanent payload will be restored." Fixes: `42bc25a79c` "radv: Advertise sync fd import and export." Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-11 21:56:13 +01:00
Anuj Phogat	fe668b5c15	intel: Add more Coffee Lake PCI IDs More Coffee Lake PCI IDs have been added to the spec. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2018-01-11 10:16:54 -08:00
Matt Turner	c0ef14f5b1	Revert "Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+"" This reverts commit `2d04572038`. Acked-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-01-11 10:11:59 -08:00
Matt Turner	01ebfbb67a	i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride Some cases weren't handled, such as stride 4 which is needed for 64-bit operations. Presumably fixes the assertion failure mentioned in commit `2d04572038` (Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+") but who can really say since the commit neglected to list any of them! Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2018-01-11 10:11:59 -08:00
Alex Smith	4fd85617c1	anv: Make sure state on primary is correct after CmdExecuteCommands After executing a secondary command buffer, we need to update certain state on the primary command buffer to reflect changes by the secondary. Otherwise subsequent commands may not have the correct state set. This fixes various issues (rendering errors, GPU hangs) seen after executing secondary command buffers in some cases. v2 (Jason Ekstrand): - Reset to invalid values instead of pulling from the secondary - Change the comment to be more descriptive Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org	2018-01-11 18:11:08 +00:00
Brian Paul	bb951d45f2	svga: simplify failure code in emit_rss_vgpu9() No need for a goto. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-11 08:06:38 -07:00
Brian Paul	8f884b83d4	svga: remove unused fail parameter to EMIT_RS(), EMIT_RS_FLOAT() Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-11 08:06:38 -07:00
Brian Paul	879ea9432d	svga: add assertion in svga_queue_rs() Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-11 08:06:38 -07:00
Brian Paul	66c6cec612	svga: whitespace/formatting fixes in svga_state_rss.c Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2018-01-11 08:06:38 -07:00
Andres Gomez	a1901d092c	anv: Import mako templates only during execution of anv_extensions anv_extensions usage from anv_icd was bringing the unwanted dependency of mako templates for the latter. We don't want that since it will force the dependency even for distributable tarballs which was not needed until now. Jason suggested this approach. v2: Patch simplification (Jason). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104551 Fixes: `0ab04ba979` ("anv: Use python to generate ICD json files") Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-11 14:44:03 +02:00
Tapani Pälli	f2c0e47d9c	glsl: cleanup shader_cache header guard Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-11 08:04:56 +02:00
Samuel Iglesias Gonsálvez	c0816389c2	anv: fix maxDescriptorSet* limits "The maxDescriptorSet* limit is n times the corresponding maxPerStageDescriptor* limit, where n is the number of shader stages supported by the VkPhysicalDevice. If all shader stages are supported, n = 6 (vertex, tessellation control, tessellation evaluation, geometry, fragment, compute)." Fixes: dEQP-VK.api.info.device.properties Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-11 07:00:42 +01:00
Timothy Arceri	c797cd605a	ac: add load_patch_vertices_in() to the abi Fixes the follow test for radeonsi nir: tests/spec/arb_tessellation_shader/execution/quads.shader_test Also stops 8 other tests from crashing, they now just fail e.g. tcs-output-array-float-index-rd-after-barrier.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-11 14:28:37 +11:00
Bas Nieuwenhuizen	67e09c8b45	ac/nir: Sanitize location_frac for local variables. If they were promoted from inputs/outputs, they could have a non-zero value left over, which messed with our store handling. Fixes: `06f05040eb` "radv: Link shaders." Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-11 00:56:52 +01:00
Rob Herring	af8fd38996	tgsi: include struct definitions for tgsi_build declarations Many of the functions declared in tgsi_build.h return structs (not struct pointers). Therefore the full struct definitions are needed to avoid warnings or errors: In file included from src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:23: external/mesa3d/src/gallium/auxiliary/tgsi/tgsi_build.h:47:1: error: 'tgsi_build_header' has C-linkage specified, but returns incomplete type 'struct tgsi_header' which could be incompatible with C [-Werror,-Wreturn-type-c-linkage] This error shows up on Android builds using clang and -Werror. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Herring <robh@kernel.org>	2018-01-10 14:56:09 -06:00
George Kyriazis	5c4081d66d	swr: Handle indirect indices in GS BuilderSWR::swr_gs_llvm_fetch_input() (and consequently swr_gs_llvm_fetch_input()), did not handle the case where is_vindex_indirect or is_aindex_direct is set. Implement it, using the code in draw_llvm.c as a guideline. Fixes the following piglit tests: dynamic_input_array_index (crash) gs-input-array-vec4-index-rd vs-output-array-vec4-index-wr-before-gs Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 14:02:17 -06:00
Samuel Pitoiset	41c36c4549	amd/common: use ac_build_buffer_load() for emitting UBO loads Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 19:02:27 +01:00
Samuel Pitoiset	7239e265eb	amd/common: import get_{load,store}_intr_attribs() from RadeonSI v2: move those helpers to the header and use static inline Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)	2018-01-10 19:02:23 +01:00
Marek Olšák	b391fb26df	dri_util: remove ALLOW_RGB10_CONFIGS option (v2) This is unused because it's for libGL/libEGL, not drivers. v2: i965 was wrong, because it used dri_util instead of its own config. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-10 17:52:56 +01:00
Tim Rowley	c259888c52	swr/rast: switch win32 jit format to COFF Allows for call-stack and exception handling for jitted functions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Tim Rowley	3d4d34e380	swr/rast: don't use 32-bit gathers for elements < 32-bits in size Using a gather for elements less than 32-bits in size can cause pagefaults when loading the last elements in a page-aligned-sized buffer. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Tim Rowley	f5f1bbcb5c	swr/rast: autogenerate named structs instead of literal structs Results in far smaller and useful IR output. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Tim Rowley	04d0bfde39	swr/rast: SIMD16 fetch shader jitter cleanup Bake in USE_SIMD16_BUILDER code paths (for USE_SIMD16_SHADER defined), remove USE_SIMD16_BUILDER define, remove deprecated psuedo-SIMD16 code paths. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Tim Rowley	d3a4c8057d	swr/rast: shuffle header files for msvc pre-compiled header usage Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Tim Rowley	e14b48e00e	swr/rast: SIMD16 builder - cleanup naming (simd2 -> simd16) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-10 09:44:07 -06:00
Ian Romanick	336afe7d7a	glsl/linker: Safely generate mask of possible locations If MaxAttribs were ever raised to 32, undefined behavior would occur. We had already gone to the effort (albeit incorrectly) handle this in one case, so fix them all. CID: 1369628 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:12 -08:00
Ian Romanick	0c9df36157	glsl/linker: Mark no locations as invalid instead of marking all locations If max_index were ever 32, the linker would have marked all 32 locations as invalid instead of marking none of them as invalid. It's a good thing the maximum value actually set by any driver for MaxAttribs is 16. Found by inspection while investigating CID 1369628. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	702dc43f7e	glsl: Don't handle visit_stop in several ::accept methods All cases where the result could be non-visit_continue would have already returned. CID: 401351, 1224465, 1224466 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	a170f27958	glsl: Remove unnecessary assignments to type None of these are necessary because result->type is the only thing used outside the giant switch-statement. CID: 1230983, 1230984 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Ian Romanick	fd2f4f507f	nir: Silence unused parameter warnings In file included from src/compiler/nir/nir_opt_algebraic.c:4:0: src/compiler/nir/nir_search_helpers.h: In function ‘is_not_const’: src/compiler/nir/nir_search_helpers.h:118:59: warning: unused parameter ‘num_components’ [-Wunused-parameter] is_not_const(nir_alu_instr instr, unsigned src, unsigned num_components, ^~~~~~~~~~~~~~ src/compiler/nir/nir_search_helpers.h:119:29: warning: unused parameter ‘swizzle ’ [-Wunused-parameter] const uint8_t swizzle) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2018-01-10 07:21:11 -08:00
Bas Nieuwenhuizen	d0ef3d4bb0	radv: Remove some typos. Trivial.	2018-01-10 13:26:27 +01:00
Bas Nieuwenhuizen	5db0bf9994	radv: Implement VK_EXT_discard_rectangles. Tested with a modified deferred demo and no regressions in a 1.0.2 mustpass run. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 13:26:22 +01:00
Bas Nieuwenhuizen	11b9cdd2d7	radv: Add mapping between dynamic state mask and external enum. The EXT values are really large, e.g. VK_DYNAMIC_STATE_DISCARD_RECTANGLE_EXT = 1000099000, so 1 << value is not going to fit into a 32-bit mask. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 13:24:31 +01:00
Samuel Pitoiset	7145b20afb	amd/common: bump the number of available user SGPRS to 32 on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:35:08 +01:00
Samuel Pitoiset	a1f1f708c0	radv: remove radv_pipeline_layout::push_constant_stages field Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:31:57 +01:00
Samuel Pitoiset	d43f50c00b	amd/common: do not rely on the pipeline for the push constants logic It makes more sense to rely on nir_intrinsic_load_push_constant instead of the pipeline layout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:31:54 +01:00
Samuel Pitoiset	4e701cf75c	radv/gfx9: calculate the number of ES VGPRs for merged shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:31:53 +01:00
Samuel Pitoiset	232c418af5	radv/gfx9: enable LDS for GS only if the ES type is TES Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:31:51 +01:00
Samuel Pitoiset	9e2395faf5	amd/common: determine the ES type (VS or TES) for the GS on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-10 12:31:49 +01:00
Iago Toral Quiroga	9ef5b3d517	i965/nir: lower TES PatchVerticesIn to a constant when a TCS is present When a TCS is present at link time we know the number of vertices in the patch and we can lower gl_PatchVerticesIn in the TesEval stage directly to a constant. We already have a pass for this that we use in the Vulkan pipeline, so we just reuse that. Notice that the GLSL linker also implements this optimization, which we are not removing because other drivers may still depend on it, so this should only be useful for OpenGL SPIR-V shaders for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-10 08:21:02 +01:00
Iago Toral Quiroga	7e5c81235f	glsl: remove Lower{TCS,TES}PatchVerticesIn Intel was the only user and now NIR can do the lowering. v2: do not try to handle it as a system value directly for the SPIR-V path. In GL we rather handle it as a uniform like we do for the GLSL path (Jason). v3: drop LowerTESPatchVerticesIn as well (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-10 08:21:02 +01:00
Iago Toral Quiroga	dae856eced	i965: lower gl_PatchVerticesIn to a uniform We want this here instead of nir_lower_system_values because for Vulkan we don't want this lowering to take place. v2: do not try to handle it as a system value directly for the SPIR-V path. In GL we rather handle it as a uniform like we do for the GLSL path (Jason). v3: do this also for the TessEval stage (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-10 08:21:02 +01:00
Iago Toral Quiroga	4317c848b9	i965/nir: add a helper to lower gl_PatchVerticesIn to a uniform v2: do not try to handle it as a system value directly for the SPIR-V path. In GL we rather handle it as a uniform like we do for the GLSL path (Jason). v3: - Remove the uniform variable, it is alwats -1 now (Jason) - Also do the lowering for the TessEval stage (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-10 08:21:02 +01:00
Roland Scheidegger	ea227f4322	r600: don't emit tes samplers/views when tes isn't active Similar to const buffers. The driver must not emit any tes-related state if tes is disabled, since the hw slots are all shared by VS, therefore it would overwrite them (the mesa state tracker might not do this, but it would be perfectly legal to do so). Nevertheless I think the dirty state tracking logic in the driver is fundamentally flawed when tes is disabled/enabled, since it looks to me like the VS (and TES) state would not get reemitted to the correct slots (if it's not dirty anyway). Unless I'm missing something... Theoretically, the overwrite problem could be solved by using non-overlapping resource slots for TES and VS (since we're not even close to using half the resource slots), but it wouldn't work for constant buffers nor samplers, and for VS would still need to propagate changes to both LS and VS, so probably not a useful idea. Unfortunately there's zero coverage of this with piglit, since all tessellation shader tests are just shader_runner tests, which are unsuitable for testing any kind of state dependency tracking issues (so I can't even quickly hack something up to proove it and fix it...). TCS otoh is just fine - like GS it has its own hw slots. Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	523b6c8704	r600: increase number of UBOs to 15 With the exception of the default tess levels only ever accessed by the default tcs shader, the LDS_INFO const buffer was only accessed by vtx instructions, and not through kcache. No idea why really, but use this to our advantage by not using a constant buffer slot for it. This just requires us to throw the default tess levels into the "normal" driver const buffer instead. Alternatively, could acesss those constants via vtx instructions too, but then we couldn't use a ordinary ureg prog accessing them as constants and would have to generate that directly when compiling the default tcs shader. (Another alternative would be to put all lds info into the ordinary driver const buffer, albeit we'd maybe need to increase the fixed size as it can't fit alongside the ucp since vs needs access to the lds info too.) Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	c5162fd3c4	r600: use GET_BUFFER_RESINFO vtx fetch on eg instead of setting up consts Contrary to what the comment said, this appears to work just fine on my rv770 (tested with piglit textureSize 140 fs/vs samplerBuffer). Dave Airlie confirmed it working on cayman too. I have no clue though if it's actually preferrable to use it (unfortunately we cannot get rid of the tex constants completely, as we still require them for cube map txq). Albeit filling in the format (1 channels or 4?) and the stuff related to mega- or mini-fetch (what the hell is this...) is just a guess based on other usage of vtx fetch instructions... v2: it really needs to be done through texture cache (I botched the testing because sb optimizations turned it automatically into tc, but can't rely on it and isn't happening on tes). Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	0be1dc25cf	r600: increase number of ubos by one to 14 Ideally we'd support 16 (d3d11 requires 15, and mesa subtracts one for non-ubo constants), but that's kind of impossible (it would be only doable if either we'd somehow merge the mesa non-ubo constants with the driver constants, or only use the driver constants with vtx fetch instead of through the kcache mechanism - the latter probably wouldn't be too bad). For now just do as the comment already said, place the gs ring (not really a const buffer in any case) which is only ever referred to through vc fetch clauses at index 16. Throw in a couple asserts for good measure to make sure the hw limit isn't exceeded. Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	43292c78b7	r600: set up constants needed for txq for buffers and cube maps with tes We only did this for the other stages, but obviously tess eval/ctrl need it too. This fixes the (newly modified) piglit texturing/textureSize test when run with tes stage and bufferSampler. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	22ba4ebb18	r600: don't emit reloc for ring buffer out into the blue It looks like this reloc belongs to setting the constant reg, which is skipped for gs ring. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	76baf99737	r600: hack up num_render_backends on Juniper to 8 Juniper really has a maximum of 4 RBEs (16 pixels). However, predication always locks up on my HD 5750, and through experiments it looks like if we're pretending it has a maximum of 8, with 4 disabled, it works correctly. My conclusion would be that there's a bug (likely firmware, not hw) which causes the predication logic to try to read 8 results out of the query buffer instead of just 4, and since of course noone ever writes the upper 4, the status bit is never set and hence it will wait for it forever. Ideally this would be fixed in firmware, but I'd guess chances of that happening are slim. This will double the size of (occlusion) query result buffers, write the status bit for the disabled rbs in these buffers, and will also add 8 results together instead of just 4 when reading them back. The latter is unnecessary, but it's probably not worth bothering - luckily num_render_backends isn't used outside of occlusion queries, so don't need separate value for the "real" maximum. Also print out the enabled_rb_mask if it changed from the pre-fixed value (which is already printed out), just in case there's some more problems with chips which have some rbs disabled... This fixes all the lockups with piglit nv_conditional_render tests on my HD 5750 (all pass). Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	f0dd1b3612	winsys/radeon: fix up default enabled_rb_mask for r600 The logic had two fatal flaws which completely killed the default value. 1) drm will overwrite the value anyway even if the chip can't be handled 2) the default value logic is relying on num_render_backends, which was filled in later. Luckily noone is relying on it, but it's a bit confusing seeing the chip clock printed out there (as hex) with R600_DEBUG=info... (Albeit radeonsi does not appear to fix up the value. If kernels which don't handle this query are still supported, radeonsi will still end up with a broken enabled_rb_mask, I have no idea of the potential results of this there.) Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	7c0bc495f1	r600: fix enabled_rb_mask on eg/cm For eg/cm, the r600_gb_backend_map will always be 0. This is a bug in the drm kernel driver, as it just just never fills the information in (it is now being fixed - the history shows it was being filled in when the query was brand new but got lost shortly thereafter with backend_map fixes). This causes r600_query_hw_prepare_buffer to write the "status bit" (just the highest bit of the occlusion query result) even for active rbes (all but the first). This doesn't make much sense, albeit I suppose it's mostly safe. According to the commit history, it's necessary to set these bits for inactive rbes since otherwise predication will lock up - presumably the hw just is waiting for the status bit to appear, which will never happen with inactive rbes. I'd guess potentially predication could be wrong (due to not waiting for the actual result if the status bit is already there) if this is set for active rbes. Discovered while trying to fix predication lockups on Juniper (needs another patch). Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	762ccf483a	r600: fix sampler indexing with texture buffers sampling This fixes the new piglit test. While here also fix up the logic for early exit of setting up driver consts. Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Reviewed-by: Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Roland Scheidegger	6c8d6ce982	r600: don't use vtx offset for load_sample_position The offset looks bogus to me. Albeit in the end it doesn't matter, by the looks of it offsets smaller than 4 get ignored there (not sure of the rules, I suppose either non-dword aligned offsets never work there or the offset must be at least aligned to the size of a single element). Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-10 04:59:00 +01:00
Dave Airlie	f4b1ec2972	r600: drop l2 related queries radeonsi only. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-10 00:56:09 +00:00
Dave Airlie	e836fb2002	r600/shader: only read back the necessary tess factor components. This just reduces the lds reads for the the tess factor emission. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-01-10 00:54:32 +00:00
Jon Turney	adfb9c5c7b	Fix use of alloca() without #include <c99_alloca.h> ../../../src/mesa/main/shaderapi.c: In function ‘_mesa_ShaderBinary’: ../../../src/mesa/main/shaderapi.c:2188:9: error: implicit declaration of function ‘alloca’ [-Werror=implicit-function-declaration]	2018-01-09 22:07:52 +00:00
Kenneth Graunke	28c2d0d80b	genxml: Add missing INSTDONE_1 bits on Gen7.5+. This will make aubinator_error_decode decode them properly. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-09 10:13:53 -08:00
Kenneth Graunke	8eadc2fb8f	intel: Apply Geminilake "Barrier Mode" workaround. Apparently, Geminilake requires you to whack a chicken bit to select either compute or tessellation mode for barriers. The recommendation is to switch between them at PIPELINE_SELECT time. We may not need to do this all the time, but I don't know that it hurts either. PIPELINE_SELECT is already a pretty giant stall. This appears to fix hangs in tessellation control shaders with barriers on Geminilake. Note that this requires a corresponding kernel change, drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake. in order for the register write to actually happen. Without an updated kernel, this register write will be noop'd and the fix will not work. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-01-09 10:13:33 -08:00
Emil Velikov	5e7d06fcb0	docs: update calendar, add news and link release notes for 17.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2018-01-09 16:13:31 +00:00
Emil Velikov	ea9c548494	docs: add sha256 checksums for 17.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `3a67ca681b`)	2018-01-09 16:10:12 +00:00
Emil Velikov	6c73767596	docs: add release notes for 17.3.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `0f27052e32`)	2018-01-09 16:10:10 +00:00
Indrajit Das	e05d5b0cf3	st/omx_bellagio: Update default intra matrix per MPEG2 spec Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2018-01-09 09:10:24 -05:00
Scott D Phillips	42f421cbbf	aubinator: add support for aubinating memtrace aubs Memtrace aubs are similar to classic aubs, with the major difference being how command submission is serialized (as register writes instead of a high-level submit message). Some internal tools generate or consume only memtrace aubs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-08 21:11:11 -08:00
Scott D Phillips	8cdf5bd292	aubinator: extract aubinator_init() out of the header handler function A later patch will use the aubinator_init() function from the memtrace aub header handler. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-08 21:11:11 -08:00
Scott D Phillips	4f0a2ff4c1	aubinator: honor --color option when printing the header Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-08 21:11:11 -08:00
Scott D Phillips	161a97c3d5	.gitignore: Ignore new generated files New generated files from: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") `65fc16c974` ("autotools: set XA versions in configure.ac and configure header file") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2018-01-08 21:11:11 -08:00
Dylan Baker	73ce7cb474	Meson: ensure variable defined A gallium driver is undefined if passing -Dgallium-drivers='' Fixes: `e0b037d697` ("meson: Build SWR driver") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Acked-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2018-01-08 17:43:45 -08:00
Dylan Baker	21bca27349	meson: Fix typo in clover build The leading space breaks things. fixes: `42ea0631f1` ("meson: build clover") Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-08 17:31:55 -08:00
Dylan Baker	eab0316d10	meson: set opencl flags for r600 Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-08 16:39:48 -08:00
Dylan Baker	42ea0631f1	meson: build clover This has only been compile tested. v2: - Have a single option for opencl (Eric E) - fix typo "tgis" -> "tgsi" (Curro) - Don't add "lib" to pipe loader libraries, which matches the autotools behavior v3: - Remove trailing whitespace - Make PIPE_SEARCH_DIR an absolute path v4: - add trailing / to LIBCLC defines Acked-by: Curro Jerez <currojerez@riseup.net> Tested-by: Jan Vesely <jan.vesely@rutgers.edu> cc: Aaron Watry <awatry@gmail.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-08 16:39:42 -08:00
Dylan Baker	425fcbde3f	meson: Turn on swr for relevant targets Currently that's dri, libgl-xlib, and osmesa. v2: - put drivers on a separate line from normal dependencies (Eric E) cc: George Kyriazis <george.kyriazis@intel.com> cc: Tim Rowley <timothy.o.rowley@intel.com> cc: Bruce Cherniak <bruce.cherniak@intel.com> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2018-01-08 16:39:37 -08:00
Dylan Baker	e0b037d697	meson: Build SWR driver This enables the SWR driver, but doesn't actually hook it up to any of the targets yet. I felt like this patch was big and complicated enough without adding that. v2: - Fix typo 'delemeited' -> 'delimited' (Eric E) - Fix type 'errror' -> 'error' (Eric E) - Use variables to hold files instead of looking above the current meson build (Eric E) - Use foreach loops to reduce the number of unique generators - Add comment about why some generators have names and some are just added to a list v3: - Remove trailing whitespace Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-01-08 16:39:30 -08:00
Timothy Arceri	f04d2ca0d9	ac: rework emit_barrier() to not segfault on radeonsi nir_to_llvm_context will always be NULL for radeonsi so we need work around this. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-09 10:21:32 +11:00
Timothy Arceri	19f3141e6a	ac: add load_tess_level() to the abi Fixes the following piglit tests in radeonsi: vs-tcs-tes-tessinner-tessouter-inputs-quads.shader_test vs-tcs-tes-tessinner-tessouter-inputs-tris.shader_test vs-tes-tessinner-tessouter-inputs-quads.shader_test vs-tes-tessinner-tessouter-inputs-tris.shader_test v2: make use of si_shader_io_get_unique_index_patch() via the helper in the previous patch rather than shader_io_get_unique_index() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-09 10:21:32 +11:00
Timothy Arceri	2bd7ab32cf	radeonsi: add load_tess_level() helper This will be shared by the tgsi and nir backends. v2: move si_shader_io_get_unique_index_patch() call inside the helper. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-09 10:21:32 +11:00
Jason Ekstrand	9e5aaa93cb	spirv: Do implicit conversions of uint to bool in OpStore Technically, the GLSLang bug related to this can also affect SSBO writes where the bool -> uint conversion is missing. However, the only known shipping application with an old enough version of GLSLang to cause issues with this is the new DOOM game so we keep the workaround as small as possible. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	154668e79c	spirv: Loosen the validation for load/store type matching Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104338 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424 Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	986303cb92	spirv: Require a storage type for OpStore destinations This rules out things such as trying to store a pointer to a local variable. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	70f588778c	spirv: Add a vtn_types_compatible helper Tested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	8bad7f33c6	spirv: Store the id of the type in vtn_type Previously, we were storing a pointer to the vtn_value because we use it to look up decorations when we create input/output variables. This works, but it also may be useful to have the id itself so we may as well store that instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	53265c8798	spirv: Add a mechanism for dumping failing shaders Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	819adfdfb4	spirv: Rework asserts in var_decoration_cb Now that higher levels are enforcing decoration sanity, we don't need the vtn_asserts here. This function should be safe but we still want a few well-placed regular asserts in case something goes awry. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	71ea4dded5	spirv: Rework error checking for decorations This reworks the error checking on our generic handling of decorations. The objective is to validate all of the SPIR-V assumptions we make up-front and convert redundant checks to compiled-out asserts. The most important part of this is to ensure that member decorations only occur on OpTypeStruct and that the member is never out-of-bounds. This way later code can assume that the member is sane and not have to worry about OOB array access due to a misplaced OpMemberDecorate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	d6a4099303	spirv: Add better type validation to OpTypeImage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	03c543d041	spirv: Switch on vtn_base_type in OpComposite(Extract\|Insert) This is a bit simpler since we have fewer enum values in the case. It's also a bit more efficient because we're making fewer glsl_get_* calls. While we're at it, add better type validation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	936f49268e	spirv: Refactor Op[Spec]ConstantComposite and add better validation Now that vtn_base_type is a real and full base type, we can switch on that instead of the GLSL base type which is a lot fewer cases in our switch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	dabce5061d	spirv: Add better validation to Op[Spec]Constant Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	6cf965751a	spirv: Remove a pointless assignment in SpvOpSpecConstant We re-assign later inside the bit_size switch Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	f13a5cff72	spirv: Unify boolean constants and add better validation Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	0bb18858fb	spirv/info: Add spirv_op_to_string Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	ab85fd02d5	spirv: Make 'info' a local array spirv_info_c.py Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Jason Ekstrand	296046556a	spirv: Add better error messages in vtn_value helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-08 14:57:44 -08:00
Caio Marcelo de Oliveira Filho	22980f941e	spirv: Import 1.2 rev 3 headers and grammar from Khronos Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-08 13:22:17 -08:00
Samuel Pitoiset	08a5f4412a	radv: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3 VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1 Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:30:01 +01:00
Samuel Pitoiset	be16bbe1d3	radv: avoid PS partial flushes when viewports/scissors don't change For Vega10 and Raven that need a special workaround for the scissor bug. This seems to give a minor boost for Talos and Dota 2, at least. To reduce the cost of memcmp, the driver checks if it's really useful to do the comparison. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:58 +01:00
Samuel Pitoiset	b09b3f8834	radv: add has_scissor_bug for Vega10 and Raven Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:56 +01:00
Samuel Pitoiset	b462ceb482	radv/gfx9: do not load VGPR1 when GS uses points or lines VGPR1 is only needed for topology that needs 3 offsets like triangles or quads. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:53 +01:00
Samuel Pitoiset	a3c2a86757	radv: make shader BOs read-only for the GPU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:51 +01:00
Samuel Pitoiset	6e3459eaf4	radv: make descriptor BOs read-only for the GPU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:49 +01:00
Samuel Pitoiset	e4f2ad403f	radv: make the indirect GFX config BO read-only for the GPU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:47 +01:00
Samuel Pitoiset	0e84fc2e2b	radv/winsys: make IBs read-only for the GPU Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:45 +01:00
Samuel Pitoiset	a3aaa03624	radv/winsys: add RADEON_FLAG_READ_ONLY Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:43 +01:00
Samuel Pitoiset	2dab5e96ec	radv/winsys: rework radv_amdgpu_bo_va_op() Needed for the following commit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-08 21:24:41 +01:00
Igor Gnatenko	23ce168048	link mesautil with pthreads ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_setname': /builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:66: undefined reference to `pthread_setname_np' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_join': /builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:336: undefined reference to `pthread_join' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create': /builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:48: undefined reference to `pthread_sigmask' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `thrd_create': /builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:296: undefined reference to `pthread_create' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_create': /builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask' /builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:50: undefined reference to `pthread_sigmask' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `call_once': /builddir/build/BUILD/mesa-17.3.1/src/util/../../include/c11/threads_posix.h:96: undefined reference to `pthread_once' ../../src/util/.libs/libmesautil.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano': /builddir/build/BUILD/mesa-17.3.1/src/util/../../src/util/u_thread.h:84: undefined reference to `pthread_getcpuclockid' collect2: error: ld returned 1 exit status Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Igor Gnatenko <ignatenko@redhat.com>	2018-01-08 11:40:02 -05:00
Alex Smith	0d8b9c529c	anv: Allow PMA optimization to be enabled in secondary command buffers This was never enabled in secondary buffers because hiz_enabled was never set to true for those. If the app provides a framebuffer in the inheritance info when beginning a secondary buffer, we can determine if HiZ is enabled and therefore allow the PMA optimization to be enabled within the command buffer. This improves performance by ~13% on an internal benchmark on Skylake. v2: Use anv_cmd_buffer_get_depth_stencil_view(). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-08 09:31:17 +00:00
Florian Will	7e025def6d	glsl: Respect std430 layout in lower_buffer_access Respect the std430 rules for determining offset and size of struct members when using a std430 buffer. std140 rules lead to wrong buffer offsets in that case. Fixes my test case attached in Bugzilla. No piglit changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-08 13:42:09 +11:00
Karol Herbst	efd2169c1a	nir: fix st_nir_assign_var_locations for patch variables Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-08 10:12:53 +11:00
Ilia Mirkin	6f4ac7b418	nvc0: enable bindless on kepler All the functionality is in. Maxwell will take a little bit more enablement work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:40:35 -05:00
Ilia Mirkin	23a6e8d8ff	nvc0: add bindless image support for kepler A part of the driver constbuf area is allocated for bindless images. Any update requires uploading to all driver constbufs. This also extends the driver constbuf to 64KB, up from 2KB. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:40:35 -05:00
Ilia Mirkin	8eb1214755	nvc0: add support for bindless textures on kepler+ This keeps a list of resident textures (per context), and dumps that list into the active buffer list when submitting. We also treat bindless texture fetches slightly differently, wrt the meaning of indirect, and not requiring the SAMPLER file to be used. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:15:23 -05:00
Ilia Mirkin	7061333653	nv50/ir: use the image info in the instruction rather than decl In preparation for bindless images, we have to retrieve the target/format info from the instruction directly, as there will be no declaration. Furthermore, for bound images, this information is still available in the instruction, so we can drop the declaration-based mechanism entirely. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:15:23 -05:00
Ilia Mirkin	7f92c8ee37	nvc0/ir: safen up lowering logic against overwriting reused values I'm fairly sure both of the changed sites are OK as-is, but they're fragile, so this is just safening them up. Since this is happening pre-ssa, we don't want to be overwriting values that may potentially get used later on. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:15:23 -05:00
Ilia Mirkin	bdf300e09d	nvc0: update tic in-place when buffer address changes This is helpful for bindless, where changing TIC id's is undesirable. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-01-07 11:15:23 -05:00
Ilia Mirkin	adcd241b56	nvc0: ensure that pushbuf keeps ref to old text/tls bos If we free the bo, then the PTE may get deallocated immediately. We have to make sure that the submission includes a ref to the old bo so that it remains mapped for the duration of the command execution. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-07 11:14:51 -05:00
Kenneth Graunke	be144e251c	i965: Torch public intel_batchbuffer_emit_dword/float helpers. intel_batchbuffer_emit_float is dead code, it should go. intel_batchbuffer_emit_dword only had one user, which had bungled using them by forgetting to call intel_batchbuffer_require_space first. So it seems wise to delete these unsafe helpers. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-06 20:10:32 -08:00
Kenneth Graunke	1c9f1a28c0	i965: Require space for MI_BATCHBUFFER_END. intel_batchbuffer_emit_dword doesn't reserve space for the DWord it emits. In the past, we had some reserved batch space to ensure this worked. With the switch to growing batches, we need to actually request space so that we grow if necessary. Fixes: `2c46a67b41` (i965: Delete BATCH_RESERVED handling.) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-06 20:10:32 -08:00
Kenneth Graunke	a693a6f51a	i965: Shut up a few unused variable warnings. If asserts are disabled, you get pointless warnings about devinfo being used (it's used to assert on devinfo->gen).	2018-01-06 17:34:54 -08:00
Marek Olšák	a140aeb619	ac: add ac_build_fmin/fmax helpers Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-06 09:51:43 +01:00
Marek Olšák	581507f10a	mesa: remove dd_function_table::GetCompressedTexSubImage and clean it up Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-06 09:51:43 +01:00
Neil Roberts	2971f688e6	mesa: Tidy up the 4.6 section of GL4x.xml The enums are moved to the top and indented like the rest of the file. Comments are added to split up the function aliases by corresponding extension. This should make no functional difference. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-06 09:28:45 +01:00
Samuel Pitoiset	87efa71001	radv: remove unused radv_color_buffer_info::cb_clear_valueX Found by inspection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-05 17:26:51 +01:00
Alex Smith	12f4e00b69	anv: Take write mask into account in has_color_buffer_write_enabled If we have a color attachment, but its writes are masked, this would have still returned true. This is inconsistent with how HasWriteableRT in 3DSTATE_PS_BLEND is set, which does take the mask into account. This could lead to PixelShaderHasUAV not being set in 3DSTATE_PS_EXTRA if the fragment shader does use UAVs, meaning the fragment shader may not be invoked because HasWriteableRT is false. Specifically, this was seen to occur when the shader also enables early fragment tests: the fragment shader was not invoked despite passing depth/stencil. Fix by taking the color write mask into account in this function. This is consistent with how things are done on i965. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-01-05 15:36:22 +00:00
Neil Roberts	0bd1c4676d	mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 11:41:10 +01:00
Samuel Pitoiset	ec63ab39be	radv: enable denorms for 64-bit and 16-bit floats Similar to RadeonSI. This fixes: dEQP-VK.image.texel_view_compatible.graphic.basic.attachment_read.bc*r16g16b16a16_sfloat dEQP-VK.image.extended_usage_bit.attachment_write.r16_sfloat Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-05 09:51:33 +01:00
Samuel Pitoiset	7643c71527	amd/common: correctly detect if we need ring buffers When allocate_user_sgprs() was called, ctx->stage was actually unset and 0 is for the vertex shader. This doesn't change anything for now because of the spill support thing. Though, the number of user SGPRs has to be fixed for merged shaders on GFX9. It was broken before anyway. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-05 09:49:51 +01:00
Samuel Pitoiset	50cfad0298	amd/common: use ac_image_load when lod is zero This might decrease VGPR spilling, because we no longer have to use v4i32 for 2D fetches when level == 0. We now use v2i32 for those cases. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 09:49:45 +01:00
Samuel Pitoiset	85769759bf	radv: limit the scissor bug workaround to Vega 10 and Raven Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-05 09:47:49 +01:00
Alejandro Piñeiro	2719467eb6	glsl/standalone: set MaxTransformFeedbackBuffers Using 4, as it is the default value on mesa. See mesa/main/config.h and the following commit that introduced the value: `15ac66e331` Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	0ba3de2ad7	glsl/standalone: set MaxVertexStreams ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum of 4. It shouldn't matter too much, so choosing the later. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	8dcf131f04	glsl/standalone: set MaxUniformBufferBindings Used to handle how many ubo you can define on the context. Minimimum defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL 4.6 (12 per stage at each moment). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Alejandro Piñeiro	937b210551	glsl/standalone: point which arguments are mandatory Every now and then I execute the standalone compiler, get the non-version error, and need to remember what I'm doing wrong Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2018-01-05 08:52:22 +01:00
Timothy Arceri	4a0c24f2dd	ac: rework ac_llvm_extract_elem() Simplifies the logic a little and asserts index is 0. Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 12:20:38 +11:00
Timothy Arceri	71f82dc9a3	st/glsl_to_nir/radeonsi: enable tessellation shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	9755eeb15a	gallium/tgsi: add patch support to tgsi_get_gl_varying_semantic() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	452586b56a	radeonsi: add dummy implementation of si_nir_scan_tess_ctrl() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	14adf7853a	ac/radeonsi: add load_tess_coord() to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	eb1e555cfd	radeonsi: make si_llvm_emit_tcs_epilogue compatible with emit_outputs abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	91f3c4ec1b	radeonsi/nir: gather tess properties Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	9e1a3caf32	ac/radeonsi: add tcs_rel_ids to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	9c2f877830	radeonsi: add unpack_llvm_param() helper This allows us to pass the llvm param directly rather than looking it up. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	f93740efc1	ac: add {tcs,tes}_patch_id to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	15c6f3fdd5	radeonsi: add nir support for tcs outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	b99ebaa4fd	ac: move some helpers to ac_llvm_build.c We will call these from the radeonsi NIR backend. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	2deb822075	ac: add store_tcs_outputs() to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	8be0135082	radeonsi: add si_nir_load_input_tcs() V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	234507b3cf	radeonsi: add get_dw_address_from_generic_indices() helper This will be used by both the tgsi and nir backends. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	b104e7e172	ac: call load_tcs_input() via the abi This also enables some code sharing with tes. V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	b09a3196e0	ac: add load_tes_inputs() to the abi V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Timothy Arceri	e04bf8a619	radeonsi: add si_nir_load_input_tes() V2: drop type param and just use ctx->i32 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Tim Rowley	396c006d90	swr/rast: fix invalid sign masks in avx512 simdlib code Should be 0x80000000 instead of 0x8000000. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-04 13:35:17 -06:00
Bas Nieuwenhuizen	76daa30e4a	radv: Use correct flush bits for flushing L2 during CB/DB flushes. Copied from radeonsi. Putting in the correct metadata flush commands for eventually not flushing L2 on CB/DB switch. Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT at the moment. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-04 19:35:36 +01:00
Bas Nieuwenhuizen	f2c9f13ec2	radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT. These are just shaders reads, so we need to invalidate L1. Fixes: `6dbb0eaccc` "radv: handle subpass cache flushes" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-04 19:35:36 +01:00
Samuel Pitoiset	2670ebb584	radv/gfx9: reduce the number of input VGPRs for the GS stage This can still be improved, but let's start with this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-04 18:43:25 +01:00
Samuel Pitoiset	a4d2782664	amd/common: scan if gl_PrimitiveID is used before translating to LLVM It makes more sense to move all scan stuff in the same place. Also, we don't really need to duplicate the uses_primid field for each stages. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 18:43:09 +01:00
Samuel Pitoiset	3b2cb2f99a	amd/common: scan if gl_InvocationID is used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 18:43:07 +01:00
Rob Herring	aa187fe7bf	egl/android: Fix build break with dri2_initialize_android _EGLDisplay parameter Commit `2f421651ac` ("egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE") broke the build due to copy-n-paste of misnamed function parameter.: src/egl/drivers/dri2/platform_android.c:1183:8: error: use of undeclared identifier 'disp' Rather than just fixing 'disp', rename the function parameter 'dpy' to 'disp' to align with the other EGL platforms' implementations. Fixes: `2f421651ac` ("egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Rob Herring <robh@kernel.org>	2018-01-04 10:18:10 -06:00
Alex Smith	00a81e9909	anv: Add missing unlock in anv_scratch_pool_alloc Fixes hangs seen due to the lock not being released here. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-04 14:54:02 +00:00
Ilia Mirkin	e923075a82	mesa/bindless: fix missing image _Layer initialization Some later code relies on _Layer to set first/last_layer. Make sure it's always initialized. Detected by valgrind's conditional jump/move with uninit value logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-01-04 02:22:25 -05:00
Józef Kucia	f222cf3c6d	radeonsi: fix alpha-to-coverage if color writes are disabled If alpha-to-coverage is enabled, we have to compute alpha even if color writes are disabled. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 01:58:33 +01:00
Bas Nieuwenhuizen	79724c89f8	ac: rename has_sync_file to has_fence_to_handle. sync_files are in linux since 4.7, while the amdgpu fence_to_handle ioctl is only in 4.15. In particular we don't need it for sync_file in radv, because everything happens via syncobjs, which got support earlier than fence_to_handle. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 01:12:09 +01:00
Bas Nieuwenhuizen	c99426ea83	ac/nir: Handle loading data from compact arrays. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-04 00:14:23 +01:00
Bas Nieuwenhuizen	1c78e4f053	radv: Allow writing 0 scissors. When rasterization is disabled we can have that few. Fixes: `76603aa90b` "radv: Drop the default viewport when 0 viewports are given." Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-04 00:14:19 +01:00
Bas Nieuwenhuizen	5158603182	radv: Use correct HTILE expanded words. Seems like users are actually hitting 0xFFFFFFFF actually making things broken for them, and the mad max regression is fixed, so lets put this in once more. v2: Use 0xf for depth-only htile. (Dave) Fixes: `af2844116f` "radv: Revert HTILE reset word to 0xFFFFFFFF." Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-01-04 00:14:03 +01:00
Marek Olšák	4f19cc82f9	ac: rename has_syncobj_wait -> has_syncobj_wait_for_submit Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-01-04 00:07:45 +01:00
Eric Anholt	81567d01c1	braodcom/vc5: Fix internal type/bpp for RGB10_A2UI images. I found that we were getting GPU hangs on most tests rendering to them, and the simulator was assertion failing.	2018-01-03 14:31:37 -08:00
Eric Anholt	a93fd7b41e	broadcom/vc5: Try to fix up compressed texture load/store. We were trying to load/store the logical width/height number of compressed blocks. As long as the textures were large, single-level, and the load/store at (0,0), it kind of worked.	2018-01-03 14:31:36 -08:00
Eric Anholt	44237b3f85	broadcom/vc5: Fix image_h value for CPU-side tiling on miplevels > 1. Fixes overflow that caused failure in dEQP-GLES3.functional.texture.filtering.2d.sizes.128x128_linear.	2018-01-03 14:31:36 -08:00
Eric Anholt	e60e3a56a2	broadcom/vc5: Fix discard_if during control flow. I want to do the SETMSF.IFA to discard only if execute == 0 and cond, so our dest of the PUSHZ needs to be nonzero if execute or !cond are nonzero. Fixes dEQP-GLES3.functional.shaders.discard.dynamic_loop_dynamic.	2018-01-03 14:31:36 -08:00
Eric Anholt	7836c85919	broadcom/vc5: Disable early Z when the stencil func isn't ALWAYS. Apparently the other funcs will have observable differences when early Z is enabled. Fixes (new) simulator assertion failures in dEQP-GLES3.functional.rasterizer_discard.basic.clear_depth.	2018-01-03 14:31:36 -08:00
Eric Anholt	635131a238	broadcom/vc5: Don't emit component 3/4 F16 TLB writes for float/vec2. Fixes a simulator assertion failure on dEQP-GLES3.functional.fragment_out.array.fixed.r8_highp_float.	2018-01-03 14:31:28 -08:00
Eric Anholt	deb552ca27	nir: Add a helper to get the uvec4 type. I needed this in the vc5 compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-01-03 14:25:23 -08:00
Eric Anholt	39811a2894	broadcom/vc5: Introduce enums for internal depth/type, with V3D prefixes.	2018-01-03 14:25:23 -08:00
Eric Anholt	d3e8a4b96c	broadcom/xml: Fix up safe name confusion with prefixing. For enums we were doubling the underscore if the value had a numeric first character of its name (which safe_name() adds an underscore to). A little helper function cleans up the other instance of prefixing while also fixing this.	2018-01-03 14:25:23 -08:00
Eric Anholt	48cabc1e75	broadcom/vc5: Turn the decimate mode field into an enum in the XML.	2018-01-03 14:25:23 -08:00
Eric Anholt	17cb634b1c	broadcom/vc5: Turn the output image format into an enum.	2018-01-03 14:25:23 -08:00
Eric Anholt	883a9b02c9	broadcom/vc5: Turn the CLE XML's memory format into an enum.	2018-01-03 14:25:23 -08:00
Eric Anholt	8e5a0ed953	broadcom/vc5: Emit flat shade flags for varying components > 24. This means that with no flatshading we'll emit the single-byte ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to get all the bits we need set. There's a _SET enum in the packet we could use to possibly set entire ranges of the bitfield without using another packet, but this at least fixes the conformance failure.	2018-01-03 14:25:23 -08:00
Eric Anholt	2056e4a777	broadcom/vc5: Emit proper flatshading code for glShadeModel(GL_FLAT). In updating the simulator, behavior changed slightly so that our old code wasn't getting glxgears's flatshading interpolated right. Emit flat shading code just like we would for a normal flat-shaded varying, by passing a flag in the shader key for glShadeModel(GL_FLAT) state and customizing the color inputs based on that.	2018-01-03 14:25:23 -08:00
Eric Anholt	4764699552	braodcom/vc5: Rely on OVRTMUOUT always being set. It seems that the HW team has decided that it's the only supported mode, and it's the mode I actually meant to be using but forgot. Our table of return_32_bit should have matched the default non-OVRTMUOUT behavior, so this change should be invisible. However, the change revealed that some my return_size checks for swizzling were a bit confused in the shadow case, so I had to move them to draw time once we have both the sampler and the view together. Fixes assertion failures in the updated simulator, where the non-OVRTMUOUT support has been removed.	2018-01-03 14:25:23 -08:00
Eric Anholt	ba965084b6	broadcom/vc5: Move texture return channel setup into the compiler. The compiler decides how many LDTMUs we're going to emit, and that must match the P1 flags. This brings the return channel counting to a single place (so all that's passed into the compiler is "how many return channels you may request from this texture's format), and was a necessary step for shadow samplers once we stop using OVRTMUOUT=0.	2018-01-03 14:25:23 -08:00
Eric Anholt	ac4054ca17	broadcom/vc5: Switch to setting the primitive list format in the RCL. This means that we get a single copy of it emitted, instead of once at the start of each tile (though it's still executed once per tile). Fixes assertion failures with the updated simulator.	2018-01-03 14:25:23 -08:00
Eric Anholt	7d8b19f0dd	broadcom/vc5: Switch to using the C++ interface for the simulator. In newer versions they've removed the C interface, so make one here. This also isolates the Mesa codebase from the simulator codebase, so we don't have conflicts over things like "unreachable"	2018-01-03 14:25:23 -08:00
Mario Kleiner	190ac52827	mesa: Add GL_UNSIGNED_INT_2_10_10_10_REV OES read type for BGRX1010102. As Marek noted, the GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV type combo is also good for readback of BGRX1010102 framebuffers, not only for BGRA1010102 framebuffers for use with glReadPixels() under GLES, so add it for the GL_IMPLEMENTATION_COLOR_READ_TYPE_OES query. Successfully tested on gallium r600 driver with a (quickly hacked for RGBA 10 10 10 0) dEQP testcase dEQP-EGL.functional.wide_color.window_1010102_colorspace_default. Suggested-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	2d8e1a6a27	st/dri: Add option to control exposure of 10 bpc color configs. Some clients may not like rgb10 fbconfigs and visuals. Support driconf option 'allow_rgb10_configs' on gallium to allow per application enable/disable. The option defaults to enabled. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	e5ff036c67	st/dri: Add support for BGR[A/X]1010102 formats. Exposes RGBA 10 10 10 2 and 10 10 10 0 visuals and fbconfigs for rendering. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	ef99597d95	st/dri: Support texture_from_pixmap for BGR[A/X]1010102 formats. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	893723a64b	st/dri2: Add buffer handling for BGR[A/X]1010102 formats. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	f52aa596d4	st/dri2: Add format translations for BGR[A/X]1010102 formats. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	6e665d07eb	st/mesa: Handle BGR[A/X]1010102 formats. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:57 +01:00
Mario Kleiner	3f867d1299	egl/wayland: Add Wayland shm swrast support for RGB10 winsys buffers. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	859ccd2096	egl/wayland: Add Wayland dmabuf support for RGB10 winsys buffers. (v2) Successfully tested under Weston 3.0. Photometer confirms 10 rgb bits from rendering to display. v2: Rebased onto master for dri2_teardown_wayland(). Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	84fd5151cd	egl/wayland: Add Wayland drm support for RGB10 winsys buffers. Successfully tested under Weston 3.0. Photometer confirms 10 rgb bits from rendering to display. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	82a2ede9aa	egl/x11: Handle depth 30 drawables for EGL_KHR_image_pixmap. Enables eglCreateImageKHR() with target set to EGL_NATIVE_PIXMAP_KHR to handle color depth 30 X11 drawables. Note that in theory the drawable depth 32 case in the current implementation is ambiguous: A depth 32 drawable could be of format ARGB8888 or ARGB2101010, therefore an assignment of __DRI_IMAGE_FORMAT_ARGB8888 for a pixmap of ARGB2101010 format would be wrong. In practice however, the X-Server (as of v1.19) does not provide any depth 32 visuals for ARGB2101010 EGL/GLX configs. Those are associated with depth 30 visuals without an alpha channel instead. Therefore the switch-case depth 32 branch is only executed for ARGB8888 pixmaps and we get away with this. Tested with KDE Plasma 5 under X11, DRI2 and DRI3/Present, selecting EGL + OpenGL compositing and different fbconfigs with/without 2 bit alpha channel. glxinfo confirms use of depth 30 visuals for ARGB2101010 only. Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	d0b320c941	egl/x11: Handle depth 30 drawables under software rasterizer. For fixing eglCreateWindowSurface() under swrast, as tested with LIBGL_ALWAYS_SOFTWARE=1. Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	be3d88e539	egl/x11: Match depth 30 RGB visuals to 32-bit RGBA EGLConfigs. Similar to the matching of 24 bit RGB visuals to 32-bit RGBA EGLConfigs, so X11 compositors won't alpha-blend any config with a destination alpha buffer during compositing. Additionally this fixes failure to select ARGB2101010 configs via eglChooseConfig() with EGL_ALPHA_BITS 2 on a depth 30 X-Screen. The X-Server doesn't provide any visuals of depth 32 for ARGB2101010 configs, it only provides depth 30 visuals. Therefore if we'd only match ARGB2101010 configs to depth 32 RGBA visuals, we would not ever get a visual for such a config. This was apparent in piglit tests for egl configs, which are fixed by this commit. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	01ce3f2cad	mesa: Add GL_RGBA + GL_UNSIGNED_INT_2_10_10_10_REV for OES read type. This format + type combo is good for BGRA1010102 framebuffers for use with glReadPixels() under GLES, so add it for the GL_IMPLEMENTATION_COLOR_READ_TYPE_OES query. Allows successful testing of 10 bpc / depth 30 rendering with dEQP test case dEQP-EGL.functional.wide_color.window_1010102_colorspace_default. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	cfb98bcdd0	i965/screen: Honor 'allow_rgb10_configs' option. (v2) Allows to prevent exposing RGB10 configs and visuals to clients. v2: Rename expose_rgb10_configs to allow_rgb10_configs, as suggested by Emil. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	67674ad0dc	dri/common: Add option to allow exposure of 10 bpc color configs. (v2) Some clients may not like RGB10X2 and RGB10A2 fbconfigs and visuals. Add a new driconf option 'allow_rgb10_configs' to allow per application enable/disable. The option defaults to enabled. v2: Rename expose_rgb10_configs to allow_rgb10_configs, as suggested by Emil. Add comment to option parsing, to make sure it stays before the ->InitScreen(). Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	9e63cbacb6	i965/screen: Add basic support for rendering 10 bpc/depth 30 framebuffers. (v3) Expose formats which are supported at least back to Gen 5 Ironlake, possibly further. Allow creation of 10 bpc winsys buffers for drawables. glxinfo now lists new RGBA 10 10 10 2/0 formats. v2: Move the BGRA/BGRX1010102 formats before the RGBA/RGBX8888 32 bit formats, as the code comments require. Thanks Emil! Update num_formats from 3 to 5, to keep the special Android handling intact. v3: Use num_formats = ARRAY_SIZE(formats) - 2 as suggested by Tapani, to only exclude the last 2 Android formats, add Tapani's r-b. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	87faa8b1e8	i965/screen: Add XRGB2101010 and ARGB2101010 support for DRI3. Allow DRI3/Present buffer sharing for 10 bpc buffers. Otherwise composited desktops under DRI3 will only display black client areas for redirected windows. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	304f363a96	loader/dri3: Add XRGB2101010 and ARGB2101010 support. To allow DRI3/Present buffer sharing for 10 bpc buffers. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	f3878aa622	dri: Add 10 bpc formats as available formats. (v2) Used to support ARGB2101010 and XRGB2101010 winsys framebuffers / drawables, but added other 10 bpc fourcc's as well for consistency with definitions in wayland_drm.h, gbm.h, and drm_fourcc.h. v2: Align new defines with tabs instead of spaces, for consistency with remainder of that block of definitions, as suggested by Tapani. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	26c4d804ff	i965: Support accelerated blit for depth 30 formats. (v2) Extend intel_miptree_blit() to handle at least ARGB2101010 -> XRGB2101010, ARGB2101010 -> ARGB2101010, and XRGB2101010 -> XRGB2101010 via the BLT engine, but not XRGB2101010 -> ARGB2101010 yet. This works as tested under Compiz, KDE-5, Gnome-Shell. v2: Restrict BLT fast path to exclude XRGB2101010 -> ARGB2101010, as intel_miptree_set_alpha_to_one() isn't ready to set 2 bit alpha channels to 1.0 yet. However, couldn't find a test case where this specific blit would be needed, so maybe not much of a point to improve here. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:56 +01:00
Mario Kleiner	6945f313c4	i965: Support xrgb/argb2101010 formats for glx_texture_from_pixmap. Makes compositing under X11/GLX work. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-03 22:57:55 +01:00
Tim Rowley	ad218754c7	swr/rast: fix MemoryBuffer build break for llvm-6 LLVM api change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104381 Tested-by: Laurent Carlier <lordheavym@gmail.com> Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2018-01-03 11:42:00 -06:00
Rob Herring	28234c5bf8	Android: util: fix locale generation in options.h The parameters to gen_xmlpool.py are wrong and cause the following warnings: Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/es/LC_MESSAGES/options.mo' not found. Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/nl/LC_MESSAGES/options.mo' not found. Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/fr/LC_MESSAGES/options.mo' not found. Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool/sv/LC_MESSAGES/options.mo' not found. Warning: language 'external/mesa3d/src/util/xmlpool/t_options.h' not found. Warning: language 'out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_util_intermediates/xmlpool' not found. Warning: language 'de' not found. Warning: language 'es' not found. Warning: language 'nl' not found. Warning: language 'fr' not found. Warning: language 'sv' not found. The result is English is the only language in options.h. Use "$<" instead of "$^" because we only need the first dependency (the script), not all dependencies. Signed-off-by: Rob Herring <robh@kernel.org>	2018-01-03 09:49:08 -06:00
Kenneth Graunke	74e1d6e20c	i965: Drop support for the legacy SNORM -> Float equation. Older OpenGL defines two equations for converting from signed-normalized to floating point data. These are: f = (2c + 1)/(2^b - 1) (equation 2.2) f = max{c/2^(b-1) - 1), -1.0} (equation 2.3) Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be used in all scenarios, and remove equation 2.2. DirectX uses equation 2.3 as well. Intel hardware only supports equation 2.3, so Gen7.5+ systems that use the vertex fetcher hardware to do the conversions always get formula 2.3. This can make a big difference for 10-10-10-2 formats - the 2-bit value can represent 0 with equation 2.3, and cannot with equation 2.2. Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES. Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use the new rules, at least in core profile. That would leave Gen4-6 doing something different than all other hardware, which seems...lame. With context version promotion, applications that requested a pre-4.2 context may get promoted to 4.2, and thus get the new rules. Zero cases have been reported of this being a problem. However, we've received a report that following the old rules breaks expectations. SuperTuxKart apparently renders the cars red when following equation 2.2, and works correctly when following equation 2.3: https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405 So, this patch deletes the legacy equation 2.2 support entirely, making all hardware and APIs consistently use the new equation 2.3 rules. If we ever find an application that truly requires the old formula, then we'd likely want that application to work on modern hardware, too. We'd likely restore this support as a driconf option. Until then, drop it. This commit will regress Piglit's draw-vertices-2101010 test on pre-Haswell without the corresponding Piglit patch to accept either formula (commit 35daaa1695ea01eb85bc02f9be9b6ebd1a7113a1): draw-vertices-2101010: Accept either SNORM conversion formula. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisforbes@google.com>	2018-01-02 16:51:42 -08:00
Ian Romanick	bd32d4d067	meta: Don't pollute the texture namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:52 -08:00
Ian Romanick	5325a34ed7	meta: Use _mesa_bind_texture instead of _mesa_BindTexture Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:52 -08:00
Ian Romanick	e0ad314568	meta: Use _mesa_CreateTextures instead of _mesa_GenTextures Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:52 -08:00
Ian Romanick	173e3045a9	meta: Track temporary textures using gl_texture_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	c36e3d3016	meta/blit: Track temporary texture using gl_texture_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	05f4be9641	meta/blit: Use _mesa_bind_texture instead of _mesa_BindTexture Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	d17e6bc48e	meta/blit: Don't bind texture in _mesa_meta_bind_rb_as_tex_image All of the callers of _mesa_meta_bind_rb_as_tex_image call _mesa_meta_setup_sampler shortly after. _mesa_meta_setup_sampler also binds the texture. This is necessary because not all paths that lead to _mesa_meta_setup_sampler some through _mesa_meta_bind_rb_as_tex_image. Rename the function _mesa_meta_texture_object_from_renderbuffer to reflect its true purpose. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	7609d54e4a	meta/blit: Track source texture using gl_texture_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	29a948e06d	meta/blit: Since _mesa_meta_bind_rb_as_tex_image has only one output, return it Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	44e153616d	meta/blit: Don't return the texture handle from _mesa_meta_bind_rb_as_tex_image It's always the same as *texObj->Name. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	922ee3b493	meta/blit: Don't return the target from _mesa_meta_bind_rb_as_tex_image It's always the same as *texObj->Target. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	9de64d0baa	meta/blit: Don't restore state of the temporary texture It's about to be destroyed, so there's no point. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:51 -08:00
Ian Romanick	a232df1523	meta/blit: Check the values instead of the target before restoring Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:50 -08:00
Ian Romanick	594d02892e	mesa: Add _mesa_bind_texture method Light-weight glBindTexture for internal use. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:50 -08:00
Ian Romanick	e6cef4b081	Revert "mesa: remove unused _mesa_delete_nameless_texture()" Changes in this series use this function. This reverts commit `048de9e34a`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: Timothy Arceri <tarceri@itsqueeze.com>	2018-01-02 16:23:50 -08:00
Ian Romanick	d80be51775	mesa: Fold _mesa_record_error into its only caller Also, the comment on _mesa_record_error was wrong. dd_function_table::Error was not called because that function does not exist. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2018-01-02 16:23:50 -08:00
Lucas Stach	0158565924	etnaviv: disable in-place resolve for non-supertiled surfaces The in-place resolve probably has some additional restrictions when not operating on a super tiled surface. Disable it on non-supertiled surfaces for now to work around a GPU hang. Fixes: `78ade65956` ("etnaviv: Do GC3000 resolve-in-place when possible") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2018-01-01 22:48:06 +01:00
Bas Nieuwenhuizen	6a36bfc64d	radv: Implement binning on GFX9. Overall it does not really help or hurt. The deferred demo gets 1% improvement and some games a 3% decrease, so I don't think this should be enabled by default. But with the code upstream it is easier to experiment with it. v2: Remove initializing the registers from si_emit_config. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-31 15:07:07 +01:00
Bas Nieuwenhuizen	b0d17270ad	radv: Add flag for enabling binning. Letting it be disabled by default. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-31 13:47:51 +01:00
Kenneth Graunke	a1afef8de0	i965: Combine {VS,FS}_OPCODE_GET_BUFFER_SIZE opcodes. These are the same, we don't need a separate opcode enum per backend. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-30 20:30:34 -08:00
Rob Clark	ea0bbe8201	nir: add missing local_group_size intrinsic For GL_ARB_compute_variable_group_size Reported-by: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-30 12:39:07 -05:00
Rhys Kidd	60c2d09483	nv50/ir: Fix unused var warnings in release build v2: Add preventative comment (Ilia Mirkin) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2017-12-29 23:04:42 -05:00
Rhys Kidd	634ca4c2c3	nvc0: Fix unused var warnings in release build Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2017-12-29 23:04:42 -05:00
Rhys Kidd	540d829d38	nv50: Fix unused var warning in release build Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>	2017-12-29 23:04:42 -05:00
Roland Scheidegger	878bc4a5ae	r600: fix textureSize queries with tbos piglit doesn't care, but I'm quite confident that the size actually bound as range should be reported and not the base size of the resource (and some quick piglit test hacking confirms this). Also, the array in the constant buffer looks overallocated by a factor of 4. For eg, also decrease the size by another factor of 2 by using the same constant slot for both buffer size (required for txq for TBOs) and the number of layers for cube arrays, as these are mutually exclusive. Could of course use some more logic and only actually do this for the samplers/images/buffers where it's required rather than for all, but ah well... Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-30 03:30:48 +01:00
Roland Scheidegger	eafaf13686	r600: kill off native_integer shader ctx flag Maybe upon a time it wasn't always true. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-30 03:30:48 +01:00
Bas Nieuwenhuizen	b0a6fd0274	radv: Also set DCC params for sampling for input attachment usage. Those are implemented as texture sampling, so we need to make the texture TC-compatible too. Fixes: `34d23e82ca` "radv: set some dcc parameters depending on if texture will be sampled" Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-12-29 23:42:30 +01:00
Bas Nieuwenhuizen	ab957243e1	radv: Enable DCC with transfers. Before this DCC was in practice disabled for most games. This enables practical DCC use. Expect a 5-10% perf increase on a bunch of games on vega @ 4k. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:22:02 +01:00
Bas Nieuwenhuizen	eb9a4c3464	radv: Decompress copy destination if formats are incompatible. If both source and destination are DCC compressed, and their formats are not compatible, we need to decompress one of them to make sure we can do reinterpretation (which needs src format == dst format) . Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:58 +01:00
Bas Nieuwenhuizen	44fcf58744	radv: Disable DCC for GENERAL layout and compute transfer dest. Apps can use this for render feedback loops, where things are defined if they render each pixel only once. However, DCC fails here, as the level of coherence is a block not a pixel, so disable it. This is also going to help implementing other stuff. Even if we optimize this later to only happen if there actually is a loop (if possible at all ...), then the machinery is still useful to exclude images accessible by the SDMA queue when that is implemented. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:53 +01:00
Bas Nieuwenhuizen	95f50f7f6c	radv: Don't init DCC metadata during FS resolve. It should already be valid there + the RB will update it during rendering. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:49 +01:00
Bas Nieuwenhuizen	1cfab28e6e	radv: Make color meta operations layout aware. For fast clear eliminate and decompressions, we always use the most compressed format. For clears, the code already creates a renderpass on demand with the exact same layout as specified. Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:44 +01:00
Bas Nieuwenhuizen	3e2a6191c9	radv: Add compute DCC decompress. We do an in place copy where we read compressed and write decompressed. By doing this in sizes that cover entire DCC blocks and waiting for all reads in the block before starting to write we avoid corruption. In the end we clear the DCC metadata to 0xffffffff. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:40 +01:00
Bas Nieuwenhuizen	8abaa3aeaa	radv: Use the meta fast clear destructor on construction failure. Simplifies failure paths. The caller already calls radv_device_finish_meta_fast_clear_flush_state on failure. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:35 +01:00
Bas Nieuwenhuizen	e5feeec140	radv: Add GFX DCC decompress. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:31 +01:00
Bas Nieuwenhuizen	fc80f52536	radv: Don't enable DCC / TC compat HTILE for storage images. We don't get a layout when binding to a descriptor set, but can assume that the LAYOUT is GENERAL. For DCC stores with the DCC bits set will result in a hang, so better be safe than sorry. Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-29 12:21:15 +01:00
Bas Nieuwenhuizen	516a80b579	Revert "radv/gfx9: fix block compression texture views." This reverts commit `5951578043`. The mentioned commit causes a hang in DoW3 on Vega. Fixes: `5951578043` "radv/gfx9: fix block compression texture views." Acked-by: Dave Airlie <airlied@redhat.com>	2017-12-29 11:21:43 +01:00
Brian Paul	23f37e98a1	svga: update SVGA_NEW_ flags for updating sampler state The SVGA_NEW_FS flag is needed since we now examine the fragment shader's fs_shadow_compare_units flags. The SVGA_NEW_TEXTURE_FLAGS flag is not needed since it's only for pre-VGPU10. No piglit changes. This doesn't fix any known issues but it could pop up somewhere. Suggested by Charmaine. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-28 22:09:29 -07:00
Brian Paul	f50924cb5b	svga: whitespace, formatting fixes in svga_state_tss.c	2017-12-28 22:09:29 -07:00
Dave Airlie	a4c23ce1b6	radv/gfx9: use correct swizzle parameter to work out border swizzle. This should fix: dEQP-VK.pipeline.sampler.view_type.*.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black and a few others in that area. Fixes: `b11c4a5546` (radv: add texture descriptor/fmask/cmask support for GFX9) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 12:09:13 +10:00
Dave Airlie	868377ab33	radv/gfx9: use a bigger hammer to flush cb/db caches. amdvlk is probably more subtle than this but it never uses the inv cb/db variants, we fail some CTS tests without this. Fixes: dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.input*. Fixes: `c2fbeb7ca0` (radv: add GFX9 cache flushing support.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (for now :-) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 11:43:30 +10:00
Dave Airlie	5951578043	radv/gfx9: fix block compression texture views. This ports a fix from amdvlk, to fix the sizing for mip levels when block compressed images are viewed using uncompressed views. Fixes: dEQP-VK.image.texel_view_compatible.graphic.extendedbc Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 11:42:47 +10:00
Dave Airlie	420627e6e7	radv/gfx9: fix buffer to image for 3d images on compute queues This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:09 +10:00
Dave Airlie	09612a62e1	radv/gfx9: fix 3d image clears on compute queues This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:05 +10:00
Dave Airlie	d08f267814	radv/gfx9: fix 3d image to image transfers on compute queues. This fixes some of the broken: dEQP-VK.synchronization.op.multi_queue.64x64x8 tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-29 09:37:00 +10:00
Jason Ekstrand	967d238c69	anv/device: Mark all state buffers as needing capture Previously, we were flagging the instruction state buffer for capture but not surface state or dynamic state. We want those captured too. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-28 10:39:04 -08:00
Jason Ekstrand	69fa3fb77f	intel/aubinator: Gracefully handle dynamic state not being available Some older versions of the Vulkan driver didn't properly tag dynamic state as needing to be captured. Also, this prevents crashes when looking at dumps on older kernels. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-28 10:39:04 -08:00
Jason Ekstrand	a92d52c3c1	intel/aubinator: Free section data last We were walking the sections, printing the batches, and then freeing them in one pass. If the batch happens to reference any earlier sections (which it almost certainly will since it's at the end), we will access freed memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-28 10:39:04 -08:00
Eero Tamminen	5c5f2eaa08	spirv: consider bitsize when handling OpSwitch cases This reverts commit `7665383a33` and is squashed together with https://patchwork.freedesktop.org/patch/194610/ (spirv: avoid infinite loop / freeze in vtn_cfg_walk_blocks()) which fixes https://bugs.freedesktop.org/show_bug.cgi?id=104359 properly. Fixes: `9702fac68e` (spirv: consider bitsize when handling OpSwitch cases) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-28 10:38:58 -08:00
Brian Paul	e0eaeef3e7	svga: check for null fs pointer in update_samplers() This can happen when there's no active fragment shader, such as when using transform feedback. This wasn't hit by any Piglit test but is hit by Daniel Rákos' Nature demo. VMware bug 2026189. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-28 09:49:31 -07:00
Brian Paul	be5153fbee	st/mesa: increase size of glsl_base_type bitfields Change `59f458cd87` added more enums to glsl_base_type. We have to bump up the size of the bitfields for fields of this type for MSVC. Also, add another assertion to catch another place where this enum bitfield is used. Reviewed-by: Neha Bhende <bhenden@vmware.com>	2017-12-28 08:30:56 -07:00
Dave Airlie	ec1edd0fd2	radv: fix pipeline statistics end query on compute queue It's legal to a pipeline stat query on a compute queue, but we'd emit the wrong packet here. This should fix it to emit the correct packet. Noticed while inspecting the mpv hang. Fixes: `ad61eac250` (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 19:31:01 +10:00
Dave Airlie	38e4467e99	radv: fix events on compute queues. The event emission wasn't sending the correct packet for gfx8 compute queues, which explains why it works on vega fine. This fixes the mpv vulkan hang. Fixes: `ad61eac250` (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 19:30:32 +10:00
Dave Airlie	ff75d3a9aa	radv: move local bos usage to a perftest flag. These seem mildly unstable on vega, crashing CTS in various fun ways, and looks like leaking memory. Disable for now, but leave the option to enable them. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 19:30:16 +10:00
Dave Airlie	78a8b73e7d	vulkan/wsi: free cmd pools We destroy the pools but don't free the container. This fixes: dEQP-VK.wsi.xlib.swapchain.simulate_oom* Fixes: `d50937f137` (vulkan/wsi: Implement prime in a completely generic way) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 09:57:33 +10:00
Bas Nieuwenhuizen	a636208ace	radv: Always use fragment resolve if dest uses DCC. HW resolve does not support it either. Fixes: `2a04f5481d` "radv/meta: select resolve paths" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:30:47 +01:00
Bas Nieuwenhuizen	da192b50b2	radv: Use correct framebuffer size for partial FS resolves. Framebuffer is from 0,0, not (dst.x, dst.y). Fixes: `69136f4e63` "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:30:47 +01:00
Bas Nieuwenhuizen	73279da41d	radv: Fix fragment resolve destination offset. The position start at (dst.x, dst.y), so if we want the source to start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y). Haven't tested that this fixed anything yet, but found by inspection. Fixes: `69136f4e63` "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen	258ebe79a0	radv: Don't handle DCC in compute resolve. If the destination has DCC, we will use the FS resolve. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen	cebc9a119d	radv: Flush caches before subpass resolve. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:26:07 +01:00
Bas Nieuwenhuizen	c39947ce30	radv: Invert condition for all samples identical during resolve. the samples_identical instruction returns 0 if they are differet, so we have to do the extra work if the result is 0, not if it is != 0. Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-28 00:26:07 +01:00
Eric Engestrom	e5a7ef0013	egl: don't try the software path twice Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Brendan King <Brendan.King@imgtec.com>	2017-12-27 22:31:56 +00:00
Eric Engestrom	81cea66ff1	egl: rename LIBGL_ALWAYS_SOFTWARE variable from UseFallback to ForceSoftware Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-27 22:31:50 +00:00
Eric Engestrom	2f421651ac	egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE My refactor in `47273d7312` missed this early return; because of it, setting UseFallback one layer above actually prevented the software path from being used. Remove this early return and let each platform's dri2_initialize_*() decide what it can do with the LIBGL_ALWAYS_SOFTWARE restriction. platform_{surfaceless,x11,wayland} were already handling it themselves. Fixes: `47273d7312` "egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Brendan King <Brendan.King@imgtec.com>	2017-12-27 22:31:38 +00:00
Brendan King	e491bffc5c	egl: link libEGL against the dynamic version of libglapi Note: the following happens only when using slibtool. Since this is a very serious breakage, we will keep the workaround until a better solution is available. DRI modules store the address of the dispatch table in a TLS variable, _glapi_tls_Dispatch. Changes to the way libEGL is built in `d884d8d007` resulted in it being statically linked against libglapi, and thus containing its own copy of _glapi_tls_Dispatch. The result was that some applications would fail to work (e.g. deqp-egl, which dynamically loads libEGL), due to the DRI module storing the dispatch table address in one copy of _glapi_tls_Dispatch, and libEGL obtaining the address from another copy of the variable. Fixes: `d884d8d007` "egl/dri: link directly to libglapi.so" Signed-off-by: Brendan King <Brendan.King@imgtec.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-27 22:29:20 +00:00
Dave Airlie	d2acf97e49	radv: don't do format replacement on tc compat htile surfaces. For copies the texture unit needs to know the depth format so it can read the htile data properly. This fixes: dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.load.clear Fixes: `ad3d98da9f` (radv: enable tc compatible htile for d32s8 also.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 05:24:52 +10:00
Dave Airlie	5ba26ed6e5	radv/gfx9: use correct stencil format for tc compat htile. This needs to correspond to the bit depth of the Z plane. noticed in passing reading amdvlk. Fixes: `fc6c77e162` (radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 05:23:49 +10:00
Brian Paul	3e59e442c3	svga: move variant->fs_shadow_compare_units assignment Fixes a crash since the variant object isn't allocated until later in the function. Not sure how this got through. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-27 11:34:51 -07:00
Samuel Pitoiset	3260a96c17	amd/common: rework set_userdata_location() and rename to set_loc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:17 +01:00
Samuel Pitoiset	4221a816e2	amd/common: rename set_userdata_location_shader() to set_loc_shader() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:15 +01:00
Samuel Pitoiset	5081fd398e	amd/common: replace set_userdata_location_indirect() by set_loc_desc() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:13 +01:00
Samuel Pitoiset	f8202ef683	amd/common: rename radv_define_vs_user_sgprs_phase2() ... to set_vs_specific_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:11 +01:00
Samuel Pitoiset	9d5a1787ee	amd/common: rename radv_define_common_user_sgprs_phase2() ... to set_global_input_locs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:08 +01:00
Samuel Pitoiset	9a2393a510	amd/common: rename add_user_sgpr_array_argument() to add_array_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:06 +01:00
Samuel Pitoiset	b6217bdbee	amd/common: replace add_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:04 +01:00
Samuel Pitoiset	32bbc9eb0f	amd/common: replace add_user_sgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:25:02 +01:00
Samuel Pitoiset	e946b5360d	amd/common: replace add_vgpr_argument() by add_arg() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:59 +01:00
Samuel Pitoiset	f1242a8976	amd/common: add new add_arg() helper for SGPRs/VGPRs arguments The idea is to clean up the add arguments logic. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:57 +01:00
Samuel Pitoiset	bedfa06eaf	amd/common: rename radv_define_common_user_sgprs_phase1() ... to declare_global_input_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:55 +01:00
Samuel Pitoiset	0f58f67abe	amd/common: rename radv_define_vs_user_sgprs_phase1() ... to declare_vs_specific_inputs_sgprs(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:53 +01:00
Samuel Pitoiset	5c91c1614c	amd/common: do not try to declare input VS SGPRs for GS It's a no-op anyway but it looked strange to me, remove it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:51 +01:00
Samuel Pitoiset	fc35a071b6	amd/common: add declare_vs_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:49 +01:00
Samuel Pitoiset	3015668cad	amd/common: add declare_tes_input_vgprs() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:47 +01:00
Samuel Pitoiset	62942aa8c6	amd/common: remove unnecessary num_user_sgprs_used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:46 +01:00
Samuel Pitoiset	6edf1fcdf5	amd/common: remove unnecessary user_sgpr_count Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-27 10:24:44 +01:00
Samuel Pitoiset	0f8290dd32	radeonsi: make use of ac_init_exec_full_mask() Similar to si_init_exec_full_mask(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-27 09:57:10 +01:00
Brian Paul	2250c967e7	svga: use tgsi_util_get_shadow_ref_src_index() in a couple place No piglit changes. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-26 21:44:42 -07:00
Brian Paul	ad26999d33	tgsi: improve comment on tgsi_util_get_shadow_ref_src_index() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-26 21:44:33 -07:00
Brian Paul	8571506e51	svga: fix TGSI_TEXTURE_SHADOW1D coordinate selection Fixes about 24 Piglit tex-miplevel-selection tests. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-26 21:44:22 -07:00
Brian Paul	1e0b64ced9	svga: fix shadow comparison failures In some cases, We do shadow comparison cases in the fragment shader instead of with texture sampler state. But when we do so, we must disable the shadow comparison test in the sampler state. As it was, we were doing the comparison twice, which resulted in nonsense. Also, we had the texcoord and texel value swapped in the comparison instruction. Fixes about 38 Piglit tex-miplevel-selection tests. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-12-26 21:43:37 -07:00
Brian Paul	ac78ab951a	util: add trivial comment on u_upload_create()	2017-12-26 21:40:49 -07:00
Dave Airlie	88d09b642d	r600: fix atomic counter index mode getting emitted on pre-cayman This is a regression since I added cayman atomic support, not sure it fixes anything, but the shader dumps look better. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 02:02:46 +00:00
Dave Airlie	34d23e82ca	radv: set some dcc parameters depending on if texture will be sampled This is ported from amdvlk which sets the independent 64b blocks only for image which will sample dcc. I'm not sure how to port this to radeonsi. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 11:10:52 +10:00
Dave Airlie	db27907d78	radv/radeonsi: set dcc min uncompressed properly for APUs. This is ported from amdvlk. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 11:10:50 +10:00
Dave Airlie	cf363e4405	amd/common/radv/radeonsi: use register defines for dcc block sizes. These are just taken from amdvlk, we probably knew these already, but may as well port them now. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 11:10:35 +10:00
Timothy Arceri	a88532c612	st/glsl_to_nir: add patch support to st_nir_assign_var_locations() Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-27 11:26:08 +11:00
Timothy Arceri	351eee05d3	st/glsl_to_nir: call post opt functions after opts have finished We need to move this to a separate loop because nir_compact_varyings() can alter the IR of a previous stage. Fixes: `6648bd68fd` "st/glsl_to_nir: enable NIR link time opts" Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-27 11:26:08 +11:00
Timothy Arceri	ddc0e7941f	st/st_glsl_to_nir: call nir_lower_64bit_pack Fixes 56 crashes in the radeonsi nir backend. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-27 11:26:08 +11:00
Dave Airlie	ae556ba778	docs/features: show es3.1 compat done on r600. This was already being reported, just missed the docs. Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-27 00:08:20 +00:00
Miklós Máté	3667714ccd	mesa: always compare optype with symbolic name in ATI_fs Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	0fd82f5455	mesa: document ati_fragment_shader::cur_pass and swizzlerq Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	f5415dcc8c	mesa: move ATI_fs state compile changes after the error checks Both in setup and arithmetic instructions. Also, remove the useless new_*_inst() functions, and refactor check_arith_arg(), because it did two completely different things. Piglit: spec/ati_fragment_shader/error04-endshader Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	7df48c48f9	tnl: fix not having texture coords in ATI_fs in swrast ATI_fs in swrast only had access to texture coordinates if there was a valid texture bound and texturing was enabled. Piglit: spec/ati_fragment_shader/render-sources and render-notexture Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	a64d6fb730	mesa: fix not having secondary color in ATI_fs in swrast ATI_fs in swrast only had secondary color if GL_COLOR_SUM was enabled. This patch probably fixes the same issue in r200. Piglit: spec/ati_fragment_shader/render-sources and render-precedence Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	759d193ceb	mesa: fix validate for secondary interpolator This patch fixes multiple problems: - the interpolator check was duplicated - both had arg instead of argRep - I split it into color and alpha for better readability and error msg - the DOT4 check only applies to color instruction according to the spec - made the DOT4 check fatal, and improved the error msg Piglit: spec/ati_fragment_shader/error08-secondary v2: fixed formatting, added spec quotations Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	8b3a519913	mesa: fix typo in ATI_fs dstMod error checking Piglit: spec/ati_fragment_shader/error14-invalidmod Signed-off-by: Miklós Máté <mtmkls@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	4003ad298a	mesa: fix crash when an ATI_fs pass begins with an alpha inst This fixes crash when: - first pass begins with alpha inst - first pass ends with color inst, second pass begins with alpha inst Also, use the symbolic name instead of a number. Piglit: spec/ati_fragment_shader/api-alphafirst v2: fixed formatting Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:23 +01:00
Miklós Máté	178a3dfb0e	mesa: add fallback texture for SampleMapATI if there is nothing This fixes crash in the state tracker. Piglit: spec/ati_fragment_shader/render-notexture v2: fixed formatting, moved stuff inside the loop, moved the fallback later to fix more cases Signed-off-by: Miklós Máté <mtmkls@gmail.com>	2017-12-25 14:32:22 +01:00
Marek Olšák	f9cd6c502e	radeonsi: don't use fast color clear for small images even on APUs Increase the limit and handle non-square images better. This makes glxgears 20% faster on APUs, and a little more on dGPUs. We all use and love glxgears. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-12-25 14:23:18 +01:00
Marek Olšák	afdcf0f6b2	radeonsi: set PNT_SPRITE_ENA = point_quad_rasterization This is based on how nvc0 translates the state.	2017-12-25 14:23:02 +01:00
Marek Olšák	986e467e4c	gallium/util: add util_num_layers helper	2017-12-25 14:23:02 +01:00
Bas Nieuwenhuizen	70b5e85fc3	radv: Fix DCC compatible formats. DCC was disabled when the image format is !!supported, which is one ! too many. Ironically the commit that introduced it was supposed to lead to more DCC use ... Fixes: `969537d935` "radv: Add support for more DCC compression with VK_KHR_image_format_list." Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-23 10:58:18 +01:00
Anuj Phogat	2d04572038	Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+" This reverts commit `9cd60fce9c`. Above commit caused 2000+ piglit tests to assert fail. Disabling the align1 mode on gen10 for now to avoid failures. Cc: Matt Turner <mattst88@gmail.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-12-22 16:40:40 -08:00
Andres Gomez	466011e46a	docs: update calendar, add news item and link release notes for 17.2.8 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-12-23 00:59:22 +02:00
Andres Gomez	7f4ea112ce	docs: add sha256 checksums for 17.2.8 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `3281775ab9`)	2017-12-23 00:55:35 +02:00
Andres Gomez	d18f00e160	docs: add release notes for 17.2.8 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `3482790712`)	2017-12-23 00:55:33 +02:00
Ilia Mirkin	0dbdb07070	freedreno: set missing internal_format when importing texture Fixes running piglits without -fbo. Probably lots of other stuff too. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-12-22 09:56:02 -05:00
Samuel Pitoiset	38f9b87af2	amd/common: add ac_export_mrt_z() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:49 +01:00
Samuel Pitoiset	03ef264146	amd/common: pass the family to ac_llvm_context_init() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:44 +01:00
Samuel Pitoiset	79c495aa37	radv: reduce the number of small surfaces that need CMASK or DCC Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-22 10:38:44 +01:00
Ilia Mirkin	05944a392e	gm107/ir: use lane 0 for manual textureGrad handling This is parallel to the pre-SM50 change which does this. Adjusts the shuffles / quadops to make the values correct relative to lane 0, and then splat the results to all lanes for the final move into the target register. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-By: Karol Herbst <kherbst@redhat.com>	2017-12-22 00:17:15 -05:00
Dave Airlie	fbac9f86aa	radv/meta: fix blit paths for depth/stencil (v2.1) This fixes the layout issue for the blit path as well. This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint* v2: use compatible render passes. v2.1: use enum Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:11:02 +10:00
Dave Airlie	821b5379f0	radv: handle depth/stencil image copy with layouts better. (v3.1) If we are doing a general->general transfer with HIZ enabled, we want to hit the tile surface disable bits in radv_emit_fb_ds_state, however we never get the current layout to know we are in general and meta hardcoded the transfer layout which is always tile enabled. This fixes: dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general v2: refactor some shared helpers for blit patches v3: we only need multiple render passes as they should be compatible. v3.1: use enum (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:10:04 +10:00
Dave Airlie	286fe1db47	radv: refactor blit2d pipeline creation This just refactors the gfx9 blit2d pipeline creation to be less lines of code. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:09:49 +10:00
Dave Airlie	9f675bf934	radv/gfx9: add support for 3d images to blit 2d paths This add support for a 3D image reading path to the blit 2d paths, like I did for the clear paths. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:09:28 +10:00
Dave Airlie	a99fa7e8a2	radv/gfx9: add 3d sampler image->buffer copy shader. (v3) On GFX9 we must access 3D textures with 3D samplers AFAICS. This fixes: dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer on GFX9 for me. v1.1: fix tex->sampler_dim to dim v2: send layer in from outside v3: don't regress on pre-gfx9 Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:08:48 +10:00
Dave Airlie	9594667899	radv: fix surface max layer count (v2) looking at traces I noticed we'd set slice_max too large sometimes. This should fix it. v2: fix missing - 1 Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 14:07:55 +10:00
Francisco Jerez	b3e3cb9901	intel/fs: Initialize fs_visitor::grf_used on construction. This should shut up some Valgrind errors during pre-regalloc scheduling. The errors were harmless since they could only have led to the estimation of the bank conflict penalty of an instruction pre-regalloc, which is inaccurate at that point of the program compilation, but no less accurate than the intended "return 0" fall-back path. The scheduling pass is normally re-run after regalloc with a well-defined grf_used value and accurate bank conflict information. Fixes: `acf98ff933` "intel/fs: Teach instruction scheduler about GRF bank conflict cycles." Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-21 15:20:17 -08:00
Francisco Jerez	1aa79d5ed5	intel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain vector storage. The weight_vector_type constructor was inadvertently assuming C++17 semantics of the new operator applied on a type with alignment requirement greater than the largest fundamental alignment. Unfortunately on earlier C++ dialects the implementation was allowed to raise an allocation failure when the alignment requirement of the allocated type was unsupported, in an implementation-defined fashion. It's expected that a C++ implementation recent enough to implement P0035R4 would have honored allocation requests for such over-aligned types even if the C++17 dialect wasn't active, which is likely the reason why this problem wasn't caught by our CI system. A more elegant fix would involve wrapping the __SSE2__ block in a '__cpp_aligned_new >= 201606' preprocessor conditional and continue taking advantage of the language feature, but that would yield lower compile-time performance on old compilers not implementing it (e.g. GCC versions older than 7.0). Fixes: `af2c320190` "intel/fs: Implement GRF bank conflict mitigation pass." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104226 Reported-by: Józef Kucia <joseph.kucia@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-21 15:19:59 -08:00
Mark Janes	7665383a33	Revert "spirv: consider bitsize when handling OpSwitch cases" This reverts commit `9702fac68e`, which hangs vulkancts and crucible on all platforms. The patch is being reverted because it disables continuous integration testing. The patch from bug 104359 does not apply to master. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359	2017-12-21 12:15:40 -08:00
Dave Airlie	b81f1a592b	radv: fix issue with multisample positions and interp_var_at_sample. This fixes vmfaults seen on vega with: dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1 These were caused by the don't allocate cmask but it was just accidental. The actual problem was the shader was trying to get the sample positions from a buffer, but the buffer was never getting configured to contain them, as the previous shader never needed them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `1171b304f3` (radv: overhaul fragment shader sample positions.) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-22 05:42:48 +10:00
Emil Velikov	be86e5e7d5	docs: update calendar, add news item and link release notes for 17.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-21 17:40:16 +00:00
Emil Velikov	022258117e	docs: add sha256 checksums for 17.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f66496d291`)	2017-12-21 17:39:38 +00:00
Emil Velikov	11ec85ddee	docs: add release notes for 17.3.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `4f5e85e9e9`)	2017-12-21 17:39:38 +00:00
Samuel Pitoiset	9f54675dbe	radv/gfx9: fix primitive topology when adjacency is used Found by inspection. Cc: 17.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-21 10:49:17 +01:00
Brian Paul	6e5b882339	glsl: disable vec3 packing/splitting in tfb separate mode This fixes a varying packing issue when using transform feedback in GL_SEPARATE_ATTRIBS mode. By time we get to linking, we already know that the number of feedback attributes is under the GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't as critical. In fact, packing/splitting vec3 attributes can cause trouble because splitting effectively creates another TFB output which can exceed device limits. So, disable vec3 packing when it's not needed to avoid that issue. Fixes the Piglit ext_transform_feedback-separate test on VMware driver. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:17 -07:00
Brian Paul	462df64495	glsl: simply packing class comparison Handle comparing the packing class using the same method as we do for var->data.is_xfb_only Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:17 -07:00
Brian Paul	06588a065f	glsl: document varying_matches::assign_locations() params and return value And change *components to components[] as a reminder that it's an array. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	544f41ff19	glsl: remove some continue statements In some cases, I think loop code is easier to read without continue statements. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	76fc24ba8d	glsl: use bitwise operators in varying_matches::compute_packing_class() The mix of bitwise operators with * and + to compute the packing_class values was a little weird. Just use bitwise ops instead. v2: add assertion to make sure interpolation bits fit without collision, per Timothy. Basically, rewrite function to be simpler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	cd7705de44	glsl: simplify loop in varying_matches::assign_locations() The use of break/continue was kind of weird/confusing. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	47b4183c92	glsl: minor simplification in assign_varying_locations() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	a0430bb62c	glsl: make varying_matches::is_varying_packing_safe() const Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Brian Paul	d86c9836d5	glsl: trivial comment fixes in lower_packed_varyings.cpp Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-20 11:23:16 -07:00
Andres Gomez	a42e96f522	docs: update 17.3 and 18.0 cycles for the release calendar Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Juan A. Suarez Romero <jasuarez@igalia.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-20 19:48:58 +02:00
Juan A. Suarez Romero	24aac5f81e	spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist Fixes: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-12-20 17:44:35 +01:00
Lucas Stach	51523ab9fa	st/dri: allow direct YUYV import Push this format to the pipe driver unchanged. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-12-20 16:54:37 +01:00
Juan A. Suarez Romero	9702fac68e	spirv: consider bitsize when handling OpSwitch cases When walking over all the cases in a OpSwitch, take in account the bitsize of the literals to avoid getting wrong cases. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-20 10:39:15 +01:00
Tapani Pälli	fcfb423646	drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104288 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-20 09:43:42 +02:00
Samuel Iglesias Gonsálvez	a31f0c4a36	anv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments() Vulkan spec doesn't specify that VK_REMAINING_ARRAY_LAYERS is allowed in the passed VkClearRect struct. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-20 06:55:41 +01:00
Ilia Mirkin	0cf6320eb5	nvc0/ir: change textureGrad to always use lane 0 as the tex origin Thanks to Karol Herbst for the debugging / tracing work that led to this change. Move to using lane 0 as the "work" lane for the texture. It is unclear why this helps, as that computation should be identical to doing it in the "correct" lane with the properly adjusted quadops. In order to be able to use the lane 0 result, we also have to ensure that lane 0 contains the proper array/indirect/shadow values. This applies to Fermi and Kepler. Maxwell+ may or may not need fixing, but that lowering logic is separate. Fixes KHR-GL45.texture_cube_map_array.sampling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-12-19 23:09:19 -05:00
Eric Anholt	22ceb1f99b	broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures. Most piglit textures happened to work out by RGBW not changing in that bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.	2017-12-19 15:55:14 -08:00
Eric Anholt	200562ad01	broadcom/vc5: Clean up the comment and code around level 0 UIF. I wrote this early in driver development, and our UIF handling is much better now.	2017-12-19 14:20:19 -08:00
Eric Anholt	5473dc2b1f	broadcom/vc5: Simplify the tiling calculations. The mb_tile_layout table was just the utile_w/h times two, so reuse the utile code instead.	2017-12-19 14:10:06 -08:00
Eric Anholt	81cdfdd670	broadcom/vc5: Return the depth in all components of depth textures. Apparently gallium's u_blitter wants depth from at least the .z component, and other swizzling appears to apply on top of that. Fixes fbo-generatemipmap-formats failures with depth formats.	2017-12-19 13:40:30 -08:00
Eric Anholt	8efb57c05e	broadcom/vc5: Enable decompressing RGTC for desktop GL support. This matches freedreno's behavior.	2017-12-19 13:40:30 -08:00
Eric Anholt	a13e9915c2	broadcom/vc5: Use u_transfer_helper for MSAA mappings.	2017-12-19 13:40:30 -08:00
Eric Anholt	7a30517cce	broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT. There may be some more RCL work to be done (I think I need to split my Z/S stores when doing separate stencil), but this gets piglit's "texwrap GL_ARB_depth_buffer_float" working. v2: Unwrap the z32f_wrapper before calling the helper, rather than having the helper have a callback. v3: Rebase on Rob Clark's u_transfer_helper instead	2017-12-19 13:40:30 -08:00
Rob Clark	308076fd55	freedreno: add debug flag to force high priority context Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority contexts. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-19 16:36:10 -05:00
Rob Clark	805a72404c	freedreno: context priority support For devices (and kernels) which support different priority ringbuffers, expose context priority support. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-19 16:36:10 -05:00
Rob Clark	0015217c1e	gallium: plumb context priority through to driver Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-12-19 16:36:10 -05:00
Rafael Antognolli	85789831b4	intel/compiler/gen10: Disable push constants. We still have gpu hangs on Cannonlake when using push constants, so disable them for now until we have a proper fix for these hangs. v2: Add warning message when creating context too. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2017-12-19 12:32:24 -08:00
Samuel Pitoiset	4237c3d645	radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components F1 2017 looks good now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:26:25 +01:00
Samuel Pitoiset	0c4a30eb51	radv: do not add extra SGPR when push constants are not used This is not because the vertex stage needs some push constants that other stages need them too. This should reduce the number of loaded SGPRs in some situations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:18 +01:00
Samuel Pitoiset	39097282f7	radv: change the needs_push_constants logic Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:16 +01:00
Samuel Pitoiset	ca8f3a8d55	radv: store pipeline stages that need push constants Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:14 +01:00
Samuel Pitoiset	1cecaa9174	radv: remove one useless check in ac_nir_shader_info_pass() pipeline->layout can't be NULL now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:12 +01:00
Samuel Pitoiset	f9a07474a1	radv: remove one useless check in radv_flush_constants() pipeline->layout can't be NULL now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:11 +01:00
Samuel Pitoiset	00162b2108	radv: add assertions to make sure pipeline layout objects are valid The spec requires it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:09 +01:00
Samuel Pitoiset	3595a11648	radv: create pipeline layout objects for all meta operations They are dummy objects but the spec requires layout to not be NULL, this just makes sure we are creating valid pipeline layout objects. This will allow us to remove some useless checks. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-19 21:22:06 +01:00
Bas Nieuwenhuizen	8b5fe4b2b4	radv: Use a sort for rebuilding the sparse buffer bo list. It uses slightly more memory (though still bounded by the number of mapped ranges), but gives less quadratic behavior. Cuts 4 minutes from the runtime of the CTS .sparse. tests. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-19 21:12:48 +01:00
Rob Clark	3511a51be0	freedreno/ir3: handle VTXID_BASE for indirect draws Need to do some gymnastics to copy the parameter from the indirect parameters buffer to uniform so shader sees the correct base-vertex-id. Fixes ./bin/arb_draw_indirect-vertexid on a5xx and probably a4xx too. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-19 15:00:18 -05:00
Rob Clark	d7cb509fd3	freedreno/ir3: add ctx->mem_to_mem() For dealing with indirect-draw + gl_VertexID, we'll introduce another case where we need to use CP_MEM_TO_MEM. Rather than adding more if(a5xx)/else make this a ctx vfunc. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-19 15:00:18 -05:00
Rob Clark	0536737983	freedreno/a5xx: use vertex_id_zero_base Cmdstream traces from blob make it clear that the blob driver dev's think a5xx has a real (non-zero-based) vtxid. But reality claims differently. Fixes ./bin/gl-3.2-basevertex-vertexid and probably others. This means draw-indirect is going to need some gymnastics to copy base-vertex into uniform. (a4xx probably needs that too.) Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-19 15:00:18 -05:00
Dave Airlie	a1e18e87c7	r600: clear compressed flags in image state on unbind. If we aren't binding an image, clear the compressed flags. This fixes a segfault seen with an apitrace. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104331 Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 05:38:12 +00:00
George Kyriazis	999f1cd5c6	swr: Account for index_bias in offsets When calculating buffer offsets for client buffers account for info.index_bias. Fixes the follow piglit tests: arb_draw_elements_base_vertex-drawelements-user_varrays arb_draw_elements_base_vertex-negative-index-user_varrays Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-18 18:51:25 -06:00
Dave Airlie	6ab346c4d9	r600: only reported tgsi ir compute support on evergreen+ This fixes a crash on r600/r700. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 21:41:46 +00:00
Bas Nieuwenhuizen	42bc25a79c	radv: Advertise sync fd import and export. Passes dEQP-VK..sync_fd. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen	52b3f50df8	radv: Implement sync file import/export for fences & semaphores. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen	b98bbdf490	radv/amdgpu: wrap sync fd import/export. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 22:13:31 +01:00
Dave Airlie	dd517ad96d	ac/nir: fix lds store for patch outputs. This wasn't calculating the correct value, this along with a nir patch fixes a regression in: dEQP-VK.tessellation.shader_input_output.barrier Fixes: `043d14db30` (ac/nir: don't write tcs outputs to LDS that aren't read back.) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:24 +10:00
Dave Airlie	0e8e7ccf9d	nir/linking: always set the used_across_stages/outputs_read bits If we don't remap and output this code would trample the outputs read bits. This fixes a regression in dEQP-VK.tessellation.shader_input_output.barrier Fixes: `1c9c42d16b` (nir: add varying component packing helpers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-19 06:44:11 +10:00
Jason Ekstrand	3be382cd7c	spirv: Relax the validation conditions of OpSelect The Talos Principle contains shaders with an OpSelect between two vectors where the condition is a scalar boolean. This is technically against the spec bout nir_builder gracefully handles it by splatting out the condition to all the channels. So long as the condition is a boolean, just emit a warning instead of failing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246	2017-12-18 09:48:58 -08:00
Samuel Pitoiset	8d00e63ca8	radv: remove useless radv_cmask_info::base_address_reg Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:51:11 +01:00
Samuel Pitoiset	79b34d0832	amd/common: add ac_vgt_gs_mode() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:50 +01:00
Samuel Pitoiset	55f8431c76	amd/common: add ac_get_cb_shader_mask() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:48 +01:00
Samuel Pitoiset	bb01661918	Revert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components" This reverts commit `2294d35b24`. We can't do this without adjusting the input SGPRs/VGPRs logic. For now, just revert it. I will send a proper solution later. It fixes a rendering issue in F1 2017 that CTS didn't catch up. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:02 +01:00
Dave Airlie	1bdeac545f	radv: port merge tess info from anv anv merges the tess info correctly, but radv wasn't doing this. This fixes hangs in dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw Fixes: `60fc0544e0` (radv/pipeline: handle tessellation shader compilation) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 18:36:49 +10:00
Bas Nieuwenhuizen	d27aaae4d2	radv: Add external fence support. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:21 +01:00
Bas Nieuwenhuizen	6abfa37879	radv: Implement VK_KHR_external_fence_fd. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:17 +01:00
Bas Nieuwenhuizen	969421b7da	radv: Implement fences based on syncobjs. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:12 +01:00
Bas Nieuwenhuizen	b308bb8773	amd/common: Add detection of the syncobj wait/signal/reset ioctls. First amdgpu bump after inclusion was 20 (which was done for local BOs). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:06 +01:00
Bas Nieuwenhuizen	1c3cda7d27	radv: Add syncobj signal/reset/wait to winsys. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:31:02 +01:00
Bas Nieuwenhuizen	52be440f48	configure/meson: Bump libdrm_amdgpu version requirement. For the radv dependencies on syncobj signal/reset. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 09:30:55 +01:00
Tapani Pälli	61756a6ceb	android: fix vulkan driver build fixes undefined references by adding missing wsi common API Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-18 09:49:15 +02:00
Tapani Pälli	f94e234dd5	android: fix undefined references to futex API Fixes: `f98a2768ca` "mesa: Add new fast mtx_t mutex type for basic use cases" Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-18 09:24:38 +02:00
Dave Airlie	8de6d4ba0c	docs: mark GL4.3 as finished for r600 Still only on fp64 supported hw.	2017-12-18 04:30:14 +00:00
Dave Airlie	7ce3bd9af3	r600: export robust buffer access Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 04:30:11 +00:00
Dave Airlie	ec7008f03a	r600: export GLSL 430 Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 04:30:08 +00:00
Dave Airlie	91dd4e44c2	r600/cs: add compute support to caps Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 04:30:05 +00:00
Dave Airlie	4388bbbf29	r600: always flush between gfx and compute This is in no way optimal, but there seems to be some problems mixing at the moment, lots of hangs, it is possible, just need to figure out more magic. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 04:30:03 +00:00
Dave Airlie	af9e34b8d7	r600: fix unused variable warning Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-18 04:29:57 +00:00
Bas Nieuwenhuizen	b42e106d4d	radv: Fix multi-layer blits. We did not set the layer correctly for the dst, as we would keep using the base layer. Same for the source image. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710 CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-18 01:27:49 +01:00
Rob Clark	e095b1347e	freedreno/a5xx: add a5xx blitter FD_MESA_DEBUG=noblit to disable Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Rob Clark	37464efa3f	freedreno: add generic blitter Basically a clone of util_blitter_blit() but with special handling to blit PIPE_BUFFER as a PIPE_TEXTURE_1D. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Rob Clark	b852c3bf67	freedreno: add non-draw batches for compute/blit Get rid of "gmem" (ie. tiling) ringbuffer, and just emit setup commands directly to "draw" ringbuffer for compute (and in future for blits not using the 3d pipe). This way we can have a simple flat cmdstream buffer and bypass setup related to 3d pipe. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Rob Clark	2697480c92	freedreno: track staging and shadow perf ctrs for the HUD Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Rob Clark	d848bee50f	freedreno: staging upload transfers In the busy && !needs_flush case, we can support a DISCARD_RANGE upload using a staging buffer. This is a bit different from the case of mid- batch uploads which require us to shadow the whole resource (because later draws in an earlier tile happen before earlier draws in a later tile). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Rob Clark	f20013a119	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-17 12:41:32 -05:00
Bas Nieuwenhuizen	6d9849d63e	anv: Remove unused variable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-17 14:53:46 +01:00
Marek Olšák	35c3cbad3c	radeonsi: don't call force_dcc_off for buffers This was undefined yet harmless behavior in LLVM. Not anymore - it causes a hang now. Cc: 17.3 <mesa-stable@lists.freedesktop.org> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-12-16 01:22:01 +01:00
Kenneth Graunke	02720f8d24	isl: Don't require VALIGN_2 for R32G32B32_FLOAT on Haswell. According to the RENDER_SURFACE_STATE internal documentation, the R32G32B32_FLOAT restriction is marked "IVB" only. We choose to apply it to Ivybridge and Baytrail, but not Haswell. Apparently fixes KHR-GL46.texture_size_promotion.functional on Haswell. Changes these tests from crashing to skipping on Haswell: - KHR-GL46.direct_state_access.textures_storage_multisample_2d_rgb32f - KHR-GL46.direct_state_access.textures_storage_multisample_3d_rgb32f Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-15 14:00:09 -08:00
Boyuan Zhang	2ec48039b8	radeon/uvd: add and manage render picture list Create a list in decoder to store all render picture buffer pointers that currently being used in reference picture lists. During get message buffer call, check each pointer in render_pic_list[] within given pic->ref[] list, remove pointer that no longer being used by pic->ref[]. Then add current render surface pointer to the render_pic_list[] and assign the associated index to result.curr_idx. As a result, result.curr_idx will have the correct index to represent the current render picture, instead of the previous increamenting values. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-12-15 16:04:31 -05:00
Boyuan Zhang	f2bfd1cbb7	radeon/vcn: add and manage render picture list Create a list in decoder to store all render picture buffer pointers that currently being used in reference picture lists. During get message buffer call, check each pointer in render_pic_list[] within given pic->ref[] list, remove pointer that no longer being used by pic->ref[]. Then add current render surface pointer to the render_pic_list[] and assign the associated index to result.curr_idx. As a result, result.curr_idx will have the correct index to represent the current render picture, instead of the previous increamenting values. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-12-15 16:04:31 -05:00
Boyuan Zhang	d9727f31a8	vl: remove is idr flag Remove is_idr flag since not being used anymore. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-12-15 16:04:05 -05:00
Boyuan Zhang	3181065b7f	st/va: directly use idr pic flag Remove is_idr flag, and use idr_pic_flag provided by vaapi directly Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-12-15 16:04:05 -05:00
Boyuan Zhang	130e1d142f	radeon/vce: determine idr by pic type Vaapi encode interface provides idr frame flags, where omx interface doesn't. Therefore, change to use picture type to determine idr frame, which will work for both interfaces. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2017-12-15 16:04:05 -05:00
Boyuan Zhang	c87d91b9d8	radeon/vcn: determine idr by pic type Vaapi encode interface provides idr frame flags, where omx interface doesn't. Therefore, change to use picture type to determine idr frame, which will work for both interfaces. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-12-15 16:04:05 -05:00
Emil Velikov	5d03a68640	util: scons: wire up the sha1 test Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-12-15 19:01:12 +00:00
Tim Rowley	f475ac3c40	swr/rast: Move more RTAI handling out of binner Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:57:12 -06:00
Tim Rowley	11a9d4f9b5	swr/rast: EXTRACT2 changed from vextract/vinsert to vshuffle Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:57:06 -06:00
Tim Rowley	12adf2c815	swr/rast: Fix cache of API thread event manager Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:57:01 -06:00
Tim Rowley	c68b2d5c79	swr/rast: Replace VPSRL with LSHR Replace use of x86 intrinsic with general llvm IR instruction. Generates the same final assembly. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:54 -06:00
Tim Rowley	20f9006603	swr/rast: Rework thread binding parameters for machine partitioning Add BASE_NUMA_NODE, BASE_CORE, BASE_THREAD parameters to SwrCreateContext. Add optional SWR_API_THREADING_INFO parameter to SwrCreateContext to control reservation of API threads. Add SwrBindApiThread() function to allow binding of API threads to reserved HW threads. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:46 -06:00
Tim Rowley	182cc51a50	swr/rast: Pull of RTAI gather & offset out of clip/bin code Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:40 -06:00
Tim Rowley	ca59b2e75c	swr/rast: Remove no-op VBROADCAST of vID Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:36 -06:00
Tim Rowley	01a57c11cb	swr/rast: SIMD16 Fetch - Fully widen 32-bit integer vertex components Also widen the 16-bit a 8-bit integer vertex component gathers to SIMD16. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:30 -06:00
Tim Rowley	fa3105cdb5	swr/rast: Replace INSERT2 vextract/vinsert with JOIN2 vshuffle Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:25 -06:00
Tim Rowley	b38ac9dca1	swr/rast: SIMD16 Fetch - Fully widen 16-bit float vertex components Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:19 -06:00
Tim Rowley	df54678ba0	swr/rast: SIMD16 Fetch - Fully widen 32-bit float vertex components Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:56:03 -06:00
Tim Rowley	fbc27ff027	swr/rast: Pass prim to ClipSimd Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:54 -06:00
Tim Rowley	8b06920796	swr/rast: Pull most of the VPAI manipulation out of the binner/clipper Move out of binner/clipper; hand them down from the frontend code instead. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:49 -06:00
Tim Rowley	f882891684	swr/rast: Move GatherScissors to header Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:42 -06:00
Tim Rowley	cdb61d45cd	swr/rast: Rewrite Shuffle8bpcGatherd using shuffle Ease future code maintenance, prepare for folding simd8 and simd16 versions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:38 -06:00
Tim Rowley	3ec98ab5d4	swr/rast: Convert gather masks to Nx1bit Simplifies calling code, gets gather function interface closer to llvm's masked_gather. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:33 -06:00
Tim Rowley	36e276b6b0	swr/rast: WIP - Widen fetch shader to SIMD16 Widen vertex gather/storage to SIMD16 for all component types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:28 -06:00
Tim Rowley	6d5275498a	swr/rast: Corrections to multi-scissor handling binner's GatherScissors() will be turned into a real gather in the not too distant future. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:24 -06:00
Tim Rowley	0e9e247687	swr/rast: Binner fixes for viewport index offset handling Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:19 -06:00
Tim Rowley	f2e3900a1e	swr/rast: Remove unneeded copy of gather mask Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-15 10:55:01 -06:00
Chris Wilson	a68873f668	i965: Allow old begin/end queryobj for gen4/5 with HW contexts Since we have HW contexts on gen4/5, we could take advantage of them, as done for gen6+ in commit `e32cd5ffbb` ("i965: Rely on hardware contexts for query objects on Gen6+."), to only emit a pair of counters at begin/end queryobj, rather than around every primitive. However, to keep queryobj working in the meantime as we bringup support for HW ctx on gen4/5, we can keep using the existing code. References: `e32cd5ffbb` ("i965: Rely on hardware contexts for query objects on Gen6+.") Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-15 13:41:18 +00:00
Rob Clark	d1465b3aee	freedreno: use u_transfer_helper Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-15 08:09:44 -05:00
Rob Clark	e94eb5e600	gallium/util: add u_transfer_helper Add a new helper that drivers can use to emulate various things that need special handling in particular in transfer_map: 1) z32_s8x24.. gl/gallium treats this as a single buffer with depth and stencil interleaved but hardware frequently treats this as separate z32 and s8 buffers. Special pack/unpack handling is needed in transfer_map/unmap to pack/unpack the exposed buffer 2) fake RGTC.. GPUs designed with GLES in mind, but which can other- wise do GL3, if native RGTC is not supported it can be emulated by converting to uncompressed internally, but needs pack/unpack in transfer_map/unmap 3) MSAA resolves in the transfer_map() case v2: add MSAA resolve based on Eric's "gallium: Add helpers for MSAA resolves in pipe_transfer_map()/unmap()." patch; avoid wrapping pipe_resource, to make it possible for drivers to use both this and threaded_context. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-15 08:09:44 -05:00
Tapani Pälli	eac1aad624	i965: enable EXT_disjoint_timer_query extension Following dEQP cases pass: dEQP-EGL.functional.get_proc_address.extension.gl_ext_disjoint_timer_query dEQP-EGL.functional.client_extensions.disjoint Piglit test 'ext_disjoint_timer_query-simple' passes with these changes. No changes/regression observed in Intel CI. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-15 08:42:48 +02:00
Tapani Pälli	33f73345da	mesa: GL_EXT_disjoint_timer_query extension API bits Patch adds GL_GPU_DISJOINT_EXT and enables to use timer queries when EXT_disjoint_timer_query is enabled. v2: enable extension only when EXT_disjoint_timer_query set Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-15 08:42:48 +02:00
Tapani Pälli	0a202dd5e8	glapi: add GL_EXT_disjoint_timer_query Most entrypoints already available via other extensions like GL_EXT_occlusion_query_boolean, GL_EXT_timer_query. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-15 08:42:48 +02:00
Tapani Pälli	80d96ca4c8	mesa: add DisjointOperation to gl_shared_state This state will be used by EXT_disjoint_timer_query. As first usage, patch sets DisjointOperation true when gpu reset happens. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-15 08:42:48 +02:00
Eric Anholt	49e2586bfc	broadcom/vc5: Fix a typo in memcmp for sig unpack checking. This shockingly ended up working out, because only the first byte of sig is used and (sizeof(sig) != 0) == 1. Fixes a compiler warning. Link: https://bugs.freedesktop.org/show_bug.cgi?id=104183	2017-12-14 14:36:24 -08:00
Eric Anholt	1171f1749d	broadcom/vc5: Enable NIR txd lowering on all txd instructions. Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for the base -texgrad/texgradcube ones which fail on what appear to be precision problems. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	0bead224fe	nir: Add a new lowering option to lower all txd to txl. VC5 requires that all txd are lowered in the shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	b08b628994	nir: Fix interaction of GL_CLAMP lowering with texture offsets. We want the clamping of the coordinate to apply after the offset, so we need to do math to lower the offset out of the instruction. Fixes texwrap offset cases for GL_CLAMP with GL_NEAREST on vc5. Note: I moved the get_texture_size() verbatim, so that it was defined before use. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-14 14:36:17 -08:00
Eric Anholt	52f024b052	broadcom/vc5: Fix shader input/outputs for gallium's new NIR linking.	2017-12-14 14:36:17 -08:00
Roland Scheidegger	1ae48963f7	gallivm: implement accurate corner behavior for textureGather with cube maps The spec says the missing texel (when we wrap around both x and y axis) should be synthesized as the average of the 3 other texels. For bilinear filtering however we instead adjusted the filter weights (because, while the complexity looks similar, there would be 4 times as many color values to fix up than weights). Obviously this could not work for gather (hence accurate corner filtering was disabled with gather). Implement this by just doing it as the spec implies - calculate the 4th texel as the average of the other 3. With gather of course there's only one color to worry about, so it's not all that many instructions neither (albeit surely the whole cube map filtering is hilariously complex). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-12-14 22:59:55 +01:00
Roland Scheidegger	a485ad0bcd	gallivm: fix an issue with NaNs with seamless cube filtering Cube texture wrapping is a bit special since the values (post face projection) always are within [0,1], so we took advantage of that and omitted some clamps. However, we can still get NaNs (either because the coords already had NaNs, or the face projection generated them), and in fact we didn't handle them quite safely. I've seen -INT_MAX + 1 been propagated through as the final int coord value, albeit I didn't observe a crash. (Not quite a coincidence, since any stride mul with -INT_MAX or -INT_MAX+1 will turn up as a small positive number - nevertheless, I'd rather not try my luck, I'm not entirely sure it can't really turn up negative neither due to seamless coord swapping, plus ifloor of a NaN is not guaranteed to return -INT_MAX by any standard. And we kill off NaNs similarly with ordinary texture wrapping too.) So kill off the NaNs by using the common max against zero method. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-12-14 22:59:55 +01:00
Jason Ekstrand	4b8c9ea46b	intel/tools: Convert aubinator over to the common framework Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:24 -08:00
Jason Ekstrand	35f9c27be3	intel/batch-decoder: Decode registers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:22 -08:00
Jason Ekstrand	81e4ecbc19	intel/batch-decoder: Decode dynamic state Unfortunately, in aubinator and aubinator_error_decode we don't always know how many of a given state we have, so we must guess. One day, we'll come up with a way to annotate the batch to solve this problem. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:20 -08:00
Jason Ekstrand	4ac2ee9001	intel/batch-decoder: Decode constants, binding tables, and samplers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:18 -08:00
Jason Ekstrand	d374423eab	intel/tools: Switch aubinator_error_decode over to the gen_print_batch The shared framework can now do everything that aubinator_error_decode ever did and more. It's time to make the switch. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:16 -08:00
Jason Ekstrand	c86671c438	intel/batch-decoder: Decode graphics shaders Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:15 -08:00
Jason Ekstrand	d4081fb778	intel/batch-decoder: Decode vertex and index buffers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:13 -08:00
Jason Ekstrand	e27ec208ed	intel/batch-decoder: Decode MEDIA_INTERFACE_DESCRIPTOR_LOAD Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:12 -08:00
Jason Ekstrand	be20043d00	intel/tools: Add the start of a generic batch decoder Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:10 -08:00
Jason Ekstrand	4cb96fbd91	intel/decoder: Expose the raw field value in the iterator Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:09 -08:00
Jason Ekstrand	79269e8f4b	intel/disasm: Take a devinfo in gen_disasm_create Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:06 -08:00
Jason Ekstrand	a7ae72032f	intel/decoder: Take a bit offset in gen_print_group Previously, if a group was nested in another group such that it didn't start on a dword boundary, we would decode it as if it started at the start of its first dword. This changes things to work even more in terms of bits so that we can properly decode these structs. This affects MOCS, attribute swizzles, and several other things. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:04 -08:00
Jason Ekstrand	dca8f466ee	intel/decoder: Stop rounding down to the nearest dword Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:03 -08:00
Jason Ekstrand	f264640693	intel/decoder: Convert the iterator to work entirely in bits Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:27:01 -08:00
Jason Ekstrand	ada705b671	intel/decoder: Drop gen_field_decode helper It's unused Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-12-14 13:26:44 -08:00
Samuel Pitoiset	225b198802	amd/common: add ac_build_waitcnt() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:44 +01:00
Samuel Pitoiset	24601810e9	amd/common: more use of i32_1 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:42 +01:00
Samuel Pitoiset	ec4e566560	amd/common: more use of i32_0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:41 +01:00
Samuel Pitoiset	d43e72fd8c	radeonsi: make use of ac_build_fdiv() And move the comment to amd/common. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:24:38 +01:00
Samuel Pitoiset	88522e2bcd	radv: export SampleMask from pixel shaders at full rate Use 16_ABGR instead of 32_ABGR if Z isn't written. Ported from RadeonSI. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:28 +01:00
Samuel Pitoiset	45872a0a6d	radeonsi: make use of ac_get_spi_shader_z_format() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:25 +01:00
Samuel Pitoiset	91f4d746e4	amd/common: add ac_get_spi_shader_z_format() ac_shader_util.c will contain shader helpers for RadeonSI and RADV. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:23 +01:00
Samuel Pitoiset	90c3bf0789	radv: do not load the local invocation index when it's unused Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:26 +01:00
Samuel Pitoiset	2294d35b24	radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components We should also not load the input SGPRs and VGPRS, but let's start with this for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:06 +01:00
Samuel Pitoiset	e001944410	amd/common: scan which components of gl_LocalInvocationID are used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:04 +01:00
Samuel Pitoiset	42285ed8c3	amd/common: scan which components of gl_WorkGroupID are used Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:22:02 +01:00
Samuel Pitoiset	5a761167f5	radv: set FORCE_SIMD_DIST(1) for compute when profitable Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:20:59 +01:00
Samuel Pitoiset	75b1c4997f	radv: calculate best compute resource limits Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:20:57 +01:00
Samuel Pitoiset	9fdc1437ba	radv: store the dispatch initiator into the device Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:20:55 +01:00
Samuel Pitoiset	2e58ef46a8	radv: replace grid_components_used by uses_grid_size Use a boolean instead because the number of needed SGPRs is always 3. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:42 +01:00
Samuel Pitoiset	97e57740d8	radv: always emit all compute block components The number of grid components is always 3 when gl_NumWorkGroups is declared, because it relies on the number of components of nir_instrinsic_load_num_work_groups. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:19:39 +01:00
Emil Velikov	271fc8606a	docs: update calendar, add news item and link release notes for 17.2.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-14 13:52:11 +00:00
Emil Velikov	7ddc3d9f15	docs: add sha256 checksums for 17.2.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-14 13:49:32 +00:00
Emil Velikov	0811bb3bd3	docs: add release notes for 17.2.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-14 13:49:30 +00:00
Harish Krupo	96fc5fbf23	egl/android: Provide an option for the backend to expose KHR_image From android cts 8.0_r4, a new test case checks if all the required egl extensions are exposed. In the current implementation we expose KHR_image if KHR_image_base and KHR_image_pixmap are supported but KHR_image spec does not mandate the existence of both the extensions. This patch preserves the current check and also provides the backend with an option to expose the KHR_image extension. Test: run cts -m CtsOpenGLTestCases -t \ android.opengl.cts.OpenGlEsVersionTest#testRequiredEglExtensions Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-14 13:43:03 +02:00
Bas Nieuwenhuizen	4eb0dca46b	radv: Don't advertise VK_EXT_debug_report. We never supported it. Missed during copy and pasting. Fixes: `17201a2eb0` "radv: port to using updated anv entrypoint/extension generator." Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-12-14 10:05:22 +01:00
Kenneth Graunke	fd3fc5f547	i965: Don't allocate an MCS for 16x MSAA and width > 8192. The hardware doesn't support this, and isl_surf_get_mcs_surf will fail. I feel a bit bad replicating this logic, but we want to decide up front. This fixes the following test when run with --deqp-surface-width=16384: - GTF-GL46.gtf30.GL3Tests.framebuffer_blit.framebuffer_blit_error_blitframebuffer_multisampled_framebuffers_different_sample_count Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-12-14 00:37:33 -08:00
Rob Herring	546633dce2	Android: fix missing generation of vtn_gather_types.c Commit `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") added generation of vtn_gather_types.c, but forgot to add it to the Android build files. Fixes: `bb1e6ff161` ("spirv: Add a prepass to set types on vtn_values") Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-12-13 16:20:15 -06:00
Dylan Baker	e5d8ffdda6	mesa: Add glSpecializeShaderARB to common_desktop_functions CC: Nicolai Hähnle <nicolai.haehnle@amd.com> CC: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104231 Fixes: `46b21b8f90` ("mesa: add GL_ARB_gl_spirv boilerplate") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-13 13:24:57 -08:00
Tomasz Figa	5364e73624	egl/android: Partially handle HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED There is no API available to properly query the IMPLEMENTATION_DEFINED format. As a workaround we rely here on gralloc allocating either an arbitrary YCbCr 4:2:0 or RGBX_8888, with the latter being recognized by lock_ycbcr failing. Reviewed-on: https://chromium-review.googlesource.com/566793 Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Chad Versace <chadversary@chromium.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-12-13 14:51:48 -06:00
Bruce Cherniak	ea2ee9cd19	swr: Correct texture allocation and limit max size to 2GB This patch fixes piglit tex3d-maxsize by correcting 4 things: The total_size calculation was using 32-bit math, therefore a >4GB allocation request overflowed and was not returning false (unsupported). Changed AlignedMalloc arguments from "unsigned int" to size_t, to handle >4GB allocations. Added error checking on texture allocations to fail gracefully. Finally, temporarily decreased supported max texture size from 4GB to 2GB. The gallivm texture-sampler needs some additional work to correctly handle larger than 2GB textures (offsets to LLVMBuildGEP are signed). I'm working on a follow-on patch to allow up to 4GB textures, as this is useful in HPC visualization applications. Fixes piglit tex3d-maxsize. v2: Updated patch description to clarify ">4GB". Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2017-12-13 14:44:04 -06:00
Bruce Cherniak	709f5bdc4a	swr: Fix KNOB_MAX_WORKER_THREADS thread creation override. Environment variable KNOB_MAX_WORKER_THREADS allows the user to override default thread creation and thread binding. Previous commit to adjust linux cpu topology caused setting this KNOB to bind all threads to a single core. This patch restores correct functionality of override. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>	2017-12-13 14:44:01 -06:00
Dylan Baker	1774c10361	meson: fix glx-test race This test should rely on dispatch.h being generated, but it doesn't. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-12-13 09:37:12 -08:00
Brian Paul	c27a6c45c2	gallium/docs: document behavior of set_sample_mask() The sample mask is used even if msaa is not explicity enabled when we have a framebuffer with multisampled surfaces. That's DX behavior and what the Radeon drivers do. Not sure about other drivers at this point. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-12-13 08:38:07 -07:00
Brian Paul	0f2bd31baf	glsl: trivial whitespace fixes in link_varyings.cpp	2017-12-13 08:38:07 -07:00
Jordan Justen	dc07bb5fd1	program: Don't reset SamplersValidated when restoring from shader cache Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-13 00:31:06 -08:00
Kai Wasserbäch	729fec6eab	mesa: remove second include of errors.h in src/mesa/main/glspirv.c Fixes: `5bc03d2508` ("mesa: implement SPIR-V loading in glShaderBinary") Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 23:42:53 -08:00
Timothy Arceri	3308f4b81a	radeonsi: create get_tcs_tes_buffer_address helper This will be shared between the NIR and TGSI backends. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-13 13:41:53 +11:00
Timothy Arceri	a5f9ac2928	ac: fix nir_op_f2f64 Without this we get the error "FPExt only operates on FP" when converting the following: vec1 32 ssa_5 = b2f ssa_4 vec1 64 ssa_6 = f2f64 ssa_5 Which results in: %44 = and i32 %43, 1065353216 %45 = fpext i32 %44 to double With this patch we now get: %44 = and i32 %43, 1065353216 %45 = bitcast i32 %44 to float %46 = fpext float %45 to double Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-13 13:20:28 +11:00
Timothy Arceri	cab5513b47	nir: fix shift for uint64_t Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-13 13:20:28 +11:00
Timothy Arceri	dd119a4263	st/glsl_to_nir: skip forced array splitting for tcs nir_lower_io_to_temporaries() does not support tcs so we cannot assume there are no indirects here. Also the radeonsi backend (the only backend to support tess) has support for tcs indirects so there is no need to lower them anyway. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-13 13:20:28 +11:00
Francisco Jerez	acab52f520	intel/fs/bank_conflicts: Don't touch Gen7 MRF hack registers. Fixes: `af2c320190` "intel/fs: Implement GRF bank conflict mitigation pass." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104199 Reported-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-12 12:05:45 -08:00
Kevin Rogovin	b1ce812c51	i965: compute scratch space size correctly for Gen9+ Fixes: `8ecdbb6136` "i965: Pretend there are 4 subslices for compute shader threads on Gen9+." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104005 Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>	2017-12-12 10:02:43 -08:00
Kevin Rogovin	eea9027f87	i965: Program MEDIA_VFE_STATE in a more readable fashion. This patch is purely for readability improvements when programming the MEDIA_VFE_STATE. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-12 10:02:31 -08:00
Brian Paul	7469966ed2	cso: add point rasterization sanity check assertion Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-12-12 09:46:18 -07:00
Brian Paul	38a4fd8ad6	gallium/u_blitter: replace tabs with spaces Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-12 09:46:18 -07:00
Brian Paul	7a46063803	xlib: call _mesa_warning() instead of fprintf() We use _mesa_warning() everywhere else in this code. Change requested by Rick Irons of Mathworks. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 09:44:59 -07:00
Brian Paul	63b03dc924	gallium/util: don't pass a pipe_resource to util_resource_is_array_texture() No need to pass a pipe_resource when we can just pass the target. This makes the function potentially more usable. Rename it too. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-12 09:44:59 -07:00
Brian Paul	dde8309cde	gallium/aux: include nr_samples in util_resource_size() computation This function is only used in two places: 1. VMware driver, but only for HUD reporting 2. st/nine state tracker, used for texture memory accounting Fixes: `a69efa9482` ("util: add new util_resource_size() function in u_resource.[ch]") Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-12 09:44:59 -07:00
Brian Paul	09b69828a3	svga: trivial whitespace/formatting fixes in svga_pipe_rasterizer.c	2017-12-12 09:44:59 -07:00
Brian Paul	71ac73ce76	st/mesa: trivial whitespace/formatting fixes in st_atom_rasterizer.c	2017-12-12 09:44:59 -07:00
Jason Ekstrand	9718ce44c2	spirv: Handle image and sampler function parameters	2017-12-12 07:34:46 -08:00
Jason Ekstrand	7d3ebd1286	spirv/cfg: Refactor the function parameter loop a bit	2017-12-12 07:34:46 -08:00
Jason Ekstrand	e6ba457c99	spirv/cfg: Be a bit more precise about function parameters Pointers with no storage type are converted to inout variables but SSA values and pointers with a storage type (which turns into a uint or uvec2) are just input variables.	2017-12-12 07:34:46 -08:00
Jason Ekstrand	aaeda8d7d4	spirv: Make sampled images a real type Previously, we just gave them exactly the same type as the respective image (which already had a sampler2D or similar type). Now they have their own base type and a pointer to the vtn_type for the image. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-12-12 07:34:46 -08:00
Eric Engestrom	ec0a4fcec0	i915: add missing 0 defines Thanks to Emil's -Wundef, t_dd_dmatmp.h now complains that intel_render.c is missing a couple `#define`s. Assigning them to 0 keeps the existing behaviour; I'll let someone else turn them on if this is the behaviour that was intended. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 13:59:46 +00:00
Nicolai Hähnle	accb7d4390	mesa: refuse to compile SPIR-V shaders or link mixed shaders Note that gl_shader::CompileStatus will also indicate whether a shader has been successfully specialized. v2: Use the 'spirv_data' member of gl_shader to know if it is a SPIR-V shader, instead of a dedicated flag. (Timothy Arceri) v3: Use bool instead of GLboolean. (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 08:18:32 +01:00
Nicolai Hähnle	4ccd00d762	mesa/shaderapi: add a getter for GL_SPIR_V_BINARY_ARB v2: Use the 'spirv_data' member of gl_shader instead of a dedicated flag. (Timothy Arceri) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 08:18:32 +01:00
Nicolai Hähnle	5bc03d2508	mesa: implement SPIR-V loading in glShaderBinary v2: * Add a gl_shader_spirv_data member to gl_shader, which already encapsulates a gl_spirv_module where the binary will be saved. (Eduardo Lima) * Just use the 'spirv_data' member to know whether a gl_shader has the SPIR_V_BINARY_ARB state. (Timothy Arceri) * Remove redundant argument checks. Move extension presence check to API entry point where the rest of checks are. Retype 'n' and 'length'arguments to use the correct and more standard types. (Ian Romanick) * Fix some nitpicks. (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 08:18:32 +01:00
Eduardo Lima Mitev	a8889f5cc7	mesa/glspirv: Add struct gl_shader_spirv_data This is a per-shader structure holding the SPIR-V data associated with the shader (binary module, specialization constants and entry-point). This is needed because both gl_shader and gl_linked_shader need to share this data. Instead of copying the data, we pass a reference to it upon program linking. That's why it is reference-counted. This struct is created and associated with the shader upon calling glShaderBinary(), then subsequently filled up by the call to glSpecializeShaderARB(). v2: Readability improvements (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 08:18:32 +01:00
Nicolai Hähnle	74f98ab76f	mesa/glspirv: Add struct gl_spirv_module v2: * Make the SPIR-V module struct part of a larger gl_shader_spirv_data struct that will be introduced later, and don't reference it directly in gl_shader. (Eduardo Lima) * Readability improvements (Ian Romanick) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-12 08:18:32 +01:00
Nicolai Hähnle	46b21b8f90	mesa: add GL_ARB_gl_spirv boilerplate v2: * Add meson build bits (Eric Engestrom) * Return INVALID_OPERATION error on SpecializeShaderARB (Ian Romanick) v3: Include boilerplate for the GL 4.6 alias of glSpecializeShaderARB (Neil Roberts) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-12 08:18:32 +01:00
Jason Ekstrand	df657ebb68	spirv: Add support for all bit sizes in OpSwitch Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101560	2017-12-11 22:28:34 -08:00
Jason Ekstrand	58cabae8cc	spirv: Restructure the case loop in OpSwitch handling Instead of calling vtn_add_case for the default case and then looping, add an is_default variable and do everything inside the loop. This will make the next commit easier. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	5f572ccc95	spirv: Add better parameter validation for vector and matrix types Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	a7c2be9944	spirv: Add type validation for OpSelect Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	6737b1b859	spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	bb1e6ff161	spirv: Add a prepass to set types on vtn_values This autogenerated pass will automatically find and set the type field on all vtn_values. This way we always have the type and can use it for validation and other checks. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Jason Ekstrand	2c84b49ddf	spirv: Add a vtn_type field to all vtn_values At the moment, this just lets us drop the const_type for constants and unify things a bit. Eventually, we will use this to store the types of all SPIR-V SSA values. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 22:28:34 -08:00
Samuel Iglesias Gonsálvez	ba4bb0838b	anv: fix bug when using component qualifier in FS outputs We can write to the same output but in different components, like in this example: layout(location = 0, component = 0) out ivec2 dEQP_FragColor_0; layout(location = 0, component = 2) out ivec2 dEQP_FragColor_1; Therefore, they are not two different outputs but only one. Fixes: dEQP-VK.glsl.440.linkage.varying.component.frag_out.* v3: - Remove FRAG_RESULT_MAX. - Add const and use sizeof (Ian). - Do three-pass to set properly the locations of fragment outputs when having arrays (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-12 07:24:55 +01:00
Ilia Mirkin	0332c7484b	st/mesa: swizzle argument when there's a vector size mismatch GLSL IR operation arguments can sometimes have an implicit swizzle as a result of a vector arg and a scalar arg, where the scalar argument is implicitly expanded to the size of the vector argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103955 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-11 23:08:43 -05:00
Roland Scheidegger	84c363fb09	gallivm: fix texture wrapping for texture gather for mirror modes Care must be taken that all coords end up correct, the tests are very sensitive that everything is correctly rounded. This doesn't matter for bilinear filter (since picking a wrong texel with weight zero is ok), and we could also switch the per-sample coords mistakenly. While here, also optimize the coord_mirror helper a bit (we can do the mirroring directly by exploiting float rounding, no need for fixing up odd/even manually). I did not touch the mirror_clamp and mirror_clamp_to_border modes. In contrast to mirror_clamp_to_edge and mirror_repeat these are legacy modes. They are specified against old gl rules, which actually does the mirroring not per sample (so you get swapped order if the coord is in the mirrored section). I think the idea though is that they should follow the respecified mirror_clamp_to_edge rules so the order would be correct. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-12-12 04:23:02 +01:00
Jason Ekstrand	24f019fd69	spirv: Allow ignoring decorations for workgroup variables Since we switched over to lowering SLM access directly in SPIR-V -> NIR, we no longer have vtn_variables for SLM. It's all safe as with UBOs and SSBOs but we need to let it through in the assert. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213 Fixes: `8761a04d0d` Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-11 19:02:47 -08:00
Jason Ekstrand	2bc9123c33	spirv: Set lengths on scalar and vector types Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-11 19:02:47 -08:00
Bas Nieuwenhuizen	3342a432fa	ac/nir: Support vulkan_resource_reindex. Fixes: `93b4cb61eb` "spirv: Allow OpPtrAccessChain for block indices" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Bas Nieuwenhuizen	368f49b284	ac/nir: Don't load the descriptor in vulkan_resource_index. To support the reindex intrinsic, we need the result to be something on which we can adjust the index/address. Since it is all within a basic block, the compiler should be able to merge any extra loads. v2: Change visit_get_buffer_size too. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-12 00:16:18 +01:00
Marek Olšák	bf0904e31f	winsys/amdgpu: disable local BOs again due to worse performance Cc: 17.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-11 19:11:14 +01:00
Marek Olšák	8a821fa91c	drirc: whitelist glthread for Mount and Blade Warband again	2017-12-11 19:11:12 +01:00
Bas Nieuwenhuizen	6469669beb	radv: Don't use local BOs when allocating with export options. If the app does not plan to put a buffer or image in it (why? But it is allowed and CTS does it), they do not need to allocate it with the deciate allocation struct. Fixes: `a639d40f13` "radv: add support for local bos. (v3)" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-10 23:47:23 +01:00
Bas Nieuwenhuizen	b926da241a	spirv: Fix loading an entire block at once. There is no chain, so checking the length ends with a SEGFAULT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-10 01:43:26 +01:00
Jason Ekstrand	4c7af87fb9	anv: Enable UBO pushing Push constants on Intel hardware are significantly more performant than pull constants. Since most Vulkan applications don't actively use push constants on Vulkan or at least don't use it heavily, we're pulling way more than we should be. By enabling pushing chunks of UBOs we can get rid of a lot of those pulls. On my SKL GT4e, this improves the performance of Dota 2 and Talos by around 2.5% and improves Aztec Ruins by around 2%. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:26 -08:00
Jason Ekstrand	f1ce0b905a	i965/fs: Handle !supports_pull_constants and push UBOs properly In Vulkan, we don't support classic pull constants and everything the client asks us to push, we push. However, for pushed UBOs, we still want to fall back to conventional pulls if we run out of space.	2017-12-08 15:43:25 -08:00
Jason Ekstrand	8d34077182	anv/device: Increase the UBO alignment requirement to 32 Push constants work in terms of 32-byte chunks so if we want to be able to push UBOs, every thing needs to be 32-byte aligned. Currently, we only require 16-byte which is too small. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	2f9eb045f3	anv/cmd_buffer: Add support for pushing UBO ranges In order to do this we have to modify push constant set up to handle ranges. We also have to tweak the way we handle dirty bits a bit so that we re-push whenever a descriptor set changes. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	0c879b62b0	anv/cmd_buffer: Add some stage asserts There are several places where we look up opcodes in an array of stages. Assert that the we don't end up going out-of-bounds. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	1968cd07a2	anv/cmd_buffer: Add some helpers for working with descriptor sets Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	1bce04deb8	anv/pipeline: Translate vulkan_resource_index to a constant when possible We want to call brw_nir_analyze_ubo_ranges immedately after anv_nir_apply_pipeline_layout and it badly wants constants. We could run an optimization step and let constant folding do it but that's way more expensive than needed. It's really easy to just handle constants in apply_pipeline_layout. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	3b34ed79f1	i965/fs: Rewrite assign_constant_locations This rewires the logic for assigning uniform locations to work in terms of "complex alignments". The basic idea is that, as we walk the list of instructions, we keep track of the alignment and continuity requirements of each slot and assert that the alignments all match up. We then use those alignments in the compaction stage to ensure that everything gets placed at a properly aligned register. The old mechanism handled alignments by special-casing each of the bit sizes and placing 64-bit values first followed by 32-bit values. The old scheme had the advantage of never leaving a hole since all the 64-bit values could be tightly packed and so could the 32-bit values. However, the new scheme has no type size special cases so it handles not only 32 and 64-bit types but should gracefully extend to 16 and 8-bit types as the need arises. Tested-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-12-08 15:43:25 -08:00
Jason Ekstrand	597c194487	anv: Disable VK_KHR_16bit_storage The testing for this extension is currently very poor. The CTS tests only test accessing UBOs and SSBOs at dynamic offsets so none of our constant-offset paths get triggered at all. Also, there's an assertion in our handling of nir_intrinsic_load_uniform that offset % 4 == 0 which is never triggered indicating that nothing every gets loaded from an offset which is not a dword. Both push constants and the constant offset pull paths are complex enough, we really don't want to ship without tests. We'll turn the extension back on once we have decent tests.	2017-12-08 15:42:55 -08:00
Leo Liu	6d74cb2570	radeon/vce: move destroy command before feedback command VCE processing IBs starts from session and task info at first level, other commands processed subsequently. The task info for destroy is embedded to destroy command, resulting that feedback command is not properly procoessed. This is causing kernel spin VM fault messages on Polaris and Vega10 card when running ends at encode application. The fix is also verified on VCE physical mode card. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: mesa-stable@lists.freedesktop.org Acked-by: Christian König <christian.koenig@amd.com>	2017-12-08 12:56:48 -05:00
Ben Crocker	060eb314eb	docs/llvmpipe: document ppc64le as alternative architecture to x86. Power8, Power8NV, and Power9 are supported on an equal footing with X86. Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ben Crocker <bcrocker@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> [Eric: changed formatting, reworded a bit (with Ben's ack)] Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-08 14:49:00 +00:00
Emil Velikov	bce489a4ed	docs/release-calendar: drop 17.3.0 from the table Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-08 13:59:27 +00:00
Emil Velikov	95c9d751ce	docs: add news item and link release notes for 17.3.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-08 13:58:03 +00:00
Emil Velikov	706986bcc9	docs: add sha256 checksums for 17.3.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `49a612d158`)	2017-12-08 13:54:34 +00:00
Emil Velikov	4124ac51f4	docs: Update 17.3.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `8d55da9f57`)	2017-12-08 13:54:33 +00:00
Samuel Pitoiset	572b2bad1d	radv: do not print ASM to stderr when dumping shaders Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:24:24 +01:00
Samuel Pitoiset	33b329f769	radv/winsys: implement query_value() Might be useful to know the VRAM/GTT usage, the number of VRAM CPU page faults, etc. Nothing is currently using that new interface, but it's a first step. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:22:35 +01:00
Samuel Pitoiset	c202119286	radv: remove useless check radv_set_dcc_need_cmask_elim_pred() emit_fast_color_clear() already checks that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:22:03 +01:00
Samuel Pitoiset	d90b7a4c50	radv: remove useless checks in radv_set_{color,depth}_clear_regs() Already checked by the respective callers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:22:00 +01:00
Samuel Pitoiset	c7c7b00889	radv: only re-mit the index type when it changes dota2 binds a ton of index buffers but the type is always 16-bit. Note that we have to invalidate the type when switching from indexed draws to normal draws. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:21:36 +01:00
Samuel Pitoiset	a302009b7b	radv: only reset command buffers that are not in the initial state dota2 always calls vkResetCommandBuffer() before vkBeginCommandBuffer() which is quite useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:21:23 +01:00
Samuel Pitoiset	a380bc7ecf	radv: track different status of a command buffer RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think it makes sense to declare it. Could be used later with better command buffer error handling. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:21:21 +01:00
Samuel Pitoiset	fc6c77e162	radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega Copied from RadeonSI. This fixes all CTS dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.* And some other ones which use the same format. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-08 11:15:44 +01:00
Jordan Justen	4d81c8e43e	docs: Update GL_ARB_get_program_binary docs to support 1 format Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 17:01:02 +11:00
Jordan Justen	b4c37ce214	i965: Add ARB_get_program_binary support using nir_serialization This resolves an apparent game bug described in 85564. The game doesn't properly handle ARB_get_program_binary with 0 supported formats. V2 (Timothy Arceri): - less driver code as more has been moved into the common helpers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85564 Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 17:00:57 +11:00
Jordan Justen	c1ff99fd70	main: Clear shader program data whenever ProgramBinary is called The GL_ARB_get_program_binary extension spec says: "If ProgramBinary fails to load a binary, no error is generated, but any information about a previous link or load of that program object is lost." v2: * Re-initialize shProg->data after clear. (Jordan) (Required after `6a72eba755`) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	50c09a648f	main: add binary support to ProgramBinary V2: call generic mesa_program_binary() helper rather than driver function directly to allow greater code sharing. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	7ee54ad057	main: add binary support to GetProgramBinary V2: call generic _mesa_get_program_binary() helper rather than driver function directly to allow greater code sharing. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	e30ed18215	main: Support getting GL_PROGRAM_BINARY_LENGTH V2: call generic _mesa_get_program_binary_length() helper rather than driver function directly to allow greater code sharing. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>i (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	c20fd744fe	mesa: Add Mesa ARB_get_program_binary helper functions V2 (Timothy Arceri): - add extra code comment - stop passing around void binary and just pass program_binary_header hdr instead. - move to src/mesa/main rather than src/util V3 (Timothy Arceri): - Move more code out of the backend and into the common helpers. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:59:25 +11:00
Timothy Arceri	90d4abdd87	mesa: add driver callbacks for serialising ProgramBinary blobs Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	64ad804e59	main: Support 1 Mesa format with get for GL_PROGRAM_BINARY_FORMATS Mesa supports either 0 or 1 formats. If 1 format is supported, it is GL_PROGRAM_BINARY_FORMAT_MESA as defined in the GL_MESA_program_binary_formats extension spec. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	fb077d603b	main: Allow non-zero NUM_PROGRAM_BINARY_FORMATS Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	2e28494af2	i965: Fix memory leak when serializing nir Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:59:25 +11:00
Jordan Justen	25b3ce6e3b	i965: Add brw_program_serialize_nir Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:59:22 +11:00
Jordan Justen	b3f1b765e9	i965: Free serialized nir after deserializing Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	cdc7ac23b9	i965: Add brw_program_deserialize_nir Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	7cf1037d5a	main, glsl: Add UniformDataDefaults which stores uniform defaults The ARB_get_program_binary extension requires that uniform values in a program be restored to their initial value just after linking. This patch saves off the initial values just after linking. When the program is restored by glProgramBinary, we can use this to copy the initial value of uniforms into UniformDataSlots. V2 (Timothy Arceri): - Store UniformDataDefaults only when serializing GLSL as this is what we want for both disk cache and ARB_get_program_binary. This saves us having to come back later and reset the Uniforms on program binary restores. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	ebd9e789c4	glsl: Split out shader program serialization This will allow us to use the program serialization to implement ARB_get_program_binary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	219628c118	include: Add GL_MESA_program_binary_formats to GL/GLES2 ext.h files Thus was merged into the OpenGL Registry in version 667c5a253781834b40a6ae9eb19d05af4542cfe1. Ref: https://github.com/KhronosGroup/OpenGL-Registry/pull/127 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-08 16:44:35 +11:00
Jordan Justen	0c48487893	mesa: add GL_PROGRAM_BINARY_FORMAT_MESA enum Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-08 16:44:35 +11:00
Francisco Jerez	4d1959e693	intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution. This addresses a long-standing back-end compiler bug that could lead to cross-channel data corruption in loops executed non-uniformly. In some cases live variables extending through a loop divergence point (e.g. a non-uniform break) into a convergence point (e.g. the end of the loop) wouldn't be considered live along all physical control flow paths the SIMD thread could possibly have taken in between due to some channels remaining in the loop for additional iterations. This patch fixes the problem by extending the CFG with physical edges that don't exist in the idealized non-vectorized program, but represent valid control flow paths the SIMD EU may take due to the divergence of logical threads. This makes sense because the i965 IR is explicitly SIMD, and it's not uncommon for instructions to have an influence on neighboring channels (e.g. a force_writemask_all header setup), so the behavior of the SIMD thread as a whole needs to be considered. No changes in shader-db. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-07 18:27:05 -08:00
Francisco Jerez	9355116bda	intel/fs: Don't let undefined values prevent copy propagation. This makes the dataflow propagation logic of the copy propagation pass more intelligent in cases where the destination of a copy is known to be undefined for some incoming CFG edges, building upon the definedness information provided by the last patch. Helps a few programs, and avoids a handful shader-db regressions from the next patch. shader-db results on ILK: total instructions in shared programs: 6541547 -> 6541523 (-0.00%) instructions in affected programs: 360 -> 336 (-6.67%) helped: 8 HURT: 0 LOST: 0 GAINED: 10 shader-db results on BDW: total instructions in shared programs: 8174323 -> 8173882 (-0.01%) instructions in affected programs: 7730 -> 7289 (-5.71%) helped: 5 HURT: 2 LOST: 0 GAINED: 4 shader-db results on SKL: total instructions in shared programs: 8185669 -> 8184598 (-0.01%) instructions in affected programs: 10364 -> 9293 (-10.33%) helped: 5 HURT: 2 LOST: 0 GAINED: 2 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-07 18:27:04 -08:00
Francisco Jerez	c3c1aa5aeb	intel/fs: Restrict live intervals to the subset possibly reachable from any definition. Currently the liveness analysis pass would extend a live interval up to the top of the program when no unconditional and complete definition of the variable is found that dominates all of its uses. This can lead to a serious performance problem in shaders containing many partial writes, like scalar arithmetic, FP64 and soon FP16 operations. The number of oversize live intervals in such workloads can cause the compilation time of the shader to explode because of the worse than quadratic behavior of the register allocator and scheduler when running out of registers, and it can also cause the running time of the shader to explode due to the amount of spilling it leads to, which is orders of magnitude slower than GRF memory. This patch fixes it by computing the intersection of our current live intervals with the subset of the program that can possibly be reached from any definition of the variable. Extending the storage allocation of the variable beyond that is pretty useless because its value is guaranteed to be undefined at a point that cannot be reached from any definition. According to Jason, this improves performance of the subgroup Vulkan CTS tests significantly (e.g. the runtime of the dvec4 broadcast test improves by nearly 50x). No significant change in the running time of shader-db (with 5% statistical significance). shader-db results on IVB: total cycles in shared programs: 61108780 -> 60932856 (-0.29%) cycles in affected programs: 16335482 -> 16159558 (-1.08%) helped: 5121 HURT: 4347 total spills in shared programs: 1309 -> 1288 (-1.60%) spills in affected programs: 249 -> 228 (-8.43%) helped: 3 HURT: 0 total fills in shared programs: 1652 -> 1597 (-3.33%) fills in affected programs: 262 -> 207 (-20.99%) helped: 4 HURT: 0 LOST: 2 GAINED: 209 shader-db results on BDW: total cycles in shared programs: 67617262 -> 67361220 (-0.38%) cycles in affected programs: 23397142 -> 23141100 (-1.09%) helped: 8045 HURT: 6488 total spills in shared programs: 1456 -> 1252 (-14.01%) spills in affected programs: 465 -> 261 (-43.87%) helped: 3 HURT: 0 total fills in shared programs: 1720 -> 1465 (-14.83%) fills in affected programs: 471 -> 216 (-54.14%) helped: 4 HURT: 0 LOST: 2 GAINED: 162 shader-db results on SKL: total cycles in shared programs: 65436248 -> 65245186 (-0.29%) cycles in affected programs: 22560936 -> 22369874 (-0.85%) helped: 8457 HURT: 6247 total spills in shared programs: 437 -> 437 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 870 -> 854 (-1.84%) fills in affected programs: 16 -> 0 helped: 1 HURT: 0 LOST: 0 GAINED: 107 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-07 18:27:04 -08:00
Francisco Jerez	acf98ff933	intel/fs: Teach instruction scheduler about GRF bank conflict cycles. This should allow the post-RA scheduler to do a slightly better job at hiding latency in presence of instructions incurring bank conflicts. The main purpuse of this patch is not to improve performance though, but to get conflict cycles to show up in shader-db statistics in order to make sure that regressions in the bank conflict mitigation pass don't go unnoticed. Acked-by: Matt Turner <mattst88@gmail.com>	2017-12-07 15:56:49 -08:00
Francisco Jerez	af2c320190	intel/fs: Implement GRF bank conflict mitigation pass. Unnecessary GRF bank conflicts increase the issue time of ternary instructions (the overwhelmingly most common of which is MAD) by roughly 50%, leading to reduced ALU throughput. This pass attempts to minimize the number of bank conflicts by rearranging the layout of the GRF space post-register allocation. It's in general not possible to eliminate all of them without introducing extra copies, which are typically more expensive than the bank conflict itself. In a shader-db run on SKL this helps roughly 46k shaders: total conflicts in shared programs: 1008981 -> 600461 (-40.49%) conflicts in affected programs: 816222 -> 407702 (-50.05%) helped: 46234 HURT: 72 The running time of shader-db itself on SKL seems to be increased by roughly 2.52%±1.13% with n=20 due to the additional work done by the compiler back-end. On earlier generations the pass is somewhat less effective in relative terms because the hardware incurs a bank conflict anytime the last two sources of the instruction are duplicate (e.g. while trying to square a value using MAD), which is impossible to avoid without introducing copies. E.g. for a shader-db run on SNB: total conflicts in shared programs: 944636 -> 623185 (-34.03%) conflicts in affected programs: 853258 -> 531807 (-37.67%) helped: 31052 HURT: 19 And on BDW: total conflicts in shared programs: 1418393 -> 987539 (-30.38%) conflicts in affected programs: 1179787 -> 748933 (-36.52%) helped: 47592 HURT: 70 On SKL GT4e this improves performance of GpuTest Volplosion by 3.64% ±0.33% with n=16. NOTE: This patch intentionally disregards some i965 coding conventions for the sake of reviewability. This is addressed by the next squash patch which introduces an amount of (for the most part boring) boilerplate that might distract reviewers from the non-trivial algorithmic details of the pass. The following patch is squashed in: SQUASH: intel/fs/bank_conflicts: Roll back to the nineties. Acked-by: Matt Turner <mattst88@gmail.com>	2017-12-07 15:56:06 -08:00
Dylan Baker	c34b53f133	meson: Fix building gallium media targets with gallium-xlib glx To demonstrate this bug run meson with the options: -Ddri-drivers= -Dglx=gallium-xlib Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-07 10:22:27 -08:00
Dylan Baker	2adc3817c6	meson: Add lmsensors to gallium libgl-xlib target. Fixes: `5e71efef44` ("meson: Add lmsensors support") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-07 10:20:58 -08:00
Eric Engestrom	4cba39331d	meson: add dep_thread to every lib that includes threads.h Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104141 Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-12-07 17:29:42 +00:00
Eric Engestrom	f0337f0f70	meson: fix pl111 dependency on vc4 src/gallium/winsys/pl111/drm/libpl111winsys.a(pl111_drm_winsys.c.o): In function `pl111_drm_screen_create': pl111_drm_winsys.c:(.text+0x33): undefined reference to `vc4_drm_screen_create_renderonly' Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-12-07 17:21:03 +00:00
Samuel Pitoiset	5f81a43535	radv: use a faster version for nir_op_pack_half_2x16 This patch is ported from RadeonSI and it has two effects. It fixes a rendering issue which affects F1 2017 and Dawn of War 3 (Vega only) because LLVM was ending up by generating the new v_mad_mix_{hi,lo} instructions which appear to be buggy in some way. Not sure if Mesa is generating something wrong or if the issue is in LLVM only. Anyway, that explains why the DOW3 issue can't be reproduced with GL on Vega. It also improves performance because v_cvt_pkrtz_f16 is faster, and because I guess the rounding mode behaviour is similar between GL and VK, we can use it. About performance, it improves Talos by +3/4% but I don't see any other impacts. No CTS regressions on Polaris. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-07 17:21:50 +01:00
Alejandro Piñeiro	25e56b2eba	mesa/spirv: move and rename nir_spirv_supported_capabilities To avoid any vulkan driver to include the GL mtypes.h. Renamed as eventually this could be used by drivers not using nir. v2: remove compiler/spirv/spirv.h from mtypes (Alejandro) v3: added the definition at compiler/shader_info.h (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-07 17:15:11 +01:00
Vadym Shovkoplias	b2490a326c	util/disk_cache: Remove unneeded free() on always null string At this point dc_job->cache_item_metadata.keys always equals NULL, so call to free() is useless Fixes: `b86ecea344` ("util/disk_cache: write cache item metadata to disk") Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-07 11:50:41 +00:00
Samuel Iglesias Gonsálvez	392638d6b5	spirv: fix bug when OpSpecConstantOp calls a conversion In that case, nir_eval_const_opcode() will evaluate the conversion but as it was using destination's bit_size, the resulting value was just a cast of the source constant value. By passing the source's bit size, it does the conversion properly. Fixes: dEQP-VK.spirv_assembly.instruction..opspecconstantop.convert* v2: - Remove invalid conversion op cases. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-07 10:19:34 +01:00
Samuel Iglesias Gonsálvez	67ec314347	spirv: allow specialization constants with bitsize different than 32 bits Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-07 10:19:34 +01:00
James Legg	947470d10b	nir/opcodes: Fix constant-folding of bitfield_insert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119 CC: <mesa-stable@lists.freedesktop.org> CC: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-07 08:59:36 +00:00
Alex Smith	8fda98c4f1	radv: Add LLVM version to the device name string Allows apps to determine the LLVM version so that they can decide whether or not to enable workarounds for LLVM issues. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-07 08:58:34 +00:00
Alejandro Piñeiro	be2c434308	mesa: remove set_entry from forward type declarations This type was used at gl_sync_object, but it is not used anymore. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-07 09:01:58 +01:00
Kenneth Graunke	8705ed13e3	meta: Fix ClearTexture with GL_DEPTH_COMPONENT. We only handled unpacking for GL_DEPTH_STENCIL formats. Cemu was hitting _mesa_problem() for an unsupported format in _mesa_unpack_float_32_uint_24_8_depth_stencil_row(), because the format was depth-only, rather than depth-stencil. Cc: "13.0 12.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94739 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103966 Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-06 20:35:46 -08:00
Kenneth Graunke	d6d16c0218	meta: Initialize depth/clear values on declaration. This helps avoid compiler warningss in the next commit - everything was initialized, but it wasn't obvious to static analysis. Suggested-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-06 20:30:24 -08:00
Timothy Arceri	9d53ccccb2	glsl: get correct member type when processing xfb ifc arrays This fixes a crash in: KHR-GL45.enhanced_layouts.xfb_block_stride Fixes: `0822517936` "glsl: add helper to process xfb qualifiers during linking" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-07 15:22:23 +11:00
Gert Wollny	6c268ea79a	r600/sb: do not convert if-blocks that contain indirect array access If an array is accessed within an if block, then currently it is not known whether the value in the address register is involved in the evaluation of the if condition, and converting the if condition may actually result in out-of-bounds array access. Consequently, if blocks that contain indirect array access should not be converted. Fixes piglits on r600/BARTS: spec/glsl-1.10/execution/variable-indexing/ vs-output-array-float-index-wr vs-output-array-vec3-index-wr vs-output-array-vec4-index-wr Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104143 Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-07 09:48:41 +10:00
Dave Airlie	81683c3d42	r600: add support for compute grid/block sizes. (v2) We just pass these in from outside in a constant buffer. The shader side stores them once they are accessed once. v2: fix to not use a temp_reg. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:21:09 +00:00
Dave Airlie	4525cdb751	r600: handle image/buffer sizes correctly. This adds support to compute for the resq workarounds (buffer/cube sizes) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:21:06 +00:00
Dave Airlie	f51458637c	r600/compute: add support for emitting compute image/buffer atoms Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:21:02 +00:00
Dave Airlie	a5a50d9c89	r600/compute: handle atomic counters in compute state. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:20:58 +00:00
Dave Airlie	c82934f212	r600/compute: add support for TGSI compute shaders. (v1.1) This add paths to handle TGSI compute shaders and shader selection. It also avoids emitting certain things on tgsi paths, CBs, vertex buffers, config reg init (not required). v1.1: fix rat mask calc Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:20:53 +00:00
Dave Airlie	08dc205c61	r600/shader: add compute support to shader assembler Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:20:50 +00:00
Dave Airlie	7b8e1c089d	r600/texture: drop lowering 1d/2d images to linear. This appears to cause hangs with compute images. Unless we can find more specifics, just don't do this for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-06 23:20:20 +00:00
Alejandro Piñeiro	0398b31d1d	mesa: define nir_spirv_supported_capabilities Until now it was part of spirv_to_nir_options. But it will be used on the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added to the OpenGL context, as a way to save what SPIR-V capabilities the current OpenGL implementation supports. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-06 22:25:52 +01:00
Fredrik Höglund	5e1cb16768	anv: fix a case statement in GetMemoryFdPropertiesKHR The handle type in the case statement is supposed to be VK_EXTERNAL_- MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT. Fixes: `ab18e8e59b` ("anv: Implement VK_EXT_external_memory_dma_buf") Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 20:04:39 +01:00
Fredrik Höglund	b055045378	radv: fix a case statement in GetMemoryFdPropertiesKHR The handle type in the case statement is supposed to be VK_EXTERNAL_- MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT. Fixes: `546e747867` ("radv: Implement VK_EXT_external_memory_dma_buf") Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-12-06 20:04:39 +01:00
Eric Engestrom	31d403160f	meson: fix keyword argument in declare_dependency() `declare_dependency()` takes `compile_args`, not `c_args`. It was correct in all the other `declare_dependency()` from that commit. Fixes: `0bbecc5a85` "meson: define driver dependencies" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-12-06 18:31:33 +00:00
Emil Velikov	526945f7dc	i965: include brw_pipe_control.h in the tarball Fixes: `bfe0f3a702` ("i965: Move PIPE_CONTROL defines and prototypes to brw_pipe_control.h.") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-06 17:33:57 +00:00
Emil Velikov	e964e01fdd	mesa: document _mesa_extension_override_* variables Currently there are no users of these outside of extensions.c. Provide some information why they exist and how to use them. Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-12-06 17:31:53 +00:00
Emil Velikov	d7ba4f41f9	docs: annotate MESA_program_debug as obsolete It has been obsolete for years - state it explicitly. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 17:31:53 +00:00
George Kyriazis	bc75adcb1e	swr/scons: Fix another intermittent build failure gen_BackendPixelRate*.cpp depends on gen_ar_eventhandler.hpp. Fix missing dependency. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-12-06 11:04:02 -06:00
Marek Olšák	4038db72d4	radeonsi: make const and stream uploaders allocate read-only memory and anything that clones these uploaders, like u_threaded_context. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	7a6643fb4c	radeonsi: use a separate allocator for fine fences Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	3e1287caef	radeonsi/gfx9: make shader binaries use read-only memory Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	fef51ebcea	winsys/amdgpu: make IBs use read-only memory Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	ba59064409	radeonsi: print the buffer list for CHECK_VM Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	010214b403	radeonsi: allow DMABUF exports for local buffers Cc: 17.3 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Nicolai Hähnle	20ccb51ffc	radeonsi: always place sparse buffers in VRAM Together with "radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check", this ensures that sparse buffers are placed in VRAM. Noticed by an assertion that started triggering with commit `d4fac1e1d7` ("gallium/radeon: enable suballocations for VRAM with no CPU access") Fixes KHR-GL45.sparse_buffer_tests.BufferStorageTest in debug builds. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-06 11:19:00 +01:00
Nicolai Hähnle	5e2962c949	radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check The flag is on the pipe_resource, not the r600_resource. I don't see an obvious bug related to this, but it could potentially lead to suboptimal placement of some resources. Fixes: `a41587433c` ("gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-12-06 11:18:14 +01:00
Jose Maria Casanova Crespo	a1e257a5bf	i965/fs: Use untyped_surface_read for 16-bit load_ssbo SSBO loads were using byte_scattered read messages as they allow reading 16-bit size components. byte_scattered messages can only operate one component at a time so we needed to emit as many messages as components. But for vec2 and vec4 of 16-bit, being multiple of 32-bit we can use the untyped_surface_read message to read pairs of 16-bit components using only one message. Once each pair is read it is unshuffled to return the proper 16-bit components. vec3 case is assimilated to vec4 but the 4th component is ignored. 16-bit scalars are read using one byte_scattered_read message. v2: Removed use of stride = 2 on sources (Jason Ekstrand) Rework optimization using unshuffle 16 reads (Chema Casanova) v3: Use W and D types insead of HF and F in shuffle to avoid rounding erros (Jason Ekstrand) Use untyped_surface_read for 16-bit vec3. (Jason Ekstrand) v4: Use subscript insead of chaging type and stride (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	ce2e572c4c	i965/fs: Optimize 16-bit SSBO stores by packing two into a 32-bit reg Currently, we use byte-scattered write messages for storing 16-bit into an SSBO. This is because untyped surface messages have a fixed 32-bit size. This patch optimizes these 16-bit writes by combining 2 values (e.g, two consecutive components aligned with 32-bits) into a 32-bit register, packing the two 16-bit words. 16-bit single component values will continue to use byte-scattered write messages. The same will happens when the first consecutive component is not aligned 32-bits. This optimization reduces the number of SEND messages used for storing 16-bit values potentially by 2 or 4, which cuts down execution time significantly because byte-scattered writes are an expensive operation as they only write a component for message. v2: Removed use of stride = 2 on sources (Jason Ekstrand) Rework optimization using shuffle 16 write and enable writes of 16bit vec4 with only one message of 32-bits. (Chema Casanova) v3: - Fix coding style (Eduardo Lima) - Reorganize code to avoid duplication. (Jason Ekstrand) - Include new comments to explain the length calculations to fix alignment issues of components. (Jason Ekstrand) - Fix issues with writemask yz with 16-bit writes. (Jason Ektrand) v4: (Jason Ekstrand) - Reorganize 64-bit ssbo-writes to avoid using slots_per_component. - Comment about why suffle is needed when using byte_scattered_write. Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	66ce6ce78f	anv: Enable SPV_KHR_16bit_storage and VK_KHR_16bit_storage for SSBO/UBO Enables SPV_KHR_16bit_storage on gen 8+. VK_KHR_16bit_storage is enabled for SSBO/UBO using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. v2: update due rebase against master (Alejandro) v3: (Jason Ekstrand) - Move this patch up in VK_KHR_16bit_storage series enabling only storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccess. - Only expose VK_KHR_16bit_storage on Gen8+ v4: (Jason Ekstrand) - Squash enable SPV_KHR_16bit_storage into VK_KHR_16bit_storage enablement for SSBO/UBO. Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	3282309f74	i965/fs: Enables 16-bit load_ubo with sampler load_ubo is using 32-bit loads as uniforms surfaces have a 32-bit surface format defined. So when reading 16-bit components with the sampler we need to unshuffle two 16-bit components from each 32-bit component. Using the sampler avoids the use of the byte_scattered_read message that needs one message for each component and is supposed to be slower. v2: (Jason Ekstrand) - Simplify component selection and unshuffling for different bitsizes - Remove SKL optimization of reading only two 32-bit components when reading 16-bits types. Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	3db31c0b06	i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components This helpers are used to load/store 16-bit types from/to 32-bit components. The functions shuffle_32bit_load_result_to_16bit_data and shuffle_16bit_data_for_32bit_write are implemented in a similar way than the analogous functions for handling 64-bit types. v1: Explain need of temporary in shuffle operations. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	fa4a9d63bb	i965/fs: Use byte scattered read for 16-bit load_ssbo Used to enable 16-bit reads at do_untyped_vector_read, that is used on the following intrinsics: * nir_intrinsic_load_shared * nir_intrinsic_load_ssbo v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand) v3: - Add bitsize to scattered read operation (Jason Ekstrand) - Remove implementation of 16-bit UBO read from this patch. - Avoid assertion at opt_algebraic caused by ADD of two IMM with offset with BRW_REGISTER_TYPE_UD type found on matrix tests. (Jose Maria Casanova) v4: (Jason Ekstrand) - Put if case for 16-bits at the beginning of the if ladder. - Use type_sz(dest.type) * 8 as bit_size parameter for scattered read. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	c57a3f200d	i965/fs: Add byte scattered read message and fs support v2: Fix alignment style (Topi Pohjolainen) (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_read. v3: (Jason Ekstrand) - Use renamed brw_byte_scattered_data_element_from_bit_size method - Assert scattered read for Gen8+ and Haswell. - Use conditional expresion at components_read. - Include comment about params for scattered opcodes. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	a4031bdfa9	i965/fs: Predicate byte scattered writes if needed While on Untyped Surface messages the bits of the execution mask are ANDed with the corresponding bits of the Pixel/Sample Mask, that is not the case for byte scattered writes. That is needed to avoid ssbo stores writing on helper invocations. So when that can affect, we load the sample mask, and predicate the send message. Note: the need for this patch was tested with a custom test. Right now the 16 bit storage CTS tests doesnt need this path in order to get a full pass. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	96f1926aab	i965/fs: Use byte_scattered_write on 16-bit store_ssbo We need to rely on byte scattered writes as untyped writes are 32-bit size. We could try to keep using 32-bit messages when we have two or four 16-bit elements, but for simplicity sake, we use the same message for any component number. We revisit this aproach in the follwing patches. v2: Removed use of stride = 2 on 16-bit sources (Jason Ekstrand) v3: (Jason Ekstrand) - Include bit_size to scattered write message and remove namespace - specific for scattered messages. - Move comment to proper place. - Squashed with i965/fs: Adjust type_size/type_slots on store_ssbo. (Jose Maria Casanova) - Take into account that get_nir_src returns now WORD types for 16-bit sources instead of DWORD. v4: (Jason Ekstrand) - Rename lenght variable to num_components. - Include assertions before emit_untyped_write. - Remove type_slot in favor of num_slot and first_slot. Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	f1a9936ee1	i965/fs: Add byte scattered write message and fs support v2: (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_write. v3: - Remove leftover newline (Topi Pohjolainen) - Rename brw_data_size to brw_scattered_data_element and use defines instead of an enum (Jason Ekstrand) - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	d038deaa40	i965/fs: Add remove_extra_rounding_modes optimization Although from SPIR-V point of view, rounding modes are attached to the operation/destination, on i965 it is a status, so we don't need to explicitly set the rounding mode if the one we want is already set. Taking into account that the default mode is RTE, one possible optimization would be optimize out the first RTE set for each block. For in order to work, we would need to take into account block interrelationships. At this point, it is not worth to complicate the optimization for such small gain. v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode (Curro) v3: Reset optimization for every block. (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	82fa4d45e7	i965/fs: Enable rounding mode on f2f16 ops By default we don't set the rounding mode. We only set round-to-near-even or round-to-zero mode if explicitly set from nir. v2: Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode (Curro) v3: Use new helper brw_rnd_mode_from_nir_op (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	d6cd14f213	i965/fs: Define new shader opcode to set rounding modes Although it is possible to emit them directly as AND/OR on brw_fs_nir, having a specific opcode makes it easier to remove duplicate settings later. v2: (Curro) - Set thread control to 'switch' when using the control register - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode. - Avoid magic numbers setting rounding mode field at control register. v3: (Curro) - Remove redundant and add missing whitespace lines. - Match printing instruction to IR opcode "rnd_mode" v4: (Topi Pohjolainen) - Fix code style. Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	ac8d4734f6	i965: Add support for control register Control register cr0 in i965 can be used to change the rounding modes in 32-bit to 16-bit floating-point conversions. From intel Skylake PRM, vol 07, section "Register and Tegister Regions", subsection "Control Register" (page 754): "Subregister cr0.0:ud contains normal operation control fields such as the floating-point mode ... " Floating-point Rounding mode is changed at bits 5:4 of cr0.0: "Rounding Mode. This field specifies the FPU rounding mode. It is initialized by Thread Dispatch." 00b = Round to Nearest or Even (RTNE) 01b = Round Up, toward +inf (RU) 10b = Round Down, toward -inf (RD) 11b = Round Toward Zero (RTZ)" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	5d5ee507fb	i965/fs: Handle 32-bit to 16-bit conversions Conversions to 16-bit need having aligment between the 16-bit and 32-bit types. So the conversion operations unpack 16-bit types to with an stride=2 and then applies a MOV with the conversion. v2 (Jason Ekstrand): - Avoid the general use of stride=2 for 16-bit register types. v3 (Topi Pohjolainen) - Code style fix (Jason Ekstrand) - Now nir_op_f2f16 was renamed to nir_op_f2f16_undef because conversion to f16 with undefined rounding is explicit Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	a05b6f25bf	i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type Note that we don't remove the assert at i965/vec4. At this point half float support is only for the scalar backend. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	75a88d8567	i965: Support for 16-bit base types in helper functions v2: Fixed calculation of scalar size for 16-bit types. (Jason Ekstrand) v3: Fix coding style (Topi Pohjolainen) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Alejandro Piñeiro	2d28ca7000	i965/vec4: Handle 16-bit types at type_size_xvec4 These types have similar vec4 sizes as their 32-bit counterparts. The vec4 backend doesn't support 16-bit types and probably never will, but this method is called by the scalar backend at fs_visitor::nir_setup_outputs(), so we still need to provide valid vec4 sizes for 16-bit types. In the future, something different should be implemented to avoid this dependency. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	4049c04122	spirv/nir: Add support for SPV_KHR_16bit_storage v2: Minor changes after rebase against recent master (Alejandro Pinheiro) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	e0667a8bd8	spirv: Enable FPRoundingMode decorator to nir operations SpvOpFConvert now manages the FPRoundingMode decorator for the returning values enabling the nir_rounding_mode in the conversion operation to fp16 values. v2: Fixed breaking of specialization constants. (Jason Ekstrand) v3: Avoid nir_rounding_mode * casting. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	549894a681	spirv/nir: Handle 16-bit types v2: Added more missing implementations of 16-bit types. (Jason Ekstrand) v3: Store values in values[0].u16[i] (Jason Ekstrand) Include switches based on bitsize for 16-bit types (Chema Casanova) v4: Coding style fixes (Jason Ekstrand) Use vtn_u64_literal and u64[0] at 64-bit SpvOpConstant (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	1f440d00d2	nir: Handle fp16 rounding modes at nir_type_conversion_op nir_type_conversion enables new operations to handle rounding modes to convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne and nir_op_f2f16_rtz. The undefined behaviour doesn't has any effect and uses the original nir_op_f2f16 operation. v2: Indentation fixed (Jason Ekstrand) v3: Use explicit case for undefined rounding and assert if rounding mode is used for non 16-bit float conversions (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	2af63683bc	nir: Populate conversion opcodes to 16-bit types This will include the following NIR ALU opcodes: * nir_op_i2i16 * nir_op_i2f16 * nir_op_u2u16 * nir_op_u2f16 * nir_op_f2i16 * nir_op_f2u16 * nir_op_f2f16 v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Jose Maria Casanova Crespo	d711445430	nir: Add rounding modes enum v2: Added comments describing each of the rounding modes. (Jason Ekstrand) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	5165e222d1	nir: Add support for 16-bit types (half float, int16 and uint16) v2: Renamed glsl_half_float_type() to glsl_float16_t_type(). (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	52b10c7f20	mesa/st: Handle 16-bit types at st_glsl_storage_type_size() This is basically to avoid "not handle in switch" warnings. v2: Let the new types hit the assertion instead. (Marek Olšák and Jason Ekstrand) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-06 08:57:18 +01:00
Eduardo Lima Mitev	59f458cd87	glsl: Add 16-bit types Adds new INT16, UINT16 and FLOAT16 base types. The corresponding GL types for half floats were reused from the AMD_gpu_shader_half_float extension. The int16 and uint16 types come from NV_gpu_shader_5 extension. This adds the builtins and the lexer support. To avoid a bunch of warnings due to cases not handled in switch, the new types have been added to a few places using same behavior as their 32-bit counterparts, except for a few trivial cases where they are already handled properly. Subsequent patches in this set will provide correct 16-bit implementations when needed. v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type. * Removed float16_t from builtin types. * Don't copy 16-bit types as if they were 32-bit values in copy_constant_to_storage(). * Use get_scalar_type() instead of adding a new custom switch statement. (Jason Ekstrand) v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency (Ilia Mirkin) v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima). v5: Fix coding style (Topi Poholainen). Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Eduardo Lima <elima@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 08:57:18 +01:00
Jason Ekstrand	8761a04d0d	anv: Add support for the variablePointers feature Not to be confused with variablePointersStorageBuffer which is the subset of VK_KHR_variable_pointers required to enable the extension. This means we now have "full" support for variable pointers. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	93b4cb61eb	spirv: Allow OpPtrAccessChain for block indices The SPIR-V spec is a bit underspecified when it comes to exactly how you're allowed to use OpPtrAccessChain and what it means in certain edge cases. In particular, what if the base pointer of the OpPtrAccessChain points to the base struct of an SSBO instead of an element in that SSBO. The original variable pointers implementation in mesa assumed that you weren't allowed to do an OpPtrAccessChain that adjusted the block index and asserted such. However, there are some CTS tests that do this and, if the CTS does it, someone will do it in the wild so we should probably handle it. With this commit, we significantly reduce our assumptions and should be able to handle more-or-less anything. The one assumption we still make for correctness is that if we see an OpPtrAccessChain on a pointer to a struct decorated block that the block index should be adjusted. In theory, someone could try to put an array stride on such a pointer and try to make the SSBO an implicit array of the base struct and we would not give them what they want. That said, any index other than 0 would count as an out-of-bounds access which is invalid. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	32c859125b	anv: Handle nir_intrinsic_vulkan_resource_reindex Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	cfb81f58a0	nir: Add a vulkan_resource_reindex intrinsic This is required for being able to handle OpPtrAccessChain in SPIR-V where the base type of the incoming pointer requires us to add to the block index instead of the byte offset. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	ae54a4f84f	spirv: Add support for lowering workgroup access to offsets Before, we always left workgroup variables as shared nir_variables and let the driver call nir_lower_io. This adds an option to do the lowering directly in spirv_to_nir. To do this, we implicitly assign the variables a std430 layout and then treat them like a UBO or SSBO and immediately lower all the way to an offset. As a side-effect, the spirv_to_nir pass now handles variable pointers for workgroup variables. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	f6eb5ce39c	spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	992aabf239	spirv: Add theoretical support for single component pointers Up until now, all pointers have been ivec2s. We're about to add support for pointers to workgroup storage and those are going to be uints. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:54 -08:00
Jason Ekstrand	843c192e2b	spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index There is no good reason why we should have the same logic repeated in get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference. If we're a bit more careful about how we do things, we can just use the one function and get rid of the other entirely. This also makes the push constant special case a lot more clear. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 22:01:53 -08:00
Jason Ekstrand	6dffef6308	spirv: Refactor a couple of pointer query helpers This commit moves them both into vtn_variables.c towards the top, makes them take a vtn_builder, and replaces a hand-rolled instance of is_external_block with a function call. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:56:16 -08:00
Jason Ekstrand	93646fb503	spirv: Refactor the base case of offset_pointer_dereference This makes us key off of !offset instead of !block_index. It also puts the guts inside a switch statement so that we can handle more than just UBOs and SSBOs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:56:14 -08:00
Jason Ekstrand	98edf6bca4	spirv: Add a switch statement for the block store opcode This parallels what we do for vtn_block_load except that we don't yet support anything except SSBO loads through this path. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:55:39 -08:00
Jason Ekstrand	91d91ce3e2	spirv: Use a dereference instead of vtn_variable_resource_index This is equivalent and means we don't have resource index code scattered about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-05 20:55:37 -08:00
Brian Paul	dee936c805	mesa: add const qualifier on _mesa_is_renderable_texture_format() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-05 15:32:25 -07:00
Brian Paul	ca78b6b4f4	mesa: add const qualifier on _mesa_base_fbo_format() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-12-05 15:32:25 -07:00
Brian Paul	08ba4a103f	mesa: s/%u/%d/ in _mesa_error() call in check_layer() The layer parameter is signed. Fixes the error message seen when running the arb_texture_multisample-errors test which checks a negative layer value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:25 -07:00
Brian Paul	323e6029a3	mesa: simplify/improve some _mesa_error() calls in teximage.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:25 -07:00
Brian Paul	726e495bbd	mesa: trivial whitespace fixes in transformfeedback.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:25 -07:00
Brian Paul	b2c3fd984a	mesa: add const qualifier in test_attachment_completeness() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:25 -07:00
Brian Paul	78dfbc30b6	st/mesa: remove unneeded #include in st_format.h Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:24 -07:00
Brian Paul	9c7840f93f	st/mesa: rename a few vars to 'bindings' To be consistent. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:24 -07:00
Brian Paul	067f6fc1af	st/mesa: whitespace fixes in st_format.c Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 15:32:24 -07:00
Rob Clark	e0c6769ef5	freedreno/a5xx: hide ARB_base_instance Grrr.. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-05 16:16:18 -05:00
Rob Clark	77c595358f	freedreno/ir3: handle input/output component After the mesa/st nir linking support, we start to see inputs/outputs like: decl_var shader_out INTERP_MODE_NONE float packed:uv (VARYING_SLOT_VAR9.x, 1, 0) decl_var shader_out INTERP_MODE_NONE float packed:uv@0 (VARYING_SLOT_VAR9.y, 1, 0) (ie. were location_frac != .x) Unfortunately I overlooked the addition of the component parameter to load_input/store_output, so when we started encountering inputs/outputs with component other than .x, we'd end up loading/storing the wrong input/output. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-05 16:03:38 -05:00
Rob Clark	fd6a96635e	mesa/st: move cloning of NIR shader for compute Since in the NIR case, driver takes ownership of the NIR shader, we need to clone what is passed to the driver. Normally this is done as part of creating the shader variant (where is clone is anyways needed). But compute shaders have no variants, so we were cloning earlier. The problem is that after the NIR linking optimizations, we ended up cloning before all the lowering passes where done. So move this into st_get_cp_variant(), to make compute shaders work more like other shader stages. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-05 16:03:38 -05:00
Dave Airlie	12a96aaf90	r600: refactor and export some shader selector code for compute This just moves some code around to make it easier to add compute. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:50 +00:00
Dave Airlie	ca64281690	r600: add compute support to compressed resource handling. This just adds support for decompressing compute resources. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:48 +00:00
Dave Airlie	5c78d000e6	r600: update max threads per block for evergreen compute Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:37 +00:00
Dave Airlie	5f15d35efc	r600/shader: add local memory support to shader assembler. This is needed for compute shaders. v1.1: make work for vectors, fix missing lds ops. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:34 +00:00
Dave Airlie	dd3630f71c	r600/cs: add support for compute to image/buffers/atomics state This just adds the compute paths to state handling for the main objects Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:24 +00:00
Dave Airlie	84feb6c24a	r600: handle compute null key shader state Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:31:13 +00:00
Dave Airlie	9b8b78b457	r600: add some missing cayman register defines These are just taken from the kernel, and were seen in some fglrx dumps. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:09:43 +00:00
Dave Airlie	3a403a9797	r600: don't set EOP on pop or loop end This appears to bad, compute shaders hang without it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:09:33 +00:00
Dave Airlie	2fdc21bcab	r600/ssbo: refactor out buffer coord calcs and use for atomic path. The atomic rat path has a bug in the ssbo path, refactor out the address calcs from the load/store paths and reuse to fix the bug in the buffer rat atomic path. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:08 +00:00
Dave Airlie	a256506b76	r600/ssbo: fix multi-dword buffer loads. This fixes loading from different channels. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:08 +00:00
Dave Airlie	989697eccc	r600/ssbo: use r32ui format for ssbo resources. This works best for returning the correct values and sizes in tests. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:08 +00:00
Dave Airlie	275293b2b4	r600: refactor out the immediate setup code. This just refactors the same code out of the images/buffers paths. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:07 +00:00
Dave Airlie	df21cd5248	r600/shader: fix ssbo atomic operations formats. Don't try and use the image format for ssbo, just 32-bit uint. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:07 +00:00
Dave Airlie	4ee2b7c452	r600/shader: fix thread id loading. This just changes how thread id loading is done, it makes smaller shaders if we don't use thread id gprs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 20:07:07 +00:00
Rob Herring	20d37da597	Android: enable noreturn and returns_nonnull attributes Commit `94ca8e04ad` ("spirv: Add vtn_fail and vtn_assert helpers") broke Android builds which have -Werror enabled with the following errors: external/mesa3d/src/compiler/spirv/spirv_to_nir.c:272:1: error: control may reach end of non-void function [-Werror,-Wreturn-type] external/mesa3d/src/compiler/spirv/spirv_to_nir.c:810:1: error: control may reach end of non-void function [-Werror,-Wreturn-type] ... The problem is the noreturn attribute is not enabled and we to define HAVE_FUNC_ATTRIBUTE_NORETURN. Auditing src/util/macros.h, we're also missing HAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL and HAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT, so add them too. Fixes: `94ca8e04ad` ("spirv: Add vtn_fail and vtn_assert helpers") Cc: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>	2017-12-05 07:47:04 -06:00
Marek Olšák	dbad0acfaf	gallium/u_upload_mgr: allow drivers to specify pipe_resource::flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 13:30:35 +01:00
Marek Olšák	c7f84f6513	winsys/amdgpu: add RADEON_FLAG_READ_ONLY Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 13:30:34 +01:00
Marek Olšák	cccf09677f	gallium/radeon: remove RADEON_HEAP_VRAM_GTT Only winsyses can set VRAM\|GTT. Drivers shouldn't if they want to use winsys allocators. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 13:30:34 +01:00
Marek Olšák	9ac5504df5	gallium/radeon: move setting VRAM\|GTT into winsyses The combined VRAM\|GTT heap will be removed. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 13:30:34 +01:00
Marek Olšák	5e805cc74b	radeonsi: flush the context after resource_copy_region for buffer exports Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 13:28:00 +01:00
Mauro Rossi	cd8554502e	Android: gallium/radeon: fix libmesa_amd_common dependency libmesa_amd_common static dependency is added in Android build to avoid the following building errors: In file included from external/mesa/src/gallium/drivers/radeon/r600_buffer_common.c:24: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26: external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found ^~~~~~~~~~~~~ 1 error generated. ... In file included from external/mesa/src/gallium/drivers/radeon/r600_gpu_load.c:34: In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:26: external/mesa/src/gallium/drivers/radeonsi/si_shader.h:138:10: fatal error: 'ac_binary.h' file not found ^~~~~~~~~~~~~ 1 error generated. Fixes: `950221f923` ("radeonsi: remove r600_common_screen") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-05 12:05:21 +00:00
Dave Airlie	b501ef164e	st/mesa: handle compute atomics Just reuse the cs atomics bit and emit the hw atomic state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 10:38:35 +00:00
Dave Airlie	05f594f229	r600/atomic: add cayman version of atomic save/restore from GDS (v2) On Cayman we don't use the append/consume counters (fglrx doesn't) and they don't seem to work well with compute shaders. This just uses GDS instead to do the atomic operations. v1.1: remove unused line. v2: use EOS on cayman, it appears to work. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 10:38:07 +00:00
Dave Airlie	cf6d3caee2	r600/atomic: refactor out evergreen atomic setup/save code. For cayman we want to use different code paths. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-05 10:38:04 +00:00
Timothy Arceri	e9e6476ae5	radeonsi: pass llvm type directly to buffer_load() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-05 15:15:36 +11:00
Dylan Baker	6b4c7047d5	meson: build gallium nine state_tracker v2: - set d3d_drivers_path instead of dri_drivers_path - Fix nine guard to check for all relavent gallium drivers - Link with libswdri and libswkmsdri when necessary - Fix pkg-config generation - Add missing comma Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:58 -08:00
Dylan Baker	0ba909f0f1	meson: build gallium xa state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:56 -08:00
Dylan Baker	5a785d51a6	meson: build gallium va state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:53 -08:00
Dylan Baker	1d36dc674d	meson: build gallium omx state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:51 -08:00
Dylan Baker	22a817af8a	meson: build gallium xvmc state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:48 -08:00
Dylan Baker	68076b8747	meson: build gallium vdpau state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:38 -08:00
Dylan Baker	085070a2c8	meson: drop gallium-media argument This argument is the wrong approach for handling gallium media state trackers, since it doesn't allow for an auto option. Instead we'll use tristates, which do allow for auto. This option has never been wired to anything anyway. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:35 -08:00
Dylan Baker	f7f1b30f81	meson: extend install_megadrivers script to handle symmlinking Which is required for the gallium media state trackers. v2: - Make symlinks local instead of absolute Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:26 -08:00
Dylan Baker	b065de05c6	meson: Add osmesa.sym script as a link dependency (gallium-osmesa) v2: - Add this patch Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:19 -08:00
Dylan Baker	0cb6d69a72	meson: use driver_deps for gallium osmesa v2: - Put driver_swrast in the correct field (dependencies) - Remove unused osmesa_deps Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:36:10 -08:00
Dylan Baker	60283769ec	meson: Use driver dependencies for libgl-xlib target v2: - put driver_swrast in the right field - add dep_threads (dep_llvm requires threads, so it masked this previously) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:35:48 -08:00
Dylan Baker	95a791f63e	meson: use the driver dependencies for the gallium dri target Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:35:43 -08:00
Dylan Baker	0bbecc5a85	meson: define driver dependencies This allow us to encapsulate the compiler and linkage requirements of each driver in a reusable way. The result will be that each target that needs a specific driver can simply add `driver_<name>` to its dependencies line and the necessary libraries and compiler args will be added. This will allow for a lot of code de-duplication between gallium targets. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:35:36 -08:00
Dylan Baker	831d2fb012	meson: sort gallium drivers after winsys This is a requirement of the next patch. Since meson does not have forward declarations, and we're going to define the driver dependencies in the drivers folder they need to be after the winsys so that the winsys libs are defined first. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:35:31 -08:00
Dylan Baker	383cdaf990	meson: Combine gallium target subdirs So that state trackers, targets, and special winsys requirements are all in a single if statement. This is a cosmetic only cleanup with no functional changes. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 14:35:03 -08:00
Nanley Chery	0a257b3fe4	i965/cnl: Avoid fast-clearing sRGB render buffers Gen10 doesn't automatically decode the clear color of sRGB buffers. To get correct rendering, avoid fast-clearing such buffers for now. The driver now passes the following piglit tests: * spec@arb_framebuffer_srgb@msaa-fast-clear * spec@ext_texture_srgb@multisample-fast-clear gl_ext_texture_srgb Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-04 14:21:47 -08:00
Dylan Baker	6a9611763b	meson: Fix overlinkage of dri3 loader This was covering for underinkage elsewhere. With that fixed these can be removed. v2: - sort dependencies Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-04 13:15:13 -08:00
Dylan Baker	d2e9ba2782	meson: fix underlinkage without dri3 There are some case where the dri3 loader is covering for underlinkage for GLX and EGL, provide the linkage that they actually need. v2: - remove dep_xcb_dri3 from glx. This was an oversight in v1 and is not needed. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-04 13:15:02 -08:00
Dylan Baker	b08a35b150	meson: Reformat glx code to match more common style Generally in our meson build large arrays are formated in the form: [ ..., ..., ..., $ ..., ] So use that form Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-04 13:14:00 -08:00
Samuel Pitoiset	5de7c782fb	radv: fix a crash in radv_can_dump_shader() module can be NULL, oops. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-04 19:52:14 +01:00
Chad Versace	a932aee7a8	intel/isl: Declare private array as static const It's array isl_drm.c:modifier_info[] . Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-12-04 10:16:33 -08:00
Dylan Baker	2be2565b9e	meson: Install dri.pc file when building gallium dri drivers Currently this pkg-config file is only installed if a classic dri driver is built. This is wrong, it should be installed if any dri driver is installed, which includes the gallium dri target. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-04 10:14:09 -08:00
Lionel Landwerlin	2ead8f1690	anv: query CS timestamp frequency from the kernel The reference value in gen_device_info isn't going to be acurate on Gen10+. We should query it from the kernel, which reads a couple of register to compute the actual value. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-12-04 18:05:20 +00:00
Lionel Landwerlin	b66e4b51bf	i965: read CS timestamp frequency from the kernel on Gen10+ We cannot figure this value out of the PCI-id anymore. Let's read it from the kernel (which computes this from a few registers). When running on a (upcoming) 4.16-rc1+ kernel, this will fixes piglit tests on CNL : spec@arb_timer_query@query gl_timestamp spec@arb_timer_query@timestamp-get spec@ext_timer_query@time-elapsed Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-12-04 18:05:20 +00:00
Lionel Landwerlin	aa8a2a8670	drm-uapi: Update drm/i915 headers from drm-next Taken from drm-next ca797d29cd63e7b71b4eea29aff3b1cefd1ecb59 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-12-04 18:05:20 +00:00
Jason Ekstrand	1e565bc6ce	radv: Implement VK_KHR_get_surface_capabilities2 The WSI core code does all the hard work. Just add the wrappers and turn it on. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	0a10e3770f	vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init Now that we have anv_device_init/finish functions, there's no reason to have the individual driver do any more work than that. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	2e3e55110b	vulkan/wsi: Drop some unneeded cruft from the API This drops the unneeded callbacks struct as well as the queue_get_family callback we were using before we'd pulled QueuePresent inside. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	c1b1be5196	vulkan/wsi: Add wrappers for all of the surface queries This lets us move wsi_interface to wsi_common_private.h Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	82931dc007	vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support Both anv and radv can handle prime now. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	3131fd9dec	vulkan/wsi: Move wsi_swapchain to wsi_common_private.h The drivers no longer poke at this directly. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	516dfb34e1	vulkan/wsi: Add a helper for AcquireNextImage Unfortunately, due to the fact that AcquireNextImage does not take a queue, the ANV trick for triggering the fence won't work in general. We leave dealing with the fence up to the caller for now. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	8ff49951c3	vulkan/wsi: move swapchain create/destroy to common code v2 (Jason Ekstrand): - Rebase - Alter the names of the helpers to better match the vulkan entrypoints - Use the helpers in anv Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	ad4c60d6b8	vulkan/wsi: Move prime blitting into queue_present This lets us save a QueueSubmit and it also makes prime a lot less X11-specific. Also, it means we can only wait on the semaphores once instead of on every blit. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	393aa3f6c9	vulkan/wsi: Move get_images into common code This moves bits out of all four corners (anv, radv, x11, wayland) and into the wsi common code. We also switch to using an outarray to ensure we get our return code right. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	1117f843fe	anv/wsi: Enable prime support Now that we're using the same common code as radv, we get prime support for free. Just enable it. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	ac95335b61	anv/wsi: Use the common QueuePresent code Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	d25a0f21e1	vulkan/wsi: Set a proper pWaitDstStageMask on the dummy submit Neither mesa driver really cares, but we should set it none the less for the sake of correctness. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	59e58c348e	vulkan/wsi: Only wait on semaphores on the first swapchain Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	b91a1953e8	vulkan/wsi: Refactor result handling in queue_present Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	6dc3a5e8f0	radv/wsi: Move the guts of QueuePresent to wsi common v2 (Jason Ekstrand): - Better comit message - Rebase - Re-indent to follow wsi_common style - Drop the unneeded _swapchain from the newly added helper - Make the clone more true to the original (as per the rebase) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	42dd06d957	vulkan/wsi: Add a WSI_FROM_HANDLE macro Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	69365d72de	radv/wsi: drop allocate memory special case Just check if image has scanout flag set v2 (Jason Ekstrand): - Rebase - Also drop the now unused radv_mem_flag_bits enum Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	e12688f365	vulkan/wsi: Do image creation in common code This uses the mock extension created in a previous commit to tell the driver that the image it's just been asked to create is, in fact, a window system image with whatever assumptions that implies. There was a lot of redundant code between the two drivers to do basically exactly the same thing. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	d50937f137	vulkan/wsi: Implement prime in a completely generic way Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	df4fc68492	radv: Move wsi initialization later in physical_device_init We need it to happen after memory type setup so that we can query memory types in wsi_device_init. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	a50f93ecfb	radv/image: Implement the wsi "extension" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	3dabb4011f	anv/image: Implement the wsi "extension" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	a44744e01d	anv: Require a dedicated allocation for modified images This lets us set the BO tiling when we allocate the memory. This is required for GL to work properly. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	7d19e570e1	anv/image: Add a drm_format_mod field At the moment, this is always initialized to DRM_FORMAT_MOD_INVALID. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	546e747867	radv: Implement VK_EXT_external_memory_dma_buf Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	ab18e8e59b	anv: Implement VK_EXT_external_memory_dma_buf This is a modified version of the patch originally sent by Chad Versace. The primary difference is that this version claims that OPQAUE_FD and DMA_BUF are compatible handle types. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	66dc618215	vulkan/wsi: Add a mock image creation extension Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	cd881dafad	vulkan/wsi: Add wsi_swapchain_init/finish functions Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	764fc1643c	vulkan/wsi: Add a wsi_device_init function This gives the opportunity to collect some function pointers if we'd like which will be very useful in future. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Jason Ekstrand	3991098f3b	vulkan/wsi/x11: Handle the geometry check earlier in create_swapchain This fixes a potential leak if allocating the swapchain fails. Since geometry checking and bit-depth fetching is self-contained, it makes sense to just do it first so we can delete the geometry reply. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Daniel Stone	c1163f7b1c	vulkan/wsi: Add a wsi_image structure This is used to hold information about the allocated image, rather than an ever-growing function argument list. v2 (Jason Ekstrand): - Rename wsi_image_base to wsi_image Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-12-04 10:04:19 -08:00
Dave Airlie	2cbeb32555	vulkan/wsi: use function ptr definitions from the spec. This just seems cleaner, and we may expand this in future. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-12-04 10:04:19 -08:00
Kenneth Graunke	55a97db523	i965: Emit CS stall before MEDIA_VFE_STATE. This fixes hangs on GFXBench 5's Aztec Ruins benchmark. Unfortunately, it regresses OglCSCloth performance by about 10%. There are some ideas for fixing that. The Vulkan driver already emits this stall. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-04 10:02:46 -08:00
Kenneth Graunke	bfe0f3a702	i965: Move PIPE_CONTROL defines and prototypes to brw_pipe_control.h. We need to be able to emit PIPE_CONTROLs from genX_state_upload.c, which can't safely include brw_defines.h because it conflicts with genxml. Move all the PIPE_CONTROL related stuff together into a separate header. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-12-04 10:02:46 -08:00
Jason Ekstrand	d74b1f4809	spirv: Replace unreachable with vtn_fail Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	b7ef60d846	spirv: Replace assert with vtn_assert Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	94ca8e04ad	spirv: Add vtn_fail and vtn_assert helpers These helpers are much nicer than just using assert because they don't kill your process. Instead, it longjmps back to spirv_to_nir(), cleans up all the temporary memory, and nicely returns NULL. While crashing is completely OK in the Vulkan world, it's not considered to be quite so nice in GL. This should help us to make SPIR-V parsing much more robust. The one downside here is that vtn_assert is not compiled out in release builds like assert() is so it isn't free. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	0c49aa0624	util: Add a NORETURN macro Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	591a07632c	spirv: Do something useful with OpSource We may as well log the source language and file name. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	16dfdeefc8	spirv: Rework logging This commit reworks the way that logging works in SPIR-V to provide richer and more detailed logging infrastructure. This commit contains several improvements over the old mechanism: 1) Log messages are now more detailed. They contain the SPIR-V byte offset as well as source language information from OpSource and OpLine. 2) There is now a logging callback mechanism so that errors can get propagated to the client through debug callbak extensions. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	11bd753c4e	spirv: Re-arrange vtn_builder initialization This simply moves allocating the vtn_builder and initializing it to the very beginning before we even parse the header. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Jason Ekstrand	d74bec1a54	spirv: Parent the nir_shader to the builder while building Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2017-12-04 09:21:09 -08:00
Rob Clark	1ec1ae47f7	freedreno: mark stencil buffer valid too in case of z32x24s8 The separate stencil buffer was not also getting marked as valid if written by a draw/clear, resulting in gmem2mem getting skipped. Move this into fd_batch_resource_used() which also handles the separate stencil case. Also fix restore_buffers typo. Fixes: `4ab6ab8036` freedreno: avoid mem2gmem for invalidated buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-04 11:50:45 -05:00
Rob Clark	e90f1a26c3	freedreno: remove use of u_transfer Freedreno doesn't treat buffers and images differently, so it's use was kind of pointless. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-04 11:50:45 -05:00
Eric Engestrom	7c3f958d23	freedreno: add -Wno-packed-bitfield-compat for meson build Otherwise huge amount of spam from instr-a2xx.h.. gcc has no way to know that freedreno was never built with such an old gcc version to care about the bugs in old gcc ;-) Reported-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> [added commit message] Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-04 11:50:45 -05:00
Samuel Iglesias Gonsálvez	fa8c1b92b7	glsl: don't run intrastage array validation when the interface type is not an array We validate that the interface block array type's definition matches. However, previously, the function could be called if an non-array interface block has different type definitions -for example, when the precision qualifier differs in a GLSL ES shader, we would create two different types-, and it would return invalid as both definitions are non-arrays. We fix this by specifying that at least one definition should be an array to call the validation. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:32:57 +01:00
Samuel Iglesias Gonsálvez	fc6d55952d	glsl/es: precision qualifier doesn't need to match in UBOs They might mismatch due to the two shaders using different GLSL versions, and that's ok in desktop GL. In ES, precision qualifiers don't need to match. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:32:57 +01:00
Pierre Moreau	9bee12160b	nvc0/ir: Properly lower 64-bit shifts when the shift value is >32 Fixes: `61d7676df7` "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30" Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current set-up: uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901 uniform uint64_t uval 0x1400000085010203 uniform int shl 36 uniform int shr 36 uniform int64_t iexpected_shl 0x09ac901000000000 uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010 uniform uint64_t uexpected_shl 0x5010203000000000 uniform uint64_t uexpected_shr 0x0000000001400000 draw rect ortho 12 0 4 4 Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-12-04 01:03:47 -05:00
Fabian Bieler	9bdb5457f4	glsl: Match order of gl_LightSourceParameters elements. spotExponent and spotCosCutoff were swapped in the gl_builtin_uniform_element struct. Now the order matches across gl_builtin_uniform_element, glsl_struct_field and the spec. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-12-03 21:14:14 -07:00
Fabian Bieler	c3ee464d7a	glsl: Fix gl_NormalScale. GLSL shaders can access the normal scale factor with the built-in gl_NormalScale. Mesa's modelspace lighting optimization uses a different normal scale factor than defined in the spec. We have to take care not to use this factor for gl_NormalScale. Mesa already defines two seperate states: state.normalScale and state.internal.normalScale. The first is used by the glsl compiler while the later is used by the fixed function T&L pipeline. Previously the only difference was some component swizzling. With this commit state.normalScale always uses the normal scale factor for eyespace lighting. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-12-03 21:13:46 -07:00
Timothy Arceri	27888977c1	st/glsl_to_nir/radeonsi: enable gs support for nir backend Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	ccd1810bba	ac: add si_nir_load_input_gs() to the abi V2: make use of driver_location and don't expose NIR to the ABI. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	caf15ce670	ac: move build_varying_gather_values() to ac_llvm_build.h and expose Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:19 +11:00
Timothy Arceri	6fd6cb6616	ac: add basic nir -> llvm type helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	4184e7c417	radeonsi: create si_llvm_load_input_gs() This creates a common function that can be shared by the tgsi and nir backends. v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	c4c8df94bd	radeonsi: pass llvm type to lds_load() v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	650126f3e0	radeonsi: add llvm_type_is_64bit() helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	7ef1e42c14	radeonsi: pass llvm type to si_llvm_emit_fetch_64bit() v2: use LLVMBuildBitCast() directly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	e51ecbe980	radeonsi: add nir support for gs epilogue v2: add emit_gs_epilogue() helper function to reduce duplication. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	73918b3172	radeonsi: add nir support for es epilogue v2: make use of existing si_tgsi_emit_epilogue() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	204f547852	radeonsi: add nir support for ls epilogue v2: make use of existing si_tgsi_emit_epilogue() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	164b6d4aeb	st/glsl_to_nir: add gs support to st_nir_assign_var_locations() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	c86baf71fb	st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays This pass is more fully featured, it supports geom and tess shaders. It also supports interpolation intrinsics. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	d99c7e0ff1	nir: allow builin arrays to be lowered Galliums nir drivers expect this to be done. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	2bc49ac3e6	nir: add array lowering function that assumes there are no indirects The gallium glsl->nir pass currently lowers away all indirects on both inputs and outputs. This fuction allows us to lower vs inputs and fs outputs and also lower things one stage at a time as we don't need to worry about indirects on the other side of the shaders interface. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	f13790c92f	radv: enable nir varying array splitting Acked-by: Dave Airlie <airlied@redhat.com>	2017-12-04 12:52:18 +11:00
Timothy Arceri	6648bd68fd	st/glsl_to_nir: enable NIR link time opts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	c16a0e11d3	radeonsi/nir: add support for packed inputs Because NIR can create non vec4 variables when implementing component packing we need to make sure not to reprocess the same slot again. Also we can drop the fs_attr_idx counter and just use driver_location. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	c3a5d74377	st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts() NIR component packing will be inserted between these calls and the calling of st_glsl_to_nir_post_opts(). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	90abaf8a21	st/glsl_to_nir: call some lowering passes earlier This is required so that we can enbale NIR linking optimisations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	bd98b8c74e	st/glsl_to_nir: add basic NIR opt loop helper We need to be able to do these NIR opts in the state tracker rather than the driver in order for the NIR linking opts to be useful. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	a9ac01b96f	st/glsl_to_nir: make st_glsl_to_nir() static Here we also move the extern C functions to the bottom of the file. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	d586f39cb0	st/glsl_to_nir: split the st_glsl_to_nir() function in two We want to be able to generate NIR then apply NIR optimisations. Once the optimisations are done we can then apply the new post opt function which assigns uniforms etc based on the optimised IR. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	d38f99baec	st/glsl_to_nir: create set_st_program() helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	da953b641d	st/glsl: move nir linking loop to new function st_link_nir() This will allow us to refactor linking and include some nir link time optimisations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	2a35021bc6	nir: fix support for scalar arrays in nir_lower_io_types() This was just recreating the same vector type we alreay had and hitting an assert for scalars. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	9530b786d2	st/glsl_to_nir: add st_nir_assign_var_locations() helper This avoids packed varyings being assigned different driver locations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Timothy Arceri	aecb9bec87	radv: enable nir component packing SaschaWillems Vulkan demo tessellation: ~4000fps -> ~4600fps Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-04 09:10:30 +11:00
Timothy Arceri	1c9c42d16b	nir: add varying component packing helpers v2: update shader info input/output masks when pack components v3: make sure interpolation loc matches, this is required for the radeonsi NIR backend. v4: `33dca36f4f` fixed nir_gather_info to update outputs_read correct, make sure we also adjust this correctly when packing components. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)	2017-12-04 09:10:30 +11:00
Timothy Arceri	c797bc6aa7	nir: add varying array splitting pass V2: - fix matrix support, non-array matrices were being skipped in v1 v3: - handle lowering of tcs output loads correctly - correctly mark indirect locations for either in or out not both when processing a stage. - use nir_src_copy() when lowering stores. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-04 09:10:30 +11:00
Rob Clark	11efe42a73	freedreno/ir3: relax barriers Instructions with no barrier_class can move wrt. an EVERYTHING barrier. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	48eef0c182	freedreno/ir3: all mem instructions have WAR hazzard It isn't just load instructions that have write-after-read hazzard. Fixes stk gaussian blur compute shaders. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	e6c6495d3a	freedreno: add debug option to force emulated indirect Useful mostly for debugging indirect draw. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	f93f2f7b1e	freedreno: also mark draw-indirect buffer as read Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	4b1d0d2844	freedreno: small cleanups Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	91730fb0ff	freedreno: avoid unneccessary batch flush In some cases we can end up trying to add a write dependency on ourself, which shouldn't trigger a flush. Avoids an extra couple flushes per from in stk. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	4ab6ab8036	freedreno: avoid mem2gmem for invalidated buffers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	2fcf6faa06	freedreno: deferred flush support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:41 -05:00
Rob Clark	15ebf387fc	freedreno: rework fence tracking ctx->last_fence isn't such a terribly clever idea, if batches can be flushed out of order. Instead, each batch now holds a fence, which is created before the batch is flushed (useful for next patch), that later gets populated after the batch is actually flushed. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:40 -05:00
Rob Clark	deb57fb237	freedreno: proper locking for iterating dependent batches In transfer_map(), when we need to flush batches that read from a resource, we should be holding screen->lock to guard against race conditions. Somehow deferred flush seems to make this existing race more obvious. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:40 -05:00
Rob Clark	ef6313ffd3	freedreno/a5xx: correct max_indicies for indirect draws Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-03 14:17:40 -05:00
Jason Ekstrand	e19c623128	spirv: Convert the supported_extensions struct to spirv_options This is a bit more general and lets us pass additional options into the spirv_to_nir pass beyond what capabilities we support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:09:11 -08:00
Jason Ekstrand	6bd876dcaa	spirv: Only emit functions which are actually used Instead of emitting absolutely everything, just emit the few functions that are actually referenced in some way by the entrypoint. This should save us quite a bit of time when handed large shader modules containing many entrypoints. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:07:35 -08:00
Jason Ekstrand	f5aad36d2e	spirv: Drop the impl field from vtn_builder We have a nir_builder and it has an impl field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2017-12-02 08:07:35 -08:00
Jordan Justen	fc033742d2	i965: Serialize nir later in the linking process Fixes MESA_GLSL=cache_fb with piglit tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test Fixes: `0610a624a1` i965/link: Serialize program to nir after linking for shader cache Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-01 23:17:44 -08:00
Marc Dietrich	d93fabb013	configure: avoid testing for negative compiler options gcc seems to always accept unsupported negative compiler warning options: echo "int i;" \| gcc -c -xc -Wno-bob - # no error echo "int i;" \| gcc -c -xc -Walice - # unsupported compiler option Inverting the options fixes the tests. V2: fix options in meson build Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Marc Dietrich <marvin24@gmx.de>	2017-12-01 17:09:42 -08:00
Eric Anholt	0ed952c7e9	broadcom/vc4: Use a single-entry cached last_hindex value. Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).	2017-12-01 15:37:28 -08:00
Eric Anholt	230e646a40	broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN. No significant difference in the minetest replay, but it should reduce overhead by not requiring that we write quad indices to index buffers that we repeatedly re-upload (and making the draw packet smaller, as well). Over the course of the series the actual game seems to be up by 1-2 fps.	2017-12-01 15:37:28 -08:00
Eric Anholt	fefff74b0d	broadcom/vc4: Use the new enum functionality of the XML to decode better.	2017-12-01 15:37:28 -08:00
Eric Anholt	5167367050	broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES. Now that there's only one user of it, it's pretty obvious how to avoid emitting redundant ones. This should save a bunch of kernel validation overhead. No statistically sigificant difference on the minetest trace I was looking at (n=169), but the maximum FPS is up by .3%	2017-12-01 15:37:28 -08:00
Eric Anholt	842b05d6ad	broadcom/vc4: Simplify the relocation handling for index buffers. Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.	2017-12-01 15:37:28 -08:00
Eric Anholt	84ab48c15c	broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count. We failed to take the start into account for how many vertices to draw in this round, so we would end up decrementing count below 0, which as an unsigned number meant we would loop until the CLs soon ran out of space. When I wrote the code I was thinking about how to use the previously emitted shader state (no index bias baked into the elements) by emitting up to 65535 and then only re-emitting with bias for the second wround, but that doesn't work if the start is over 65535. Instead, just delay emitting shader state until we get into the drawarrays GFXH-515 loop and always bake the bias in when we're doing the workaround.	2017-12-01 15:37:28 -08:00
Eric Anholt	bcb6ebe91a	broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround. For triangle strips, we step by max_verts - 2.	2017-12-01 15:37:28 -08:00
Dylan Baker	f56e964e01	meson: use dep_thread instead of dependency('threads') in freedreno They are the same thing, but this is more consistent with the rest of the project. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 15:31:43 -08:00
Dylan Baker	5e71efef44	meson: Add lmsensors support v2: - Make -Dlmsensors=false work - Simplify auto and true cases Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 15:31:43 -08:00
Dylan Baker	7309207432	meson: Add support for gallium extra hud Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 15:31:43 -08:00
Adam Jackson	a48a6b8a40	glx: Prepare driFetchDrawable for no-config contexts When we look up the DRI drawable state we need to associate an fbconfig with the drawable. With GLX_EXT_no_config_context we can no longer infer that from the context and must instead query the server. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 15:53:52 -05:00
Adam Jackson	75d5d22fb7	glx: Use __glXSendError instead of open-coding it This also fixes a bug, the error path through MakeCurrent didn't translate the error code by the extension's error base. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 15:46:46 -05:00
Adam Jackson	bcb15bee52	glx: Simplify some dummy vtable interactions The dummy vtable has these slots as NULL already, no need to check for the dummy context explicitly. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-12-01 15:46:46 -05:00
Emil Velikov	8893418e99	docs/release-calendar: update and extend v2: Missing td tag, add Andres + Juan for 17.2.8 and 17.3.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-12-01 19:30:23 +00:00
Emil Velikov	8d58e9b2cf	docs/specs: annotate MESA_set_3dfx_mode as obsolete Aimed to work with Glide, which hasn't been a thing in over 10 years. There are no drivers that implement it, so annotate it as obsolete v2: Move the extension to OLD/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Adam Jackson <ajax@redhat.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 19:30:23 +00:00
Emil Velikov	f8aea0ce47	xlib: remove dummy GLX_MESA_set_3dfx_mode implementation The implementation is a simple 'return EGL_FALSE'. Stop pretending and simply remove it. Note: the removal of XMesa API is fine, since there hasn't been any users for it in years. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 19:30:23 +00:00
Emil Velikov	7a4107291d	docs/specs: annotate MESA_agp_offset as obsolete No Mesa driver has implemented the extension in ages. Seemingly non Mesa drivers don't implement it either. As mentioned by Ian, the extension is effectively superseded by ARB_vertex_buffer_object. v2: Move the extension to OLD/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Adam Jackson <ajax@redhat.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 19:30:23 +00:00
Emil Velikov	bcf0ce4016	xlib: remove empty GLX_MESA_agp_offset stubs The extension was never implemented and seemingly never will. The DRI based libGL dropped support for it over 10 years ago. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 19:30:23 +00:00
Emil Velikov	b1e7386f1b	xlib: remove empty GLX_NV_vertex_array_range stubs The extension was never implemented and seemingly never will. The DRI based libGL dropped support for it over 10 years ago. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-12-01 19:30:23 +00:00
Rafael Antognolli	e20830db96	i965/gen10: Change the order of PIPE_CONTROL and load register. I believe the workaround describes that the MI_LOAD_REGISTER_IMM should come right after the 3DSTATE_SAMPLE_PATTERN. This fixes GPU hangs in the i965 initial state batchbuffer when running some Piglit tests with always_flush_batch=true. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-01 11:27:27 -08:00
Rafael Antognolli	2919adffe9	intel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS. The bspec describes: "WA: Clear tdr register before send EOT in all non-PS shader kernels mov(8) tdr0:ud 0x0:ud {NoMask}" Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-01 11:27:27 -08:00
Rafael Antognolli	979fc1bc9b	i965/gen10: emit 3DSTATE_MULTISAMPLE more often. On CNL, we see multiple multisample failures on piglit tests. By emitting this extra state, though not documented in the bspec, those failures seem to go away. This workaround could be removed if we ever find out a better solution, but it should be good enough for now. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-12-01 11:27:19 -08:00
Dylan Baker	dbeb278e0d	meson: install khrplatform header for EGL as well as GLES Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 10:39:19 -08:00
Dylan Baker	91244db186	meson: install dri internal header Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 10:39:04 -08:00
Jason Ekstrand	ee57b15ec7	i965: Disable regular fast-clears (CCS_D) on gen9+ This partially reverts commit `3e57e9494c` which caused a bunch of GPU hangs on several Source titles. To date, we have no clue why these hangs are actually happening. This undoes the final effect of `3e57e9494c` and gets us back to not hanging. Tested with Team Fortress 2. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435 Fixes: `3e57e9494c` Cc: mesa-stable@lists.freedesktop.org	2017-12-01 10:14:28 -08:00
Vadym Shovkoplias	a1b4f1877f	egl/x11: Remove unneeded free() on always null string In this condition dri2_dpy->driver_name string always equals NULL, so call to free() is useless Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 15:15:30 +00:00
Eric Engestrom	29ee934331	gallium/hud: use #ifdef to test for macro existence Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-01 13:49:42 +00:00
Eric Engestrom	13a7a2d455	amd: remove always-true BRAHMA_BUILD define Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-12-01 13:49:42 +00:00
Vadym Shovkoplias	d555929239	glx/dri3: Remove unused deviceName variable deviceName string is declared, assigned and freed but actually never used in dri3_create_screen() function. Fixes: `2d94601582` ("Add DRI3+Present loader") Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-12-01 13:49:42 +00:00
George Kyriazis	95adbe1a4e	swr/scons: Fix intermittent build failure gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp. Account for new dependency. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-12-01 07:47:13 -06:00
Samuel Pitoiset	80e6e71b82	radv: only reset command buffers when the allocation fails "vkAllocateCommandBuffers can be used to create multiple command buffers. If the creation of any of those command buffers fails, the implementation must destroy all successfully created command buffer objects from this command, set all entries of the pCommandBuffers array to NULL and return the error." This has been suggested by gabriel@system.is. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-01 11:38:34 +01:00
Samuel Pitoiset	921986b580	radv: do not dump meta shaders with RADV_DEBUG=shaders It's really annoying and this pollutes the output especially when a bunch of non-meta shaders are compiled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-01 11:38:26 +01:00
Dave Airlie	4e7f6437b5	r600: add ARB_shader_storage_buffer_object support (v3) This just builds on the image support. Evergreen only has ssbo for fragment and compute no other stages. v2: handle images and ssbo in the same shader properly (Ilia) v3: fix RESQ on buffers, fix missing atom emit fix first element offset use R32 format write separate buffer rat store path. (from running deqp gles3.1 tests) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-01 06:12:31 +00:00
Dave Airlie	c758fd05d8	r600/cayman: looks like cmpxchg moved to Z On cayman it appears the cmp component is now in Z. Fixes: arb_shader_image_load_store-dead-fragments on cayman. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-01 03:59:17 +00:00
Dave Airlie	4f3e73516c	r600/shader: fix 64->32 conversions These didn't handle the TGSI at all properly, this fixes them to use the common path for 64->32 then adds the 32->int on at the end. Fixes: generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/* Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-01 03:48:35 +00:00
Samuel Pitoiset	ff0f17da14	radv: do not allocate CMASK or DCC for small surfaces The idea is ported from RadeonSI, but using 512x512 instead of 256x256 seems slightly better. This improves dota2 performance by +2%. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-11-30 21:38:30 +01:00
Samuel Pitoiset	f5955c6bf8	radv: do not set DISABLE_LSB_CEIL on GFX9 The state no longer exists on GFX9. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-30 21:38:01 +01:00
Samuel Pitoiset	319f56e675	radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-30 21:38:00 +01:00
Samuel Pitoiset	4eab78b03c	radv: do not store gfx9_epitch in radv_color_buffer_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-30 21:37:58 +01:00
Dylan Baker	7776dc32eb	meson: fix glxext.h install Another typo, the glext.h header was being install instead. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 10:00:49 -08:00
Dylan Baker	a80a3e4cbb	meson: fix GLES3/gl31.h install This is a typo, gl32.h is installed twice. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 10:00:49 -08:00
Marek Olšák	186adc514b	ac/surface: always compute DCC info when DCC is possible on GFX9 The same code for VI doesn't check for scanout either. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-30 18:46:11 +01:00
Marek Olšák	ed4780383c	radeonsi/gfx9: fix importing shared textures with DCC VI has 11 dwords at least. GFX9 has 10 dwords. Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-30 18:46:11 +01:00
Jon Turney	6f0ce2617e	meson: fix deps and underlinkage of libGL Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Jon Turney	5ef75cb02b	meson: build src/glx/windows Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Jon Turney	3ae998a743	meson: don't require dri2proto for darwin or windows Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Jon Turney	dbe36e3b17	meson: set _GNU_SOURCE on cygwin Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Jon Turney	9cdd41b18a	meson: set windows glx defines Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Dylan Baker	bb5d663b39	meson: fix generated source inclusion on macOS and Windows Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 15:09:21 +00:00
Vadym Shovkoplias	cdb3eb7174	intel/blorp: Fix possible NULL pointer dereferencing Fix incomplete check of input params in blorp_surf_convert_to_uncompressed() which can lead to NULL pointer dereferencing. Fixes: `5ae8043fed` ("intel/blorp: Add an entrypoint for doing bit-for-bit copies") Fixes: `f395d0abc8` ("intel/blorp: Internally expose surf_convert_to_uncompressed") Reviewed-by: Emil Velikov <emli.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-11-30 16:20:05 +02:00
Tapani Pälli	faccbaf3fa	mesa: add AllowGLSLCrossStageInterpolationMismatch workaround This fixes issues seen with certain versions of Unreal Engine 4 editor and games built with that using GLSL 4.30. v2: add driinfo_gallium change (Emil Velikov) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801 Acked-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-30 11:43:10 +02:00
Vinson Lee	8c1e4b1afc	anv: Check if memfd_create is already defined. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-30 01:36:46 -08:00
Iago Toral Quiroga	8620f7ebbc	i965/vec4: use a temp register to compute offsets for pull loads 64-bit pull loads are implemented by emitting 2 separate 32-bit pull load messages, where the second message loads from an offset at +16B. That addition of 16B to the original offset should not alter the original offset register used as source for the pull load instruction though, since the compiler might use that same offset register in other instructions (for example, for other pull loads in the shader code that take that same offset as reference). If the pull load is 32-bit then we only need to emit one message and we don't need to do offset calculations, but in that case the optimizer should be able to drop the redundant MOV. Fixes the following test on Haswell: KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103007	2017-11-30 07:57:53 +01:00
Wladimir J. van der Laan	f1a9a724f9	etnaviv: GC7000: Factor out state based texture functionality Prepare for two texture handling paths, the descriptor-based path will be added in a future commit. These are structured so that the texture implementation handles its own state emission. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:33:20 +01:00
Wladimir J. van der Laan	075f8cd7de	etnaviv: GC7000: Move active_samplers_bits to texture This needs to be shared between texture_plain and texture_desc. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:33:16 +01:00
Wladimir J. van der Laan	260a5e2a1a	etnaviv: GC7000: Factor out incompatible texture handling logic This will be shared with the texture descriptor path. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:33:11 +01:00
Wladimir J. van der Laan	9d1f8805b0	etnaviv: GC7000: Track dirty sampler views Need this to efficiently emit texture descriptor invalidations. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:33:07 +01:00
Wladimir J. van der Laan	5cc36f9f21	etnaviv: GC7000: Make point sprites work on HALTI5 Track varying component offset of the point size output, as well as provide the offset of the point coord input. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:33:02 +01:00
Wladimir J. van der Laan	3d09bb390a	etnaviv: GC7000: State changes for HALTI3..5 Update state objects to add new state, and emit function to emit new state. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:32:33 +01:00
Wladimir J. van der Laan	acd3dff463	etnaviv: GC7000: Update screen specs for HALTI5 - This core must load shaders from memory (AFAIK) - Yet another new location for UNIFORMS Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:32:21 +01:00
Wladimir J. van der Laan	c6033e84bb	etnaviv: GC7000: Update context reset for ..HALTI5 Update context reset for HALTI3..HALTI5, sorting states for the HALTI version that has them. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:28:09 +01:00
Wladimir J. van der Laan	baff59ebf0	etnaviv: GC7000: No RS align when using BLT RS align is not necessary and might even be harmful when using the BLT engine for blitting. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:28:02 +01:00
Wladimir J. van der Laan	dd3a04c2c3	etnaviv: GC7000: BLT engine blitting support Add an implemenation of key clear_blit functions using the BLT engine that replaced the RS on GC7000. Also set level->size correctly for imported resources. This is important for the BLT resolve-in-place path to work for them. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:27:57 +01:00
Wladimir J. van der Laan	079bbaec0c	etnaviv: GC7000: Factor out RS blit functionality Prepare for BLT-based blitting path by moving RS-based blitting to the RS implementation file, making this self-contained. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:27:53 +01:00
Wladimir J. van der Laan	77768b1859	etnaviv: GC7000: Move etna_coalesce to emit header file Want to be able to emit state from the texture implementation, and the blitter implementation. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:27:48 +01:00
Wladimir J. van der Laan	571d980695	etnaviv: GC7000: Support BLT as recipient for etna_stall When the BLT is involved as source or target, add an extra BLT enable/disable sequence around the sync sequence. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:27:43 +01:00
Wladimir J. van der Laan	150d8766ea	etnaviv: Use only DRAW_INSTANCED on GC3000+ The blob does this, as DRAW_INSTANCED can replace fully all the other draw commands. It is also required to handle integer vertex formats. The other path is only there for compatibility and might go away (or at least rot to become buggy due to dis-use) in newer hardware. As a by-effect this changes the behavior for GC3000-, by no longer using the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR. This should make no difference. Preparation for GC7000 support. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-11-30 07:26:55 +01:00
Wladimir J. van der Laan	23630ab1b6	etnaviv: Emit SCALE for vertex attributes This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED. It is also necessary when switching between integer and floating point vertex element formats. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-30 07:26:46 +01:00
Kenneth Graunke	74e38739ca	i965: Reorganize batch/state BO fields into a 'brw_growing_bo' struct. We're about to add more of them, and need to pass the whole lot of them around together when growing them. Putting them in a struct makes this much easier. brw->batch.batch.bo is a bit of a mouthful, but it's nice to have things labeled 'batch' and 'state' now that we have multiple buffers. Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-29 17:30:35 -08:00
Kenneth Graunke	ca43616586	i965: Don't grow batch/state buffer on every emit after an overflow. Once we reach the intended size of the buffer (BATCH_SZ or STATE_SZ), we try and flush. If we're not allowed to flush, we resort to growing the buffer so that there's space for the data we need to emit. We accidentally got the threshold wrong. The first non-wrappable call beyond (e.g.) STATE_SZ would grow the buffer to floor(1.5 * STATE_SZ), The next call would see we were beyond STATE_SZ and think we needed to grow a second time - when the buffer was already large enough. We still want to flush when we hit STATE_SZ, but for growing, we should use the actual size of the buffer as the threshold. This way, we only grow when actually necessary. v2: Simplify the control flow (suggested by Jordan) Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-11-29 17:30:35 -08:00
Kenneth Graunke	52d32917e1	i965: Preserve EXEC_OBJECT_CAPTURE when growing the BO. The original state buffer was marked with EXEC_OBJECT_CAPTURE. When growing it, we want to preserve that flag so we continue to capture it in GPU hang reports. Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-29 17:30:35 -08:00
Kenneth Graunke	2af7085460	i965: Use old_bo->align when growing batch/state buffer instead of 4096. The intention here is make the new BO use the same alignment as the old BO. This isn't strictly necessary, but we would have to update the 'alignment' field in the validation list when swapping it out, and we don't bother today. The batch and state buffers use an alignment of 4096, so this should be equivalent - it's just clearer than cut and pasting a magic constant. Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-11-29 17:30:35 -08:00
Dave Airlie	2c4861e453	r600: no need to reinit compute regs Compute setup gets emitted into the normal gfx state buffer, so no need to reinit the basics. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-30 09:53:22 +10:00
Dave Airlie	ea355e29f7	r600: split cb setup code out from evergreen compute path. This just makes it easier to bypass for TGSI later. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-30 09:39:25 +10:00
Dave Airlie	77c70e5fe5	r600: add support for compute pkt flags to debug dumping. This just lets us see packets marked for compute. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-30 09:32:31 +10:00
Dave Airlie	779306c8b6	r600: fix bfe where src/dst are same. This fixes overlaps where src/dst are the same. Fixes a bunch of the deqp bitfield tests. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-30 09:32:31 +10:00
Adam Jackson	0d044351b7	gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_control Reviewed-and-tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-29 16:00:24 -05:00
Kenneth Graunke	cfc5af588c	i965: Program the dynamic state heap size to MAX_STATE_SIZE. STATE_BASE_ADDRESS specifies a maximum size of the dynamic state section, beyond which data supposedly reads back as 0. On Gen8+, we were programming it to the size of the buffer. This worked fine until we started growing the state buffer in commit `2dfc119f22`. When the state buffer grows, the value in STATE_BASE_ADDRESS becomes too small, and our state beyond STATE_SZ bytes would read back as 0. To avoid having to update the value, we program it to MAX_STATE_SIZE. We used to program the upper bound to the maximum on older hardware anyway, so programming it too large isn't a big deal. Bogus SURFACE_STATE can easily lead to GPU hangs and misrendering. DiRT Rally was hitting the statebuffer growth path, and suffered from bad texture corruption and GPU hangs (usually around the same time). This patch fixes both issues. Fixes: `2dfc119f22` "i965: Grow the batch/state buffers if we need space and can't flush." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101 Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-29 11:48:29 -08:00
Marek Olšák	2c5f2936af	r300,r600,radeonsi: replace RADEON_FLUSH_* with PIPE_FLUSH_* and handle PIPE_FLUSH_HINT_FINISH in r300. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	950221f923	radeonsi: remove r600_common_screen Most files in gallium/radeon now include si_pipe.h. chip_class and family are now here: sscreen->info.family sscreen->info.chip_class Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	4d1fe8f964	radeonsi: remove r600_pipe_common::barrier_flags::compute_to_L2 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	c0d44fe0e9	radeonsi: remove query/apply_opaque_metadata callbacks Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	2208b760f3	radeonsi: move shader debug helpers out of r600_pipe_common.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	e4cce7dbba	radeonsi: dismantle si_common_screen_init/destroy Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	e32d3a648e	radeonsi: document our vendor string situation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	eae85b99fc	radeonsi: set all pipe buffer functions in r600_buffer_common.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	63f88644a5	radeonsi/uvd: don't call ws->query_info Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	b86feec390	radeonsi: move video queries into si_get.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	30d5f2c942	radeonsi: remove more functions from r600_pipe_common.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	757ea3e613	radeonsi: move/remove ac_shader_binary helpers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	03e2adc990	radeonsi: move all get functions to si_get.c; disk_cache_create to si_pipe.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	1823bbbb1a	radeonsi: remove R600_CONTEXT_* flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	d96c7e7822	radeonsi: just include si_pipe.h in r600_query.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	c63e225bff	radeonsi: remove some definitions and helpers from r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	175ee084ff	radeonsi: don't use fast color clear for small surfaces This removes 35+ clear eliminate passes from DOTA 2. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	8a58724ac9	radeonsi: unify code setting dirty_level_mask for fast clear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	980bf9a27e	radeonsi: clean up si_do_fast_color_clear parameters Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	092756f23f	radeonsi: remove r600_common_context::clear_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	b191e2d79d	radeonsi: move r600_test_dma.c into si_test_dma.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	132471bde1	radeonsi: move si_pipe_clear_buffer into si_cp_dma.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	7aa2366b70	radeonsi: move all clear() code into si_clear.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	3c4d871ca2	radeonsi: enable DCC with MSAA for VI Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	373f4a48ae	radeonsi: implement fast color clear for DCC with MSAA for VI Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	36ac7a1b0e	radeonsi: add a workaround for blending with DCC and MSAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	d1f65e5e99	radeonsi: clear PIPE_IMAGE_ACCESS_WRITE when it's invalid to be on the safe side Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	e3c0a5b6e8	ac/surface: enable DCC computation for MSAA Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	6863651bbd	radeonsi: fix layered DCC fast clear Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Jon Turney	2c62ccb10a	util: Also include endian.h on cygwin If u_endian.h can't determine the endianess, the default behaviour in sha1.c is to build for big-endian Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-29 14:04:40 +00:00
Juan A. Suarez Romero	6d540aa092	mesa: deal with vs_inputs as 64-bit unsigned integer Commit 78942e ("mesa: shrink VERT_ATTRIB bitfields to 32 bits") uses vs_prog_data->vs_inputs as if it were a 32-bit unsigned integer. But actually it is a 64-bit integer, and as such it is used in other parts of Mesa code. It is worth to note that bits from the entire range are used, and not only 32-bits. This is due our implementation for handling 64-bit dual-slot input attributes, which requires to use a larger bitfield to manage them. This commit reverts the changes done in brw_draw_upload.c, keeping the rest of the changes. This fixes the following tests: - KHR-GL45.enhanced_layouts.varying_array_locations - KHR-GL45.enhanced_layouts.varying_locations Fixes: 78942e ("mesa: shrink VERT_ATTRIB bitfields to 32 bits") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103942 CC: Marek Olšák <marek.olsak@amd.com> CC: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-11-29 12:13:10 +01:00
Timothy Arceri	a39a3b4b76	mesa: rework _mesa_add_parameter() to only add a single param This is more inline with what the functions name suggests it should do, and makes the code much easier to follow. This will also make adding uniform packing support much simpler. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 21:50:48 +11:00
Dave Airlie	f8a54c489d	r600: lds load cleanups. This is just some cleanups on top of the last patch from my compute branch. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-29 13:18:44 +10:00
Gert Wollny	76837e29e3	r600_shader: only load from LDS what is really used Use the destination write mask to determine which values are really to be read from LDS and load only these. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>	2017-11-29 13:08:29 +10:00
Dave Airlie	579ec9c311	r600/sb: handle jump after target to end of program. (v2) This fixes hangs on cayman with tests/spec/arb_tessellation_shader/execution/trivial-tess-gs_no-gs-inputs.shader_test This has a single if/else in it, and when this peephole activated, it would set the jump target to NULL if there was no instruction after the final POP. This adds a NOP if we get a jump in this case, and seems to fix the hangs, so we have a valid target for the ELSE instruction to go to, instead of 0 (which causes infinite loops). v2: update last_cf correctly. (I had some other patches hide this) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-29 11:52:53 +10:00
Kenneth Graunke	6b91610fc6	i965: Change a ret == -1 check to ret != 0. For consistency with most other ret checks. Suggested by Chris. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-28 15:23:16 -08:00
Kenneth Graunke	874d41add3	i965: Use C99 struct initializers in brw_bufmgr.c. This is cleaner than using a non-standard memclear macro (which does a memset to 0) and then initializing fields after the fact. We move the declarations to where we initialized the fields. While we're at it, we move the declaration of 'ret' that goes with the ioctl, eliminating the declaration section altogether. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-28 15:23:16 -08:00
Kenneth Graunke	3d68329a65	i965: Move perf_debug and WARN_ONCE back to brw_context.h. These were moved to src/intel/common/gen_debug.h, but they are not common code. They assume that brw_context or gl_context variables exist, named brw or ctx. That isn't remotely true outside of i965. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-28 15:23:16 -08:00
Eric Engestrom	07d3966694	i965: const a few structs and vars to avoid writing to them by accident Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 15:23:16 -08:00
Kenneth Graunke	760e0156df	i965: Fix Smooth Point Enables. We want to program the 3DSTATE_RASTER field to the gl_context value, not the other way around. Fixes: `13ac46557a` (i965: Port Gen8+ 3DSTATE_RASTER state to genxml.) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-28 15:23:16 -08:00
Dylan Baker	43b0e5f5cd	meson: build virgl driver Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:38 -08:00
Dylan Baker	a537231b22	meson: build svga driver on linux Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:36 -08:00
Dylan Baker	5060c51b6f	meson: build r600 driver v4: - Ensure inc_amd_common defined when radeonsi is disabled (needed by r600) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:33 -08:00
Dylan Baker	4ae08296d0	meson: build r300 driver This is build tested only Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:30 -08:00
Dylan Baker	9169dde941	meson: build i915g driver Build tested only. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 14:06:26 -08:00
Brian Paul	c5d199fa2c	svga: move svga_is_format_supported() to svga_format.c where the other format-related functions live. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-28 06:50:16 -07:00
Brian Paul	bae5b2a87c	svga: s/unsigned/SVGA3dDevCapIndex/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-28 06:50:16 -07:00
Lionel Landwerlin	addfa4c5e8	i965: perf: add support for CoffeeLake GT3 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	b5f6b9b0eb	i965: perf: add support for CoffeeLake GT2 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	74f41fd781	i965: perf: add busyness metric sets on gen8/9 platforms Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	a543ae4c2a	i965: fix time elapsed counter equations in VME/Media configs There was a mistake just in those metric sets. We probably didn't noticed because they're not really interesting for 3D workloads. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	064a4831e3	i965: perf: update counter names on gen8/9 platforms Just fixing names. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	349712018b	i965: add a debug option to disable oa config loading This provides a good way to verify we haven't broken using the perf driver on older kernels (which don't have the oa config loading mechanism). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	27ee83eaf7	i965: perf: add support for userspace configurations This allows us to deploy new configurations without touching the kernel. v2: Detect loadable configs without creating one (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Lionel Landwerlin	3e7112e603	i965: perf: update configs for loading from userspace When making configs loadable from userspace in the kernel, we left to userspace more responsability around programming some registers. In particular one register we use to set directly in the driver has now been moved into the configs. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-28 13:34:04 +00:00
Eric Engestrom	44fbbd6fd0	util: add mesa-sha1 test to meson Fixes: `513d7ffa23` "util: Add a SHA1 unit test program" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 11:06:04 +00:00
Eric Engestrom	9d281e1506	compiler: fix typo Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Eric Engestrom	7b85b9b877	compiler: use NDEBUG to guard asserts nir_validate.c's #endif already had the correct NDEBUG comment Fixes: `dcb1acdea0` "nir/validate: Only build in debug mode" Fixes: `9ff71b649b` "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 10:54:38 +00:00
Eric Engestrom	bb46111c01	broadcom: use NDEBUG to guard asserts Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 09:50:36 +00:00
Eric Engestrom	7bb89e1c8f	vc4: check preprocessor token existence using #ifdef instead of #if (other uses of USE_VC4_SIMULATOR are already correct) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-28 09:50:36 +00:00
Ben Crocker	b43daf7bf6	docs/llvmpipe.html: Minor edits Language and spelling fixups in three places. Cc: "17.2" "17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ben Crocker <bcrocker@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> [Eric: move two fixes from the other patch to this one.] Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-28 09:50:36 +00:00
Eric Engestrom	bca122902a	st/dri: replace hard-coded array size with ARRAY_SIZE() Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:50:36 +00:00
Nicolai Hähnle	dd07868904	radeonsi/gfx9: simplify condition for on-chip ESGS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	239d2b5809	radeonsi: clarify that si_shader_selector::esgs_itemsize is set for the ES part Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	26da5d0317	radeonsi: use si_shader_context instead of lp_build_context in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	1c2d19d84d	radeonsi: cleanup si_initialize_color_surface Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	08f6b4dd7b	radeonsi: avoid attempting to create CMASK if the tiling mode doesn't have it Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	e52e8326d9	radeonsi: check that we don't leak fine.buf references Just as an added precaution. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	377a062321	ac/surface: fix indentation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	97f42d11df	amd/common: sid.h cleanups Fix a bunch of labels indicating when registers were added/removed and normalize the SI-class GRBM_GFX_INDEX. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:34:43 +01:00
Nicolai Hähnle	7e35bdad1c	st_glsl_to_tgsi: check for the tail sentinel in merge_two_dsts This fixes yet another case where DFRACEXP has only one destination. Found by address sanitizer. Fixes tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4-only-mantissa.shader_test Fixes: `3b666aa747` ("st/glsl_to_tgsi: fix DFRACEXP with only one destination") Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:31:33 +01:00
Tapani Pälli	1e508e10d9	mesa/gles: adjust internal format in glTexSubImage2D error checks When floating point textures are created on OpenGL ES 2.0, driver is free to choose used internal format. Mesa makes this decision in adjust_for_oes_float_texture. Error checking for glTexImage2D properly checks that sized formats are not used. We use same error checking path for glTexSubImage2D (since there is lot of overlap), however since those checks include internalFormat checks, we need to pass original internalFormat passed by the client. Patch adds oes_float_internal_format that does reverse adjust_for_oes_float_texture to get that format. Fixes following test failure: ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float (when running test with MESA_GLES_VERSION_OVERRIDE=2.0) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227 Cc: "17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-28 08:57:49 +02:00
Jason Ekstrand	049b84246e	radv: Use the suffixed versions of VK_QUEUE_GLOBAL_PRIORITY_* Acked-by: Dave Airlie <airlied@redhat.com>	2017-11-27 21:42:06 -08:00
Jason Ekstrand	07850893a1	vulkan: Update the XML and headers to 1.0.66 Acked-by: Dave Airlie <airlied@redhat.com>	2017-11-27 21:41:46 -08:00
Jason Ekstrand	d7c8c7bd9d	intel/blorp: Drop blorp_resolve_ccs_attachment The only reason why we needed that version was because the Vulkan driver needed to be able to create the surface states so it could handle indirect clear colors. Now that blorp handles them natively, there's no need for the extra entrypoint. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	5bc2849af9	anv: Let blorp handle indirect clear colors for CCS resolves Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	34b95f88e6	anv: Move get_fast_clear_state_address into anv_private.h While we're at it, we break it into two nicely named functions. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	8915621882	intel/blorp: Take a range of layers in blorp_ccs_resolve Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:13 -08:00
Jason Ekstrand	67b676f0c5	intel/blorp: Add initial support for indirect clear colors Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-27 16:22:12 -08:00
Jason Ekstrand	85aa4074a2	i965/blorp: Use a designated initializer for blorp_surf This way uninitialized fields get automatically zeroed and it's safe to add more fields to blorp_surf. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:12 -08:00
Jason Ekstrand	86becfd2de	intel/blorp: Add fast-clear to the special case in MSAA resolves This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:22:11 -08:00
Jason Ekstrand	dc21c3937c	intel/blorp/blit: Rename blorp_nir_txf_ms_mcs That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-27 16:19:38 -08:00
Rob Herring	46148be8e4	Android: disable warnings causing errors AOSP master has changed the build default to -Werror making all the warnings errors. Override that with -Wno-error. Signed-off-by: Rob Herring <robh@kernel.org>	2017-11-27 17:26:45 -06:00
Timothy Arceri	3e789026ca	st/glsl_to_tgsi: make use of driver_cache_blob with the disk cache driver_cache_blob was introduced with the i965 disk cache, it allows us to simplify the cache a little and possibly offers some minor speed improvements since we load the GLSL metadata and TGSI from disk in one pass. Using driver_cache_blob should also make it straight forward to implement binary support for ARB_get_program_binary in gallium. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-28 09:01:44 +11:00
Gwan-gyeong Mun	4cb27047c8	glsl: Fix typo nagivation -> navigation Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-28 08:48:55 +11:00
Emil Velikov	c7616ac069	gl_table.py: add extern C guard for the generated glapitable.h The header can be included from C++, hence contents should have appropriate notation. Cc: mesa-stable@lists.freedesktop.org Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-27 19:23:05 +00:00
Marek Olšák	6b8909f2d1	ac: pack legacy_surf_level better r600_texture: 1488 -> 1248 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:46:16 +01:00
Marek Olšák	ec15ff78c3	ac: change legacy_surf_level::slice_size to dword units The next commit will reduce the size even more. v2: typecast to uint64_t manually v3: add more typecasts, add asserts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:44:04 +01:00
Marek Olšák	474b4a9191	ac: pack ac_surface better r600_texture: 1736 -> 1488 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	b5444877c0	radeonsi: always initialize max_forced_staging_uploads r600_resource is malloc'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808 Fixes: `4b0dc098b2` ("gallium/u_threaded: don't map big VRAM buffers for the first upload directly") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	95cd74abd4	radeonsi: remove an old hack for evergreen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Marek Olšák	1cb731012c	radeonsi: set COMPUTE_RESOURCE_LIMITS.FORCE_SIMD_DIST when profitable ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-27 14:12:38 +01:00
Dave Airlie	043d14db30	ac/nir: don't write tcs outputs to LDS that aren't read back. If the TCS doesn't read back the outputs, no need to store them to LDS in the first place. (except for tess factors). This seems to give about 50fps (3290->3330) with tessellation demo. I haven't tested if it impacts DoW3 at all. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 13:50:24 +10:00
Dave Airlie	33dca36f4f	nir: fill outputs_read field and add patch outputs read (v2) This is to be used for TCS optimisations on radv. v2: don't set written on reads (nha) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-27 13:50:03 +10:00
Dave Airlie	fd301472bd	r600/eg: dump event type in dumps This just makes it easier to debug some things. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-27 12:53:18 +10:00
Tobias Klausmann	068a72fbcb	nouveau/compiler: Allow to omit line numbers when printing instructions This comes in handy when checking "NV50_PROG_DEBUG=1" outputs with diff! V2: - Use environmental variable (Karol Herbst) V3: - Use the already populated nv50_ir_prog_info to forward information to the print pass (Pierre Moreau) V4: - get rid of default value in PrintPass constructor Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 12:51:30 -05:00
Nicolai Hähnle	0fed7f83ba	radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0 Under certain conditions, waiting on a GL sync objects should act like a flush, regardless of the timeout. Portal 2, CS:GO, and presumably other Source engine games rely on this behavior and hang during loading without this fix. Fixes: `bc65dcab3b` ("radeonsi: avoid syncing the driver thread in si_fence_finish") Signed-off-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103902 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103904	2017-11-26 16:53:00 +01:00
Ilia Mirkin	0bd83d0461	nv50/ir: move LateAlgebraicOpt to the very end Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	3072bbef63	nv50/ir: when merging immediates/consts, load directly When a MERGE operation gets its constraint moves added, it susbstantially extends live ranges to be reusing an immediate from earlier in the program (not to mention the silliness of loading an immediate into a register, and then moving into another register). We detect these scenarios and insert moves that take the immediate or constbuf load directly into the register. If it's the last use, then we can just move that operation to the closer location. With SM35 (255 regs) we get these results: total instructions in shared programs : 6583670 -> 6580681 (-0.05%) total gprs used in shared programs : 950818 -> 944261 (-0.69%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60367456 -> 60339896 (-0.05%) local shared gpr inst bytes helped 0 0 4584 3186 3186 hurt 0 0 55 968 968 I suspect they will be better for SM20 and SM30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-26 01:10:19 -05:00
Ilia Mirkin	50e913b9c5	nv50/ir: add optimization for modulo by a non-power-of-2 value We can still use the optimized division methods which make use of multiplication with overflow. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2017-11-26 01:10:03 -05:00
Ilia Mirkin	3079993727	nv50/ir: optimize signed integer modulo by pow-of-2 It's common to use signed int modulo in GLSL. As it happens, the GLSL specs allow the result to be undefined, but that seems fairly surprising. It's not that much more effort to get it right, at least for positive modulo operators. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-25 22:48:09 -05:00
Matt Turner	676761252b	util: Just give up and define PIPE_ARCH_LITTLE_ENDIAN on MSVC MSVC doesn't support #warning?! Getting really tired of this.	2017-11-25 16:46:00 -08:00
Andres Gomez	5fa589148a	docs: remove bug 103626 from fix list as per 17.2.6 Bug https://bugs.freedesktop.org/show_bug.cgi?id=103626 was incorrectly listed as fixed. Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `b9b60dbf55`)	2017-11-26 02:18:08 +02:00
Matt Turner	b8cbad624b	util: Use preprocessor correctly Fixes: `6a353479a7` ("util: Assume little endian in the absence of platform-specific handling")	2017-11-25 15:57:37 -08:00
Andres Gomez	63d488d10c	docs: update calendar, add news item and link release notes for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-11-26 01:46:25 +02:00
Andres Gomez	b0049428b5	docs: add sha256 checksums for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `93c2beafc0`)	2017-11-26 01:42:16 +02:00
Andres Gomez	e6acc4d528	docs: add release notes for 17.2.6 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `00b52f8e99`)	2017-11-26 01:42:15 +02:00
Ilia Mirkin	f39a91c152	freedreno/a4xx: add ARB_framebuffer_no_attachments support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	4f748d12e8	freedreno/a4xx: add indirect draw support This is a copy of the a5xx logic. Fails a few tests, but basic functionality is there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	c3c8d48725	freedreno: regenerate pm4 header, adjust code for new names Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	ffdcd51e66	freedreno/a4xx: add stencil texturing support Copied from a5xx, should be identical. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 17:20:17 -05:00
Ilia Mirkin	86f12e9377	freedreno/ir3: add a pass to lower tg4 to txl, enable gather on a4xx Unfortunately Adreno A4xx hardware returns incorrect results with the GATHER4 opcodes. As a result, we have to lower to 4 individual texture calls (txl since we have to force lod to 0). We achieve this using offsets, including on cube maps which normally never have offsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:59 -05:00
Ilia Mirkin	ab336e8b46	nir: allow texture offsets with cube maps GL doesn't have this, but some hardware supports it. This is convenient for lowering tg4 to plain texture calls, which is necessary on Adreno A4xx hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@gmail.com>	2017-11-25 16:56:30 -05:00
Matt Turner	c690a7a8cd	util: Fix disk_cache index calculation on big endian The cache-test test program attempts to create a collision (using key_a and key_a_collide) by making the first two bytes identical. The idea is fine -- the shader cache wants to use the first four characters of a SHA1 hex digest as the index. The following program unsigned char array[4] = {1, 2, 3, 4}; int ptr = (int )array; for (int i = 0; i < 4; i++) { printf("%02x", array[i]); } printf("\n"); printf("%08x\n", *ptr); prints 01020304 04030201 on little endian, and 01020304 01020304 on big endian. On big endian platforms reading the character array back as an int (as is done in disk_cache.c) does not yield the same results as reading the byte array. To get the first four characters of the SHA1 hex digest when we mask with CACHE_INDEX_KEY_MASK, we need to byte swap the int on big endian platforms. Bugzilla: https://bugs.freedesktop.org/103668 Bugzilla: https://bugs.gentoo.org/637060 Bugzilla: https://bugs.gentoo.org/636326 Fixes: `87ab26b2ab` ("glsl: Add initial functions to implement an on-disk cache") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	513d7ffa23	util: Add a SHA1 unit test program Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	532674303a	util: Fix SHA1 implementation on big endian The code defines a macro blk0(i) based on the preprocessor condition BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap operation. Unfortunately, if the preprocessor macros used in the test are no defined, then the comparison becomes 0 == 0 and it evaluates as true. Fixes: `d1efa09d34` ("util: import sha1 implementation from OpenBSD") Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-25 12:30:46 -08:00
Matt Turner	6a353479a7	util: Assume little endian in the absence of platform-specific handling	2017-11-25 12:30:46 -08:00
Marek Olšák	78942e7dbf	mesa: shrink VERT_ATTRIB bitfields to 32 bits There are only 32 vertex attribs now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:18:22 +01:00
Marek Olšák	43abaf2ad0	mesa: remove unused vertex attrib WEIGHT We don't support ARB_vertex_blend. Note that the attribute aliasing check for ARB_vertex_program had to be rewritten. vbo_context: 20344 -> 20008 bytes gl_context: 74672 -> 74616 bytes Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Marek Olšák	2116b97418	mesa: don't assign numbers to vertex attrib enums manually I plan to remove one of them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-25 17:17:52 +01:00
Marek Olšák	bd57f45168	gallium/hud: add HUD sharing within a context share group This is needed for profiling multi-context applications like Chrome. One context can record queries and another context can draw the HUD. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	11e25eb7f4	gallium/hud: update the HUD interface for multiple contexts This is the boring subset of the following commit. All new parameters are optional. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	9c5b4eb6b4	gallium/hud: prevent a crash if the recording context is inactive Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	37ded08321	gallium/hud: separate code for record context init/release Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	fc07acc21e	gallium/hud: separate code for draw context init/release Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	8caf7d51a9	gallium/hud: don't use hud->pipe in hud_parse_env_var Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	65433c3fd0	gallium/hud: use cso_get_pipe_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	e20364df82	cso: add cso_get_pipe_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	3132afdf4c	gallium/hud: pass pipe_context explicitly to most functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	0e319ed835	gallium/hud: split hud_draw into 3 separate functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	e5148791f6	st/dri: remove dead code and incorrect comment around make_current Core Mesa already handles flushing based on ContextReleaseBehavior, so the comment is wrong. Also, old_st is always NULL, because unbind_context always precedes make_current. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	6ad83b58e2	st/dri: clean up dri_unbind_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	2cfa319f9f	radeonsi: expose all CB performance counters on Stoney Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	797c447f1c	radeonsi: handle imported textures with DCC robustly now you can hack the driver to enable DCC for displayable textures and Glamor that doesn't enable that by default won't crash anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	992b6e18d0	radeonsi: fix a typo in creating monolithic ES-GS This has no effect because both occupy the same memory in a union. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	f783677a82	radeonsi: don't write undefined output channels to LDS in LS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	b63e7d4c6f	radeonsi: use ac.lds for shared memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Marek Olšák	39b098dafb	radeonsi: do 64-bit LDS loads recursively Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-25 17:16:56 +01:00
Jon Turney	b6b4b2c6d8	mapi: Teach es{1,2}api/ABI-check shared library names on Cygwin Ideally we'd be able to get the library filename from libtool, but that doesn't seem to be a feature... Use of ${uname} is presumably ok here as we won't be running 'make check' if we are cross-compiling Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-24 16:53:55 +00:00
Samuel Pitoiset	1cc00b8e0e	Revert "radv: remove unnecessary memset() in radv_AllocateCommandBuffers()" This fixes two CTS regressions: - dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary - dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary These two tests are part the mustpass lists, so presumably they are correct and my change was wrong. This reverts commit `0f68208f1d`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-24 12:26:35 +01:00
Samuel Pitoiset	dc391a406a	radv/winsys: improve error messages when the buffer list creation failed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-24 11:18:43 +01:00
Samuel Pitoiset	15c0df785b	radv/winsys: do not try to create a BO list with 0 buffers This happens when all BOs have the RADEON_FLAG_NO_INTERPROCESS_SHARING (DRM version >= 3.23) flag set. This flag is mainly used for reducing overhead on the userspace side because we don't have to put those BOs inside the list. Though, if the driver tries to create a list with 0 buffers inside it, libdrm returns -EINVAL and the app just crashes. This fixes a bunch of CTS dEQP-VK.sparse_resources.* fails (~100). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-24 11:18:38 +01:00
Iago Toral Quiroga	f1873956db	i965/vec4: fix splitting of interleaved attributes When we split an instruction that reads an uniform value (vstride 0) we need to respect the vstride on the second half of the instruction (that is, the second half should read the same region as the first). We were doing this already, but we didn't account for stages that have interleaved input attributes which also have a vstride of 0 and need the same treatment. Fixes the following on Haswell: KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations KHR-GL45.enhanced_layouts.varying_structure_locations Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Andres Gomez <agomez@igalia.com>	2017-11-24 09:24:06 +01:00
Wladimir J. van der Laan	35548cae93	etnaviv: Emit vertex buffers consecutively Vertex buffer legacy state is no longer picked up with new drawing commands. Change to use different cases depending on the number of vertex streams in the GPU specs. This results in slightly more compact state emission as well, on all vivantes. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-23 22:24:51 +01:00
Eric Engestrom	99aea1e3de	REVIEWERS: add Alexander von Gluck IV as a reviewer for Haiku There's been some Haiku-related activity lately, so let's document who to cc on these patches. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Alexander von Gluck IV <kallisti5@unixzen.com>	2017-11-23 10:00:55 +00:00
Eric Engestrom	1d3944aeeb	genxml: fix assert guards This removes a few hundred warnings on debug builds with asserts off. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-23 09:44:16 +00:00
Eric Engestrom	f9cb2370f3	meson: add variable for mapi_abi.py instead of going back up the tree Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-23 09:44:16 +00:00
Eric Engestrom	d16af73559	meson: reorder subdirs to avoid directly including more than one level Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-23 09:44:16 +00:00
Eric Engestrom	ab0809e552	meson: fix strtof locale support check Fixes: `d1992255bb` "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-23 09:44:16 +00:00
Roland Scheidegger	71e630753e	r600: set DX10_CLAMP for compute shader too I really intended to set this for all shader stages by `3835009796` but missed it for compute shaders (because it's in a different source file...). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-23 02:28:38 +01:00
Lionel Landwerlin	d4c52c5408	anv: flag batch & instruction BOs for capture When the kernel support flagging our BO, let's mark batch & instruction BOs for capture so then can be included in the error state. v2: Only add EXEC_CAPTURE if supported (Kristian) v3: Fix operator precedence issue (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-22 22:53:27 +00:00
Lionel Landwerlin	118a8c7587	anv: setup BO flags at state_pool/block_pool creation This will allow to set the flags on any anv_bo created/filled from a state pool or block pool later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-22 22:53:27 +00:00
Gert Wollny	799d350870	r600/shader: Fix all warnings issed with "-Wall -Wextra" - fix a number of -Wsign-compare warnings - fix two warnings for -Woverride-init because TGSI_OPCODE_CEIL == 83, and the according field was defined two times. [airlied: don't use -1 with unsigned type, fix whitespace] Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-22 22:50:18 +00:00
Gert Wollny	1d076aafbc	r600: Emit EOP for more CF instruction types So far on pre-cayman chipsets the CF instructions CF_OP_LOOP_END, CF_OP_CALL_FS, CF_OP_POP, and CF_OP_GDS an extra CF_NOP instruction was added to add the EOP flag, even though this is not actually needed, because all these instrutions support the EOP flag. This patch removes the fixup code, adds setting the EOP flag for the according instructions as well as others like CF_OP_TEX and CF_OP_VTX, and adds writing out EOP for this type of instruction in the disassembler. This also fixes a bug where shaders were created that didn't actually have the EOP flag set in the last CF instruction, which might have resulted in GPU lockups. [airlied: cleaned up a little] Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-22 22:39:42 +00:00
Dylan Baker	c2dad6ca0a	meson: replace with_dri with with_dri_platform This fixes the windows and macos stubs to be consistent with the nix path. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	33627d23d0	meson: add logic to select apple and windows dri This is still not fully correct (haiku and BSD is notably probably not correct), but Linux is not regressed and this should be correct for macOS and Windows. v2: - set the dri_platform to windows on Cygwin as well (Jon) v3: - Add a better todo for Hurd (Eric) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	2d1a3bf657	meson: Fix LLVM requires for radeonsi Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	48f64e591f	meson: convert llvm option to tristate This option has been acting as a strange sort of half-tri state anyway. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	4b61b07e4b	meson: Convert platform to auto This is necessary to support operating systems other than the *nix family (excluding macOS). For Linux nothing has changed, the defaults are still the same. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	b5d98a101b	meson: Remove duplicate _GNU_SOURCE There is one provided unconditionally, and one guarded by platform == linux. Remove the unconditional one. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:47:43 -08:00
Dylan Baker	9c3e894ebe	meson: Remove completed or irrelevant TODO comments These are all either done already, or are autotools specific. The misspelled gallium G3DVL is the autotools specific bit, meson is handling that via build_by_default. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:46:00 -08:00
Dylan Baker	e89842ebbc	meson: Fix TODO for missing dl_iterate_phdr function This function is required for both the Intel "Anvil" vulkan driver and the i965 GL driver. Error out if either of those is enabled but this function isn't found. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:46:00 -08:00
Dylan Baker	2d62fc0646	meson: disable x86 asm in fewer cases. This patch allows building asm for x86 on x86_64 platforms, when the operating system is the same. Previously cross compile always turned off assembly. This allows using a cross file to cross compile x86 binaries on x86_64 with asm. This could probably be relaxed further thanks to meson's "exe_wrapper", which is way to specify an emulator or compatibility layer (wine) that can run the foreign binaries on the build system. Since the meson build at this point only supports building on Linux I can't test this and I don't want to write/enable code that cannot even be build tested. v4: - set condition to build == x86_64 and host == x86 and build.system == host.system Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:46:00 -08:00
Dylan Baker	84486f6462	meson: Enable SSE4.1 optimizations This patch checks for an and then enables sse4.1 optimizations if the host machine will be x86/x86_64. v2: - Don't compile code, it's unnecessary since we require a compiler which always has SSE4.1 (Matt) v3: - x64 -> x86_64 (Matt) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-22 12:46:00 -08:00
Eric Anholt	6a78416dab	broadcom/vc5: Fix BASE_LEVEL handling with txl. The HW doesn't add the base level anywhere (the min/max lod clamping is what does base level), so we need to add it manually in this case. Fixes piglit tex-miplevel-selection *Lod 2D.	2017-11-22 10:56:31 -08:00
Eric Anholt	c55813c22e	broadcom/vc5: Fix array texture layer count setup. Fixes piglit array-texture.	2017-11-22 10:56:31 -08:00
Eric Anholt	ad1521d708	broadcom/vc5: Don't increment primitive queries while they're paused. Fixes ext_transform_feedback-generatemipmap prims_generated	2017-11-22 10:56:31 -08:00
Eric Anholt	1214c2ea2a	broadcom/vc5: Fix incorrect padding of TF outputs. After the first output, we were padding by an extra size of the previous output. Fixes piglit ext_transform_feedback-output-type mat4x3[2] and friends.	2017-11-22 10:56:31 -08:00
Eric Anholt	b18840ac6e	broadcom/vc5: Fix UIF surface size setup for ARB_fbo's mismatched sizes. The HW was computing an implicit height for the surface based on the image size, but that may be smaller than the surface with ARB_fbo mismatched sizes. In that case, we need to tell it about the pad, either with the little 4-bit field in the RT config, or the extended field in CLEAR_COLORS_PART3. Fixes piglit arb_framebuffer_object-mixed-buffer-sizes.	2017-11-22 10:56:31 -08:00
Wladimir J. van der Laan	9f162fa107	etnaviv: Put HALTI level in specs The HALTI level is an indication of the gross architecture of the GPU. It determines for significant part what feature level the GPU has, what state (especially frontend state) is there, and where it is located. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-22 14:42:06 +01:00
Wladimir J. van der Laan	391c958f08	etnaviv: Const-correctness etnaviv_emit.h The relocation structure is never changed by submitting it. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-22 14:42:00 +01:00
Juan A. Suarez Romero	1b0638c65f	meson: add si_driinfo.h in libgallium_dri v2: generate target conditionally (Dylan) Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-22 12:35:38 +01:00
Iago Toral Quiroga	a217cbd7ec	nir/gather_info: recognize load_patch_vertices_in as a system value This intrinsic is produced to load SYSTEM_VALUE_VERTICES_IN, which is generated to load gl_PatchVerticesIn in the SPIR-V path for both Vulkan and OpenGL. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-22 08:03:55 +01:00
Jordan Justen	386f6cd041	i965: Support decoding INTERFACE_DESCRIPTOR_DATA with INTEL_DEBUG=bat This will dump the INTERFACE_DESCRIPTOR_DATA along with the associated samplers & surfaces. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-21 12:11:57 -08:00
Kristian H. Kristensen	24609377f9	intel/genxml: Add helpers for determining field type Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-21 11:15:06 -08:00
Matt Turner	beaea7abfa	i965/fs: Check ADD/MAD with immediates in satprop unit test The gen had to be changed from 4 to 6 so that we could test MAD, which is new on Gen6. mad_imm_float_neg_mov_sat tests the case fixed by the previous commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-21 10:13:07 -08:00
Matt Turner	a05af1f7b8	i965/fs: Handle negating immediates on MADs when propagating saturates MADs don't take immediate sources, but we allow them in the IR since it simplifies a lot of things. I neglected to consider that case. Fixes: `4009a9ead4` ("i965/fs: Allow saturate propagation to propagate negations into MADs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 Reported-and-Tested-by: Ruslan Kabatsayev <b7.10110111@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-21 10:13:07 -08:00
Juan A. Suarez Romero	ce221cbbcf	mesa/teximage: add TEXTURE_CUBE_MAP_ARRAY target for CompressedTexImage3D From section 8.7, page 179 of OpenGL ES 3.2 spec: An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is one of the the formats in table 8.17 and target is not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY or TEXTURE_3D. An INVALID_OPERATION error is generated by CompressedTexImage3D if internalformat is TEXTURE_CUBE_MAP_ARRAY and the “Cube Map Array” column of table 8.17 is not checked, or if internalformat is TEXTURE_3D and the “3D Tex.” column of table 8.17 is not checked. So far it was only considering TEXTURE_2D_ARRAY as valid target. But as "Cube Map Array" column is checked for all the cases, in practice we can consider also TEXTURE_CUBE_MAP_ARRAY. This fixes KHR-GLES32.core.texture_cube_map_array.etc2_texture Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-21 13:05:42 +01:00
Tapani Pälli	6236ffeb83	intel: fix disasm_info memory leaks Fixes: `4f82b17287` ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-21 08:36:43 +02:00
Timothy Arceri	04a9558497	st/glsl_to_nir: don't generate nir twice for gs This was left out of `c980a3aa31` Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-21 15:57:39 +11:00
Roland Scheidegger	b5957cee92	llvmpipe: fix snorm blending The blend math gets a bit funky due to inverse blend factors being in range [0,2] rather than [-1,1], our normalized math can't really cover this. src_alpha_saturate blend factor has a similar problem too. (Note that piglit fbo-blending-formats test is mostly useless for anything but unorm formats, since not just all src/dst values are between [0,1], but the tests are crafted in a way that the results are between [0,1] too.) v2: some formatting fixes, and fix a fairly obscure (to debug) issue with alpha-only formats (not related to snorm at all), where blend optimization would think it could simplify the blend equation if the blend factors were complementary, however was using the completely unrelated rgb blend factors instead of the alpha ones... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2017-11-21 04:06:29 +01:00
Dave Airlie	464c2d8083	r600: add cull distance support This passes all the tests in piglit. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-21 09:00:52 +10:00
Aravindan Muthukumar	971b3c019b	i965: Optimize bucket index calculation Reducing Bucket index calculation to O(1). This algorithm calculates the index using matrix method. Assuming PAGE_SIZE is 4096, matrix arrangement is as below: 14096 24096 34096 44096 54096 64096 74096 84096 104096 124096 144096 164096 204096 244096 284096 324096 ... ... ... ... ... ... ... ... ... ... ... max_cache_size From this matrix its clearly seen that every row follows the below way: ... ... ... n n+(1/4)n n+(1/2)n n+(3/4)n 2n Row is calculated as log2(size/PAGE_SIZE) Column is calculated as converting the difference between the elements to fit into power size of two and indexing it. Final Index is (row*4)+(col-1) Tested with Intel Mesa CI. Improves performance of 3DMark on BXT by 0.705966% +/- 0.229767% (n=20) v4: Review comments on style and code comments implemented (Ian). v3: Review comments implemented (Ian). v2: Review comments implemented (Jason). Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com> Signed-off-by: Kedar Karanje <kedar.j.karanje@intel.com> Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-20 14:52:42 -08:00
Dylan Baker	c8417c8d25	meson: Guard the gallium dri componenet Currently the target has a redundant guard, and the state tracker isn't properly guarded. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-20 14:28:31 -08:00
Dylan Baker	689fb74716	meson: don't build gallium subdir unless we're building gallium This will allow us to simplify some guards within the gallium directory. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-20 14:28:31 -08:00
Eric Anholt	494effd242	broadcom/vc5: Align 1D texture miplevels to 64b. Fixes tex-miplevel-selection GL2:texture() 1D	2017-11-20 13:54:45 -08:00
Eric Anholt	9d5972da80	broadcom/vc5: Clamp min lod to the last level. Otherwise, the simulator would complain in tex-miplevel-selection that the min/max clamp was out of order. The actual HW seems to have clamped to the max anyway.	2017-11-20 13:52:33 -08:00
Eric Anholt	2c8913e224	broadcom/vc5: Increase simulator memory for tex-miplevel-selection. We were overflowing, because of all the little 4k allocations for CLs that were getting expanded to 128kb in the simulator due to the GMP alignment.	2017-11-20 13:52:33 -08:00
Tim Rowley	34838c2212	swr/rast: Repair simd8 frontend code rot Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:51:10 -06:00
Tim Rowley	005d937e15	swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shader Disabled for now. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:51:06 -06:00
Tim Rowley	2e244c7168	swr/rast: Simplify GATHER* jit builder api General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:51:01 -06:00
Tim Rowley	44025def06	swr/rast: Add alignment to transpose targets Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:56 -06:00
Tim Rowley	bc356b0fc0	swr/rast: Cache eventmanager Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:51 -06:00
Tim Rowley	395a298fa5	swr/rast: Enable AVX-512 targets in the jitter Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:45 -06:00
Tim Rowley	37bb69fb88	swr/rast: Points with clipdistance can't go through simplepoints path Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:38 -06:00
Tim Rowley	d9de8f3122	swr/rast: Code style change (NFC) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:29 -06:00
Tim Rowley	08512c52de	swr/rast: Widen fetch shader to SIMD16 Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:50:23 -06:00
Tim Rowley	e612231f20	swr/rast: Support flexible vertex layout for DS output Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-11-20 13:49:59 -06:00
Nicolai Hähnle	3f17d3c017	gallium/u_threaded: avoid syncing in threaded_context_flush We could always do the flush asynchronously, but if we're going to wait for a fence anyway and the driver thread is currently idle, the additional communication overhead isn't worth it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:16:15 +01:00
Nicolai Hähnle	bc65dcab3b	radeonsi: avoid syncing the driver thread in si_fence_finish It is really only required when we need to flush for deferred fences. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:16:11 +01:00
Nicolai Hähnle	3db1ce01b1	radeonsi: recompute the relative timeout after waiting for ready fence Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:16:06 +01:00
Nicolai Hähnle	f5ea8d18ff	ddebug: fix the hang detection timeout calculation Fixes: `c9fefa062b` ("ddebug: rewrite to always use a threaded approach") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:16:03 +01:00
Nicolai Hähnle	16f8da2997	ddebug: fix use-after-free of streamout targets Fixes: `b47727a83a` ("ddebug: implement pipelined hang detection mode") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:16:00 +01:00
Nicolai Hähnle	aaebf49eba	gallium/u_threaded: properly initialize fence unflushed tokens This got lost in a rebase but never hurt anything because we happened to always sync in fence_finish anyway... Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:56 +01:00
Nicolai Hähnle	81aabb20f3	util/u_queue: really use futex-based fences The relevant define changed in the final revision of the simple mutex patch. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:53 +01:00
Nicolai Hähnle	a6e8311723	util/u_queue: fix timeout handling in util_queue_fence_wait_timeout Fixes: `e3a8013de8` ("util/u_queue: add util_queue_fence_wait_timeout") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:49 +01:00
Nicolai Hähnle	764bd6ef96	st/mesa: use asynchronous flushes in st_finish With threaded gallium, the driver may currently be running in another thread. In that case, we will execute all remaining commands in that thread instead of syncing, which should be better for cache locality. Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:07 +01:00
Nicolai Hähnle	2d8b82baaa	st/mesa: implement st_server_wait_sync properly Asynchronous flushes require a proper implementation of st_server_wait_sync, because we could have the following with threaded Gallium: Context 1 app Context 1 driver Context 2 ------------- ---------------- --------- f = glFenceSync glFlush <-- app sync --> <-- app sync --> glWaitSync(f) .. draw calls .. pipe_context::flush for glFenceSync pipe_context::flush for glFlush Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:07 +01:00
Nicolai Hähnle	ce470af0b1	u_threaded_gallium: remove synchronization in fence_server_sync The whole point of fence_server_sync is that it can be used to avoid waiting in the application thread. Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 18:15:06 +01:00
Nicolai Hähnle	abeded1cac	amd: build addrlib with C++11 It is required for LLVM anyway. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103658 Fixes: `7f33e94e43` ("amd/addrlib: update to latest version") Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 16:26:28 +01:00
Nicolai Hähnle	df5ebe0c26	radeonsi/gfx9: fix VM fault with fetched instance divisors We need to account for SGPR locations in merged shaders. This case is exercised by KHR-GL45.enhanced_layouts.vertex_attrib_locations Fixes: `79c2e7388c` ("radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-20 16:26:10 +01:00
Samuel Pitoiset	3a32858fc3	radv: use a 16 bytes array for the sampled/storage image descriptors This allows to update them with only one memcpy(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-20 11:18:22 +01:00
Samuel Pitoiset	bc92ed04ac	radv: do not add the query pool BO to the list in vkCmdEndQuery() As per the spec, the query identified by queryPool and query must currently be active. Applications have to call vkCmdBeginQuery() before, and thus the query pool BO will already be in the list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-20 11:18:20 +01:00
Samuel Pitoiset	cf54ea155e	radv: only load needed depth clear regs for fast depth clears Similar to how the driver sets the depth clear regs after a fast depth clear. Most of the time, this will copy a 32-bit reg instead of a 64-bit reg. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-20 10:45:27 +01:00
Samuel Pitoiset	e55b7609fa	radv: do not add the image BO in radv_set_depth_clear_regs() For the fast path, radv_fill_buffer() ensures that the BO is already in the list. For the slow path, the depth surface is part of the framebuffer which means the BO is added to the list when the framebuffer is emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-20 10:45:23 +01:00
Samuel Pitoiset	3c6bba83f0	radv: remove useless assertion in emit_depthstencil_clear() Already checked in emit_clear(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-20 10:45:21 +01:00
Samuel Pitoiset	403a3d8061	radv: remove useless check in radv_set_depth_clear_regs() aspects can't be zero and there is an assertion that ensures it's not in emit_clear(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-20 10:45:19 +01:00
Dave Airlie	59ca0c4b44	docs/features: mark some r600 extensions supported These just looked to be missed when this file was updated. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-20 10:22:25 +10:00
George Barrett	f09c2cefdd	glsl: Catch subscripted calls to undeclared subroutines generate_array_index fails to check whether the target of a subroutine call exists in the AST, potentially passing around null ir_rvalue pointers eventuating in abort/segfault. Fixes: `fd01840c0b` ("glsl: add AoA support to subroutines") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100438	2017-11-20 11:04:04 +11:00
Eric Anholt	514db90448	broadcom/vc5: Fix up integer texture handling. The original spec I had didn't expose integer textures and suggested that you use unfiltered floats. Now there are proper formats for them. Fixes 16- and 32-bit texwrap integer tests in piglit, and dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0.rgb10_a2ui.	2017-11-19 10:12:30 -08:00
Eric Anholt	65ae4527d9	broadcom/vc5: Fix simulator assertion failures about color RT clears. When we tried to clear color while storing depth, it assertion failed about basically not having enough information to decide which color RT to clear. It turns out the STORE_GENERAL picks the buffer according to the color buffer being stored, or all of them if NONE. If you're doing depth, it doesn't know which to pick.	2017-11-19 10:12:30 -08:00
Rob Clark	ae44845aff	freedreno/ir3: add texture gather support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-18 13:39:39 -05:00
Lucas Stach	f5d477f447	etnaviv: enable full overwrite when no color buffer is present The OVERWRITE bit disables destination fetches, which is exactly what we want when there is no valid color buffer bound. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-11-18 12:33:49 +01:00
Jason Ekstrand	1eab327ba7	i965: Stop including brw_cfg.h in brw_disasm_info.h The brw_disasm_info header is included by certain tools in order to get shader assembly from binaries so it's a semi-external header. Including brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit of our back-end compiler internals. Instead, make the couple of forward declarations we need and make the header more stand-alone. This fixes the meson build. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `4f82b17287`	2017-11-17 21:51:16 -08:00
Jason Ekstrand	0a6a137eb2	i965: Mark BOs as external when we export their handle Almost all of our BO export paths were already properly marked the BO as external and added it to the handle table. Most export use-cases go through a prime fd or flink where we have a brw_bo export helper that does the right thing. The one missing one happens when you call queryImage and ask for __DRI_IMAGE_ATTRIB_HANDLE. We just grabbed the gem handle out of the BO (because it's really easy to do that) and handed it off to the client; what could go wrong? As it turns out, this path is used by basically every compositor that wants to turn around and call drmModeAddFB2 on it so it can hand it off to display. The result, as of `4b1e70cc57`, is that we no longer set MOCS_PTE on those surfaces and the kernel's attempts to disable caching fail and we scanout gets corruption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103759 Fixes: `4b1e70cc57` Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-11-17 17:16:44 -08:00
Jason Ekstrand	344252a27f	i965/bufmgr: Add a helper to mark a BO as external Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-11-17 17:16:44 -08:00
Andres Gomez	1866f7aee5	i965: Correct disasm_info usage in eu_validate test Fixes: `4f82b17287` ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-18 03:07:06 +02:00
Eric Anholt	8d5994098f	broadcom/vc5: Set up the padded height at surface creation time. This centralizes the calculation in the surface, instead of in each load/store.	2017-11-17 16:09:55 -08:00
Eric Anholt	87391e23cf	broadcom/vc5: Ensure that there is always a TLB write. This should fix some GPU hangs in our (currently always single-threaded) fragment shaders, and definitely fixes assertion failures in simulation.	2017-11-17 16:09:55 -08:00
Eric Anholt	c40ac132e4	broadcom/vc5: Fix clear color for swap_color_rb render targets. Fixes dEQP-GLES3.functional.depth_stencil_clear.depth.*	2017-11-17 16:09:55 -08:00
Eric Anholt	52f3e9e43c	broadcom/vc5: Fix pasteo in front stencil ref value setup. Fixes piglit masked-clear.	2017-11-17 16:09:55 -08:00
Eric Anholt	b63dd626b7	broadcom/vc5: Fix colormasking when we need to swap r/b colors. Fixes part of piglit masked-clear.	2017-11-17 16:09:55 -08:00
Eric Anholt	2daf941a58	broadcom/vc5: Enable the Z min/max clipping planes.	2017-11-17 16:09:55 -08:00
Eric Anholt	c259bf686c	broadcom/vc5: Fix driver for new PIPE_SHADER_CAP_MAX_HW_ATOMIC_*.	2017-11-17 16:09:54 -08:00
Brian Paul	2b5b6bebac	r300: add PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* switch cases To silence compiler warnings. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-17 16:09:59 -07:00
Brian Paul	af322ed887	tgsi: s/uint/enum pipe_shader_type/ Roland Scheidegger <sroland@vmware.com>	2017-11-17 16:09:40 -07:00
Brian Paul	fdee3e1d82	tgsi: bump tgsi_opcode_info::output_mode size to 4 bits To avoid problems with MSVC. And verify size with ASSERT_BITFIELD_SIZE(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-17 16:09:39 -07:00
Kenneth Graunke	a01ba366e0	i965: Revert Gen8 aspect of VF PIPE_CONTROL workaround. This apparently causes hangs on Broadwell, so let's back it out for now. I think there are other PIPE_CONTROL workarounds that we're missing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103787	2017-11-17 14:28:22 -08:00
Adam Jackson	ddcd4b05a3	egl: Convert int to attrib in eglGetPlatformDisplay ... because converting attrib to int truncates, and that's bad. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-17 16:43:16 -05:00
Rob Clark	1831e3fb1d	docs: update features for freedreno Just comparing glxinfo and features.txt, and it seems features.txt is fairly out of date. The a5xx specific features (compute/images/atomics/ etc) are recent. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-17 15:19:38 -05:00
Matt Turner	821ec473a8	i965: Rename intel_asm_annotation -> brw_disasm_info It was the only file named intel_* in the compiler. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 12:14:38 -08:00
Matt Turner	4f82b17287	i965: Rewrite disassembly annotation code The old code used an array to store each "instruction group" (the new, better name than the old overloaded "annotation"), and required a memmove() to shift elements over in the array when we needed to split a group so that we could add an error message. This was confusing and difficult to get right, not the least of which was because the array has a tail sentinel not included in .ann_count. Instead use a linked list, a data structure made for efficient insertion. Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 12:14:38 -08:00
Matt Turner	f80e97346b	i965: Simplify annotation_insert_error() Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 12:14:38 -08:00
Matt Turner	f4276ef7ef	i965: Move common code out of #ifdef I'm going to change the call in a later patch and with the difference in indentation level it wasn't immediately obvious that the calls were identical. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 12:14:38 -08:00
Anuj Phogat	822fd2341d	i965: Remove DWord length from MI_FLUSH_DW definition Fixes: `6165fda59b` ("i965: Program DWord Length in MI_FLUSH_DW") Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-17 11:51:28 -08:00
Jason Ekstrand	a07f7b2619	anv/cmd_buffer: Take bo_offset into account in fast clear state addresses Otherwise, if the image is not bound to the start of the buffer, we're going to be reading and writing its fast clear state in the wrong spot. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-17 11:32:21 -08:00
Jason Ekstrand	a6cc361e5f	anv/cmd_buffer: Advance the address when initializing clear colors Found by inspection Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-17 11:32:21 -08:00
Boyuan Zhang	3b7fd35d01	radeon/video: enable encode support for raven Enable h.264 encode for vcn hardware (raven) Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	549a41ed9d	radeonsi: enable vcn encode Enable vcn encode by creating radeon_encoder for vcn. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	fe50797d93	radeon/vcn: add create encoder Add implementation for create_encoder interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	3c53fbbc87	radeon/vcn: add encode get feedback Add implementation for get_feedback interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	bc9644460d	radeon/vcn: add encode destroy Add implementation for destroy interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	3f83c24366	radeon/vcn: add encode end frame Add implementation for end_frame interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	47443bc9f0	radeon/vcn: add encode bitstream Add implementation for encode_bitstream interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	f40fe728a1	radeon/vcn: add encode begin frame Add implementation for begin_frame interface for vcn encode. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	c2448f20a3	radeon/vcn: add encode header implementations Implement encoding of sps, pps, and silce headers using the newly added h.264 header coding descriptors functions based on h.264 specs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	d940fdf765	radeon/vcn: add encode header algorithms Since bitstream headers, e.g. sps, pps, slice, are encoded in driver side, we need to add corresponding algorithms that required to generate those headers. According to h.264 specs, signed/unsigned interger Exp-Golomb-coded syntax element with left bit first (code_se and code_ue) and unsigned integer using n bits (code_fixed_bits) descriptors function are needed. Therefore, adding those algorithms and related variables and output algorithms here. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	be996f2213	radeon/vcn: add ib implementations Implement required ibs and command buffer submission interfaces for vcn encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	7f7ae47385	radeon/vcn: add common encode part Add a skeleton pipe video interface and encode ib interface for video encode on vcn hardware. Add function defines and structures for vcn encode. Update Makefile.sources and meson.build with newly added files. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	58aa4dffb4	st/va: implement poc type pic_order_cnt_type is a required variable when encoding both sps and slice header, therefore we need to get this value from st, e.g. vaapi interface, and then pass it to radeon driver for encoding headers. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	76e0dcd5a9	vl: add poc type Different from vce encoding, vcn encoding requires driver side to encode bitstream header, such as pps, sps and slice header. pic_order_cnt_type is a required variable when encoding both sps and slice header, therefore we need to add this new variable here, and hold the value passed from st, e.g. vaapi interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	c445cdf649	winsys/amdgpu: add vcn enc cs support New cs support is needed for vcn encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	436a3f8d6d	radeon/common: add vcn enc ip info query New ip info query is needed for vcn encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	f2021d92eb	radeon/winsys: add vcn enc ring type New ring type is needed for vcn encode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Boyuan Zhang	d3d8914275	radeon/vcn: add vcn encode interface Add a new header file for vcn encode interface Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2017-11-17 12:25:47 -05:00
Gert Wollny	b50eda8498	gallium/aux/util/u_surface.c: Silence warnings and remove unneeded MAYBE_UNUSED * Explicitely convert values to int in comparison. * Remove one MAYBE_UNUSED that is actually not needed. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:58 -07:00
Gert Wollny	c7bf83ef5c	gallium/aux/util/u_debug_image.c: Silence warnings -Wunused-param Decorate the according parameters with UNUSED. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:58 -07:00
Gert Wollny	f23f2146cb	gallium/aux/util/u_debug_flush.c: Silence warnings -Wunused-param Decorate the unused parameters with UNUSED. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:58 -07:00
Gert Wollny	1dca234daf	gallium/aux/util/u_debug.c: Silence warnings -Wunused-param Silence warnings by decoration the parameters with UNUSED. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	bec80e892b	gallium/aux/util/u_format_table.py: Add UNUSED decoration to the generated function headers Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	c7ebf95797	gallium/aux/util/u_async_debug.c: Fix -Wtype-limits warning. Use size_t instread of unsigned for new_max. realloc later expects size_t as parameter anyway. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	537d04615d	gallium/aux/os/os_thread.h: Silence -Wunused-param. With --disable-debug a parameter is not used. Silence this warning by fake-using it. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	4bdad3d98c	gallium/aux/util/u_debug_refcnt.h: Fix -Wunused-param warnings Annotate the according parameters accordingly. v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	2049c635ee	gallium/aux/util/u_blit.c: Fix -Wunused-param warnings Annotate the parameters accordingly. v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> v1: Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	0b984188f9	src/util/simple_mtx.h: Fix two -Wunused-param warnings. Decorate the parameters accordingly with "UNUSED" or "MAYBE_UNUSED" (for the param that is used in debug mode, but not in release mode). v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	811eb70a57	mesa/main/texcompress_s3tc_tmp.h: Fix two -Wparam-unused warnings. Decorate the params accordingly with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	28a02eb3b8	gallium/aux/util/u_transfer.c: Fix some -Wunused-param warnings. Decorate the params with "UNUSED" accordingly. v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	82e2f1ea34	gallium/aux/util/u_threaded_context.c: Fix some -Wunused-param warnings. Decorate the params accordingly with UNUSED or MAYBE_UNUSED (for params that are used in debug mode). v2: move *UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	a5da06d9b7	gallium/aux/util/u_surface.c: Silence a -Wsign-compare warning. Explicitely convert one value to compare. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	c9fef0fa9f	gallium/aux/util/u_pstipple.c: Fix one -Wsign-compare warning in ?: construct. Silence the warning by making the conversion to int explicit. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:57 -07:00
Gert Wollny	2644a80ccf	gallium/aux/util/u_mm.c: Fix one -Wparam-unused warning. Decorate the unused param accordingly with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	1c0d4baaf7	gallium/aux/util/u_format_yuv.c: Fix a number of -Wunused-param warnings. Decorate the params accordingly with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	a837a3d10d	gallium/aux/util/u_format_rgtc.c: Fix a number of -Wunused-param warnings Decorate the params accordingly with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	a29181d3c1	gallium/aux/util/u_format_other.c: Fix various -Wunused-param warnings Decorate the unused params with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	3b4bacc09f	gallium/aux/util/u_format_latc.c: Fix various -Wunused-param warnings, (v2) Decorate the unused params with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:57 -07:00
Gert Wollny	d744387b08	gallium/aux/util/u_format_etc.c: Fix eight -Wunused-param warnings (v2) Decorate the parameters accordingly with "UNUSED". v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Gert Wollny	cf93a7fc9e	gallium/aux/util/u_format.c: Fix one -Wunused-param warning This warning was issued only in release mode. Fix it by fake-using the parameter. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Gert Wollny	e979ab70c2	gallium/aux/util/u_dump_state.c: Fix two -Wunused-paramter warnings v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Gert Wollny	20d3e943b1	gallium/aux/util/u_dump_defines.c: Fix -Wcompare-unsigned warning u_bit_scan may return -1 that then may be interpreted as (unsigned)-1 in the following comparison, since num_names is unsigned. Convert the latter to be int as well. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-17 09:27:56 -07:00
Gert Wollny	373c263e2c	gallium/aux/util/u_debug_stack.c: Silence -Wunused-result warning asprintf is decorated with the attrbute "warn_unused_result", and if the function call fails, the pointer "temp" will be undefined, but since it is used later it should contain some usable value. Test return value of asprintf and assign some save value to "temp" if the call failed. Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Gert Wollny	9b80c03870	gallium/aux/util/u_debug_describe.c: Silence an -Wunused-param warning Annotate the unused parameter. v2: move UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Gert Wollny	ca7d5170eb	gallium/aux/util/u_blitter.c: Silence some warnings * Annotate three parameters that are not used in release mode. * explicitely convert an int to unsigned in an ?: construct. v2: move MAYBE_UNUSED decoration in front of parameter declaration Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-17 09:27:56 -07:00
Rob Clark	c267750bb1	freedreno/a5xx: stencil texturing support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-17 11:19:34 -05:00
Rob Clark	a39d403202	freedreno/a5xx/gmem: fix z32/s8 restore/resolve BLIT_ZS mode is used for either combined z24/s8 or z32 in which case BLIT_S mode is used for separate stencil. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-17 11:19:34 -05:00
Rob Clark	010ebed72a	freedreno/a5xx/gmem: move ZS restore tiling hack Code motion to simplify next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-17 11:19:33 -05:00
Rob Clark	22605dce4b	freedreno: update generated headers	2017-11-17 11:19:33 -05:00
Brian Paul	501591e852	svga: add missing PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTER* cases Reviewed-by: Charmaine Lee <charmainel@vmware.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-11-16 20:35:17 -07:00
Brian Paul	92c1290dc5	glsl: s/unsigned/glsl_base_type/ in glsl type code (v2) Declare glsl_type::sampled_type as glsl_base_type as we do for the base_type field. And make base_type a bitfield to save a few bytes. Update glsl_type constructor to take glsl_base_type instead of unsigned and pass GLSL_TYPE_VOID instead of zero. No Piglit regressions with llvmpipe. v2: - Declare both base_type and sampled_type as 8-bit fields - Use the new ASSERT_BITFIELD_SIZE() macro. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 20:35:17 -07:00
Brian Paul	940fba68c9	util/tgsi: use ASSERT_BITFIELD_SIZE() to check opcode field size I've noticed at least two places where we store the TGSI opcode in an unsigned:8 bitfield. We're at 249 opcodes now. If we hit 256 we'll need to grow those bitfields. Use the new ASSERT_BITFIELD_SIZE() macro to detect that. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 20:35:17 -07:00
Brian Paul	d4726b1318	st/mesa: use enum types instead of int/unsigned (v3) Use the proper enum types for various variables. Makes life in gdb a little nicer. Note that the size of enum bitfields must be one larger so the high bit is always zero (for MSVC). v2: also increase size of image_format bitfield, per Eric Engestrom. v3: use the new ASSERT_BITFIELD_SIZE() macro Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 20:35:17 -07:00
Brian Paul	fe81e1f975	util: add new ASSERT_BITFIELD_SIZE() macro (v3) For checking that bitfields are large enough to hold the largest expected value. v2: move into existing util/macros.h header where STATIC_ASSERT() lives. v3: add MAYBE_UNUSED to variable declaration Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-11-16 20:35:17 -07:00
Dave Airlie	c8ce3c2689	st/mesa: don't move ssbo after atomic buffers if we support hw atomics There is no need to have these overlap if we support hw atomics. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 13:15:38 +10:00
Kenneth Graunke	8f91aa35a5	i965: Upload invariant state once at the start of the batch on Gen4-5. We want to emit invariant state at the start of a render batch. In the past, this more or less happened: a new batch flagged BRW_NEW_CONTEXT (because we don't have hardware contexts), which triggered the brw_invariant_state atom. So, it would be emitted before any 3D drawing. (Technically, there might be some BLT commands in the batch because Gen4-5 have a single combined render/BLT ring, but that should be harmless). With the advent of BLORP, this broke. The first item in a batch might be a BLORP operation, which bypasses the normal draw upload path. So, we need to ensure invariant state happens first. To do that, we just upload it when creating a new batch. On Gen6+ we'd need to worry about whether it's a RENDER or BLT batch, but because we have a combined ring, this approach should work fine on Gen4-5. Seems to fix GPU hangs when playing hardware accelerated video with mpv -hwdec=vaapi on Ironlake. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103529 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-16 17:39:01 -08:00
Dave Airlie	59162c122f	docs: update features/relnotes for r600 shader image support. (v2) v2: update GLES Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:41 +10:00
Dave Airlie	8c52ece581	r600: enable ARB_shader_image_load_store, ARB_shader_image_size This also enables GL4.2 for gpus with hw fp64 (cayman, cypress) Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:41 +10:00
Dave Airlie	31db4a3200	r600: handle image size support. This adds support for the RESQ opcode with the workaround required due to hw bugs for buffers and cube arrays. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:41 +10:00
Dave Airlie	f94f5c9a9f	r600/sb: disable SB for images. Until we can work further on sb, disable it for images for now. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:41 +10:00
Dave Airlie	aa38bf658f	r600/shader: add support for load/store/atomic ops on images. This adds support to the shader assembler for load/store/atomic ops on images which are handled via the RAT operations. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:41 +10:00
Dave Airlie	a6b3792843	r600: add core pieces of image support. This adds the atoms and gallium api implementations, along with support for compress/decompress paths for shader images. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dave Airlie	5689bb0022	r600/shader: implement getting thread id. We need the thread id to use the immediate buffer readback mechanism, so add support for calculating it. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dave Airlie	c119a1acb3	r600/shader: add flag to denote if shader uses images Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dave Airlie	894e2deb7e	r600: implement basic memory barrier. This isn't 100% perfect (fglrx also fails a bunch of those tests) but implement the start of a memory barrier for image support. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dave Airlie	ac4f175d79	r600: allocate immed buffer resource for images. In order to image readback we have to execute a MEM_RAT instruction that needs a buffer to transfer the result into until the shader can fetch it. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dave Airlie	77d36cbc8d	r600: handle writes_memory properly This implements proper handling for shaders with side effects. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-17 11:31:40 +10:00
Dylan Baker	d8acf79f0c	autotools: change version TINY -> PATCH Because patch is more common than tiny for talking about the 3rd element of a version. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emli.velikov@collabora.com>	2017-11-16 16:16:45 -08:00
Dylan Baker	65fc16c974	autotools: set XA versions in configure.ac and configure header file Currently the versions are set in the header, and then sed is used to extract them, so that autotools can use them elsewhere. This is odd. Autotools is perfectly capable of configuring the header with the versions, and then they don't need to be extracted from the the header. This is cleaner and more obvious. Tested with make distcheck. v2: - Split tiny -> patch change - Drop temporary variables - change XA_VERSION_* -> XA_* v3: - Finish splitting the tiny -> patch change Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emli.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)	2017-11-16 16:16:29 -08:00
Kenneth Graunke	f274687413	genxml: Fix PIPELINE_SELECT on G45/Ironlake. Original 965 sets bits 28:27 to 0, while G45 and later set it to 1. Note that the G45 docs are incorrect in this regard - see the DevCTG+ note in the Ironlake PRMs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-16 11:01:50 -08:00
Emil Velikov	9b02230466	egl: pass the dri2_dpy to the $plat_teardown functions Cc: Mark Janes <mark.a.janes@intel.com> Fixes: `40a01c9a0e` ("egl/drm: move teardown code to the platform file") Fixes: `8d745abc00` ("egl/wayland: move teardown code to the platform file") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Dylan Baker <dylan@pnwbakers.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103784	2017-11-16 18:46:01 +00:00
Rafael Antognolli	306914db92	meson: Add dridriverdir variable to dri.pc. Xorg (and possibly other things) depend on this variable to find the path to DRI drivers. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-16 10:40:26 -08:00
Dylan Baker	bc17ac5866	docs: add documentation for building with meson v2: - Add information about CC, CXX, CFLAGS, and CXXFLAGS (Nicolai) - Add message at top that meson for mesa is still a work in progress - Add trailing "/" to directories (Eric E.) - Fix a number of spelling/grammar/style suggestions from Eric E. - Make a number of changes as suggested by Emil. v3: - Fix order of commands in example (Eric E.) - Add documentation for overriding LLVM version (Eric E.) v4: - Rebase on master - update default buildtype - add note about b_ndebug - Clarify meson configure a bit v5: - use <code> for command line arguments (Eric E.) - Add note about listing options without a build directory - Minor formatting changes (Eric E.) - Replace the CC, CFLAGS, etc section with an environment variables section, which mentions CC, CXX, CFLAGS, CXXFLAGS, LDFLAGS, and DESTDIR - Add comment that not using buildtype debug might make debugging harder - Add comment that b_ndebug and buildtype are orthogonal Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v3)	2017-11-16 09:48:09 -08:00
Kai Wasserbäch	d25123e23a	docs: Point to apt.llvm.org for development snapshot packages Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 16:21:09 +00:00
Eric Engestrom	ca95d7ad4e	egl: fix var type queryImage() takes an `int*`; compiler is warning about the signed<->unsigned pointer mismatch. Fixes: `0db36caa19` "egl/wayland: Add a fallback when fourcc query isn't supported" Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Derek Foreman <derekf@osg.samsung.com>	2017-11-16 16:20:44 +00:00
Emil Velikov	9e74e2d13c	i915: add missing extensions.h include Otherwise we'll bail with due to -Werror=implicit-function-declaration. It went unnoticed since the we had a bug which did consistently set the compiler flag. Fixes: `ba8a347f93` ("mesa: split extensions overrides and glGetString(GL_EXTENSIONS)") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-16 16:11:25 +00:00
Emil Velikov	f3ea07959b	mesa: return 'unrecognized' extensions in glGetStringi Analogous to the glGetString() case - report all the extensions enabled via MESA_EXTENSION_OVERRIDE Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:17:07 +00:00
Emil Velikov	310e4485cb	mesa: rework the way we manage extra_extensions Store pointers to the tokenized strings in the gl_extensions struct. This way we can reuse them in glGetStringi() while we construct the really long string only in _mesa_make_extension_string. Only 16 pointers/strings are stored for now. v2: Warn only once when we provide more than 16 unk. extensions, rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-16 14:17:07 +00:00
Emil Velikov	693682bd01	mesa: pass the ctx to _mesa_one_time_init_extension_overrides Will be needed with next commit Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:08:14 +00:00
Emil Velikov	9aa9b98e63	mesa: call atexit() only as needed If the extra_extensions string is empty there's no need to call atexit() - there's nothing to free. v2: Rebase Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2017-11-16 14:08:03 +00:00
Emil Velikov	3d81e11b49	mesa: remove unnecessary 'sort by year' for the GL extensions The sorting was originally added to work around broken games (comment says Quake3 demo) that were copying the extensions list into small buffer. Sorting does not solve the problem, since we'll still overflow and cause corruption/crash. Better workaround is to actually trim the string ... as done with a later commit which introduces the MESA_EXTENSION_MAX_YEAR env. variable. Side note: On my machine, the existing sorting makes no changes to the extensions string. Cc: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	a3f82876f4	mesa: reuse set_extension() for _mesa_extension_override_disables We already use it for _mesa_extension_override_enables. Improve consistency and use it for both extension lists. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	444d9e4b08	mesa: drop unnecessary coping of extra_extensions The function get_extension_override() returns a copy of a string, only for it to be copied again ... Drop the unneeded calloc/strdup/free dance. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	e4cdce5058	mesa: remove duplicate 'disabled extensions' list While parsing MESA_EXTENSION_OVERRIDE we keep track of the disabled extensions, twice - in _mesa_extension_override_disables and disabled_extensions. Upon context creation, we use the former to modify the extensions list. Yet, we still check the updated list against disabled_extensions. Remove disabled_extensions, it's obsolete. Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	167e958a87	mesa: call _mesa_make_extension_string only as needed As of previous commit we removed the extension overrides from this function. Thus we no longer need to call it during MakeCurrent, so we can construct the extensions string when needed - _mesa_GetString. This commit effectively reverts `a879d14ecf` ("mesa: initialize extension string when context is first bound") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	ba8a347f93	mesa: split extensions overrides and glGetString(GL_EXTENSIONS) Currently we apply the extension overrides and construct the extensions string upon MakeCurrent. They are two distinct things, so let's slit the two while pushing the overrides management _before_ _mesa_compute_version(). This ensures that the version is updated to reflect the enabled/disabled extensions. Cc: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:07:14 +00:00
Emil Velikov	afd6a964a4	i965: remove ARB_compute_shader extension override Checking the override was useful in the early stages of developing the extension. Now that everything is wired, where possible, we can drop the check. Doing so allows us to simplify some of the related code. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-11-16 14:06:57 +00:00
Emil Velikov	f8812931cf	i965: use _mesa_is_desktop_gl helper Use the helper over opencoding the check. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 14:06:56 +00:00
Emil Velikov	6614804d1e	egl: add note about missing $plat_teardown Some platforms are missing a proper teardown function. Add a small TODO to make it obvious. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 14:03:11 +00:00
Emil Velikov	8d745abc00	egl/wayland: move teardown code to the platform file Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 14:03:10 +00:00
Emil Velikov	40a01c9a0e	egl/drm: move teardown code to the platform file Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 14:03:08 +00:00
Emil Velikov	938fcab08b	egl/x11: move teardown code to the platform file Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 14:03:06 +00:00
Emil Velikov	55245fe1c9	egl: Provide meaningfull error when built w/o requested platform The current "No EGL platform enabled." is misleading and wrong. We reach said code when $platform is missing. To make this more obvious and clear provide wrappers in the header file, making the code a bit easier to follow. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-16 14:03:03 +00:00
Jon Turney	8ceccbf80d	meson: Don't define HAVE_PTHREAD only on linux I'm not sure of the reason for this. I don't see anything like this in configure.ac In include/c11/threads.h the cases are: 1) building for Windows -> threads_win32.h 2) HAVE_PTHREAD -> threads_posix.h 3) Not supported on this platform So not defining HAVE_PTHREAD for anything not Windows just means we can't build at all. When we are building for Windows, I'm not sure if dependency('threads') would ever find anything, or defining HAVE_PTHREAD has any effect, but avoid defining it there, just in case. Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-16 13:51:25 +00:00
Rob Clark	ff018a3f55	freedreno: also mark images used by draw/grid Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-16 08:44:19 -05:00
Rob Clark	92e75bf0ec	freedreno: mark SSBOs written at draw time Comment was right, implementation was wrong ;-) Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-16 08:44:19 -05:00
Rob Clark	2878af74dd	freedreno/a5xx: ARB_framebuffer_no_attachments support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-16 08:44:19 -05:00
Kenneth Graunke	8d48671492	i965: Implement another VF cache invalidate workaround on Gen8+. ...and provide a better citation for the existing one. v2: - Apply the workaround to Gen8 too, as intended (caught by Topi). - Restructure to add bits instead of an extra flush (based on a similar patch by Rafael Antognolli). Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-11-16 01:43:26 -08:00
Nicolai Hähnle	f3fa3b0d95	tgsi/exec: fix LDEXP in softpipe Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103128 Fixes: `cad959d901` ("gallium: add LDEXP TGSI instruction and corresponding cap") Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-16 06:47:05 +01:00
Nicolai Hähnle	2e3d0dd6c8	threads,configure.ac,meson.build: define and use HAVE_TIMESPEC_GET Tested with Travis and Appveyor. v2: add HAVE_TIMESPEC_GET for non-Windows Scons builds v3: use check_functions in Scons (Eric) Cc: Rob Herring <robh@kernel.org> Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103674 Fixes: `f1a3648784` ("threads: update for late C11 changes") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk> (v2)	2017-11-16 06:45:35 +01:00
Timothy Arceri	a8bdf0e0c4	radeonsi: copy some nir gs info v2: copy input primitive Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 10:54:03 +11:00
Timothy Arceri	b73ce64fb8	ac: add gs_{prim,invocation}_id to the abi Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 10:54:03 +11:00
Timothy Arceri	8ae92a9209	radeonsi: gather stream info in nir path Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-16 10:51:35 +11:00
Vinson Lee	cd58b98b03	mapi: Use correct shared libraries suffix on macOS. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-15 15:26:46 -08:00
Brian Paul	ae7b4fdb32	tgsi: whitespace clean-ups in tgsi_util.[ch] Trivial.	2017-11-15 16:12:44 -07:00
Brian Paul	cbfc92c04b	svga: s/unsigned/enum tgsi_texture_type/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-15 16:12:43 -07:00
Brian Paul	dc37cb0f86	tgsi: s/unsigned/enum tgsi_texture_type/ Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-15 16:12:43 -07:00
Frank Richter	bf41b2b262	gallium/wgl: fix default pixel format issue When creating a context without SetPixelFormat() don't blindly take the pixel format reported by GDI. Instead, look for our own closest pixel format. Minor clean-ups added by Brian Paul. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103412 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2017-11-15 16:12:43 -07:00
Brian Paul	824e8084ed	svga: issue debug warning for unsupported two-sided stencil state We only have a single stencil read mask and write mask. Issue a warning if different front/back values are used. The Piglit gl-2.0-two-sided-stencil test hits this. Reviewed-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-15 16:12:43 -07:00
Brian Paul	cacb88490a	st/mesa: whitespace fixes in st_manager.c Trivial.	2017-11-15 16:12:43 -07:00
Brian Paul	8150690cac	st/mesa: whitespace clean-ups in st_context.c Trivial.	2017-11-15 16:12:43 -07:00
Brian Paul	0605a6cc89	st/mesa: move st_manager_destroy() earlier in file To avoid forward declaration. Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2017-11-15 16:12:43 -07:00
Brian Paul	3a74eb3a9b	st/mesa: move st_init_driver_flags() earlier in file To get rid of forward declaration. Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>	2017-11-15 16:12:43 -07:00
Brian Paul	955cbdf120	docs: update llvmpipe.html build instructions	2017-11-15 16:12:42 -07:00
Wladimir J. van der Laan	d61a914394	etnaviv: Add sampler TS support Sampler TS is an hardware optimization that can be used when rendering to textures. After rendering to a resource with TS enabled, the texture unit can use this to bypass lookups to empty tiles. This also means a resolve-in-place can be avoided to flush the TS. This commit is also an optimization when not using sampler TS, as resolve-in-place will now be skipped if a resource has no (valid) TS. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-15 23:27:54 +01:00
Wladimir J. van der Laan	59d76e7ab6	etnaviv: Flush TS cache before changing TS configuration This is to make sure that the TS is properly flushed to memory before rendering to a new surface starts. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-15 23:27:39 +01:00
Wladimir J. van der Laan	0d6d9b520b	etnaviv: Add TS_SAMPLER formats to etnaviv_format Sampler TS introduces yet another format enumeration for renderable+textureable formats. Introduce it into the etnaviv_format table as another column. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-15 23:27:26 +01:00
Wladimir J. van der Laan	ade528edd1	etnaviv: Check that resource has a valid TS in etna_resource_needs_flush Resources only need a resolve-to-itself if their TS is valid for any level, not just if it happens to be allocated. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-15 23:27:09 +01:00
Wladimir J. van der Laan	b24cb40188	etnaviv: rnndb update Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-15 23:26:53 +01:00
Dave Airlie	00bf875d55	radv: it isn't an error to not support a format or driver This reverts two of the vk_error changes: reporting unsupported format is common, and testing non-amdgpu drivers and ignoring them is also common. Fixes: `cd64a4f70` (radv: use vk_error() everywhere an error is returned) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-16 06:12:42 +10:00
Kenneth Graunke	5da2b26dcb	i965: Drop some reserved space remnants. BATCH_RESERVED was deleted in commit `2c46a67b41` (i965: Delete BATCH_RESERVED handling.) The reserved_space field is dead code, and the comments aren't useful these days.	2017-11-15 09:37:32 -08:00
Kenneth Graunke	e48cc01be9	intel: Drop mtypes.h include from brw_compiler.h. This isn't necessary and causes trouble for a project I'm working on.	2017-11-15 09:37:32 -08:00
Kenneth Graunke	0704702972	i965: Fold ABO state upload code into the SSBO/UBO state upload code. Having this separate could potentially make programs that rebind atomics but no other surfaces ever so slightly faster. But it's a tiny amount of code to add to the existing UBO/SSBO atom, and very related. The extra atoms have a cost on every draw call, and so dropping some of them would be nice. This also reclaims a dirty bit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-15 09:37:32 -08:00
Kenneth Graunke	ff964916dc	i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code. We use the same hardware mechanism for both atomic counters and SSBO atomics, so there's really no benefit to maintaining separate code to handle each case. Instead, we can just use Rob's shiny new NIR pass to convert atomic_uints to SSBOs, and delete piles of code. The ssbo_start section of the binding table becomes a combined ABO and SSBO section, with ABOs first, then SSBOs. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-15 09:37:32 -08:00
Kenneth Graunke	f48f52b030	i965: Make a better helper function for UBO/SSBO/ABO surface handling. This fixes the missing AutomaticSize handling in the ABO code, removes a bunch of duplicated code, and drops an extra layer of wrapping around brw_emit_buffer_surface_state(). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-15 09:37:32 -08:00
Samuel Pitoiset	059d25a06d	radv: add the vertex buffers BO to the list at bind time This should reduce the overhead of adding a BO to the current list, especially when the list is huge. Also, when a new pipeline is bound, we only need to update the descriptor, the buffer objects should already be in the list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:01:07 +01:00
Samuel Pitoiset	c665879455	radv: replace vb_dirty with RADV_CMD_DIRTY_VERTEX_BUFFER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:01:05 +01:00
Samuel Pitoiset	8fd213277f	radv: drop radv_cmd_dirty_mask_t typedef I don't think we will need a 64-bit unsigned integer for the dirty flags in the future, and there is still 20 bits left. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:01:01 +01:00
Samuel Pitoiset	f697365058	radv: use an unsigned 32-bit integer for radv_queue::family_index VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit integer. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:00:59 +01:00
Samuel Pitoiset	f9e1ff2464	radv: do not add the image BO in radv_set_dcc_need_cmask_elim_pred() radv_fill_buffer() ensures that the image BO is added to the list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:00:57 +01:00
Samuel Pitoiset	40290c805f	radv: do not add the image BO in radv_set_color_clear_regs() radv_fill_buffer() ensures that the image BO is added to the list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-15 09:00:54 +01:00
Roland Scheidegger	65123ee62c	r600: set the number type correctly for float rts in cb setup Float rts were always set as unorm instead of float. Not sure of the consequences, but at least it looks like the blend clamp would have been enabled, which is against the rules (only eg really bothered to even attempt to specify this correctly, r600 always used clamp anyway). Albeit r600 (not r700) setup still looks bugged to me due to never setting BLEND_FLOAT32 which must be set according to docs... Not sure if the hw really cares, no piglit change (on eg/juniper). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-15 03:13:46 +01:00
Roland Scheidegger	570d5b7992	r600: use ieee version of rsq Both r600 and evergreen used the clamped version, whereas cayman used the ieee one. I don't think there's a valid reason for this discrepancy, so let's switch to the ieee version for r600 and evergreen too, since we generally want to stick to ieee arithmetic. With this, behavior for both rcp and rsq should now be the same for all of r600, eg, cm, all using ieee versions (albeit note rsq retains the abs behavior for everybody, which may not be a good idea ultimately). Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-15 03:13:46 +01:00
Roland Scheidegger	1c8d57a008	r600: use ieee version of rcp r600 used the clamped version for rcp, whereas both evergreen and cayman used the ieee version. I don't know why that discrepancy exists (it does so since day 1) but there does not seem to be a valid reason for this, so make it consistent. This seems now safer than before the previous commit (using the dx10 clamp bit). Note that rsq still uses clamped version (as before even though the table may have suggested otherwise for evergreen) for r600/eg, but not for cayman. Will be changed separately for better regression tracking... Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-15 03:13:46 +01:00
Roland Scheidegger	3835009796	r600: use DX10_CLAMP bit in shader setup The docs are not very concise in what this really does, however both Alex Deucher and Nicolai Hähnle suggested this only really affects instructions using the CLAMP output modifier, and I've confirmed that with the newly changed piglit isinf_and_isnan test. So, with this bit set, if an instruction has the CLAMP modifier bit (which clamps to [0,1]) set, then NaNs will be converted to zero, otherwise the result will be NaN. D3D10 would require this, glsl doesn't have modifiers (with mesa clamp(x,0,1) would get converted to such a modifier) coupled with a whatever-floats-your-boat specified NaN behavior, but the clamp behavior should probably always be used (this also matches what a decomposition into min(1.0, max(x, 0.0)) would do, if min/max also adhere to the ieee spec of picking the non-nan result). Some apps may in fact rely on this, as this prevents misrenderings in This War of Mine since using ieee muls (`ce7a045fee`), without having to use clamped rcp opcode, which would also fix this bug there. radeonsi also seems to set this bit nowadays if I see that righ (albeit the llvm amdgpu code comment now says "Make clamp modifier on NaN input returns 0" instead of "Do not clamp NAN to 0" since it was changed, which also looks a bit misleading). v2: set it in all shader stages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103544 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-15 03:13:46 +01:00
Roland Scheidegger	aab0bfc648	r600: use min_dx10/max_dx10 instead of min/max I believe this is the safe thing to do, especially ever since the driver actually generates NaNs for muls too. The ISA docs are not very helpful here, however the dx10 versions will pick a non-nan result over a NaN one (this is also the ieee754 behavior), whereas the non-dx10 ones will pick the NaN (verified by newly changed piglit isinf-and-isnan test). Other "modern" drivers will most likely do the same. This was shown to make some difference for bug 103544, albeit it is not required to fix it. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-15 03:13:46 +01:00
Dave Airlie	3ceee04a4f	r600: fix cubemap arrays A lot of cubemap array piglits fail, port the texture type picking code from radeonsi which seems to fix most of them. For images I will port the rest of the code. Fixes: getteximage-depth gl_texture_cube_map_array-* fbo-generatemipmap-cubemap array getteximage-targets cube_array amongst others. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-15 11:26:11 +10:00
Rob Clark	7676e71113	freedreno/a5xx: small comment fix Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-14 18:12:47 -05:00
Rob Clark	d27318bdd0	freedreno/a5xx: indirect draw support A couple failures in piglit tests w/ TF or gl_VertexID + indirect draws. OTOH all the deqp tests (although they don't test those combinations). I suspect this could be fixed by a firmware update, but I don't think there is much we can do in mesa for that. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-14 18:10:58 -05:00
Rob Clark	f383cf9d41	freedreno/a5xx: split out helper for pipeline stalls We need a similar thing for indirect draws. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-14 18:10:51 -05:00
Rob Clark	d74029bddc	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-14 18:10:43 -05:00
Timothy Arceri	5041ea96a0	gallium/radeon: disable the cache when nir backend enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-15 08:47:31 +11:00
Timothy Arceri	7273e9820e	st/glsl_to_tgsi: use tgsi_get_gl_varying_semantic() for gs/tes outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-15 08:26:34 +11:00
Timothy Arceri	bc308122cc	gallium/tgsi: add tess output supoort to tgsi_get_gl_varying_semantic() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-15 08:26:34 +11:00
Timothy Arceri	4ae9f0b580	st/glsl_to_tgsi: make use of tgsi_get_gl_varying_semantic() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-15 08:26:34 +11:00
Timothy Arceri	3d21eb3b7d	gallium/tgsi: add prim id to tgsi_get_gl_varying_semantic() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-15 08:26:34 +11:00
Anuj Phogat	fc59546e9a	i965: Make use of brw_load_register_imm32() helper function Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Nanley Chery <nanley.g.chery@intel.com>	2017-11-14 13:23:18 -08:00
Anuj Phogat	1dc45d75bb	i965/gen8+: Fix the number of dwords programmed in MI_FLUSH_DW Number of dwords in MI_FLUSH_DW changed from 4 to 5 in gen8+. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-11-14 13:23:18 -08:00
Anuj Phogat	6165fda59b	i965: Program DWord Length in MI_FLUSH_DW Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org>	2017-11-14 13:23:18 -08:00
Anuj Phogat	5d8164c428	anv/gen10: Enable float blend optimization Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-11-14 13:23:18 -08:00
Anuj Phogat	72a239266b	intel/genxml: Add Cache Mode SubSlice Register to gen10.xml Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-11-14 13:23:18 -08:00
Anuj Phogat	aacf1943c0	anv/gen10: Implement WaSampleOffsetIZ workaround We already have this workaround in OpenGL driver. See Mesa commit `3cf4fe2219`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Nanley Chery <nanley.g.chery@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com>	2017-11-14 13:23:18 -08:00
Andres Rodriguez	20e8dfcca9	mesa/st: add missing copyright headers to memoryobjects files Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-14 11:32:44 -08:00
Andres Rodriguez	60baf1a962	mesa: minor tidy up for memory object error strings Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-14 11:31:49 -08:00
Andres Rodriguez	f7580e7204	broadcom/vc4: fix indentation in vc4_screen.c Stumbled into this when adding a new PIPE_CAP. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-14 11:31:36 -08:00
Matt Turner	a31d038208	Revert "intel/fs: Use a pure vertical stride for large register strides" This reverts commit `e8c9e65185`. With the actual bug fixed (by commit `6ac2d16901`), this is not necessary. I'm doubtful of its correctness in any case.	2017-11-14 11:24:08 -08:00
Matt Turner	6ac2d16901	i965/fs: Fix extract_i8/u8 to a 64-bit destination The MOV instruction can extract bytes to words/double words, and words/double words to quadwords, but not byte to quadwords. For unsigned byte to quadword, we can read them as words and AND off the high byte and extract to quadword in one instruction. For signed bytes, we need to first sign extend to word and the sign extend that word to a quadword. Fixes the following test on CHV, BXT, and GLK: KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-14 10:56:18 -08:00
Matt Turner	cfcfa0b9cd	i965/fs: Split all 32->64-bit MOVs on CHV, BXT, GLK Fixes the following tests on CHV, BXT, and GLK: KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115	2017-11-14 10:56:18 -08:00
Tim Rowley	d8489517a5	swr/rast: Faster emulated simd16 permute Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-14 11:40:19 -06:00
Tim Rowley	439904847e	swr/rast: Use gather instruction for i32gather_ps on simd16/avx512 Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-14 11:39:02 -06:00
Derek Foreman	0db36caa19	egl/wayland: Add a fallback when fourcc query isn't supported When queryImage doesn't support __DRI_IMAGE_ATTRIB_FOURCC wayland clients will die with a NULL derefence in wl_proxy_add_listener. Attempt to provide a simple fallback to keep ancient systems working. Fixes: `6595c69951` ("egl/wayland: Remove more surface specifics from create_wl_buffer") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103519 Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-14 15:38:43 +00:00
Marek Olšák	89e669d2fd	radeonsi: remove has_cp_dma, has_streamout flags (v2) v2: remove r600_can_dma_copy_buffer	2017-11-14 15:24:50 +01:00
Julien Isorce	b904ad7d21	i965: implement (un)mapImage Already implemented for Gallium drivers. Useful for gbm_bo_(un)map. Tests: By porting wayland/weston/clients/simple-dmabuf-drm.c to GBM. kmscube --mode=rgba kmscube --mode=nv12-1img kmscube --mode=nv12-2img piglit ext_image_dma_buf_import-refcount -auto piglit ext_image_dma_buf_import-transcode-nv12-as-r8-gr88 -auto piglit ext_image_dma_buf_import-sample_rgb -fmt=XR24 -alpha-one -auto piglit ext_image_dma_buf_import-sample_rgb -fmt=AR24 -auto piglit ext_image_dma_buf_import-sample_yuv -fmt=NV12 -auto piglit ext_image_dma_buf_import-sample_yuv -fmt=YU12 -auto piglit ext_image_dma_buf_import-sample_yuv -fmt=YV12 -auto v2: add early return if (flag & MAP_INTERNAL_MASK) v3: take input rect into account and test with kmscube and piglit. v4: handle wraparound and bo reference. v5: indent, exclude 0 width and height on the boundary, map bo independently of the image. Signed-off-by: Julien Isorce <jisorce@oblong.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-14 14:23:13 +00:00
Samuel Pitoiset	8a7d4092d2	radv: force enable LLVM sisched for The Talos Principle It seems safe and it improves performance by +4% (73->76). A drirc based solution is not what we want for now, keep it simple and improve later if it's really needed. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-14 15:21:50 +01:00
Samuel Pitoiset	ecabe2280c	radv: add nosisched debug option Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-14 15:21:48 +01:00
Alejandro Piñeiro	b498172d0e	spirv: fix typo on DO NOT EDIT header Introduced on commit `157c9a1341` Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-14 13:07:36 +01:00
Jon Turney	7df9a3609a	meson: if dep_dl is an empty list, it's not a dependency object It's ok to use an empty list for dependencies:, but it's not ok to try to use the found() method of it. See also https://github.com/mesonbuild/meson/issues/2324 Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-14 12:00:25 +00:00
Bas Nieuwenhuizen	7c25578863	radv: Free temporary syncobj after waiting on it. Otherwise we leak it. Fixes: `eaa56eab6d` "radv: initial support for shared semaphores (v2)" Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-11-14 10:03:02 +01:00
Bas Nieuwenhuizen	917d3b43f2	radv: Free syncobj with multiple imports. Otherwise we can leak the old syncobj. Fixes: `eaa56eab6d` "radv: initial support for shared semaphores (v2)" Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-11-14 10:03:02 +01:00
Jason Ekstrand	fb0e9b5197	i965: Track the depth and render caches separately Previously, we just had one hash set for tracking depth and render caches called brw_context::render_cache. This is less than ideal because the depth and render caches are separate and we can't track moves between the depth and the render caches. This limitation led to some unnecessary flushing around the depth cache. There are cases (mostly with BLORP) where we can end up touching a depth or stencil buffer through the render cache. To guard against this, blorp would unconditionally do a render_cache_set_check_flush on it's destination which meant that if you did any rendering (including a BLORP operation) to a given surface and then used it as a blorp destination, you would end up flushing it out of the render cache before rendering into it. Things get worse when you dig into the depth/stencil state code for regular GL draw calls. Because we may end up rendering to a depth or stencil buffer via BLORP, we did a render_cache_set_check_flush on all depth and stencil buffers in brw_emit_depthbuffer to ensure that they got flushed out of the render cache prior to using them for depth or stencil testing. However, because we also need to track dirtiness for depth and stencil so that we can implement depth and stencil texturing correctly, we were adding all depth and stencil buffers to the render cache set in brw_postdraw_set_buffers_need_resolve. This meant that, if anything caused 3DSTATE_DEPTH_BUFFER to get re-emitted (currently _NEW_BUFFERS, BRW_NEW_BATCH, and BRW_NEW_BLORP), we would almost always do a full pipeline stall and render/depth cache flush. The root cause of both of these problems is that we can't tell the difference between the render and depth caches in our tracking. This commit splits our cache tracking into two sets, one for render and one for depth, and properly handles transitioning between the two. We still flush all the caches whenever anything needs to be flushed. The idea is that if we're going to take the hit of a flush and stall, we may as well flush everything in the hopes that we can avoid a flush by something else later. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 21:51:59 -08:00
Jason Ekstrand	d6d0ac95d5	i965/blorp: Add more destination flushing Right now we just always flush the destination for render and aren't particularly careful about depth or stencil. Soon, flush_for_render isn't going to do the same thing as flush_for_depth and we may be doing a good deal less depth flushing so we should be a bit more precise. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 21:51:59 -08:00
Jason Ekstrand	4a09070295	i965: Add more precise cache tracking helpers In theory, this will let us track the depth and render caches separately. Right now, they're just wrappers around brw_render_cache_set_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 21:51:59 -08:00
Jason Ekstrand	6830ba0d3b	i965: Add stencil buffers to cache set regardless of stencil texturing We may access them as a texture using blorp regardless of whether or not stencil texturing is enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-11-13 21:51:59 -08:00
Jason Ekstrand	4b1e70cc57	i965: Switch over to fully external-or-not MOCS scheme Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 21:35:52 -08:00
Jason Ekstrand	d7a19d69eb	i965: Use PTE MOCS for all external buffers We were already using PTE for all render targets in case one happened to get scanned out. However, this still wasn't 100% correct because there are still possibly cases where we may want to texture from an external buffer even though we don't know the caching mode. This can happen, for instance, on buffers imported from another GPU via prime. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101691 Cc: "17.3" <mesa-stable@lists.freedesktop.org> Tested-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 21:35:44 -08:00
Jason Ekstrand	bc933d0e84	intel/blorp: Make the MOCS setting part of blorp_address This makes our MOCS settings significantly more flexible. Cc: "17.3" <mesa-stable@lists.freedesktop.org> Tested-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 19:40:10 -08:00
Jason Ekstrand	deec84fd77	anv/blorp: Add a device parameter to blorp_surf_for_anv_image Cc: "17.3" <mesa-stable@lists.freedesktop.org> Tested-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 19:40:09 -08:00
Jason Ekstrand	4639cc716e	intel/blorp: Use mocs.tex for depth stencil Cc: "17.3" <mesa-stable@lists.freedesktop.org> Tested-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-13 19:39:57 -08:00
Kenneth Graunke	866158b4b6	intel/tools/error: Decode compute shaders. This is a bit more annoying than your average shader - we need to look at MEDIA_INTERFACE_DESCRIPTOR_LOAD in the batch buffer, then hop over to the dynamic state buffer to read the INTERFACE_DESCRIPTOR_DATA, then hop over to the instruction buffer to decode the program. Now that we store all the buffers before decoding, we can actually do this fairly easily. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-13 17:11:02 -08:00
Kenneth Graunke	7049c38655	intel/tools/error: Use do-while for field iterator loops. while loops skip the first field of the instruction/structure, which is not what the code intended. It works out because the field we're looking for doesn't happen to be first, but we ought to do it right regardless. Found while writing the next patch, where Kernel Start Pointer is the first field of INTERFACE_DESCRIPTOR_DATA. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-13 17:11:02 -08:00
Kenneth Graunke	8b749ee0ea	intel/tools/error: Decode shaders while decoding batch commands. This makes aubinator_error_decode's shader dumping work like aubinator. Instead of printing them after the fact, it prints them right inside the 3DSTATE_VS/HS/DS/GS/PS packet that references them. This saves you the effort of cross-referencing things and jumping back and forth. It also reduces a bunch of book-keeping, and eliminates the limitation that we could only handle 4096 programs. That code was also broken and failed to print any shaders if there were under 4096 programs. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-13 17:11:02 -08:00
Kenneth Graunke	4979bf2728	intel/tools/error: Save error state sections and decode them later. This lets us complete parsing and storing of each buffer's data before we begin decoding the batchbuffer. This makes it possible to inspect the state buffer and program buffer, so we can properly decode any indirect state or shader programs. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:11:02 -08:00
Kenneth Graunke	eb8ad56ed2	intel/tools/error: Fix null termination of ring name string. Ported from intel_error_decode. We don't want to run off the end. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:11:01 -08:00
Kenneth Graunke	ac17b38e79	intel/tools/error: Drop unused MAX_RINGS #define. Dead code. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:11:01 -08:00
Kenneth Graunke	596e860317	intel/tools/error: Refactor buffer matching, add more buffers. Based on a similar patch to intel_error_decode by Chris Wilson. While we're de-duplicating the gtt_offset calculation, we can simplify it to assume two hex digits are there - the kernel has done this since v4.6, and we already require error states from v4.10. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:10:51 -08:00
Kenneth Graunke	4bb119f00b	intel/tools/error: Only decode a few sections of error states. These three are the only we can reasonably decode with genxml. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:10:38 -08:00
Kenneth Graunke	00981e7c47	intel/tools/error: Drop unused parameters from decode() helper. Also change count from a pointer into a value. We were supposed to be resetting it to 0 (and failed to), but that's gone since we dropped the pre-ascii85 handling. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:10:38 -08:00
Kenneth Graunke	1898bf11a8	intel/tools/error: Drop support for non-ascii85 encoded error states. Error state files used to look like: render ring --- gtt_offset = 0x0e8f6000 00000000 : 69040000 00000004 : 79090000 ... 00007ffc : 00000000 --- ringbuffer = 0x00001000 There were thousands of lines between sections. The file format changed with Kernel 4.10, and now has a single ascii85-encoded line following each section heading. This is much easier to parse. There are a bunch of bugs in our handling of the old style format, where we'd decode the wrong data, at the wrong time. Fixing all of these is going to be a giant pain. It's also a lot of extra code complexity. In order to properly decode indirect state, or compute shaders, we'll also need to parse data in advance of decoding, which is going to be a giant pain with this ad-hoc "decode everywhere!" mentality. So, let's just drop support for the older file format. This unfortunately requires an error state generated by Kernel 4.10 or later. That's probably not the end of the world, as we encourage users to upgrade to the latest kernel when encountering GPU hangs anyway. It might be a giant pain for people with LTS kernels, though... Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:10:38 -08:00
Kenneth Graunke	53586f88d7	intel/tools/error: Do ascii85 decode first. The dashes "---" may occur within an ascii85 block, but only an ascii85 block starts with ':' or '~'. Ported from Chris Wilson's intel-gpu-tools commit: bceec7e1d8a160226b783c6344eae8cbf4ece144 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-11-13 17:10:38 -08:00
Alexander von Gluck IV	8f9d9ddcae	c11/haiku: Define missing timespec_get on Haiku Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-13 16:45:14 -06:00
Alexander von Gluck IV	f09e001a05	egl/haiku: Correct invalid void* conversion in calloc Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-13 16:45:10 -06:00
Dylan Baker	46a7fdd7ca	meson: Remove build_by_default from amd code This is the same logic as the previous two patches. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-13 13:43:20 -08:00
Dylan Baker	49fa074726	meson: Don't build intel shared components by default It's a neat idea, and still useful in some cases, but the intel common code is used by i965 and anvil only, this is a little clearer. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-13 13:43:15 -08:00
Dylan Baker	2bfd34c518	meson: don't use build_by_default for specific gallium drivers Using build_by_default : false is convenient for dependencies that can be pulled in by various diverse components of the build system, the gallium hardware/software drivers and state trackers do not fit that description. Instead, these should be guarded using the variable that tracks whether that driver should be enabled. This leaves a few helper libraries: trace, rbug, etc, and the generic winsys bits as `build_by_default : false` because there are a large number of gallium components that pull them in. v2: - remove build_by_default from winsys convenience libs as well. v3: - Always put drivers before winsys for consistency Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-13 13:43:12 -08:00
Dave Airlie	63b6eb9cb9	r600/shader: handle bitfield extract semantics properly. Fixes: tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldExtract.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 06:16:06 +10:00
Dave Airlie	0442809205	r600: handle bitfieldInsert corner case. This handles the bits >= 32 corner case in bitfieldInsert. Fixes: tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-bitfieldInsert.shader_test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 06:16:06 +10:00
Dave Airlie	53d5dda6f8	r600: add gs tri strip adjacency fix. Like radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation evergreen hw suffers from the same problem, so rotate the geometry inputs to fix this. This fixes: ./bin/glsl-1.50-geometry-primitive-types GL_TRIANGLE_STRIP_ADJACENCY on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 06:16:06 +10:00
Dave Airlie	f3f8615d76	r600: fix isoline tess factor component swapping. As per radeonsi, the tess factor components for isolines are reversed. Fixes: tests/spec/arb_tessellation_shader/execution/isoline.shader_test Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 06:16:06 +10:00
Dave Airlie	50330d7115	r600/shader: reserve first register of vertex shader. r0 in input into vertex shaders contains things like vertexid, we need to reserve it even if we have no inputs. This fixes a bunch of tessellation piglits. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 06:16:05 +10:00
Jason Ekstrand	00fb21b570	meson: Move -Dvulkan-drivers handling higher in the file The window-system auto-detection code (specifically for glx) relies on with_any_vk being available. This fixes the Vulkan-only build. Also, this puts it up near the handling of -Ddri-drivers and -Dgallium-drivers which seems to make a bit more sense. Fixes: `118a7f0441` "meson: add support for xlib glx" Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-13 11:55:06 -08:00
Jason Ekstrand	3a922d6a61	meson: Stop requiring platforms for Vulkan It should be perfectly valid to build a completely headless Vulkan driver. We don't need to require a platform. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-13 11:54:44 -08:00
Dave Airlie	da31e2c22d	r600: don't emit atomic save if we have no atomic counters. Otherwise we end up emitting the fence. Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-14 05:40:47 +10:00
Adam Jackson	257edb5b9a	glx/dri3: Fix passing renderType into glXCreateContext Without this, trying to create a GLX_RGBA_FLOAT_TYPE_ARB context would fail, because GLX_RGBA_TYPE would be a mismatch with the fbconfig. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-13 10:40:30 -05:00
Adam Jackson	033cfb17db	glx/drisw: Fix glXMakeCurrent(dpy, None, ctx) This is perfectly legal in GL 3.0+. Fixes piglit/glx-create-context-current-no-framebuffer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-13 10:39:48 -05:00
Adam Jackson	bc1bc6f512	glx: Lower GLX opcode lookup into SendMakeCurrentRequest Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-13 10:39:33 -05:00
Jason Ekstrand	93200ea26d	aubinator: Don't skip the first field in each subgroup The previous iteration algorithm would advance the field pointer right after we advance the group. This meant that you would end up with skipping the first field of the group. In the common case, where the only field is a struct (e.g. 3DSTATE_VERTEX_BUFFERS), it would get skipped entirely. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-13 07:37:23 -08:00
Jason Ekstrand	74a9e51696	intel/genxml: Delete empty groups They serve no purpose other than to just fill empty space in the packet so each dword has something. Just disallowing empty groups is a bit easier on some of the tools. This does not change the generated packing headers in any way. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-13 07:37:23 -08:00
Jason Ekstrand	54a6f7eaca	anv: Don't crash on invalid heap sizes when the PCI ID is overriden	2017-11-13 07:37:23 -08:00
Alex Smith	4122d00846	nir/spirv: tg4 requires a sampler Gather operations in both GLSL and SPIR-V require a sampler. Fixes gathers returning garbage when using separate texture/samplers (on AMD, was using an invalid sampler descriptor). Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-13 13:38:18 +00:00
Alex Smith	e9eb3c4753	spirv: Use correct type for sampled images We should use the result type of the OpSampledImage opcode, rather than the type of the underlying image/samplers. This resolves an issue when using separate images and shadow samplers with glslang. Example: layout (...) uniform samplerShadow s0; layout (...) uniform texture2D res0; ... float result = textureLod(sampler2DShadow(res0, s0), uv, 0); For this, for the combined OpSampledImage, the type of the base image was being used (which does not have the Depth flag set, whereas the result type does), therefore it was not being recognised as a shadow sampler. This led to the wrong LLVM intrinsics being emitted by RADV. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-13 13:37:50 +00:00
Alejandro Piñeiro	157c9a1341	spirv: add DO NOT EDIT warning on generated spirv_info.c Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-13 13:28:44 +01:00
Thomas Hellstrom	54a58b2856	loader/dri3: Improve dri3 thread-safety It turned out that with recent changes that call into dri3 from glFinish(), it appears like different thread end up waiting for X events simultaneously, causing deadlocks since they steal events from eachoter and update the dri3 counters behind eachothers backs. This patch intends to improve on that. It allows at most one thread at a time to wait on events for a single drawable. If another thread intends to do the same, it's put to sleep until the first thread finishes waiting, and then it rechecks counters and optionally retries the waiting. Threads that poll for X events never pulls X events off the event queue if there are other threads waiting for events on that drawable. Counters in the dri3 drawable structure are protected by a mutex. Finally, the mutex we introduce is never held while waiting for the X server to avoid unnecessary stalls. This does not make dri3 drawables completely thread-safe but at least it's a first step. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102358 Fixes: `d5ba75f888` "st/dri2 Plumb the flush_swapbuffer functionality through to dri3" Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-13 12:43:39 +01:00
Juan A. Suarez Romero	2b72ab58e5	etnaviv: automake,meson: include common_3d.xml.h in the sources lists v2: include the file also in the meson.build (Eric Engestrom). Fixes: `f1e1c60ff6` ("etnaviv: Update from rnndb") Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-13 12:26:59 +01:00
Tapani Pälli	41f7de477c	egl: EXT_pixel_format_float plumbing Patch adds support and capability to match with new surface attribute, component type. Currently no configs with floating point type are exposed. With this change, following dEQP test starts to pass: dEQP-EGL.functional.choose_config.color_component_type_ext.dont_care dEQP-EGL.functional.choose_config.color_component_type_ext.fixed dEQP-EGL.functional.choose_config.color_component_type_ext.float Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2017-11-13 12:40:26 +02:00
Samuel Pitoiset	934b77f2fe	radv: add unlikely() around radv_save_descriptors() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:40 +01:00
Samuel Pitoiset	305745457c	radv: optimize calling radv_cmd_buffer_trace_emit() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:38 +01:00
Samuel Pitoiset	957d42271b	radv: optimize calling radv_save_pipeline() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:36 +01:00
Samuel Pitoiset	ebab5c8ff4	radv: use vk_zalloc instead of vk_alloc+memset Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:35 +01:00
Samuel Pitoiset	0f68208f1d	radv: remove unnecessary memset() in radv_AllocateCommandBuffers() This should not be needed, if the allocation fails an error is returned and the host should handle it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:32 +01:00
Samuel Pitoiset	66da4c75bc	radv: remove useless initializations in radv_create_cmd_buffer() There is a memset() above. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:30 +01:00
Samuel Pitoiset	3d95fde661	radv: remove useless memset() in radv_CreateFence() All radv_fence fields are initialized here. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:28 +01:00
Samuel Pitoiset	cd64a4f705	radv: use vk_error() everywhere an error is returned For consistency and it might help for debugging purposes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:26 +01:00
Samuel Pitoiset	4e16c6a41e	radv: make radv_emit_framebuffer_state() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:04:25 +01:00
Samuel Pitoiset	be01197d8d	radv: do not emit the framebuffer when restoring a pass Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be re-emitted if necessary before the next draw. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:04:22 +01:00
Samuel Pitoiset	f87c58dde3	radv: prefetch VBO descriptors at the right place Just after the vertex shader. This seems to give a minor boost for, at least, Serious Sam Fusion 2017 and Dawn of War 3. I don't see any real impacts with The Talos Principle. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:03:16 +01:00
Samuel Pitoiset	9444a34f4a	radv: add radv_emit_prefetch_TC_L2_async() helper Will be used for VBO descriptors prefetching. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:03:13 +01:00
Samuel Pitoiset	36c2e46328	radv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch() For consistency because this function will also prefetch VBO descriptors. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:03:11 +01:00
Iago Toral Quiroga	456e10944f	glsl/linker: use without_array() to retrieve type This is what we do in the condition too, so it makes sense. v2: Only compute without_array() once (Ilia). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-13 09:22:26 +01:00
Dave Airlie	bec716e844	radv: emit esgs ring size in one place. This register is the same on all gpus so far, so emit it in one place and also for the pre-gfx9 gpus set the value in the pipeline creation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-13 07:17:09 +00:00
Dave Airlie	031e591923	radv: move calculating vs out info regs into pipeline. This moves some calculations of register values into the pipeline construction, it saves looking at outinfo in the cmd buffer emit. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-13 07:16:53 +00:00
Rob Clark	4a9aad96aa	freedreno/a5xx: fix SSBO emit for non-zero offset Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:29:00 -05:00
Rob Clark	5f25ab4fee	freedreno/a5xx: remove obsolete comment Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:29:00 -05:00
Rob Clark	8fcee858d5	freedreno/ir3: don't create split/fo if only writing .x In case an instruction only writes one register, and it is .x, we can skip the extra level of fanout indirection. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	e7b2719f69	freedreno/a5xx: indirect grids Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	471aa1b6d0	freedreno/a5xx: add global size compute cap Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	62981bbe65	freedreno/ir3: turn on std430 packing Seems to fix dEQP compute related tests.. and matches what i965 does, so perhaps there is some assumption that std430 packing is on by default somewhere in NIR?	2017-11-12 12:28:59 -05:00
Rob Clark	bedbe7f90c	freedreno/a5xx: image support	2017-11-12 12:28:59 -05:00
Rob Clark	819a613ae3	freedreno/ir3: moar better scheduler Add a new pass that inserts additional dependencies, rather than simply relying on SSA srcs added in the nir->ir3 frontend. This makes it easier to deal with barriers, but the additional false deps also lets us deal properly with ensuring a write depends on all previous reads. Since conversion to barrier instructions is lossy (ie. just knowing the instruction doesn't tell us enough about what other instructions the barrier applies to), use barrier_class/barrier_conflict fields in the ir3_instruction to retain this information. This could probably be relaxed somewhat by considering which array/ buffer/image variable is being referenced. Ie. a write to buffer A can overtake a read from buffer B, if B is not coherent. (right?) Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	15ea8d128a	freedreno/ir3: move macros I want to add a growable array to ir3_instruction, so we can append false dependencies for purposes of scheduling barriers, atomics, and dealing with write after read hazards. Just code motion preparing for next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	9edfc369c0	freedreno/ir3: image support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	eaae81058c	freedreno/ir3: shared variable support Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	dd75abc6f3	freedreno/ir3: some SSBO cleanups/fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	2f8bdf2e2b	freedreno/ir3: split out INSTR4F instructions Atomic instructions take a different # of src args depending on .g or .l variant, split these out into different helpers with INSTR*F() helper macro that lets you specify instruction flag. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	0038deb256	freedreno/ir3: cat6 encoding fixes Instruction encoding/decoding fixes needed for images, shared variables, etc. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	4e9a6c6868	freedreno/ir3: add barriers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	4c711f4d18	freedreno/ir3: invert is_same_type_mov() logic Some instructions (like barriers) have no dst, which causes problems with dereferencing a NULL dst. Flip the logic around to reject opc's that can't be a type of move first, to filter out those instructions. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	6da5130074	freedreno/ir3: add cat7 instructions Needed for memory and execution barriers. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	33f5f63b8f	freedreno/ir3: add SSBO get_buffer_size() support Somehow I overlooked this when adding initial SSBO support. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	b267a08404	freedreno/ir3: extract helper for common consts User consts and driver consts such as UBO addresses and immediates are handled the same for all shader stages, so split out a shared helper for these, to make it easier to add more. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	13fe1feb62	freedreno: add image view state tracking It is unfortunate that image state isn't a real CSO, since (at least for a4xx/a5xx) it is a combination of sampler and "SSBO" image state, and it would be useful to pre-compute the state block "register" values rather than doing it at emit time. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	12c1c3ab23	freedreno: update generated headers Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	0006b860ce	mesa/st/nir: assign driver_location for images Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Rob Clark	ecbe1e976f	st/program: fix compute shader nir references In case the IR is NIR, the driver takes reference to the nir_shader. Also, because there are no variants, we need to clone the shader, instead of sharing the reference with gl_program, which would result in a double free in _mesa_delete_program(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-12 12:28:59 -05:00
Rob Clark	5009dc55f2	freedreno/ir3: rename ir3_compile -> ir3_context Having both an ir3_compile (which was really context for compiling a single shader variant) and ir3_compiler (which is the compiler object that compiles all variants, ie. basically holds the RA regset) is a bit confusing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-12 12:28:59 -05:00
Kenneth Graunke	9a0465b3a3	intel/tools: Fix detection of enabled shader stages. We renamed "Function Enable" to "Enable", which broke our detection of whether shaders are enabled or not. So, we'd see a bunch of HS/DS packets with program offsets of 0, and think that was a valid TCS/TES. Fixes: `c032cae9ff` (genxml: Rename "Function Enable" to "Enable".) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-12 00:16:40 -08:00
Timothy Arceri	b99fb1a04d	st/atifs: remove unrequired initialisation of gl_program fields As far as I can tell these fields are only used to query arb program info and are not related to ATI_fragment_shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Miklós Máté <mtmkls@gmail.com>	2017-11-12 11:59:22 +11:00
Timothy Arceri	8fe6abd964	ac: add emit_vertex to the abi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Timothy Arceri	dc42a2177c	radeonsi: rework gs_vtx_offset handling This simplifies things a bit and will enable it to work with the common NIR -> LLVM code. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Timothy Arceri	8c9f3f2c46	nir: add streams to nir data This will be used by gallium drivers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-12 11:08:26 +11:00
Marek Olšák	3a71eac783	st/dri: fix deadlock when waiting on android fences Android fences can't be deferred, because st/dri calls fence_finish with ctx = NULL, so the driver can't flush u_threaded_context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-11 04:12:53 +01:00
Rob Clark	881f6e741f	meson: Guard freedreno build with with_gallium_freedreno. This prevents build failures when libdrm_freedreno is unavailable, which started happening after the ir3_compiler build was enabled. (Patch by Rob, commit message by Ken). Fixes: `fecd04a66a` ("freedreno/ir3: fix standalone compiler meson build") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-10 17:11:48 -08:00
Andres Gomez	d48492074a	docs: update calendar, add news item and link release notes for 17.2.5 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-11-11 01:25:40 +02:00
Andres Gomez	30a3da45fa	docs: add sha256 checksums for 17.2.5 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `96ad27f8fc`)	2017-11-11 01:25:01 +02:00
Andres Gomez	bb40d2f3b8	docs: add release notes for 17.2.5 Signed-off-by: Andres Gomez <agomez@igalia.com> (cherry picked from commit `ae52410bf0`)	2017-11-11 01:24:56 +02:00
Jason Ekstrand	2927014313	i965/gen10: Use the correct form of \| for the RCPFE workaround Found by inspection Fixes: `d3d0fe4572` Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-11-10 14:36:57 -08:00
Kenneth Graunke	b8d42cccd0	i965: Make L3 configuration atom listen for TCS/TES program updates. The L3 configuration code already considers the TCS and TES programs, but failed to listen for TCS/TES program changes. This was somehow missing. Fixes: `e9644cb1f9` ("i965: Consider tessellation in get_pipeline_state_l3_weights.") Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2017-11-10 13:34:59 -08:00
Dylan Baker	ad9c2f5469	meson: build gallium-xlib based glx Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 13:00:01 -08:00
Dylan Baker	118a7f0441	meson: add support for xlib glx There is a bunch of churn in the main meson.build so that we can correctly set the auto tristate of GLX. In particular, don't build xlib-based glx when dri and gallium are disabled but vulkan is enabled, in that case just turn glx off. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 13:00:01 -08:00
Dylan Baker	13752af4ed	meson: move gl pkgconfig generation out of glx Because the same generation logic is required by xlib glx and gallium-xlib glx, it makes sense to pull it out. v2: - Ensure that libgl is defined before trying to generate a pkgconfig file with it. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 13:00:01 -08:00
Dylan Baker	dc0ec581f2	meson: move wayland-egl into egl folder This ensure that it's properly guarded, but also makes the code clearer by grouping related things together. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 13:00:01 -08:00
Dylan Baker	140b688c57	meson: add nir_builder_opcodes_h to gallium_auxiliary This creates a dependency on this header being generated before trying to compile any of these targets, as well as passing the correct -I to the compiler to ensure it's included correctly. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-10 12:59:54 -08:00
Dylan Baker	7210d0096a	gallium/xlib: remove GL_{MAJOR,MINOR,TINY} These variables were removed from autotools in 2008 (sha: `80f68e1b6a`), but they have lived on here. The Scons build meanwhile doesn't set a patch/tiny version at all, just major and minor. This patch removes the unused variables and simply sets the version, leaving patch/tiny as 0 since that's what the autotools build as been doing forever. This shouldn't change any behavior. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-10 12:40:08 -08:00
Timothy Arceri	f9e5216f71	radeonsi: get llvm types from ac Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-11 06:54:25 +11:00
Jon Turney	d1edf6e396	glx/windows: Fix building libwindowsdri when libX11 headers are installed in a non-standard location Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Adam Jackson <ajax@redhat.com>	2017-11-10 18:24:12 +00:00
Jon Turney	764f1e4d45	util: include unistd.h, which may be required for usleep prototype This seems to be dropped in `222a2fb9` "util: move os_time.[ch] to src/util" ../../../src/util/os_time.c: In function ‘os_time_sleep’: ../../../src/util/os_time.c:104:4: error: implicit declaration of function ‘usleep’ [-Werror=implicit-function-declaration] Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-10 18:20:35 +00:00
Dylan Baker	854455498c	autotools: Set C++ visibility flags on Intel These flags are set for C sources, but not C++. This causes symbol visibility leaks from the C++ parts of the Intel compiler. Fixes: `700bebb958` ("i965: Move the back-end compiler to src/intel/compiler") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-11-10 09:41:55 -08:00
Andres Gomez	b8c85f5acc	docs/releasing: improve the pre-announce template and examples v2: Choose a proper rejection example (Emil). Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-10 19:36:24 +02:00
Andres Gomez	3bc16339e5	docs/releasing: drop manually exported variables during smoke test Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-10 19:36:00 +02:00
Andres Gomez	5ccbac8e04	docs/releasing: drop custom LLVM_CONFIG if previously manually set Cc: Emil Velikov <emil.velikov@collabora.com> Cc: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-10 19:34:51 +02:00
Marek Olšák	e456d4def5	st/dri: fix android fence regression Fixes piglit - egl_khr_fence_sync/android_native tests. Broken by `884a0b2a9e`. Introduce state-tracker flush flags, analogous to the pipe ones. Use the former when with stapi->flush(). Fixes: `884a0b2a9e` ("st/dri: use stapi flush instead of pipe flush when creating fences") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-10 17:17:13 +01:00
Nicolai Hähnle	e7972b8943	util/u_thread: fix compilation on Mac OS Apparently, it doesn't have pthread barriers. p_config.h (which was originally used to guard this code) uses the __APPLE__ macro to detect Mac OS. Fixes: `f0d3a4de75` ("util: move pipe_barrier into src/util and rename to util_barrier") Cc: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-10 16:37:54 +01:00
Nicolai Hähnle	f53570a7a6	util/u_queue: handle OS_TIMEOUT_INFINITE in util_queue_fence_wait_timeout Fixes e.g. piglit/bin/bufferstorage-persistent read -auto Fixes: `e6dbc804a8` ("winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences") Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-10 16:37:47 +01:00
Nicolai Hähnle	ee880e91cc	gallium/u_threaded: fix end_query regression Ouch... Fixes: `244536d3d6` ("gallium/u_threaded: avoid syncs for get_query_result") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-10 16:37:37 +01:00
Bruce Cherniak	d473f91758	swr: Fixed an uncommon freed-memory access during state validation State validation is performed during clear and draw calls. Validation during clear was still accessing vertex buffer state. When the currently set vertex buffers are client arrays, this could lead to accessing freed memory. Such is the case with the VMD application. Previously, vertex buffer validation depended on a dirty bit or the draw info indicating an indexed draw. This required special handling for clears. But, vertex buffer validation still occurred which was unnecessary and wrong. Now, only minimal validation is performed during clear, deferring the remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo for indexed draws, vertex buffer validation is only dependent upon a single dirty bit. This fixes a bug exposed by the VMD application when changing models. Reviewed-By: George Kyriazis <george.kyriazis@intel.com>	2017-11-10 08:55:42 -06:00
Rob Clark	fecd04a66a	freedreno/ir3: fix standalone compiler meson build Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Rob Clark	86154acb57	freedreno/ir3: correct # of dest components for intrinsics Don't rely on intr->num_components having a valid value. It doesn't seem to anymore for non-vectorized intrinsics. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Rob Clark	3fcf18634c	freedreno/ir3: remove bogus assert The ssbo atomic instructions are not vectorized. So num_components is not expected to be valid. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-11-10 08:57:33 -05:00
Rob Clark	ef4c42fc3a	nir: handle get_buffer_size in nir_lower_atomics_to_ssbo Overlooked initially, be we need to remap the SSBO index for this as well. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-10 08:57:33 -05:00
Chad Versace	cd6f79a71d	anv/meson: Generate dev_icd.json I tested this in a setup where the builddir was outside of the srcdir. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-09 16:29:33 -08:00
Chad Versace	b7441ef252	anv: Fix architecture in intel_icd.{arch}.json Use the host arch, not the target arch. In Meson and in recent Autotools, the host arch is where the binary will be used. The target arch is useful only when compiling a compiler. See: http://mesonbuild.com/Cross-compilation.html See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html Reported-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-09 16:29:31 -08:00
Chad Versace	9ec33972cc	radv: Fix architecture in radeon_icd.{arch}.json Use the host arch, not the target arch. In Meson and in recent Autotools, the host arch is where the binary will be used. The target arch is useful only when compiling a compiler. See: http://mesonbuild.com/Cross-compilation.html See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html Reported-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-09 16:29:28 -08:00
Chad Versace	2a4798ad98	anv: Refactor anv_GetImageSubresourceLayout() Its helper function, anv_surface_get_subresource_layout(), was not very helpful. So fold it into the main function. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	69e3f0b02e	anv/image: Refactor choice of isl_tiling_flags_t Instead of choosing the tiling flags inside make_surface(), which is called once per aspect in a loop, and which chooses the same tiling for each aspect, choose the tiling flags exactly once before entering the aspect loop. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	7bb4387105	anv: Refactor anv_get_format_plane() - explicit unsupported The same local variable, 'plane_format', was returned on success and failure. Be more explicit in distinguishing the two cases: return 'plane_format' on success and return 'unsupported' on failure. This simplifies the diff in upcoming patches for VK_EXT_image_drm_format_modifier. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	3ee7f4bc2f	anv: Remove anv_physical_device_get_format_properties() Fold its body into its sole caller, anv_GetPhysicalDeviceFormatProperties(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	891d237667	anv: Simplify anv_physical_device_get_format_properties() Now that get_image_format_properties() returns the correct VkFormatFeatureFlags, we can remove the unneeded if-branch and some local variables. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	b3e2ce0580	anv: Simplify anv_get_image_format_properties() Now that get_image_format_features() has a VkImageTiling parameter, we can bypass anv_physical_device_get_format_properties() and call get_image_format_features() directly. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	cd3fe376e0	anv: Rename get_image_format_properties() The name is misleading. It looks like vkGetPhysicalDeviceImageFormatProperties(), but it actually implement vkGetPhysicalDeviceFormatProperties. Let's rename it to what it actually does, get_image_format_features(), because it returns VkFormatFeatureFlags. For consistency, also rename get_buffer_format_properties() to get_buffer_format_features(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	17ac61a2c9	anv: Fix get_image_format_properties() - YCbCr Teach it to calculate the format features for YCbCr. The goal (which is completed in this patch) is to incrementally fix get_image_format_properties() to return a correct result. Previously, it returned incorrect VkFormatFeatureFlags which the caller needed clean up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	eaa49ec3fc	anv: Fix get_image_format_properties() - 3-channel formats Teach it to calculate the format features for 3-channel formats. The goal is to incrementally fix get_image_format_properties() to return a correct result. Currently, it returns incorrect VkFormatFeatureFlags which the caller must clean up. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	6394e4a380	anv: Refactor get_image_format_properties() - Reduce params Replace parameters 'enum isl_format' and 'struct anv_format_plane' with new parameter 'const struct anv_format *'. The goal is to incrementally fix get_image_format_properties() to return a correct result. Currently, it returns incorrect VkFormatFeatureFlags which the caller must clean up. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-09 16:01:59 -08:00
Chad Versace	66647074a4	anv: Refactor get_image_format_properties() - base_isl_format Rename parameter 'base' to 'base_isl_format'. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-09 16:01:59 -08:00
Chad Versace	c22a9f10be	anv: Refactor get_image_format_properties() - plane_format Rename parameter 'format' to 'plane_format'. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	096fc6915b	anv: Refactor get_image_format_properties() - ASTC Teach it to calculate the format features for ASTC. The goal is to incrementally fix get_image_format_properties() to return a correct result. Currently, it returns incorrect VkFormatFeatureFlags which the caller must clean up. v2: New commit message Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	8ae4e97536	anv: Refactor get_image_format_properties() - depthstencil (v2) Teach it to calculate the features of depthstencil formats. The goal is to incrementally fix get_image_format_properties() to return a correct result. Currently, it returns incorrect VkFormatFeatureFlags which the caller must clean up. v2: New commit message Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-09 16:01:59 -08:00
Chad Versace	6720abf292	anv: Better types for 'aspect' function params Some functions have a comment that says "Exactly one bit must be in 'aspect'". So change the type of their 'aspect' parameter from VkImageAspectFlags to VkImageAspectFlagBits. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-09 16:01:59 -08:00
Chad Versace	342c811646	anv: Refactor get_buffer_format_properties() Make it a stand-alone function. Pre-patch, for some formats the function returned incorrect VkFormatFeatureFlags which were cleaned up by the caller. This prepares for a cleaner implementation of VK_EXT_image_drm_format_modifier. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-09 16:01:59 -08:00
Eric Anholt	62deeaa23a	broadcom/vc4: Fix simulator mode for the MADVISE usage.	2017-11-09 15:51:56 -08:00
Marek Olšák	272fe94942	mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile We already have piglit tests testing alpha, luminance, and intensity formats. They were skipped by piglit until now. Additionally, I'm enabling one ARB_texture_buffer_range piglit test to run with the compat profile. i965 behavior is unchanged except that it doesn't expose TBOs in the Compat profile. Not sure how that affects the GL version override. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-09 23:55:31 +01:00
Dave Airlie	d4ebdc1a54	docs: update r600 atomic counter status. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:36 +10:00
Dave Airlie	06993e4ee3	r600: add support for hw atomic counters. (v3) This adds support for the evergreen/cayman atomic counters. These are implemented using GDS append/consume counters. The values for each counter are loaded before drawing and saved after each draw using special CP packets. v2: move hw atomic assignment into driver. v3: fix messing up caps (Gert Wollny), only store ranges in driver, drop buffers. Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com>	2017-11-10 08:39:36 +10:00
Dave Airlie	9e62654d4b	st/mesa: add support for hw atomics to glsl->tgsi. (v5) This adds support for creating the hw atomic tgsi from the glsl codepaths. v2: drop the atomic index and move to backend. v3: drop buffer decls. (Marek) v4: fix off by one (Gert) v5: fix off by one the other way (Dave) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	58d061ceef	st/mesa: setup hw atomic limits. (v1.1) HW atomics need to use caps to set some limits, and some other limits may also need limiting. This fixes things up to work for evergreen hw, it may need more changes in the future if other hw wants to use this path. v1.1: fix indent. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	9f1db21f28	st/mesa: start adding support for hw atomics atom. (v2) This adds a new atom that calls the new driver API to bind buffers containing hw atomics. v2: fixup bindings for sparse buffers. (mareko/nha) don't bind buffer atomics when hw atomics are enabled. use NewAtomicBuffer (mareko) Tested-By: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	e0fb6de313	mesa/program: add hw atomic counter file This is needed for the GLSL->TGSI translation for hw atomic counters. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	cca5617348	gallium: add hw atomic buffer binding API. This API binds atomic buffers for all bound shaders (as per the GL semantics). This is needed to support cross shader hw atomic counters. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	4b0b82770a	gallium/tgsi: start adding hw atomics (v3.2) This adds support for a hw atomic counters to TGSI. A new register file for storing atomic counters is added, along with a new atomic counter semantic, along with docs for both. v2: drop semantic, move hw counter to backend, Ilia pointed out SSO would have busted my plan, and he was right. v3: drop BUFFER decls. (Marek) v3.1: minor fixups for whitespace, set ureg error if we overflow the hw atomic limits. (nha) v3.2: fix some docs inconsistencies (Ilia) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:35 +10:00
Dave Airlie	2a06423c00	gallium: add CAPs to support HW atomic counters. (v3) This looks like an evergreen specific feature, but with atomic counters AMD have hw specific counters they use instead of operating on buffers directly. These are separate to the buffer atomics, so require different limits and code paths. I've left the CAP for atomic type extensible in case someone else has a variant on this sort of thing (freedreno maybe?) and needs to change it. This adds all the CAPs required to add support for those atomic counters, along with a related CAP for limiting the number of output resources. I'd like to land this and the st patch then I can start to upstream the evergreen support for these and other GL4.x features. v2: drop the ATOMIC_COUNTER_MODE cap, just use the return from the HW counters. If 0 we use the current mode. v3: fix some rebase errors (Gert Wollny) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-By: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:34 +10:00
Dave Airlie	24baca6e75	r600/query: drop rest of vi workaround code. This isn't needed in r600 anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-10 08:39:16 +10:00
Roland Scheidegger	dd38a4ee0d	docs: Fix GL_MESA_program_debug enums `13b303ff92` added the actual enums but didn't remove the already existing XXXX ones. (And also duplicated the "fragment" names instead of using the "vertex" names.) Fixes: `13b303ff92` "docs: Update the list of used MESA GL enums." Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-11-09 23:35:12 +01:00
Brian Paul	b99c6288c1	st/mesa: remove 'struct' keyword on function parameter st_src_reg is a class, not a struct. Simply remove 'struct' to silence a MSVC compiler warning (class vs. struct mismatch). Reviewed-by; Charmaine Lee <charmainel@vmware.com>	2017-11-09 14:13:59 -07:00
Brian Paul	750ee4182e	threads: fix MinGW build breakage Fixes: `f1a3648784` ("threads: update for late C11 changes") Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-11-09 14:13:59 -07:00
Brian Paul	f8bae523d9	mesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexes Also fix local variable declarations and replace -1 with BUFFER_NONE. No Piglit changes. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-09 14:13:59 -07:00
Brian Paul	366453f4d3	mesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndex BUFFER_NONE is -1 so no reason for GLint. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-09 14:13:59 -07:00
Brian Paul	9862a8403e	mesa: minor reformatting, add const to gl_external_samplers() This function should probably be moved elsewhere, too. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-09 14:13:59 -07:00
Brian Paul	ad5f407b61	st/mesa: whitespace clean-up in st_mesa_to_tgsi.c Remove trailing whitespace, fix indentation, wrap lines to 78 columns, etc. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-09 14:13:59 -07:00
Dylan Baker	1873327c4b	meson: implement default driver arguments This allows drivers to be set by OS/arch in a sane manner. v2: - set _drivers to a list of drivers instead of manually assigning each with_* v3: - Use "auto" instead of "default", which matches the value of other automatically configured options. - Set vulkan drivers as well - Add error message if no automatic drivers are known for a given arch/OS combo - use not(darwin or windows) instead of (linux or bsd), which is probably more accurate (that way Solaris and other nix systems aren't excluded) - rename softpipe to swrast, as swrast is the actual option name Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2017-11-09 13:10:59 -08:00
Kenneth Graunke	8ecdbb6136	i965: Pretend there are 4 subslices for compute shader threads on Gen9+. Similar to what we did for pixel shader threads - see gen_device_info.c. We don't want to bump the actual Maximum Number of Threads though, so we adjust it here. For pixel shaders, we don't use max_wm_threads, so we could just bump it globally. Supposedly fixes Piglit tests: arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-11-09 12:34:11 -08:00
Dylan Baker	3e9533d9b8	meson: Add script to use VERSION file for getting version Meson has up until this point set it's version in the root meson.build script, while the other build systems read the VERSION file. This is just "one more thing" to duplicate between meson and every other build system. This script is a simple "read, strip, print" sort of deal to allow meson to read the VERSION file. I chose to implement this in python since python is portable, and to keep the meson.build script clean. This is also complicated by the fact that the project() call must be the first non-comment,non-blank in the toplevel meson.build script. v2: - Move from scripts/ to bin/ - use python explicitly to run the scripts to support windows Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-09 11:19:53 -08:00
Boris Brezillon	359a8f6ae5	broadcom/vc4: Mark BOs as purgeable when they enter the BO cache This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. v2: - Removed BOs from the cache when they've been purged by the kernel - Check whether the madvise ioctl is supported or not before using it v3: Don't walk the whole list when we find a busy BO (by anholt, acked by Boris) Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-09 10:57:17 -08:00
Boris Brezillon	7d72af4f09	drm-uapi: Update vc4 header from drm-next Taken from drm-next d65d31388a23 ("Merge tag 'drm-misc-next-fixes-2017-11-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next") v2: Add the NOTSUPP definition from the final drm-next version, not the commit (anholt). Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-11-09 10:57:14 -08:00
Eric Anholt	ebcb4c2156	meson: Enable VC4's NEON assembly support. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:30 -08:00
Eric Anholt	9c9fd8ff37	meson: Always link libgallium_dri.so against dep_thread. Somehow on my cross build the -pthread is getting lost. All the other deps seem to work out fine. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:27 -08:00
Eric Anholt	4c94607c21	meson: Drop stale comment about making valgrind conditional. It was fixed in `5c2ff5773a`. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:25 -08:00
Eric Anholt	d975e6940e	meson: Leave dep_llvm empty if !with_llvm The gallium auxiliary build would link against llvm, for the gallivm code that it didn't build. This broke the build on my armhf cross, where libLLVM-3.9.so is not multiarch and thus points to x86-64 libs. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Tested-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-09 09:40:03 -08:00
Adam Jackson	015cc6bb7c	Revert "glx: Implement GLX_EXT_no_config_context (v2)" Pushed ahead of things actually working. This reverts commit `5293b96b16`.	2017-11-09 11:41:14 -05:00
Marek Olšák	9ceb057ebf	radeonsi: pack r600_surface better 160 -> 136 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	169525684f	radeonsi: pack r600_texture better 1752 -> 1736 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	f8a4b606a2	radeonsi: clean up r600_surface 216 -> 160 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	6916ee7e17	radeonsi: remove r600_texture::non_disp_tiling Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Marek Olšák	a06fe75eac	radeonsi: remove DBG_NO_DISCARD_RANGE Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 17:32:14 +01:00
Adam Jackson	5293b96b16	glx: Implement GLX_EXT_no_config_context (v2) This more or less ports EGL_KHR_no_config_context to GLX. v2: Enable the extension only for those backends that support it. Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-09 09:35:30 -05:00
Adam Jackson	3f66d54a2a	glx: Prepare the DRI backends for GLX_EXT_no_config_context This should be safe as these backends already support the EGL version of this extension. DRI1 is not affected because it does not support GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement this as there's no equivalent WGL extension, and wglCreateContextAttribs seems to really want the HDC's pixel format to be set. Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-11-09 09:35:22 -05:00
Adam Jackson	74b701d84c	glx: Relax validate_renderType_against_config for EXT_no_config_context Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-09 09:35:07 -05:00
Nicolai Hähnle	ffc2060616	anv: fix build failure Fixes: `e3a8013de8` ("util/u_queue: add util_queue_fence_wait_timeout")	2017-11-09 14:49:19 +01:00
Nicolai Hähnle	cc78d77043	mesa: flush and wait after creating a fallback texture Fixes non-deterministic failures in dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render and others in dEQP-EGL.functional.sharing.gles2.multithread.* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:20:58 +01:00
Nicolai Hähnle	46444613cf	mesa: increase MaxServerWaitTimeout The current value was introduced in commit `a27180d0d8`, which claims that it represents ~1.11 years. However, it is interpreted in nanoseconds, so it actually only represents ~9.8 hours. That seems a bit short. Use the largest value consistent with both int32 and int64. It corresponds to ~292 years in nanoseconds. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:20:58 +01:00
Nicolai Hähnle	fbda7958ff	st/mesa: remove redundant flushes from st_flush st_flush should flush state tracker-internal state and the pipe, but not mesa/main state. Of the four callers: - glFlush/glFinish already call FLUSH_{VERTICES,STATE}. - st_vdpau doesn't need to call them. - st_manager will now call them explicitly. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:20:58 +01:00
Nicolai Hähnle	884a0b2a9e	st/dri: use stapi flush instead of pipe flush when creating fences There may be pending operations (e.g. vertices) that need to be flushed by the state tracker. Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:20:58 +01:00
Nicolai Hähnle	b921da3b74	radeonsi: use a threaded context even for debug contexts Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:04 +01:00
Nicolai Hähnle	1a6d9e087a	radeonsi: record and dump time of flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:04 +01:00
Nicolai Hähnle	b07569ad8b	ddebug: optionally handle transfer commands like draws Transfer commands can have associated GPU operations. Enabled by passing GALLIUM_DDEBUG=transfers. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	18fd2a859d	ddebug: dump context and before/after times of draws Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	ba2f2b6f2a	ddebug: generalize print_named_xxx via a PRINT_NAMED macro Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	c9fefa062b	ddebug: rewrite to always use a threaded approach This patch has multiple goals: 1. Off-load the writing of records in 'always' mode to another thread for performance. 2. Allow using ddebug with threaded contexts. This really forces us to move some of the "after_draw" handling into another thread. 3. Simplify the different modes of ddebug, both in the code and in the user interface, i.e. GALLIUM_DDEBUG. In particular, there's no 'pipelined' anymore, since we're always pipelined; and 'noflush' is replaced by 'flush', since we no longer flush by default. 4. Fix the fences in pipelining mode. They previously relied on writes via pipe_context::clear_buffer. However, on radeonsi, those could (quite reasonably) end up in the SDMA buffer. So we use the newly added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead. 5. Improve pipelined mode overall, using the finer grained information provided by the new fences. Overall, the result is that pipelined mode should be more useful, and using ddebug in default mode is much less invasive, in the sense that it changes the overall driver behavior less (which is kind of crucial for a driver debugging tool). An example of the new hang debug output: Gallium debugger active. Hang detection timeout is 1000ms. GPU hang detected, collecting information... Draw # driver prev BOP TOP BOP dump file ------------------------------------------------------------- 2 YES YES YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000000 3 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000001 4 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000002 5 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000003 Done. We can see that there were almost certainly 4 draws in flight when the hang happened: the top-of-pipe fence was signaled for all 4 draws, the bottom-of-pipe fence for none of them. In virtually all cases, we'd expect the first draw in the list to be at fault, but due to the GPU parallelism, it's possible (though highly unlikely) that one of the later draws causes a component to get stuck in a way that prevents the earlier draws from making progress as well. (In the above example, there were actually only 3 draws truly in flight: the last draw is a blit that waits for the earlier draws; however, its top-of-pipe fence is emitted before the cache flush and wait, and so the fact that the draw hasn't truly started yet can only be seen from a closer inspection of GPU state.) Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	e8bb8758dd	ddebug: use an atomic increment when numbering files Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	d6710fe874	dd/util: extract dd_get_debug_filename_and_mkdir Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:03 +01:00
Nicolai Hähnle	8491fcafab	gallium/u_dump: add and use util_dump_transfer_usage Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	9b8033a4a7	gallium/u_dump: add util_dump_ns Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	6f4a03b08a	gallium/u_dump: export util_dump_ptr Change format to %p while we're at it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:02 +01:00
Nicolai Hähnle	125a915052	radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE v2: use uncached system memory for the fence, and use the CPU to clear it so we never read garbage when checking the fence Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:55 +01:00
Nicolai Hähnle	e4627ac8fb	radeonsi: document some subtle details of fence_finish & fence_server_sync v2: remove the change to si_fence_server_sync, we'll handle that more robustly Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:50 +01:00
Nicolai Hähnle	14b9fa75e4	gallium: add pipe_context::callback For running post-draw operations inside the driver thread. ddebug will use it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:50 +01:00
Nicolai Hähnle	2bdfbb0e53	gallium/u_threaded: implement pipe_context::set_log_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:49 +01:00
Nicolai Hähnle	244536d3d6	gallium/u_threaded: avoid syncs for get_query_result Queries should still get marked as flushed when flushes are executed asynchronously in the driver thread. To this end, the management of the unflushed_queries list is moved into the driver thread. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:49 +01:00
Nicolai Hähnle	609a230375	gallium/u_threaded: implement asynchronous flushes This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved cs_add_fence_dependency - only implement asynchronous flushes on amdgpu Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	11b380ed0c	gallium/u_threaded: mark queries flushed only for non-deferred flushes The driver uses (and must use) the flushed flag of queries as a hint that it does not have to check for synchronization with currently queued up commands. Deferred flushes do not actually flush queued up commands, so we must not set the flushed flag for them. Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	78a4750d91	radeonsi: move fence functions to si_fence.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	e6dbc804a8	winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences The idea is to fix the following interleaving of operations that can arise from deferred fences: Thread 1 / Context 1 Thread 2 / Context 2 -------------------- -------------------- f = deferred flush <------- application-side synchronization -------> fence_server_sync(f) ... flush() flush() We will now stall in fence_server_sync until the flush of context 1 has completed. This scenario was unlikely to occur previously, because applications seem to be doing Thread 1 / Context 1 Thread 2 / Context 2 -------------------- -------------------- f = glFenceSync() glFlush() <------- application-side synchronization -------> glWaitSync(f) ... and indeed they probably have to use this ordering to avoid deadlocks in the GLX model, where all GL operations conceptually go through a single connection to the X server. However, it's less clear whether applications have to do this with other WSI (i.e. EGL). Besides, even this sequence of GL commands can be translated into the Gallium-level sequence outlined above when Gallium threading and asynchronous flushes are used. So it makes sense to be more robust. As a side effect, we no longer busy-wait on submission_in_progress. We won't enable asynchronous flushes on radeon, but add a cs_add_fence_dependency stub anyway to document the potential issue. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:22 +01:00
Nicolai Hähnle	1e5c9cf590	gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits These bits are intended to be used by the ddebug hang detection and are named in analogy to the Vulkan stage bits (and the corresponding Radeon pipeline event). Hang detection needs fences on the granularity of individual commands, which nothing else really covers. The closest alternative would have been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object and we really want a per-screen object, (b) queries don't offer a wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is meant to imply that GPU caches are flushed, which the new bits explicitly aren't. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 13:58:16 +01:00
Nicolai Hähnle	ea6df1ce37	gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH Also document some subtleties of pipe_context::flush. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 13:58:16 +01:00
Nicolai Hähnle	e3a8013de8	util/u_queue: add util_queue_fence_wait_timeout v2: - style fixes - fix missing timeout handling in futex path Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 13:58:10 +01:00
Nicolai Hähnle	f1a3648784	threads: update for late C11 changes C11 threads were changed to use struct timespec instead of xtime, and thrd_sleep got a second argument. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock} Note that cnd_timedwait is spec'd to be relative to TIME_UTC / CLOCK_REALTIME. v2: Fix Windows build errors. Tested with a default Appveyor config that uses Visual Studio 2013. Judging from Brian's email and random internet sources, Visual Studio 2015 does have timespec and timespec_get, hence the _MSC_VER-based guard which I have not tested. Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-11-09 11:57:22 +01:00
Nicolai Hähnle	c50743f61c	gallium: remove unused and deprecated u_time.h Cc: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:57:22 +01:00
Nicolai Hähnle	222a2fb998	util: move os_time.[ch] to src/util Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:57:21 +01:00
Nicolai Hähnle	f76a6cb337	radeonsi: always use async compiles when creating shader/compute states With Gallium threaded contexts, creating shader/compute states is effectively a screen operation, so we should not use context state. In particular, this allows us to avoid using the context's LLVM TargetMachine. This isn't an issue yet because u_threaded_context filters out non-async debug callbacks, and we disable threaded contexts for debug contexts. However, we may want to change that in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:20 +01:00
Nicolai Hähnle	b650fc09c3	radeonsi: fix potential use-after-free of debug callbacks Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:20 +01:00
Nicolai Hähnle	dd7c273e87	radeonsi: move pipe debug callback to si_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	185061aef4	u_queue: add util_queue_finish for waiting for previously added jobs Schedule one job for every thread, and wait on a barrier inside the job execution function. v2: avoid alloca (fixes Windows build error) Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	f0d3a4de75	util: move pipe_barrier into src/util and rename to util_barrier The #if guard is probably not 100% equivalent to the previous PIPE_OS check, but if anything it should be an over-approximation (are there pthread implementations without barriers?), so people will get either a good implementation or compile errors that are easy to fix. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	28c95cdb29	gallium: add async debug message forwarding helper v2: use util_vasprintf for Windows portability Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	637240d824	st/mesa: guard sampler views changes with a mutex Some locking is unfortunately required, because well-formed GL programs can have multiple threads racing to access the same texture, e.g.: two threads/contexts rendering from the same texture, or one thread destroying a context while the other is rendering from or modifying a texture. Since even the simple mutex caused noticable slowdowns in the piglit drawoverhead micro-benchmark, this patch uses a slightly more involved approach to keep locks out of the fast path: - the initial lookup of sampler views happens without taking a lock - a per-texture lock is only taken when we have to modify the sampler view(s) - since each thread mostly operates only on the entry corresponding to its context, the main issue is re-allocation of the sampler view array when it needs to be grown, but the old copy is not freed Old copies of the sampler views array are kept around in a linked list until the entire texture object is deleted. The total memory wasted in this way is roughly equal to the size of the current sampler views array. Fixes non-deterministic memory corruption in some dEQP-EGL.functional.sharing.gles2.multithread.* tests, e.g. dEQP-EGL.functional.sharing.gles2.multithread.simple.images.texture_source.create_texture_render Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:50:55 +01:00
Nicolai Hähnle	8d20c660a9	st/mesa: re-arrange st_finalize_texture Move the early-out for surface-based textures earlier. This narrows the scope of the locking added in a follow-up commit. Fix one remaining case of initializing a surface-based texture without properly finalizing it. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:50:55 +01:00
Nicolai Hähnle	0dcf30e550	gallium: clarify the constraints on sampler_view_destroy r600 expects the context that created the sampler view to still be alive (there is a per-context list of sampler views). svga currently bails when the context of destruction is not the same as creation. The GL state tracker, which is the only one that runs into the multi-context subtleties (due to share groups), already guarantees that sampler views are destroyed before their context of creation is destroyed. Most drivers are context-agnostic, so the warning message in pipe_sampler_view_release doesn't really make sense. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:50:54 +01:00
Nicolai Hähnle	0f54ee6072	radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. v2: fix double-unlock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Nicolai Hähnle	4f493c79ee	radeonsi: use ready fences on all shaders, not just optimized ones There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 -------- -------- si_shader_select_with_key begin compiling the first variant (guarded by sel->mutex) si_bind_XX_shader select first_variant by default as state->current si_shader_select_with_key match state->current and early-out Since thread 2 never takes sel->mutex, it may go on rendering without a PM4 for that shader, for example. The solution taken by this patch is to broaden the scope of shader->optimized_ready to a fence shader->ready that applies to all shaders. This does not hurt the fast path (if anything it makes it faster, because we don't explicitly check is_optimized). It will also allow reducing the scope of sel->mutex locks, but this is deferred to a later commit for better bisectability. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Nicolai Hähnle	d1ff082637	u_queue: add a futex-based implementation of fences Fences are now 4 bytes instead of 96 bytes (on my 64-bit system). Signaling a fence is a single atomic operation in the fast case plus a syscall in the slow case. Testing if a fence is signaled is the same as before (a simple comparison), but waiting on a fence is now no more expensive than just testing it in the fast (already signaled) case. v2: - style fixes - use p_atomic_xxx macros with the right barriers Acked-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:39 +01:00
Nicolai Hähnle	574c59d4f9	u_queue: add util_queue_fence_reset Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:39 +01:00
Nicolai Hähnle	1b9d5ece55	u_queue: export util_queue_fence_signal Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:38 +01:00
Nicolai Hähnle	b20f955bc1	u_queue: group fence functions together Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:38 +01:00
Nicolai Hähnle	0a7f17cf5b	util/u_atomic: add p_atomic_xchg The closest to it in the old-style gcc builtins is __sync_lock_test_and_set, however, that is only guaranteed to work with values 0 and 1 and only provides an acquire barrier. I also don't know about other OSes, so we provide a simple & stupid emulation via p_atomic_cmpxchg. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:30 +01:00
Nicolai Hähnle	b4b2a951c8	util: move futex helpers into futex.h v2: style fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-11-09 11:37:22 +01:00
Kenneth Graunke	688d695868	glsl: Make #pragma STDGL invariant(all) only modify outputs. According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs: "To force all output variables to be invariant, use the pragma #pragma STDGL invariant(all) before all declarations in a shader." Notably, this is only supposed to affect output variables. Furthermore, "Only variables output from a shader can be candidates for invariance." It looks like this has been wrong since we first supported the pragma in 2011 (commit `86b4398cd1`). Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment. v2: Now that all cases are identical (other than compute shaders, which have no output variables anyway), we can drop the switch statement entirely. We also don't need the current_function == NULL check; this was a hold over from when we had a single var_mode_out for both function parameters and shader varyings, in the bad old days. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-11-08 23:11:48 -08:00
Tapani Pälli	c591b1e594	i965: expose SRGB visuals and turn on EGL_KHR_gl_colorspace Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and ony add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb v2: some code cleanup (Emil Velikov) update num_formats correctly (reported by deveee@gmail.com) v3: cleanup, remove redundant is_srgb rename explicit_srgb as 'need_srgb' to follow style better Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v2) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102264 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102354 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102503	2017-11-09 07:43:25 +02:00
Neil Roberts	4dc8458cd1	glsl: Transform fb buffers are only active if a variable uses them The GL spec will soon be revised to clarify that a buffer binding for a transform feedback buffer is only required if a variable is actually defined to use the buffer binding point. Previously a declaration for the default transform buffer would make it require a binding even if nothing was declared to use the default buffer. Affects: KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-09 05:39:42 +01:00
Jason Ekstrand	951a5dc4cc	intel/nir: Use the correct indirect lowering masks in link_shaders Previously, if we were linking a vec4 VS with a SIMD8/16 FS, we wouldn't lower indirects on the fragment shader which is wrong. Instead of using a single indirect mask, take advantage of our new little helper. Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-08 20:10:04 -08:00
Ilia Mirkin	f317f72f73	r600g: use SIMPLE_FLOAT for blending to enable some optimizations Radeonsi also sets this flag. Seems to avoid pulling up the desintation RT value when the dst blend factor is zero if it's not otherwise being loaded. Among other things, it allows blending to overwrite infinity/NaN values in the destination RT. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 22:35:27 -05:00
Ilia Mirkin	35433494f3	nv50: make blending work so that zero wins in a multiplication This matches nvc0 behavior, tested with the fbo-float-nan piglit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann<tobias.johannes.klausmann@mni.thm.de>	2017-11-08 22:32:43 -05:00
Ian Romanick	9c53b80ff9	glsl: Minor cleanups after previous commit I think it's more clear to only call emit_access once. The only difference between the two calls is the value of size_mul used for the offset parameter... but you really have to look at it to be sure. The s/is_64bit/is_double/ change is because there are no int64_t or uint64_t matrix types. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	c18d8c61d6	glsl: Use more link_calculate_matrix_stride in lower_buffer_access I was going to squash this with the previous commit, but there's a lot of churn in that commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	1a2beae1b3	glsl: Use link_calculate_matrix_stride in lower_buffer_access and friends Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	24e78d99db	glsl: Refactor matrix stride calculation into a utility function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	88f5588f77	glsl/linker: Optimize swizzles again after linking Without this, the SPIR-V generator has to deal with a bunch of junk like: (swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir)))) It seems better to cull that stuff out than to add code to deal with it. The problem is the way swizzles to and from scalars have to be handled in SPIR-V. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	ef1ca06ce8	glsl: Combine nop-swizzle optimization with swizzle-swizzle optimization Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	c858abb14f	glsl: Make the swizzle-swizzle optimization greedy If there is a long sequence of swizzled swizzles, compact all of them down to a single swizzle. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: <thomashelland90@gmail.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	ae1fd09c1d	glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *) I could not find any remaining users. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 18:37:29 -08:00
Ian Romanick	2c7657f62c	glsl: Silence unused parameter warning glsl/lower_shared_reference.cpp: In member function ‘virtual void {anonymous}::lower_shared_reference_visitor::insert_buffer_access(void, ir_dereference, const glsl_type, ir_rvalue, unsigned int, int)’: glsl/lower_shared_reference.cpp:244:58: warning: unused parameter ‘channel’ [-Wunused-parameter] int channel) ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 18:37:29 -08:00
Dave Airlie	6bec8bcd79	ac/nir: add support for all intrinsics. (v2) This is derived from tgsi/radeonsi code from the GLSL intrinsics. This should pre-fix radv for the upcoming spirv patches. v2: actually use wait_cnt, sleep deprived dad time! (Bas) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-09 01:25:59 +00:00
Timothy Arceri	87f02ddfd1	amdgpu: use simple mtx Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 12:07:48 +11:00
Timothy Arceri	f0857fe87b	mesa: use simple mtx in core mesa Results from x11perf -copywinwin10 on Eric's SKL: 4.33338% ± 0.905054% (n=40) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Yogesh Marathe <yogesh.marathe@intel.com>	2017-11-09 12:07:48 +11:00
Timothy Arceri	f98a2768ca	mesa: Add new fast mtx_t mutex type for basic use cases While modern pthread mutexes are very fast, they still incur a call to an external DSO and overhead of the generality and features of pthread mutexes. Most mutexes in mesa only needs lock/unlock, and the idea here is that we can inline the atomic operation and make the fast case just two intructions. Mutexes are subtle and finicky to implement, so we carefully copy the implementation from Ulrich Dreppers well-written and well-reviewed paper: "Futexes Are Tricky" http://www.akkadia.org/drepper/futex.pdf We implement "mutex3", which gives us a mutex that has no syscalls on uncontended lock or unlock. Further, the uncontended case boils down to a cmpxchg and an untaken branch and the uncontended unlock is just a locked decr and an untaken branch. We use __builtin_expect() to indicate that contention is unlikely so that gcc will put the contention code out of the main code flow. A fast mutex only supports lock/unlock, can't be recursive or used with condition variables. We keep the pthread mutex implementation around as for the few places where we use condition variables or recursive locking. For platforms or compilers where futex and atomics aren't available, simple_mtx_t falls back to the pthread mutex. The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound applications. Most CPU bound cases are helped and some of our internal bind_buffer_object heavy benchmarks gain up to 10%. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 12:07:48 +11:00
Timothy Arceri	6a72eba755	mesa: rework how we free gl_shader_program_data When I introduced gl_shader_program_data one of the intentions was to fix a bug where a failed linking attempt freed data required by a currently active program. However I seem to have failed to finish hooking up the final steps required to have the data hang around. Here we create a fresh instance of gl_shader_program_data every time we link. gl_program has a reference to gl_shader_program_data so it will be freed once the program is no longer active. Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Neil Roberts <nroberts@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177	2017-11-09 12:07:48 +11:00
Timothy Arceri	9c33533586	glsl: use the correct parent when allocating program data members Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-09 12:07:48 +11:00
Timothy Arceri	cf05bb506a	glsl: drop cache_fallback This turned out to be a dead end, it is much easier and less error prone to just cache the IR used by the drivers backend e.g. TGSI or NIR. Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-09 12:07:48 +11:00
Kenneth Graunke	a16dc04ad5	i965: properly initialize brw->cs.base.stage to MESA_SHADER_COMPUTE This has a bit of a surprising effect: For the render pipeline, the upload_sampler_state_table atom emits 3DSTATE_BINDING_TABLE_POINTERS_XS. It tries to avoid this for compute: if (GEN_GEN >= 7 && stage_state->stage != MESA_SHADER_COMPUTE) { /* Emit a 3DSTATE_SAMPLER_STATE_POINTERS_XS packet. */ genX(emit_sampler_state_pointers_xs)(brw, stage_state); } ... However, we were failing to initialize brw->cs.base.stage, so it was left as 0 (MESA_SHADER_VERTEX), causing this condition to break. We then emitted 3DSTATE_SAMPLER_STATE_POINTERS_VS in GPGPU mode, when trying to upload CS samplers. Nothing good can come of this. Found by inspection while debugging a GPU hang. Jordan believes this helps the Deus Ex: Mankind Divided benchmark mode's stability when running with shader cache. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-11-08 15:26:18 -08:00
Jason Ekstrand	3e63cf893f	intel/nir: Break the linking code into a helper in brw_nir.c Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-08 14:09:51 -08:00
Jason Ekstrand	7364f080f9	intel/nir: Add a helper for getting the NoIndirect mask Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-08 14:09:49 -08:00
Matt Turner	77a63d190a	nir: Don't print swizzles when there are more than 4 components ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: `5a0d3e1129` ("nir: Print the components referenced for split or packed shader in/outs.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-11-08 13:22:26 -08:00
Dylan Baker	34593e978c	meson: Add threads dependencies to glsl_compiler executable Fixes compiling the optional standalone glsl compiler. Reported-by: DrNick (on irc) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 11:36:02 -08:00
Andreas Boll	a6932faae1	glsl: Fix typo fragement -> fragment Fixes: `94d669b0d2` ("glsl: enforce fragment shader input restrictions in GLSL ES 3.10") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:48 +00:00
Andreas Boll	4f29ed38f3	broadcom/vc5: Remove unused v3d_compiler.c Unused since original import of VC5. Fixes: `ade416d023` ("broadcom: Add VC5 NIR compiler.") Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:47 +00:00
Andreas Boll	6e4d65f674	broadcom/vc5: Add vc5_drm.h to the release tarball Fixes: `45bb8f2957` ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.") Cc: 17.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 18:30:45 +00:00
Gert Wollny	6905d005ef	clover: use the unified check for c++11 instead of the gcc version number So far clover based its test for compiler support on the version of gcc, while in reality support for c++11 is required. This patch replaces the version check by the check unified for all modules that require c++11. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 16:03:38 +00:00
Gert Wollny	8f18528cea	swr: Replace the check for c++11 by the unified version Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 16:03:38 +00:00
Gert Wollny	09ad2576ec	configure: check for -std=c++11 support and enable st/mesa test accordingly Add a check that tests whether the c++ compiler supports c++11, either by default, by adding the compiler flag -std=c++11, or by adding a compiler flag that the user has specified via the environment variable CXX11_CXXFLAGS. The test only does a very shallow check of c++11 support, i.e. it tests whether the define __cplusplus >= 201103L to confirm language support by the compiler, and it checks whether the header <tuple> is available to test the availability of the c++11 standard library. A make file conditional HAVE_STD_CXX11 is provided that is used in this patch to enable the test in st/mesa if C++11 support is available. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665 Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 16:03:34 +00:00
Emil Velikov	6dd56fafe2	configure.ac: append to existing initializer override flags Currently we were overwriting the existing warning flags, instead of adding new [as applicable]. Fixes `c5d2e2d43f` ("configure: Test for -Wno-initializer-overrides") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 15:53:00 +00:00
Emil Velikov	63811f3b7c	configure.ac: append to existing MSVC compat flags Currently we were overwriting the existing warning flags, instead of adding new [as applicable]. v2: Add missing space before -Werror (Eric) Fixes `e4b2b69e82` ("configure: Add and use AX_CHECK_COMPILE_FLAG") Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 15:53:00 +00:00
Dylan Baker	8a36f025f4	meson: Allow building glvnd with EGL and non-dri based GLX Because meson mirrors the auototools logic, it needs the same changes to allow building glvnd based egl. v2: - change if to elif (Eric) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 15:53:00 +00:00
Emil Velikov	85a017230c	configure.ac: require xcb* for the omx/va/... when using x11 platform Targets such as omx and va can work w/o anything X related. Mandate the xcb* dependencies only when the X11 platform is selected. Reported-by: Lukas Rusak <lorusak@gmail.com> Fixes: `63e11ac2b5` ("configure: error out if building VA w/o supported platform") Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Lukas Rusak <lorusak@gmail.com> (v1)	2017-11-08 15:53:00 +00:00
Emil Velikov	b4967561c0	configure.ac: loosen --enable-glvnd check to honour egl Currently we error out when building GLVND w/o GLX. That was the original premice before we had EGL. As the commit says, that error should be reworked to honour both - do so. v2: Drop noop *);; (Eric) Reported-by: Lukas Rusak <lorusak@gmail.com> Fixes: `ce562f9e3f` ("EGL: Implement the libglvnd interface for EGL (v3)") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Lukas Rusak <lorusak@gmail.com> (v1)	2017-11-08 15:52:56 +00:00
Emil Velikov	61e99ce267	egl/android: add a note about .swap_buffers_with_damage Android implements the API and does the native damage handling itself. At the same time it a) does call the vendor's eglSwapBuffersWithDamageKHR b) does not implement eglSetDamageRegionKHR There's something strange happening here. For now simply note about the 'lack' of eglSwapBuffersWithDamageKHR support. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 14:42:12 +00:00
Emil Velikov	c7b65c330f	wayland-drm: static inline wayland_drm_buffer_get The function is effectively a direct function call into libwayland-server.so. Thus GBM no longer depends on the wayland-drm static library, making the build more straight forward. And the resulting binary is a bit smaller. Note: we need to move struct wayland_drm_callbacks further up, otherwise we'll get an error since the type is incomplete. v2: Rebase, beef-up commit message, update meson, move struct wayland_drm_callbacks. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> # meson bit only Acked-by: Eric Engestrom <eric.engestrom@imgtec.com> # for the rest Reviewed-by: Dylan Baker <dylan@pnwbakers.com> # meson	2017-11-08 14:40:12 +00:00
Emil Velikov	ba414dba4f	automake: intel: correctly append to the LIBADD variable Commit `05fc62d89f` sets the variable, yet it forgot the update the existing reference to append (instead of assign). Thus as-is the expat library was discarded from the link chain when building with Android. Fixes: `05fc62d89f` ("automake: intel: move expat handling where it's used") Cc: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-08 14:23:57 +00:00
Emil Velikov	6ef9482b78	configure: enable the OpenCL ICD by default Nearly all the distributions* that build Mesa OpenCL, enable the ICD. Since building a non-ICD driver has the chance of conflicting with existing OpenCL binary (libOpenCL.so). Furthermore, some applications expect the library to provide annotated/versioned symbols. https://lists.freedesktop.org/archives/mesa-dev/2017-September/171093.html *Fedora, Suse, Arch, Debian, Ubuntu, FreeBSD use the ICD Gentoo manages the conflicting files via eselect. Cc: Matt Turner <mattst88@gmail.com> Cc: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-By: Aaron Watry <awatry@gmail.com>	2017-11-08 14:10:33 +00:00
Emil Velikov	0cd0958544	targets/opencl: don't hardcode the icd file install to /etc/... Use $(sysconfdir) instead of hardcoding /etc. While the OpenCL spec expects the file in /etc, people building their stack can override that, esp. !Linux users. Furthermore this removes a fundamental violation, which results in the system file being overwritten even as one explicitly sets --prefix and/or DESTDIR. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-By: Aaron Watry <awatry@gmail.com>	2017-11-08 14:10:07 +00:00
Emil Velikov	01d91b3718	amd: add amdgpu_asic_addr.h to the sources list Otherwise it will be missing from the release tarball Fixes: `7f33e94e43` ("amd/addrlib: update to latest version") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-08 14:07:27 +00:00
Tobias Droste	5d61fa4e68	gallivm: Use new LLVM fast-math-flags API LLVM 6 changed the API on the fast-math-flags: https://reviews.llvm.org/rL317488 NOTE: This also enables the new flag 'ApproxFunc' to allow for approximations for library functions (sin, cos, ...). I'm not completly convinced, that this is something mesa should do. Signed-off-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-11-08 10:44:19 +01:00
Juan A. Suarez Romero	d5a641106b	glsl: add varying resources for arrays of complex types This patch is mostly a patch done by Ilia Mirkin. It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. v2: fix locations for TCS/TES/GS inputs and outputs (Ilia) CC: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-11-08 10:12:07 +01:00
Timothy Arceri	36be8c2fcf	st/glsl_to_nir: use nir_shader_gather_info() Use the NIR helper rather than the GLSL IR helper to get in/out masks. This allows us to ignore varyings removed by NIR optimisations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 17:33:14 +11:00
Timothy Arceri	c980a3aa31	st/glsl_to_nir: generate NIR earlier We want to use nir_shader_gather_info() the GLSL IR version might be including varyings that NIR later eliminates. To do this we need to generate NIR before we we start using the in/out bitmasks. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 17:33:14 +11:00
Timothy Arceri	f6c0504abc	st/glsl_to_nir: delay adding built-in uniforms to Parameters list Delaying adding built-in uniforms until after we convert to NIR gives us a better chance to optimise them away. Also NIR allows us to iterate over the uniforms directly so should be faster. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 17:33:14 +11:00
Marek Olšák	7f33e94e43	amd/addrlib: update to latest version This uses C++11 initializer lists. I just overwrote all Mesa files with internal addrlib and discarded hunks that we should probably keep, but I might have missed something. The code depending on ADDR_AM_BUILD is removed. We can add it back next time if needed. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-08 00:55:13 +01:00
Eric Anholt	3bfcd31e98	braodcom/vc5: Flush the job when it grows over 1GB. Fixes GL_OUT_OF_MEMORY from streaming-texture-leak (and will hopefully keep piglit from ooming on my no-swap platform, as well).	2017-11-07 12:58:03 -08:00
Eric Anholt	50906e4583	broadcom/vc5: Do 16-bit unpacking of integer texture returns properly. We were doing f16 unpacks, which trashed "1" values. Fixes many piglit texwrap GL_EXT_texture_integer cases.	2017-11-07 12:58:03 -08:00
Eric Anholt	80da60947b	broadcom/vc5: Fix pausing of transform feedback. Gallium disables it by removing the streamout buffers, not by binding a program that doesn't have TF outputs. Fixes piglit "ext_transform_feedback2/counting with pause"	2017-11-07 12:58:00 -08:00
Eric Anholt	25d199f67d	broadcom/vc5: Add support for GL_RASTERIZER_DISCARD Fixes piglit discard-drawarrays.	2017-11-07 12:57:49 -08:00
Eric Anholt	dfff9ce45e	broadcom/vc5: Fix scheduling for a non-SFU R4 write after a dead R4 write. The v3d_qpu_writes_r*() were only checking for fixed-function accumulator writes, not normal ALU writes to those regs. Fixes fs-discard-exit-2 on simulation (but not HW).	2017-11-07 12:57:49 -08:00
Eric Anholt	9ccb6621be	broadcom/vc5: Add partial transform feedback query support. We have to compute the queries in software, so we're counting the primitives by hand. We still need to make sure to not increment the PRIMITIVES_EMITTED if we overflowed, but leave that for later.	2017-11-07 12:57:43 -08:00
Eric Anholt	4f33344e7a	broadcom/vc5: Add occlusion query support. Fixes all of piglit's OQ tests.	2017-11-07 12:56:40 -08:00
Jason Ekstrand	d002950e54	intel/fs/nir: Return Q types from brw_reg_type_for_bit_size Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	dee58ecd2e	intel/fs/nir: Use Q immediates for load_const on gen8+ Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	9bb34892bf	intel/fs/nir: Setup immediates based on type in i2b and f2b Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	1cb210f4bc	intel/reg: Add helpers for 64-bit integer immediates Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	df81b81fb9	compiler/nir_types: Handle vectors in glsl_get_array_element Most of NIR doesn't allow doing array indexing on a vector (though it does on a matrix). However, nir_lower_io handles it just fine and this behavior is needed for shared variables in Vulkan. This commit makes glsl_get_array_element do something sensible for vector types and makes nir_validate happy with them. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	ad77775809	nir: Validate base types on array dereferences We were already validating that the parent type goes along with the child type but we weren't actually validating that the parent type is reasonable. This fixes that. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:41:24 -08:00
Jason Ekstrand	ab9220edd6	nir,intel/compiler: Use a fixed subgroup size The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared as a uniform. This means that it cannot change across an invocation such as a draw call or a compute dispatch. For compute shaders, we're ok because we only ever use one dispatch size. For fragment, however, the hardware dynamically chooses between SIMD8 and SIMD16 which violates the spec. Instead, let's just pick a subgroup size based on the shader stage. The fixed size we choose for compute shaders is a bit higher than strictly needed but there's no real harm in that. The advantage is that, if they do anything interesting with the value, NIR will see it as an immediate and can optimize better. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	a026458020	nir/lower_subgroups: Lower ballot intrinsics to the specified bit size Ballot intrinsics return a bitfield of subgroups. In GLSL and some SPIR-V extensions, they return a uint64_t. In SPV_KHR_shader_ballot, they return a uvec4. Also, some back-ends would rather pass around 32-bit values because it's easier than messing with 64-bit all the time. To solve this mess, we make nir_lower_subgroups take a new parameter called ballot_bit_size and it lowers whichever thing it gets in from the source language (uint64_t or uvec4) to a scalar with the specified number of bits. This replaces a chunk of the old lowering code. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	8c2bf020fd	nir/builder: Add a nir_imm_intN_t helper This lets you easily build integer immediates of arbitrary bit size. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	9b35faba42	nir/lower_system_values: Lower SUBGROUP__MASK based on type The SUBGROUP__MASK system values are uint64_t when coming in from GLSL but uvec4 when coming in from SPIR-V. Lowering based on type allows us to nicely handle both. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	3ee91ee6ac	nir: Make ballot intrinsics variable-size This way they can return either a uvec4 or a uint64_t. At the moment, this is a no-op since we still always return a uint64_t. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ad127afcfd	nir: Add a ssa_dest_init_for_type helper This would be useful a number of places Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	28da82f978	nir: Add a new subgroups lowering pass This commit pulls nir_lower_read_invocations_to_scalar along with most of the guts of nir_opt_intrinsics (which mostly does subgroup lowering) into a new nir_lower_subgroups pass. There are various other bits of subgroup lowering that we're going to want to do so it makes a bit more sense to keep it all together in one pass. We also move it in i965 to happen after nir_lower_system_values to ensure that because we want to handle the subgroup mask system value intrinsics here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	1ca3a94427	intel/fs: Don't use automatic exec size inference The automatic exec size inference can accidentally mess things up if we're not careful. For instance, if we have add(4) g38.2<4>D g38.1<8,2,4>D g38.2<8,2,4>D then the destination register will end up having a width of 2 with a horizontal stride of 4 and a vertical stride of 8. The EU emit code sees the width of 2 and decides that we really wanted an exec size of 2 which doesn't do what we wanted. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	dc4cf11dfc	intel/fs: Explicitly set EXECUTE_1 where needed Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ab378734f5	intel/eu: Explicitly set EXECUTE_1 where needed Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	8280560705	intel/eu: Make automatic exec sizes a configurable option We have had a feature in codegen for some time that tries to automatically infer the execution size of an instruction from the width of its destination. For things such as fixed function GS, clipper, and SF programs, this is very useful because they tend to have lots of hand-rolled register setup and trying to specify the exec size all the time would be prohibitive. For things that come from a higher-level IR, however, it's easier to just set the right size all the time and the automatic exec sizes can, in fact, cause problems. This commit makes it optional while enabling it by default. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	7a82ad54bb	intel/fs: Rework zero-length URB write handling Originally we tried to handle this case based on slots_valid. However, there are a number of ways that this can go wrong. For one, we throw away any trailing slots which either aren't written or are set to VARYING_SLOT_PAD. Second, even if PSIZ is a valid slot, we may not actually write anything there. Between the lot of these, it was possible to end up in a case where we tried to do a regular URB write but ended up with a length of 1 which is invalid. This commit moves it to the end and makes it based on a new boolean flag urb_written. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6132992cdb	intel/compiler/fs: Set up subgroup invocation as a system value Subgroup invocation is computed using a vector immediate and some dispatch-aware arithmetic. Unfortunately, due to the vector arithmetic, and the fact that it's frequently read 16-wide, it's not something that can easily be CSEd by the back-end compiler. There are a few different possible approaches to this problem: 1) Emit the code to calculate the subgroup invocation on-the-fly and trust NIR to do the CSE. This is what we were doing. 2) Add a back-end instruction for the subgroup ID. This has the advantage of helping the back-end compiler with CSE but has the downside of very poor scheduling for the calculation because it has to be emitted in the back-end. 3) Emit the calculation at the top of the program and re-use the result. This gets rid of the CSE problem but comes at the cost of an extra live register. This commit switches us from 1) to 3). We choose to store the subgroup invocation values as a W type to reduce the impact of the extra live register. Trusting NIR and using 1) was fine but we're soon going to want to use the subgroup invocation value for other things in the back-end compiler and this makes it much easier to do without having to worry about CSE problems. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	295605c930	intel/cs: Push subgroup ID instead of base thread ID We're going to want subgroup ID for SPIR-V subgroups eventually anyway. We really only want to push one and calculate the other from it. It makes a bit more sense to push the subgroup ID because it's simpler to calculate and because it's a real API thing. The only advantage to pushing the base thread ID is to avoid a single SHL in the shader. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6411defdcd	intel/cs: Re-run final NIR optimizations for each SIMD size With the advent of SPIR-V subgroup operations, compute shaders will have to be slightly different depending on the SIMD size at which they execute. In order to allow us to do dispatch-width specific things in NIR, we re-run the final NIR stages for each sIMD width. One side-effect of this change is that we start rallocing fs_visitors which means we need DECLARE_RALLOC_CXX_OPERATORS. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	4e79a77cdc	intel/compiler: Move the destructor from vec4_visitor to backend_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	16ada419d7	i965/fs: Get rid of the early return in brw_compile_cs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	80ddfab2f5	intel/cs: Rework the way thread local ID is handled Previously, brw_nir_lower_intrinsics added the param and then emitted a load_uniform intrinsic to load it directly. This commit switches things over to use a specific NIR intrinsic for the thread id. The one thing I don't like about this approach is that we have to copy thread_local_id over to the new visitor in import_uniforms. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	25f7453c9e	intel/fs: Mark 64-bit values as being contiguous This isn't often a problem , when we're in a compute shader, we must push the thread local ID so we decrement the amount of available push space by 1 and it's no longer even and 64-bit data can, in theory, span it. By marking those uniforms contiguous, we ensure that they never get split in half between push and pull constants. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	c4c8cba705	intel/cs: Ignore runtime_check_aads_emit for CS It's only set on gen4-5 which clearly don't support compute shaders. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	d4de813d86	intel/cs: Stop setting dispatch_grf_start_reg Nothing ever reads it for compute shaders because it's always 1. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	b1a9cdede4	intel/cs: Drop max_dispatch_width checks from compile_cs The only things that adjust fs_visitor::max_dispatch_width are render target writes which don't happen in compute shaders so they're pointless. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	1077981eb5	intel/fs: Remove min_dispatch_width from fs_visitor It's 8 for everything except compute shaders. For compute shaders, there's no need to duplicate the computation and it's just a possible source of error. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	b299ded02e	intel/fs: use pull constant locations to check for first compile of a shader Before, we bailing in assign_constant_locations based on the minimum dispatch size. The more direct thing to do is simply to check for whether or not we have constant locations and bail if we do. For nir_setup_uniforms, it's completely safe to do it multiple times because we just copy a value from the NIR shader. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	103081c9a9	intel/fs: Retype dest to match value in read[First]Invocation This is what we really wanted all along. Always retyping to D works because that's what get_nir_src() always gives us, at least for 32-bit types. The SPIR-V variants of these operations accept arbitrary types and we need this if we're going to handle 64 or 16-bit values. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ebaee9da4a	intel/fs: Uniformize the index in readInvocation The index is any value provided by the shader and this can be called in non-uniform control flow so we can't just take component 0. Found by inspection. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	b67230de63	intel/fs: Protect opt_algebraic from OOB BROADCAST indices Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	aa4ff4b98c	i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	ec8c6649f1	i965/fs/nir: Minor refactor of store_output Stop retyping the output of shuffle_64bit_data_for_32bit_write. It's always BRW_REGISTER_TYPE_D which is perfectly fine for writing out. Also, when we change get_nir_src to return something with a 64-bit type for 64-bit values, the retyping will not be at all what we want. Also, retyping the output based on src.type before we whack it back to 32 bits is a problem because the output is always 32 bits. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	030d2b5016	i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write All callers of this function allocate a fs_reg expressly to pass into it. It's much easier if we just let the helper allocate the register. While we're here, we switch it to doing the MOVs with an integer type so that we don't accidentally canonicalize floats on half of a double. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6197a6b7ac	i965/fs/nir: Simplify 64-bit store_output The swizzles weren't doing any good because swiz is just XYZW. Also, we were emitting an extra set of MOVs because shuffle_64bit_data_for_32bit already does a MOV for us. Finally, the temporary was only ever used inside the inner loop so there's no need for it to actually be an array. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	18fde36ced	intel/fs: Use the original destination region for int MUL lowering Some hardware (CHV, BXT) have special restrictions on register regions when doing integer multiplication. We want to respect those when we lower to DxW multiplication. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	d54f8ec744	intel/fs: Fix integer multiplication lowering for src/dst hazards Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	fd1bcccc2d	intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core The same workaround we need for 64-bit values on little core also takes care of the Ivy Bridge problem and does so a bit more efficiently so we can drop that code while we're here. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6041a31e77	intel/eu: Fix broadcast instruction for 64-bit values on little-core We're not using broadcast for any 32-bit types right now since we mostly use it for emit_uniformize on 32-bit buffer indices. However, SPIR-V subgroups are going to need it for 64-bit so let's make it work. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	10e4feed39	intel/eu/reg: Add a subscript() helper This is similar to the identically named fs_reg helper. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	068beb41d8	intel/eu: Just modify the offset in brw_broadcast This means we have to drop const from a variable but it also means that 100% of the code which deals with the offset limit is in one place. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	e3bcc86133	intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST These restrictions effectively already existed due to the way we use indirect sources but weren't being directly enforced. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 10:37:52 -08:00
Jason Ekstrand	1b8ef49f48	intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all For some reason, the any/all predicates don't work properly with SIMD32. In particular, it appears that a SEL with a QtrCtrl of 2H doesn't read the correct subset of the flag register and you end up getting garbage in the second half. Work around this by using a pair of 1-wide MOVs and scattering the result. This fixes the any/all instructions for SIMD32. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	1f41663007	intel/fs: Use an explicit D type for vote any/all/eq intrinsics The any/all intrinsics return a boolean value so D or UD is the correct type. Unfortunately, get_nir_dest has the annoying behavior of returnning a float type by default. This causes format conversion which gives us -1.0f or 0.0f in the register. If the consumer of the result does an integer comparison to zero, it will give you the right boolean value but if we do something more clever based on the 0/~0 assumption for booleans, this will give the wrong value. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	6c00240bc6	intel/fs: Don't stomp f0.1 in SIMD16 ballot In fragment shaders f0.1 is used for discards so doing ballot after a discard can potentially cause the discard to not happen. However, we don't support SIMD32 fragment shaders yet so this isn't a problem. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	def013a863	intel/fs: Use ANY/ALL32 predicates in SIMD32 We have ANY/ALL32 predicates and, for the most part, they work just fine. (See the next commit for more details.) Also, due to the way that flag registers are handled in hardware, instruction splitting is able to split the CMP correctly. Specifically, that hardware looks at the execution group and knows to shift it's flag usage up correctly so a 2H instruction will write to f0.1 instead of f0.0. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	0d905597fe	intel/fs: Be more explicit about our placement of [un]zip Before, we were careful to place the zip after the last of the split instructions but did unzip on-demand. This changes things so that the unzips go before all of the split instructions and the unzip comes explicitly after all the split instructions. As a side-effect of this change, we now emit the split instruction from highest SIMD group to lowest instead of low to high. We could have kept the old behavior, but it shouldn't matter and this made the code easier. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	fcd4adb9d0	intel/fs: Pass builders instead of blocks into emit_[un]zip This makes it far more explicit where we're inserting the instructions rather than the magic "before and after" stuff that the emit_[un]zip helpers did based on block and inst. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Jason Ekstrand	e8c9e65185	intel/fs: Use a pure vertical stride for large register strides Register strides higher than 4 are uncommon but they can happen. For instance, if you have a 64-bit extract_u8 operation, we turn that into UB -> UQ MOV with a source stride of 8. Our previous calculation would try to generate a stride of <32;8,8>:ub which is invalid because the maximum horizontal stride is 4. To solve this problem, we instead use a stride of <8;1,0>. As noted in the comment, this does not work as a destination but that's ok as very few things actually generate that stride. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-07 10:37:52 -08:00
Eric Anholt	bd24f4890f	broadcom/vc5: Skip emitting textures that aren't used. Fixes crashes when ARB_fp uses texture[1] but not 0, as in piglit's fp-fragment-position.	2017-11-07 09:40:25 -08:00
Eric Anholt	3d5e62dcfa	broadcom/vc5: Add missing SRGBA8 ETC2 support. Fixes piglit oes_compressed_etc2_texture-miptree srgb8-alpha8.	2017-11-07 09:40:25 -08:00
Eric Anholt	6079f7c3c3	broadcom/vc5: Disable early Z test when the FS writes Z. Fixes piglit early-z.	2017-11-07 09:40:25 -08:00
Eric Anholt	eeb9e80272	broadcom/vc5: Shift the min/max lod fields by the BASE_LEVEL. The lod clamping is what limits you between base and last level, and the base level field is just there to help decide where the min/mag change happens. Fixes tex-miplevel-selection GL2:texture()	2017-11-07 09:40:25 -08:00
Eric Anholt	521e1d0275	broadcom/vc5: Add support for anisotropic filtering.	2017-11-07 09:40:25 -08:00
Eric Anholt	a266f78741	broadcom/vc5: Fix mipmap filtering enums. The ordering of the values was even less obvious than I thought, with both the mip filter and the min filter being in different bits depending on whether the mip filter is none. Fixes piglit fs-textureLod-miplevels.shader_test	2017-11-07 09:40:25 -08:00
Eric Anholt	73ec70bf13	broadcom/vc5: Fix height padding of small UIF slices. The HW doesn't pad the slice's height to make a full 4x4 group of UIF blocks. We just need to pad to columns, and the start of the next column appears in the bottom of the previous column's last block. Fixes piglit fs-textureOffset-2D.	2017-11-07 09:40:24 -08:00
Eric Anholt	e23c6991be	broadcom/vc5: Print the actual offsets in HW for our resource layout debug. The alignment of level 0 is non-obvious, so it's hard to turn a faulting address into a slice without this.	2017-11-07 09:40:24 -08:00
Eric Anholt	426c352336	broadcom/vc5: Set the available VS outputs to match the FS inputs. Fixes piglit glsl-es-3.00/minimum-maximums.txt.	2017-11-07 09:40:24 -08:00
Eric Anholt	f1797928fd	broadcom/vc5: Set the max texture LOD bias. The field is signed 8.8, so the usual 16.0f fits. Fixes piglit gl-2.1-minmax.	2017-11-07 09:40:24 -08:00
Eric Anholt	47bd9dac19	broadcom/vc5: Fix translation of stencil ops. They aren't quite in the same order as the gallium defines. Fixes piglit gl-2.0-two-sided-stencil.	2017-11-07 09:40:24 -08:00
Eric Anholt	3be820477f	broadcom/vc5: Move stencil state packing to the CSO. Only the stencil ref comes in as dynamic state at emit time.	2017-11-07 09:19:48 -08:00
Eric Anholt	3da39f2297	broadcom/vc5: Introduce a helper for pre-packing our V3DXX structs. This is so much more pleasant to write than the manual V3D33_whatever_pack() calls, and will be useful for when we start doing actual per-V3D compiles.	2017-11-07 09:19:48 -08:00
Eric Anholt	078b163a9c	broadcom/vc5: Add a cl_emit() variant for merging with a pre-packed struct. Cleans up the hand-written code, at the cost of another ugly macro.	2017-11-07 09:19:48 -08:00
Eric Anholt	735b844b1b	broadcom/vc5: Skip emitting depth offset while disabled. The enable flag is also in the rasterizer state, so it will be emitted once it's needed.	2017-11-07 09:19:48 -08:00
Eric Anholt	386e9362a5	broadcom/vc5: Don't emit stencil config if not doing stencil test. As with blending, we'll have the bit flagged again when it gets reenabled in CONFIGURATION_BITS, so there's no need to emit test state if we're not testing.	2017-11-07 09:19:48 -08:00
Eric Anholt	f90ee6eb2b	broadcom/vc5: Don't emit updated blend factors/funcs while disabled. The dirty bit will be flagged again when re-enbaled. Keeps us from emitting blend state in CLs that never do blending.	2017-11-07 09:19:48 -08:00
Eric Anholt	dd429cb2db	broadcom/vc5: Fix missing enum decode for indexed primitives.	2017-11-07 09:19:48 -08:00
Eric Anholt	bb6997e6a3	broadcom/vc5: Drop padding bits from the bottom of the TSDA address. Fixes misaligned-looking addresses in decode.	2017-11-07 09:19:48 -08:00
Eric Anholt	949ac638bc	broadcom/vc5: Make sure the TMU indirect struct is appropriately aligned. I was hoping that this would help with fbo-generatemipmap hangs, but no luck.	2017-11-07 09:19:48 -08:00
Kenneth Graunke	cb47de4ff0	broadcom/genxml: Fix decoding of groups with small fields. Groups containing fields smaller than a byte probably not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current byte, printing the same value over and over. Cc: Eric Anholt <eric@anholt.net>	2017-11-07 09:19:48 -08:00
Eric Anholt	47dac5d2bc	broadcom/vc5: Use DEPTH24_STENCIL8 for rendering to depth-only textures. The HW puts the pad bits at the top for DEPTH_COMPONENT24, but we need it at the bottom for texturing. Using the format with stencil probably means we won't be able to do Z24 and separate S8, but I wasn't planning on supporting that anyway. Fixes hiz-depth-read-fbo-d24-s0	2017-11-07 09:19:48 -08:00
Chad Versace	3ea37d0a2a	anv: Suffix anv-private 'VK' tokens with 'ANV' I saw VK_IMAGE_ASPECT_ANY_COLOR_BIT while hacking anv_formats.c and got confused. "Huh? What extension added that?". No extension defines it; anv_private.h defines it. To remove confusion, rename the anv-private VK tokens as if they were extension tokens with the ANV vendor suffix. I found only two such tokens: VK_IMAGE_ASPECT_ANY_COLOR_BIT VK_IMAGE_ASPECT_PLANES_BITS Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-07 09:06:41 -08:00
Chad Versace	012b54c6b1	anv: Remove unused variable 'gen' In anv_physical_device_get_format_properties(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-11-07 09:06:30 -08:00
Marek Olšák	33000e7c43	radeonsi: add si_screen::has_ls_vgpr_init_bug Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:40 +01:00
Marek Olšák	cde664ab81	radeonsi: use ac_create_target_machine Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:38 +01:00
Marek Olšák	81f81fdb54	radeonsi: use ac_get_llvm_processor_name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:36 +01:00
Marek Olšák	c29f5fe41c	radeonsi/gfx9: don't set gs_table_depth Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:33 +01:00
Marek Olšák	e616743dab	radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:26:36 +01:00
Marek Olšák	24e9004708	radeonsi: remove unused field in the PCI ID table Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2017-11-07 17:26:36 +01:00
Miklós Máté	cf47dfe8f1	mesa: fix deleting the dummy ATI_fs The DummyShader is used by GenFragmentShadersATI() as a placeholder to mark IDs as allocated. Context cleanup wants to delete everything in ctx->Shared->ATIShaders, and crashes on these placeholders with this backtrace: ==15060== Invalid free() / delete / delete[] / realloc() ==15060== at 0x482F478: free (vg_replace_malloc.c:530) ==15060== by 0x57694F4: _mesa_delete_ati_fragment_shader (atifragshader.c:68) ==15060== by 0x58B33AB: delete_fragshader_cb (shared.c:208) ==15060== by 0x5838836: _mesa_HashDeleteAll (hash.c:295) ==15060== by 0x58B365F: free_shared_state (shared.c:377) ==15060== by 0x58B3BC2: _mesa_reference_shared_state (shared.c:469) ==15060== by 0x578687F: _mesa_free_context_data (context.c:1366) ==15060== by 0x595E9EC: st_destroy_context (st_context.c:642) ==15060== by 0x5987057: st_context_destroy (st_manager.c:772) ==15060== by 0x5B018B6: dri_destroy_context (dri_context.c:217) ==15060== by 0x5B006D3: driDestroyContext (dri_util.c:511) ==15060== by 0x4A1CBE6: dri3_destroy_context (dri3_glx.c:170) ==15060== Address 0x7b5dae0 is 0 bytes inside data symbol "DummyShader" Also, DeleteFragmentShadersATI() should not assert on DummyShader, just remove the hash entry. Normally one would define a shader after GenFragmentShadersATI(), and BindFragmentShaderATI() replaces the placeholder with a real object. However, the specification doesn't say that one has to define a shader for each allocated ID. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-11-07 17:26:36 +01:00
Michel Dänzer	cd3b55ad07	gallium: Guard assertions by NDEBUG instead of DEBUG This matches the standard assert.h header. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-07 16:47:15 +01:00
Eric Engestrom	1e6f9ea212	meson: only turn on Mesa's DEBUG for buildtype==debug As discussed in this thread: https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Chad Versace <chadversary@chromium.org>	2017-11-07 11:01:32 +00:00
Eric Engestrom	d5597f09c6	meson: switch default build type to debugoptimized As discussed in this thread: https://lists.freedesktop.org/archives/mesa-dev/2017-November/175104.html Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Michel Dänzer <michel@daenzer.net> Cc: Christian Schmidbauer <ch.schmidbauer@gmail.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Andres Rodriguez <andresx7@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Tested-by: Chad Versace <chadversary@chromium.org>	2017-11-07 11:00:03 +00:00
Eric Engestrom	cc15460e18	meson: drop GLESv1 .so version back to 1.0.0 autotools generates libGLESv1_CM.so.1.0.0, so let's make sure meson does the same. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-07 10:47:20 +00:00
Eric Engestrom	5be1b1a8ce	meson: standardize .so version to major.minor.patch This `version` field defines the filename for the .so. The plan .so as well as .so.$major are always symlinks to this. Unless I'm mistaken, only the major is ever used, so this shouldn't matter, but for consistency with autotools (and in case it does matter), let's always have all 3 major.minor.patch components. (The soname isn't affected, and is always .so.$major) Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-07 10:47:20 +00:00
Dave Airlie	0084f4a422	ac/nir: for ubo load use correct num_components I was hacking something stupid in doom, and hit an assert for the bitcast following this, it definitely looks like this should be the number of 32-bit components, not the instr level ones. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-07 14:54:19 +10:00
Gwan-gyeong Mun	fb87c40a58	nir: fix a typo Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-06 18:11:24 -08:00
Tomasz Figa	0886be093f	glsl: Allow precision mismatch on dead data with GLSL ES 1.00 Commit `259fc50545` added linker error for mismatching uniform precision, as required by GLES 3.0 specification and conformance test-suite. Several Android applications, including Forge of Empires, have shaders which violate this rule, on a dead varying that will be eliminated. The problem affects a big number of applications using Cocos2D engine and other GLES implementations accept this, this poses a serious application compatibility issue. Starting from GLSL ES 3.0, declarations with conflicting precision qualifiers are explicitly prohibited. However GLSL ES 1.00 does not clearly specify the behavior, except that "Uniforms are defined to behave as if they are using the same storage in the vertex and fragment processors and may be implemented this way. If uniforms are used in both the vertex and fragment shaders, developers should be warned if the precisions are different. Conversion of precision should never be implicit." The word "used" is not clear in this context and might refer to 1) declared (same as GLES 3.x) 2) referred after post-processing, or 3) linked after all optimizations are done. Looking at existing applications, 2) or 3) seems to be widely adopted. To avoid compatibility issues, turn the error into a warning if GLSL ES version is lower than 3.0 and the data is dead in at least one of the shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532 Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-06 15:16:03 -08:00
Timothy Arceri	a9000cb860	i965: disable NIR linking on HSW and below Fixes: `379b24a40d` "i965: make use of nir linking" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103537 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-11-07 09:14:05 +11:00
Dave Airlie	201b3b8d0d	radv: move is_local up to the winsys level. We can avoid adding the buffer in the non-local case, this will avoid all the overhead of the indirect call. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:45:59 +00:00
Dave Airlie	25660499b6	radv: wrap cs_add_buffer in an inline. (v2) The next patch will try and avoid calling the indirect function. v2: add a missing conversion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:45:59 +00:00
Dave Airlie	31b5da7958	radv: when loading regs no need to add buffer The function that calls us has just added the buffer to the list already, no need to try and add it again. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:44:49 +00:00
Dave Airlie	3bf8be41b8	radv: pre-calculate user_data_0 registers and store in pipeline There's no point recalculating these the whole time on descriptor emission, just store them at pipeline creation. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:44:49 +00:00
Adam Jackson	d547e18184	docs: Mark GLX_ARB_context_flush_control done Requires an unreleased X server, but from the client GLX side this is as done as it gets. Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-06 16:21:57 -05:00
Neil Roberts	6ce9006d76	i965: Enable flush control Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Neil Roberts <neil@linux.intel.com>	2017-11-06 16:09:03 -05:00
Adam Jackson	791d06b23b	drisw: Enable flush control for llvmpipe and softpipe Hilariously this is a fairly big win. Neil's multi-context-test improves from ~24 to ~36 fps with llvmpipe on a Core i5-3317U. softpipe also improves, from about 2.25 to 3.09 fps (when it's that slow, you're allowed to be that precise). I'd have added it to swrast classic, but the testcase wants GL 3.0 and shaders, and that's not a thing classic has, so I figured making it work on softpipe was crime enough. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-06 16:09:03 -05:00
Adam Jackson	5cc06bec19	gallium: Wire up flush control Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-06 16:09:03 -05:00
Adam Jackson	c0be3aae6c	egl: Implement EGL_KHR_context_flush_control Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2017-11-06 16:09:03 -05:00
Neil Roberts	ba7679f48d	glx: Implement GLX_ARB_context_flush_control Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Neil Roberts <neil@linux.intel.com>	2017-11-06 16:09:02 -05:00
Neil Roberts	b89067c84f	dri: Add a flush control extension This advertises that the driver can accept a new context attribute __DRI_CTX_ATTRIB_RELEASE_BEHAVIOR. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Neil Roberts <neil@linux.intel.com>	2017-11-06 16:09:02 -05:00
Neil Roberts	6d87500fe1	dri: Change __DriverApiRec::CreateContext to take a struct for attribs Previously the CreateContext method of __DriverApiRec took a set of arguments to describe the attribute values from the window system API's CreateContextAttribs function. As more attributes get added this could quickly get unworkable and every new attribute needs a modification for every driver. To fix that, pass the attribute values in a struct instead. The struct has a bitmask to specify which members are used. The first three members (two for the GL version and one for the flags) are always set. If the bit is not set in the attribute mask then it can be assumed the attribute has the default value. Drivers will error if unknown bits in the mask are set. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Neil Roberts <neil@linux.intel.com>	2017-11-06 16:09:02 -05:00
Neil Roberts	8c0729fd99	intel: Don't flush the old context in intelMakeCurrent It shouldn't be necessary to flush the context within the driver implementation because the old context is explicitly flushed in _mesa_make_current which is called a little further on. It is useful to only have a single place that flushes when switching contexts to make it easier to later implement the GL_KHR_context_flush_control extension. The flush in intelMakeCurrent was added in commit `5505865` to implement the GLX semantics that the context should be flushed when it is released. When the commit was made there was no flush in _mesa_make_current because it was only added later in `93102b4c`. I think that later commit effectively makes the first commit redundant. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Neil Roberts <neil@linux.intel.com>	2017-11-06 16:08:58 -05:00
Adam Jackson	9ef7158a09	egl/dri2: Factor out context attribute initialization Signed-off-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-06 16:08:58 -05:00
Wladimir J. van der Laan	96463614a3	etnaviv: Don't over-pad compressed textures HALIGN_FOUR/SIXTEEN has no meaning for compressed textures, and we can't render to them anyway. So use the tightest possible packing. This avoids bugs with non-power-of-two block sizes. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-06 21:31:20 +01:00
Wladimir J. van der Laan	93ba3f29bb	etnaviv: ASTC texture support Add ASTC texture support for hardware that supports this (currently only GC3000 on i.MX6qp is known to have this). Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-06 21:30:54 +01:00
Wladimir J. van der Laan	f1e1c60ff6	etnaviv: Update from rnndb Updated as of etnav_viv commit 3b4a8ec. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-11-06 21:29:19 +01:00
Dave Airlie	4bcb48b831	radv: add initial copy descriptor support. (v2) It appears the latest dota2 vulkan uses this, and we get a hang in VR mode without it. v2: remove finishme I left in after finishing. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 19:12:39 +00:00
Marek Olšák	71f5fe36b7	gallium/u_vbuf: use signed vertex buffers offsets for optimal uploads Uploaded data must start at (stride * start), because we can't modify start in all cases. If it's the first allocation, it's also the amount of memory wasted. If the starting offset is larger than the size of the upload buffer, the buffer is re-created, used for 1 upload, and then thrown away. If the upload is small, most of the buffer space is unused and wasted. Keep doing that and the OOM killer comes. It's actually pretty quick. With signed VB offsets, we can set min_out_offset = 0 in u_upload_alloc/u_upload_data. This fixes OOM situations with SPECviewperf.	2017-11-06 19:09:12 +01:00
Marek Olšák	3f58988b81	radeonsi: enable signed vertex buffer offsets	2017-11-06 19:09:12 +01:00
Marek Olšák	24d6318d24	gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET	2017-11-06 19:09:12 +01:00
Juan A. Suarez Romero	e17e8934f9	automake: include git_sha1.h.in in release tarball Fixes: make[2]: Leaving directory '/home/local/mesa/mesa-17.4.0-devel/_build/sub/src' make[2]: *** No rule to make target '../../../src/git_sha1.h.in', needed by 'git_sha1.h'. Stop. Makefile:660: recipe for target 'all-recursive' failed Fixes: `16be271c6e` "git_sha1_gen: use git_sha1.h.in on all build systems" Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-11-06 18:18:42 +01:00
Marek Olšák	adab7f16ff	radeonsi: don't map big VRAM buffers for the first upload directly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-06 16:23:20 +01:00
Marek Olšák	4b0dc098b2	gallium/u_threaded: don't map big VRAM buffers for the first upload directly This improves Paraview "many spheres" performance 4x along with the radeonsi commit. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-06 16:23:20 +01:00
Marek Olšák	a5d3999c31	gallium/u_threaded: clean up tc_improve_map_buffer_flags and prevent reentry Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-06 16:23:20 +01:00
Dave Airlie	60a9705e00	radv: move descriptor sets out of cmd_state. Instead of storing all the pointers and zeroing them all out, just store a valid bitmask in the state. This also moves the CmdBindPipeline path down the cpu usage path for the multithreading demo as it no longer has to traverse MAX_SETS to find the active descriptor sets. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:11:03 +00:00
Dave Airlie	3a0d098252	radv: add helper for setting a descriptor. This is just a simple refactor. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:11:00 +00:00
Dave Airlie	b48063a2f2	radv: move vertex binding out of cmd state. This isn't required to be cleared, since buffers are only linked by vertex elements, so if elements are clear then no buffers should be referenced. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:10:56 +00:00
Dave Airlie	7365626d78	radv: reorder cmd_state to remove a hole. This just removes a hole in the cmd_state and packs some bools together. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:10:53 +00:00
Dave Airlie	f0ae06a13c	radv: free attachments on end command buffer. If we allocate attachments in the begin command buffer due to the render pass continue bit, we were leaking them. Since renderpasses inside a cmd buffer malloc/free these properly, and set to NULL, we just need to call free at end. Fixes a memory leak with multithreading demo. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 01:03:47 +00:00
Bas Nieuwenhuizen	608af05ffb	radv: Optimize calling radv_save_descriptors. uint32_t data[MAX_SETS * 2] = {}; was getting executed before the exit and took significant amounts of time. By having the check outside the function, we skip the execution of the clear. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen	cecbcf4b2d	radv: Use an array to store descriptor sets. The vram_list linked list resulted in lots of pointer chasing. Replacing this with an array instead improves descriptor set allocation CPU usage by 3x at least (when also considering the free), because it had to iterate through 300-400 sets on average. Not a huge improvement as the pre-improvement CPU usage was only about 2.3% in the busiest thread. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-04 20:18:17 +01:00
Pierre Moreau	b041687ed1	nv50,nvc0: Display shared memory usage in pipe_debug_message Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Pierre Moreau	efe532b739	nv50,nvc0: Copy shared memory per block to the program info structure and back In OpenCL/CUDA kernels, shared memory usage can be defined within the kernel code. Those usage will only be picked up while parsing the SPIR-V, during the translation phase of the program. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Pierre Moreau	49752e99f8	nv50/ir: Store shared memory per block in nv50_ir_prog_info Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>	2017-11-04 14:12:07 -04:00
Anuj Phogat	898e5555de	i965/gen10: Implement Wa3DStateMode This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Remove the bits enabling Float blend optimization. It is enabled through CACHE_MODE_SS register. Update the comment. Move gen10 if block on top of gen9 if block. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	6c681b4cc1	i965/gen10: Enable float blend optimization This optimization is enabled for previous generations too. See Mesa commit `c17e214a6b` On CNL this bit has been moved to CACHE_MODE_SS register. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	d3d0fe4572	i965/gen10: Implement WaForceRCPFEHangWorkaround This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Add the check for Post Sync Operation. Update the workaround comment. Use braces around if-else. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:34 -07:00
Anuj Phogat	3cf4fe2219	i965/gen10: Implement WaSampleOffsetIZ workaround There are few other (duplicate) workarounds which have similar recommendations: WaFlushHangWhenNonPipelineStateAndMarkerStalled WaCSStallBefore3DSamplePattern WaPipeControlBefore3DStateSamplePattern WaPipeControlBefore3DStateSamplePattern has some extra recommendations if driver is using mid batch context restore. Ignoring it for now because We're not doing mid-batch context restore in Mesa. This workaround doesn't fix any of the piglit hangs we've seen on CNL. But it might be fixing something we haven't tested yet. V2: Use brw_load_register_imm32() to program CACHE_MODE_0. Get rid of brw_flush_gpu_caches(). V3: Make the workaround helper functions static. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by :Nanley Chery <nanley.g.chery@intel.com>	2017-11-03 14:30:33 -07:00
Anuj Phogat	7a09be2dc9	i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:30:33 -07:00
Anuj Phogat	2d10eb5ed8	i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:30:33 -07:00
Andrey Grodzovsky	19fc3cdcfb	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Fixes reverted patch `f03b7c9` by doing VMID reservation per process and not per context. Also updates required amdgpu libdrm version since the change involved interface updates in amdgpu libdrm. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-11-03 18:06:17 +01:00
Lionel Landwerlin	24ec29b919	i965: perf: list registers to program for queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:25:36 +00:00
Lionel Landwerlin	285a2192f9	i965: perf: factorize code for availability Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:23:39 +00:00
Lionel Landwerlin	05231a4e74	i965: perf: make revision variable available This will be used in the next commit to build up register programming. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-11-03 14:23:22 +00:00
Nicolai Hähnle	ca63a5ed3e	glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx The dynamic index of a vector (not array!) is lowered to a sequence of conditional assignments. However, the interpolate_at_* expressions require that the interpolant is an l-value of a shader input. So instead of doing conditional assignments of parts of the shader input and then interpolating that (which is nonsensical), we interpolate the entire shader input and then do conditional assignments of the interpolated result. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Nicolai Hähnle	4f42450b86	glsl: allow any l-value of an input variable as interpolant in interpolateAt* The intended rule has been clarified in GLSL 4.60, Section 8.13.2 (Interpolation Functions): "For all of the interpolation functions, interpolant must be an l-value from an in declaration; this can include a variable, a block or structure member, an array element, or some combination of these. Component selection operators (e.g., .xy) may be used when specifying interpolant." For members of interface blocks, var->data.must_be_shader_input must be determined on-the-fly after lowering interface blocks, since we don't want to disable varying packing for an entire block just because one input in it is used in interpolateAt. v2: keep setting must_be_shader_input in ast_function (Ian) v3: follow the relaxed rule of GLSL 4.60 v4: only apply the relaxed rules to desktop GL (the ES WG decided that the relaxed rules may apply in a future version but not retroactively; see also dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_centroid.negative.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101378 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-11-03 14:30:08 +01:00
Dave Airlie	57372c5a42	nir/serialize: fix build with gcc 4.4.7 I had to build on RHEL6 today, and noticed this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 15:03:35 +10:00
Dave Airlie	0722b6d693	i915g: remove some unknown cap warnings.	2017-11-03 15:03:30 +10:00
Dave Airlie	cc69f2385e	i915g: make gears run again. We need to validate some structs exist before we dirty the states, and avoid the problem in some other places. Fixes: `e027935a7` ("st/mesa: don't update unrelated states in non-draw calls such as Clear")	2017-11-03 15:03:30 +10:00
Timothy Arceri	6e2eb96b64	ac: remove the remaining duplicate llvm types Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	e73a467005	ac: remove usused v4f32 Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	7f4966731f	ac: add v2f32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	cd6cfd1095	ac: use the ac f16 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	8f651ae062	ac: use the ac f32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	368654a299	ac: use the ac f64 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	d927db0672	ac: use the common v8i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	9db51b2393	ac: use the common v4i32 llvm type Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:46 +11:00
Timothy Arceri	ee376ac6f4	ac: add v3i32 to the common code and make use of it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	309a51411d	ac: add v2i32 to the common code and use it Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	c64cfa0392	ac: use the ac i64 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	3d45acf71c	ac: remove unused i16 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	4d4799643d	ac: use the ac ivoidt llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	209ad5c16f	ac: use the ac i8 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	21d71189ec	ac: use the ac i1 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	bd59a0bb8b	ac: use the ac i32 llvm type Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:54:45 +11:00
Timothy Arceri	439a2febc4	ac/radeonsi: add support for tex instr without a derefence These are produced by nir_lower_bitmap(), adding the missing derefence would cause other issues that need to be hacked around such as skipping sampler lowering and uniform location assignment, so this change seems the correct way to go. Fixes 194 piglit crashes on radeonsi using NIR. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:51 +11:00
Timothy Arceri	440d08fe93	nir: skip lowering sampler if there is no dereference This avoids a crash on the output of nir_lower_bitmap(). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:46 +11:00
Dave Airlie	de126b0402	r600: add support for early depth/stencil. This add support for the early depth/stencil property found on image shaders. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:37 +10:00
Dave Airlie	f3c6149c26	r600: add support for emitting RAT instructions to the assembler. This adds support for emitting RAT instructions to the assembler. RAT instructions are used to implement image accessors. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:33 +10:00
Dave Airlie	159bf38c3a	r600: add support for mark bit to the assembler. This adds support to the assembler for the mark bit on the export word1. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:30 +10:00
Dave Airlie	90ca378080	r600: add support for valid pixel mode on CF clauses This just adds support to the assembler for setting the valid pixel mode on the CF clause. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:33:26 +10:00
Dave Airlie	d584b4671f	r600: add support for some ALU sources. These special ALU sources provide the shader engine, simd and hw wave ids. These are required for images support. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-03 09:31:50 +10:00
Samuel Pitoiset	bad31f6a65	radv: use the optimal packets order for dispatch calls This should reduce the time where compute units are idle, mainly for meta operations because they use a bunch of compute shaders. This seems to have a really minor positive effect for Talos, at least. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-02 23:03:59 +01:00
Timothy Arceri	cf5f8f55c3	nir: add tess patch support to nir_remove_unused_varyings() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 08:58:39 +11:00
Dylan Baker	4ff6187b84	es2api/ABI-check: Add es3.x symbols Currently this ABI check only checks for es2 symbols, but es3.x symbols are also exposed. Exposing these symbols is recommended by Khronos, and as such the test should accept that as ABI. see: https://lists.freedesktop.org/archives/mesa-stable/2016-June/004545.html for the discussion about exposing these symbols cc: Ian Romanick <idr@freedesktop.org> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-11-02 14:50:52 -07:00
Dylan Baker	a5635d993a	meson: Set c visibility args for wayland-drm Because otherwise gbm will expose wayland symbols that it shouldn't. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-02 14:50:18 -07:00
Timothy Arceri	4837ad4832	st/glsl_to_nir: pass gl_shader_program to st_finalize_nir() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 08:32:35 +11:00
Bas Nieuwenhuizen	806721429a	radv: Don't expose heaps with 0 memory. It confuses CTS. This pregenerates the heap info into the physical device, so we can use it for translating contiguous indices into our "standard" ones. This also makes the WSI a bit smarter in case the first preferred heap does not exist. Reviewed-by: Dave Airlie <airlied@redhat.com> CC: <mesa-stable@lists.freedesktop.org>	2017-11-02 20:28:19 +01:00
Dylan Baker	a29869e872	gbm: Don't traverse backwards for includes This is just a bad idea and should be avoided. Instead, make the #include flat and fix the build systems to pass the proper -I flags v2: - add an inc_wayland_drm instead passing a path to include_directories (Emil) - update commit message (Emil) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> (v1) Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2017-11-02 11:39:45 -07:00
Dylan Baker	10d869535c	automake: Remove unused include path Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-02 11:39:44 -07:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Jason Ekstrand	172e8e42c4	intel/fs: Don't allocate a param array for zero push constants Thanks to the ralloc invariant of "any pointer returned from ralloc can be used as a context", calling ralloc_size with a size of zero will cause it to allocate at least a header. If we don't have any push constants, then NULL is perfectly acceptable (and even preferred). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-11-02 09:55:21 -07:00
Jason Ekstrand	7b4387519c	intel/fs: Alloc pull constants off mem_ctx It doesn't actually matter since the only user of push constants, i965, ralloc_steals it back to NULL but it's more consistent and probably fixes memory leaks in some error cases. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-11-02 09:55:21 -07:00
Dylan Baker	cb831d918b	Revert "meson: bump libdrm version required by amdgpu" This reverts commit `d364684711`. The commit that bumped the autotools version was reverted, so lets revert the meson version to match. fixes: `1f2640bfa9` "Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx."" Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-11-02 09:49:03 -07:00
Tim Rowley	0023b5ae67	gallivm: allow arch rounding with avx512 Fixes piglit vs-roundeven-{float,vec[234]} with simd16 VS. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-11-02 10:24:54 -05:00
Wladimir J. van der Laan	0ba4320d94	etnaviv: Allow clearing constant buffer using buffer==NULL user_buffer==NULL Prevents an assertion when using GALLIUM_HUD with ioquake3, when cso_restore_constant_buffer_slot0 restores an empty constant buffer in slot 0. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-02 11:03:30 +01:00
Wladimir J. van der Laan	bc71c31842	etnaviv: Don't flush on transfer when UNSYNCHRONIZED Structure code to only flush when we will potentially call cpu_prep. This prevents spurious flushes in applications that heavily rely on u_uploader. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-02 11:00:26 +01:00
Wladimir J. van der Laan	8fbd82f464	etnaviv: don't do resolve-in-place without valid TS GC3000 resolve-in-place assumes that the TS state is configured. If it is not, this will result in MMU errors. This is especially apparent when using glGenMipmaps(). Fixes: `78ade65956` ("etnaviv: Do GC3000 resolve-in-place when possible") Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2017-11-02 10:58:48 +01:00
Samuel Pitoiset	c39f39106d	radv: make radv_bind_descriptor_set() static Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-02 09:36:14 +01:00
Dave Airlie	799ef80059	radv: make sure we set buffers as shareable properly. This should make sure we don't treat exports buffers as local bos. Fixes: `a639d40f13` (radv: add support for local bos. (v3)) Tested-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-02 01:01:29 +00:00
Dylan Baker	6594213cfa	svga: Use __asm__ instead of asm __asm__ is portable, and allows the svga driver to be compiled with the c99 standard instead of requiring the gnu99 standard. I have compile tested this with GCC and Clang on Linux. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2017-11-01 15:05:26 -07:00
Marek Olšák	1f2640bfa9	Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx." This reverts commit `f03b7c9ad9`. The libdrm interface is wrong.	2017-11-01 21:42:31 +01:00
Lionel Landwerlin	8d8b9d11c9	intel: decoder: enable decoding a single field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	bb16503542	intel: decoder: expose missing find_enum() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	ad876f721e	intel: decoder: extract field value computation Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	81aee9fd4b	intel: decoder: rename field() to field_value() We would like to avoid collisions with variables named field. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	69d158573a	intel: decoder: rename internal function to free name Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	20156931bf	intel: decoder: simplify field_is_header() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	cab93a901e	intel: common: make intel utils available from C++ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	ea14ba0179	intel: decoder: remove unused platform field Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Lionel Landwerlin	938f62a1c7	intel: error-decode: implement a rolling window of programs If we have more programs than what we can store, aubinator_error_decode will assert. Instead let's have a rolling window of programs. v2: Fix overflowing issues (Eric Engestrom) v3: Go through programs starting at idx_program (Scott) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 17:23:49 +00:00
Brian Paul	eedecb4eca	gallium: increase pipe_sampler_view::target bitfield size for MSVC MSVC treats enums as being signed. The 4-bit target field isn't large enough to correctly store the value 8 (for PIPE_TEXTURE_CUBE_ARRAY). The bitfield value 0x8 was being interpreted as -8 so matching the target with PIPE_TEXTURE_CUBE_ARRAY in switch statements, etc. was failing. To keep the structure size the same, we reduce the format field from 16 bits to 15. There don't appear to be any other enum bitfields which need to be adjusted. This fixes a number of Piglit cube map array tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2017-11-01 11:06:02 -06:00
Eric Engestrom	5d4ffb9970	mapi: fix .so path in ABI-check Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2017-11-01 15:43:46 +00:00
Lionel Landwerlin	38f338c19a	intel: decoder: extract instruction/structs length Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:49:12 +00:00
Lionel Landwerlin	279531672e	intel: decoder: pack iterator variable declarations Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:49:12 +00:00
Lionel Landwerlin	1cf1591abd	intel: decoder: simplify creation of struct when 0-allocated Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:49:12 +00:00
Lionel Landwerlin	eb00b8b18c	intel: decoder: add destructor for gen_spec This makes use of ralloc to simplify the destruction. We can also store instructions in hash tables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:49:12 +00:00
Lionel Landwerlin	de213b4af8	intel: decoder: expose helper to test header fields These fields are of little importance as they're used to recognize instructions. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	68e1853ea3	intel: decoder: don't read qword outside instruction/struct limit We used to print invalid data when the last field was being clamped to 32bits due to Dword Length of the whole instruction. Here is an example where the decoder read part of the next instruction instead of stopping at the 32bit limit: 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 8791026489807077376 With this change we have the proper value : 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM (4 Dwords) 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	f5e5ca1e21	intel: decoder: split out getting the next field and decoding it Due to the new way we handle fields, we need not to forget the first field when decoding instructions. The issue was that the advance function was called first and skipped the first field. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	ffa011d1e3	intel: decoder: move field name copy This should be inside the function that actually decodes fields. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	0698318d1a	intel: decoder: reorder iterator init function Making the next change more readable. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	1b369acdd8	intel: common: print out all dword with field spanning multiple dwords For example, we were skipping Dword 3 in this PIPE_CONTROL : 0x000ce130: 0x7a000004: PIPE_CONTROL DWord Length: 4 0x000ce134: 0x00000010 : Dword 1 Flush LLC: false Destination Address Type: 0 (PPGTT) LRI Post Sync Operation: 0 (No LRI Operation) Store Data Index: 0 Command Streamer Stall Enable: false Global Snapshot Count Reset: false TLB Invalidate: false Generic Media State Clear: false Post Sync Operation: 0 (No Write) Depth Stall Enable: false Render Target Cache Flush Enable: false Instruction Cache Invalidate Enable: false Texture Cache Invalidation Enable: false Indirect State Pointers Disable: false Notify Enable: false Pipe Control Flush Enable: false DC Flush Enable: false VF Cache Invalidation Enable: true Constant Cache Invalidation Enable: false State Cache Invalidation Enable: false Stall At Pixel Scoreboard: false Depth Cache Flush Enable: false 0x000ce138: 0x00000000 : Dword 2 Address: 0x00000000 0x000ce140: 0x00000000 : Dword 4 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	3ae5c57916	intel: decoder: build sorted linked lists of fields The xml files don't always have fields in order. This might confuse our parsing of the commands. Let's have the fields in order. To do this, the easiest way it to use a linked list. It also helps a bit with the iterator. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Lionel Landwerlin	957a6eea7a	intel: common: expose gen_spec fields Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-11-01 13:19:20 +00:00
Eric Engestrom	f0ab3f7635	travis: build meson first for quicker feedback Meson is much quicker to build Mesa, giving quicker feedback if executed first. Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-11-01 09:57:32 +00:00
Eric Engestrom	d364684711	meson: bump libdrm version required by amdgpu Fixes: `f03b7c9ad9` "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx." Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-11-01 09:57:32 +00:00
Jordan Justen	1a61a8b9a7	i965: Initialize disk shader cache if MESA_GLSL_CACHE_DISABLE is false (Apologies for the double negative.) For now, the shader cache is disabled by default on i965 to allow us to verify its stability. In other words, to enable the shader cache on i965, set MESA_GLSL_CACHE_DISABLE to false or 0. If the variable is unset, then the shader cache will be disabled. We use the build-id of i965_dri.so for the timestamp, and the pci device id for the device name. v2: * Simplify code by forcing link to include build id sha. (Matt) v3: * Don't use a for loop with snprintf for bin to hex. (Matt) * Assume fixed length render and timestamp string to further simplify code. Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:46:53 -07:00
Jordan Justen	ccb700526f	dri drivers: Always add the sha1 build-id v4: * Add Android build changes. (Emil) Cc: Dylan Baker <dylanx.c.baker@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	e5b141634c	disk_cache: Fix issue reading GLSL metadata This would cause the read of the metadata content to fail, which would prevent the linking from being skipped. Seen on Rocket League with i965 shader cache. Fixes: `b86ecea344` "util/disk_cache: write cache item metadata to disk" Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	e6ecd7d73f	glsl/shader_cache: Save fs (BlendSupport) metadata Fixes many GL 4.5 CTS blend tests, such as: * GL45-CTS.blend_equation_advanced.extension_directive_enable * GL45-CTS.blend_equation_advanced.extension_directive_warn * GL45-CTS.blend_equation_advanced.blend_all.GL_MULTIPLY_KHR_all_qualifier * GL45-CTS.blend_equation_advanced.blend_specific.GL_COLORBURN_KHR v2: * Directly save the BlendSupport field to avoid potentially including a pointer in the future in the structure is updated. (tarceri) Cc: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	7f5204a0db	i965: Initialize sha1 hash of dri config options Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	478a73fdfa	i965: Don't link when the program was found in the disk cache Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	c3a8ae105c	i965: add cache fallback support using serialized nir If the i965 gen program cannot be loaded from the cache, then we fallback to using a serialized nir program. This is based on "i965: add cache fallback support" by Timothy Arceri <timothy.arceri@collabora.com>. Tim's version was written to fallback to compiling from source, and therefore had to be much more complex. After Connor and Jason implemented nir serialization, I was able to rewrite and greatly simplify this patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	a4078b819f	i965: add support for cached shaders with xfb qualifiers For now this disables the shader cache when transform feedback is enabled via the GL API as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	15f39e8654	mesa/glsl: add api_enabled flag to gl_transform_feedback_info This will be used to disable the shader cache when xfb is enabled via the api as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	8a019f5601	i965: Add shader cache support for compute v2: * Use MAYBE_UNUSED. (Matt) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	42383faf51	i965: add shader cache support for tess stages v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	5a4afd822f	i965: add shader cache support for geometry shaders v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	2589e7ddaf	i965: Add shader cache support for vertex and fragment stages This enables the cache on vertex and fragment shaders only. v2: * Use MAYBE_UNUSED. (Matt) [jordan.l.justen@intel.com: reword subject] [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Timothy Arceri	516d50db31	i965: add initial implementation of on disk shader cache This uses the Mesa disk_cache support to write out the final linked binary for vertex and fragment shader programs. This is based off the initial implementation done by Carl Worth. It has been significantly reworked, first by Tim Arceri, and then by Jordan Justen. v2: * Squash 'i965: add image param shader cache support' * Squash 'i965: add shader cache support for pull param pointers' * Sustantially simplified by a rework on top of Jason's `2975e4c56a`. * Rename load_program_data to read_program_data. (Jason) v3: * Simplify and align program read/write. (Jason) v4: * Don't save prog_data size since we know it from the stage. (Ken) * Don't save program size, since prog_data includes the size. (Ken) * Remove `assert` that potentially could be triggered by disk corruption of the cache entries. (Ken) * Fix compute shader scratch allocation. (Ken) * Remove special case mapping for non-LLC. (Ken) * Remove SET_UPLOAD_PARAMS macro [jordan.l.justen@intel.com: _cached_program => brw_disk_cache__program] [jordan.l.justen@intel.com: brw_shader_cache.c => brw_disk_cache.c] [jordan.l.justen@intel.com: don't map to write program when LLC is present] [jordan.l.justen@intel.com: set program_written_to_cache on read from cache] [jordan.l.justen@intel.com: only try cache when status is linking_skipped] [jordan.l.justen@intel.com: all v2-v4 changes noted above] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	f9d5a7add4	i965: Calculate thread_count in brw_alloc_stage_scratch Previously, thread_count was sent in from the stage after some stage specific calculations. Those stage specific calculations were moved into brw_alloc_stage_scratch, which will allow the shader cache to also use the same calculations. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	f082d7f64f	intel/compiler: Add functions to get prog_data and prog_key sizes for a stage v2: * Return unsigned instead of size_t. (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	05b1193361	intel/compiler: Add union types for prog_data and prog_key stages Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	4c7a1ec62a	blob: Don't set overrun if reading 0 bytes at end of data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	3dcbc5cdaa	intel/compiler: Remove final_program_size from brw_compile_* The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Carl Worth	540636045f	intel/compiler: add new field for storing program size This will be used by the on disk shader cache. v2: * Set in brw_compile_* rather than brw_codegen_*. (Jason) Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com> [jordan.l.justen@intel.com: Only add to brw_stage_prog_data] Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	1edf0fe612	i965: Don't rely on nir for uses_texture_gather When a program is restored from the shader cache, prog->nir will be NULL, but prog->info will be restored. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-31 23:36:54 -07:00
Jordan Justen	0610a624a1	i965/link: Serialize program to nir after linking for shader cache If the shader cache is enabled, after linking the program, we serialize the program to nir. This will be saved out by the glsl shader cache support. Later, if the same program is found in the cache, we can use the nir for a fallback in the unlikely case that the gen binary program is not found in the cache. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	6b815e405d	glsl/shader_cache: Save and restore serialized nir in gl_program v3: * Rename serialized_nir* to driver_cache_blob*. (Tim) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:54 -07:00
Jordan Justen	571bee96d5	main: Add driver cache blob fields to gl_program These fields can be used to optionally save off a driver blob with the program metadata. For example, serialized nir, or tgsi. v3: * Rename serialized_nir* to driver_cache_blob. (Tim) Free memory. (Jason) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:53 -07:00
Jason Ekstrand	54f691311c	nir: Add hooks for testing serialization Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-31 23:36:53 -07:00
Connor Abbott	120da00975	nir: add serialization and deserialization v2 (Jason Ekstrand): - Various whitespace cleanups - Add helpers for reading/writing objects - Rework derefs - [de]serialize nir_shader::num_* - Fix uses of blob_reserve_bytes - Use a bitfield struct for packing tex_instr data v3: - Zero nir_variable struct on deserialization. (Jordan) - Allow nir_serialize.h to be included in C++. (Jordan) - Handle NULL info.name. (Jason) - Set info.name to NULL when name is NULL. (Jordan) Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 23:36:53 -07:00
Dave Airlie	57892a23be	mesa/st: implement max combined output resources limiting. if the driver sets the cap, then use the value it gives us. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 10:07:07 +10:00
Dave Airlie	d3fdd66401	gallium: add cap for driver specified max combined shader resources. Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 10:07:03 +10:00
Gert Wollny	69eee511c6	r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling It is possible that the optimizer ends up in an infinite loop in post_scheduler::schedule_alu(), because post_scheduler::prepare_alu_group() does not find a proper scheduling. This can be deducted from pending.count() being larger than zero and not getting smaller. This patch works around this problem by signalling this failure so that the optimizers bails out and the un-optimized shader is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 09:33:40 +10:00
Timothy Arceri	e80bbd6f52	radeonsi: fix culldist_writemask in nir path The shared si_create_shader_selector() code already offsets the mask. Fixes the following piglit tests: arb_cull_distance/clip-cull-3.shader_test arb_cull_distance/clip-cull-4.shader_test Fixes: `29d7bdd179` (radeonsi: scan NIR shaders to obtain required info) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-01 09:41:11 +11:00
Neil Roberts	b697ece10a	nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB Previously the values were calculated by just shifting ~0 by the invocation ID. This would end up including bits that are higher than gl_SubGroupSizeARB. The corresponding CTS test effectively requires that these high bits be zero so it was failing. There is a Piglit test as well but this appears to checking the wrong values so it passes. For the two greater-than bitmasks, this patch adds an extra mask with (~0>>(64-gl_SubGroupSizeARB)) to force these bits to zero. Fixes: KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680#c3 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-31 23:28:00 +01:00
Nanley Chery	9e849eb8bb	i965: Check CCS_E compatibility for texture view rendering Only use CCS_E to render to a texture that is CCS_E-compatible with the original texture's miptree (linear) format. This prevents render operations from writing data that can't be decoded with the original miptree format. On Gen10, with the new CCS_E-enabled formats handled, this enables the driver to pass the arb_texture_view-rendering-formats piglit test. v2. Add a TODO for texturing. (Jason) Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 14:26:23 -07:00
Nanley Chery	c7baaafe54	intel/isl: Disable some gen10 CCS_E formats for now CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 formats with CCS_E. None of these formats fit within the current blorp_copy framework so disable them until support is added. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-31 14:26:23 -07:00
Eric Engestrom	3ba973fe37	meson: pass correct args to gles2 ABI test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:11 +00:00
Eric Engestrom	8a3022ffa0	meson: pass correct args to gles1 ABI test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:06 +00:00
Eric Engestrom	5eb7bd0e77	meson: pass correct args to gbm symbol test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 18:00:00 +00:00
Eric Engestrom	be301ab724	meson: pass correct args to wayland-egl symbol test Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:54 +00:00
Eric Engestrom	64f17440b6	automake+meson: don't run egl symbol check on libglvnd lib We might want to add a symbol check for the glvnd variant though. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:49 +00:00
Eric Engestrom	1946de2b71	meson: pass correct env/args to egl tests Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:44 +00:00
Eric Engestrom	ddb3a695a8	gles2: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:39 +00:00
Eric Engestrom	4e7612c54d	gles1: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:34 +00:00
Eric Engestrom	d830351bfa	gbm: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:29 +00:00
Eric Engestrom	4e43ba5687	wayland-egl: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:24 +00:00
Eric Engestrom	0529a63384	egl: fail symbol check if lib is missing Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-31 17:59:19 +00:00
Dylan Baker	1c59119368	meson: set visibility flags on gbm This is done in autotools, and is an oversight in the meson build. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:41:19 -07:00
Dylan Baker	c16486f5db	meson: Don't link gbm with threads It's supposed to be linked with pthread-stubs (if the platform needs pthread-stubs). Pthread stubs support isn't (yet) implemented in the meson build, so add a TODO. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:41:19 -07:00
Dylan Baker	0589331d54	meson: Use true and false instead of yes and no for tristate options This allows a user to not care whether they're setting a tristate or a boolean option, which is a nice user facing feature, and something I've personally run into. Suggested-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 10:37:17 -07:00
Andrey Grodzovsky	f03b7c9ad9	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-31 16:55:24 +01:00
Erik Faye-Lund	5c2ff5773a	meson: do not search for needless deps If we don't want to use these deps, there's no good reason to search for them in the first place. This should shave a bit of time for the initial build. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-31 15:44:13 +01:00
Samuel Pitoiset	5010436e09	radv: bail out when binding the same vertex buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 10:16:38 +01:00
Samuel Pitoiset	11fdc2cd34	radv: bail out when binding the same index buffer DOW3 appears to hit this path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 10:16:35 +01:00
Erik Faye-Lund	cf41c19d9f	meson: use dep_m in libgallium The u_format_other.c users sqrtf, which on some systems require a math-library. So let's make sure we link with it. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-31 08:10:37 +01:00
Timothy Arceri	e92405c55a	radv: use correct alloc function when loading from disk Fixes regression in: dEQP-VK.api.object_management.alloc_callback_fail.graphics_pipeline Fixes: `1e84e53712` "radv: add cache items to in memory cache when reading from disk" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-31 14:51:55 +11:00
Plamena Manolova	048d4c45c9	i965: Fix ARB_indirect_parameters logic. This patch modifies the ARB_indirect_parameters logic in brw_draw_prims, so that our implementation isn't affected if another application attempts to use predicates. Previously we were using a predicate with a DELTAS_EQUAL comparison operation and relying on the MI_PREDICATE_DATA register being 0. Our code to initialize MI_PREDICATE_DATA to 0 was incorrect, so we were accidentally using whatever value was written there. Because the kernel does not initialize the MI_PREDICATE_DATA register on hardware context creation, we might inherit the value from whatever context was last running on the GPU (likely another process). The Haswell command parser also does not currently allow us to write the MI_PREDICATE_DATA register. Rather than fixing this and requiring an updated kernel, we switch to a different approach which uses a SRCS_EQUAL predicate that makes no assumptions about the states of any of the predicate registers. Fixes Piglit's spec/arb_indirect_parameters/tf-count-arrays test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103085 Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-30 20:40:05 -07:00
Kenneth Graunke	877dd14e88	i965: Don't flag BRW_NEW_SURFACES unless some push constants are dirty. Due to a gaffe on my part, we were re-emitting all binding table entries on every single draw call. The push_constant_packets atom listens to BRW_NEW_DRAW_CALL, but skips emitting 3DSTATE_CONSTANT_XS for each stage unless stage_state->push_constants_dirty is true. However, it flagged BRW_NEW_SURFACES unconditionally at the end, by mistake. Instead, it should only flag it if we actually emit 3DSTATE_CONSTANT_XS for a stage. We can move it a few lines up, inside the loop - the early continues will skip over it if push constants aren't dirty for a stage. With INTEL_NO_HW=1 set, improves performance of GFXBench5 gl_driver_2 on Apollolake at 1280x720 by 1.01122% +/- 0.470723% (n=35). Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2017-10-30 20:38:08 -07:00
Kenneth Graunke	28fcf5cd94	intel/genxml: Fix decoding of groups with fields smaller than a DWord. Groups containing fields smaller than a DWord were not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter, and calling iter_group_offset_bits() to advance to the proper DWord. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current DWord when printing values, instead of advancing to each element of the array, printing bits 0-3, 4-7, 8-11, and so on. To fix this, we add new iter->start/end tracking, which properly advances for each instance of a group's field. Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, with a patch to convert it to use an array of bitfields (the example above). This also fixes the decoding of 3DSTATE_SBE's "Attribute Active Component Format" fields. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-30 20:22:55 -07:00
Ian Romanick	53c7b8bdca	glsl: Fix bad formatting in a comment Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2017-10-30 20:08:25 -07:00
Eric Anholt	2a77c763fe	broadcom/vc5: Force blending to treat alpha as 1 for formats without alpha. Fixes fbo-blending-formats on RGB8 and 565. We will still need to demote blending to shader code in the MRT case to fix it in general, but that can be added when we start doing 32F blending (which also needs to be done in the shader).	2017-10-30 13:31:32 -07:00
Eric Anholt	61bb0df60e	broadcom/vc5: Do BGRA vs RGBA swapping for the BLEND_CONSTANT_COLOR. Fixes many of the fbo-blending-formats tests.	2017-10-30 13:31:32 -07:00
Eric Anholt	2e3c7beb1e	broadcom/vc5: Pack clear colors according to the TLB internal format/type. The previous packing I did got us all the R16F and R32F formats, where the pipe format basically matched the TLB's format, but since the clear color will just be memcpyed to the TLB, we should be looking at its format for deciding how to pack. Fixes RGB565, RGB5_A1 and RGBA10 fbo-clear-formats tests and improves 4444.	2017-10-30 13:31:32 -07:00
Eric Anholt	828299d1bd	broadcom/vc5: Don't do r/b channel swapping on 565. The HW's format actually matches the gallium format.	2017-10-30 13:31:32 -07:00
Eric Anholt	9e5df1897c	broadcom/vc5: Use the proper gallium format for our RGB10_A2. This keeps us from needing our own reswizzling of the B vs R fields.	2017-10-30 13:31:31 -07:00
Eric Anholt	6d1809a6d6	broadcom/vc5: Add some comments about the texture/output format ordering. The output formats are consistent with their channels appearing from low to high in their name. Textures are interpreted the same way, but their names may have the channels swapped around. I'm retaining the texture names so that we are consistent with the documentation, but I want to leave a warning for others.	2017-10-30 13:31:28 -07:00
Eric Anholt	2d6088f2a3	broadcom/vc5: Drop duplicated setup of clip_window_height_in_pixels.	2017-10-30 13:31:28 -07:00
Eric Anholt	1b32786de6	broadcom/vc5: Don't forget to actually turn on stencil testing. I had the rest of stencil state set up, but forgot to actually enable it in the higher level configuration bits packet.	2017-10-30 13:31:28 -07:00
Eric Anholt	4d2619a6b3	broadcom/vc5: Stop lowering negates to subs. In the case of fneg(0.0), we were getting back 0.0 instead of -0.0. We were also needing an immediate 0 value for ineg, when there's an opcode to do the job properly. Fixes fs-floatBitsToInt-neg.shader_test.	2017-10-30 13:31:28 -07:00
Eric Anholt	a797f0eb63	broadcom/vc5: Set up MSAA texture type according to the internal format. It gets most of EXT_framebuffer_multisample-formats passing, but doesn't really work for texture views.	2017-10-30 13:31:28 -07:00
Eric Anholt	fe6fc579cb	broadcom/vc5: Use the sampler view's format, not the resource's. This should help with texture views, though I just noticed this while reading the code.	2017-10-30 13:31:27 -07:00
Eric Anholt	0ec4b4178f	broadcom/vc5: Emit raw loads for MSAA buffers. Similar to stores, but we also need to emit dummy stores in between each load, to flush out the previous queued load.	2017-10-30 13:31:27 -07:00
Eric Anholt	464f1fb733	broadcom/vc5: Use raw stores for MSAA buffers. We were storing the resolved pixels in all cases, but nr_samples > 0 means we should be keeping the per-sample values. We will probably want to change the job structure at some point, as we'll want to recognize full-buffer resolves and do the resolved store in the same job as the original rendering, meaning we'll need to track both the MSAA and single-sample resources in the job. However, this will be enough to build the rest of the MSAA support.	2017-10-30 13:31:27 -07:00
Eric Anholt	e717e3e7cd	broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture. The HW has no native sampler support for multisample textures, but since we only need to support txf_ms and the layout is UIF, we just need to scale up the texcoords and then add in the sample. This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating MSAA textures as textures, rather than basically texbos like VC4 had to.	2017-10-30 13:31:27 -07:00
Eric Anholt	b1a8b3979c	broadcom/vc5: Lay out MSAA textures/renderbuffers as UIF scaled by 4. We just need to multiply width/height by 2 each, and always set them up as UIF tiling, since that's how the TLB will store them in raw (per-sample) mode.	2017-10-30 13:31:27 -07:00
Eric Anholt	1d8105a167	broadcom/vc5: Keep output height pad out of the store TLB general address. The equivalent load already had the pad separated out.	2017-10-30 13:31:24 -07:00
Eric Anholt	99c69027e4	broadcom/vc5: Drop padding bits from the texture shader state's address.	2017-10-30 13:31:22 -07:00
Eric Anholt	cf3759a9a4	broadcom/vc5: Drop alignment bits from texture P1's address.	2017-10-30 13:31:19 -07:00
Eric Anholt	607031f411	broadcom/vc5: Drop alignment bits from Z/S rendering mode config address. Improves CLIF dumping output.	2017-10-30 13:31:16 -07:00
Eric Anholt	d0f7053369	broadcom/xml: Fix address packing for address with >= 8 alignment bits. We were handing the intra-byte padding fine, but with a 24-bit address (bottom 8 bits implied 0) we would end up off by 8 bytes in our shift, impacting vc5's load/store general packets (all other packets we have had <8 bits of padding).	2017-10-30 13:31:16 -07:00
Eric Anholt	40280b0abe	broadcom/clif: Print out the contents of the generic tile list. This is the real meat of the RCL, so let's get it printed again.	2017-10-30 13:31:16 -07:00
Eric Anholt	10fa685b53	broadcom/clif: Move the CL printing part of CL dumps to a helper. This will let me reuse the printing for processing branches to other CLs.	2017-10-30 13:31:16 -07:00
Eric Anholt	125f2a751e	broadcom/vc5: Lower unpack_*_4x8 to normal math. We only have 2x16 unpacking in our ALUs. To enable this, we also need lower_fdiv for its new instructions, which had been handled at a higher level previously.	2017-10-30 13:31:16 -07:00
Eric Anholt	eecdbaa985	broadcom/vc5: Add PIPE_TEX_WRAP_CLAMP support for linear-filtered textures. I already had the texture's wrapping set up to use different behavior for nearest or linear, so we just needed to saturate the coordinates in linear mode to get the "proper" blend between the edge and border values.	2017-10-30 13:31:16 -07:00
Eric Anholt	e798455330	broadcom/vc5: Disable GL_ARB_transform_feedback3. We don't seem to have a way to generally handle gl_SkipComponents.	2017-10-30 13:31:15 -07:00
Eric Anholt	e2d9ed4f39	broadcom/vc5: Fix gl_FragCoord pixel center setup. Fixes glsl-arb-fragment-coord-conventions.	2017-10-30 13:31:15 -07:00
Eric Anholt	bacbcafec1	broadcom/vc5: Always set up 1D textures as raster order. 1D is the exception to "all V3D textures are tiled", since tiling 1D textures would just waste memory and cache space. This ended up being a problem once we started actually marking 1D textures as 1D instead of 2D.	2017-10-30 13:31:15 -07:00
Eric Anholt	443e1984d2	broadcom/xml: Throw an #error in XML-based codegen for a >1bit bool I've debugged two nasty errors now due to copy-and-pasting a bool type when writing a uint field. Make sure I don't do that again.	2017-10-30 13:31:12 -07:00
Eric Anholt	e2f114b32b	broadcom/vc4: Fix bool marking on Rasterizer Oversample Mode. We don't set this field using the XML codegen, but this would help us decode the right value in case of 16x (VG) oversampling.	2017-10-30 13:27:03 -07:00
Eric Anholt	45e70bdc8c	broadcom/vc5: Mark lookup type as uint, not bool. Fixes non-2D texturing.	2017-10-30 13:27:03 -07:00
Eric Anholt	77c7b98ba5	broadcom/vc5: Fix GPU hang with no vertex elements used by the VS. Like VC4, we need to at least have one element set up, but unlike VC4 it seems we don't need to read it to keep the HW happy. Fixes GPU hangs with glsl-no-vertex-attribs.shader_test.	2017-10-30 13:25:45 -07:00
Eric Engestrom	2117d03310	git_sha1_gen: create empty file in fallback path I missed this part in my conversion, the old stream redirection meant the file was always created. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496 Fixes: `7088622e5f` "buildsys: move file regeneration logic to the script itself" Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-30 17:21:58 +00:00
Lionel Landwerlin	a1faf48636	intel: common: silence compiler warning Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-30 17:15:50 +00:00
Eduardo Lima Mitev	f9de7f5596	glsl/linker: Check that re-declared, inter-shader built-in blocks match >From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of the PDF states: "If multiple shaders using members of a built-in block belonging to the same interface are linked together in the same program, they must all redeclare the built-in block in the same way, as described in section 4.3.9 “Interface Blocks” for interface-block matching, or a link-time error will result." Fixes: * GL45-CTS.CommonBugs.CommonBug_PerVertexValidation v2 (Neil Roberts): Explicitly look for gl_PerVertex in the symbol tables instead of waiting to find a variable in the interface. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102677 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev	f5fe99ac85	glsl: Use the utility function to copy symbols between symbol tables This effectively factorizes a couple of similar routines. v2 (Neil Roberts): Non-trivial rebase on master Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Eduardo Lima Mitev	4c62a270a9	glsl_parser_extra: Add utility to copy symbols between symbol tables Some symbols gathered in the symbols table during parsing are needed later for the compile and link stages, so they are moved along the process. Currently, only functions and non-temporary variables are copied between symbol tables. However, the built-in gl_PerVertex interface blocks are also needed during the linking stage (the last step), to match re-declared blocks of inter-stage shaders. This patch adds a new utility function that will factorize current code that copies functions and variables between two symbol tables, and in addition will copy explicitly declared gl_PerVertex blocks too. The function will be used in a subsequent patch. v2 (Neil Roberts): Allow the src symbol table to be NULL and explicitly copy the gl_PerVertex symbols in case they are not referenced in the exec_list. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Signed-off-by: Neil Roberts <nroberts@igalia.com>	2017-10-30 18:10:39 +01:00
Eric Engestrom	ceaad79f85	i965: remove unused variable Fixes: `2c873060d3` "i965: Delete unused brw_vs_prog_data::nr_attributes field." Cc: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2017-10-30 16:32:05 +00:00
Eric Engestrom	c5ec155685	meson: wire up egl/android Cc: Rob Herring <robh@kernel.org> Cc: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-30 16:32:05 +00:00
Ian Romanick	6403efbe74	glsl: Remove ir_binop_greater and ir_binop_lequal expressions NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	34f7e761bc	glsl/parser: Track built-in types using the glsl_type directly Without the lexer changes, tests/glslparsertest/glsl2/tex_rect-02.frag fails. Before this change, the parser would determine that sampler2DRect is not a valid type because the call to state->symbols->get_type() in ast_type_specifier::glsl_type() would return NULL. Since ast_type_specifier::glsl_type() is now going to return the glsl_type pointer that it received from the lexer, it doesn't have an opportunity to generate an error. text data bss dec hex filename 8255243 268856 294072 8818171 868dfb 32-bit i965_dri.so before 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so after 7815195 345592 420592 8581379 82f103 64-bit i965_dri.so before 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	747c057530	glsl/parser: Return the glsl_type object from the lexer This allows us to use a single token for every built-in type except void. text data bss dec hex filename 8275163 269336 294072 8838571 86ddab 32-bit i965_dri.so before 8255243 268856 294072 8818171 868dfb 32-bit i965_dri.so after 7836963 346552 420592 8604107 8349cb 64-bit i965_dri.so before 7815195 345592 420592 8581379 82f103 64-bit i965_dri.so after Yes, the 64-bit binary shrinks by 21k. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	4171900cf1	glsl/parser: Allocate identifier inside classify_identifier Passing YYSTYPE into classify_identifier enables a later patch. text data bss dec hex filename 8310339 269336 294072 8873747 876713 32-bit i965_dri.so before 8275163 269336 294072 8838571 86ddab 32-bit i965_dri.so after 7845579 346552 420592 8612723 836b73 64-bit i965_dri.so before 7836963 346552 420592 8604107 8349cb 64-bit i965_dri.so after Yes, the 64-bit binary shrinks by 8k. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	792acfc44a	glsl/parser: Move anonymous struct name handling to the parser There are two callers of the constructor, and they are right next to each other. Move the "#anon_struct" name handling to the parser so that the conditional can be removed. I've also deleted part of the comment (about the memory leak) because I don't think it's quite accurate or relevant. text data bss dec hex filename 8310399 269336 294072 8873807 87674f 32-bit i965_dri.so before 8310339 269336 294072 8873747 876713 32-bit i965_dri.so after 7845611 346552 420592 8612755 836b93 64-bit i965_dri.so before 7845579 346552 420592 8612723 836b73 64-bit i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	fc07ab165b	glsl/parser: Silence unused parameter warning glsl/glsl_parser_extras.cpp: In constructor ‘ast_struct_specifier::ast_struct_specifier(void, const char, ast_declarator_list)’: glsl/glsl_parser_extras.cpp:1675:50: warning: unused parameter ‘lin_ctx’ [-Wunused-parameter] ast_struct_specifier::ast_struct_specifier(void lin_ctx, const char *identifier, ^~~~~~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Ian Romanick	d70e8ef1c1	glsl: Silence unused parameter warnings glsl/standalone_scaffolding.cpp: In function ‘GLbitfield _mesa_program_state_flags(const gl_state_index)’: glsl/standalone_scaffolding.cpp:103:66: warning: unused parameter ‘state’ [-Wunused-parameter] _mesa_program_state_flags(const gl_state_index state[STATE_LENGTH]) ^ glsl/standalone_scaffolding.cpp: In function ‘char _mesa_program_state_string(const gl_state_index)’: glsl/standalone_scaffolding.cpp:109:67: warning: unused parameter ‘state’ [-Wunused-parameter] _mesa_program_state_string(const gl_state_index state[STATE_LENGTH]) ^ glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_shader(gl_context, gl_shader)’: glsl/standalone_scaffolding.cpp:115:40: warning: unused parameter ‘ctx’ [-Wunused-parameter] _mesa_delete_shader(struct gl_context ctx, struct gl_shader sh) ^~~ glsl/standalone_scaffolding.cpp: In function ‘void _mesa_delete_linked_shader(gl_context, gl_linked_shader)’: glsl/standalone_scaffolding.cpp:123:47: warning: unused parameter ‘ctx’ [-Wunused-parameter] _mesa_delete_linked_shader(struct gl_context ctx, ^~~ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-30 09:27:09 -07:00
Mauro Rossi	7dae419aa7	Android: move drivers' symlinks to /vendor (v2) Having moved gallium_dri.so library to /vendor/lib/dri also symlinks need to be coherently created using TARGET_OUT_VENDOR instead of TARGET_OUT or all non Intel drivers will not be loaded with Android N and earlier, thus causing SurfaceFlinger SIGABRT (v2) simplification of post install command Fixes: `c3f75d483c` ("Android: move libraries to /vendor") Cc: 17.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Reviewed-by: Rob Herring <robh@kernel.org> (v1) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-30 15:41:31 +00:00
Emil Velikov	fc7816fd4e	Revert "foo" This reverts commit `27d5a7bce0`. I fat fingered it, failing to reset the checkout before applying the sequential commit.	2017-10-30 15:32:56 +00:00
Emil Velikov	6997d222f5	docs/release-calendar: update - 17.3.0-rc2 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-30 15:24:10 +00:00
Emil Velikov	27d5a7bce0	foo Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2017-10-30 15:22:26 +00:00
Lionel Landwerlin	a8b1715b8a	util: hashtable: make hashing prototypes match It seems nobody's using the string hashing function. If you try to pass it directly to the hashtable creation function, you'll get compiler warning for non matching prototypes. Let's make them match. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-30 15:18:00 +00:00
Andres Gomez	c220202a73	docs: update calendar, add news item and link release notes for 17.2.4 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-10-30 16:58:51 +02:00
Andres Gomez	d100e966e8	docs: add sha256 checksums for 17.2.4 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-10-30 16:55:19 +02:00
Andres Gomez	bccb074014	docs: add release notes for 17.2.4 Signed-off-by: Andres Gomez <agomez@igalia.com>	2017-10-30 16:55:18 +02:00
Alex Smith	134a40d2a6	radv: Fix -Wformat-security issue Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103513 Fixes: `de88979413` ("radv: Implement VK_AMD_shader_info") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-30 10:58:56 +01:00
Timothy Arceri	1e84e53712	radv: add cache items to in memory cache when reading from disk Otherwise we will leak them, load duplicates from disk rather than memory and never write items loaded from disk to the apps pipeline cache. Fixes: `fd24be134f` 'radv: make use of on-disk cache' Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-30 17:49:54 +11:00
Tapani Pälli	446c5726ec	i965: fix blorp stage_prog_data->param leak Patch uses mem_ctx for allocation to ensure param array gets freed later. ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 ==6164== at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) ==6164== by 0x12E31C6C: ralloc_size (ralloc.c:121) ==6164== by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095) ==6164== by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) ==6164== by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) ==6164== by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) ==6164== by 0x130C4B07: blorp_compile_fs (blorp.c:194) ==6164== by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) ==6164== by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) ==6164== by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) ==6164== by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) ==6164== by 0x12EFF72B: brw_clear (brw_clear.c:297) Fixes: `8d90e28839` ("intel/compiler: Allocate pull_param in assign_constant_locations") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-30 08:19:37 +02:00
Kevin Rogovin	d30b5f2f9b	i965: correctly assign SamplerCount of INTERFACE_DESCRIPTOR_DATA We were dividing by 4 twice. This also papered over a bug where we were neglecting to clamp the sampler count to the [0, 16] range. This should have no functional impact, it only affects prefetching. v2 [Kenneth Graunke]: - Clamp sampler_count to [0, 16] to avoid overflowing the valid values for this field. Write a commit message. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-29 22:41:23 -07:00
Kenneth Graunke	992e2cf57f	i965: Only set key->high_quality_derivatives when it matters. This avoids recompiles for shaders that don't use explicit derivatives when ctx->Hint.FragmentShaderDerivative == GL_NICEST. For example, GFXBench 5 Aztec Ruins sets the GL_NICEST hint before compiling any shaders, but none of them use dFdx() or dFdy() - only implicit derivatives. This doesn't eliminate any recompiles, but does eliminate one of the reasons for doing so. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-29 20:54:16 -07:00
Kenneth Graunke	86c68bb886	nir: Make nir_gather_info collect a uses_fddx_fddy flag. i965 turns fddx/fddy into their coarse/fine variants based on the ctx->Hint.FragmentShaderDerivative setting. It needs to know whether this can impact a shader in order to better guess NOS settings. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-29 20:52:20 -07:00
Kenneth Graunke	f970e4d481	i965: Update brw_wm_debug_recompile() for newer key entries. Also, reorder them to match the structure's field order, to make it easier to check that they're all present. Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-29 20:52:14 -07:00
Kenneth Graunke	d1b392d060	i965: Delete brw_wm_prog_key::drawable_height. This has been unused since we switched to nir_lower_wpos_ytransform. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-29 20:52:02 -07:00
Alex Smith	de88979413	radv: Implement VK_AMD_shader_info This allows an app to query shader statistics and get a disassembly of a shader. RenderDoc git has support for it, so this allows you to view shader disassembly from a capture. When this extension is enabled on a device (or when tracing), we now disable pipeline caching, since we don't get the shader debug info when we retrieve cached shaders. v2: Improvements to resource usage reporting v3: Disassembly string must be null terminated (string_buffer's length does not include the terminator) v4: Fixed LDS reporting. (Bas) Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-29 00:28:45 +02:00
Christian Gmeiner	0a23841a98	etnaviv: add ext_texture_srgb support Following piglits are passing: - glean@texture_srgb - spec@ext_texture_srgb@fbo-srgb - spec@ext_texture_srgb@tex-srgb - spec@ext_texture_srgb@texwrap formats - spec@ext_texture_srgb@texwrap formats-s3tc Btw. this enables GL 2.1 :-) Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-28 21:20:42 +02:00
Topi Pohjolainen	97e01adfd5	intel/compiler/gen9: Pixel shader header only workaround Fixes intermittent GPU hangs on Broxton with an Intel internal test case. There are plenty of similar fragment shaders in piglit that do not use any varyings and any uniforms. According to the documentation special timing is needed between pipeline stages. Apparently we just don't hit that with piglit. Even with the failing test case one doesn't always get the hang. Moreover, according to the error states the hang happens significantly later than the execution of the problematic shader. There are multiple render cycles (primitive submissions) in between. I've also seen error states where the ACTHD points outside the batch. Almost as if the hardware writes somewhere that gets used later on. That would also explain why piglit doesn't suffer from this - most tests kick off one render cycle and any corruption is left unseen. v2 (Ken): Instead of enabling push constants, enable one of the inputs (PSIZ). v3 (Ken, Jason): Use LAYER instead making vulkan emit_3dstate_sbe() happy. Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-28 10:07:29 +03:00
Brian Paul	2b612431f5	scons: fix OSMesa driver build Fixes: `ea53d9a8eb` "glapi: include generated headers without path" Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-10-27 20:56:13 -06:00
Brian Paul	05592cebd4	scons: fix scons build to find generated glapitable.h Fixes: `ea53d9a8eb` "glapi: include generated headers without path" Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-10-27 16:26:26 -06:00
Brian Paul	1fe4c7b2af	gallium: s/unsigned/enum pipe_prim_type/ In the vbuf_render::set_primitive() functions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-10-27 16:26:26 -06:00
Roland Scheidegger	3e4fd2d4b1	draw: don't cull tris with zero area Culling tris with zero area seems like a great idea, but apparently with fill mode line (and point) we're supposed to draw them, at least some tests for some other state tracker complained otherwise. Such tris also always seem to be back facing (not sure if this can be inferred from anything, since in a mathematical sense it cannot really be determined), so make sure to account for this when filling in the face information. (For solid tris, this is of course unnecessary, drivers will throw the tris away later in any case.) Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-27 22:37:19 +02:00
Dylan Baker	f7f12780c8	meson: Add a dependency on nir_opcodes_h for freedreno This fixes a race condition in the build. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2017-10-27 11:30:13 -07:00
Dylan Baker	f121a669c7	meson: build gallium based osmesa This has been tested with the osdemo from mesa-demos v2: - Add SELinux dependency - fix typo GALLIUM_LLVM -> GALLIUM_LLVMPIPE Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:45 -07:00
Dylan Baker	cbbd5bb889	meson: build classic osmesa This builds the classic (non-gallium) osmesa with meson. This has been tested with the osdemo application from mesa-demos. v2: - Remove unrelated change - Add SELinux dependency to osmesa Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:45 -07:00
Dylan Baker	7503ab687b	meson: Add generated files to non-shared glapi Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	ea53d9a8eb	glapi: include generated headers without path This has been tested wtih make dist-check and with meson. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	5daed06da2	osmesa: Include generated headers without path This makes things much easier to ensure correctness with meson. Tested with make dist-check and with meson. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	06c6675560	meson: move gallium include declarations to src These are used by non-gallium osmesa, so they need to be defined outside of the gallium subdirectory. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	63c360d7b2	meson: fix glprocs.h generator There was a typo that causes the generated file to be called gl_procs.h instead. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	1f11ac4395	meson: rename all instances of xf86vm to xxf86vm Because consistency Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Dylan Baker	b92992e95c	meson: fix pkg-config Gl Require.Private xf86vm -> xxf86vm Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-27 11:06:07 -07:00
Kenneth Graunke	4f538c3f99	mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility. According to the ARB_ES3_1_compatibility specification, glGetFramebufferAttachmentParameteriv is supposed to accept BACK, and it behaves exactly like BACK_LEFT. Fixes a GL error in GFXBench 5 Aztec Ruins. Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2017-10-27 10:19:07 -07:00
Brian Paul	7a718667f3	gallium/os: fix align_malloc() / os_malloc_aligned() comment mix-up os_free_aligned() is the counterpart to os_malloc_aligned(). Trivial.	2017-10-27 09:54:26 -06:00
Alejandro Piñeiro	fd011376cb	formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE From the spec: "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the resource when used as an image textures is returned in <params>. This is equivalent to calling GetTexParameter" So we would need to return None for any target not supported by GetTexParameter. By mistake, we were using the target check for GetTexLevelParameter. v2: fix typo (GetTextParameter vs GetTexParemeter) on comment (Illia Mirkin) Reviewed-by: Antia Puentes <apuentes@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-27 15:04:03 +02:00
Eric Engestrom	05a94a4dfc	meson: bring MESA_GIT_SHA1 in line with other build systems Meson's vcs_tag() uses the output of `git describe`, eg. 17.3-branchpoint-5-gfbf29c3cd15ae831e249+ Whereas the other build systems used a script that outputs only the sha1 of the HEAD commit, eg. fbf29c3cd1 Given that this information is used by printing it next to the version number, there's some redundancy here, and inconsistency between build systems. Bring Meson in line by making it use the same script, with the added advantage of now supporting the MESA_GIT_SHA1_OVERRIDE env var. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-27 13:38:37 +01:00
Eric Engestrom	7088622e5f	buildsys: move file regeneration logic to the script itself Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-27 13:38:37 +01:00
Samuel Pitoiset	a41e2e9cf5	radv: allow to use a compute shader for resetting the query pool Serious Sam Fusion 2017 uses a huge number of occlusion queries, and the allocated query pool buffer is greater than 4096 bytes. This slightly improves performance (tested in Ultra) from 117.2 FPS to 119.7 FPS (~+2%) on my RX480. This also improves Talos, from 69 FPS to 72/73 FPS (~+5%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-27 13:47:06 +02:00
Samuel Pitoiset	0d61109bb7	radv: make radv_fill_buffer() return the needed flush bits Only needed when the CS path is used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-27 13:47:03 +02:00
Eric Engestrom	4b9421d45d	meson: wire up selinux Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-27 11:57:03 +01:00
Eric Engestrom	866c8a94d4	wayland-egl: fix wayland cflags Fixes: `80bfff5c4f` "wayland-egl: adds CFLAGS for wayland.egl.h include" Suggested-by: Daniel Stone <daniel@fooishbar.org> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2017-10-27 11:57:03 +01:00
Eric Engestrom	5d44e35a8f	vc4: fix release build Mesa's DEBUG and assert's NDEBUG are not tied to each other, so we need to explicitly compile this code out. Fixes: `3df7892878` "vc4: Drop reloc_count tracking for debug asserts on non-debug builds." Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-27 11:57:03 +01:00
Tapani Pälli	0b131ca427	i965: unref push_const_bo in intelDestroyContext Valgrind shows that leak is caused by gen6_upload_push_constant, add unref push_const_bo per stage to destructor to fix this (like done for scratch_bo). ==10952== 144 bytes in 1 blocks are definitely lost in loss record 44 of 66 ==10952== at 0x4C30A1E: calloc (vg_replace_malloc.c:711) ==10952== by 0x8C02847: bo_alloc_internal.constprop.10 (brw_bufmgr.c:344) ==10952== by 0x8C425C4: intel_upload_space (intel_upload.c:101) ==10952== by 0x8C22ED0: gen6_upload_push_constants (gen6_constant_state.c:154) v2: remove if conditions, brw_bo_unreference handles NULL (Ken, Emil) Fixes: `24891d7c05` ("i965: Store per-stage push constant BO pointers.") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2017-10-27 13:49:13 +03:00
Tapani Pälli	eeb3515c3f	i965: remove if conditions from scratch_bo unref brw_bo_unreference handles NULL case Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-27 13:48:56 +03:00
Kenneth Graunke	70cd05d6ac	anv: Fix assert about source attrs. Asserting slot >= 2 made sense when the URB read offset was always 1 (pair of slots). Commit `566a0c43f0` made it possible to read from the VUE header in slot 0, by adjusting the offset to be 0. So, this assert is now bogus. Use the one from GL. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-27 03:01:13 -07:00
Kenneth Graunke	49d3c004f1	anv: Drop URB entry output read handling in 3DSTATE_XS. Commit `566a0c43f0` started setting the 3DSTATE_SBE bit to override these values with the one calculated there. So, they're dead. Stop setting them. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-27 03:01:13 -07:00
Kenneth Graunke	2c873060d3	i965: Delete unused brw_vs_prog_data::nr_attributes field. Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-10-27 02:53:38 -07:00
Samuel Pitoiset	dd79aa4ad3	radeonsi: update hack for HTILE corruption in ARK: Survival Evolved It appears that flushing the DB metadata is actually not sufficient since the driver uses the new VS blit shaders. This looks quite strange though, but it seems like we need to flush DB for fixing the corruption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955 Fixes: `69ccb9dae7` (radeonsi: use new VS blit shaders (VS inputs in SGPRs) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-27 10:47:30 +02:00
Dave Airlie	a639d40f13	radv: add support for local bos. (v3) This uses the new kernel interfaces for reduced cs overhead, We only set the local flag for memory allocations that don't have a dedicated allocation and ones that aren't imports. v2: add to all the internal buffer creation paths. v3: missed some command submission paths, handle 0/empty bo lists. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 23:59:28 +01:00
Jason Ekstrand	39c5c12f8f	i965/miptree: Take an isl_format in render_aux_usage Not all rendering matches the miptree format. We allow rendering to texture views so there are cases where it may not match. In those cases, our current scheme of just passing the value of ctx->sRGBEnabled isn't viable. Instead, just do what we do for texturing and pass the view format in directly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-26 15:24:38 -07:00
Jason Ekstrand	78e50185d6	i965/blorp: Use more temporary isl_format variables Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-26 15:24:38 -07:00
Jason Ekstrand	94389943b6	i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-26 15:24:38 -07:00
Jason Ekstrand	8ab9820d34	spirv: Claim support for the simple memory model It's rather surprising that we've never actually hit this before. Aparently, Ian's SPIR-V generator currently claims the Simple when you don't do anything complex. We really shouldn't assert-fail on it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org	2017-10-26 15:24:38 -07:00
Rob Herring	90dd6e5bb9	Android: egl: add dependency on libnativewindow system/window.h is no longer available by default and is part of libnativewindow, so add it to the shared libraries. It has to be conditional because the library is only present in O and later. Really, we should only be depending on vndk/window.h now, but that's only in O and changing would be pretty invasive. Signed-off-by: Rob Herring <robh@kernel.org>	2017-10-26 16:06:53 -05:00
Dylan Baker	eb3bb03b34	meson: build nouveau vieux driver Build tested only. v2: - fix spelling error (veaux -> vieux) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-26 11:30:56 -07:00
Dylan Baker	eecd2860ff	meson: build r200 driver v2: - remove TODO that is done Build tested only Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-26 11:30:56 -07:00
Dylan Baker	191e785c81	meson: build r100 driver build tested only Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-26 11:30:56 -07:00
Dylan Baker	8da36268d4	install_megadrivers: print the full path with driver name Instead of just the path. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-26 11:30:56 -07:00
Kevin Rogovin	e640b3fe13	intel/tools/disasm: correctly observe FILE *out parameter Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-26 10:43:48 -07:00
Kevin Rogovin	75d10e4c84	intel/compiler: brw_validate_instructions to take const void* instead of void* The disassembler does not (and should not) be modifying the data. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-26 10:43:48 -07:00
Eric Engestrom	109de3049d	loader: drop empty function alias While at it, drop the duplicate return. Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emli.velikov@collabora.com>	2017-10-26 16:25:33 +01:00
Marek Olšák	3f8e3c2bd8	radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI See my LLVM patch which fixes the root cause. Users have to apply this patch and then they have 2 choices: - Downgrade to LLVM 5.0 - Update to LLVM git after my LLVM patch is pushed. It won't be possible to use current and earlier development version of LLVM 6.0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 17.3 <mesa-stable@lists.freedesktop.org>	2017-10-26 16:44:01 +02:00
Greg V	9d8c91eb91	util: use OpenBSD/NetBSD code on FreeBSD/DragonFly Obtained from: FreeBSD ports Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com> [Emil Velikov: wrap long line] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-26 15:11:38 +01:00
Greg V	cece4ff6a3	winsys/svga/drm: add ERESTART define for *BSD Obtained from: FreeBSD ports Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-26 15:11:38 +01:00
Greg V	db8519a369	loader: use drmGetDeviceNameFromFd2 from libdrm Reduce code duplication and automatically benefit from OS-specific fixes to libdrm (e.g. in FreeBSD ports). API was introduced with 2.4.74 and we already require 2.4.75 globally. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103283 Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-26 15:11:35 +01:00
Daniel Stone	9f7ed60b3e	meson: wayland-egl depends on wayland-client Since wayland-egl.h is currently provided by the core Wayland package, depend on wayland-client to make sure we get it in our include path. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Fixes: `108d257a16` ("meson: build libEGL") Cc: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Gert Wollny <gw.fossdev@gmail.com>	2017-10-26 13:41:09 +01:00
Rob Clark	4f0f80776f	freedreno: implement pipe->invalidate_resource() Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-26 08:39:32 -04:00
Rob Clark	9fc7de827e	freedreno: GL_ARB_texture_barrier Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-26 08:39:32 -04:00
Rob Clark	a6bd23e43b	freedreno/a5xx: rename invalidate_resource() This is different from pipe->invalidate_resource().. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-26 08:39:32 -04:00
Rob Clark	4afcadbcc2	freedreno/a5xx: mem2gmem is read-only for BO This should be OUT_RELOC() since the operation isn't writing to the buffer. Technically it doesn't matter much currently, since we'd anyways to a gmem2mem later. But that will change. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-26 08:39:32 -04:00
Rob Clark	a4744c2ae7	freedreno: small rename Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-26 08:39:32 -04:00
Leo Liu	ea3dc75d72	radeon/video: add gfx9 offsets when rejoin the video surface For CPU access. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com>	2017-10-26 08:30:21 -04:00
Samuel Pitoiset	06a12f250f	radv: only copy the dynamic states that changed When binding a new pipeline, we applied all dynamic states without checking if they really need to be re-emitted. This doesn't seem to be useful for the meta operations because only the viewports/scissors are updated. This should reduce the number of commands added to the IB when a new graphics pipeline is bound. Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state() and set the dirty flags directly there. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-26 09:37:05 +02:00
Samuel Pitoiset	b1e31c1911	radv: store the dynamic state mask into radv_dynamic_state Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-26 09:37:03 +02:00
Samuel Pitoiset	672cf692fb	radv: only emit the depth bounds test values when set dynamically The depth bounds test values are either set at pipeline creation or dynamically using vkCmdSetDepthBounds(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-26 09:37:00 +02:00
Iago Toral Quiroga	13652e7516	glsl/linker: Fix type checks for location aliasing From the OpenGL 4.6 spec, section 4.4.1 Input Layout Qualifiers, Page 68, (Location aliasing): "Further, when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer)." The current implementation is too strict, since it checks that the the base types are an exact match instead. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	7276ccf8ed	glsl/linker: refactor check_location_aliasing Mostly, this merges the type checks with all the other checks so we only have a single loop for this. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	e2abb75b0e	glsl/linker: validate explicit locations for SSO programs v2: - we only need to validate inputs to the first stage and outputs from the last stage, everything else has already been validated during cross_validate_outputs_to_inputs (Timothy). - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	bdaf058978	glsl/linker: generalize validate_explicit_variable_location for SSO For non-SSO programs, we only need to validate outputs, since the cross validation of outputs to inputs will ensure that we produce linker errors for invalid inputs too. Hoever, for the SSO path there is no output to input validation, so we need to validate inputs explicitly. Generalize the function so it can handle this as well. Also, notice that vertex shader inputs and fragment shader outputs are already validated in assign_attribute_or_color_locations() for both SSO and non-SSO paths, so we should not try to validate that here again (in fact, the function would require explicit paths to handle these two cases properly). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	e7b7fe314e	glsl/linker: create a helper function to validate explicit locations Currently, we only validate explicit locations for non-SSO programs. This creates a helper that we can call from both SSO and non-SSO paths directly, so we can reuse all the logic behind this. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	ab40acb453	glsl/linker: outputs in the same location must share auxiliary storage From ARB_enhanced_layouts: "[...]when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer) and the same auxiliary storage and interpolation qualification.[...]" Add code to the linker to validate that aliased locations do have the same aux storage. Fixes: KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_auxiliary_storage Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	0b565f715d	glsl/linker: outputs in the same location must share interpolation From ARB_enhanced_layouts: "[...]when location aliasing, the aliases sharing the location must have the same underlying numerical type (floating-point or integer) and the same auxiliary storage and interpolation qualification.[...]" Add code to the linker to validate that aliased locations do have the same interpolation. Fixes: KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_interpolation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	c4545676d7	glsl/linker: fix location aliasing checks for interface variables The existing code was checking the whole interface variable rather than its members, which is not what we want: we want to check aliasing for each member in the interface variable. Surprisingly, there are piglit tests that verify this and were passing due to a bug in the existing code: when we were computing the last component used by an interface variable we would use the 'vector' path and multiply by vector_elements, which is 0 for interface variables. This made the loop that checks for aliasing be a no-op and not add the interface variable to the list of outputs so then we would fail to link when we did not see a matching output for the same input in the next stage. Since the tests expect a linker error to happen, they would pass, but not for the right reason. Unfortunately, the current implementation uses ir_variable instances to keep track of explicit locations. Since we don't have ir_variables instances for individual interface members, we need to have a custom struct with the data we need. This struct has the ir_variable (which for interface members is the whole interface variable), plus the data that we need to validate for each aliased location, for now only the base type, which for interface members we will take from the appropriate field inside the interface variable. Later patches will expand this custom struct so we can also check other requirements for location aliasing, specifically that we have matching interpolation and auxiliary storage, that once again, we will take from the appropriate field members for the interface variables. v2: - Use MAX_VARYING instead of MAX_VARYINGS_INCL_PATCH (Illia) Fixes: KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations Fixes (these were passing before but for incorrect reasons): tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	6aa68772d4	glsl/linker: refactor link-time validation of output locations Move the checks for explicit locations to a separate function. We will use this in a follow-up patch to validate locations for interface variables where we need to validate each interface member rather than the interface variable itself. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 08:40:14 +02:00
Iago Toral Quiroga	b944617224	glsl/linker: report linker errors for invalid explicit locations on inputs We were assuming that if an input has an invalid explicit location it would fail to link because it would not find the corresponding output, however, since we look for the matching output by indexing the explicit_locations array with the input location, we still need to ensure that we don't index out of bounds. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-26 08:40:14 +02:00
Dave Airlie	16cfbef44c	ac/llvm: drop pointless wrappers around umsb/imsb Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:34 +10:00
Dave Airlie	82d47b9d38	ac/llvm: consolidate find lsb function. This was the same between si and ac. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:31 +10:00
Dave Airlie	de2b241111	ac/llvm: drop v4f32empty. (v2) This was unused. v2: drop args. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:22 +10:00
Dave Airlie	a76b6c2192	ac/llvm: add i1false/i1true to common code. These get used in fair few places. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:18 +10:00
Dave Airlie	88b7ddbe65	ac/llvm: use the ac i32 0/1 and f32 0/1 llvm types. This just avoids having two copies of these. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:13 +10:00
Dave Airlie	f925f5b074	ac/nir: move lds declaration/load/store into shared code. This was duplicated between both drivers, share here. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:11 +10:00
Dave Airlie	74fc9e9186	st/mesa: enable nir path for all shaders. There is no reason to block this here, if a driver enables it, let it handle it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 00:55:59 +01:00
Dave Airlie	3ee2e98aff	st/program: add support for gs/tes/tcs nir shaders. This probably needs more work but this just add the initial code to convert gs/tcs/tes nir based shaders in the state tracker. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 00:55:56 +01:00
Dave Airlie	3c34d11589	st/program: rework basic variant interface This just passes st_common_program and uses it. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 00:55:52 +01:00
Jason Ekstrand	3720d913dd	anv/entrypoints: Dump useful data if mako throws an exception Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	e0519294c7	nir/opt_intrinsics: Rework progress This commit fixes two issues: First, we were returning false regardless of whether or not the function made progress. Second, we were calling nir_metadata_preserve far more often than needed; we only need to call it once per impl. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	d24311b7b5	intel/compiler: Call nir_lower_system_values in brw_preprocess_nir Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	ece206b848	i965/program: Move nir_lower_system_values higher up We want this to get called before nir_lower_subgroups which is going in brw_preprocess_nir. Now that nir_lower_wpos_ytransform can handle system values, this should be safe to do. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	2cfa3ef438	nir/lower_wpos_ytransform: Support system value intrinsics Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	279f8fb69c	anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir We currently have a bug where nir_lower_system_values gets called before nir_lower_var_copies so it will miss any system value uses which come from a copy_var intrinsic. Moving it to after brw_preprocess_nir fixes this problem. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-25 16:14:09 -07:00
Jason Ekstrand	afa0ddb81e	anv/pipeline: Drop nir_lower_clip_cull_distance_arrays We already handle it in brw_preprocess_nir Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	e758b6519d	anv/pipeline: Dump shader immedately after spirv_to_nir Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	562b8d458c	intel/eu: Use EXECUTE_1 for JMPI The PRM says "The execution size must be 1." In `73137997e2`, the execution size was set to 1 when it should have been BRW_EXECUTE_1 (which maps to 0). Later, in `dc2d3a7f5c`, JMPI was used for line AA on gen6 and earlier and we started manually stomping the exeution size to BRW_EXECUTE_1 in the generator. This commit fixes the original bug and makes brw_JMPI just do the right thing. Reviewed-by: Matt Turner <mattst88@gmail.com> Fixes: `73137997e2`	2017-10-25 16:14:09 -07:00
Alejandro Piñeiro	4723933b8e	i965/fs: Add brw_reg_type_from_bit_size utility method Returns the brw_type for a given ssa.bit_size, and a reference type. So if bit_size is 64, and the reference type is BRW_REGISTER_TYPE_F, it returns BRW_REGISTER_TYPE_DF. The same applies if bit_size is 32 and reference type is BRW_REGISTER_TYPE_HF it returns BRW_REGISTER_TYPE_F v2 (Jason Ekstrand): - Use better unreachable() messages - Add Q types Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	99778e7f9f	i965/fs/nir: Use the nir_src_bit_size helper Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-25 16:14:09 -07:00
Jason Ekstrand	fa6e74e33e	intel/fs: Handle flag read/write aliasing in needs_src_copy In order to implement the ballot intrinsic, we do a MOV from flag register to some GRF. If that GRF is used in a SEL, cmod propagation helpfully changes it into a MOV from the flag register with a cmod. This is perfectly valid but when lower_simd_width comes along, it simply splits into two instructions which both have conditional modifiers. This is a problem since we're reading the flag register. This commit makes us check whether or not flags_written() overlaps with the flag values that we are reading via the instruction source and, if we have any interference, will force us to emit a copy of the source. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-25 16:14:09 -07:00
Jan Vesely	a6d38f476b	clover: Fix compilation after clang r315871 v2: use a more generic compat function v3: rename and formatting cleanup Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: <mesa-stable@lists.freedesktop.org>	2017-10-25 18:57:42 -04:00
Marek Olšák	b85cd69415	glsl_to_tgsi: remove unused glsl_version variable trivial	2017-10-26 00:43:31 +02:00
Bas Nieuwenhuizen	61a9ef4ab1	radv: Compute ac keys from pipeline key. The beginning of the end for the shader keys. Not entirely sure what I'm going to replace them with for the compiler though, so this is the first step. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen	49d035122e	radv: Add single pipeline cache key. To decouple the key used for info gathering and the cache from whatever we pass to the compiler. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen	de38491a57	radv: Don't compute as_ls/as_es before hashing. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-26 00:28:40 +02:00
Jordan Justen	87e71726e0	glsl_to_nir: Zero nir_constant in constant_copy for valgrind & nir_serialize Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jordan Justen	16867154d8	glsl_to_nir: Zero nir_variable struct for valgrind & nir_serialize Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jordan Justen	78550869a1	nir: Zero nir_load_const_instr::value for valgrind & nir_serialize Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jordan Justen	b35e8c3b86	intel/nir: Zero local index const struct for valgrind & nir_serialize Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jordan Justen	d917f57c2f	nir: Zero local_size const struct for valgrind & nir_serialize Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jordan Justen	abbcdc9b69	glsl: Add field initializers for glsl_struct_field default constructor This helps valgrind when encode_type_to_blob is used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:21 -07:00
Jason Ekstrand	23327af91c	compiler/types: Support [de]serializing void types Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-25 12:36:21 -07:00
Jason Ekstrand	c1b84256cc	nir/intrinsics: Set the correct num_indices for load_output Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-25 12:36:20 -07:00
Connor Abbott	7686f0b316	glsl: move shader_cache type handling to glsl_types Not sure if this is the best place to put it, but we're going to need this for NIR too. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-25 12:36:20 -07:00
Alex Smith	9626128f32	vulkan: Update headers and registry to 1.0.64 Acked-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Smith <asmith@feralinteractive.com>	2017-10-26 05:17:57 +10:00
Matthew Nicholls	27a0b24bf2	ac/nir: generate correct instruction for atomic min/max on unsigned images v2: fix silly typo Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 20:52:58 +02:00
Roland Scheidegger	20c77ae639	gallium/util: remove some block alignment assertions These assertions were revisited a couple of times in the past, and they still weren't quite right. The problem I was seeing (with some other state tracker) was a copy between two 512x512 s3tc textures, but from mip level 0 to mip level 8. Therefore, the destination has only size 2x2 (not a full block), so the box width/height was only 2, causing the assertion to trigger for src alignment. As far as I can tell, such a copy is completely legal, and because a correct assertion would get ridiculously complicated just get rid of it for good. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-25 19:52:24 +02:00
Eric Engestrom	7983adc60f	meson: be explicit about the version required This way, we know what we're allowed to use (no nested include lists for instance) and users get immediate feedback when trying to use unsupported versions, rather than a cryptic crash or things being silently not built correctly. Cc: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-25 14:05:56 +01:00
Erik Faye-Lund	9e5a5a11ed	meson: add opt-out of libunwind Libunwind has some issues on some platforms, so let's allow people who have issues to opt-out. This is similar to what we do in automake, and the implementation is modelled after our opt-out for valgrind. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-25 14:05:24 +02:00
Harish Krupo	d37bcf3cc2	gles2: support for GL_EXT_occlusion_query_boolean Following test checking entrypoints passes: dEQP-EGL.functional.get_proc_address.extension.gl_ext_occlusion_query_boolean Piglit test 'ext_occlusion_query_boolean-any-samples' passes with these changes. No changes/regression observed in WebGL occlusion tests or Intel CI. v2: add es2="2.0" for glapi entrypoints, clean up xml dispatch_sanity changes (fix 'make check') Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-25 14:10:38 +03:00
Tapani Pälli	f5bec8583a	mesa: enum checks for GL_EXT_occlusion_query_boolean Some of the checks are valid for generic ES 3.2 as well. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-25 14:10:38 +03:00
Samuel Pitoiset	9711979df0	radv: print NIR before LLVM IR and disassembly It's still printed after linking, but it makes more sense to have SPIRV->NIR->LLVM IR->ASM. Fixes: `f0a2bbd1a4` (radv: move nir print after linking is done) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 11:46:53 +02:00
Bas Nieuwenhuizen	5bfbab2fdc	radv: Fix truncation issue hexifying the cache uuid for the disk cache. Going from binary to hex has a 2x blowup. Fixes: `1421625292` 'radv: create on-disk shader cache' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-25 09:50:05 +02:00
Timothy Arceri	767ca5bdf1	radv: enable lower to scalar nir pass This will allow dead components of varyings to be removed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 17:02:40 +11:00
Timothy Arceri	8ebaf8192a	ac: add support for explicit component packing This is needed for RADV to support explicit component packing. This is also required to use the new NIR component splitting / packing passes. V2: - add commponent packing support for interpolate_at* intrinsics - improve store packing support when not all varyings are scalar as spotted by Bas the store source was incorrectly offset. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-25 17:02:40 +11:00
Timothy Arceri	e0e0666584	i965: fix unused var warnings in release build Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-25 14:26:39 +11:00
Dave Airlie	d8cefaa197	radv: use device name in cache creation like radeonsi. Not sure how useful this is, but it makes it more consistent. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-25 02:26:01 +01:00
Dave Airlie	3cd3035ace	radv: use a define for the transition point between cp and compute shader For certain buffer meta ops we can use the CP or a compute shader, we should use a define to rather than hardcoding 4096, allows for easier testing and more consistency. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-25 10:01:13 +10:00
Kenneth Graunke	b704538b00	docs: Mark GL_KHR_no_error as done. Drivers have supported KHR_no_error for a while. We'd been leaving it marked as "in progress" because there's a zillion places that could get slightly more optimized. But, Timothy and Samuel have already done piles of work, and I think we have a solid implementation at this point. Let's check it off the list. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-24 16:56:58 -07:00
Kenneth Graunke	66b4a7a79e	i965: Call gen6_upload_push_constants() even when the stage is disabled. This properly sets stage_state->push_constant_dirty = true, so that we emit 3DSTATE_CONSTANT_XS to disable the constant buffer for the shader stage. It also sets stage_state->push_const_size = 0. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-24 16:14:04 -07:00
Kenneth Graunke	16096e9119	i965: Drop a bunch of downcasting and upcasting of gl_program pointers. We have a gl_program and we want a gl_program. There's no point in converting to brw_program and back again. This probably made more sense in the old days before Tim dropped a layer of subclassing. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-24 16:14:02 -07:00
Kenneth Graunke	90ed2a10bb	i965: Move _mesa_shader_write_subroutine_indices down a level. Now we call it in one place instead of making every caller do it. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-24 16:13:59 -07:00
Dave Airlie	a5499b639c	radv: only emit dfsm packets if dfsm is allowed. radeonsi only emits these when dfsm is enabled, so for now just hinge them on a flag we never set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-24 23:00:57 +01:00
Rob Clark	4aa69cc425	meson: build freedreno Mostly copy/pasta from Dylan Baker's conversion of nouveau and i965. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-24 15:33:40 -04:00
Rob Clark	2207af032b	meson: extract out variable for nir_algebraic.py Also needed in freedreno/ir3. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-24 15:33:40 -04:00
Rob Clark	0ca8d53215	freedreno/ir3: use a flag instead of setting PYTHONPATH Similar to `848da66222`, pass an arg to ir3_nir_trig.py to add to python path, rather than using $PYTHONPATH, to prep for meson build support. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-24 15:33:40 -04:00
Kenneth Graunke	583ce96c94	i965: Don't disable CCS for RT dependencies when dispatching compute. Compute shaders don't have access to the framebuffer, so there's no point in worrying whether a texture is bound as a render target. This saves a bunch of resolves in GFXBench4 Manhattan 3.1, but doesn't seem to impact performance at all, at least on Apollolake. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-10-24 11:31:33 -07:00
Eric Anholt	e91c3540fc	i965: Fix memmem compiler warnings. gcc is throwing this warning in my meson build: ../src/intel/compiler/brw_eu_validate.c:50:11: warning argument 1 null where non-null expected [-Wnonnull] return memmem(haystack.str, haystack.len, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ needle.str, needle.len) != NULL; ~~~~~~~~~~~~~~~~~~~~~~~ The first check for CONTAINS has a NULL error_msg.str and 0 len. The glibc implementation will exit without looking at any haystack bytes if haystack.len < needle.len, so this was safe, but silence the warning anyway by guarding against implementation variablility. Fixes: `122ef3799d` ("i965: Only insert error message if not already present") Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-10-24 10:51:18 -07:00
Rob Clark	eed9685dd6	freedreno: per-context fd_pipe To enable per-context priorities, we need to have per-context pipe's. Unfortunately we still need to keep the global screen pipe, mostly just for screen->get_timestamp(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-24 12:56:51 -04:00
Rob Clark	9c32333a58	freedreno: rename pipe -> vsc_pipe To add context priority support we need to have an fd_pipe per context, rather than per-screen. Which conflicts with existing ctx->pipe (which is actually a visibility stream pipe (hw resource). So just rename it. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-24 12:56:51 -04:00
Rob Clark	7e7096307a	freedreno: pass context flags through to fd_context_init() Prep work for later patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-24 12:56:51 -04:00
Brian Paul	7a6c6e73a8	gallium/util: use util_snprintf() in u_socket_connect() Instead of plain snprintf(). To fix the MSVC build. snprintf() is used in various places in Mesa/gallium, but apparently, not in code built with MSVC. Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-24 08:17:15 -06:00
Benjamin Gordon	de3555f834	configure: Allow android as an EGL platform I'm working on radeonsi support in the Chrome OS Android container (ARC++). Mesa in ARC++ uses autotools instead of Android.mk, but all the necessary EGL bits are there, so the existing check is too strict. Signed-off-by: Benjamin Gordon <bmgordon@chromium.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-24 14:46:22 +01:00
Marek Olšák	2a414c3961	radeonsi: postponed KILL isn't postponed anymore, but maintains WQM This restores performance for the drirc workaround, i.e. KILL_IF does: visible = src0 >= 0; kill_flag &= visible; // accumulate kills amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only And all helper pixels are killed at the end of the shader: amdgcn_kill(kill_flag); Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	da0083f123	radeonsi: use postponed KILL only when derivatives are used Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	478afbe525	ac: use llvm.amdgcn.kill with LLVM 6.0 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	1ff9e27cbd	ac: replace ac_build_kill with ac_build_kill_if_false This will be a new LLVM intrinsic and will also work nicely with llvm.amdgcn.wqm.vote. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Timothy Arceri	f0a2bbd1a4	radv: move nir print after linking is done We now have linking optimisations so we want to delay dumping the nir until after these are complete. Fixes: `06f05040eb` (radv: Link shaders) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-24 10:41:38 +11:00
Dave Airlie	11d688d9f0	mesa/bufferobj: don't double negate the range This fixes a regression I introduced refactoring this code, I managed to invert range twice, I moved the inversion into the common code, but forgot to stop doing it in the callee. Fixes: GL45-CTS.multi_bind.dispatch_bind_buffers_base Fixes: `35ac13ed3` (mesa/bufferobj: consolidate some codepaths between ubo/ssbo/atomics.) Reported-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-24 08:40:23 +10:00
Timothy Arceri	013313cf89	radv: clone meta shaders before linking The IR is reused in different pipeline combinations so we need to clone it to avoid link time optimistaions messing up the original copy. Fixes: `06f05040eb` (radv: Link shaders) Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-24 09:27:40 +11:00
Brian Paul	069211f205	gallium/util: don't call close() on Windows in u_tests.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 15:10:44 -06:00
Brian Paul	5134c0dedf	mesa: use util_strdup() macro in u_debug_symbol.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 15:10:38 -06:00
Brian Paul	89372220b3	mesa: use util_strdup() macro in symbol_table.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 15:10:32 -06:00
Brian Paul	acd6ea0cc0	util: add util_strdup() wrapper macro To work around MSVC warning that strdup() is a deprecated POSIX function. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 15:10:24 -06:00
Brian Paul	6230773936	gallium/util: replace gethostbyname() with getaddrinfo() Compiling with MSVC options /we4995 /we4996 (a subset of /sdl) generates a warning that the gethostbyname() function is deprecated in favor of getaddrinfo() or GetAddrInfoW(). Replace the call with getaddrinfo(). Untested. There are no callers to u_socket_connect() in Gallium. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 15:10:01 -06:00
Alex Smith	fee9d05e21	radv: Update code pointer correctly if a variant is already created This was the actual cause of GPU hangs fixed by `0fdd531457` ("radv: Fix pipeline cache locking issues"), since multiple threads would end up trying to create the variants for a single entry. Now that we're locking around the whole of this function, this isn't really necessary (we either create all or none of the variants), but fix this anyway in case things change later. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 17.3 <mesa-stable@lists.freedesktop.org>	2017-10-23 22:36:54 +02:00
Kenneth Graunke	013d331220	i965: Revert absolute mode for constant buffer pointers. The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2 registers at context initialization time. Instead, they're inherited from whatever happened to be running on the GPU prior to first run of a new context. So, when we started setting these, other contexts in the system started inheriting our values. Since this controls whether 3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong setting is fatal for almost any process which isn't expecting this. Unfortunately, VA-API and Beignet don't initialize this (nor does older Mesa), so they will die horribly if we start doing this. UXA and SNA don't use any push constants, so they are unaffected. Until we have some kind of solution to this problem, I'm going to revert this patch and abandon using the feature for now. It will lead to fewer pushed UBO ranges on Broadwell+, which may lead to lower performance, though I don't have any data on the impact. Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774	2017-10-23 12:03:22 -07:00
Dylan Baker	d4567efa5c	meson: build imx driver Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-10-23 11:45:55 -07:00
Dylan Baker	51558a1d6c	meson: build etnaviv driver + winsys Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-10-23 11:45:38 -07:00
Eric Anholt	ba85525fce	ac: Silence a compiler warning about results[0]. We know that num_components will be > 0, but it doesn't. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 10:14:40 -07:00
Eric Anholt	34c04c734f	ac: Fix a compiler warning for possibly undefined "name" Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-23 10:14:40 -07:00
Dylan Baker	77f7ef0287	meson: fix egl build for meson version < 0.43 Meson 0.43 added the ability to pass nested lists to include_directories, so the code that we have works for 0.43, but not for 0.42. This patch changes the include_directories list to be flat so it works with 0.42 fixes: `108d257a16` ("meson: build libEGL") Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-10-23 10:14:40 -07:00
Nicolai Hähnle	f9ccfda9bc	amd/common/gfx9: workaround DCC corruption more conservatively Fixes KHR-GL45.texture_swizzle.smoke and others on Vega. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-23 18:10:20 +02:00
Emil Velikov	a90b4329df	docs/release-calendar: update - 17.3.0-rc1 is out Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 14:31:15 +01:00
Ilia Mirkin	4d24a7cb97	glsl: fix derived cs variables There are two issues with the current implementation. First, it relies on the layout(local_size_*) happening in the same shader as the main function, and secondly it doesn't work for variable group sizes. In both cases, the simplest fix is to move the setup of these derived values to a later time, similar to how the gl_VertexID workarounds are done. There already exist system values defined for both of the derived values, so we use them unconditionally, and lower them after linking is performed. While we're at it, we move to using gl_LocalGroupSizeARB instead of gl_WorkGroupSize for variable group sizes. Also the dead code elimination avoidance can be removed, since there can be situations where gl_LocalGroupSizeARB is needed but has not been inserted for the shader with main function. As a result, the lowering code has to insert its own copies of the system values if needed. Reported-by: Stephane Chevigny <stephane.chevigny@polymtl.ca> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-23 08:34:56 -04:00
Emil Velikov	4302df8c8e	docs: add 17.4.0-devel release notes template Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 13:07:06 +01:00
Emil Velikov	c85d64cf67	mesa: bump version to 17.4.0-devel Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 13:00:43 +01:00
Juan A. Suarez Romero	2665d012a8	radv: automake: include radv_extensions.py in the tarball Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 12:37:01 +02:00
Bas Nieuwenhuizen	a548b727a1	ac/nir: Only clamp shadow reference on radeonsi. Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: `0f9e32519b` 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen	c07d719e8b	radv: Disallow indirect outputs for GS on GFX9 as well. Since it also uses the output vector before writing to memory. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-23 00:27:44 +02:00
Bas Nieuwenhuizen	2c5b43c87f	ac/nir: Fix nir_texop_lod on GFX for 1D arrays. Fixes: `1bcb953e16` 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 00:27:44 +02:00
Dave Airlie	da9c3cd3ee	radv/ac/nir: only emit tess factors to storage if tes reads them Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen	6ce550453f	radv: Don't use vgpr indexing for outputs on GFX9. Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.* tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-22 02:36:37 +02:00
Bas Nieuwenhuizen	ad727b96b6	ac/nir: Account for compact array index in GS input load from LDS. Mirrors the vram path. Fixes: `d4ecc3c929` 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen	67648c0faa	radv: Don't compile shaders when they are cached already. When the gs_copy_shader is NULL (due to an incomplete cache), but the main shaders are found, we still do the nir, but we shouldn't compile the shaders again. For merged shaders we should also account for the missing shaders. Fixes: `ce03c119ce` 'radv: Add code to compile merged shaders.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:34 +02:00
Bas Nieuwenhuizen	3bf954b28e	radv: Don't check for max GL GS invocations. We specify 127 instead of 32 as the limit in vulkan. Fixes: `6bc42855f9` 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:09 +02:00
Bas Nieuwenhuizen	050f7e2df2	radv: Don't explicitly reference vertex shader for draw_id. With merged shaders the vertex shader may not exist. This got in because the offending patch was written before merged shaders were upstream, but committed after. Fixes: `75dfab24a2` 'radv: refactor indirect draws with radv_draw_info' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 20:00:22 +02:00
Bas Nieuwenhuizen	20fb15bfe4	radv: Don't reset cmd_buffer->state.dirty. Otherwise for non-indexed draws we set and immediately unset RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should clear their own bit, this is unnecessary. Fixes: `341529dbee` 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 20:00:16 +02:00
Bas Nieuwenhuizen	fb55477990	radv: Correctly detect changed shaders for vertex descriptors. As they were emitted after the new pipeline, the changed pipeline detection was not working anymore. Fixes: `341529dbee` 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 19:59:44 +02:00
Bas Nieuwenhuizen	24fe4e6143	ac/nir: Set larged wrokgroup size for GS on GFX9. They don't take a single wave anymore and we need the barriers. Fixes: `6bc42855f9` 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen	9e82f2b3ea	ac/nir: Take the max workgroup size of all provided shaders. Fixes: `ffaf4d608a` 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:28 +02:00
Alex Smith	0fdd531457	radv: Fix pipeline cache locking issues Need to lock around the whole process of retrieving cached shaders, and around GetPipelineCacheData. This fixes GPU hangs observed when creating multiple pipelines in parallel, which appeared to be due to invalid shader code being pulled from the cache. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 03:52:43 +02:00
Lionel Landwerlin	c71d44c7f8	anv: don't assert on device init on Cannonlake v2: Warn that support is still in alpha (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Lionel Landwerlin	0c95adaf9e	anv: disable stencil pma fix on Gen > 9 This workaround isn't listed on Gen10. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Lionel Landwerlin	0c92651a3b	blorp: enable R32G32B32X32 blorp ccs copies Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Eric Anholt	48615d1ead	meson: Fix vc5 deps on the XML-generated headers. I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the v3d_packet_v33_pack.h that the compiler and QPU packing actually use.	2017-10-20 17:16:00 -07:00
Eric Anholt	07bfdb478b	broadcom/vc5: Propagate vc4 aliasing fix to vc5. See `e5fea0d621`	2017-10-20 17:09:47 -07:00
Stefan Schake	e5fea0d621	broadcom/vc4: Fix aliasing issue This was causing Android clang version 3.8.256229 to miscompile, presumably due to strict aliasing. Fixes: `14dc281c13` ("vc4: Enforce one-uniform-per-instruction after optimization.")	2017-10-20 17:09:35 -07:00
Dylan Baker	035ec7a2bb	meson: Add support for EGL glvnd Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Lyude Paul <lyude@redhat.com>	2017-10-20 16:46:48 -07:00
Dylan Baker	108d257a16	meson: build libEGL This is based heavily on Daniel Stone's work for the same, rebased on master and with a number of TODO's fixed. This does not implement glvnd (which is coming in a later patch) Meson builds egl slightly differently than autotools, namely it doesn't build an intermediate shared library. It doesn't do this because meson doesn't have problems with the name of the library being dynamically generated, so the glvnd and non-glvnd code can follow the same path. v2: - Don't reuse variable (Eric E.) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-10-20 16:46:48 -07:00
Dylan Baker	ddf06a05ad	meson: move wayland_drm_protocol generation to wayland-drm These files are needed by both vulkan wayland-wsi and by egl wayland-wsi, since the XML file is in src/egl/wayland/wayland-drm and we can include this directory in such a way that it will be loaded before egl and vulkan this allows us to avoid multiple calls to the same generator. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>	2017-10-20 16:46:48 -07:00
Dylan Baker	8d3b1210cb	meson: Don't allow glx to be built without platform_x11 Previously this failed to change with_glx to disabled from auto if platform_x11 was unset or if no opengl apis were being built. v2: - swap conditional positions Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>	2017-10-20 16:46:48 -07:00
Dylan Baker	8792a9e01b	meson: bump libdrm_amdgpu requirement to 2.4.85 fixes: `b603725703` ("configure.ac: Bump libdrm_amdgpu version to 2.4.85.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 16:45:39 -07:00
Eric Anholt	5a0d3e1129	nir: Print the components referenced for split or packed shader in/outs. Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative, much better to see: decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0) v2: Handle arrays and structs better (by Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-20 16:26:46 -07:00
Eric Anholt	d9ce4ac990	nir: Add a safety check that we don't remove dead I/O vars after lowering. The pass only looks at var load/store intrinsics, not input load/store intrinsics, so assert that we don't see the other type. v2: Adjust comment indentation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-20 16:26:07 -07:00
Andres Rodriguez	a2c6fbb3ee	radv: disable implicit sync for radv allocated bos v3 Implicit sync kicks in when a buffer is used by two different amdgpu contexts simultaneously. Jobs that use explicit synchronization mechanisms end up needlessly waiting to be scheduled for long periods of time in order to achieve serialized execution. This patch disables implicit synchronization for all radv allocations except for wsi bos. The only systems that require implicit synchronization are DRI2/3 and PRIME. v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC v3: Add drm version check (Bas) Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:15:54 +02:00
Andres Rodriguez	eff2bdbd82	radv: factor out radv_alloc_memory This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:15:49 +02:00
Andres Rodriguez	92724338ba	radv: Expose VK_EXT_global_priority Expose the extension string as supported Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	9f7edf4d1f	radv: don't skip PS/VS partial flush This patch helps lower high priority compute latency. Found by bisecting a perf regression on computeparticles with high priority compute queues enabled. Reverting this micro-optimization doesn't seem to have any negative effect on performance on Dota2 or ssao. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	fd04f3eb86	radv: Implement VK_EXT_global_priority This extension allows the caller to change a queue's system wide priority. This is useful for applications with specific latency constraints. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	557de3b9ae	radeonsi: hardcode shader WAVE_LIMIT to the maximum value This is part of a cooperative scheduling approach used by radv. All drivers in the stack must opt-in to resource arbitration, otherwise GL based apps will be able to ignore system priorities. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	986c4b0bd4	radv: hardcode shader WAVE_LIMIT to the maximum value When WAVE_LIMIT is set, a submission will opt-in for SPI based resource scheduling. Because this mechanism is cooperative, we must ensure that all submissions have this field set, otherwise they will bypass resource arbitration. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	b7c2f70656	vulkan: update headers & registry to VK 1.0.63 Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Bas Nieuwenhuizen	b603725703	configure.ac: Bump libdrm_amdgpu version to 2.4.85. For VK_EXT_global_priority in radv. Acked-by: Andres Rodriguez <andresx7@gmail.com>	2017-10-21 01:01:44 +02:00
Eric Anholt	9b5fa214f4	broadcom/vc5: Use SETMSF to handle discards. A bit of spec text suggested that (like vc4) condition codes should be used for discards, and the simulator was fine with it, but the 7268 disagrees and you have to use SETMSF instead or the color comes through. Fixes glsl-fs-discard-01 and many of the interpolation-with-clipping tests.	2017-10-20 15:59:41 -07:00
Eric Anholt	a48a38937c	broadcom/vc5: Set the snorm/unorm packing functions to be lowered. We don't have native instructions for them, so set up the lowering. Once we support the bfi instructions that get generated, they should start actually working.	2017-10-20 15:59:41 -07:00
Eric Anholt	0e6fee7328	broadcom/vc5: Fix pasteo that broke vertex texturing. We weren't ever filling in the texture state record, so we'd dereference NULL from the shader.	2017-10-20 15:59:41 -07:00
Eric Anholt	34690536a7	broadcom/vc5: Move default attribute value setup to the CSO and fix them. I was generating some stub values to bring the driver up, but fill them in properly now. We now set 1.0 or 1u as appropriate, and thanks to being in their own BO it fixes piglit failures on the 7268 (where our 4-byte alignment was insufficient). Fixes const-packHalf2x16.shader_test	2017-10-20 15:59:41 -07:00
Eric Anholt	fb15168919	broadcom/vc5: Move most of the shader state attribute record to the CSO. This should reduce our draw-time overhead, and puts the code where it should go long term.	2017-10-20 15:53:55 -07:00
Eric Anholt	f4ff8f74ee	broadcom/vc5: Fix build failure frm nir_shader::stage removal. Fixes: `59fb59ad54` ("nir: Get rid of nir_shader::stage")	2017-10-20 15:53:55 -07:00
Matt Turner	9cd60fce9c	i965/fs: Use align1 mode on ternary instructions on Gen10+ Align1 mode offers some nice features over align16, like access to more data types and the ability to use a 16-bit immediate. This patch does not start using any new features. It just emits ternary instructions in align1 mode. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	8c16c9c677	i965: Add align1 ternary instruction emission support Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	f11fa5ac6c	i965: Add align1 ternary instruction disassembler support Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	6c7fc9b73a	i965: Add align1 ternary instruction-word support Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	3b2c868848	i965: Add align1 ternary instruction support to conversion functions Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	281e8b8f27	i965: Add align1 ternary instruction field encodings Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	5f6ee55e68	i965: Add functions to abstract access to 3src register types Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:17 -07:00
Matt Turner	e15dac319b	i965: Rename brw_inst's functions that access the 3src register type Put hw_ in the name so that it's clear these are the hardware encodings. Similar to commit `9fb8323328` ("i965: Rename brw_inst's functions that access the register type") Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:16 -07:00
Matt Turner	e7f3b82e03	i965: Rename brw_inst 3src functions in preparation for align1 Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:16 -07:00
Matt Turner	ba50b538af	i965: Print subreg in units of type-size on ternary instructions The instruction word contains SubRegNum[4:2] so it's in units of dwords (hence the * 4 to get it in terms of bytes). Before this patch, the subreg would have been wrong for DF arguments. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:16 -07:00
Matt Turner	3f14150e9a	i965: Add functions for brw_reg_type <-> hw 3src type Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:16 -07:00
Matt Turner	4c857d1f3b	i965: Move brw_reg_type_is_floating_point to brw_reg_type.h I'm going to call this from brw_inst.h, and I don't want to have to include all of brw_reg.h. Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>	2017-10-20 15:00:16 -07:00
Jason Ekstrand	59fb59ad54	nir: Get rid of nir_shader::stage It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-20 12:49:17 -07:00
Samuel Pitoiset	341529dbee	radv: use optimal packet order for draws Ported from RadeonSI. The time where shaders are idle should be shorter now. This can give a little boost, like +6% with the dynamicubo Vulkan demo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 20:07:53 +02:00
Samuel Pitoiset	af6985b309	radv: add radv_emit_shaders_prefetch() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 20:07:53 +02:00
Samuel Pitoiset	0d85f4a9e2	radv: add radv_emit_shader_prefetch() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 20:07:53 +02:00
Marek Olšák	46f452dd5f	st/mesa: correct a u_vbuf comment trivial.	2017-10-20 18:56:20 +02:00
Christian Gmeiner	65ccee2dc2	etnaviv: fix implicit conversion warning Galliums query_type used in APIs is unsigned. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-20 12:42:55 +02:00
Christian Gmeiner	57a586828f	etnaviv: enable occlusion query if GPU supports it Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-20 12:42:48 +02:00
Christian Gmeiner	246243d447	etnaviv: add support for occlusion queries Passes most occlusion query piglits. The following piglits are broken: - spec@arb_occlusion_query@occlusion_query_meta_fragments - spec@arb_occlusion_query@occlusion_query_meta_save - spec@arb_occlusion_query2@render v1 -> v2: - use one sample provider for all occlusion queries tyes - add comment about 'magic' value 0x1DF5E76 Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-20 12:42:44 +02:00
Christian Gmeiner	282d8698ec	etnaviv: add basic infrastructure for hw queries No hardware query is supported yet. v1 -> v2 - removed query_type from strcut etna_hw_sample_provider Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-20 12:42:40 +02:00
Christian Gmeiner	b8c335c91b	etnaviv: update headers from rnndb Update to etna_viv commit 6c9c706. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-20 12:42:35 +02:00
Chris Wilson	aa65dcd1d7	relnotes/17.3: EGL_IMG_context_priority is now implemented Suggested-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:18 +01:00
Chris Wilson	f72392231b	i965: Report supported context priorities to EGL/DRI Hook up the RendererQuery for __DRI2_RENDERER_HAS_CONTEXT_PRIORITY to report the available DRM_I915_GEM_CONTEXT_SETPARAM options based on the per-client default context. The kernel will validate the request to change the property, so we get an accurate reflection of available support (based on kernel version and privilege) and we should only have to do it once during screen setup -- although the SETPARAM should be fast, they are still an ioctl each. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Chris Wilson	1617fca6d1	i965: Pass the EGL/DRI context priority through to the kernel Decode the EGL/DRI priority enum into the [-1023, 1023] range as interpreted by the kernel and call DRM_I915_GEM_CONTEXT_SETPARAM to adjust the priority. We use 0 as the default medium priority (also the kernel default) and so only need adjust up or down. By only doing the adjustment if not setting to medium, we can faithfully report any error whilst setting without worrying about kernel version. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Chris Wilson	21023954f8	i965: Record the presence of the kernel scheduler Mention to the debug log if the kernel scheduler is enabled; and in particular if it has preemption enabled. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Chris Wilson	98c2b7f9fa	i965: Sync i915_drm.h from kernel for IMG_context_priority Pulling in changes up to kernel commit ac14fbd460d0ec16e7750e40dcd8199b0ff83d0a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Oct 3 21:34:53 2017 +0100 drm/i915/scheduler: Support user-defined priorities and including the fixup from kernel commit 822a4b673284672af697ccd66e8795f8a712a90d Author: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Date: Fri Oct 6 13:45:59 2017 +0300 drm/i915: Don't use BIT() in UAPI section for implementing IMG_context_priority. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Chris Wilson	5c5618338a	egl,dri: Propagate context priority hint to driver->CreateContext Jump through the layers of abstraction between egl and dri in order to feed the context priority attribute through to the backend. This requires us to read the value from the base _egl_context, convert it to a DRI attribute, parse it again in the generic context creator before passing it to the driver as a function parameter. In order to not require us to pass back the actual value of the context priority after creation, we impose that drivers should report the available set of priorities during screen setup (and then they may chose to fail if given an invalid value as that should have been checked at the user boundary.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> # i915/i965 Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Chris Wilson	95ecf3df62	egl: Support IMG_context_priority IMG_context_priority https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt "This extension allows an EGLContext to be created with a priority hint. It is possible that an implementation will not honour the hint, especially if there are constraints on the number of high priority contexts available in the system, or system policy limits access to high priority contexts to appropriate system privilege level. A query is provided to find the real priority level assigned to the context after creation." The extension adds a new eglCreateContext attribute for choosing a priority hint. This stub parses the attribute and copies into the base struct _egl_context, and hooks up the query similarly. Since the attribute is purely a hint, I have no qualms about the lack of implementation before reporting back the value the user gave! v2: Remember to set the default ContextPriority value to medium. v3: Use the driRendererQuery interface to probe the backend for supported priority values and use those to mask the EGL interface. v4: Treat the priority attrib as a hint and gracefully mask any requests not supported by the driver, the EGLContext will remain at medium priority. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rob Clark <robdclark@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Emil Velikov <emli.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-20 11:28:17 +01:00
Fredrik Höglund	e2053b8e3d	radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT The Vulkan specification says: "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_- PIPE_BIT in the source stage mask will effectively not wait for any prior commands to complete." Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-20 11:37:51 +02:00
Samuel Pitoiset	565c22158f	radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor} Fixes two compilation warnings in release build. Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-20 11:22:19 +02:00
Samuel Pitoiset	c8f2b73656	radv: rename radv_cmd_buffer_flush_state() to radv_draw() Similar to the dispatch codepath. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:16 +02:00
Samuel Pitoiset	9e45e5c9fd	radv: emit primitive restart from radv_emit_draw_registers() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:14 +02:00
Samuel Pitoiset	93207a8e89	radv: add radv_emit_draw_registers() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:12 +02:00
Samuel Pitoiset	9466856456	radv: refactor indirect draws (+count buffer) with radv_draw_info Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:11 +02:00
Samuel Pitoiset	75dfab24a2	radv: refactor indirect draws with radv_draw_info Indirect draws with a count buffer will be refactored in a separate patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:08 +02:00
Samuel Pitoiset	03afa95d9f	radv: refactor simple and indexed draws with radv_draw_info Similar to the dispatch compute logic but for draw calls. For convenience, indirect draws will be converted in a separate patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 11:20:05 +02:00
Samuel Pitoiset	54fa635f82	radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it Only on CIK and later. We should only update VGT_INDEX_TYPE but it seems easier to re-emit all the index buffer packets. Fixes: `966d66f28f` (radv: do not re-emit the index buffer for every draw call) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 10:40:01 +02:00
Samuel Pitoiset	eae46f192e	radv: clear the dirty flags in the corresponding emit helpers This will allow us to fix the VGT_INDEX_TYPE issue properly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 10:39:28 +02:00
Samuel Pitoiset	68cd3564a0	radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFER To be consistent with the emit function name. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 10:39:26 +02:00
Samuel Pitoiset	94e69f4141	radv: move DB_COUNT_CONTROL initialization to si_emit_config() CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 10:38:11 +02:00
Samuel Iglesias Gonsálvez	9e515cf381	i965/vec4: remove setting default LOD in the backend It is already done in NIR. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-20 08:29:53 +02:00
Samuel Iglesias Gonsálvez	c6d7d09bd0	i965/fs: remove setting default LOD in the backend It is already done in NIR. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-20 08:29:53 +02:00
Samuel Iglesias Gonsálvez	e382890e25	nir: set default lod to texture opcodes that needed it but don't provide it v2: - Use helper to add a new source to the texture instruction. v3: - Use nir_tex_instr_src_index() to simplify the patch (Jason). Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-20 08:29:09 +02:00
Bas Nieuwenhuizen	6bc42855f9	radv: enable GS on GFX9 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 07:14:00 +01:00
Bas Nieuwenhuizen	73749caf0e	radv: calculate and emit GFX9 GS registers to pipeline state. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:47 +01:00
Bas Nieuwenhuizen	9961ae2447	ac/nir: Fix up GS input vgprs. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen	d4ecc3c929	ac/nir: Add loading from LDS for merged GS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen	ec53e52742	ac/nir: Add ES output to LDS for GFX9. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen	3e77333030	ac/nir: Add merged GS function. [airlied: merged fixup + and fixed up a couple more bits]. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:23:14 +01:00
Bas Nieuwenhuizen	f82797b56d	radv: Only emit TES when it exists. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:14:14 +01:00
Bas Nieuwenhuizen	6e21b7a294	radv: Use control shader presence for detecting tess. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:11:10 +01:00
Dave Airlie	5bc5e07d81	radv: fixup tess eval shader when combined. This fixes some access to the tess eval shader when it's combined with geometry on gfx9. This is a review of Bas's commit: radv: Prevent crashing by accessing TES for VGT reuse depth. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-20 06:11:10 +01:00
Bas Nieuwenhuizen	e6acc20b6a	radv: Set VGT_GS_MODE properly for gfx9 Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 05:55:11 +01:00
Dave Airlie	99281c1e8f	radv: ensure correct outinfo is picked. This struct used to rely on being in a union, it isn't anymore, so we have to pick the correct outinfo struct now. This should fix a regression since the union became a struct. dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set Fixes: `6078a3bd51` (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-20 14:44:09 +10:00
George Kyriazis	f9d239e11f	swr: Rework scratch space allocation Remove allocation of > 2kbyte buffers into context memory in swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers and shader constants to a scratch space to be used by the upcoming draw.) Large shader constant allocations need to be done in the circular scratch buffer instead of context memory, because their values persist across render calls. Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger buffers will get too large for the circular scratch space. Fixes render issues with CEI Ensight. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 20:18:09 -05:00
Bas Nieuwenhuizen	ffaf4d608a	radv: Enable tessellation shaders for GFX9. It mostly works now. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 01:50:43 +02:00
Dave Airlie	1dda214d9c	ac/nir: init full exec mask for merged shaders. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-20 01:50:40 +02:00
Dave Airlie	14978a1c3b	radv: drop unused r600_htile_info. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-20 00:38:57 +01:00
Dave Airlie	c8eb3558cc	radv: fix CLEAR_STATE packet length. Looking at shader traces I noticed some registers were missing, one of them was being eaten by the wrong clear state length. Fixes: `4f42ea4dc` (radv: use CLEAR_STATE for initializing some registers) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-19 23:56:48 +01:00
Dylan Baker	a447f9fe7b	meson: don't build gallium dri target if gallium is disabled Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and the megadriver install script to attempt to install drivers without any actual drivers being built. fixes: `66f97f6640` ("meson: build radeonsi") Reported-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com>	2017-10-19 15:17:34 -07:00
Timothy Arceri	087e010b2b	radv: copy indirect lowering settings from radeonsi It looks the original indirect mask was probably copied from ANV. Sascha Willems demo results: tessellation ~4000 -> ~4200 fps V2: continue lowering local indirects due to llvm deficiencies. Tested-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 08:01:26 +11:00
Timothy Arceri	5549b47d7b	radv: stop redundant setting of active_stages We already set it when above in the nir compilation loop. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-20 08:01:26 +11:00
Timothy Arceri	bebfeb7e1c	ac: move some code out of loop in store_tcs_output() Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen	228325f4b7	radv: Modify rsrc1/rsrc2 generation for merged tess. No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at a different offset. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:44 +02:00
Bas Nieuwenhuizen	8250efb90a	radv: Set correct registers for merged shader rings. We need different regs to end up in s0/s1. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:39 +02:00
Bas Nieuwenhuizen	6a074f87be	radv: Add GFX9 HS emitting code. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:34 +02:00
Bas Nieuwenhuizen	b096245030	radv: Remove remaining hard coded references to VS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:31 +02:00
Bas Nieuwenhuizen	91b033f4f6	radv: Update GFX9 user data regs for GS/tess. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:27 +02:00
Bas Nieuwenhuizen	ce03c119ce	radv: Add code to compile merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen	640f2c458f	ac/nir: Add LS-HS input VGPR workaround. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen	0a182e73d9	ac/nir: Compile the bodies of multiple shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen	56d8af1ec5	ac/nir: Expand user SGPR descriptions a bit. To prevent VS/TCS collisions in merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:07 +02:00
Bas Nieuwenhuizen	25efef40d2	ac/nir: Don't write to the dynamic HS word on GFX9. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen	d8bd693d03	ac/nir: Add function creation for merged LS+HS. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen	0cdc8b26f8	ac/nir: Make scan_shader_output_decl less dependent on the context. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:56 +02:00
Bas Nieuwenhuizen	6078a3bd51	ac/nir: Allow ac_shader_variant_info to contain info about multiple stages. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:51 +02:00
Bas Nieuwenhuizen	a996ed1f9b	ac/nir: Change interface to allow multiple source shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:47 +02:00
Bas Nieuwenhuizen	872b21487c	ac/nir: Add HS calling convention. Needed for GFX9 merged shaders. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:42 +02:00
Bas Nieuwenhuizen	163a4bf386	ac: Parse the new HS RSRC1 register. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-19 22:24:20 +02:00
Tim Rowley	bfda35c8dd	swr: knob overrides for Intel Xeon Phi Architecture benefits from having more threads/work outstanding. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	028ffa5e18	swr/rast: Add api to override draws in flight Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	2559f2b93e	swr/rast: Widen fetch shader to SIMD16 (disabled for now) Refactored the gather operation to process 16 elements at a time via paired SIMD8 operations. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	49090ccf54	swr/rast: Change DS memory allocation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	04ea03d99d	swr/rast: Fix indentation Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	62e2d657c8	swr/rast: Miscellaneous viewport array code changes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Tim Rowley	ed1db803fa	swr/rast: Minor changes for os-x Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-10-19 13:10:55 -05:00
Kenneth Graunke	82144b7392	i965: Don't disable aux buffers for non-overlapping miplevels. Meta's GenerateMipmap implementation binds the same image for both sampling and rendering - but it samples from one miplevel while rendering the next. This is a false self-dependency, and there's no need to disable auxiliary buffers in this case. In fact, we really want to leave it enabled so the new miplevels gain color compression. Thankfully, the texture object's _MaxLevel is always one shy of the miplevel being rendered. So we can simply check if irb->mt_level is overlaps with the texture's defined levels. If not, there's no self- dependency and we can leave the auxiliary buffers enabled. Fixes a performance regression in GFXBench4 Car Chase, which apparently calls glGenerateMipmap() on every frame. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>	2017-10-19 11:10:00 -07:00
Kenneth Graunke	fa6ca6991b	i965: Remove the intel_miptree_prepare_fb_fetch wrapper. Now that intel_miptree_prepare_texture takes levels and layers, there's not much use in this anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>	2017-10-19 11:10:00 -07:00
Kenneth Graunke	e208d7f874	i965: Only resolve texture levels/layers that are accessed. This should avoid unnecessary resolves when working with texture views. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>	2017-10-19 11:10:00 -07:00
Kenneth Graunke	0954ce1000	i965: Make intel_miptree_prepare_texture() take level/layer arguments. This effectively exports intel_miptree_prepare_texture_slices() as intel_miptree_prepare_texture(). The hope is to avoid resolves for when using texture views that access a subset of the levels/layers. For now, we pass the same arguments to separate the mechanical change from the one that actually modifies our behavior. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>	2017-10-19 11:10:00 -07:00
Tim Rowley	33bdbc1db4	gallium: add more exceptions to tgsi_util_get_inst_usage_mask A number of double/int64 operations don't have matching read and write usage masks, which the fallthrough case of tgsi_util_get_inst_usage_mask assumes for componentwise tagged instructions. No regressions in llvmpipe piglit; fixes a large number of swr regressions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-19 12:49:32 -05:00
Kenneth Graunke	113a6a639f	isl: Fix width check in isl_gen7_choose_msaa_layout. The restriction is supposed to apply if the width field is >= 8192, meaning the actual width value is >= 8193. The code also incorrectly used == for some reason. Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-19 10:21:45 -07:00
Kenneth Graunke	68f69ebdcc	i965: Use is_scheduling_barrier instead of schedule_node::is_barrier. Commit `a73116ecc6` tried to make add_barrier_deps() walk to the next barrier, and stop. To accomplish that, it added an is_barrier flag. Unfortunately, this only works half of the time. The issue is that add_barrier_deps() walks both backward (to the previous barrier), and forward (to the next barrier). It also sets is_barrier. Assuming that we're processing instructions in forward order, this means that is_barrier will be set for previous instructions, but not future ones. So we'll never see it, and walk further than we need to. dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 now compiles its shaders in 3.6 seconds instead of 3.3 minutes. Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Pallavi G <pallavi.g@intel.com>	2017-10-19 10:19:20 -07:00
Kenneth Graunke	3d112a7cd4	i965: Move fs_inst::has_side_effects()'s eot check to the parent class. This eliminates a layer of wrapping, and makes a backend_instruction sufficient. The downside is that it exposes 'eot' to the vec4 backend, which it doesn't need, but can basically happily ignore. Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Pallavi G <pallavi.g@intel.com>	2017-10-19 10:19:20 -07:00
Roland Scheidegger	77b8392858	tgsi: fix tgsi_util_get_inst_usage_mask The logic for handling shadow coords was completely broken. Fixes `be3ab867bd`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-19 16:33:39 +02:00
Emil Velikov	a6c55243b9	docs: update calendar, add news item and link release notes for 17.2.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-19 13:31:59 +01:00
Emil Velikov	d5fdc37263	docs: add sha256 checksums for 17.2.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `facc851818`)	2017-10-19 13:31:59 +01:00
Emil Velikov	b1605550a6	docs: add release notes for 17.2.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `28dc4b64f2`)	2017-10-19 13:31:59 +01:00
Iago Toral Quiroga	2d87caa279	glsl/linker: produce error when invalid explicit locations are used We only need to add a check to validate output locations here. For inputs with invalid locations we will fail to link when we can't find a matching output in the same (invalid) location. v2: compute location slots properly depending on shader stage and variable type / direction Fixes: KHR-GL45.enhanced_layouts.varying_location_limit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-19 11:27:12 +02:00
Iago Toral Quiroga	16631ca30e	i965/sbe: fix active components for SSO programs with over 16 inputs When we have up to 16 FS inputs, the SF unit will reorder our inputs to be consecutive, however, when we have more than 16 we need to to read our inputs from the URB exactly as they have been output from the previous stage. This means that for SSO we have to consider if we have URB padding due to unused input locations. Specifically, this affects gen9 active components programming, since for things to work in scenarios with over 16 inputs that have padded regions we need to ensure that we program active components for the padded regions too. If we don't do this the hardware won't read the URB properly for inputs located after padded regions. Found empirically. Fixes (these also require a patch in CTS): KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-19 08:31:42 +02:00
Chris Wilson	b7c655f700	i965: Do not log a perf warning when mapping an idle bo We only want to scare the user away from causing a GPU stall for mapping a busy bo. The time taken to instantiate the set of pages for a buffer and their mmapping is unavoidable and flagging idle bo as being busy is "crying wolf". Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-19 07:12:39 +01:00
Matt Turner	e9796ebca7	i965: Use a union to bitcast a float ... which does not break C's aliasing rules.	2017-10-18 22:16:46 -07:00
Darren Salt	5767ce7d0d	drirc: Group a few games in the glthread whitelist together. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-19 03:28:34 +02:00
Darren Salt	80c20b29d8	drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell). “Saints Row: Gat out of Hell” benefits from this on slower CPUs in that usage spikes on individual cores are avoided, which in turn makes it harder to hit a bug which causes broken audio and the game to hang on exit. “Saints Row IV” appears to be fine either way, but also exhibits the audio breakage bug: glthread is therefore being enabled on the grounds that it should make it a little harder to hit that bug. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-19 03:28:34 +02:00
Samuel Pitoiset	535aa43df0	radv: reset dirty flags after flushing all states Move it to radv_cmd_buffer_flush_state() because if rasterizerDiscardEnable is true, the flags are not cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 21:21:48 +02:00
Samuel Pitoiset	966d66f28f	radv: do not re-emit the index buffer for every draw call It can only be changed when CmdBindIndexBuffer() is called or when a secondary buffer is used. Though not always, but let's re-emit the packets in this situation for now. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 21:21:43 +02:00
Samuel Pitoiset	e5480be0d1	radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet() This saves few CPU cycles when CmdDrawIndexed() is used a lot. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 21:21:30 +02:00
Bas Nieuwenhuizen	fa226e9933	radv: Do not read from the disk cache with RADV_DEBUG=nocache. Otherwise the flag is borderline useless. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-18 20:37:10 +02:00
Alex Smith	2cccc74f56	radv: Set active_stages after getting cached shaders Fixes: `7d45d22fdd` ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 20:37:10 +02:00
Alex Smith	f557673237	radv: Don't free NIR shaders if tracing Fixes a crash while generating a hang report. Fixes: `7d45d22fdd` ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 20:37:10 +02:00
Marek Olšák	84f3afc2e1	Revert "egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}" This reverts commit `8cb84c8477`. This fixes crashing shader-db/run.	2017-10-18 20:23:42 +02:00
Marek Olšák	2cb9ab53dd	Revert "egl: drop EGL driver `name`" This reverts commit `6414d6bd8d`. This is needed to apply the next revert.	2017-10-18 20:23:24 +02:00
Miklós Máté	f37af5ec8d	st/mesa: set dimension for constants in ATI_fragment_shader This fixes an assertion failure introduced by `30a2f0dfd4`. Fixes: `30a2f0dfd4` ("radeonsi: add an assertion that only Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-18 19:36:53 +02:00
Michel Dänzer	8c9e7c9638	st/osmesa: include u_inlines.h for pipe_resource_reference Fixes build failure due to unresolved symbol. Fixes: `7561da367b` "st/mesa: Initialize textures array in st_framebuffer_validate" Trivial.	2017-10-18 18:44:58 +02:00
Michel Dänzer	7561da367b	st/mesa: Initialize textures array in st_framebuffer_validate And just reference pipe_resources to it in the validate callbacks. Avoids pipe_resource leaks when st_framebuffer_validate ends up calling the validate callback multiple times, e.g. when a window is resized. v2: * Use generic stable tag instead of Fixes: tag, since the problem could already happen before the commit referenced in v1 (Thomas Hellstrom) * Use memset to initialize the array on the stack instead of allocating the array with os_calloc. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-10-18 18:28:00 +02:00
Eric Engestrom	47273d7312	egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 17:25:41 +01:00
Eric Engestrom	6414d6bd8d	egl: drop EGL driver `name` The "DRI2" name was reported as confusing when printing EGL infos (one user reported thinking DRI3 was not working on his X server), and the only alternative is Haiku, which can only be used on a Haiku machine. The name therefore doesn't add any information that the user wouldn't know already, so let's just drop it. Cc: Kai Wasserbäch <kai@dev.carbon-project.org> Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Related-to: `b174a1ae72` ("egl: Simplify the "driver" interface") Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 17:25:41 +01:00
Eric Engestrom	d7e769abec	egl: drop always-false TestOnly option Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 17:25:41 +01:00
Nicholas Miell	3012885b3f	Fix the xf86vm meson dependency The pkg-config file is called xxf86vm. Signed-off-by: Nicholas Miell <nmiell@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 17:25:41 +01:00
Eric Engestrom	8cb84c8477	egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku} Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's overwritten by the EGL_NOT_INITIALIZED in eglInitialize(). Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-18 17:25:41 +01:00
Eric Engestrom	4893673b15	egl_dri2: drop dri2_egl_driver struct Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-18 17:25:41 +01:00
Eric Engestrom	7823cfe9fe	egl_dri2: move glFlush out of struct dri2_egl_driver There's no reason to store this there, it doesn't depend on the driver. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-18 17:25:41 +01:00
Roland Scheidegger	3d0deed12a	llvmpipe: handle shader sample mask output This probably isn't all that useful for GL, but there are apis where sample_mask is a valid output even without msaa. Just discard the pixel if the sample_mask doesn't include the bit for sample 0. Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-18 18:16:44 +02:00
Vinson Lee	c5124fbc74	anv: Fix instance typos. Fix build error. CC vulkan/vulkan_libvulkan_common_la-anv_device.lo In file included from vulkan/anv_device.c:33:0: vulkan/anv_device.c: In function ‘anv_AllocateMemory’: vulkan/anv_device.c:1562:37: error: ‘struct anv_device’ has no member named ‘instace’; did you mean ‘instance’? result = vk_errorf(device->instace, device, ^ vulkan/anv_private.h:317:17: note: in definition of macro ‘vk_errorf’ __vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\ ^~~~~~~~ Fixes: `9775894f10` ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 09:08:08 -07:00
Brian Paul	e17aa6cd9d	mesa: fix trivial typo in _mesa_PixelMapusv() error string Signed-off-by: Brian Paul <brianp@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103323	2017-10-18 09:53:00 -06:00
Eric Engestrom	2515eb63f8	meson: move expat dependency where it's needed Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-18 14:27:20 +01:00
Hongxu Jia	05fc62d89f	automake: intel: move expat handling where it's used Linking libvulkan_intel.so can fail, due to unresolved references to libexpat.so. EXPAT_CFLAGS should be moved as well. Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-18 14:27:20 +01:00
Timothy Arceri	e5e9e21e9f	radv: don't create dummy fs when compiling compute stage Fixes: `d1c9f30d7f` "radv: add radv_create_shaders() helper" Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 22:47:53 +11:00
Samuel Pitoiset	e6b9abf294	radv: use the dispatch initiator for indirect dispatches Missed that when I allowed waves to be launched out-of-order. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 11:22:41 +02:00
Samuel Pitoiset	095e709717	radv: remove XtoY_temps structs Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 11:22:39 +02:00
Tapani Pälli	6ef9bea734	anv: Install as Vulkan HAL module in Android.mk build Now that anvil fully implements the Vulkan HAL interface, we can install it as the vendor HAL module at /vendor/lib/hw/vulkan.${board}.so. To do so: - Rename LOCAL_MODULE to vulkan.$(TARGET_BOARD_PLATFORM). - Use LOCAL_PROPRIETARY_MODULE to install under vendor path. Tested by running different Sascha Williams demos on Android-IA. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> [chadv: Extract this hunk from Tapani's patch, and embed it as stand-alone patch in my arc-vulkan series]. Signed-off-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-18 00:23:38 -07:00
Chad Versace	053d4c328f	anv: Implement VK_ANDROID_native_buffer (v9) This implementation is correct (afaict), but takes two shortcuts regarding the import/export of Android sync fds. Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync fd into a VkSemaphore or VkFence, the driver instead simply blocks on the sync fd, then puts the VkSemaphore or VkFence into the signalled state. Thanks to implicit sync, this produces correct behavior (with extra latency overhead, perhaps) despite its ugliness. Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export a collection of wait semaphores as a sync fd, the driver instead submits the semaphores to the queue, then returns sync fd -1, which informs the caller that no additional synchronization is needed. Again, thanks to implicit sync, this produces correct behavior (with extra batch submission overhead) despite its ugliness. I chose to take the shortcuts instead of properly importing/exporting the sync fds for two reasons: Reason 1. I've already tested this patch with dEQP and with demos apps. It works. I wanted to get the tested patches into the tree now, and polish the implementation afterwards. Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915 supports neither Android's sync_fence, nor upstream's sync_file, nor drm_syncobj. Again, I tested these patches on Android with a 3.18 kernel and they work. I plan to quickly follow-up with patches that remove the shortcuts and properly import/export the sync fds. Non-Testing =========== I did not test at all using the Android.mk buildsystem. I may have broke it. Please test and review that. Testing ======= I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel. The following pass (as of patchset v9): - a little spinning cube demo APK - several Sascha demos - dEQP-VK.info.* - dEQP-VK.api.wsi.android.* (except dEQP-VK.api.wsi.android.swapchain..image_usage, because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT) - dEQP-VK.api.smoke. - dEQP-VK.api.info.instance.* - dEQP-VK.api.info.device.* v2: - Reject VkNativeBufferANDROID if the dma-buf's size is too small for the VkImage. - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory during vkCreateImage. Instead, directly import its dma-buf during vkCreateImage with anv_bo_cache_import(). [for jekstrand] - Rebase onto Tapani's VK_EXT_debug_report changes. - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not exist. v3: - Delete duplicate #include "anv_private.h". [per Tapani] - Try to fix the Android-IA build in Android.vulkan.mk by following Tapani's example. v4: - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported gralloc buffer, just as we do for all other winsys buffers in anv_wsi.c. [found by Tapani] v5: - Really fix the Android-IA build by ensuring that Android.vulkan.mk uses Mesa' vulkan.h and not Android's. Insert -I$(MESA_TOP)/include before -Iframeworks/native/vulkan/include. [for Tapani] - In vkAcquireImageANDROID, submit signal operations to the VkSemaphore and VkFence. [for zhou] v6: - Drop copy-paste duplication in vkGetSwapchainGrallocUsageANDROID(). [found by zhou] - Improve comments in vkGetSwapchainGrallocUsageANDROID(). v7: - Fix vkGetSwapchainGrallocUsageANDROID() to inspect its VkImageUsageFlags parameter. [for tfiga] - This fix regresses dEQP-VK.api.wsi.android.swapchain.*.image_usage because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT. v8: - Drop unneeded goto in vkAcquireImageANDROID. [for tfiga] v8.1: (minor changes) - Drop errant hunks added by rerere in anv_device.c. - Drop explicit mention of VK_ANDROID_native_buffer in anv_entrypoints_gen.py. [for jekstrand] v9: - Isolate as much Android code as possible, moving it from anv_image.c to anv_android.c. Connect the files with anv_image_from_gralloc(). Remove VkNativeBufferANDROID params from all anv_image.c funcs. [for krh] - Replace some intel_loge() with vk_errorf() in anv_android.c. - Use © in copyright line. [for krh] Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v5) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v9) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v9) Cc: zhoucm1 <david1.zhou@amd.com> Cc: Tomasz Figa <tfiga@chromium.org>	2017-10-18 00:23:38 -07:00
Chad Versace	9775894f10	anv: Move size check from anv_bo_cache_import() to caller (v2) This change prepares for VK_ANDROID_native_buffer. When the user imports a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user provides no size. The driver must infer the size from the internals of the gralloc buffer. The patch is essentially a refactor patch, but it does change behavior in some edge cases, described below. In what follows, the "nominal size" of the bo refers to anv_bo::size, which may not match the bo's "actual size" according to the kernel. Post-patch, the nominal size of the bo returned from anv_bo_cache_import() is always the size of imported dma-buf according to lseek(). Pre-patch, the bo's nominal size was difficult to predict. If the imported dma-buf's gem handle was not resident in the cache, then the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize, 4096). If it was resident, then the bo's nominal size was whatever the cache returned. As a consequence, the first cache insert decided the bo's nominal size, which could be significantly smaller compared to the dma-buf's actual size, as the nominal size was determined by VkMemoryAllocationInfo::allocationSize and not lseek(). I believe this patch cleans up that messy behavior. For an imported or exported VkDeviceMemory, anv_bo::size should now be the true size of the bo, if I correctly understand the problem (which I possibly don't). v2: - Preserve behavior of aligning size to 4096 before checking. [for jekstrand] - Check size with < instead of <=, to match behavior of commit `c0a4f56` "anv: bo_cache: allow importing a BO larger than needed". [for chadv]	2017-10-17 23:46:06 -07:00
Dylan Baker	fbf39fd7c3	meson: turn on pl111 not vc4 when pl111 driver specificed Reviewed-by: Eric Anholt <eric@anholt.net> fixes: `1918c9b162` ("meson: Add support for the pl111 driver.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-10-17 15:34:35 -07:00
Bas Nieuwenhuizen	06f05040eb	radv: Link shaders. Here we make use of NIR the linking helpers to remove unused varyings. Sascha Willems demo results: computecullandlod 39 -> 41 fps pipelines ~6100 -> ~6200 fps Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-10-18 09:19:35 +11:00
Timothy Arceri	dbbf10541b	radv: reuse the multiple shader store & load functions for gs copy variant Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 09:19:35 +11:00
Timothy Arceri	351f9dde60	radv: remove some now unused shader compile code Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 09:19:35 +11:00
Timothy Arceri	7d45d22fdd	radv: switch to using radv_create_shaders() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen	d1c9f30d7f	radv: add radv_create_shaders() helper This is a combined shader creation helper than will help us to create the shaders for each stage at once. This will allow us to do some link time optimisations. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen	ed9218f154	radv: add radv_hash_shaders() helper This will be used to create a hash of the combined shaders in the pipeline. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen	7f29055751	radv: Add multiple shader cache store & load functions. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen	670c02b430	radv: Change cache datastructures for combined pipelines. Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-10-18 09:19:35 +11:00
Timothy Arceri	56998558f4	radv: reorder init function calls Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-18 09:19:35 +11:00
Eric Anholt	4f3e380fa0	meson: Add support for the vc5 driver. v2: Default vc5 to off, since it requires the simulator currently. Add missing dep on the XML generation from libbroadcom_vc5. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)	2017-10-17 13:41:59 -07:00
Eric Anholt	1918c9b162	meson: Add support for the pl111 driver. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-17 13:41:59 -07:00
Eric Anholt	1ae8018a6a	meson: Add support for the vc4 driver. Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-17 13:41:59 -07:00
Marek Olšák	2f4705afde	radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0 if there is no other buffer there. Benefits: - there is no constbuf descriptor upload and shader load It's assumed that all constant addresses are within bounds. Non-constant addresses are clamped against the last declared CONST variable. This only works if the state tracker ensures the bound constant buffer matches what the shader needs. Once we get 32-bit pointers, we can only do this for user constant buffers where the driver is in charge of the upload so that it can guarantee a 32-bit address. The real performance benefit might not be measurable. These apps get 100% theoretical benefit in all shaders (except where noted): - antichamber - barman arkham origins - borderlands 2 - borderlands pre-sequel - brutal legend - civilization BE - CS:GO - deadcore - dota 2 -- most shaders - europa universalis - grid autosport -- most shaders - left 4 dead 2 - legend of grimrock - life is strange - payday 2 - portal - rocket league - serious sam 3 bfe - talos principle - team fortress 2 - thea - unigine heaven - unigine valley -- also sanctuary and tropics - wasteland 2 - xcom: enemy unknown & enemy within - tesseract - unity (engine) Changed stats only: SGPRS: 2059998 -> 2086238 (1.27 %) VGPRS: 1626888 -> 1626904 (0.00 %) Spilled SGPRs: 7902 -> 7865 (-0.47 %) Code Size: 60924520 -> 60982660 (0.10 %) bytes Max Waves: 374539 -> 374526 (-0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	854593b8eb	ac: clean up ac_build_indexed_load function interfaces Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	cdb21dfffa	radeonsi: handle 64-bit loads earlier in fetch_constant Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	ee0e1a47ce	radeonsi: add si_descriptors::gpu_address and remove buffer_offset This allows us to change the pointer arbitrarily. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	6d2664880c	radeonsi: unify code for extracting a buffer address from a descriptor Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	8d2685d129	radeonsi: remove atom parameter from si_upload_descriptors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	4ddce1b1a4	radeonsi: pack si_descriptors better again Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	859eeffb3d	radeonsi: emit dirty consecutive pointers in one SET_SH_REG packet IB size: -1.6% Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	36626ffe46	radeonsi: split si_emit_shader_pointer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	69325fa88d	radeonsi: generalize the SI_VS_SHADER_POINTER_MASK macro Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	79c2e7388c	radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON IB size: -0.4% Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	b762a08896	radeonsi/gfx9: move RW_BUFFERS from s[0:1] to s[8:9] for HS and GS Let's use the same user data SGPRs in all stages. (for SPI_SHADER_USER_DATA_COMMON_0) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	0aafedbbb2	radeonsi: add GFX-IB-size query to the HUD It shows the sum of all IBs per frame. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	4d944c72b1	winsys/amdgpu: disable CPU caching for GFX & SDMA IBs This should decrease IB fetch latency. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	49f5ce39c1	winsys/amdgpu: don't do read-modify-write on command buffers i.e. don't use \|= Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Eric Anholt	cde209960c	broadcom/vc4: Fix false-positive for the tiling ioctls on simulator mode. If there happened to be an ENOENT laying around, we would try using the ioctls later and fail out resource allocation.	2017-10-17 12:35:16 -07:00
Eric Anholt	b202f90f65	broadcom/vc4: Skip BO labeling when in simulator mode. It was calling down into i915 trying to label the BO, which is definitely not the right thing.	2017-10-17 12:35:16 -07:00
Eric Anholt	d623a34ab2	broadcom/vc5: Don't forget to set the RT format for 1555 textures. Fixes dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb5_a1	2017-10-17 12:35:16 -07:00
Chad Versace	b5dc551014	anv: Add func anv_gem_get_tiling() Will use in VK_ANDROID_native_buffer. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Chad Versace	eb69a61806	anv: Move close(fd) from anv_bo_cache_import to its callers (v2) This will allow us to implement VK_ANDROID_native_buffer without dup'ing the fd. We must close the fd in VK_KHR_external_memory_fd, but we should not in VK_ANDROID_native_buffer. v2: - Add missing close(fd) for case VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR, subcase ANV_SEMAPHORE_TYPE_BO. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Chad Versace	076a279a1a	anv: Add field anv_image::planes[]::bo_is_owned (v2) If this flag is set, then the image and the bo have the same lifetime. vkDestroyImage will release the bo. We need this for VK_ANDROID_native_buffer, because that extension creates the VkImage and imports its memory during the same call, vkCreateImage. v2: Rebase onto VK_KHR_bind_memory2. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Chad Versace	a9ca8f370d	anv: Better support for Android logging (v2) In src/intel/vulkan/*, redirect all instances of printf, vk_error, anv_loge, anv_debug, anv_finishme, anv_perf_warn, anv_assert, and their many variants to the new intel_log functions. I believe I caught them all. The other subdirs of src/intel are left for a future exercise. v2: - Rebase onto Tapani's VK_EXT_debug_report changes. - Drop unused #include <cutils/log.h>. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Chad Versace	aa716db0f6	intel: Add simple logging façade for Android (v2) I'm bringing up Vulkan in the Android container of Chrome OS (ARC++). On Android, stdio goes to /dev/null. On Android, remote gdb is even more painful than the usual remote gdb. On Android, nothing works like you expect and debugging is hell. I need logging. This patch introduces a small, simple logging API that can easily wrap Android's API. On non-Android platforms, this logger does nothing fancy. It follows the time-honored Unix tradition of spewing everything to stderr with minimal fuss. My goal here is not perfection. My goal is to make a minimal, clean API, that people hate merely a little instead of a lot, and that's good enough to let me bring up Android Vulkan. And it needs to be fast, which means it must be small. No one wants to their game to miss frames while aiming a flaming bow into the jaws of an angry robot t-rex, and thus become t-rex breakfast, because some fool had too much fun desiging a bloated, ideal logging API. If people like it, perhaps we should quickly promote it to src/util. The API looks like this: #define INTEL_LOG_TAG "intel-vulkan" #define DEBUG intel_logd("try hard thing with foo=%d", foo); n = try_foo(...); if (n < 0) { intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__); return VK_ERROR_DEVICE_LOST; } And produces this on non-Android: intel-vulkan: debug: try hard thing with foo=93 intel-vulkan: error: anv_device.c:182: foo failed bigtime v2: Fix meson build. [for dcbaker] Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Tapani Pälli	3555d36139	anv/android: Link to libsync, liblog in Android.mk chadv: I made this patch by extracting the hunk from Tapani's patch in https://lists.freedesktop.org/archives/mesa-dev/2017-September/169602.html. Signed-off-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Chad Versace	3791fe23af	anv/android: Link to Android libraries in the autotools build A first step to supporting Vulkan on ARC++. Mesa on ARC++ uses Autotools, not Android.mk. Doing this now, even before VK_ANDROID_native_buffer is implemented, allows us to incrementally add Android support to the Autotools build. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-17 11:08:26 -07:00
Eric Engestrom	320018be77	meson: s/radv_extensions/radv_extensions_c/ to respect var convention Suggested-by: Dylan Baker <dylan@pnwbakers.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-17 19:07:09 +01:00
Eric Engestrom	1f0e80f897	meson: track python script dependency Suggested-by: Andres Gomez <agomez@igalia.com> Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-17 19:07:03 +01:00
Henri Verbeet	3de87f7cd7	vulkan/wsi: Free the event in x11_manage_fifo_queues(). Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Fixes: `e73d136a02` ("vulkan/wsi/x11: Implement FIFO mode.") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com	2017-10-17 17:17:15 +01:00
Eric Engestrom	cde7859273	meson: add missing radv_extensions.c generation for libvulkan_radeon Fixes: `17201a2eb0` "radv: port to using updated anv entrypoint/extension generator." Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Andres Gomez <agomez@igalia.com>	2017-10-17 16:19:21 +01:00
Jason Ekstrand	759ab66db0	anv/apply_pipeline_layout: Use nir_tex_instr_remove_src Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-17 07:36:00 -07:00
Jason Ekstrand	41c75b5354	nir: Add a helper for adding texture instruction sources Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-17 07:36:00 -07:00
Mark Thompson	31fb7bbe0b	st/va: Return correct width and height for encode/decode support Previously this would return the largest possible buffer size, which is much larger than the codecs themselves support. This caused confusion when client applications attempted to decode 8K video thinking it was supported when it isn't. Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-10-17 08:23:55 -04:00
Mark Thompson	ba28c1c9f7	st/va: Fix config entrypoint handling Consistently use it as a PIPE_VIDEO_ENTRYPOINT. v2: Return an error if the entrypoint is not set (Christian). Signed-off-by: Mark Thompson <sw@jkqxz.net> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-10-17 08:23:55 -04:00
Mark Thompson	b6f41e393e	st/va: Disable vaExportSurfaceHandle() This is not in libva 2.0, so it shouldn't be enabled yet. Signed-off-by: Mark Thompson <sw@jkqxz.net> Acked-by: Christian König <christian.koenig@amd.com>	2017-10-17 08:23:55 -04:00
Dave Airlie	35c66f3e40	radv/image: bump all the offset to uint64_t. So one of the CTS tests tries to allocate a 16384x1 2048 array texture. This overflows a bunch of calculations when we want it tiled as the heights goes to 128. addrlib returns us the correct size (16GB or so), but we mangle it in the htile calcs due to the 32-bit offset fields, then userspace gives us the reduced number and we try to allocate it on a heap and things blow up. We really need to give the app back the correct size for the image so we can blow up properly in memory allocation later. This should fix hangs in dEQP-VK.pipeline.render_to_image.core.1d_array.huge.width_layers.r8g8b8a8_unorm_d32_sfloat_s8_uint since Fixes: `ad3d98da9f` (radv: enable tc compatible htile for d32s8 also.) Now there's an open question if we should be enabling tc-compat htile at all for shallow textures like the above. This might cause some other wierd side effects in CTS even without the tc compat so: Cc: "17.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-17 08:28:48 +01:00
Dave Airlie	17201a2eb0	radv: port to using updated anv entrypoint/extension generator. This ports radv to using the anv entrypoint/extension generator code. No differences on enabled extensions list in vulkaninfo. Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-17 16:50:32 +10:00
Dave Airlie	c00256a12c	radv: enable VK_KHX_multiview always. This was in the wrong place. Fixes: `ba51ad2f2` (radv: Expose VK_KHX_multiview.) Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-17 16:48:14 +10:00
Marek Olšák	5d071bf04b	Revert "mesa: fix texture updates for ATI_fragment_shader" This reverts commit `9d54025cd1`. It breaks KOTOR. Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>	2017-10-17 04:16:17 +02:00
Miklós Máté	1b86dbc144	mesa: remove redundant NULL check in update_single_program_texture_state update_single_program_texture() never returns NULL. Signed-off-by: Miklós Máté <mtmkls@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-17 04:16:17 +02:00
Dylan Baker	43a6e84927	meson: build mesa test. v2: - add dependency on dispatch.h generator (which this test needs) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2017-10-16 16:39:26 -07:00
Dylan Baker	c7081a3b08	.travis: Don't build gallium drivers in non-gallium test targets Simply disable gallium in non-gallium builds. For some reason the gallium driver wont link on ubuntu 14.04 (it will on 16.04, debian testing, and arch) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-10-16 16:32:43 -07:00
Dylan Baker	61631be3a9	meson: refactor meson_options To put one argument on each line. This results in the file being much longer, but I think much more readable. Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	6a9ad20b7c	meson: build llvmpipe Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	de24d61765	meson: build softpipe This doesn't include llvmpipe. v2: - Fix inconsistent use of with_gallium_swrast and with_gallium_softpipe. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	813b4b09f9	meson: build nouveau (gallium) driver Tested with a GK107. v2: - Add target for nouveau standalone compiler. This target is not built by default. v3: - Add nouveau to list of drivers built by default Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	b154b44ae3	meson: build radeonsi gallium driver This hooks up the bits necessary to build gallium dri drivers, with radeonSI as the first example driver. This isn't tested yet. v4: - drop radeonsi generated header from sources. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	66c94b9313	meson: build gallium winsys for dri, null, and wrapper Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	66f97f6640	meson: build radeonsi This builds the radeonsi (and radeon) window system bits and gallium driver bits. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	f3d03a2cf7	meson: Build gallium dri state tracker Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	4d701ee969	meson: build gallium helper drivers This builds ddebug, noop, rbug, and trace drivers. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	d451a11b21	meson: Build gallium pipe-loader Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-10-16 16:32:43 -07:00
Dylan Baker	50c28dfa81	meson: split and simplify dependencies Rather than group dependencies in complex groups, use a flatter structure with split dependencies to avoid checking for the same dependencies twice. v2: - Fix building vulkan drivers without gallium or dri drivers v3: - Drop TODO comment that is done - Fix typo in commit message Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	b1b65397d0	meson: Build gallium auxiliary v2: - guard gallivm files with "with_llvm" instead of "dep_llvm.found()" Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2017-10-16 16:32:43 -07:00
Dylan Baker	af9d276134	meson: build libmesa_gallium Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	02cf3a8f39	meson: Add option to toggle LLVM Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	8e611878c4	meson: always set GLX_USE_TLS This can be applied to all GLX implementations, and in autotools this is guarded only by the --enable-glx-tls flag. Since this is on by default in autotools, and is strictly better than being off, the meson build doesn't even have a toggle for it. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	90b5ec6c5f	meson: Don't try to install dri drivers unless one is built This confused the with_dri flag which is meant to control Direct Rendering Infrastructure, not classic drivers Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	601bd7296f	meson: Set _GNU_SOURCE When we start adding non-free software platforms support we'll need to guard this, but for now it should be fine as is. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	e21e0a6a70	meson: add checks for version script and dynamic list These are used by gallium drivers. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	e4796ab7c8	configure: commit test files These are currently auto-generated, but meson needs the same files, so lets commit them to reduce duplication. v3: - Rename .build to build-support Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>	2017-10-16 16:32:43 -07:00
Dylan Baker	3b209e9304	meson: Add switch for texture float Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 16:32:43 -07:00
Kenneth Graunke	9e779e59b2	Revert "i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2" This reverts commit `d80cbbeaff`. It turns out that formats do matter - the framebuffer's miptree has an sRGB format, and the one we created did not. This broke rendering when using KWin compositing, GNOME Terminal Fedora (with a transparent background), and Qt menu rendering in general, to name a few. It's been a month and this hasn't been fixed, and I'm sick of reverting this patch or applying NAK'd hacks and restarting various programs at random times every day, multiple times a day, to keep my desktop environment functional. The only benefit of this patch was to prepare the way for modifiers, which AFAIK aren't finished yet anyway, so there's really no downside to reverting it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102924	2017-10-16 16:02:53 -07:00
Rob Herring	c6e584f194	Android: add libmesa_nir dependency to libmesa_dricore Commit `32fcced7b4` ("meta: Unset the textures_used_by_txf bitfield.") added a dependency in libmesa_dricore to NIR headers, but failed to add libmesa_nir as a dependency resulting in a build error: In file included from external/mesa3d/src/mesa/drivers/common/meta.c:90: external/mesa3d/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found Add libmesa_nir as a static library dependency to libmesa_dricore. Fixes: `32fcced7b4` ("meta: Unset the textures_used_by_txf bitfield.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rob Herring <robh@kernel.org>	2017-10-16 14:49:37 -05:00
Chris Wilson	2c4097aff1	i965: Only put external handles into the handle ht We know that we will only ever need to lookup an external handle and so can defer adding a bo to the external ht until it is ever exported or imported, keeping that hashtable compact. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-16 11:52:59 -07:00
Eric Engestrom	b05820621d	svga: format the version string like the rest of mesa All 4 other version strings do it like this. ((Also, double parentheses just look confusing)) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-16 18:52:41 +01:00
Eric Engestrom	16be271c6e	git_sha1_gen: use git_sha1.h.in on all build systems Meson already uses this, let's get the other build sys to use it too. Note: rstrip() was dropped, as truncating to the first 10 chars already gets rid of the terminating newline (not an issue with the env var either, unless maliciously crafted to break the build... not sure this is a real-world issue). Verified to work and give the same output as before on both python 2 and 3 :) Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2017-10-16 18:52:35 +01:00
Brian Paul	4542a63254	svga: fix format_conversion_table breakage The new A1B5G5R5_UNORM, X1B5G5R5_UNORM formats were added in the wrong place in commit `ef874ee450`. Fixes: `ef874ee450` "gallium: Add support for 5551 with the 1-bit field in the low bit." Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>	2017-10-16 10:58:02 -06:00
Jason Ekstrand	92d3f21ec2	i965/miptree: Drop the invalidate parameter form copy_teximage This was a leftover from i915. The one caller in i965 always passes in false so there's no point in having the parameter. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-16 08:06:02 -07:00
Jason Ekstrand	b03b19f558	anv: Get rid of gen fall-through In the early days of the Vulkan driver, we thought it would be a good idea to just make genN just fall back to the genN-1 code if it didn't need to be any different for genN. While this seemed like a good idea, it ultimately ended up being far simpler to just recompile everything. We haven't been using the fall-through functionality for some time so we're better off just deleting it so it doesn't accidentally start causing problems. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-16 08:04:56 -07:00
Jason Ekstrand	9cec35579c	intel/common: Improve the comments for sample positions These are pulled directly from brw_multisample_state.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-10-16 08:04:56 -07:00
Samuel Pitoiset	f16382d35b	radv: update ia_multi_vgt when executing secondary buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-16 14:50:30 +02:00
Samuel Pitoiset	47d7d18613	radv: be smarter with the draw packets when executing secondary buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-16 14:50:27 +02:00
Samuel Pitoiset	b253f3189a	radv: always dirty some states after executing secondary buffers The spec requires the number of buffer to be greater than 0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-16 14:09:51 +02:00
Samuel Pitoiset	4e65b4ea4b	radv: be smarter with pipelines when emitting secondary buffers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-16 14:09:51 +02:00
Jakob Bornecrantz	67dd52e7e8	docs: Add EXT_memory_objects extensions to features.txt These extensions are good for Vulkan interop, so track them. Signed-off-by: Jakob Bornecrantz <jakob.bornecrantz@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-16 11:05:41 +01:00
Timothy Arceri	f1eb5e6399	nir: add component level support to remove_unused_io_vars() Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 09:06:53 +11:00
Timothy Arceri	9f7127f5d2	glsl: mark xfb inputs as always_active_io We won't split varyings marked as always active because there is no point in doing so. This means we need to mark both sides of the interface as always active otherwise we will have a mismatch and start removing things we shouldn't. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 09:06:53 +11:00
Timothy Arceri	6af5e0bec9	nir: add variant of lower_io_to_scalar to be called earlier This is intended to be called before nir_lower_io() so that we can do some linking optimisations with the results. It can also be used with drivers that don't use nir_lower_io() at all such as RADV. v2: pass mode mask rather than first and last stage integer. Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 09:06:53 +11:00
Timothy Arceri	3b59f5ca17	nir: add glsl_channel_type() helper Reviewed-by: Eric Anholt <eric@anholt.net>	2017-10-16 09:06:53 +11:00
Timothy Arceri	421c1b9bd6	nir: add glsl_type_is_64bit() to nir_types Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-16 09:06:53 +11:00
Ilia Mirkin	790b5c4a38	a2xx: add support for a few 16-bit color rendering formats The rest should be possible too, just needs some additional investigation. Passes fbo-*-formats piglit tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2017-10-15 12:09:21 -04:00
Wladimir J. van der Laan	d3af7f5153	freedreno/a20x: Enable rendering to RGBA/RGBX Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-15 12:09:14 -04:00
Wladimir J. van der Laan	c10eeb454d	freedreno/a20x: Fix rendering to BGRX Make sure that BGRX rendering is swapped the correct way around. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-10-15 12:09:03 -04:00
Brian Paul	c7a81dcea9	mesa: minor simplification in test_attachment_completeness() We already have a pointer to the texture object. Use it here. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-14 10:30:27 -06:00
Lucas Stach	4daee6733f	etnaviv: rework TS enable to be a derived state Draw operations should not use the TS if the TS buffer content is invalid, as this leads to wrong rendering or even GPU hangs. As the TS valid status can change between draws (clear operations changing it to valid, blits using the RS to the color or ZS buffer changing it to invalid), the TS_MEM_CONFIG must be updated before each draw if the status has changed. This fixes the remaining TS related piglit failures (regressions of a standard run against a piglit run with TS completely disabled). Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-10-14 16:40:08 +02:00
Lucas Stach	34360ac6ed	etnaviv: skip unused vertex attributes when assigning VS inputs When not all of the vertex attributes are actually used in the shader, we end up with some inputs without an assigned reg. Those are marked as invalid and must be skipped when assigning the inputs, as those would overwrite other valid inputs otherwise. Fixes piglit drawpixels and a bunch of other tests using the st_draw path. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>	2017-10-14 16:39:46 +02:00
Samuel Pitoiset	0c1aecf177	radv: do not allocate CMASK for non-MSSA images with 128 bit formats This saves some useless CMASK initializations/eliminations in the Vulkan SSAO demo. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-14 12:25:48 +02:00
Samuel Pitoiset	a4c08c8cd5	radv: set correct INDEX_TYPE for indexed indirect draws on GFX9 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-14 12:05:19 +02:00
Samuel Pitoiset	3e5f27faf3	radv: add the draw count buffer to the list of buffers My guess is that the GPU is going to report VM faults if vkCmdDrawIndirectCountAMD() (and friends) are used. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-14 12:05:19 +02:00
Jason Ekstrand	1cec500c69	blob: Use intptr_t instead of ssize_t ssize_t is a GNU extension and is not available on Windows or MacOS. Instead, we use intptr_t which should be effectively equivalent and is part of the C standard. This should fix the Windows and Mac OS builds. Fixes: `3af1c82989` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103253 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2017-10-13 15:02:34 -07:00
Kenneth Graunke	77d3d71f23	i965: Rename brw->no_batch_wrap to intel_batchbuffer::no_wrap This really makes more sense in the intel_batchbuffer struct. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-10-13 11:16:41 -07:00
Kenneth Graunke	d22bc4ba52	i965: Delete dead brw_context fields. fast_clear_op is leftover from the meta-fast-clear days. No idea what the other thing was for, but it isn't used now. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-10-13 11:16:41 -07:00
Emil Velikov	9b753e8ca3	mapi/shared-glapi/test: rework glapitable.h handling Currently all the build systems but Meson generate the header in src/mapi/glapi. Meson cannot do that since: - it does not allow user control over the location of output files - moving the generation rule(s) causes explosion due to the unusual structure of glapi and friends - copying the file into the correct location is a non-trivial task To workaround the above deficiency in the least invasive way, let's adjust the #include directive and add a few -I flags to the autotools build. Note: both builddir and srcdir, should be used. Otherwise building from a release tarball fails badly. Cc: Dylan Baker <dylanx.c.baker@intel.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-13 11:12:08 -07:00
Dylan Baker	142dc8b9de	meson: fix blob test includes Since blob.h moved up to src/compiler the test should include that instead of src/compiler/glsl fixes: `0e3bd56c6e` ("compiler: Move blob up a level") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-13 10:40:23 -07:00
Emil Velikov	ee779c93d5	Revert "make: Fix test to be meson compatible" This reverts commit `fc48ad2427`. There commit reference the previous commit as it justification of changing behaviour. Although unlike the said commit, there's nothing obviously wrong there. I'll take a look close why Meson fails to pick the file, but in the interim reverting this commit fixes the normal distcheck target.	2017-10-13 14:57:33 +01:00
Mark Thompson	e7f24859ca	st/dri: Add definitions to allow importing 16-bit surfaces Necessary to support P010/P016 surfaces for video. Signed-off-by: Mark Thompson <sw@jkqxz.net> Acked-by: Leo Liu <leo.liu@amd.com>	2017-10-13 08:11:47 -04:00
Mario Kleiner	556037f131	i965: Complete 'expose RGBA visuals only on Android' Commit `731ba6924a` "expose RGBA visuals only on Android" replaced ARRAY_SIZE(formats) by num_formats, but there are 3 loops which add configs, and only one was updated to num_formats. Also update loops for configs with accumulation buffer and multisample configs. Fixes: `731ba6924a` "i965: expose RGBA visuals only on Android" Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-13 12:56:13 +01:00
Emil Velikov	df3a430180	configure.ac: add missing LLVM components for OpenCL Coverage and LTO seems to be hard requirements for Clang, while coroutines is needed as of LLVM/Clang 4.0. Mark the last one as "optional" so we handle every case. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-10-13 12:56:13 +01:00
Emil Velikov	36d6d1e931	configure.ac: add llvm_add_optional_component helper We want to add "optional" components, which have been added with later LLVM versions. One such in-tree example is inteljitevents. Others are to follow shortly. v2: Use the correct function, add blank line between functions (Tobias) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-10-13 12:56:13 +01:00
Emil Velikov	a7ecf7b86f	Travis: add binutils 2.26 for a few more LLVM 3.9 builds Otherwise we error out at link stage as follows: /usr/lib/llvm-3.9/lib/libLLVMAMDGPUCodeGen.a(R600OptimizeVectorRegisters.cpp.o): unrecognized relocation (0x2a) in section `.text._ZNK12_GLOBAL__N_119R600VectorRegMerger16getAnalysisUsageERN4llvm13AnalysisUsageE' /usr/bin/ld: final link failed: Bad value Cc: mesa-stable@lists.freedesktop.org Cc: Jan Vesely <jan.vesely@rutgers.edu Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-13 12:56:13 +01:00
Emil Velikov	13a53c4f5c	configure.ac: rework llvm libs handling for 3.9+ Earlier versions need different quirks, but as of LLVM 3.9 llvm-config provides --link-shared/link-static toggles. The output of which seems to be reliable - looking at LLVM 3.9, 4.0 and 5.0. Note that there are earlier code will be used for pre LLVM 3.9 and is unchanged. This effectively fixes LLVM static linking, while providing a clearer and more robust solution for future versions. Mildly interesting side notes: - build-mode (introduced with 3.8) was buggy with 3.8 It shows "static" when build with -DLLVM_LINK_LLVM_DYLIB=ON, yet it was consistent with --libs. The latter shows the static libraries. - libnames and libfiles are broken with LVM 3.9 The library prefix and extension is printed twice liblibLLVM-3.9.so.so v2: Invoke llvm-config twice, instead of using sed, to combine the two lines into one (Tobias) Cc: mesa-stable@lists.freedesktop.org Cc: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-10-13 12:56:12 +01:00
Emil Velikov	98fdff7247	configure.ac: factor out detection for old and buggy llvm As of LLVM 3.9 one could use consistent ways to handle the component. Factor out the current handling, as it will be used for older versions. Cc: mesa-stable@lists.freedesktop.org Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-10-13 12:56:12 +01:00
Emil Velikov	9032e2cdcc	configure.ac: remove no longer necessary llvm-config --libs check Prior to the refactor/cleanup by Tobias one could add an invalid component to LLVM_COMPONENTS. Since that's no longer the case we can drop the current check. Cc: Tobias Droste <tdroste@gmx.de> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tobias Droste <tdroste@gmx.de>	2017-10-13 12:56:12 +01:00
Emil Velikov	66ebdfbd44	eglmesaext: add forward declaration for struct wl_buffers The user does not need to know the specifics of the struct, as only a pointer to it is used. Just forward declare the struct making the header self-contained. v2: Remove deprecation warning text/bugzilla - patch does no help there. Cc: Greg V <greg@unrelenting.technology> Fixes: `5cddb1ce3c` ("wayland: Add an extension to create wl_buffers from EGLImages") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)	2017-10-13 12:56:12 +01:00
Emil Velikov	a14ecdab16	configure.ac: bump Clover LLVM requirement to 3.9 The only driver that utilises Clover already depends on LLVM 3.9. Close to every supported distribution has said version. Additionally libclc also requires LLVM 3.9. With this in mind, we can safely bump the requirement. There is a handful of dead code that we could remove, which will be resolved with later commits. Note: this drops the LLVM 3.6 build from the Travis build. LLVM 3.9 (and later) are already covered in there. https://lists.freedesktop.org/archives/mesa-dev/2017-September/170028.html v2: Add reference to discussion thread (Eric), adjust libclc LLVM req. (Jan). Cc: Aaron Watry <awatry@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> Acked-by: Vedran Miletić <vedran@miletic.net> Acked-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2017-10-13 12:56:12 +01:00
Emil Velikov	acb84ffbc7	wayland-drm: constify the callbacks struct, take 2 Now that wayland-drm (correctly) keeps a local copy of the callbacks, this should not longer cause explosions. After all the symbol is a local, constant data. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Derek Foreman <derekf@osg.samsung.com>	2017-10-13 12:56:12 +01:00
Emil Velikov	0cfd6f6cfc	wayland-drm: use a copy of the wayland_drm_callbacks struct The callbacks may be called even when they are no longer valid. Say, the user is dlclose(ing) libEGL while the buffers are being destroyed. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Tested-by: Derek Foreman <derekf@osg.samsung.com>	2017-10-13 12:56:12 +01:00
Emil Velikov	872a373bc8	egl/dri: don't crash when createImageFromRenderbuffer2 is NULL The __DRI_IMAGE version can be 17 or over, while the function pointer is NULL. Guard for that instead of crashing. Fixes: `bad24395d9` ("egl/dri: use createImageFromRenderbuffer2 when available") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-13 12:56:12 +01:00
Ville Syrjälä	2289964f4f	meson: Build i915 Build i915 with meson. More or less copied from i965, with all the unneeded cruft removed, and the libdrm_intel dependency added. Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-13 14:29:00 +03:00
Ville Syrjälä	66b1597a88	meson: Fix xf86vm dep The pkg-config file is called xxf86vm.pc not xf86vm.pc. Cc: Dylan Baker <dylan@pnwbakers.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2017-10-13 14:28:41 +03:00
Jason Ekstrand	79d403417c	intel/cs: Make thread_local_id a regular builtin param This is a lot more natural than special casing it all over the place. We still have to do a bit of special-casing in assign_constant_locations but it's not special-cased quite as bad as it was before. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	8d90e28839	intel/compiler: Allocate pull_param in assign_constant_locations Now that everything is nicely ralloc'd, we can allocate the pull_param array in assign_constant_locations instead of higher up. We can also re-allocate the param array so that it's exactly the needed size. This should save us some memory because we're not allocating the total needed param space for both push and pull. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	29737eac98	intel: Allocate prog_data::[pull_]param deeper inside the compiler Now that we're always growing the param array as-needed, we can allocate the param array in common code and stop repeating the allocation everywere. In order to keep things sane, we ralloc the [pull_]param array off of the compile context and then steal it back to a NULL context later. This doesn't get us all the way to where prog_data::[pull_]param is purely an out parameter of the back-end compiler but it gets us a lot closer. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	c3d54d0375	ralloc: Allow reparenting to a NULL context Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	2e317a4b6d	anv/pipeline: Refactor setup of the prog_data::param array Now that the only thing we put in the array up-front are client push constants, we can simplify anv_pipeline_compile a bit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	6b31229592	anv/pipeline: Grow the param array for images Before, we were calculating up-front and then filling in later. Now we just grow as needed in anv_nir_apply_pipeline_layout. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	63c938fd18	anv/pipeline: Whack nir->num_uniforms to MAX_PUSH_CONSTANT_SIZE This way any image uniforms end up having locations higher than MAX_PUSH_CONSTANT_SIZE. There's no bug here at the moment, but this consistency will make the next commit easier. Also, because nir_apply_pipeline_layout properly increments nir->num_uniforms when it expands the param array, we no longer need to stomp it to match prog_data::nr_params because it already does. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:31 -07:00
Jason Ekstrand	4dfb8b3416	intel/vs: Grow the param array for clip planes Instead of requiring the caller of brw_compile_vs to figure it out, just grow the param array on-demand. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	6bcc5c0c75	intel/cs: Grow prog_data::param on-demand for thread_local_id_index Instead of making the caller of brw_compile_cs add something to the param array for thread_local_id_index, just add it on-demand in brw_nir_intrinsics and grow the array. This is now safe to do because everyone is now using ralloc for prog_data::param. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	b1d1b7222a	intel/compiler: Make brw_nir_lower_intrinsics compute-specific It's already only ever called from brw_compile_cs and only handles compute intrinsics. Let's just make it CS-specific. We can always make it handle other stages again later if we want. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	2db9470d88	intel/compiler: Add a helper for growing the prog_data::param array Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	c0435b204a	intel/compiler: Stop adding params for texture sizes We haven't needed this ever since we started using NIR for lowering rectangle textures. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	4d4f149376	i965: Only add the wpos state reference if we lowered something Otherwise, in the ARB program case _mesa_add_state_reference may grow the parameter array which will cause brw_nir_setup_arb_uniforms to write past the end of the param array because it only looks at the parameter list length but the parma array is allocated based on nir->num_uniforms. The only reason this hasn't caused us problems is because we are padding out the param array for fragment programs unnecessarily. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	4efd079aba	intel/compiler: Add a flag for pull constant support The Vulkan driver does not support pull constants. It simply limits things such that we can always push everything. Previously, we were determining whether or not to push things based on whether or not the prog_data::pull_param array is non-null. This is rather hackish and about to stop working. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	9df64b5666	anv/pipeline: Ralloc prog_data::param of the compile mem_ctx This way we stop leaking it. This is completely safe because, when we hand it off to anv_shader_bin_create or anv_pipeline_cache_upload_kernel, they make a copy of the entire param array. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	490d80fd1a	anv/pipeline: Add a mem_ctx parameter to anv_pipeline_compile This lets us avoid some of the manual ralloc stealing and prepares for future commits in which we will want to ralloc prog_data::param. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	cfc7ed75eb	i965: Store image_param in brw_context instead of prog_data This burns an extra 10k of memory or so in the case where you don't have any images. However, if you have several shaders which use images, this should be much less memory. It also gets rid of a part of prog_data that really has nothing to do with the compiler. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:30 -07:00
Jason Ekstrand	6ee4b352c9	i965: Use prog->info.num_images for needs_dc computation This should be just as good as looking in prog_data but removes our one state setup dependency on brw_stage_prog_data::nr_image_param. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:29 -07:00
Jason Ekstrand	2975e4c56a	intel: Rewrite the world of push/pull params This moves us away to the array of pointers model and onto a model where each param is represented by a generic uint32_t handle. We reserve 2^16 of these handles for builtins that get generated by somewhere inside the compiler and have well-defined meanings. Generic params have handles whose meanings are defined by the driver. The primary downside to this new approach is that it moves a little bit of the work that we would normally do at compile time to draw time. On my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to have any measurable affect on OglBatch7. So, while this may come back to bite us, it doesn't look too bad. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:29 -07:00
Jason Ekstrand	faad828b16	i965: Get rid of gen7_cs_state.c The only thing it was handling was push constants. We pull the actual constant upload code into gen6_constant_state.c and the atoms into genX_state_upload.c. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:29 -07:00
Jason Ekstrand	9b3f917f9e	i965: Add a helper for populating constant buffers Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:29 -07:00
Jason Ekstrand	d640627159	i965: Move brw_upload_pull_constants to gen6_constant_state.c Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:39:29 -07:00
Jason Ekstrand	3442c9fc3e	nir: Get rid of the variable on vote intrinsics This looks like a copy+paste error. They don't actually write into that variable as would be implied by putting the return there. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-12 22:39:29 -07:00
Jason Ekstrand	a0947921eb	nir/opcodes: Fix constant-folding of ufind_msb We didn't fold correctly in the case of 0x1 because we never let the loop counter hit 0. Switching it to bit >= 0 solves this problem. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-12 22:39:29 -07:00
Jason Ekstrand	ac3b73ac8d	meta: Delete the PBO texsubimage path for real Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 22:38:40 -07:00
Jason Ekstrand	b8ab78d1af	anv/pipeline_cache: Rework to use multialloc and blob This gets rid of all of our hand-rolled size calculation and serialization code and replaces it with safe "standards" that are used elsewhere in anv and mesa. This should be significantly safer than rolling our own. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	2d29dd9ee4	anv/pipeline: Declare bind maps closer to their use This is just a trivial cleanup. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	ba4b7e9c44	anv/multialloc: Add new add_size helper Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	6a41a52e62	compiler/blob: Make some parameters void instead of uint8_t There are certain advantages to using uint8_t internally such as well-defined arithmetic on all platforms. However, interfaces that work in terms of raw data should use a void* type. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	4d56ff0a71	compiler/blob: Constify the reader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	3af1c82989	compiler/blob: Add (reserve\|overwrite)_(uint32\|intptr) helpers These helpers not only call blob_reserve_bytes but also make sure that the blob is properly aligned as if blob_write_* were called. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Connor Abbott	6935440967	compiler/blob: make blob_reserve_bytes() more useful Despite the name, it could only be used if you immediately wrote to the pointer. Noboby was using it outside of one test, so clearly this behavior wasn't that useful. Instead, make it return an offset into the data buffer so that the result isn't invalidated if you later write to the blob. In conjunction with blob_overwrite_bytes(), this will be useful for leaving a placeholder and then filling it in later, which we'll need to do for handling phi nodes when serializing NIR. v2 (Jason Ekstrand): - Detect overflow in the offset + to_write computation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	8ae03af4ed	compiler/blob: Allow for fixed-size blobs with a NULL data pointer These can be used to easily count up the number of bytes that will be required by "writing" it into the NULL blob. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	26f6d4e5c7	compiler/blob: Add a concept of a fixed-allocation blob Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	49bb9f785a	compiler/blob: Switch to init/finish instead of create/destroy There's no reason why that tiny bit of memory needs to be on the heap. We always put blob_reader on the stack, so why not do the same with the writable blob. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	0e3bd56c6e	compiler: Move blob up a level We're going to want to use the blob for Vulkan pipeline caching so it makes sense to have it in libcompiler not libglsl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-12 21:47:06 -07:00
Jason Ekstrand	8f42a43d08	meson: Add inc_compiler to the libglsl includes	2017-10-12 21:47:06 -07:00
Jason Ekstrand	e03717efbd	glsl/blob: Return false from grow_to_fit if we've ever failed Otherwise we could have a failure followed by a smaller write that succeeds and get a corrupted blob. If we ever OOM, we should stop. v2 (Jason Ekstrand): - Initialize the new boolean member in create_blob Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-12 21:47:06 -07:00
Jason Ekstrand	7118851374	glsl/blob: Return false from ensure_can_read on overrun Otherwise, if you have a large read fail and then try to do a small read, the small read may succeed even though it's at the wrong offset. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: mesa-stable@lists.freedesktop.org	2017-10-12 21:47:06 -07:00
Chris Wilson	c866e0b3ca	i965: Share the flush for brw_blorp_miptree_download into a pbo As all users of brw_blorp_miptree_download() must emit a full pipeline and cache flush when targetting a user PBO (as that PBO may then be subsequently bound or be bound anywhere and outside of the driver dirty tracking) move that flush into brw_blorp_miptree_download() itself. v2 (Ken): Rebase without userptr stuff so it can land sooner. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	760a5815d4	meta: Delete the PBO texture upload/download path Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	cdf626294e	i965: Use blorp instead of meta for PBO pixel reads Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	f933ef00e1	i965: Use blorp instead of meta for PBO texture downloads Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	157faa407f	i965/tex: Use blorp texture upload for all CCS_E textures This improves the FillTex benchmark in GLBench 2.7 by 30% on my Broxton. On Ken's Broxton which only has single-channel ram, it improves by 210%. v2 (Ken): Check mt->aux_usage == ISL_AUX_USAGE_CCS_E rather than using intel_miptree_is_lossless_compressed(). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	dffda6cbbb	i965: Use blorp instead of meta for PBO texture uploads Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	1a05bbe6a4	i965: Add blorp-based texture upload and download paths v1 (Topi Pohjolainen): original patch. v2 (Topi Pohjolainen): - Fix return value (s/MESA_FORMAT_NONE/false/) (Anuj) - Move _mesa_tex_format_from_format_and_type() just in the end avoiding additional if-block (Anuj) - Explain better the array alignment restriction (Anuj) - Do not bail out in case of gl_pixelstore_attrib::ImageHeight, it is handled by _mesa_image_offset() automatically (Ken). - Support 1D_ARRAY by flipping depth, width and y, z (Ken). v3 (Topi Pohjolainen): - Contrary to v2, do not try to handle gl_pixelstore_attrib::ImageHeight. Currently there are no tests in piglit or cts for it. One could possibly copy or modify tests/texturing/texsubimage.c. There, however, seems to be number of corner cases to consider. Moreover, current meta path applies the packing height for both source and targets when determining the offset. This would probably require re-visiting also. v4 (Topi Pohjolainen): Rebased on top of merged drm-bacon v5 (Jason Ekstrand): - Move to brw_blorp.c - Significant refactoring - Fixed 1-D array textures - Simplified handling of PBOs vs. CPU data. - Handle gl_pixelstore_attrib::ImageHeight. It turns out there are piglit tests that cover this. The original version was failing them because of an error in the way it handled 1-D array textures. - Add support for texture download v6 (Kenneth Graunke): Rebase fixes: - Use intel_miptree_check_level_layer instead of deleted fields - Update for mesa_format_supports_render[] rename. - Pass 'false' (read-only) to intel_bufferobj_buffer v7 (Kenneth Graunke): - Fix brw_blorp_download_miptree to pass 'false' (not read only) for the destination buffer (caught by Chris Wilson). - Fix blorp_get_client_bo to pass intel_bufferobj_buffer !read_only for the 'writable' parameter instead of 'false' (caught by Jason). - Support GL_BGR, GL_BGRA, GL_BGRA_INTEGER, GL_BGR_INTEGER, allowing us to use this for ReadPixels on the window system buffer (caught by Chris Wilson). - Fix y-flipping bugs in download path (exposed by BGRA support). - Fix false vs. NULL return value in blorp_get_client_bo. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-12 19:58:40 -07:00
Kenneth Graunke	acd3e073e4	i965: Refactor y-flipping coordinate transform. I want to reuse it for the BLORP download path. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	52f39d6910	i965/tex: Check if there is data to upload up-front Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	d9ed4f6c32	i965/barrier: Do the correct flushes for framebuffer access Framebuffer access includes framebuffer reads so we need to invalidate the texture cache. We do not, however, need to flush the depth cache because you cannot do bind a depth texture as an image. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Jason Ekstrand	45991479a3	i965/barrier: Do the correct flushes for texture updates Texture uploads and downloads may go through the render pipe which may result in texturing from or rendering to the texture or the PBO. We need to flush accordingly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 19:58:40 -07:00
Eric Anholt	2f1cdd7137	include: Revert out the update of the Khronos GLX extension header. They made a mistake in the MESA_swap_control XML, which I'm pursuing in their github. Until then, we can just back this piece out. Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2017-10-12 19:49:14 -07:00
Kenneth Graunke	cb9a4ae6c0	i965: Ignore GL_SKIP_DECODE_EXT for textures accessed via texelFetch(). The GL_EXT_texture_sRGB_decode spec says: "The conversion of sRGB color space components to linear color space is always performed if the texel lookup function is one of the texelFetch builtin functions. Otherwise, if the texel lookup function is one of the texture builtin functions or one of the texture gather functions, the conversion of sRGB color space components to linear color space is controlled by the TEXTURE_SRGB_DECODE_EXT parameter. If the TEXTURE_SRGB_DECODE_EXT parameter is DECODE_EXT, the conversion of sRGB color space components to linear color space is performed. If the TEXTURE_SRGB_DECODE_EXT parameter is SKIP_DECODE_EXT, the value is returned without decoding. However, if the texture is also accessed with a texelFetch function, then the result of texture builtin functions and/or texture gather functions may be returned with decoding or without decoding." This patch makes i965 force sRGB decoding for any textures accessed via texelFetch(). If textures are accessed via texelFetch() and a regular texture access function, this will affect the other ones too - which is fine - it's undefined according to the last paragraph quoted. We could make both work, but we'd have to emit multiple SURFACE_STATEs, and have two binding table sections, like we do for texture gather hacks on older platforms. Fixes the following Android O CTS test: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-12 17:22:42 -07:00
Kenneth Graunke	32fcced7b4	meta: Unset the textures_used_by_txf bitfield. Drivers that use Meta are happily using blitting data using texelFetch and GL_SKIP_DECODE_EXT, but the GL_EXT_texture_sRGB spec unfortunately makes GL_SKIP_DECODE_EXT not necessarily work with texelFetch. As a hack, just unset the texture_used_by_txf bitfield so we can continue with the old desired behavior. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-12 17:22:42 -07:00
Kenneth Graunke	a576c148cd	nir: Make nir_shader_gather_info() track texelFetch texture accesses. For TGSI-based drivers, st_glsl_to_tgsi records this information. For NIR-based drivers, nir_shader_gather_info() will do so. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-12 17:22:42 -07:00
Kenneth Graunke	fbf4c2916c	compiler: Move gl_program::TexelFetchSamplers to shader_info. I'd like to put this sort of metadata in the shader_info structure, rather than adding more things to gl_program. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 17:22:39 -07:00
Dave Airlie	fb972ed4e5	radv: take unsafe_math and sisched into account when hashing shaders. We want to generate different variants for sisched and unsafe_math shader variants, so add them to the hash key. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-12 23:46:55 +01:00
Dave Airlie	26f1ba94a3	mesa/bufferobj: fix atomic offset/size get When I realigned the bufferobj code, I didn't see the getters were different, realign the getters to work the same as ssbo. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103214 Fixes: `65d3ef7cd` (mesa: align atomic buffer handling code with ubo/ssbo (v1.1)) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-13 07:53:34 +10:00
Marek Olšák	69730dc589	relnotes: document EGL_ANDROID_native_fence_sync on radeonsi	2017-10-12 22:27:55 +02:00
Eric Anholt	89e02db81f	include: Update GL headers from khronos opengl registry. Taken from their c6a99aff31874697741a08cbc8a3488606ce59c7, keeping the BUILDING_MESA hunk in place. Reviewed-by: Daniel Stone <daniels@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 12:45:07 -07:00
Eric Anholt	6de8f1f970	mapi: Update extension number of MESA_tile_raster_order. Reviewed-by: Daniel Stone <daniels@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-10-12 12:44:51 -07:00
Eric Anholt	dbf9e4fbf8	broadcom/vc5: Remove the u_resource_vtbl usage. Like for vc4, this was just a wasted indirection.	2017-10-12 12:44:27 -07:00
Eric Anholt	376a0a9b08	mesa: Disallow GL_RED/GL_RG with half-floats on GLES2. Sure, you'd think that the combination of GL_OES_texture_half_float and GL_EXT_texture_rg would mean that GL_RG16F exists, but it doesn't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227 Fixes: `c16a7443e9` ("mesa: Expose GL_OES_required_internalformat on GLES contexts.") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 12:42:13 -07:00
Marek Olšák	f536f45250	radeonsi: implement sync_file import/export Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 21:07:48 +02:00
Marek Olšák	162502370c	winsys/amdgpu: implement sync_file import/export syncobj is used internally for interactions with command submission. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 21:07:41 +02:00

3281 changed files with 441328 additions and 171915 deletions

									
										1

.editorconfig
									
												View File
												
				@@ -11,6 +11,7 @@ tab_width = 8

				[*.{c,h,cpp,hpp,cc,hh}]

				indent_style = space

				indent_size = 3

				max_line_length = 78

				[{Makefile*,*.mk}]

				indent_style = tab

7

.mailmap

View File

@@ -145,9 +145,16 @@ Edward O'Callaghan <funfunctor@folklore1984.net> <eocallaghan@alterapraxis.com>
 Emeric Grange <emeric.grange@gmail.com> Emeric <emeric.grange@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.veliko@collabora.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@collabora.co.uk>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.veliikov@collabora.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emil.velikov@gmail.com>
 Emil Velikov <emil.l.velikov@gmail.com> <emmil.velikov@collabora.com>
 Eric Anholt <eric@anholt.net> Eric Anholt <anholt@FreeBSD.org>
 Eric Engestrom <eric@engestrom.ch> <eric.engestrom@imgtec.com>
 Eugeni Dodonov <eugeni.dodonov@intel.com> <eugeni@mandriva.com>
 Fabian Bieler <der.fabe@gmx.net> <fabianbieler@fastmail.fm>

744

.travis.yml

View File

File diff suppressed because it is too large Load Diff

									
										19

Android.common.mk
									
												View File
												
				@@ -31,12 +31,12 @@ LOCAL_C_INCLUDES += \

				MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				LOCAL_CFLAGS += \

					-Wno-error \

					-Wno-unused-parameter \

					-Wno-pointer-arith \

					-Wno-missing-field-initializers \

					-Wno-initializer-overrides \

					-Wno-mismatched-tags \

					-DVERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"

				@@ -51,11 +51,15 @@ LOCAL_CFLAGS += \

					-DHAVE___BUILTIN_EXPECT \

					-DHAVE___BUILTIN_FFS \

					-DHAVE___BUILTIN_FFSLL \

					-DHAVE_DLFCN_H \

					-DHAVE_FUNC_ATTRIBUTE_FLATTEN \

					-DHAVE_FUNC_ATTRIBUTE_UNUSED \

					-DHAVE_FUNC_ATTRIBUTE_FORMAT \

					-DHAVE_FUNC_ATTRIBUTE_PACKED \

					-DHAVE_FUNC_ATTRIBUTE_ALIAS \

					-DHAVE_FUNC_ATTRIBUTE_NORETURN \

					-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL \

					-DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT \

					-DHAVE___BUILTIN_CTZ \

					-DHAVE___BUILTIN_POPCOUNT \

					-DHAVE___BUILTIN_POPCOUNTLL \

				@@ -65,8 +69,14 @@ LOCAL_CFLAGS += \

					-DHAVE_PTHREAD=1 \

					-DHAVE_DLADDR \

					-DHAVE_DL_ITERATE_PHDR \

					-DHAVE_LINUX_FUTEX_H \

					-DHAVE_ENDIAN_H \

					-DHAVE_ZLIB \

					-DMAJOR_IN_SYSMACROS \

					-DVK_USE_PLATFORM_ANDROID_KHR \

					-fvisibility=hidden \

					-fno-math-errno \

					-fno-trapping-math \

					-Wno-sign-compare

				LOCAL_CPPFLAGS += \

				@@ -80,6 +90,13 @@ LOCAL_CPPFLAGS += \

				LOCAL_CONLYFLAGS += \

					-std=c99

				# c11 timespec_get is part of bionic as well

				# https://android-review.googlesource.com/c/718518

				# This means releases from P and earlier won't need this

				ifeq ($(filter 5 6 7 8 9, $(MESA_ANDROID_MAJOR_VERSION)),)

				LOCAL_CFLAGS += -DHAVE_TIMESPEC_GET

				endif

				ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

									
										8

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g nouveau pl111 r300g r600g radeonsi vc4 virgl vmwgfx etnaviv imx

				#   gallium drivers: swrast freedreno i915g nouveau kmsro r300g r600g radeonsi vc4 virgl vmwgfx etnaviv

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -39,6 +39,7 @@ endif

				MESA_DRI_MODULE_REL_PATH := dri

				MESA_DRI_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(MESA_DRI_MODULE_REL_PATH)

				MESA_DRI_LDFLAGS := -Wl,--build-id=sha1

				MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				@@ -51,15 +52,14 @@ gallium_drivers := \

					freedreno.HAVE_GALLIUM_FREEDRENO \

					i915g.HAVE_GALLIUM_I915 \

					nouveau.HAVE_GALLIUM_NOUVEAU \

					pl111.HAVE_GALLIUM_PL111 \

					kmsro.HAVE_GALLIUM_KMSRO \

					r300g.HAVE_GALLIUM_R300 \

					r600g.HAVE_GALLIUM_R600 \

					radeonsi.HAVE_GALLIUM_RADEONSI \

					vmwgfx.HAVE_GALLIUM_VMWGFX \

					vc4.HAVE_GALLIUM_VC4 \

					virgl.HAVE_GALLIUM_VIRGL \

					etnaviv.HAVE_GALLIUM_ETNAVIV \

					imx.HAVE_GALLIUM_IMX

					etnaviv.HAVE_GALLIUM_ETNAVIV

				ifeq ($(BOARD_GPU_DRIVERS),all)

				MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))

									
										6

CleanSpec.mk
									
												View File
												
				@@ -10,7 +10,7 @@ $(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/STATIC_LIBRARIES/libmesa_*_interm

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/i9?5_dri_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libglapi_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libGLES_mesa_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/mesa_*_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/glsl_compiler_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/STATIC_LIBRARIES/libmesa_*_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT)/*/EXECUTABLES/mesa_*_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT)/*/EXECUTABLES/glsl_compiler_intermediates)

				$(call add-clean-step, rm -rf $(HOST_OUT)/*/STATIC_LIBRARIES/libmesa_*_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/*_dri_intermediates)

									
										19

Makefile.am
									
												View File
												
				@@ -22,6 +22,7 @@

				SUBDIRS = src

				AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-autotools \

					--enable-dri \

					--enable-dri3 \

					--enable-egl \

				@@ -35,6 +36,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-glx-tls \

					--enable-nine \

					--enable-opencl \

					--enable-opencl-icd \

					--enable-opengl \

					--enable-va \

					--enable-vdpau \

				@@ -44,7 +46,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-libunwind \

					--with-platforms=x11,wayland,drm,surfaceless \

					--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \

					--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \

					--with-gallium-drivers=i915,nouveau,r300,kmsro,r600,radeonsi,freedreno,svga,swrast,vc4,tegra,virgl,swr,etnaviv \

					--with-vulkan-drivers=intel,radeon

				ACLOCAL_AMFLAGS = -I m4

				@@ -56,7 +58,15 @@ EXTRA_DIST = \

					doxygen \

					bin/git_sha1_gen.py \

					scons \

					SConstruct

					SConstruct \

					build-support/conftest.dyn \

					build-support/conftest.map \

					meson.build \

					meson_options.txt \

					bin/meson.build \

					include/meson.build \

					bin/install_megadrivers.py \

					bin/meson_get_version.py

				noinst_HEADERS = \

					include/c99_alloca.h \

				@@ -67,12 +77,15 @@ noinst_HEADERS = \

					include/drm-uapi/drm_fourcc.h \

					include/drm-uapi/drm_mode.h \

					include/drm-uapi/i915_drm.h \

					include/drm-uapi/tegra_drm.h \

					include/drm-uapi/v3d_drm.h \

					include/drm-uapi/vc4_drm.h \

					include/D3D9 \

					include/GL/wglext.h \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids

					include/pci_ids \

					include/vulkan

				# We list some directories in EXTRA_DIST, but don't actually want to include

				# the .gitignore files in the tarball.

									
										79

README.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,79 @@

				`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library

				======================================================

				Source

				------

				This repository lives at https://gitlab.freedesktop.org/mesa/mesa.

				Other repositories are likely forks, and code found there is not supported.

				Build status

				------------

				Travis:

				.. image:: https://travis-ci.org/mesa3d/mesa.svg?branch=master

				    :target: https://travis-ci.org/mesa3d/mesa

				Appveyor:

				.. image:: https://img.shields.io/appveyor/ci/mesa3d/mesa.svg

				    :target: https://ci.appveyor.com/project/mesa3d/mesa

				Coverity:

				.. image:: https://scan.coverity.com/projects/139/badge.svg?flat=1

				    :target: https://scan.coverity.com/projects/mesa

				Build & install

				---------------

				You can find more information in our documentation (`docs/install.html

				<https://mesa3d.org/install.html>`_), but the recommended way is to use

				Meson (`docs/meson.html <https://mesa3d.org/meson.html>`_):

				.. code-block:: sh

				  $ mkdir build

				  $ cd build

				  $ meson ..

				  $ sudo ninja install

				Support

				-------

				Many Mesa devs hang on IRC; if you're not sure which channel is

				appropriate, you should ask your question on `Freenode's #dri-devel

				<irc://chat.freenode.net#dri-devel>`_, someone will redirect you if

				necessary.

				Remember that not everyone is in the same timezone as you, so it might

				take a while before someone qualified sees your question.

				To figure out who you're talking to, or which nick to ping for your

				question, check out `Who's Who on IRC

				<https://dri.freedesktop.org/wiki/WhosWho/>`_.

				The next best option is to ask your question in an email to the

				mailing lists: `mesa-dev\@lists.freedesktop.org

				<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_

				Bug reports

				-----------

				If you think something isn't working properly, please file a bug report

				(`docs/bugs.html <https://mesa3d.org/bugs.html>`_).

				Contributing

				------------

				Contributions are welcome, and step-by-step instructions can be found in our

				documentation (`docs/submittingpatches.html

				<https://mesa3d.org/submittingpatches.html>`_).

				Note that Mesa uses email mailing-lists for patches submission, review and

				discussions.

17

REVIEWERS

View File

@@ -72,7 +72,18 @@ F: src/loader/
 EGL
 R: Eric Engestrom <eric@engestrom.ch>
 R: Emil Velikov <emil.l.velikov@gmail.com>
 F: src/egl/
 F: include/EGL/
 HAIKU
 R: Alexander von Gluck IV <kallisti5@unixzen.com>
 F: include/HaikuGL/
 F: src/egl/drivers/haiku/
 F: src/gallium/state_trackers/hgl/
 F: src/gallium/targets/haiku-softpipe/
 F: src/gallium/winsys/sw/hgl/
 F: src/hgl/
 GALLIUM LOADER
 R: Emil Velikov <emil.l.velikov@gmail.com>
@@ -107,6 +118,7 @@ MESON BUILD
 R: Dylan Baker <dylan@pnwbakers.com>
 R: Eric Engestrom <eric@engestrom.ch>
 F: */meson.build
 F: meson.build
 F: meson_options.txt
 ANDROID EGL SUPPORT
@@ -126,3 +138,8 @@ F:	src/gallium/drivers/freedreno/
 GLX
 R: Adam Jackson <ajax@redhat.com>
 F: src/glx/
 VULKAN
 R: Eric Engestrom <eric@engestrom.ch>
 F: src/vulkan/
 F: include/vulkan/

									
										7

SConstruct
									
												View File
												
				@@ -27,6 +27,13 @@ import SCons.Util

				import common

				#######################################################################

				# Minimal scons version

				EnsureSConsVersion(2, 4)

				EnsurePythonVersion(2, 7)

				#######################################################################

				# Configuration options

2

VERSION

View File

@@ -1 +1 @@
 .3.0-devel
 .0.0-rc5

									
										30

appveyor.yml
									
												View File
												
				@@ -33,31 +33,41 @@ branches:

				# - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories

				clone_depth: 100

				# https://www.appveyor.com/docs/build-cache/

				cache:

				- win_flex_bison-2.5.9.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				- '%LOCALAPPDATA%\pip\Cache -> appveyor.yml'

				- win_flex_bison-2.5.15.zip

				- llvm-5.0.1-msvc2017-mtd.7z

				os: Visual Studio 2013

				os: Visual Studio 2017

				init:

				# Appveyor defaults core.autocrlf to input instead of the default (true), but

				# that can hide problems processing CRLF text on Windows

				- git config --global core.autocrlf true

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				  WINFLEXBISON_VERSION: 2.5.15

				  LLVM_ARCHIVE: llvm-5.0.1-msvc2017-mtd.7z

				install:

				# Check git config

				- git config core.autocrlf

				# Check pip

				- python --version

				- python -m pip --version

				# Install Mako

				- python -m pip install Mako==1.0.6

				- python -m pip install Mako==1.0.7

				# Install pywin32 extensions, needed by SCons

				- python -m pip install pypiwin32

				# Install python wheels, necessary to install SCons via pip

				- python -m pip install wheel

				# Install SCons

				- python -m pip install scons==2.5.1

				- python -m pip install scons==3.0.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"

				- set WINFLEXBISON_ARCHIVE=win_flex_bison-%WINFLEXBISON_VERSION%.zip

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://github.com/lexxmark/winflexbison/releases/download/v%WINFLEXBISON_VERSION%/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

				@@ -69,10 +79,10 @@ install:

				- set LLVM=%CD%\llvm

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1

				after_build:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1 check

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=14.1 llvm=1 check

				# It's possible to setup notification here, as described in

3

bin/.cherry-ignore Normal file

View File

@@ -0,0 +1,3 @@
 # Both of these were already merged with different shas
 da48cba61ef6fefb799bf96e6364b70dbf4ec712
 c812c740e60c14060eb89db66039111881a0f42f

									
										2

bin/bugzilla_mesa.sh
									
												View File
												
				@@ -23,7 +23,7 @@ echo "<ul>"

				echo ""

				# extract fdo urls from commit log

				git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				git log --pretty=medium $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\

				while read url

				do

					id=$(echo $url | cut -d'=' -f2)

									
										81

bin/get-fixes-pick-list.sh
									
												View File
											
				@@ -1,81 +0,0 @@

				#!/bin/sh

				# Script for generating a list of candidates [referenced by a Fixes tag] for

				# cherry-picking to a stable branch

				#

				# Usage examples:

				#

				# $ bin/get-fixes-pick-list.sh

				# $ bin/get-fixes-pick-list.sh > picklist

				# $ bin/get-fixes-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# List all the commits between day 1 and the branch point...

				git log --reverse --pretty=%H $latest_branchpoint > already_landed

				# ... and the ones cherry-picked.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//'  > already_picked

				# Grep for commits with Fixes tag

				git log --reverse --pretty=%H -i --grep="fixes:" $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list ...

					if [ -f bin/.cherry-ignore ] ; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

					fi

					# Skip if it has been already cherry-picked.

					if grep -q ^$sha already_picked ; then

						continue

					fi

					# Place every "fixes:" tag on its own line and join with the next word

					# on its line or a later one.

					fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`

					# For each one try to extract the tag

					fixes_count=`echo "$fixes" | wc -l`

					warn=`(test $fixes_count -gt 1 && echo $fixes_count) || echo 0`

					while [ $fixes_count -gt 0 ] ; do

						# Treat only the current line

						id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`

						fixes_count=$(($fixes_count-1))

						# Bail out if we cannot find suitable id.

						# Any specific validation the $id is valid and not some junk, is

						# implied with the follow up code

						if [ "x$id" = x ] ; then

							continue

						fi

						# Check if the offending commit is in branch.

						# Be that cherry-picked ...

						# ... or landed before the branchpoint.

						if grep -q ^$id already_picked ||

						   grep -q ^$id already_landed ; then

							printf "Commit \"%s\" fixes %s\n" \

							       "`git log -n1 --pretty=oneline $sha`" \

							       "$id"

							warn=$(($warn-1))

						fi

					done

					if [ $warn -gt 0 ] ; then

						printf "WARNING: Commit \"%s\" has more than one Fixes tag\n" \

						       "`git log -n1 --pretty=oneline $sha`"

					fi

				done

				rm -f already_picked

				rm -f already_landed

									
										124

bin/get-pick-list.sh
									
												View File
												
				@@ -7,21 +7,107 @@

				# $ bin/get-pick-list.sh

				# $ bin/get-pick-list.sh > picklist

				# $ bin/get-pick-list.sh | tee picklist

				#

				# The output is as follows:

				# [nomination_type] commit_sha commit summary

				is_stable_nomination()

				{

					git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-stable"

				}

				is_typod_nomination()

				{

					git show --pretty=medium --summary "$1" | grep -q -i -o "CC:.*mesa-dev"

				}

				fixes=

				# Helper to handle various mistypos of the fixes tag.

				# The tag string itself is passed as argument and normalised within.

				#

				# Resulting string in the global variable "fixes" and contains entries

				# in the form "fixes:$sha"

				is_sha_nomination()

				{

					fixes=`git show --pretty=medium -s $1 | tr -d "\n" | \

						sed -e 's/'"$2"'/\nfixes:/Ig' | \

						grep -Eo 'fixes:[a-f0-9]{8,40}'`

					fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`

					if test $fixes_count -eq 0; then

						return 1

					fi

					# Throw a warning for each invalid sha

					while test $fixes_count -gt 0; do

						# Treat only the current line

						id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`

						fixes_count=$(($fixes_count-1))

						if ! git show $id >/dev/null 2>&1; then

							echo WARNING: Commit $1 lists invalid sha $id

						fi

					done

					return 0

				}

				# Checks if at least one of offending commits, listed in the global

				# "fixes", is in branch.

				sha_in_range()

				{

					fixes_count=`echo "$fixes" | grep "fixes:" | wc -l`

					while test $fixes_count -gt 0; do

						# Treat only the current line

						id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`

						fixes_count=$(($fixes_count-1))

						# Be that cherry-picked ...

						# ... or landed before the branchpoint.

						if grep -q ^$id already_picked ||

						   grep -q ^$id already_landed ; then

							return 0

						fi

					done

					return 1

				}

				is_fixes_nomination()

				{

					is_sha_nomination "$1" "fixes:[[:space:]]*"

					if test $? -eq 0; then

						return 0

					fi

					is_sha_nomination "$1" "fixes[[:space:]]\+"

				}

				is_brokenby_nomination()

				{

					is_sha_nomination "$1" "broken by"

				}

				is_revert_nomination()

				{

					is_sha_nomination "$1" "This reverts commit "

				}

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

				# List all the commits between day 1 and the branch point...

				git log --reverse --pretty=%H $latest_branchpoint > already_landed

				# ... and the ones cherry-picked.

				git log --reverse --pretty=medium --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable' $latest_branchpoint..origin/master |\

				# Grep for potential candidates

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable\|^CC:.*mesa-dev\|\<fixes\>\|\<broken by\>\|This reverts commit' $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

					if [ -f bin/.cherry-ignore ] ; then

					if test -f bin/.cherry-ignore; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

				@@ -32,7 +118,33 @@ do

						continue

					fi

					git log -n1 --pretty=oneline $sha | cat

					if is_fixes_nomination "$sha"; then

						tag=fixes

					elif is_brokenby_nomination "$sha"; then

						tag=brokenby

					elif is_revert_nomination "$sha"; then

						tag=revert

					elif is_stable_nomination "$sha"; then

						tag=stable

					elif is_typod_nomination "$sha"; then

						tag=typod

					else

						continue

					fi

					case "$tag" in

					fixes | brokenby | revert )

						if ! sha_in_range; then

							continue

						fi

						;;

					* )

						;;

					esac

					printf "[ %8s ] " "$tag"

					git --no-pager show --no-patch --oneline $sha

				done

				rm -f already_picked

				rm -f already_landed

									
										42

bin/get-typod-pick-list.sh
									
												View File
											
				@@ -1,42 +0,0 @@

				#!/bin/sh

				# Script for generating a list of candidates which have typos in the nomination line

				#

				# Usage examples:

				#

				# $ bin/get-typod-pick-list.sh

				# $ bin/get-typod-pick-list.sh > picklist

				# $ bin/get-typod-pick-list.sh | tee picklist

				# NB:

				# This script intentionally _never_ checks for specific version tag

				# Should we consider folding it with the original get-pick-list.sh

				# Use the last branchpoint as our limit for the search

				latest_branchpoint=`git merge-base origin/master HEAD`

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' $latest_branchpoint..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

					if [ -f bin/.cherry-ignore ] ; then

						if grep -q ^$sha bin/.cherry-ignore ; then

							continue

						fi

					fi

					# Check to see if it has already been picked over.

					if grep -q ^$sha already_picked ; then

						continue

					fi

					git log -n1 --pretty=oneline $sha | cat

				done

				rm -f already_picked

									
										23

bin/git_sha1_gen.py
									
										Executable file → Normal file
									
												View File
												
				@@ -1,11 +1,10 @@

				#!/usr/bin/env python

				"""

				Generate the contents of the git_sha1.h file.

				The output of this script goes to stdout.

				"""

				import argparse

				import os

				import os.path

				import subprocess

				@@ -27,7 +26,25 @@ def get_git_sha1():

				        git_sha1 = ''

				    return git_sha1

				def write_if_different(contents):

				    """

				    Avoid touching the output file if it doesn't need modifications

				    Useful to avoid triggering rebuilds when nothing has changed.

				    """

				    if os.path.isfile(args.output):

				        with open(args.output, 'r') as file:

				            if file.read() == contents:

				                return

				    with open(args.output, 'w') as file:

				        file.write(contents)

				parser = argparse.ArgumentParser()

				parser.add_argument('--output', help='File to write the #define in',

				                    required=True)

				args = parser.parse_args()

				git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10]

				if git_sha1:

				    sys.stdout.write('#define MESA_GIT_SHA1 "git-%s"\n' % git_sha1.rstrip())

				    write_if_different('#define MESA_GIT_SHA1 " (git-' + git_sha1 + ')"')

				else:

				    write_if_different('#define MESA_GIT_SHA1 ""')

									
										37

bin/install_megadrivers.py
									
										Executable file → Normal file
									
												View File
												
				@@ -1,6 +1,5 @@

				#!/usr/bin/env python

				# encoding=utf-8

				# Copyright © 2017 Intel Corporation

				# Copyright © 2017-2018 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				@@ -35,19 +34,39 @@ def main():

				    parser.add_argument('drivers', nargs='+')

				    args = parser.parse_args()

				    to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)

				    if os.path.isabs(args.libdir):

				        to = os.path.join(os.environ.get('DESTDIR', '/'), args.libdir[1:])

				    else:

				        to = os.path.join(os.environ['MESON_INSTALL_DESTDIR_PREFIX'], args.libdir)

				    master = os.path.join(to, os.path.basename(args.megadriver))

				    if not os.path.exists(to):

				        if os.path.lexists(to):

				            os.unlink(to)

				        os.makedirs(to)

				    shutil.copy(args.megadriver, master)

				    for each in args.drivers:

				        driver = os.path.join(to, each)

				        if os.path.exists(driver):

				            os.unlink(driver)

				        print('installing {} to {}'.format(args.megadriver, to))

				        os.link(master, driver)

				    for driver in args.drivers:

				        abs_driver = os.path.join(to, driver)

				        if os.path.lexists(abs_driver):

				            os.unlink(abs_driver)

				        print('installing {} to {}'.format(args.megadriver, abs_driver))

				        os.link(master, abs_driver)

				        try:

				            ret = os.getcwd()

				            os.chdir(to)

				            name, ext = os.path.splitext(driver)

				            while ext != '.so':

				                if os.path.lexists(name):

				                    os.unlink(name)

				                os.symlink(driver, name)

				                name, ext = os.path.splitext(name)

				        finally:

				            os.chdir(ret)

				    os.unlink(master)

									
										88

bin/meson-cmd-extract.py
									
										Executable file
									
												View File
												
				@@ -0,0 +1,88 @@

				#!/usr/bin/env python3

				# Copyright © 2019 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				"""This script reads a meson build directory and gives back the command line it

				was configured with.

				This only works for meson 0.49.0 and newer.

				"""

				import argparse

				import ast

				import configparser

				import pathlib

				import sys

				def parse_args() -> argparse.Namespace:

				    """Parse arguments."""

				    parser = argparse.ArgumentParser()

				    parser.add_argument(

				        'build_dir',

				        help='Path the meson build directory')

				    args = parser.parse_args()

				    return args

				def load_config(path: pathlib.Path) -> configparser.ConfigParser:

				    """Load config file."""

				    conf = configparser.ConfigParser()

				    with path.open() as f:

				        conf.read_file(f)

				    return conf

				def build_cmd(conf: configparser.ConfigParser) -> str:

				    """Rebuild the command line."""

				    args = []

				    for k, v in conf['options'].items():

				        if ' ' in v:

				            args.append(f'-D{k}="{v}"')

				        else:

				            args.append(f'-D{k}={v}')

				    cf = conf['properties'].get('cross_file')

				    if cf:

				        args.append('--cross-file={}'.format(cf))

				    nf = conf['properties'].get('native_file')

				    if nf:

				        # this will be in the form "['str', 'str']", so use ast.literal_eval to

				        # convert it to a list of strings.

				        nf = ast.literal_eval(nf)

				        args.extend(['--native-file={}'.format(f) for f in nf])

				    return ' '.join(args)

				def main():

				    args = parse_args()

				    path = pathlib.Path(args.build_dir, 'meson-private', 'cmd_line.txt')

				    if not path.exists():

				        print('Cannot find the necessary file to rebuild command line. '

				              'Is your meson version >= 0.49.0?', file=sys.stderr)

				        sys.exit(1)

				    conf = load_config(path)

				    cmd = build_cmd(conf)

				    print(cmd)

				if __name__ == '__main__':

				    main()

									
										21

bin/meson.build
									
										Normal file
									
												View File
												
				@@ -0,0 +1,21 @@

				# Copyright © 2017 Eric Engestrom

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				git_sha1_gen_py = files('git_sha1_gen.py')

									
										35

bin/meson_get_version.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				#!/usr/bin/env python

				# encoding=utf-8

				# Copyright © 2017 Intel Corporation

				# Permission is hereby granted, free of charge, to any person obtaining a copy

				# of this software and associated documentation files (the "Software"), to deal

				# in the Software without restriction, including without limitation the rights

				# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell

				# copies of the Software, and to permit persons to whom the Software is

				# furnished to do so, subject to the following conditions:

				# The above copyright notice and this permission notice shall be included in

				# all copies or substantial portions of the Software.

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE

				# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				# SOFTWARE.

				from __future__ import print_function

				import os

				def main():

				    filename = os.path.join(os.environ['MESON_SOURCE_ROOT'], 'VERSION')

				    with open(filename) as f:

				        version = f.read().strip()

				    print(version, end='')

				if __name__ == '__main__':

				    main()

3

build-support/conftest.dyn Normal file

View File

@@ -0,0 +1,3 @@
 {
 	radeon_drm_winsys_create;
 };

6

build-support/conftest.map Normal file

View File

@@ -0,0 +1,6 @@
 VERSION_1 {
     global:
         main;
     local:
         *;
 };

									
										8

common.py
									
												View File
												
				@@ -86,7 +86,7 @@ def AddOptions(opts):

				        from SCons.Options.EnumOption import EnumOption

				    opts.Add(EnumOption('build', 'build type', 'debug',

				                        allowed_values=('debug', 'checked', 'profile',

				                                        'release', 'opt')))

				                                        'release')))

				    opts.Add(BoolOption('verbose', 'verbose output', 'no'))

				    opts.Add(EnumOption('machine', 'use machine-specific assembly code',

				                        default_machine,

				@@ -99,17 +99,13 @@ def AddOptions(opts):

				                        'enable static code analysis where available', 'no'))

				    opts.Add(BoolOption('asan', 'enable Address Sanitizer', 'no'))

				    opts.Add('toolchain', 'compiler toolchain', default_toolchain)

				    opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',

				                        'no'))

				    opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))

				    opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',

				                        'no'))

				    opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))

				    opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))

				    opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))

				    opts.Add(BoolOption('texture_float',

				                        'enable floating-point textures and renderbuffers',

				                        'no'))

				    opts.Add(BoolOption('swr', 'Build OpenSWR', 'no'))

				    if host_platform == 'windows':

				        opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')

				        opts.Add('MSVC_USE_SCRIPT', 'Microsoft Visual C/C++ vcvarsall script', True)

704

configure.ac

View File

File diff suppressed because it is too large Load Diff

									
										13

docs/autoconf.html
									
												View File
												
				@@ -26,6 +26,12 @@

				  </ul>

				</ol>

				<h2>ATTENTION:</h2>

				<p>

				    The autotools build is being replaced by the <a href="meson.html">meson</a>

				    build system. If you haven't yet now is a good time to try using meson and

				    report any issues you run into.

				</p>

				<h2 id="basic">1. Basic Usage</h2>

				@@ -94,6 +100,13 @@ Currently there's only one config file provided when dri drivers are

				enabled - it's <code>drirc</code>.</p>

				</dd>

				<dt><code>--datadir=DIR</code></dt>

				<dd><p>This option specifies the directory where the data files will

				be installed. The default is <code>${prefix}/share</code>.

				Currently when dri drivers are enabled, <code>drirc.d/</code> is at

				this place.</p>

				</dd>

				<dt><code>--enable-static, --disable-shared</code></dt>

				<dd><p>By default, Mesa

				will build shared libraries. Either of these options will force static

									
										4

docs/codingstyle.html
									
												View File
												
				@@ -83,7 +83,7 @@ We try to quote the OpenGL specification where prudent:

				    *     "An INVALID_OPERATION error is generated for any of the following

				    *     conditions:

				    *

				    *     * <length> is zero."

				    *     * &lt;length&gt; is zero."

				    *

				    * Additionally, page 94 of the PDF of the OpenGL 4.5 core spec

				    * (30.10.2014) also says this, so it's no longer allowed for desktop GL,

				@@ -94,7 +94,7 @@ Function comment example:

				<pre>

				   /**

				    * Create and initialize a new buffer object.  Called via the

				    * ctx->Driver.CreateObject() driver callback function.

				    * ctx-&gt;Driver.CreateObject() driver callback function.

				    * \param  name  integer name of the object

				    * \param  type  one of GL_FOO, GL_BAR, etc.

				    * \return  pointer to new object or NULL if error

									
										1

docs/contents.html
									
												View File
												
				@@ -43,6 +43,7 @@

				<li><a href="install.html" target="_parent">Compiling / Installing</a>

				  <ul>

				    <li><a href="autoconf.html" target="_parent">Autoconf</a></li>

				    <li><a href="meson.html" target="_parent">Meson</a></li>

				  </ul>

				</li>

				<li><a href="precompiled.html" target="_parent">Precompiled Libraries</a>

									
										6

docs/download.html
									
												View File
												
				@@ -102,9 +102,9 @@ In the past, GLUT, GLU and the Mesa demos were released in conjunction with

				Mesa releases.  But since GLUT, GLU and the demos change infrequently, they

				were split off into their own git repositories:

				<a href="https://cgit.freedesktop.org/mesa/glut/">GLUT</a>,

				<a href="https://cgit.freedesktop.org/mesa/glu/">GLU</a> and

				<a href="https://cgit.freedesktop.org/mesa/demos/">Demos</a>,

				<a href="https://gitlab.freedesktop.org/mesa/glut">GLUT</a>,

				<a href="https://gitlab.freedesktop.org/mesa/glu">GLU</a> and

				<a href="https://gitlab.freedesktop.org/mesa/demos">Demos</a>,

				</p>

				</div>

									
										1

docs/egl.html
									
												View File
												
				@@ -168,6 +168,7 @@ the X server directly using (XCB-)DRI2 protocol.</p>

				<p>This driver can share DRI drivers with <code>libGL</code>.</p>

				</dd>

				</dl>

				<h2>Packaging</h2>

									
										72

docs/envvars.html
									
												View File
												
				@@ -88,22 +88,40 @@ This is a work-around for that.

				<li>MESA_GL_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) and possibly the GL API type.

				<ul>

				<li> The format should be MAJOR.MINOR[FC]

				<li> FC is an optional suffix that indicates a forward compatible context.

				This is only valid for versions &gt;= 3.0.

				<li> GL versions &lt; 3.0 are set to a compatibility (non-Core) profile

				<li> GL versions = 3.0, see below

				<li> GL versions &gt; 3.0 are set to a Core profile

				<li> Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC

				<ul>

				<li> 2.1 - select a compatibility (non-Core) profile with GL version 2.1

				<li> 3.0 - select a compatibility (non-Core) profile with GL version 3.0

				<li> 3.0FC - select a Core+Forward Compatible profile with GL version 3.0

				<li> 3.1 - select a Core profile with GL version 3.1

				<li> 3.1FC - select a Core+Forward Compatible profile with GL version 3.1

				</ul>

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				  <li>The format should be MAJOR.MINOR[FC|COMPAT]

				  <li>FC is an optional suffix that indicates a forward compatible

				      context. This is only valid for versions &gt;= 3.0.

				  <li>COMPAT is an optional suffix that indicates a compatibility

				      context or GL_ARB_compatibility support. This is only valid for

				      versions &gt;= 3.1.

				  <li>GL versions &lt;= 3.0 are set to a compatibility (non-Core)

				      profile

				  <li>GL versions = 3.1, depending on the driver, it may or may not

				      have the ARB_compatibility extension enabled.

				  <li>GL versions &gt;= 3.2 are set to a Core profile

				  <li>Examples: 2.1, 3.0, 3.0FC, 3.1, 3.1FC, 3.1COMPAT, X.Y, X.YFC,

				      X.YCOMPAT.

				  <ul>

				    <li>2.1 - select a compatibility (non-Core) profile with GL

				        version 2.1.

				    <li>3.0 - select a compatibility (non-Core) profile with GL

				        version 3.0.

				    <li>3.0FC - select a Core+Forward Compatible profile with GL

				        version 3.0.

				    <li>3.1 - select GL version 3.1 with GL_ARB_compatibility enabled

				        per the driver default.

				    <li>3.1FC - select GL version 3.1 with forward compatibility and

				        GL_ARB_compatibility disabled.

				    <li>3.1COMPAT - select GL version 3.1 with GL_ARB_compatibility

				        enabled.

				    <li>X.Y - override GL version to X.Y without changing the profile.

				    <li>X.YFC - select a Core+Forward Compatible profile with GL

				        version X.Y.

				    <li>X.YCOMPAT - select a Compatibility profile with GL version

				        X.Y.

				  </ul>

				  <li>Mesa may not really implement all the features of the given

				      version. (for developers only)

				</ul>

				<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) for OpenGL ES.

				@@ -128,13 +146,23 @@ your system. For example under the default settings you may end up with a 1GB

				cache for x86_64 and another 1GB cache for i386.

				<li>MESA_GLSL_CACHE_DIR - if set, determines the directory to be used

				for the on-disk cache of compiled GLSL programs. If this variable is

				not set, then the cache will be stored in $XDG_CACHE_HOME/mesa (if

				that variable is set), or else within .cache/mesa within the user's

				not set, then the cache will be stored in $XDG_CACHE_HOME/mesa_shader_cache (if

				that variable is set), or else within .cache/mesa_shader_cache within the user's

				home directory.

				<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>

				<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.

				<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>

				<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>

				<li>MESA_VK_VERSION_OVERRIDE - changes the Vulkan physical device version

				    as returned in VkPhysicalDeviceProperties::apiVersion.

				  <ul>

				    <li>The format should be MAJOR.MINOR[.PATCH]</li>

				    <li>This will not let you force a version higher than the driver's

				        instance versionas advertised by vkEnumerateInstanceVersion</li>

				    <li>This can be very useful for debugging but some features may not be

				        implemented correctly. (For developers only)</li>

				  </ul>

				</li>

				</ul>

				@@ -241,7 +269,7 @@ Mesa EGL supports different sets of environment variables.  See the

				    Especially useful to toggle hud at specific points of application and

				    disable for unencumbered viewing the rest of the time. For example, set

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_TOGGLE_SIGNAL to 10 (SIGUSR1).

				    Use kill -10 <pid> to toggle the hud as desired.

				    Use kill -10 &lt;pid&gt; to toggle the hud as desired.

				<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed

				    hud values into files.

				<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for

				@@ -313,6 +341,12 @@ such as the OpenGL program's name and command line arguments.

				<li>See the driver code for other, lesser-used variables.

				</ul>

				<h3>WGL environment variables</h3>

				<ul>

				<li>WGL_SWAP_INTERVAL - to set a swap interval, equivalent to calling

				wglSwapIntervalEXT() in an application.  If this environment variable

				is set, application calls to wglSwapIntervalEXT() will have no effect.

				</ul>

				<h3>VA-API state tracker environment variables</h3>

				<ul>

									
										4

docs/extensions.html
									
												View File
												
				@@ -23,7 +23,7 @@ The specifications follow.

				<ul>

				<li><a href="specs/MESA_agp_offset.spec">MESA_agp_offset.spec</a>

				<li><a href="specs/OLD/MESA_agp_offset.spec">MESA_agp_offset.spec</a>

				<li><a href="specs/MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>

				<li><a href="specs/MESA_drm_image.spec">MESA_drm_image.spec</a>

				<li><a href="specs/MESA_multithread_makecurrent.spec">MESA_multithread_makecurrent.spec</a>

				@@ -33,7 +33,7 @@ The specifications follow.

				<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)

				<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>

				<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)

				<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>

				<li><a href="specs/OLD/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>

				<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>

				<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)

				<li><a href="specs/MESA_swap_control.spec">MESA_swap_control.spec</a>

									
										20

docs/faq.html
									
												View File
												
				@@ -16,7 +16,7 @@

				<center>

				<h1>Mesa Frequently Asked Questions</h1>

				Last updated: 9 October 2012

				Last updated: 19 September 2018

				</center>

				<br>

				@@ -373,18 +373,16 @@ the archives) is a good way to get information.

				<h2>4.3 Why isn't GL_EXT_texture_compression_s3tc implemented in Mesa?</h2>

				<p>

				The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compression_s3tc.txt">specification for the extension</a>

				indicates that there are intellectual property (IP) and/or patent issues

				to be dealt with.

				</p>

				<p>We've been unsuccessful in getting a response from S3 (or whoever owns

				the IP nowadays) to indicate whether or not an open source project can

				implement the extension (specifically the compression/decompression

				algorithms).

				Oh but it is! Prior to 2nd October 2017, the Mesa project did not include s3tc

				support due to intellectual property (IP) and/or patent issues around the s3tc

				algorithm.

				</p>

				<p>

				In the mean time, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC">

				plug-in library</a> is available.

				As of Mesa 17.3.0, Mesa now officially supports s3tc, as the patent has expired.

				</p>

				<p>

				In versions prior to this, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC">

				plug-in library</a> was required.

				</p>

				</div>

BIN
docs/favicon.ico Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 13 KiB

BIN
docs/favicon.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 2.9 KiB

323

docs/features.txt

View File

@@ -24,16 +24,19 @@ not started
 # OpenGL Core and Compatibility context support
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
 version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 Some drivers do not support the Compatibility profile or the
 ARB_compatibility extensions.  If an application does not request a
 specific version without the forward-compatiblity flag, such drivers
 will be limited to OpenGL 3.0.  If an application requests OpenGL 3.1,
 it will get a context that may or may not have the ARB_compatibility
 extension enabled.  Some of the later GL features are exposed in the 3.0
 context as extensions.
 Feature                                                 Status
 ------------------------------------------------------- ------------------------
 GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   glBindFragDataLocation, glGetFragDataLocation         DONE
   GL_NV_conditional_render (Conditional rendering)      DONE ()
@@ -60,12 +63,12 @@ GL 3.0, GLSL 1.30 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (freedreno (*), llvmpipe (*), softpipe (*), swr (*))
   Multisample anti-aliasing                             DONE (freedreno/a5xx, freedreno (*), llvmpipe (*), softpipe (*), swr (*))
 (*) freedreno, llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 (*) freedreno (a2xx-a4xx), llvmpipe, softpipe, and swr have fake Multisample anti-aliasing support
 GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   Forward compatible context support/deprecations       DONE ()
   GL_ARB_draw_instanced (Instanced drawing)             DONE ()
@@ -78,7 +81,7 @@ GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
   GL_EXT_texture_snorm (Signed normalized textures)     DONE ()
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr, virgl
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
@@ -87,13 +90,13 @@ GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
   GL_ARB_provoking_vertex (Provoking vertex)            DONE (freedreno)
   GL_ARB_seamless_cube_map (Seamless cubemaps)          DONE (freedreno)
   GL_ARB_texture_multisample (Multisample textures)     DONE ()
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx)
   GL_ARB_depth_clamp (Frag depth clamp)                 DONE (freedreno)
   GL_ARB_sync (Fence objects)                           DONE (freedreno)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, virgl
   GL_ARB_blend_func_extended                            DONE (freedreno/a3xx, swr)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
@@ -102,23 +105,23 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_shader_bit_encoding                            DONE (freedreno, swr)
   GL_ARB_texture_rgb10_a2ui                             DONE (freedreno, swr)
   GL_ARB_texture_swizzle                                DONE (freedreno, swr)
   GL_ARB_timer_query                                    DONE (swr)
   GL_ARB_timer_query                                    DONE (freedreno, swr)
   GL_ARB_instanced_arrays                               DONE (freedreno, swr)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE (freedreno, swr)
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_draw_buffers_blend                             DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_gpu_shader5                                    DONE (i965/gen7+)
   - 'precise' qualifier                                 DONE
   - Dynamically uniform sampler array indices           DONE (softpipe)
   - Dynamically uniform UBO array indices               DONE ()
   - Dynamically uniform UBO array indices               DONE (freedreno)
   - Implicit signed -> unsigned conversions             DONE
   - Fused multiply-add                                  DONE ()
   - Packing/bitfield/conversion functions               DONE (softpipe)
   - Enhanced textureGather                              DONE (softpipe)
   - Packing/bitfield/conversion functions               DONE (freedreno, softpipe)
   - Enhanced textureGather                              DONE (freedreno, softpipe)
   - Geometry shader instancing                          DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                    DONE ()
   - Enhanced per-sample shading                         DONE ()
@@ -126,97 +129,97 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
   - New overload resolution rules                       DONE
   GL_ARB_gpu_shader_fp64                                DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_sample_shading                                 DONE (i965/gen6+, nv50)
   GL_ARB_shader_subroutine                              DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_shader_subroutine                              DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_tessellation_shader                            DONE (i965/gen7+)
   GL_ARB_texture_buffer_object_rgb32                    DONE (i965/gen6+, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_object_rgb32                    DONE (freedreno, i965/gen6+, llvmpipe, softpipe, swr)
   GL_ARB_texture_cube_map_array                         DONE (i965/gen6+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                 DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_query_lod                              DONE (freedreno, i965, nv50, llvmpipe, softpipe)
   GL_ARB_transform_feedback2                            DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_transform_feedback3                            DONE (i965/gen7+, llvmpipe, softpipe, swr)
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_ES2_compatibility                              DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 binary formats)
   GL_ARB_ES2_compatibility                              DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary                             DONE (0 or 1 binary formats)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_shader_precision                               DONE (i965/gen7+, all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                            DONE (i965/gen7+, llvmpipe, softpipe)
   GL_ARB_viewport_array                                 DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, radeonsi
 GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, r600, radeonsi, virgl
   GL_ARB_texture_compression_bptc                       DONE (i965, r600)
   GL_ARB_texture_compression_bptc                       DONE (freedreno, i965)
   GL_ARB_compressed_texture_pixel_storage               DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_texture_storage                                DONE (all drivers)
   GL_ARB_transform_feedback_instanced                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (i965, softpipe)
   GL_ARB_transform_feedback_instanced                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_base_instance                                  DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_conservative_depth                             DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                       DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_internalformat_query                           DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_internalformat_query                           DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_map_buffer_alignment                           DONE (all drivers)
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
 GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                              DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                            DONE (all drivers)
   GL_ARB_compute_shader                                 DONE (i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, r600, softpipe, llvmpipe)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_copy_image                                     DONE (i965, nv50, softpipe, llvmpipe)
   GL_KHR_debug                                          DONE (all drivers)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (i965, r600, softpipe)
   GL_ARB_fragment_layer_viewport                        DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965, softpipe)
   GL_ARB_internalformat_query2                          DONE (all drivers)
   GL_ARB_invalidate_subdata                             DONE (all drivers)
   GL_ARB_multi_draw_indirect                            DONE (i965, r600, llvmpipe, softpipe, swr)
   GL_ARB_multi_draw_indirect                            DONE (freedreno, i965, llvmpipe, softpipe, swr)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                  DONE (i965)
   GL_ARB_shader_image_size                              DONE (i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965, softpipe)
   GL_ARB_stencil_texturing                              DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (nv50, i965, r600, llvmpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx, i965, softpipe)
   GL_ARB_stencil_texturing                              DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_buffer_range                           DONE (freedreno, nv50, i965, llvmpipe)
   GL_ARB_texture_query_levels                           DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_view                                   DONE (freedreno, i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
 GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, r600, radeonsi
   GL_MAX_VERTEX_ATTRIB_STRIDE                           DONE (all drivers)
   GL_ARB_buffer_storage                                 DONE (i965, nv50, r600, llvmpipe, swr)
   GL_ARB_clear_texture                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_buffer_storage                                 DONE (freedreno, i965, nv50, llvmpipe, swr)
   GL_ARB_clear_texture                                  DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_enhanced_layouts                               DONE (i965, nv50, llvmpipe, softpipe, virgl)
   - compile-time constant expressions                   DONE
   - explicit byte offsets for blocks                    DONE
   - forced alignment within blocks                      DONE
   - specified vec4-slot component numbers               DONE (i965, nv50, llvmpipe, softpipe)
   - specified vec4-slot component numbers               DONE
   - specified transform/feedback layout                 DONE
   - input/output block locations                        DONE
   GL_ARB_multi_bind                                     DONE (all drivers)
   GL_ARB_query_buffer_object                            DONE (i965/hsw+)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_stencil8                               DONE (i965/hsw+, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_mirror_clamp_to_edge                   DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_texture_stencil8                               DONE (freedreno, i965/hsw+, nv50, llvmpipe, softpipe, swr, virgl)
   GL_ARB_vertex_type_10f_11f_11f_rev                    DONE (i965, nv50, llvmpipe, softpipe, swr, virgl)
 GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+)
   GL_ARB_clip_control                                   DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_cull_distance                                  DONE (i965, nv50, llvmpipe, softpipe, swr)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600)
   GL_ARB_ES3_1_compatibility                            DONE (i965/hsw+, r600, virgl)
   GL_ARB_clip_control                                   DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_conditional_render_inverted                    DONE (freedreno, i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_cull_distance                                  DONE (i965, nv50, r600, llvmpipe, softpipe, swr, virgl)
   GL_ARB_derivative_control                             DONE (i965, nv50, r600, virgl)
   GL_ARB_direct_state_access                            DONE (all drivers)
   GL_ARB_get_texture_sub_image                          DONE (all drivers)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600)
   GL_ARB_texture_barrier                                DONE (i965, nv50, r600)
   GL_ARB_shader_texture_image_samples                   DONE (i965, nv50, r600, virgl)
   GL_ARB_texture_barrier                                DONE (freedreno, i965, nv50, r600, virgl)
   GL_KHR_context_flush_control                          DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robustness                                     DONE (i965)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
@@ -225,39 +228,39 @@ GL 4.6, GLSL 4.60
   GL_ARB_gl_spirv                                       in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_indirect_parameters                            DONE (i965/gen7+, nvc0, radeonsi)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
   GL_ARB_shader_atomic_counter_ops                      DONE (i965/gen7+, nvc0, radeonsi, softpipe)
   GL_ARB_pipeline_statistics_query                      DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe, swr)
   GL_ARB_polygon_offset_clamp                           DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr, virgl)
   GL_ARB_shader_atomic_counter_ops                      DONE (freedreno/a5xx, i965/gen7+, nvc0, r600, radeonsi, softpipe, virgl)
   GL_ARB_shader_draw_parameters                         DONE (i965, nvc0, radeonsi)
   GL_ARB_shader_group_vote                              DONE (i965, nvc0, radeonsi)
   GL_ARB_spirv_extensions                               in progress (Nicolai Hähnle, Ian Romanick)
   GL_ARB_texture_filter_anisotropic                     DONE (i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, radeonsi, llvmpipe, softpipe)
   GL_KHR_no_error                                       started (Timothy Arceri)
   GL_ARB_texture_filter_anisotropic                     DONE (freedreno, i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
   GL_ARB_transform_feedback_overflow_query              DONE (i965/gen6+, nvc0, radeonsi, llvmpipe, softpipe, virgl)
   GL_KHR_no_error                                       DONE (all drivers)
 (*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
 GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, r600, radeonsi, virgl
   GL_ARB_arrays_of_arrays                               DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                 DONE (i965/gen7+, softpipe)
   GL_ARB_draw_indirect                                  DONE (i965/gen7+, r600, llvmpipe, softpipe, swr)
   GL_ARB_compute_shader                                 DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_draw_indirect                                  DONE (freedreno, i965/gen7+, llvmpipe, softpipe, swr)
   GL_ARB_explicit_uniform_location                      DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                     DONE (i965/gen7+, r600, softpipe)
   GL_ARB_framebuffer_no_attachments                     DONE (freedreno, i965/gen7+, softpipe)
   GL_ARB_program_interface_query                        DONE (all drivers)
   GL_ARB_shader_atomic_counters                         DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (i965/gen7+, softpipe)
   GL_ARB_shader_atomic_counters                         DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_image_load_store                        DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_image_size                              DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shader_storage_buffer_object                   DONE (freedreno/a5xx, i965/gen7+, softpipe)
   GL_ARB_shading_language_packing                       DONE (all drivers)
   GL_ARB_separate_shader_objects                        DONE (all drivers)
   GL_ARB_stencil_texturing                              DONE (nv50, r600, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe)
   GL_ARB_stencil_texturing                              DONE (freedreno, nv50, llvmpipe, softpipe, swr)
   GL_ARB_texture_multisample (Multisample textures)     DONE (freedreno/a5xx, i965/gen7+, nv50, llvmpipe, softpipe)
   GL_ARB_texture_storage_multisample                    DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_vertex_attrib_binding                          DONE (all drivers)
   GS5 Enhanced textureGather                            DONE (i965/gen7+, r600)
   GS5 Packing/bitfield/conversion functions             DONE (i965/gen6+, r600)
   GS5 Enhanced textureGather                            DONE (freedreno, i965/gen7+)
   GS5 Packing/bitfield/conversion functions             DONE (freedreno/a5xx, i965/gen6+)
   GL_EXT_shader_integer_mix                             DONE (all drivers that support GLSL)
   Additional functionality not covered above:
@@ -266,72 +269,138 @@ GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                       DONE (i965, r600)
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
 GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+, radeonsi, virgl
   GL_EXT_color_buffer_float                             DONE (all drivers)
   GL_KHR_blend_equation_advanced                        DONE (i965, nvc0)
   GL_KHR_debug                                          DONE (all drivers)
   GL_KHR_robustness                                     DONE (i965, nvc0, radeonsi)
   GL_KHR_texture_compression_astc_ldr                   DONE (i965/gen9+)
   GL_KHR_robustness                                     DONE (i965, nvc0)
   GL_KHR_texture_compression_astc_ldr                   DONE (freedreno, i965/gen9+)
   GL_OES_copy_image                                     DONE (all drivers)
   GL_OES_draw_buffers_indexed                           DONE (all drivers that support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex                      DONE (all drivers)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_geometry_shader                                DONE (i965/hsw+, nvc0)
   GL_OES_gpu_shader5                                    DONE (all drivers that support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0, radeonsi)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600, radeonsi)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_primitive_bounding_box                         DONE (i965/gen7+, nvc0)
   GL_OES_sample_shading                                 DONE (i965, nvc0, r600)
   GL_OES_sample_variables                               DONE (i965, nvc0, r600)
   GL_OES_shader_image_atomic                            DONE (all drivers that support GL_ARB_shader_image_load_store)
   GL_OES_shader_io_blocks                               DONE (All drivers that support GLES 3.1)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600, radeonsi)
   GL_OES_shader_multisample_interpolation               DONE (i965, nvc0, r600)
   GL_OES_tessellation_shader                            DONE (all drivers that support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp                           DONE (all drivers)
   GL_OES_texture_buffer                                 DONE (i965, nvc0, radeonsi)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0, radeonsi)
   GL_OES_texture_buffer                                 DONE (freedreno, i965, nvc0)
   GL_OES_texture_cube_map_array                         DONE (i965/hsw+, nvc0)
   GL_OES_texture_stencil8                               DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array           DONE (all drivers that support GL_ARB_texture_multisample)
 Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
   GL_ARB_bindless_texture                               DONE (radeonsi)
   GL_ARB_bindless_texture                               DONE (nvc0, radeonsi)
   GL_ARB_cl_event                                       not started
   GL_ARB_compute_variable_group_size                    DONE (nvc0, radeonsi)
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+)
   GL_ARB_fragment_shader_interlock                      not started
   GL_ARB_ES3_2_compatibility                            DONE (i965/gen8+, radeonsi, virgl)
   GL_ARB_fragment_shader_interlock                      DONE (i965)
   GL_ARB_gpu_shader_int64                               DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
   GL_ARB_parallel_shader_compile                        not started, but Chia-I Wu did some related work in 2014
   GL_ARB_post_depth_coverage                            DONE (i965)
   GL_ARB_post_depth_coverage                            DONE (i965, nvc0)
   GL_ARB_robustness_isolation                           not started
   GL_ARB_sample_locations                               not started
   GL_ARB_seamless_cubemap_per_texture                   DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
   GL_ARB_sample_locations                               DONE (nvc0)
   GL_ARB_seamless_cubemap_per_texture                   DONE (freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl)
   GL_ARB_shader_ballot                                  DONE (i965/gen8+, nvc0, radeonsi)
   GL_ARB_shader_clock                                   DONE (i965/gen7+, nv50, nvc0, radeonsi)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
   GL_ARB_shader_clock                                   DONE (i965/gen7+, nv50, nvc0, r600, radeonsi, virgl)
   GL_ARB_shader_stencil_export                          DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl)
   GL_ARB_shader_viewport_layer_array                    DONE (i965/gen6+, nvc0, radeonsi)
   GL_ARB_sparse_buffer                                  DONE (radeonsi/CIK+)
   GL_ARB_sparse_texture                                 not started
   GL_ARB_sparse_texture2                                not started
   GL_ARB_sparse_texture_clamp                           not started
   GL_ARB_texture_filter_minmax                          not started
   GL_EXT_memory_object                                  DONE (radeonsi)
   GL_EXT_memory_object_fd                               DONE (radeonsi)
   GL_EXT_memory_object_win32                            not started
   GL_EXT_render_snorm                                   DONE (i965, radeonsi)
   GL_EXT_semaphore                                      DONE (radeonsi)
   GL_EXT_semaphore_fd                                   DONE (radeonsi)
   GL_EXT_semaphore_win32                                not started
   GL_EXT_texture_norm16                                 DONE (i965, r600, radeonsi, nvc0)
   GL_KHR_blend_equation_advanced_coherent               DONE (i965/gen9+)
   GL_KHR_texture_compression_astc_hdr                   DONE (i965/bxt)
   GL_KHR_texture_compression_astc_sliced_3d             DONE (i965/gen9+)
   GL_KHR_texture_compression_astc_sliced_3d             DONE (i965/gen9+, radeonsi)
   GL_OES_depth_texture_cube_map                         DONE (all drivers that support GLSL 1.30+)
   GL_OES_EGL_image                                      DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       not started
   GL_OES_EGL_image_external                             DONE (all drivers)
   GL_OES_EGL_image_external_essl3                       DONE (all drivers)
   GL_OES_required_internalformat                        DONE (all drivers)
   GL_OES_surfaceless_context                            DONE (all drivers)
   GL_OES_texture_compression_astc                       DONE (core only)
   GL_OES_texture_float                                  DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_float_linear                           DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float                             DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   not started - based on GL_ARB_texture_view
   GL_OES_texture_float                                  DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_float_linear                           DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float                             DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_half_float_linear                      DONE (freedreno, i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
   GL_OES_texture_view                                   DONE (freedreno, i965/gen8+, r600, radeonsi, nv50, nvc0, softpipe, llvmpipe, swr)
   GL_OES_viewport_array                                 DONE (i965, nvc0, radeonsi)
   GLX_ARB_context_flush_control                         not started
   GLX_ARB_robustness_application_isolation              not started
   GLX_ARB_robustness_share_group_isolation              not started
 GL_EXT_direct_state_access subfeatures (in the spec order):
   GL 1.1: Client commands                               not started
   GL 1.0-1.3: Matrix and transpose matrix commands      not started
   GL 1.1-1.2: Texture commands                          not started
   GL 1.2: 3D texture commands                           not started
   GL 1.2.1: Multitexture commands                       not started
   GL 1.2.1-3.0: Indexed texture commands                not started
   GL 1.2.1-3.0: Indexed generic queries                 not started
   GL 1.2.1: EnableIndexed.. Get*Indexed                 not started
   GL_ARB_vertex_program                                 not started
   GL 1.3: Compressed texture and multitexture commands  not started
   GL 1.5: Buffer commands                               not started
   GL 2.0-2.1: Uniform and uniform matrix commands       not started
   GL_EXT_texture_buffer_object                          not started
   GL_EXT_texture_integer                                not started
   GL_EXT_gpu_shader4                                    not started
   GL_EXT_gpu_program_parameters                         not started
   GL_NV_gpu_program4                                    n/a
   GL_NV_framebuffer_multisample_coverage                n/a
   GL 3.0: Renderbuffer/framebuffer commands, Gen*Mipmap not started
   GL 3.0: CopyBuffer command                            not started
   GL_EXT_geometry_shader4 commands (expose in GL 3.2)   not started
   GL_NV_explicit_multisample                            n/a
   GL 3.0: Vertex array/attrib/query/map commands        not started
   Matrix GL tokens                                      not started
 GL_EXT_direct_state_access additions from other extensions (complete list):
   GL_AMD_framebuffer_sample_positions                   n/a
   GL_AMD_gpu_shader_int64                               not started
   GL_ARB_bindless_texture                               not started
   GL_ARB_buffer_storage                                 not started
   GL_ARB_clear_buffer_object                            not started
   GL_ARB_framebuffer_no_attachments                     not started
   GL_ARB_gpu_shader_fp64                                not started
   GL_ARB_instanced_arrays                               not started
   GL_ARB_internalformat_query2                          not started
   GL_ARB_sparse_texture                                 n/a
   GL_ARB_sparse_buffer                                  not started
   GL_ARB_texture_buffer_range                           not started
   GL_ARB_texture_storage                                not started
   GL_ARB_texture_storage_multisample                    not started
   GL_ARB_vertex_attrib_64bit                            not started
   GL_ARB_vertex_attrib_binding                          not started
   GL_EXT_buffer_storage                                 not started
   GL_EXT_external_buffer                                not started
   GL_EXT_separate_shader_objects                        n/a
   GL_EXT_sparse_texture                                 n/a
   GL_EXT_texture_storage                                n/a
   GL_EXT_vertex_attrib_64bit                            not started
   GL_EXT_EGL_image_storage                              n/a
   GL_NV_bindless_texture                                n/a
   GL_NV_gpu_shader5                                     n/a
   GL_NV_texture_multisample                             n/a
   GL_NV_vertex_buffer_unified_memory                    n/a
   GL_NVX_linked_gpu_multicast                           n/a
   GLX_NV_copy_buffer                                    n/a
 The following extensions are not part of any OpenGL or OpenGL ES version, and
 we DO NOT WANT implementations of these extensions for Mesa.
@@ -343,39 +412,55 @@ we DO NOT WANT implementations of these extensions for Mesa.
 Vulkan 1.0 -- all DONE: anv, radv
 Khronos extensions that are not part of any Vulkan version:
 Vulkan 1.1 -- all DONE: anv, radv
   VK_KHR_16bit_storage                                  in progress (Alejandro)
   VK_KHR_android_surface                                not started
   VK_KHR_bind_memory2                                   DONE (anv, radv)
   VK_KHR_dedicated_allocation                           DONE (anv, radv)
   VK_KHR_descriptor_update_template                     DONE (anv, radv)
   VK_KHR_display                                        not started
   VK_KHR_display_swapchain                              not started
   VK_KHR_external_fence                                 not started
   VK_KHR_external_fence_capabilities                    not started
   VK_KHR_external_fence_fd                              not started
   VK_KHR_external_fence_win32                           not started
   VK_KHR_device_group                                   not started
   VK_KHR_device_group_creation                          not started
   VK_KHR_external_fence                                 DONE (anv, radv)
   VK_KHR_external_fence_capabilities                    DONE (anv, radv)
   VK_KHR_external_memory                                DONE (anv, radv)
   VK_KHR_external_memory_capabilities                   DONE (anv, radv)
   VK_KHR_external_memory_fd                             DONE (anv, radv)
   VK_KHR_external_memory_win32                          not started
   VK_KHR_external_semaphore                             DONE (radv)
   VK_KHR_external_semaphore_capabilities                DONE (radv)
   VK_KHR_external_semaphore_fd                          DONE (radv)
   VK_KHR_external_semaphore_win32                       not started
   VK_KHR_external_semaphore                             DONE (anv, radv)
   VK_KHR_external_semaphore_capabilities                DONE (anv, radv)
   VK_KHR_get_memory_requirements2                       DONE (anv, radv)
   VK_KHR_get_physical_device_properties2                DONE (anv, radv)
   VK_KHR_get_surface_capabilities2                      DONE (anv)
   VK_KHR_incremental_present                            DONE (anv, radv)
   VK_KHR_maintenance1                                   DONE (anv, radv)
   VK_KHR_maintenance2                                   DONE (anv, radv)
   VK_KHR_maintenance3                                   DONE (anv, radv)
   VK_KHR_multiview                                      DONE (anv, radv)
   VK_KHR_relaxed_block_layout                           DONE (anv, radv)
   VK_KHR_sampler_ycbcr_conversion                       DONE (anv)
   VK_KHR_shader_draw_parameters                         DONE (anv, radv)
   VK_KHR_storage_buffer_storage_class                   DONE (anv, radv)
   VK_KHR_variable_pointers                              DONE (anv, radv)
 Khronos extensions that are not part of any Vulkan version:
   VK_KHR_8bit_storage                                   DONE (anv)
   VK_KHR_android_surface                                not started
   VK_KHR_create_renderpass2                             DONE (anv, radv)
   VK_KHR_display                                        DONE (anv, radv)
   VK_KHR_display_swapchain                              DONE (anv, radv)
   VK_KHR_draw_indirect_count                            DONE (radv)
   VK_KHR_external_fence_fd                              DONE (anv, radv)
   VK_KHR_external_fence_win32                           not started
   VK_KHR_external_memory_fd                             DONE (anv, radv)
   VK_KHR_external_memory_win32                          not started
   VK_KHR_external_semaphore_fd                          DONE (anv, radv)
   VK_KHR_external_semaphore_win32                       not started
   VK_KHR_get_display_properties2                        DONE (anv, radv)
   VK_KHR_get_surface_capabilities2                      DONE (anv, radv)
   VK_KHR_image_format_list                              DONE (anv, radv)
   VK_KHR_incremental_present                            DONE (anv, radv)
   VK_KHR_mir_surface                                    not started
   VK_KHR_push_descriptor                                DONE (anv, radv)
   VK_KHR_sampler_mirror_clamp_to_edge                   DONE (anv, radv)
   VK_KHR_shader_draw_parameters                         DONE (anv, radv)
   VK_KHR_shared_presentable_image                       not started
   VK_KHR_storage_buffer_storage_class                   DONE (anv, radv)
   VK_KHR_surface                                        DONE (anv, radv)
   VK_KHR_swapchain                                      DONE (anv, radv)
   VK_KHR_variable_pointers                              DONE (anv, radv)
   VK_KHR_wayland_surface                                DONE (anv, radv)
   VK_KHR_win32_keyed_mutex                              not started
   VK_KHR_win32_surface                                  not started

									
										2

docs/helpwanted.html
									
												View File
												
				@@ -47,7 +47,7 @@ You can find some further To-do lists here:

				<b>Common To-Do lists:</b>

				</p>

				<ul>

				  <li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">

				  <li><a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/docs/features.txt">

				    <b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>

				</ul>

									
										289

docs/index.html
									
												View File
												
				@@ -15,6 +15,291 @@

				<div class="content">

				<h1>News</h1>

				<h2>January 17, 2019</h2>

				<p>

				<a href="relnotes/18.3.2.html">Mesa 18.3.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 27, 2018</h2>

				<p>

				<a href="relnotes/18.2.8.html">Mesa 18.2.8</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 18.2.8 will be the final release in the

				18.2 series. Users of 18.2 are encouraged to migrate to the 18.3

				series in order to obtain future fixes.

				</p>

				<h2>December 13, 2018</h2>

				<p>

				<a href="relnotes/18.2.7.html">Mesa 18.2.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 11, 2018</h2>

				<p>

				<a href="relnotes/18.3.1.html">Mesa 18.3.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 7, 2018</h2>

				<p>

				<a href="relnotes/18.3.0.html">Mesa 18.3.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>November 28, 2018</h2>

				<p>

				<a href="relnotes/18.2.6.html">Mesa 18.2.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 15, 2018</h2>

				<p>

				<a href="relnotes/18.2.5.html">Mesa 18.2.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 31, 2018</h2>

				<p>

				<a href="relnotes/18.2.4.html">Mesa 18.2.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 19, 2018</h2>

				<p>

				<a href="relnotes/18.2.3.html">Mesa 18.2.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 5, 2018</h2>

				<p>

				<a href="relnotes/18.2.2.html">Mesa 18.2.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 24, 2018</h2>

				<p>

				<a href="relnotes/18.1.9.html">Mesa 18.1.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 18.1.9 will be the final release in the

				18.1 series. Users of 18.1 are encouraged to migrate to the 18.2

				series in order to obtain future fixes.

				</p>

				<h2>September 21, 2018</h2>

				<p>

				<a href="relnotes/18.2.1.html">Mesa 18.2.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 7, 2018</h2>

				<p>

				<a href="relnotes/18.1.8.html">Mesa 18.1.8</a> and

				<a href="relnotes/18.2.0.html">Mesa 18.2.0</a> are released.

				These are, respectively, a bug-fix release from the 18.1 branch and a

				new development release.  See the release notes for more information

				about the releases.

				</p>

				<h2>August 24, 2018</h2>

				<p>

				<a href="relnotes/18.1.7.html">Mesa 18.1.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 13, 2018</h2>

				<p>

				<a href="relnotes/18.1.6.html">Mesa 18.1.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 27, 2018</h2>

				<p>

				<a href="relnotes/18.1.5.html">Mesa 18.1.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 13, 2018</h2>

				<p>

				<a href="relnotes/18.1.4.html">Mesa 18.1.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 29, 2018</h2>

				<p>

				<a href="relnotes/18.1.3.html">Mesa 18.1.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 15, 2018</h2>

				<p>

				<a href="relnotes/18.1.2.html">Mesa 18.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 3, 2018</h2>

				<p>

				<a href="relnotes/18.0.5.html">Mesa 18.0.5</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 18.0.5 will be the final release in the

				18.0 series. Users of 18.0 are encouraged to migrate to the 18.1

				series in order to obtain future fixes.

				</p>

				<h2>June 1, 2018</h2>

				<p>

				<a href="relnotes/18.1.1.html">Mesa 18.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 18, 2018</h2>

				<p>

				<a href="relnotes/18.1.0.html">Mesa 18.1.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>May 17, 2018</h2>

				<p>

				<a href="relnotes/18.0.4.html">Mesa 18.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 7, 2018</h2>

				<p>

				<a href="relnotes/18.0.3.html">Mesa 18.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 28, 2018</h2>

				<p>

				<a href="relnotes/18.0.2.html">Mesa 18.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2018</h2>

				<p>

				<a href="relnotes/18.0.1.html">Mesa 18.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2018</h2>

				<p>

				<a href="relnotes/17.3.9.html">Mesa 17.3.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 17.3.9 will be the final release in the

				17.3 series. Users of 17.3 are encouraged to migrate to the 18.0

				series in order to obtain future fixes.

				</p>

				<h2>April 03, 2018</h2>

				<p>

				<a href="relnotes/17.3.8.html">Mesa 17.3.8</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 27, 2018</h2>

				<p>

				<a href="relnotes/18.0.0.html">Mesa 18.0.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>March 21, 2018</h2>

				<p>

				<a href="relnotes/17.3.7.html">Mesa 17.3.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 26, 2018</h2>

				<p>

				<a href="relnotes/17.3.6.html">Mesa 17.3.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 19, 2018</h2>

				<p>

				<a href="relnotes/17.3.5.html">Mesa 17.3.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 15, 2018</h2>

				<p>

				<a href="relnotes/17.3.4.html">Mesa 17.3.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 18, 2018</h2>

				<p>

				<a href="relnotes/17.3.3.html">Mesa 17.3.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 8, 2018</h2>

				<p>

				<a href="relnotes/17.3.2.html">Mesa 17.3.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 22, 2017</h2>

				<p>

				<a href="relnotes/17.2.8.html">Mesa 17.2.8</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 17.2.8 will be the final release in the

				17.2 series. Users of 17.2 are encouraged to migrate to the 17.3

				series in order to obtain future fixes.

				</p>

				<h2>December 21, 2017</h2>

				<p>

				<a href="relnotes/17.3.1.html">Mesa 17.3.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 14, 2017</h2>

				<p>

				<a href="relnotes/17.2.7.html">Mesa 17.2.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 8, 2017</h2>

				<p>

				<a href="relnotes/17.3.0.html">Mesa 17.3.0</a> is released.  This is a

				new development release.  See the release notes for more information

				about the release.

				</p>

				<h2>November 25, 2017</h2>

				<p>

				<a href="relnotes/17.2.6.html">Mesa 17.2.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 10, 2017</h2>

				<p>

				<a href="relnotes/17.2.5.html">Mesa 17.2.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 30, 2017</h2>

				<p>

				<a href="relnotes/17.2.4.html">Mesa 17.2.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 19, 2017</h2>

				<p>

				<a href="relnotes/17.2.3.html">Mesa 17.2.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 2, 2017</h2>

				<p>

				@@ -26,6 +311,10 @@ This is a bug-fix release.

				<p>

				<a href="relnotes/17.1.10.html">Mesa 17.1.10</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 17.1.10 will be the final release in the

				17.1 series. Users of 17.1 are encouraged to migrate to the 17.2

				series in order to obtain future fixes.

				</p>

				<h2>September 17, 2017</h2>

									
										48

docs/install.html
									
												View File
												
				@@ -22,6 +22,7 @@

				  <li><a href="#prereq-general">General prerequisites</a>

				  <li><a href="#prereq-dri">For DRI and hardware acceleration</a>

				  </ul>

				<li><a href="#meson">Building with meson</a>

				<li><a href="#autoconf">Building with autoconf (Linux/Unix/X11)</a>

				<li><a href="#scons">Building with SCons (Windows/Linux)</a>

				<li><a href="#android">Building with AOSP (Android)</a>

				@@ -39,9 +40,10 @@ Build system.

				</p>

				<ul>

				<li>Autoconf is required when building on *nix platforms.

				<li><a href="https://mesonbuild.com">meson</a> is recommended when building on *nix platforms.

				<li>Autoconf is another option when building on *nix platforms.

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				Windows and optional for Linux (it's an alternative to autoconf/automake.)

				Windows and optional for Linux (it's an alternative to autoconf/automake or meson.)

				</li>

				<li>Android Build system when building as native Android component. Autoconf

				is used when when building ARC.

				@@ -57,7 +59,7 @@ willing to maintain support for other compiler get in touch.

				<ul>

				<li>GCC 4.2.0 or later (some parts of Mesa may require later versions)

				<li>clang - exact minimum requirement is currently unknown.

				<li>Microsoft Visual Studio 2013 Update 4 or later is required, for building on Windows.

				<li>Microsoft Visual Studio 2015 or later is required, for building on Windows.

				</ul>

				@@ -72,10 +74,12 @@ you think you've spotted a bug let developers know by filing a

				<ul>

				<li><a href="https://www.python.org/">Python</a> - Python is required.

				Version 2.6.4 or later should work.

				When building with scons 2.7 is required.

				When building with meson 3.5 or newer is required.

				When building with autotools 2.7, or 3.5 or later are required.

				</li>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.3.4 or later should work.

				Python Mako module is required. Version 0.8.0 or later should work.

				</li>

				<li>lex / yacc - for building the Mesa IR and GLSL compiler.

				<div>

				@@ -111,11 +115,31 @@ the packaging tool used by your distro.

				  ... # others

				</pre>

				<h1 id="autoconf">2. Building with autoconf (Linux/Unix/X11)</h1>

				<h1 id="meson">2. Building with meson</h1>

				<p>

				The primary method to build Mesa on Unix systems is with autoconf.

				Meson is the latest build system in mesa, it is currently able to build for

				*nix systems like Linux and BSD, and will be able to build for windows as well.

				</p>

				<p>

				The general approach is:

				</p>

				<pre>

				  meson builddir/

				  ninja -C builddir/

				  sudo ninja -C builddir/ install

				</pre>

				<p>

				Please read the <a href="meson.html">detailed meson instructions</a>

				for more information

				</p>

				<h1 id="autoconf">3. Building with autoconf (Linux/Unix/X11)</h1>

				<p>

				Although meson is recommended, another supported way to build on *nix systems

				is with autoconf.

				</p>

				<p>

				@@ -133,7 +157,7 @@ for more details.

				<h1 id="scons">3. Building with SCons (Windows/Linux)</h1>

				<h1 id="scons">4. Building with SCons (Windows/Linux)</h1>

				<p>

				To build Mesa with SCons on Linux or Windows do

				@@ -169,7 +193,7 @@ Additional information is available in <a href="README.WIN32">README.WIN32</a>.

				<h1 id="android">4. Building with AOSP (Android)</h1>

				<h1 id="android">5. Building with AOSP (Android)</h1>

				<p>

				Currently one can build Mesa for Android as part of the AOSP project, yet

				@@ -188,7 +212,7 @@ Android-x86 and/or other resources.

				</p>

				<h1 id="libs">5. Library Information</h1>

				<h1 id="libs">6. Library Information</h1>

				<p>

				When compilation has finished, look in the top-level <code>lib/</code>

				@@ -226,7 +250,7 @@ versions of libGL and device drivers.

				</p>

				<h1 id="pkg-config">6. Building OpenGL programs with pkg-config</h1>

				<h1 id="pkg-config">7. Building OpenGL programs with pkg-config</h1>

				<p>

				Running <code>make install</code> will install package configuration files

									
										34

docs/llvmpipe.html
									
												View File
												
				@@ -20,7 +20,7 @@

				The Gallium llvmpipe driver is a software rasterizer that uses LLVM to

				do runtime code generation.

				Shaders, point/line/triangle rasterization and vertex processing are

				implemented with LLVM IR which is translated to x86 or x86-64 machine

				implemented with LLVM IR which is translated to x86, x86-64, or ppc64le machine

				code.

				Also, the driver is multithreaded to take advantage of multiple CPU cores

				(up to 8 at this time).

				@@ -32,24 +32,36 @@ It's the fastest software rasterizer for Mesa.

				<ul>

				<li>

				   <p>An x86 or amd64 processor; 64-bit mode recommended.</p>

				   <p>

				   For x86 or amd64 processors, 64-bit mode is recommended.

				   Support for SSE2 is strongly encouraged.  Support for SSE3 and SSE4.1 will

				   yield the most efficient code.  The fewer features the CPU has the more

				   likely is that you run into underperforming, buggy, or incomplete code.

				   likely it is that you will run into underperforming, buggy, or incomplete code.

				   </p>

				   <p>

				   For ppc64le processors, use of the Altivec feature (the Vector

				   Facility) is recommended if supported; use of the VSX feature (the

				   Vector-Scalar Facility) is recommended if supported AND Mesa is

				   built with LLVM version 4.0 or later.

				   </p>

				   <p>

				   See /proc/cpuinfo to know what your CPU supports.

				   </p>

				</li>

				<li>

				   <p>LLVM: version 3.4 recommended; 3.3 or later required.</p>

				   <p>Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or later is required.</p>

				   <p>

				   For Linux, on a recent Debian based distribution do:

				   </p>

				<pre>

				     aptitude install llvm-dev

				</pre>

				   <p>

				   If you want development snapshot builds of LLVM for Debian and derived

				   distributions like Ubuntu, you can use the APT repository at <a

				   href="https://apt.llvm.org/" title="Debian Development packages for LLVM"

				   >apt.llvm.org</a>, which are maintained by Debian's LLVM maintainer.

				   </p>

				   <p>

				   For a RPM-based distribution do:

				   </p>

				@@ -108,10 +120,10 @@ To build everything on Linux invoke scons as:

				  scons build=debug libgl-xlib

				</pre>

				Alternatively, you can build it with GNU make, if you prefer, by invoking it as

				Alternatively, you can build it with autoconf/make with:

				<pre>

				  make linux-llvm

				  ./configure --enable-glx=gallium-xlib --with-gallium-drivers=swrast --disable-dri --disable-gbm --disable-egl

				  make

				</pre>

				but the rest of these instructions assume that scons is used.

				@@ -228,8 +240,8 @@ build/linux-???-debug/gallium/drivers/llvmpipe:

				</ul>

				<p>

				Some of this tests can output results and benchmarks to a tab-separated-file

				for posterior analysis, e.g.:

				Some of these tests can output results and benchmarks to a tab-separated file

				for later analysis, e.g.:

				</p>

				<pre>

				  build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv

				@@ -240,8 +252,8 @@ for posterior analysis, e.g.:

				<ul>

				<li>

				  When looking to this code by the first time start in lp_state_fs.c, and 

				  then skim through the lp_bld_* functions called in there, and the comments

				  When looking at this code for the first time, start in lp_state_fs.c, and

				  then skim through the lp_bld_* functions called there, and the comments

				  at the top of the lp_bld_*.c functions.

				</li>

				<li>

									
										3

docs/mesa.css
									
												View File
												
				@@ -29,6 +29,9 @@ pre {

					/*font-family: monospace;*/

					font-size: 10pt;

					/*color: black;*/

					background-color: #eee;

					margin-left: 2em;

					padding: .5em;

				}

				iframe {

									
										336

docs/meson.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,336 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Compilation and Installation using Meson</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Compilation and Installation using Meson</h1>

				<ul>

				  <li><a href="#basic">Basic Usage</a></li>

				  <li><a href="#cross-compilation">Cross-compilation and 32-bit builds</a></li>

				</ul>

				<h2 id="basic">1. Basic Usage</h2>

				<p><strong>The Meson build system is generally considered stable and ready

				for production</strong></p>

				<p>The meson build is tested on Linux, macOS, Cygwin and Haiku, FreeBSD,

				DragonflyBSD, NetBSD, and should work on OpenBSD.</p>

				<p><strong>Mesa requires Meson >= 0.45.0 to build.</strong>

				Some older versions of meson do not check that they are too old and will error

				out in odd ways.

				</p>

				<p>

				The meson program is used to configure the source directory and generates

				either a ninja build file or Visual Studio® build files. The latter must

				be enabled via the <code>--backend</code> switch, as ninja is the default backend on all

				operating systems. Meson only supports out-of-tree builds, and must be passed a

				directory to put built and generated sources into. We'll call that directory

				"build" for examples.

				</p>

				<pre>

				    meson build/

				</pre>

				<p>

				To see a description of your options you can run <code>meson configure</code>

				along with a build directory to view the selected options for. This will show

				your meson global arguments and project arguments, along with their defaults

				and your local settings.

				</p>

				<p>

				Meson does not currently support listing options before configure a build

				directory, but this feature is being discussed upstream.

				For now, the only way to see what options exist is to look at the

				<code>meson_options.txt</code> file at the root of the project.

				</p>

				<pre>

				    meson configure build/

				</pre>

				<p>

				With additional arguments <code>meson configure</code> is used to change

				options on already configured build directory. All options passed to this

				command are in the form <code>-D "command"="value"</code>.

				</p>

				<pre>

				    meson configure build/ -Dprefix=/tmp/install -Dglx=true

				</pre>

				<p>

				Note that options taking lists (such as <code>platforms</code>) are

				<a href="http://mesonbuild.com/Build-options.html#using-build-options">a bit

				more complicated</a>, but the simplest form compatible with Mesa options

				is to use a comma to separate values (<code>-D platforms=drm,wayland</code>)

				and brackets to represent an empty list (<code>-D platforms=[]</code>).

				</p>

				<p>

				Once you've run the initial <code>meson</code> command successfully you can use

				your configured backend to build the project. With ninja, the -C option can be

				be used to point at a directory to build.

				</p>

				<pre>

				    ninja -C build/

				</pre>

				<p>

				Without arguments, it will produce libGL.so and/or several other libraries

				depending on the options you have chosen. Later, if you want to rebuild for a

				different configuration, you should run <code>ninja clean</code> before

				changing the configuration, or create a new out of tree build directory for

				each configuration you want to build

				<a href="http://mesonbuild.com/Using-multiple-build-directories.html">as

				recommended in the documentation</a>

				</p>

				<p>

				Autotools automatically updates translation files as part of the build process,

				meson does not do this. Instead if you want translated drirc files you will need 

				to invoke non-default targets for ninja to update them:

				<code>ninja -C build/ xmlpool-pot xmlpool-update-po xmlpool-gmo</code>

				</p>

				<dl>

				<dt><code>Environment Variables</code></dt>

				<dd><p>Meson supports the standard CC and CXX environment variables for

				changing the default compiler. Meson does support CFLAGS, CXXFLAGS, etc. But

				their use is discouraged because of the many caveats in using them. Instead it

				is recomended to use <code>-D${lang}_args</code> and

				<code>-D${lang}_link_args</code> instead. Among the benefits of these options

				is that they are guaranteed to persist across rebuilds and reconfigurations.

				Meson does not allow changing compiler in a configured builddir, you will need

				to create a new build dir for a different compiler.

				</p>

				<pre>

				    CC=clang CXX=clang++ meson build-clang

				    ninja -C build-clang

				    ninja -C build-clang clean

				    meson configure build -Dc_args="-Wno-typedef-redefinition"

				    ninja -C build-clang

				</pre>

				<p>

				The default compilers depends on your operating system. Meson supports most of

				the popular compilers, a complete list is available

				<a href="http://mesonbuild.com/Reference-tables.html#compiler-ids">here</a>.

				</p>

				<p>Meson also honors <code>DESTDIR</code> for installs</p>

				</dd>

				<dt><code>LLVM</code></dt>

				<dd><p>Meson includes upstream logic to wrap llvm-config using its standard

				dependency interface.

				</p></dd>

				<dd><p>

				As of meson 0.49.0 meson also has the concept of a

				<a href="https://mesonbuild.com/Native-environments.html">"native file"</a>,

				these files provide information about the native build environment (as opposed

				to a cross build environment). They are ini formatted and can override where to

				find llvm-config:

				custom-llvm.ini

				<pre>

				    [binaries]

				    llvm-config = '/usr/local/bin/llvm/llvm-config'

				</pre>

				Then configure meson:

				<pre>

				    meson builddir/ --native-file custom-llvm.ini

				</pre>

				</p></dd>

				<dd><p>

				For selecting llvm-config for cross compiling a

				<a href="https://mesonbuild.com/Cross-compilation.html#defining-the-environment">"cross file"</a>

				should be used. It uses the same format as the native file above:

				cross-llvm.ini

				<pre>

				    [binaries]

				    ...

				    llvm-config = '/usr/lib/llvm-config-32'

				</pre>

				Then configure meson:

				<pre>

				    meson builddir/ --cross-file cross-llvm.ini

				</pre>

				See the <a href="#cross-compilation">Cross Compilation</a> section for more information.

				</dd></p>

				<dd><p>

				For older versions of meson <code>$PATH</code> (or <code>%PATH%</code> on

				windows) will be searched for llvm-config (and llvm-config$version and

				llvm-config-$version), you can override this environment variable to control

				the search: <code>PATH=/path/with/llvm-config:$PATH meson build</code>.

				</dd></p>

				</dl>

				<dl>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for configuring and

				building Mesa on Unix-like systems. It is used to search for external libraries

				on the system. This environment variable is used to control the search path for

				<code>pkg-config</code>. For instance, setting

				<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for package

				metadata in <code>/usr/X11R6</code> before the standard directories.</p>

				</dd>

				</dl>

				<p>

				One of the oddities of meson is that some options are different when passed to

				the <code>meson</code> than to <code>meson configure</code>. These options are

				passed as --option=foo to <code>meson</code>, but -Doption=foo to <code>meson

				configure</code>. Mesa defined options are always passed as -Doption=foo.

				</p>

				<p>For those coming from autotools be aware of the following:</p>

				<dl>

				<dt><code>--buildtype/-Dbuildtype</code></dt>

				<dd><p>This option will set the compiler debug/optimisation levels to aid

				debugging the Mesa libraries.</p>

				<p>Note that in meson this defaults to <code>debugoptimized</code>, and

				not setting it to <code>release</code> will yield non-optimal

				performance and binary size. Not using <code>debug</code> may interfere

				with debugging as some code and validation will be optimized away.

				</p>

				<p> For those wishing to pass their own optimization flags, use the <code>plain</code>

				buildtype, which causes meson to inject no additional compiler arguments, only

				those in the C/CXXFLAGS and those that mesa itself defines.</p>

				</dd>

				</dl>

				<dl>

				<dt><code>-Db_ndebug</code></dt>

				<dd><p>This option controls assertions in meson projects. When set to <code>false</code>

				(the default) assertions are enabled, when set to true they are disabled. This

				is unrelated to the <code>buildtype</code>; setting the latter to

				<code>release</code> will not turn off assertions.

				</p>

				</dd>

				</dl>

				<h2 id="cross-compilation">2. Cross-compilation and 32-bit builds</h2>

				<p><a href="https://mesonbuild.com/Cross-compilation.html">Meson supports

				cross-compilation</a> by specifying a number of binary paths and

				settings in a file and passing this file to <code>meson</code> or

				<code>meson configure</code> with the <code>--cross-file</code>

				parameter.</p>

				<p>This file can live at any location, but you can use the bare filename

				(without the folder path) if you put it in $XDG_DATA_HOME/meson/cross or

				~/.local/share/meson/cross</p>

				<p>Below are a few example of cross files, but keep in mind that you

				will likely have to alter them for your system.</p>

				<p>

				Those running on ArchLinux can use the AUR-maintained packages for some

				of those, as they'll have the right values for your system:

				<ul>

				  <li><a href="https://aur.archlinux.org/packages/meson-cross-x86-linux-gnu">meson-cross-x86-linux-gnu</a></li>

				  <li><a href="https://aur.archlinux.org/packages/meson-cross-aarch64-linux-gnu">meson-cross-aarch64-linux-gnu</a></li>

				</ul>

				</p>

				<p>

				32-bit build on x86 linux:

				<pre>

				[binaries]

				c = '/usr/bin/gcc'

				cpp = '/usr/bin/g++'

				ar = '/usr/bin/gcc-ar'

				strip = '/usr/bin/strip'

				pkgconfig = '/usr/bin/pkg-config-32'

				llvm-config = '/usr/bin/llvm-config32'

				[properties]

				c_args = ['-m32']

				c_link_args = ['-m32']

				cpp_args = ['-m32']

				cpp_link_args = ['-m32']

				[host_machine]

				system = 'linux'

				cpu_family = 'x86'

				cpu = 'i686'

				endian = 'little'

				</pre>

				</p>

				<p>

				64-bit build on ARM linux:

				<pre>

				[binaries]

				c = '/usr/bin/aarch64-linux-gnu-gcc'

				cpp = '/usr/bin/aarch64-linux-gnu-g++'

				ar = '/usr/bin/aarch64-linux-gnu-gcc-ar'

				strip = '/usr/bin/aarch64-linux-gnu-strip'

				pkgconfig = '/usr/bin/aarch64-linux-gnu-pkg-config'

				exe_wrapper = '/usr/bin/qemu-aarch64-static'

				[host_machine]

				system = 'linux'

				cpu_family = 'aarch64'

				cpu = 'aarch64'

				endian = 'little'

				</pre>

				</p>

				<p>

				64-bit build on x86 windows:

				<pre>

				[binaries]

				c = '/usr/bin/x86_64-w64-mingw32-gcc'

				cpp = '/usr/bin/x86_64-w64-mingw32-g++'

				ar = '/usr/bin/x86_64-w64-mingw32-ar'

				strip = '/usr/bin/x86_64-w64-mingw32-strip'

				pkgconfig = '/usr/bin/x86_64-w64-mingw32-pkg-config'

				exe_wrapper = 'wine'

				[host_machine]

				system = 'windows'

				cpu_family = 'x86_64'

				cpu = 'i686'

				endian = 'little'

				</pre>

				</p>

				</div>

				</body>

				</html>

31

docs/patents.txt

View File

@@ -1,31 +0,0 @@
 ARB_texture_float:
     Silicon Graphics, Inc. owns US Patent #6,650,327, issued November 18,
 [1].
     SGI believes this patent contains necessary IP for graphics systems
     implementing floating point rasterization and floating point
     framebuffer capabilities described in ARB_texture_float extension, and
     will discuss licensing on RAND terms, on an individual basis with
     companies wishing to use this IP in the context of conformant OpenGL
     implementations [2].
     The source code to implement ARB_texture_float extension is included
     and can be toggled on at compile time, for those who purchased a
     license from SGI, or are in a country where the patent does not apply,
     etc.
     The software is provided "as is", without warranty of any kind, express
     or implied, including but not limited to the warranties of
     merchantability, fitness for a particular purpose and noninfringement.
     In no event shall the authors or copyright holders be liable for any
     claim, damages or other liability, whether in an action of contract,
     tort or otherwise, arising from, out of or in connection with the
     software or the use or other dealings in the software.
     You should contact a lawyer or SGI's legal department if you want to
     enable this extension.
 [1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
 [2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

									
										2

docs/precompiled.html
									
												View File
												
				@@ -24,10 +24,12 @@ Some Linux distributions closely follow the latest Mesa releases. On others one

				has to use unofficial channels.

				<br>

				There are some general directions:

				<ul>

				<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>

				<li>Fedora - Corp: erp and che</li>

				<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>

				<li>Gentoo/Archlinux - officially provided/supported</li>

				</ul>

				</p>

				</div>

									
										167

docs/release-calendar.html
									
												View File
												
				@@ -23,6 +23,16 @@ Mesa provides feature/development and stable releases.

				The table below lists the date and release manager that is expected to do the

				specific release.

				<br>

				Regular updates will ensure that the schedule for the current and the

				next two feature releases are shown in the table.

				<br>

				In order to keep the whole releasing team up to date with the tools

				used, best practices and other details, the member in charge of the

				next feature release will be in constant rotation.

				<br>

				The way the release schedule works is

				explained <a href="releasing.html#schedule" target="_parent">here</a>.

				<br>

				Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>

				if you'd like to nominate a patch in the next stable release.

				</p>

				@@ -39,78 +49,129 @@ if you'd like to nominate a patch in the next stable release.

				<th>Notes</th>

				</tr>

				<tr>

				<td rowspan="5">17.2</td>

				<td>2017-10-13</td>

				<td>17.2.3</td>

				<td rowspan="4">18.3</td>

				<td>2019-01-30</td>

				<td>18.3.3</td>

				<td>Emil Velikov</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2017-10-27</td>

				<td>17.2.4</td>

				<td>2019-02-13</td>

				<td>18.3.4</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-27</td>

				<td>18.3.5</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-03-13</td>

				<td>18.3.6</td>

				<td>Emil Velikov</td>

				<td>Last planned 18.3.x release</td>

				</tr>

				<tr>

				<td rowspan="4">19.0</td>

				<td>2019-01-29</td>

				<td>19.0.0-rc1</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-05</td>

				<td>19.0.0-rc2</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-12</td>

				<td>19.0.0-rc3</td>

				<td>Dylan Baker</td>

				<td>

				</tr>

				<tr>

				<td>2019-02-19</td>

				<td>19.0.0-rc4</td>

				<td>Dylan Baker</td>

				<td>Last planned RC/Final release</td>

				</tr>

				<tr>

				<td rowspan="4">19.1</td>

				<td>2019-04-30</td>

				<td>19.1.0-rc1</td>

				<td>Andres Gomez</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2017-11-10</td>

				<td>17.2.5</td>

				<td>2019-05-07</td>

				<td>19.1.0-rc2</td>

				<td>Andres Gomez</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2017-11-24</td>

				<td>17.2.6</td>

				<td>2019-05-14</td>

				<td>19.1.0-rc3</td>

				<td>Andres Gomez</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2017-12-08</td>

				<td>17.2.7</td>

				<td>Emil Velikov</td>

				<td>Final planned release for the 17.2 series</td>

				</tr>

				<tr>

				<td rowspan="7">17.3</td>

				<td>2017-10-20</td>

				<td>17.3.0-rc1</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-10-27</td>

				<td>17.3.0-rc2</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-11-03</td>

				<td>17.3.0-rc3</td>

				<td>Emil Velikov</td>

				<td></td>

				</tr>

				<tr>

				<td>2017-11-10</td>

				<td>17.3.0-rc4</td>

				<td>Emil Velikov</td>

				<td>May be promoted to 17.3.0 final</td>

				</tr>

				<tr>

				<td>2017-11-24</td>

				<td>17.3.1</td>

				<td>2019-05-21</td>

				<td>19.1.0-rc4</td>

				<td>Andres Gomez</td>

				<td></td>

				<td>Last planned RC/Final release</td>

				</tr>

				<tr>

				<td>2017-12-08</td>

				<td>17.3.2</td>

				<td rowspan="4">19.2</td>

				<td>2019-08-06</td>

				<td>19.2.0-rc1</td>

				<td>Emil Velikov</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2017-12-22</td>

				<td>17.3.3</td>

				<td>2019-08-13</td>

				<td>19.2.0-rc2</td>

				<td>Emil Velikov</td>

				<td></td>

				<td>

				</tr>

				<tr>

				<td>2019-08-20</td>

				<td>19.2.0-rc3</td>

				<td>Emil Velikov</td>

				<td>

				</tr>

				<tr>

				<td>2019-08-27</td>

				<td>19.2.0-rc4</td>

				<td>Emil Velikov</td>

				<td>Last planned RC/Final release</td>

				</tr>

				<tr>

				<td rowspan="4">19.3</td>

				<td>2019-10-15</td>

				<td>19.3.0-rc1</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-10-22</td>

				<td>19.3.0-rc2</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-10-29</td>

				<td>19.3.0-rc3</td>

				<td>Juan A. Suarez</td>

				<td>

				</tr>

				<tr>

				<td>2019-11-05</td>

				<td>19.3.0-rc4</td>

				<td>Juan A. Suarez</td>

				<td>Last planned RC/Final release</td>

				</tr>

				</table>

									
										114

docs/releasing.html
									
												View File
												
				@@ -21,6 +21,7 @@

				<li><a href="#overview">Overview</a>

				<li><a href="#schedule">Release schedule</a>

				<li><a href="#pickntest">Cherry-pick and test</a>

				<li><a href="#stagingbranch">Staging branch</a>

				<li><a href="#branch">Making a branchpoint</a>

				<li><a href="#prerelease">Pre-release announcement</a>

				<li><a href="#release">Making a new release</a>

				@@ -54,10 +55,11 @@ For example:

				<h1 id="schedule">Release schedule</h1>

				<p>

				Releases should happen on Fridays. Delays can occur although those should be keep

				to a minimum.

				Releases should happen on Wednesdays. Delays can occur although those

				should be kept to a minimum.

				<br>

				See our <a href="release-calendar.html" target="_parent">calendar</a> for the

				See our <a href="release-calendar.html" target="_parent">calendar</a>

				for information about how the release schedule is planned, and the

				date and other details for individual releases.

				</p>

				@@ -66,6 +68,9 @@ date and other details for individual releases.

				<li>Available approximately every three months.

				<li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1)

				on the mesa-announce@ mailing list.

				<li>Typically, the final release will happen after 4

				candidates. Additional ones may be needed in order to resolve blocking

				regressions, though.

				<li>A <a href="#prerelease">pre-release</a> announcement should be available

				approximately 24 hours before the final (non-rc) release.

				</ul>

				@@ -83,6 +88,12 @@ Note: There is one or two releases overlap when changing branches. For example:

				<br>

				The final release from the 12.0 series Mesa 12.0.5 will be out around the same

				time (or shortly after) 13.0.1 is out.

				<br>

				This also involves that, as a final release may be delayed due to the

				need of additional candidates to solve some blocking regression(s),

				the release manager might have to update

				the <a href="release-calendar.html" target="_parent">calendar</a> with

				additional bug fix releases of the current stable branch.

				</p>

				@@ -96,7 +107,7 @@ described in the same section.

				<p>

				Nomination happens in the mesa-stable@ mailing list. However,

				maintainer is resposible of checking for forgotten candidates in the

				maintainer is responsible of checking for forgotten candidates in the

				master branch. This is achieved by a combination of ad-hoc scripts and

				a casual search for terms such as regression, fix, broken and similar.

				</p>

				@@ -111,18 +122,21 @@ the autoconf and scons build.

				<p>Done continuously up-to the <a href="#prerelease">pre-release</a> announcement.</p>

				<p>

				As an exception, patches can be applied up-to the last ~1h before the actual

				release. This is made <strong>only</strong> with explicit permission/request,

				and the patch <strong>must</strong> be very well contained. Thus it cannot

				affect more than one driver/subsystem.

				</p>

				<p>

				Currently Ilia Mirkin and AMD devs have requested "permanent" exception.

				Developers can request, <em>as an exception</em>, patches to be applied up-to

				the last one hour before the actual release. This is made <strong>only</strong>

				with explicit permission/request, and the patch <strong>must</strong> be very

				well contained. Thus it cannot affect more than one driver/subsystem.

				</p>

				<p>Following developers have requested permanent exception</p>

				<ul>

				<li>make distcheck, scons and scons check must pass

				<li><em>Ilia Mirkin</em>

				<li><em>AMD team</em>

				</ul>

				<p>The following must pass:</p>

				<ul>

				<li>make distcheck, scons and scons check

				<li>Testing with different version of system components - LLVM and others is also

				performed where possible.

				<li>As a general rule, testing with various combinations of configure

				@@ -130,9 +144,9 @@ switches, depending on the specific patchset.

				</ul>

				<p>

				Achieved by combination of local ad-hoc scripts, mingw-w64 cross

				compilation and AppVeyor plus Travis-CI, the latter as part of their

				Github integration.

				These are achieved by combination of <a href="basictesting">local testing</a>,

				which includes mingw-w64 cross compilation and AppVeyor plus Travis-CI, the

				latter two as part of their Github integration.

				</p>

				<p>

				@@ -209,6 +223,25 @@ system and making some every day's use until the release may be a good

				idea too.

				</p>

				<h1 id="stagingbranch">Staging branch</h1>

				<p>

				A live branch, which contains the currently merge/rejected patches is available

				in the main repository under <code>staging/X.Y</code>. For example:

				</p>

				<pre>

					staging/18.1 - WIP branch for the 18.1 series

					staging/18.2 - WIP branch for the 18.2 series

				</pre>

				<p>

				Notes:

				</p>

				<ul>

				<li>People are encouraged to test the staging branch and report regressions.</li>

				<li>The branch history is not stable and it <strong>will</strong> be rebased,</li>

				</ul>

				<h1 id="branch">Making a branchpoint</h1>

				@@ -272,6 +305,11 @@ It is followed by a brief period (normally 24 or 48 hours) before the actual

				release is made.

				</p>

				<p>

				Be aware to add a note to warn about a final release in a series, if

				that is the case.

				</p>

				<h2>Terminology used</h2>

				<ul><li>Nominated</ul>

				@@ -311,6 +349,10 @@ The candidate for the Mesa X.Y.Z is now available. Currently we have:

				 - NUMBER nominated (outstanding)

				 - and NUMBER rejected patches

				[If applicable:

				Note: this is the final anticipated release in the SERIES series. Users are

				encouraged to migrate to the NEXT_SERIES series in order to obtain future fixes.]

				BRIEF SUMMARY OF CHANGES

				Take a look at section "Mesa stable queue" for more information.

				@@ -374,6 +416,9 @@ Queued (NUMBER)

				AUTHOR (NUMBER):

				      COMMIT SUMMARY

				[If applicable:

				Squashed with

				      COMMIT SUMMARY]

				For example:

				@@ -382,16 +427,21 @@ Jonas Pfeil (1):

				Squashed with

				      ralloc: don't leave out the alignment factor

				Rejected (NUMBER)

				=================

				Rejected (11)

				=============

				AUTHOR (NUMBER):

				      SHA     COMMIT SUMMARY

				Reason: ...

				For example:

				Emil Velikov (1)

				      a39ad18 configure.ac: honour LLVM_LIBDIR when linking against LLVM

				Reason: The patch was reverted shortly after it was merged.

				</pre>

				@@ -408,7 +458,7 @@ Ensure the latest code is available - both in your local master and the

				relevant branch.

				</p>

				<h3>Perform basic testing</h3>

				<h3 id="basictesting">Perform basic testing</h3>

				<p>

				Most of the testing should already be done during the

				@@ -457,9 +507,9 @@ Here is one solution that I've been using.

					cd .. &amp;&amp; rm -rf mesa-$__version

					# Test the automake binaries

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					# Restore LLVM_CONFIG, if applicable:

					# export LLVM_CONFIG=`echo $save_LLVM_CONFIG`; unset save_LLVM_CONFIG

					tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version

					./configure \

						--with-dri-drivers=i965,swrast \

						--with-gallium-drivers=swrast \

				@@ -471,10 +521,14 @@ Here is one solution that I've been using.

						--enable-egl \

						--with-platforms=x11,drm,wayland,surfaceless

					make &amp;&amp; DESTDIR=`pwd`/test make install

					__glxinfo_cmd='glxinfo 2>&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'

					__glxgears_cmd='glxgears 2>&amp;1 | grep -v "configuration file"'

					__es2info_cmd='es2_info 2>&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'

					__es2gears_cmd='es2gears_x11 2>&amp;1 | grep -v "configuration file"'

					# Drop LLVM_CONFIG, if applicable:

					# unset LLVM_CONFIG

					__glxinfo_cmd='glxinfo 2&gt;&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'

					__glxgears_cmd='glxgears 2&gt;&amp;1 | grep -v "configuration file"'

					__es2info_cmd='es2_info 2&gt;&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'

					__es2gears_cmd='es2gears_x11 2&gt;&amp;1 | grep -v "configuration file"'

					test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"

					export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"

					export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/

				@@ -500,8 +554,10 @@ Here is one solution that I've been using.

					unset LIBGL_DRIVERS_PATH

					unset LIBGL_DEBUG

					unset LIBGL_ALWAYS_SOFTWARE

					unset GALLIUM_DRIVER

					export VK_ICD_FILENAMES=`pwd`/src/intel/vulkan/dev_icd.json

					steam steam://rungameid/570  -vconsole -vulkan

					unset VK_ICD_FILENAMES

				</pre>

				<h3>Update version in file VERSION</h3>

				@@ -580,7 +636,8 @@ Something like the following steps will do the trick:

				<p>

				Also, edit docs/relnotes.html to add a link to the new release notes,

				edit docs/index.html to add a news entry, and remove the version from

				edit docs/index.html to add a news entry and a note in case of the

				last release in a series, and remove the version from

				docs/release-calendar.html. Then commit and push:

				</p>

				@@ -596,6 +653,11 @@ docs/release-calendar.html. Then commit and push:

				Use the generated template during the releasing process.

				</p>

				<p>

				Again, pay attention to add a note to warn about a final release in a

				series, if that is the case.

				</p>

				<h1 id="website">Update the mesa3d.org website</h1>

									
										44

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,50 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/18.3.2.html">18.3.2 release notes</a>

				<li><a href="relnotes/18.2.8.html">18.2.8 release notes</a>

				<li><a href="relnotes/18.2.7.html">18.2.7 release notes</a>

				<li><a href="relnotes/18.3.1.html">18.3.1 release notes</a>

				<li><a href="relnotes/18.3.0.html">18.3.0 release notes</a>

				<li><a href="relnotes/18.2.6.html">18.2.6 release notes</a>

				<li><a href="relnotes/18.2.5.html">18.2.5 release notes</a>

				<li><a href="relnotes/18.2.4.html">18.2.4 release notes</a>

				<li><a href="relnotes/18.2.3.html">18.2.3 release notes</a>

				<li><a href="relnotes/18.2.2.html">18.2.2 release notes</a>

				<li><a href="relnotes/18.1.9.html">18.1.9 release notes</a>

				<li><a href="relnotes/18.2.1.html">18.2.1 release notes</a>

				<li><a href="relnotes/18.2.0.html">18.2.0 release notes</a>

				<li><a href="relnotes/18.1.8.html">18.1.8 release notes</a>

				<li><a href="relnotes/18.1.7.html">18.1.7 release notes</a>

				<li><a href="relnotes/18.1.6.html">18.1.6 release notes</a>

				<li><a href="relnotes/18.1.5.html">18.1.5 release notes</a>

				<li><a href="relnotes/18.1.4.html">18.1.4 release notes</a>

				<li><a href="relnotes/18.1.3.html">18.1.3 release notes</a>

				<li><a href="relnotes/18.1.2.html">18.1.2 release notes</a>

				<li><a href="relnotes/18.0.5.html">18.0.5 release notes</a>

				<li><a href="relnotes/18.1.1.html">18.1.1 release notes</a>

				<li><a href="relnotes/18.1.0.html">18.1.0 release notes</a>

				<li><a href="relnotes/18.0.4.html">18.0.4 release notes</a>

				<li><a href="relnotes/18.0.3.html">18.0.3 release notes</a>

				<li><a href="relnotes/18.0.2.html">18.0.2 release notes</a>

				<li><a href="relnotes/18.0.1.html">18.0.1 release notes</a>

				<li><a href="relnotes/17.3.9.html">17.3.9 release notes</a>

				<li><a href="relnotes/17.3.8.html">17.3.8 release notes</a>

				<li><a href="relnotes/18.0.0.html">18.0.0 release notes</a>

				<li><a href="relnotes/17.3.7.html">17.3.7 release notes</a>

				<li><a href="relnotes/17.3.6.html">17.3.6 release notes</a>

				<li><a href="relnotes/17.3.5.html">17.3.5 release notes</a>

				<li><a href="relnotes/17.3.4.html">17.3.4 release notes</a>

				<li><a href="relnotes/17.3.3.html">17.3.3 release notes</a>

				<li><a href="relnotes/17.3.2.html">17.3.2 release notes</a>

				<li><a href="relnotes/17.2.8.html">17.2.8 release notes</a>

				<li><a href="relnotes/17.3.1.html">17.3.1 release notes</a>

				<li><a href="relnotes/17.2.7.html">17.2.7 release notes</a>

				<li><a href="relnotes/17.3.0.html">17.3.0 release notes</a>

				<li><a href="relnotes/17.2.6.html">17.2.6 release notes</a>

				<li><a href="relnotes/17.2.5.html">17.2.5 release notes</a>

				<li><a href="relnotes/17.2.4.html">17.2.4 release notes</a>

				<li><a href="relnotes/17.2.3.html">17.2.3 release notes</a>

				<li><a href="relnotes/17.2.2.html">17.2.2 release notes</a>

				<li><a href="relnotes/17.1.10.html">17.1.10 release notes</a>

				<li><a href="relnotes/17.2.1.html">17.2.1 release notes</a>

									
										181

docs/relnotes/17.2.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,181 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.3 Release Notes / October 19, 2017</h1>

				<p>

				Mesa 17.2.3 is a bug fix release which fixes bugs found since the 17.2.2 release.

				</p>

				<p>

				Mesa 17.2.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				fb305eecfeec1fd771fdc96fff973c51871f7bd35fd2bd56cacc27b4b8823220  mesa-17.2.3.tar.gz

				a0b0ec8f7b24dd044d7ab30a8c7e6d3767521e245f88d4ed5dd93315dc56f837  mesa-17.2.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Add R16G16B16A16_SNORM fast clear support</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>nir/spirv: Allow loop breaks in a switch body.</li>

				  <li>radv: Only set the MTYPE flags on GFX9+.</li>

				</ul>

				<p>Ben Crocker (4):</p>

				<ul>

				  <li>gallivm: fix typo in debug_printf message</li>

				  <li>gallivm: allow additional llc options</li>

				  <li>gallivm/ppc64le: adjust VSX code generation control.</li>

				  <li>gallivm/ppc64le: allow environmental control of Altivec code generation</li>

				</ul>

				<p>Daniel Stone (2):</p>

				<ul>

				  <li>egl/wayland: Check queryImage return for wl_buffer</li>

				  <li>egl/wayland: Don't use dmabuf with no modifiers</li>

				</ul>

				<p>Dave Airlie (2):</p>

				<ul>

				  <li>radv: emit fmuladd instead of fma to llvm.</li>

				  <li>radv: lower ffma in nir.</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>cherry-ignore: add "anv: Remove unreachable cases from isl_format_for_size"</li>

				  <li>cherry-ignore: add "anv/wsi: Allocate enough memory for the entire image"</li>

				  <li>swr/rast: do not crash on NULL strings returned by getenv</li>

				  <li>wayland-drm: use a copy of the wayland_drm_callbacks struct</li>

				  <li>eglmesaext: add forward declaration for struct wl_buffers</li>

				  <li>Update version to 17.2.3</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>scons: use python3-compatible print()</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nv50/ir: fix 64-bit integer shifts</li>

				  <li>nv50,nvc0: fix push hint logic in presence of a start offset</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>intel/compiler: Don't cmod propagate into a saturated operation</li>

				  <li>intel/compiler: Don't propagate cmod into integer multiplies</li>

				  <li>glsl/blob: Return false from ensure_can_read on overrun</li>

				  <li>glsl/blob: Return false from grow_to_fit if we've ever failed</li>

				  <li>nir/opcodes: Fix constant-folding of ufind_msb</li>

				  <li>nir: Get rid of the variable on vote intrinsics</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.2</li>

				</ul>

				<p>Józef Kucia (3):</p>

				<ul>

				  <li>anv: Fix vkCmdFillBuffer()</li>

				  <li>spirv: Fix SpvOpAtomicISub</li>

				  <li>anv: Do not assert() on VK_ATTACHMENT_UNUSED</li>

				</ul>

				<p>Leo Liu (3):</p>

				<ul>

				  <li>st/va: use pipe transfer_map to map upload buffer</li>

				  <li>st/vdpau: don't re-allocate interlaced buffer with packed YUV format</li>

				  <li>st/va: don't re-allocate interlaced buffer with pakced format</li>

				</ul>

				<p>Lionel Landwerlin (4):</p>

				<ul>

				  <li>intel: compiler: vec4: add missing default 0 lod</li>

				  <li>anv/cmd_buffer: fix push descriptors with set &gt; 0</li>

				  <li>anv/cmd_buffer: Reset state in cmd_buffer_destroy</li>

				  <li>anv: bo_cache: allow importing a BO larger than needed</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>mesa: fix texture updates for ATI_fragment_shader</li>

				  <li>st/mesa: don't use pipe_surface for passing information about EGLImage</li>

				  <li>glsl_to_tgsi: fix instruction order for bindless textures</li>

				</ul>

				<p>Nicolai Hähnle (14):</p>

				<ul>

				  <li>st/glsl_to_tgsi: fix conditional assignments to packed shader outputs</li>

				  <li>amd/common: fix build_cube_select</li>

				  <li>radeonsi/gfx9: fix geometry shaders without output vertices</li>

				  <li>util/queue: fix a race condition in the fence code</li>

				  <li>glsl/lower_instruction: handle denorms and overflow in ldexp correctly</li>

				  <li>radeonsi: move current_rast_prim to r600_common_context</li>

				  <li>radeonsi: don't discard points and lines</li>

				  <li>radeonsi: deduce rast_prim correctly for tessellation point mode</li>

				  <li>radeonsi: fix maximum advertised point size / line width</li>

				  <li>st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS</li>

				  <li>st/glsl_to_tgsi: fix indirect access to 64-bit integer</li>

				  <li>st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts</li>

				  <li>radeonsi: clamp depth comparison value only for fixed point formats</li>

				  <li>radeonsi: clamp border colors for upgraded depth textures</li>

				</ul>

				<p>Rob Clark (2):</p>

				<ul>

				  <li>freedreno/a5xx: align height to GMEM</li>

				  <li>freedreno/a5xx: fix missing restore state</li>

				</ul>

				</div>

				</body>

				</html>

									
										132

docs/relnotes/17.2.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,132 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.4 Release Notes / October 30, 2017</h1>

				<p>

				Mesa 17.2.4 is a bug fix release which fixes bugs found since the 17.2.3 release.

				</p>

				<p>

				Mesa 17.2.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				cb266edc5cf7226219ebaf556ca2e03dff282e0324d20afd80423a5754d1272c  mesa-17.2.4.tar.gz

				5ba408fecd6e1132e5490eec1a2f04466214e4c65c8b89b331be844768c2e550  mesa-17.2.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (8):</p>

				<ul>

				  <li>cherry-ignore: configure.ac: rework llvm detection and handling</li>

				  <li>cherry-ignore: glsl: fix derived cs variables</li>

				  <li>cherry-ignore: added 17.3 nominations.</li>

				  <li>cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9.</li>

				  <li>cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well.</li>

				  <li>cherry-ignore: mesa/bufferobj: don't double negate the range</li>

				  <li>cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5.</li>

				  <li>Update version to 17.2.4</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>ac/nir: Fix nir_texop_lod on GFX for 1D arrays.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv/image: bump all the offset to uint64_t.</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.3</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>vulkan/wsi: Free the event in x11_manage_fifo_queues().</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>clover: Fix compilation after clang r315871</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>nir/intrinsics: Set the correct num_indices for load_output</li>

				  <li>intel/fs: Handle flag read/write aliasing in needs_src_copy</li>

				  <li>anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir</li>

				  <li>intel/eu: Use EXECUTE_1 for JMPI</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Revert absolute mode for constant buffer pointers.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>Revert "mesa: fix texture updates for ATI_fragment_shader"</li>

				</ul>

				<p>Matthew Nicholls (1):</p>

				<ul>

				  <li>ac/nir: generate correct instruction for atomic min/max on unsigned images</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>st/mesa: Initialize textures array in st_framebuffer_validate</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: add the draw count buffer to the list of buffers</li>

				</ul>

				<p>Stefan Schake (1):</p>

				<ul>

				  <li>broadcom/vc4: Fix aliasing issue</li>

				</ul>

				</div>

				</body>

				</html>

									
										156

docs/relnotes/17.2.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,156 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.5 Release Notes / November 10, 2017</h1>

				<p>

				Mesa 17.2.5 is a bug fix release which fixes bugs found since the 17.2.4 release.

				</p>

				<p>

				Mesa 17.2.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				25b40e72fad64b096c2d8d6fe9579369954debe7970d4ad53e5033c7eec2918b  mesa-17.2.5.tar.gz

				7f7f914b7b9ea0b15f2d9d01a4375e311b0e90e55683b8e8a67ce8691eb1070f  mesa-17.2.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (8):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.4</li>

				  <li>cherry-ignore: radv: copy indirect lowering settings from radeonsi</li>

				  <li>cherry-ignore: i965: fix blorp stage_prog_data-&gt;param leak</li>

				  <li>cherry-ignore: etnaviv: don't do resolve-in-place without valid TS</li>

				  <li>cherry-ignore: intel/fs: Alloc pull constants off mem_ctx</li>

				  <li>cherry-ignore: added 17.3 nominations.</li>

				  <li>cherry-ignore: automake: include git_sha1.h.in in release tarball</li>

				  <li>Update version to 17.2.5</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Don't expose heaps with 0 memory.</li>

				  <li>radv: Don't use vgpr indexing for outputs on GFX9.</li>

				  <li>radv: Disallow indirect outputs for GS on GFX9 as well.</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>i915g: make gears run again.</li>

				  <li>radv: free attachments on end command buffer.</li>

				  <li>radv: add initial copy descriptor support. (v2)</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>vc4: fix release build</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600/sb: bail out if prepare_alu_group() doesn't find a proper scheduling</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>spirv: Claim support for the simple memory model</li>

				  <li>i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees</li>

				  <li>i965/blorp: Use more temporary isl_format variables</li>

				  <li>i965/miptree: Take an isl_format in render_aux_usage</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/video: add gfx9 offsets when rejoin the video surface</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>st/dri: don't expose modifiers in EGL if the driver doesn't implement them</li>

				  <li>ac/surface/gfx9: don't allow DCC for the smallest mipmap levels</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>i965: Check CCS_E compatibility for texture view rendering</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>nir/opt_intrinsics: Fix values for gl_SubGroupG{e,t}MaskARB</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>amd/common/gfx9: workaround DCC corruption more conservatively</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>i965: unref push_const_bo in intelDestroyContext</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>radv: copy indirect lowering settings from radeonsi</li>

				</ul>

				<p>Tomasz Figa (1):</p>

				<ul>

				  <li>glsl: Allow precision mismatch on dead data with GLSL ES 1.00</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>intel/compiler/gen9: Pixel shader header only workaround</li>

				</ul>

				</div>

				</body>

				</html>

									
										187

docs/relnotes/17.2.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,187 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.6 Release Notes / November 25, 2017</h1>

				<p>

				Mesa 17.2.6 is a bug fix release which fixes bugs found since the 17.2.5 release.

				</p>

				<p>

				Mesa 17.2.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a9ed76702ffb14ad674ad48899f5c8c7e3a0f987911878a5dfdc4117dce5b415  mesa-17.2.6.tar.gz

				6ad85224620330be26ab68c8fc78381b12b38b610ade2db8716b38faaa8f30de  mesa-17.2.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (2):</p>

				<ul>

				  <li>glx/drisw: Fix glXMakeCurrent(dpy, None, ctx)</li>

				  <li>glx/dri3: Fix passing renderType into glXCreateContext</li>

				</ul>

				<p>Alex Smith (2):</p>

				<ul>

				  <li>spirv: Use correct type for sampled images</li>

				  <li>nir/spirv: tg4 requires a sampler</li>

				</ul>

				<p>Andres Gomez (14):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.5</li>

				  <li>cherry-ignore: intel/fs: Use a pure vertical stride for large register strides</li>

				  <li>cherry-ignore: intel/nir: Use the correct indirect lowering masks in link_shaders</li>

				  <li>cherry-ignore: intel/fs: Use the original destination region for int MUL lowering</li>

				  <li>cherry-ignore: intel/fs: refactors</li>

				  <li>cherry-ignore: r600/shader: reserve first register of vertex shader.</li>

				  <li>cherry-ignore: anv/cmd_buffer: Advance the address when initializing clear colors</li>

				  <li>cherry-ignore: anv/cmd_buffer: Take bo_offset into account in fast clear state addresses</li>

				  <li>cherry-ignore: i965: Mark BOs as external when we export their handle</li>

				  <li>cherry-ignore: added 17.3 nominations.</li>

				  <li>cherry-ignore: glsl: Fix typo fragement -&gt; fragment</li>

				  <li>cherry-ignore: egl: pass the dri2_dpy to the $plat_teardown functions</li>

				  <li>cherry-ignore: Revert "intel/fs: Use a pure vertical stride for large register strides"</li>

				  <li>Update version to 17.2.6</li>

				</ul>

				<p>Anuj Phogat (2):</p>

				<ul>

				  <li>i965: Program DWord Length in MI_FLUSH_DW</li>

				  <li>i965/gen8+: Fix the number of dwords programmed in MI_FLUSH_DW</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Free syncobj with multiple imports.</li>

				  <li>radv: Free temporary syncobj after waiting on it.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600: fix isoline tess factor component swapping.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Add a fallback when fourcc query isn't supported</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>autotools: Set C++ visibility flags on Intel</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>targets/opencl: don't hardcode the icd file install to /etc/...</li>

				  <li>configure.ac: loosen --enable-glvnd check to honour egl</li>

				  <li>configure.ac: require xcb* for the omx/va/... when using x11 platform</li>

				</ul>

				<p>George Barrett (1):</p>

				<ul>

				  <li>glsl: Catch subscripted calls to undeclared subroutines</li>

				</ul>

				<p>Jason Ekstrand (9):</p>

				<ul>

				  <li>intel/fs: Use ANY/ALL32 predicates in SIMD32</li>

				  <li>intel/fs: Use an explicit D type for vote any/all/eq intrinsics</li>

				  <li>intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all</li>

				  <li>intel/eu/reg: Add a subscript() helper</li>

				  <li>intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core</li>

				  <li>intel/fs: Fix integer multiplication lowering for src/dst hazards</li>

				  <li>intel/fs: Mark 64-bit values as being contiguous</li>

				  <li>intel/fs: Rework zero-length URB write handling</li>

				  <li>i965: Add stencil buffers to cache set regardless of stencil texturing</li>

				</ul>

				<p>Kenneth Graunke (5):</p>

				<ul>

				  <li>i965: properly initialize brw-&gt;cs.base.stage to MESA_SHADER_COMPUTE</li>

				  <li>i965: Make L3 configuration atom listen for TCS/TES program updates.</li>

				  <li>intel/tools: Fix detection of enabled shader stages.</li>

				  <li>i965: Implement another VF cache invalidate workaround on Gen8+.</li>

				  <li>i965: Upload invariant state once at the start of the batch on Gen4-5.</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>i965/fs: Fix extract_i8/u8 to a 64-bit destination</li>

				  <li>i965/fs: Split all 32-&gt;64-bit MOVs on CHV, BXT, GLK</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>glsl: Transform fb buffers are only active if a variable uses them</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>ddebug: fix use-after-free of streamout targets</li>

				</ul>

				<p>Tim Rowley (2):</p>

				<ul>

				  <li>swr/rast: Use gather instruction for i32gather_ps on simd16/avx512</li>

				  <li>swr/rast: Faster emulated simd16 permute</li>

				</ul>

				<p>Timothy Arceri (3):</p>

				<ul>

				  <li>glsl: drop cache_fallback</li>

				  <li>glsl: use the correct parent when allocating program data members</li>

				  <li>mesa: rework how we free gl_shader_program_data</li>

				</ul>

				</div>

				</body>

				</html>

									
										247

docs/relnotes/17.2.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,247 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.7 Release Notes / December 14, 2017</h1>

				<p>

				Mesa 17.2.7 is a bug fix release which fixes bugs found since the 17.2.6 release.

				</p>

				<p>

				Mesa 17.2.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e8d837a1cd55014e636e9caf6c75cfbe1b3e4be9ab3fa125f5ef38398aa12e97  mesa-17.2.7.tar.gz

				50cfdea8df55045797b4d0409591c04c784d9551c4da09b8178874dbe5a37a68  mesa-17.2.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102006">Bug 102006</a> - gstreamer vaapih264enc segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102552">Bug 102552</a> - Null dereference due to not checking return value of util_format_description</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103732">Bug 103732</a> - [swr] often gets stuck in piglit's glx-multi-context-single-window test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of ‘memfd_create’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Add LLVM version to the device name string</li>

				</ul>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.6</li>

				  <li>docs: remove bug 103626 from fix list as per 17.2.6</li>

				</ul>

				<p>Ben Crocker (2):</p>

				<ul>

				  <li>docs/llvmpipe.html: Minor edits</li>

				  <li>docs/llvmpipe: document ppc64le as alternative architecture to x86.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600/sb: handle jump after target to end of program. (v2)</li>

				</ul>

				<p>Denis Pauk (1):</p>

				<ul>

				  <li>gallium/{r600, radeonsi}: Fix segfault with color format (v2)</li>

				</ul>

				<p>Eduardo Lima Mitev (3):</p>

				<ul>

				  <li>glsl_parser_extra: Add utility to copy symbols between symbol tables</li>

				  <li>glsl: Use the utility function to copy symbols between symbol tables</li>

				  <li>glsl/linker: Check that re-declared, inter-shader built-in blocks match</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>gl_table.py: add extern C guard for the generated glapitable.h</li>

				  <li>cherry-ignore: radeonsi: allow DMABUF exports for local buffers</li>

				  <li>Update version to 17.2.7</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>compiler: use NDEBUG to guard asserts</li>

				</ul>

				<p>Fabian Bieler (2):</p>

				<ul>

				  <li>glsl: Match order of gl_LightSourceParameters elements.</li>

				  <li>glsl: Fix gl_NormalScale.</li>

				</ul>

				<p>Frank Richter (1):</p>

				<ul>

				  <li>gallium/wgl: fix default pixel format issue</li>

				</ul>

				<p>George Kyriazis (1):</p>

				<ul>

				  <li>swr: Handle resource across context changes</li>

				</ul>

				<p>Gert Wollny (2):</p>

				<ul>

				  <li>r600: Emit EOP for more CF instruction types</li>

				  <li>r600/sb: do not convert if-blocks that contain indirect array access</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>glsl: fix derived cs variables</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>nir/opcodes: Fix constant-folding of bitfield_insert</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965: Disable regular fast-clears (CCS_D) on gen9+</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>glsl: add varying resources for arrays of complex types</li>

				</ul>

				<p>Julien Isorce (1):</p>

				<ul>

				  <li>st/va: change frame_idx from array to hash table</li>

				</ul>

				<p>Kai Wasserbäch (1):</p>

				<ul>

				  <li>docs: Point to apt.llvm.org for development snapshot packages</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>meta: Initialize depth/clear values on declaration.</li>

				  <li>meta: Fix ClearTexture with GL_DEPTH_COMPONENT.</li>

				  <li>i965: Fix Smooth Point Enables.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: fix layered DCC fast clear</li>

				  <li>radeonsi/gfx9: fix importing shared textures with DCC</li>

				  <li>radeonsi: flush the context after resource_copy_region for buffer exports</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>i965/fs: Handle negating immediates on MADs when propagating saturates</li>

				  <li>util: Fix SHA1 implementation on big endian</li>

				  <li>util: Fix disk_cache index calculation on big endian</li>

				  <li>i965/fs: Unpack count argument to 64-bit shift ops on Atom</li>

				</ul>

				<p>Nicolai Hähnle (3):</p>

				<ul>

				  <li>radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check</li>

				  <li>glsl: allow any l-value of an input variable as interpolant in interpolateAt*</li>

				  <li>glsl: fix interpolateAtXxx(some_vec[idx], ...) with dynamic idx</li>

				</ul>

				<p>Pierre Moreau (1):</p>

				<ul>

				  <li>nvc0/ir: Properly lower 64-bit shifts when the shift value is &gt;32</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>mesa/gles: adjust internal format in glTexSubImage2D error checks</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>glsl: get correct member type when processing xfb ifc arrays</li>

				</ul>

				<p>Vadym Shovkoplias (2):</p>

				<ul>

				  <li>intel/blorp: Fix possible NULL pointer dereferencing</li>

				  <li>glx/dri3: Remove unused deviceName variable</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>anv: Check if memfd_create is already defined.</li>

				</ul>

				</div>

				</body>

				</html>

									
										112

docs/relnotes/17.2.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,112 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.2.8 Release Notes / December 22, 2017</h1>

				<p>

				Mesa 17.2.8 is a bug fix release which fixes bugs found since the 17.2.7 release.

				</p>

				<p>

				Mesa 17.2.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				c715c3a3d6fe26a69c096f573ec416e038a548f0405e3befedd5136517527a84  mesa-17.2.8.tar.gz

				6e940345cceaadfd805d701ed2b956589fa77fe8c39991da30ed51ea6b9d095f  mesa-17.2.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (6):</p>

				<ul>

				  <li>cherry-ignore: swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.</li>

				  <li>cherry-ignore: added 17.3 nominations.</li>

				  <li>cherry-ignore: radv: port merge tess info from anv</li>

				  <li>cherry-ignore: main: Clear shader program data whenever ProgramBinary is called</li>

				  <li>cherry-ignore: r600: set DX10_CLAMP for compute shader too</li>

				  <li>Update version to 17.2.8</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>spirv: Fix loading an entire block at once.</li>

				  <li>radv: Fix multi-layer blits.</li>

				</ul>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>xlib: call _mesa_warning() instead of fprintf()</li>

				  <li>gallium/aux: include nr_samples in util_resource_size() computation</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.2.7</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>i965/vec4: use a temp register to compute offsets for pull loads</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move destroy command before feedback command</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>util: Assume little endian in the absence of platform-specific handling</li>

				  <li>util: Add a SHA1 unit test program</li>

				</ul>

				<p>Roland Scheidegger (2):</p>

				<ul>

				  <li>r600: use min_dx10/max_dx10 instead of min/max</li>

				  <li>r600: use DX10_CLAMP bit in shader setup</li>

				</ul>

				</div>

				</body>

				</html>

									
										188

docs/relnotes/17.3.0.html
									
												View File
												
				@@ -14,7 +14,7 @@

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.0 Release Notes / TBD</h1>

				<h1>Mesa 17.3.0 Release Notes / December 8. 2017</h1>

				<p>

				Mesa 17.3.0 is a new development release.

				@@ -33,7 +33,8 @@ because compatibility contexts are not supported.

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				0cb1ffe2b4637d80f08df3bdfeb300352dcffd8ff4f6711278639b084e3f07f9  mesa-17.3.0.tar.gz

				29a0a3a6c39990d491a1a58ed5c692e596b3bfc6c01d0b45e0b787116c50c6d9  mesa-17.3.0.tar.xz

				</pre>

				@@ -51,19 +52,194 @@ Note: some of the new features are only available with certain drivers.

				<li>GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, radeonsi</li>

				<li>GL_EXT_memory_object on radeonsi</li>

				<li>GL_EXT_memory_object_fd on radeonsi</li>

				<li>EGL_ANDROID_native_fence_sync on radeonsi with a future kernel (possibly 4.15)</li>

				<li>EGL_IMG_context_priority on i965</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				TBD

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101655">Bug 101655</a> - Explicit sync support for android</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101709">Bug 101709</a> - [llvmpipe] piglit gl-1.0-scissor-offscreen regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101851">Bug 101851</a> - [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101876">Bug 101876</a> - SIGSEGV when launching Steam</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101925">Bug 101925</a> - playstore/webview crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101941">Bug 101941</a> - Getting different output depending on attribute declaration order</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101961">Bug 101961</a> - Serious Sam Fusion hangs system completely</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101981">Bug 101981</a> - Commit ddc32537d6db69198e88ef0dfe19770bf9daa536 breaks rendering in multiple applications</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101982">Bug 101982</a> - Weston crashes when running an OpenGL program on i965</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101983">Bug 101983</a> - [G33] ES2-CTS.functional.shaders.struct.uniform.sampler_nested* regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101989">Bug 101989</a> - ES3-CTS.functional.state_query.integers.viewport_getinteger regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102006">Bug 102006</a> - gstreamer vaapih264enc segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102014">Bug 102014</a> - Mesa git build broken by commit bc7f41e11d325280db12e7b9444501357bc13922</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102015">Bug 102015</a> - [Regression,bisected]: Segfaults with various programs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102038">Bug 102038</a> - assertion failure in update_framebuffer_size</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102050">Bug 102050</a> - commit b4f639d02a causes build breakage on Android 32bit builds</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102052">Bug 102052</a> - No package 'expat' found</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102062">Bug 102062</a> - Segfault at eglCreateContext in android-x86</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102125">Bug 102125</a> - [softpipe] piglit arb_texture_view-targets regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102201">Bug 102201</a> - [regression, SI] GPU crash in Unigine Valley</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102274">Bug 102274</a> - assertion failure in ir_validate.cpp:240</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102377">Bug 102377</a> - PIPE_*_4BYTE_ALIGNED_ONLY caps crashing</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102429">Bug 102429</a> - [regression, SI] Performance decrease in Unigine Valley &amp; Heaven</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102461">Bug 102461</a> - [llvmpipe] piglit glean fragprog1 XPD test 1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102496">Bug 102496</a> - Frontbuffer rendering corruption on mesa master</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102502">Bug 102502</a> - [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102530">Bug 102530</a> - [bisected] Kodi crashes when launching a stream - commit bd2662bf</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102552">Bug 102552</a> - Null dereference due to not checking return value of util_format_description</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102565">Bug 102565</a> - u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102573">Bug 102573</a> - fails to build on armel</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: ‘&gt;&gt;’ should be ‘&gt; &gt;’ within a nested template argument list</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102685">Bug 102685</a> - piglit.spec.glsl-1_50.compiler.vs-redeclares-pervertex-out-before-global-redeclaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102847">Bug 102847</a> - swr fail to build with llvm-5.0.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102904">Bug 102904</a> - piglit and gl45 cts linker tests regressed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102924">Bug 102924</a> - mesa (git version) images too dark</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102999">Bug 102999</a> - [BISECTED,REGRESSION] Failing Android EGL dEQP with RGBA configs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103002">Bug 103002</a> - string_buffer_test.cpp:43: error: ISO C++ forbids initialization of member ‘str1’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103214">Bug 103214</a> - GLES CTS functional.state_query.indexed.atomic_counter regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103247">Bug 103247</a> - Performance regression: car chase, manhattan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103253">Bug 103253</a> - blob.h:138:1: error: unknown type name 'ssize_t'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103265">Bug 103265</a> - [llvmpipe] piglit depth-tex-compare regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103323">Bug 103323</a> - Possible unintended error message in file pixel.c line 286</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of ‘memfd_create’ follows non-static declaration</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				TBD

				</ul>

				</div>

				</body>

									
										191

docs/relnotes/17.3.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,191 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.1 Release Notes / December 21, 2017</h1>

				<p>

				Mesa 17.3.1 is a bug fix release which fixes bugs found since the 17.3.0 release.

				</p>

				<p>

				Mesa 17.3.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b0bb0419dbe3043ed4682a28eaf95721f427ca3f23a3c2a7dc77dbe8a3b6384d  mesa-17.3.1.tar.gz

				9ae607e0998a586fb2c866cfc8e45e6f52d1c56cb1b41288253ea83eada824c1  mesa-17.3.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Add LLVM version to the device name string</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>spirv: Fix loading an entire block at once.</li>

				  <li>radv: Don't advertise VK_EXT_debug_report.</li>

				  <li>radv: Fix multi-layer blits.</li>

				</ul>

				<p>Ben Crocker (1):</p>

				<ul>

				  <li>docs/llvmpipe: document ppc64le as alternative architecture to x86.</li>

				</ul>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>xlib: call _mesa_warning() instead of fprintf()</li>

				  <li>gallium/aux: include nr_samples in util_resource_size() computation</li>

				</ul>

				<p>Bruce Cherniak (1):</p>

				<ul>

				  <li>swr: Fix KNOB_MAX_WORKER_THREADS thread creation override.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: port merge tess info from anv</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.0</li>

				  <li>util: scons: wire up the sha1 test</li>

				  <li>cherry-ignore: meson: fix strtof locale support check</li>

				  <li>cherry-ignore: util: add mesa-sha1 test to meson</li>

				  <li>Update version to 17.3.1</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>compiler: use NDEBUG to guard asserts</li>

				</ul>

				<p>Fabian Bieler (2):</p>

				<ul>

				  <li>glsl: Match order of gl_LightSourceParameters elements.</li>

				  <li>glsl: Fix gl_NormalScale.</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600/sb: do not convert if-blocks that contain indirect array access</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>nir/opcodes: Fix constant-folding of bitfield_insert</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965: Switch over to fully external-or-not MOCS scheme</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>travis: disable Meson build</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>meta: Initialize depth/clear values on declaration.</li>

				  <li>meta: Fix ClearTexture with GL_DEPTH_COMPONENT.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move destroy command before feedback command</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>radeonsi: flush the context after resource_copy_region for buffer exports</li>

				  <li>radeonsi: allow DMABUF exports for local buffers</li>

				  <li>winsys/amdgpu: disable local BOs again due to worse performance</li>

				  <li>radeonsi: don't call force_dcc_off for buffers</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>util: Assume little endian in the absence of platform-specific handling</li>

				  <li>util: Add a SHA1 unit test program</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>radeonsi: fix the R600_RESOURCE_FLAG_UNMAPPABLE check</li>

				</ul>

				<p>Pierre Moreau (1):</p>

				<ul>

				  <li>nvc0/ir: Properly lower 64-bit shifts when the shift value is &gt;32</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>glsl: get correct member type when processing xfb ifc arrays</li>

				</ul>

				<p>Vadym Shovkoplias (2):</p>

				<ul>

				  <li>glx/dri3: Remove unused deviceName variable</li>

				  <li>util/disk_cache: Remove unneeded free() on always null string</li>

				</ul>

				</div>

				</body>

				</html>

									
										109

docs/relnotes/17.3.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,109 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.2 Release Notes / January 8, 2018</h1>

				<p>

				Mesa 17.3.2 is a bug fix release which fixes bugs found since the 17.3.1 release.

				</p>

				<p>

				Mesa 17.3.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f997e80f14c385f9a2ba827c2b74aebf1b7426712ca4a81c631ef9f78e437bf4  mesa-17.3.2.tar.gz

				e2844a13f2d6f8f24bee65804a51c42d8dc6ae9c36cff7ee61d0940e796d64c6  mesa-17.3.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Fix DCC compatible formats.</li>

				</ul>

				<p>Brendan King (1):</p>

				<ul>

				  <li>egl: link libEGL against the dynamic version of libglapi</li>

				</ul>

				<p>Dave Airlie (6):</p>

				<ul>

				  <li>radv/gfx9: add support for 3d images to blit 2d paths</li>

				  <li>radv: handle depth/stencil image copy with layouts better. (v3.1)</li>

				  <li>radv/meta: fix blit paths for depth/stencil (v2.1)</li>

				  <li>radv: fix issue with multisample positions and interp_var_at_sample.</li>

				  <li>radv/gfx9: add 3d sampler image-&gt;buffer copy shader. (v3)</li>

				  <li>radv: don't do format replacement on tc compat htile surfaces.</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.1</li>

				  <li>Update version to 17.3.2</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>egl: let each platform decided how to handle LIBGL_ALWAYS_SOFTWARE</li>

				</ul>

				<p>Rob Herring (1):</p>

				<ul>

				  <li>egl/android: Fix build break with dri2_initialize_android _EGLDisplay parameter</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv/gfx9: fix primitive topology when adjacency is used</li>

				  <li>radv: use a faster version for nir_op_pack_half_2x16</li>

				</ul>

				<p>Tapani Pälli (2):</p>

				<ul>

				  <li>mesa: add AllowGLSLCrossStageInterpolationMismatch workaround</li>

				  <li>drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games</li>

				</ul>

				</div>

				</body>

				</html>

									
										151

docs/relnotes/17.3.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,151 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.3 Release Notes / January 18, 2018</h1>

				<p>

				Mesa 17.3.3 is a bug fix release which fixes bugs found since the 17.3.2 release.

				</p>

				<p>

				Mesa 17.3.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				c733d37a161501cd81dc9b309ccb613753b98eafc6d35e0847548a6642749772  mesa-17.3.3.tar.gz

				41bac5de0ef6adc1f41a1ec0f80c19e361298ce02fa81b5f9ba4fdca33a9379b  mesa-17.3.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (3):</p>

				<ul>

				  <li>anv: Add missing unlock in anv_scratch_pool_alloc</li>

				  <li>anv: Take write mask into account in has_color_buffer_write_enabled</li>

				  <li>anv: Make sure state on primary is correct after CmdExecuteCommands</li>

				</ul>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>anv: Import mako templates only during execution of anv_extensions</li>

				</ul>

				<p>Bas Nieuwenhuizen (11):</p>

				<ul>

				  <li>radv: Invert condition for all samples identical during resolve.</li>

				  <li>radv: Flush caches before subpass resolve.</li>

				  <li>radv: Fix fragment resolve destination offset.</li>

				  <li>radv: Use correct framebuffer size for partial FS resolves.</li>

				  <li>radv: Always use fragment resolve if dest uses DCC.</li>

				  <li>Revert "radv/gfx9: fix block compression texture views."</li>

				  <li>radv: Use correct HTILE expanded words.</li>

				  <li>radv: Allow writing 0 scissors.</li>

				  <li>ac/nir: Handle loading data from compact arrays.</li>

				  <li>radv: Invalidate L1 for VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT.</li>

				  <li>ac/nir: Sanitize location_frac for local variables.</li>

				</ul>

				<p>Dave Airlie (8):</p>

				<ul>

				  <li>radv: fix events on compute queues.</li>

				  <li>radv: fix pipeline statistics end query on compute queue</li>

				  <li>radv/gfx9: fix 3d image to image transfers on compute queues.</li>

				  <li>radv/gfx9: fix 3d image clears on compute queues</li>

				  <li>radv/gfx9: fix buffer to image for 3d images on compute queues</li>

				  <li>radv/gfx9: fix block compression texture views.</li>

				  <li>radv/gfx9: use a bigger hammer to flush cb/db caches.</li>

				  <li>radv/gfx9: use correct swizzle parameter to work out border swizzle.</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.2</li>

				</ul>

				<p>Florian Will (1):</p>

				<ul>

				  <li>glsl: Respect std430 layout in lower_buffer_access</li>

				</ul>

				<p>Juan A. Suarez Romero (6):</p>

				<ul>

				  <li>cherry-ignore: intel/fs: Use the original destination region for int MUL lowering</li>

				  <li>cherry-ignore: i965/fs: Use UW types when using V immediates</li>

				  <li>cherry-ignore: main: Clear shader program data whenever ProgramBinary is called</li>

				  <li>cherry-ignore: egl: pass the dri2_dpy to the $plat_teardown functions</li>

				  <li>cherry-ignore: vulkan/wsi: free cmd pools</li>

				  <li>Update version to 17.3.3</li>

				</ul>

				<p>Józef Kucia (1):</p>

				<ul>

				  <li>radeonsi: fix alpha-to-coverage if color writes are disabled</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Require space for MI_BATCHBUFFER_END.</li>

				  <li>i965: Torch public intel_batchbuffer_emit_dword/float helpers.</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>etnaviv: disable in-place resolve for non-supertiled surfaces</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>anv: VkDescriptorSetLayoutBinding can have descriptorCount == 0</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>loader/dri3: Avoid freeing renderbuffers in use</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>swr/rast: fix invalid sign masks in avx512 simdlib code</li>

				</ul>

				</div>

				</body>

				</html>

									
										275

docs/relnotes/17.3.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,275 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.4 Release Notes / January 15, 2018</h1>

				<p>

				Mesa 17.3.4 is a bug fix release which fixes bugs found since the 17.3.3 release.

				</p>

				<p>

				Mesa 17.3.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2d3a4c3cbc995b3e192361dce710d8c749e046e7575aa1b7d8fc9e6b4df28f84  mesa-17.3.4.tar.gz

				71f995e233bc5df1a0dd46c980d1720106e7f82f02d61c1ca50854b5e02590d0  mesa-17.3.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU HANG: ecode 9:0:0x86df7cf9, in csgo_linux64 [4947], reason: Hang on rcs, action: reset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104745">Bug 104745</a> - HEVC VDPAU decoding broken on RX 460 with UVD Firmware v1.130</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104818">Bug 104818</a> - mesa fails to build on ia64</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen&lt;8</li>

				</ul>

				<p>Bas Nieuwenhuizen (10):</p>

				<ul>

				  <li>radv: Fix ordering issue in meta memory allocation failure path.</li>

				  <li>radv: Fix memory allocation failure path in compute resolve init.</li>

				  <li>radv: Fix freeing meta state if the device pipeline cache fails to allocate.</li>

				  <li>radv: Fix fragment resolve init memory allocation failure paths.</li>

				  <li>radv: Fix bufimage failure deallocation.</li>

				  <li>radv: Init variant entry with memset.</li>

				  <li>radv: Don't allow 3d or 1d depth/stencil textures.</li>

				  <li>ac/nir: Use instance_rate_inputs per attribute, not per variable.</li>

				  <li>ac/nir: Use correct 32-bit component writemask for 64-bit SSBO stores.</li>

				  <li>ac/nir: Fix vector extraction if source vector has &gt;4 elements.</li>

				</ul>

				<p>Boyuan Zhang (2):</p>

				<ul>

				  <li>radeon/vcn: add and manage render picture list</li>

				  <li>radeon/uvd: add and manage render picture list</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>configure.ac: add missing llvm dependencies to .pc files</li>

				</ul>

				<p>Dave Airlie (10):</p>

				<ul>

				  <li>r600/sb: fix a bug emitting ar load from a constant.</li>

				  <li>ac/nir: account for view index in the user sgpr allocation.</li>

				  <li>radv: add fs_key meta format support to resolve passes.</li>

				  <li>radv: don't use hw resolve for integer image formats</li>

				  <li>radv: don't use hw resolves for r16g16 norm formats.</li>

				  <li>radv: move spi_baryc_cntl to pipeline</li>

				  <li>r600/sb: insert the else clause when we might depart from a loop</li>

				  <li>radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)</li>

				  <li>radv/gfx9: fix block compression texture views. (v2)</li>

				  <li>virgl: also remove dimension on indirect.</li>

				</ul>

				<p>Eleni Maria Stea (1):</p>

				<ul>

				  <li>mesa: Fix function pointers initialization in status tracker</li>

				</ul>

				<p>Emil Velikov (18):</p>

				<ul>

				  <li>cherry-ignore: i965: Accept CONTEXT_ATTRIB_PRIORITY for brwCreateContext</li>

				  <li>cherry-ignore: swr: refactor swr_create_screen to allow for proper cleanup on error</li>

				  <li>cherry-ignore: anv: add explicit 18.0 only nominations</li>

				  <li>cherry-ignore: radv: fix sample_mask_in loading. (v3.1)</li>

				  <li>cherry-ignore: meson: multiple fixes</li>

				  <li>cherry-ignore: swr/rast: support llvm 3.9 type declarations</li>

				  <li>Revert "cherry-ignore: intel/fs: Use the original destination region for int MUL lowering"</li>

				  <li>cherry-ignore: ac/nir: set amdgpu.uniform and invariant.load for UBOs</li>

				  <li>cherry-ignore: add gen10 fixes</li>

				  <li>cherry-ignore: add r600/amdgpu 18.0 nominations</li>

				  <li>cherry-ignore: add i965 shader cache fixes</li>

				  <li>cherry-ignore: nir: mark unused space in packed_tex_data</li>

				  <li>radv: Stop advertising VK_KHX_multiview</li>

				  <li>cherry-ignore: radv: Don't expose VK_KHX_multiview on android.</li>

				  <li>configure.ac: correct driglx-direct help text</li>

				  <li>cherry-ignore: add meson fix</li>

				  <li>cherry-ignore: add a few more meson fixes</li>

				  <li>Update version to 17.3.4</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>radeon: remove left over dead code</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600/shader: Initialize max_driver_temp_used correctly for the first time</li>

				</ul>

				<p>Grazvydas Ignotas (2):</p>

				<ul>

				  <li>st/va: release held locks in error paths</li>

				  <li>st/vdpau: release held lock in error path</li>

				</ul>

				<p>Igor Gnatenko (1):</p>

				<ul>

				  <li>link mesautil with pthreads</li>

				</ul>

				<p>Indrajit Das (4):</p>

				<ul>

				  <li>st/omx_bellagio: Update default intra matrix per MPEG2 spec</li>

				  <li>radeon/uvd: update quantiser matrices only when requested</li>

				  <li>radeon/vcn: update quantiser matrices only when requested</li>

				  <li>st/va: clear pointers for mpeg2 quantiser matrices</li>

				</ul>

				<p>Jason Ekstrand (19):</p>

				<ul>

				  <li>i965: Call brw_cache_flush_for_render in predraw_resolve_framebuffer</li>

				  <li>i965: Add more precise cache tracking helpers</li>

				  <li>i965/blorp: Add more destination flushing</li>

				  <li>i965: Track the depth and render caches separately</li>

				  <li>i965: Track format and aux usage in the render cache</li>

				  <li>Re-enable regular fast-clears (CCS_D) on gen9+</li>

				  <li>i965/miptree: Refactor CCS_E and CCS_D cases in render_aux_usage</li>

				  <li>i965/miptree: Add an explicit tiling parameter to create_for_bo</li>

				  <li>i965/miptree: Use the tiling from the modifier instead of the BO</li>

				  <li>i965/bufmgr: Add a create_from_prime_tiled function</li>

				  <li>i965: Set tiling on BOs imported with modifiers</li>

				  <li>i965/miptree: Take an aux_usage in prepare/finish_render</li>

				  <li>i965/miptree: Add an aux_disabled parameter to render_aux_usage</li>

				  <li>i965/surface_state: Drop brw_aux_surface_disabled</li>

				  <li>intel/fs: Use the original destination region for int MUL lowering</li>

				  <li>anv/pipeline: Don't look at blend state unless we have an attachment</li>

				  <li>anv/cmd_buffer: Re-emit the pipeline at every subpass</li>

				  <li>anv: Stop advertising VK_KHX_multiview</li>

				  <li>i965: Call prepare_external after implicit window-system MSAA resolves</li>

				</ul>

				<p>Jon Turney (3):</p>

				<ul>

				  <li>configure: Default to gbm=no on osx</li>

				  <li>glx/apple: include util/debug.h for env_var_as_boolean prototype</li>

				  <li>glx/apple: locate dispatch table functions to wrap by name</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>svga: Prevent use after free.</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.3</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Bind null render targets for shadow sampling + color.</li>

				  <li>i965: Bump official kernel requirement to Linux v3.9.</li>

				</ul>

				<p>Lucas Stach (2):</p>

				<ul>

				  <li>etnaviv: dirty TS state when framebuffer has changed</li>

				  <li>renderonly: fix dumb BO allocation for non 32bpp formats</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't ignore pitch for imported textures</li>

				</ul>

				<p>Matthew Nicholls (2):</p>

				<ul>

				  <li>radv: restore previous stencil reference after depth-stencil clear</li>

				  <li>radv: remove predication on cache flushes</li>

				</ul>

				<p>Maxin B. John (1):</p>

				<ul>

				  <li>anv_icd.py: improve reproducible builds</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>winsys/radeon: Compute is_displayable in surf_drm_to_winsys</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>r600: don't do stack workarounds for hemlock</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: create pipeline layout objects for all meta operations</li>

				</ul>

				<p>Samuel Thibault (1):</p>

				<ul>

				  <li>glx: fix non-dri build</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>ac: fix buffer overflow bug in 64bit SSBO loads</li>

				  <li>ac: fix visit_ssa_undef() for doubles</li>

				</ul>

				</div>

				</body>

				</html>

									
										66

docs/relnotes/17.3.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.5 Release Notes / February 19, 2018</h1>

				<p>

				Mesa 17.3.5 is a bug fix release which fixes bugs found since the 17.3.4 release.

				</p>

				<p>

				Mesa 17.3.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				bc1ee20366aae2affc37c89228f871f438136f70252005e9f842169bde976788  mesa-17.3.5.tar.gz

				eb9228fc8aaa71e0205c1481c5b157752ebaec9b646b030d27478e25a6d7936a  mesa-17.3.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.4</li>

				  <li>Update version to 17.3.5</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>ac/nir: Fix conflict resolution typo in handle_vs_input_decl</li>

				</ul>

				</div>

				</body>

				</html>

									
										85

docs/relnotes/17.3.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.6 Release Notes / February 27, 2018</h1>

				<p>

				Mesa 17.3.6 is a bug fix release which fixes bugs found since the 17.3.5 release.

				</p>

				<p>

				Mesa 17.3.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				d5e10ea3f0d11b06d2b0b235bba372a04278c39bc0e712090bda1f61842db188  mesa-17.3.6.tar.gz

				e5915680d44ac9d05defdec529db7459ac9edd441c9845266eff2e2d3e57fbf8  mesa-17.3.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>

				</ul>

				<h2>Changes</h2>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.5</li>

				  <li>Update version to 17.3.6</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>i965/draw: Do resolves properly for textures used by TXF</li>

				  <li>i965: Replace draw_aux_buffer_disabled with draw_aux_usage</li>

				  <li>i965/draw: Set NEW_AUX_STATE when draw aux changes</li>

				  <li>i965: Stop disabling aux during texture preparation</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Don't disable CCS for RT dependencies when dispatching compute.</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>i965: Don't try to disable render aux buffers for compute</li>

				</ul>

				</div>

				</body>

				</html>

									
										312

docs/relnotes/17.3.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,312 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.7 Release Notes / March 21, 2018</h1>

				<p>

				Mesa 17.3.7 is a bug fix release which fixes bugs found since the 17.3.7 release.

				</p>

				<p>

				Mesa 17.3.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f08de6d0ccb3dbca04b44790d85c3ff9e7b1cc4189d1b7c7167e5ba7d98736c0  mesa-17.3.7.tar.gz

				0595904a8fba65a8fe853a84ad3c940205503b94af41e8ceed245fada777ac1e  mesa-17.3.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Fix CmdCopyImage between uncompressed and compressed images</li>

				</ul>

				<p>Andriy Khulap (1):</p>

				<ul>

				  <li>i965: Fix RELOC_WRITE typo in brw_store_data_imm64()</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>isl: Don't use surface format R32_FLOAT for typed atomic integer operations</li>

				</ul>

				<p>Bas Nieuwenhuizen (6):</p>

				<ul>

				  <li>radv: Always lower indirect derefs after nir_lower_global_vars_to_local.</li>

				  <li>radeonsi: Export signalled sync file instead of -1.</li>

				  <li>radv: Implement WaitForFences with !waitAll.</li>

				  <li>radv: Implement waiting on non-submitted fences.</li>

				  <li>radv: Fix copying from 3D images starting at non-zero depth.</li>

				  <li>radv: Increase the number of dynamic uniform buffers.</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: add missing switch case for EXTRA_VERSION_40 in check_extra()</li>

				</ul>

				<p>Chuck Atkins (1):</p>

				<ul>

				  <li>glx: Properly handle cases where screen creation fails</li>

				</ul>

				<p>Daniel Stone (3):</p>

				<ul>

				  <li>i965: Fix bugs in intel_from_planar</li>

				  <li>egl/wayland: Fix ARGB/XRGB transposition in config map</li>

				  <li>egl/wayland: Always use in-tree wayland-egl-backend.h</li>

				</ul>

				<p>Dave Airlie (9):</p>

				<ul>

				  <li>r600: fix cubemap arrays</li>

				  <li>r600/sb/cayman: fix indirect ubo access on cayman</li>

				  <li>r600: fix xfb stream check.</li>

				  <li>ac/nir: to integer the args to bcsel.</li>

				  <li>r600/cayman: fix fragcood loading recip generation.</li>

				  <li>radv: don't support tc-compat on multisample d32s8 at all.</li>

				  <li>virgl: remap query types to hw support.</li>

				  <li>ac/nir: don't apply slice rounding on txf_ms</li>

				  <li>r600: implement callstack workaround for evergreen.</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>glapi/check_table: Remove 'extern "C"' block</li>

				  <li>glapi: remove APPLE extensions from test</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.6</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>mesa: Drop incorrect A4B4G4R4 _mesa_format_matches_format_and_type() cases.</li>

				  <li>ac/nir: Fix compiler warning about uninitialized dw_addr.</li>

				  <li>glsl/tests: Fix strict aliasing warning about int64/double.</li>

				  <li>glsl/tests: Fix a compiler warning about signed/unsigned loop comparison.</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>i965: Fix KHR_blend_equation_advanced with some render targets.</li>

				</ul>

				<p>Frank Binns (1):</p>

				<ul>

				  <li>egl/dri2: fix segfault when display initialisation fails</li>

				</ul>

				<p>George Kyriazis (1):</p>

				<ul>

				  <li>swr/rast: blend_epi32() should return Integer, not Float</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>

				</ul>

				<p>Gurchetan Singh (1):</p>

				<ul>

				  <li>mesa: don't clamp just based on ARB_viewport_array extension</li>

				</ul>

				<p>Iago Toral Quiroga (2):</p>

				<ul>

				  <li>i965/sbe: fix number of inputs for active components</li>

				  <li>i965/vec4: use a temp register to compute offsets for pull loads</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>radv: Really use correct HTILE expanded words.</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>intel/isl: Add an isl_color_value_is_zero helper</li>

				  <li>vulkan/wsi/x11: Set OUT_OF_DATE if wait_for_special_event fails</li>

				  <li>intel/fs: Set up sampler message headers in the visitor on gen7+</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>configure.ac: pthread-stubs not present on OpenBSD</li>

				</ul>

				<p>Jordan Justen (3):</p>

				<ul>

				  <li>i965: Create new program cache bo when clearing the program cache</li>

				  <li>program: Don't reset SamplersValidated when restoring from shader cache</li>

				  <li>intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview</li>

				</ul>

				<p>Juan A. Suarez Romero (14):</p>

				<ul>

				  <li>cherry-ignore: Explicit 18.0 only nominations</li>

				  <li>cherry-ignore: r600/compute: only mark buffer/image state dirty for fragment shaders</li>

				  <li>cherry-ignore: anv: Move setting current_pipeline to cmd_state_init</li>

				  <li>cherry-ignore: anv: Be more careful about fast-clear colors</li>

				  <li>cherry-ignore: Add patches that has a specific version for 17.3</li>

				  <li>cherry-ignore: r600: Take ALU_EXTENDED into account when evaluating jump offsets</li>

				  <li>cherry-ignore: intel/compiler: Memory fence commit must always be enabled for gen10+</li>

				  <li>cherry-ignore: i965: Avoid problems from referencing orphaned BOs after growing.</li>

				  <li>cherry-ignore: include all Meson related fixes</li>

				  <li>cherry-ignore: ac/shader: fix vertex input with components.</li>

				  <li>cherry-ignore: i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.</li>

				  <li>cherry-ignore: anv/image: Separate modifiers from legacy scanout</li>

				  <li>cherry-ignore: glsl: Fix memory leak with known glsl_type instances</li>

				  <li>Update version to 17.3.7</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nvir/nvc0: fix legalizing of ld unlock c0[0x10000]</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Emit CS stall before MEDIA_VFE_STATE.</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965: perf: ensure reading config IDs from sysfs isn't interrupted</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>radeonsi: align command buffer starting address to fix some Raven hangs</li>

				  <li>configure.ac: blacklist libdrm 2.4.90</li>

				</ul>

				<p>Michal Navratil (1):</p>

				<ul>

				  <li>winsys/amdgpu: allow non page-aligned size bo creation from pointer</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>glsl/linker: fix bug when checking precision qualifier</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>ac/nir: use ordered float comparisons except for not equal</li>

				  <li>Revert "mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()"</li>

				</ul>

				<p>Stephan Gerhold (1):</p>

				<ul>

				  <li>util/build-id: Fix address comparison for binaries with LOAD vaddr &gt; 0</li>

				</ul>

				<p>Thomas Hellstrom (2):</p>

				<ul>

				  <li>svga: Fix a leftover debug hack</li>

				  <li>loader_dri3/glx/egl: Reinstate the loader_dri3_vtable get_dri_screen callback</li>

				</ul>

				<p>Tim Rowley (1):</p>

				<ul>

				  <li>swr/rast: fix MemoryBuffer build break for llvm-6</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>nir: fix interger divide by zero crash during constant folding</li>

				</ul>

				<p>Tobias Droste (1):</p>

				<ul>

				  <li>gallivm: Use new LLVM fast-math-flags API</li>

				</ul>

				<p>Vadym Shovkoplias (1):</p>

				<ul>

				  <li>mesa: add glsl version query (v4)</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>swr/rast: Fix macOS macro.</li>

				</ul>

				</div>

				</body>

				</html>

									
										147

docs/relnotes/17.3.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,147 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.8 Release Notes / April 03, 2018</h1>

				<p>

				Mesa 17.3.8 is a bug fix release which fixes bugs found since the 17.3.7 release.

				</p>

				<p>

				Mesa 17.3.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				175d2ca9be2af3a8db6cd603986096d75da70f59699528d7b6675d542a305e23  mesa-17.3.8.tar.gz

				8f9d9bf281c48e4a8f5228816577263b4c655248dc7666e75034ab422951a6b1  mesa-17.3.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				</ul>

				<h2>Changes</h2>

				<p>Axel Davy (3):</p>

				<ul>

				  <li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>

				  <li>st/nine: Fixes warning about implicit conversion</li>

				  <li>st/nine: Fix non inversible matrix check</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>anv/pipeline: fail if TCS/TES compile fail</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>radv: get correct offset into LDS for indexed vars.</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Make swrast display_sync the correct queue</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>meson/configure: detect endian.h instead of trying to guess when it's available</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Don't write to user buffer in glGetTexParameterIuiv on error</li>

				  <li>i965/vec4: Fix null destination register in 3-source instructions</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965: Emit texture cache invalidates around blorp_copy</li>

				</ul>

				<p>Jordan Justen (2):</p>

				<ul>

				  <li>i965: Calculate thread_count in brw_alloc_stage_scratch</li>

				  <li>i965: Hard code CS scratch_ids_per_subslice for Cherryview</li>

				</ul>

				<p>Juan A. Suarez Romero (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.7</li>

				  <li>cherry-ignore: ac/nir: pass the nir variable through tcs loading.</li>

				  <li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>

				  <li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>

				  <li>cherry-ignore: docs: fix 18.0 release note version</li>

				  <li>Update version to 17.3.8</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move feedback command inside of destroy function</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>nir: fix per_vertex_output intrinsic</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>

				  <li>nir: fix crash in loop unroll corner case</li>

				</ul>

				</div>

				</body>

				</html>

									
										162

docs/relnotes/17.3.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,162 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 17.3.9 Release Notes / April 18, 2018</h1>

				<p>

				Mesa 17.3.9 is a bug fix release which fixes bugs found since the 17.3.8 release.

				</p>

				<p>

				Mesa 17.3.9 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4d625f65a1ff4cd8cfeb39e38f047507c6dea047502a0d53113c96f54588f340  mesa-17.3.9.tar.gz

				c5beb5fc05f0e0c294fefe1a393ee118cb67e27a4dca417d77c297f7d4b6e479  mesa-17.3.9.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>dri_util: when overriding, always reset the core version</li>

				  <li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>

				</ul>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>st/nine: Declare lighting consts for ff shaders</li>

				  <li>st/nine: Do not use scratch for face register</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>ac/nir: Add workaround for GFX9 buffer views.</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>st/dri: Initialise modifier to INVALID for DRI2</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>glsl: remove unreachable assert()</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>gbm: remove never-implemented function</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>mesa: Inherit texture view multi-sample information from the original texture images.</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>nir/vars_to_ssa: Remove copies from the correct set</li>

				  <li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>

				  <li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>

				  <li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 17.3.8</li>

				  <li>cherry-ignore: Explicit 18.0 only nominations</li>

				  <li>Update version to 17.3.9</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix number of planes for depth &amp; stencil</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix picking the method for resolve subpass</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965: Extend the negative 32-bit deltas to 64-bits</li>

				</ul>

				<p>Timothy Arceri (6):</p>

				<ul>

				  <li>gallium/pipebuffer: fix parenthesis location</li>

				  <li>glsl: always call do_lower_jumps() after loop unrolling</li>

				  <li>ac: add if/loop build helpers</li>

				  <li>radeonsi: make use of if/loop build helpers in ac</li>

				  <li>ac: make use of if/loop build helpers</li>

				  <li>mesa: free debug messages when destroying the debug state</li>

				</ul>

				<p>Xiong, James (1):</p>

				<ul>

				  <li>i965: return the fourcc saved in __DRIimage when possible</li>

				</ul>

				</div>

				</body>

				</html>

									
										321

docs/relnotes/18.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,321 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.0 Release Notes / March 27 2018</h1>

				<p>

				Mesa 18.0.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 18.0.1.

				</p>

				<p>

				Mesa 18.0.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				93c2d3504b2871ac2146603fb1270f341d36a39695e2950a469c5eac74f98457  mesa-18.0.0.tar.gz

				694e5c3d37717d23258c1f88bc134223c5d1aac70518d2f9134d6df3ee791eea  mesa-18.0.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to "0" or "false"</li>

				<li>GL_ARB_shader_atomic_counters and GL_ARB_shader_atomic_counter_ops on r600/evergreen+</li>

				<li>GL_ARB_shader_image_load_store and GL_ARB_shader_image_size on r600/evergreen+</li>

				<li>GL_ARB_shader_storage_buffer_object on r600/evergreen+</li>

				<li>GL_ARB_compute_shader on r600/evergreen+</li>

				<li>GL_ARB_cull_distance on r600/evergreen+</li>

				<li>GL_ARB_enhanced_layouts on r600/evergreen+</li>

				<li>GL_ARB_bindless_texture on nvc0/kepler</li>

				<li>OpenGL 4.3 on r600/evergreen with hw fp64 support</li>

				<li>Support 1 binary format for GL_ARB_get_program_binary on i965.

				    (For the 18.0 release, 0 formats continue to be supported in

				    compatibility profiles.)</li>

				<li>Cannonlake support on i965 and anv</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85564">Bug 85564</a> - Dead Island rendering issues</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94739">Bug 94739</a> - Mesa 11.1.2 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97532">Bug 97532</a> - Regression: GLB 2.7 &amp; Glmark-2 GLES versions segfault due to linker precision error (259fc505) on dead variable</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97852">Bug 97852</a> - Unreal Engine corrupted preview viewport</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100438">Bug 100438</a> - glsl/ir.cpp:1376: ir_dereference_variable::ir_dereference_variable(ir_variable*): Assertion `var != NULL' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101378">Bug 101378</a> - interpolateAtSample check for input parameter is too strict</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101560">Bug 101560</a> - SPIR-V OpSwitch with int64 not supported even though shaderInt64 is true</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101691">Bug 101691</a> - gfx corruption on windowed 3d-apps running on dGPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102177">Bug 102177</a> - [SKL] ES31-CTS.core.sepshaderobjs.StateInteraction fails sporadically</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102264">Bug 102264</a> - Missing MESA_FORMAT_{B8G8R8A8,B8G8R8X8}_SRGB formats</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102354">Bug 102354</a> - Mesa 17.2 no longer can give SRGB-capable framebuffer on i965, even though Mesa 17.1.x does.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102358">Bug 102358</a> - WarThunder freezes at start, with activated vsync (vblank_mode=2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102435">Bug 102435</a> - [skl,kbl] [drm] GPU hang in Valve games based on Source 1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102503">Bug 102503</a> - Report SRGB framebuffer to SuperTuxKart to workaround SuperTuxKart crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102665">Bug 102665</a> - test_glsl_to_tgsi_lifetime.cpp:53:67: error: ‘&gt;&gt;’ should be ‘&gt; &gt;’ within a nested template argument list</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102677">Bug 102677</a> - [OpenGL CTS] KHR-GL45.CommonBugs.CommonBug_PerVertexValidation fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102680">Bug 102680</a> - [OpenGL CTS] KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102710">Bug 102710</a> - vkCmdBlitImage with arrayLayers &gt; 1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102774">Bug 102774</a> - [BDW] [Bisected] Absolute constant buffers break VAAPI in mpv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102809">Bug 102809</a> - Rust shadows(?) flash random colours</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102897">Bug 102897</a> - Separate bind points are not implemented correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102955">Bug 102955</a> - HyperZ related rendering issue in ARK: Survival Evolved</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103007">Bug 103007</a> - [OpenGL CTS] [HSW] KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103085">Bug 103085</a> - [ivb byt hsw] piglit.spec.arb_indirect_parameters.tf-count-arrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103098">Bug 103098</a> - [OpenGL CTS] KHR-GL45.enhanced_layouts.varying_structure_locations fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103101">Bug 103101</a> - [SKL][bisected] DiRT Rally GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103115">Bug 103115</a> - [BSW BXT GLK] dEQP-VK.spirv_assembly.instruction.compute.sconvert.int32_to_int64</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103128">Bug 103128</a> - [softpipe] piglit fs-ldexp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103227">Bug 103227</a> - [G965 G45 ILK] ES2-CTS.gtf.GL2ExtensionTests.texture_float.texture_float regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103283">Bug 103283</a> - drm_get_device_name_for_fd is broken on FreeBSD</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103388">Bug 103388</a> - Linking libcltgsi.la (llvm/codegen/libclllvm_la-common.lo) fails with &quot;error: no match for 'operator-'&quot; with GCC-7, Mesa from Git and current LLVM revisions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103393">Bug 103393</a> - glDispatchComputeGroupSizeARB : gl_GlobalInvocationID.x != gl_WorkGroupID.x * gl_LocalGroupSizeARB.x + gl_LocalInvocationID.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103412">Bug 103412</a> - gallium/wgl: Another fix to context creation without prior SetPixelFormat()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103496">Bug 103496</a> - svga_screen.c:26:46: error: git_sha1.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103513">Bug 103513</a> - [build failure] radv_shader.c:683:2: error: format not a string literal and no format arguments [-Werror=format-security]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103519">Bug 103519</a> - wayland egl apps crash on start with mesa 17.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103529">Bug 103529</a> - [GM45] GPU hang with mpv fullscreen (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103537">Bug 103537</a> - i965: Shadow of Mordor broken since commit 379b24a40d3d34ffdaaeb1b328f50e28ecb01468 on Haswell</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103544">Bug 103544</a> - Graphical glitches r600 in game this war of mine linux native</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103579">Bug 103579</a> - Vertex shader causes compiler to crash in SPIRV-to-NIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103616">Bug 103616</a> - Increased difference from reference image in shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - [SNB] ES3-CTS.functional.shaders.precision</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103628">Bug 103628</a> - [BXT, GLK, BSW] KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103653">Bug 103653</a> - Unreal segfault since gallium/u_threaded: avoid syncs for get_query_result</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103658">Bug 103658</a> - addrlib/gfx9/gfx9addrlib.cpp:727:50: error: expected expression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103674">Bug 103674</a> - u_queue.c:173:7: error: implicit declaration of function 'timespec_get' is invalid in C99</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103759">Bug 103759</a> - plasma desktop corrupted rendering</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103784">Bug 103784</a> - [bisected] Egl changes breaks all of EGL</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103787">Bug 103787</a> - [BDW,BSW] gpu hang on spec.arb_pipeline_statistics_query.arb_pipeline_statistics_query-comp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103801">Bug 103801</a> - [i965] &gt;Observer_ issue</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103808">Bug 103808</a> - [radeonsi, bisected] World of Warcraft scribbling all over screen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103902">Bug 103902</a> - Portal 2 game  hangs at startup   with latest mesa dev</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103904">Bug 103904</a> - Source engine-based games won't hang at start without R600_DEBUG=vs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103909">Bug 103909</a> - anv_allocator.c:113:1: error: static declaration of ‘memfd_create’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103942">Bug 103942</a> - KHR-GL46.enhanced_layouts.varying* regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103955">Bug 103955</a> - Using array in structure results in wrong GLSL compilation output</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103966">Bug 103966</a> - Mesa 17.2.5 implementation error: bad format MESA_FORMAT_Z_FLOAT32 in _mesa_unpack_uint_24_8_depth_stencil_row</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103988">Bug 103988</a> - Intermittent piglit failures with shader cache enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104005">Bug 104005</a> - [sklgt4e] GPU hangs in Car_Chase</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104119">Bug 104119</a> - radv: OpBitFieldInsert produces 0 with a loop counter for Insert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104141">Bug 104141</a> - include/c11/threads_posix.h:96: undefined reference to `pthread_once'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104143">Bug 104143</a> - r600/sb: clobbers gl_Position -&gt; gl_FragCoord</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104163">Bug 104163</a> - [GEN9+] 2-3% perf drop in GfxBench Manhattan 3.1 from &quot;i965: Disable regular fast-clears (CCS_D) on gen9+&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104183">Bug 104183</a> - mesa-17.3.0/src/broadcom/qpu/qpu_pack.c:171]: (error) Invalid memcmp() argument</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104199">Bug 104199</a> - [i965 bisected] BIO and EM Vision in &gt;Observer_ is broken since commit af2c320190f3c73180f1610c8df955a7fa2a4d09</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104213">Bug 104213</a> - NULL pointer access crashes on compiling Vulkan compute shaders after &quot;anv: Add support for the variablePointers feature&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104214">Bug 104214</a> - Dota crashes when switching from game to desktop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104226">Bug 104226</a> - [bisected] Anvil accesses uninitialized memory while compiling shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104231">Bug 104231</a> - DispatchSanity_test.GL30 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104246">Bug 104246</a> - Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104271">Bug 104271</a> - i965: Timeout in dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.5</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104288">Bug 104288</a> - Steamroll needs allow_glsl_cross_stage_interpolation_mismatch=true</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104331">Bug 104331</a> - [r600g] Ogre demo &quot;TutorialUAV01&quot; crash at r600_decompress_color_images</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104338">Bug 104338</a> - NULL pointer access crash on Sacha Willems' Vulkan raytracing demo after &quot;spirv: Add basic type validation for OpLoad, OpStore, and OpCopyMemory&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104359">Bug 104359</a> - Mesa freezes in &quot;vtn_cfg_walk_blocks&quot; with Sacha Willems' hdr, parallaxmapping and specializationconstants Vulkan demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104381">Bug 104381</a> - swr fails to build since llvm-svn r321257</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104383">Bug 104383</a> - [KBL] Intel GPU hang with firefox</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104411">Bug 104411</a> - [CCS] lemonbar-xft GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104424">Bug 104424</a> - DOOM 2016 broken by spirv OpStore validation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104487">Bug 104487</a> - [KBL] portal2_linux GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104490">Bug 104490</a> - [radeonsi/290x] Dota2 fails to start (can't create opengl context)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104492">Bug 104492</a> - Compute Shader: Wrong alignment when assigning struct value to structured SSBO</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104546">Bug 104546</a> - Crash happens when running compute pipeline after calling glxMakeCurrent two times</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104551">Bug 104551</a> - Check if Mako templates for Python are installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104677">Bug 104677</a> - radv_generate_graphics_pipeline_key reads input rate from incorrect binding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104690">Bug 104690</a> - [G33] regression: piglit.spec.!opengl 1_4.draw-batch and gl-1_4-dlist-multidrawarrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104711">Bug 104711</a> - [skl CCS] Oxenfree (unity engine game) hangs GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104742">Bug 104742</a> - [swrast] piglit gl-1.4-dlist-multidrawarrays regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104746">Bug 104746</a> - [swrast] piglit attribs regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104749">Bug 104749</a> - rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to ‘llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - [BSW/HD400] SynMark OglCSDof GPU hangs when shaders come from cache</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - Reading per-patch outputs in Tessellation Control Shader returns undefined values</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Remove incomplete GLX_MESA_set_3dfx_mode from the Xlib libGL</li>

				</ul>

				</div>

				</body>

				</html>

									
										225

docs/relnotes/18.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,225 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.1 Release Notes / April 18, 2018</h1>

				<p>

				Mesa 18.0.1 is a bug fix release which fixes bugs found since the 18.0.0 release.

				</p>

				<p>

				Mesa 18.0.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				0c93ba892c0610f5dd87f2e2673b9445187995c395b3ddb33fd4260bfb291e89  mesa-18.0.1.tar.gz

				b2d2f5b5dbaab13e15cb0dcb5ec81887467f55ebc9625945b303a3647cd87954  mesa-18.0.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - compiler assertion hit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>dri_util: when overriding, always reset the core version</li>

				  <li>mesa: adds some comments regarding MESA_GLES_VERSION_OVERRIDE usage</li>

				</ul>

				<p>Axel Davy (5):</p>

				<ul>

				  <li>st/nine: Fix bad tracking of vs textures for NINESBT_ALL</li>

				  <li>st/nine: Fixes warning about implicit conversion</li>

				  <li>st/nine: Fix non inversible matrix check</li>

				  <li>st/nine: Declare lighting consts for ff shaders</li>

				  <li>st/nine: Do not use scratch for face register</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>ac/nir: Add workaround for GFX9 buffer views.</li>

				  <li>radv: Don't set instance count using predication.</li>

				  <li>radv: Always reset draw user SGPRs after secondary command buffer.</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>anv/pipeline: fail if TCS/TES compile fail</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>st/dri: Initialise modifier to INVALID for DRI2</li>

				</ul>

				<p>Derek Foreman (1):</p>

				<ul>

				  <li>egl/wayland: Make swrast display_sync the correct queue</li>

				</ul>

				<p>Dylan Baker (4):</p>

				<ul>

				  <li>meson: don't use compiler.has_header</li>

				  <li>autotools: include meson_get_version</li>

				  <li>meson: Set .so version for xa like autotools does</li>

				  <li>meson: fix megadriver symlinking</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.0</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>meson/configure: detect endian.h instead of trying to guess when it's available</li>

				  <li>docs: fix 18.0 release note version</li>

				  <li>gbm: remove never-implemented function</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>mesa: Inherit texture view multi-sample information from the original texture images.</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>compiler/spirv: set is_shadow for depth comparitor sampling opcodes</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>i965/vec4: Fix null destination register in 3-source instructions</li>

				</ul>

				<p>Jason Ekstrand (4):</p>

				<ul>

				  <li>nir/vars_to_ssa: Remove copies from the correct set</li>

				  <li>nir/lower_indirect_derefs: Support interp_var_at intrinsics</li>

				  <li>intel/vec4: Set channel_sizes for MOV_INDIRECT sources</li>

				  <li>nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination</li>

				</ul>

				<p>Juan A. Suarez Romero (5):</p>

				<ul>

				  <li>cherry-ignore anv: Be more careful about fast-clear colors</li>

				  <li>cherry-ignore: ac/shader: fix vertex input with components.</li>

				  <li>cherry-ignore: radv: handle exporting view index to fragment shader. (v1.1)</li>

				  <li>cherry-ignore: omx: always define ENABLE_ST_OMX_{BELLAGIO,TIZONIA}</li>

				  <li>Update version to 18.0.1</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: move feedback command inside of destroy function</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965/perf: fix config registration when uploading to kernel</li>

				</ul>

				<p>Marc Dietrich (1):</p>

				<ul>

				  <li>meson: fix HAVE_LLVM version define in meson build</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>mesa: simplify MESA_GL_VERSION_OVERRIDE behavior of API override</li>

				</ul>

				<p>Mark Thompson (1):</p>

				<ul>

				  <li>st/va: Enable vaExportSurfaceHandle()</li>

				</ul>

				<p>Rob Clark (3):</p>

				<ul>

				  <li>nir: fix per_vertex_output intrinsic</li>

				  <li>freedreno/a5xx: fix page faults on last level</li>

				  <li>freedreno/a5xx: don't align height for PIPE_BUFFER</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix picking the method for resolve subpass</li>

				  <li>radv: fix radv_layout_dcc_compressed() when image doesn't have DCC</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>i965: Extend the negative 32-bit deltas to 64-bits</li>

				</ul>

				<p>Timothy Arceri (7):</p>

				<ul>

				  <li>ac: add if/loop build helpers</li>

				  <li>radeonsi: make use of if/loop build helpers in ac</li>

				  <li>ac: make use of if/loop build helpers</li>

				  <li>glsl: fix infinite loop caused by bug in loop unrolling pass</li>

				  <li>nir: fix crash in loop unroll corner case</li>

				  <li>gallium/pipebuffer: fix parenthesis location</li>

				  <li>glsl: always call do_lower_jumps() after loop unrolling</li>

				</ul>

				<p>Xiong, James (1):</p>

				<ul>

				  <li>i965: return the fourcc saved in __DRIimage when possible</li>

				</ul>

				</div>

				</body>

				</html>

									
										144

docs/relnotes/18.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,144 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.2 Release Notes / April 28, 2018</h1>

				<p>

				Mesa 18.0.2 is a bug fix release which fixes bugs found since the 18.0.1 release.

				</p>

				<p>

				Mesa 18.0.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: ffd8dfe3337b474a3baa085f0e7ef1a32c7cdc3bed1ad810b2633919a9324840  mesa-18.0.2.tar.gz

				SHA256: 98fa159768482dc568b9f8bf0f36c7acb823fa47428ffd650b40784f16b9e7b3  mesa-18.0.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>ac/nir: Make the GFX9 buffer size fix apply to image loads/atomics too.</li>

				  <li>radv: Mark GTT memory as device local for APUs.</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>bin/install_megadrivers: fix DESTDIR and -D*-path</li>

				  <li>meson: don't build classic mesa tests without dri_drivers</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>intel/compiler: Add scheduler deps for instructions that implicitly read g0</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/fs: Return mlen * 8 for size_read() for INTERPOLATE_AT_*</li>

				</ul>

				<p>Johan Klokkhammer Helsing (1):</p>

				<ul>

				  <li>st/dri: Fix dangling pointer to a destroyed dri_drawable</li>

				</ul>

				<p>Juan A. Suarez Romero (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.1</li>

				  <li>travis: radv needs LLVM 4.0</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>Update version to 18.0.2</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Fix shadow batches to be the same size as the real BO.</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: fix number of planes for depth &amp; stencil</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>etnaviv: fix texture_format_needs_swiz</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi/gfx9: fix a hang with an empty first IB</li>

				  <li>glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract</li>

				  <li>Revert "st/dri: Fix dangling pointer to a destroyed dri_drawable"</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: fix scissor computation when using half-pixel viewport offset</li>

				  <li>radv/winsys: allow to submit up to 4 IBs for chips without chaining</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: free debug messages when destroying the debug state</li>

				</ul>

				</div>

				</body>

				</html>

									
										107

docs/relnotes/18.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,107 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.3 Release Notes / May 7, 2018</h1>

				<p>

				Mesa 18.0.3 is a bug fix release which fixes bugs found since the 18.0.2 release.

				</p>

				<p>

				Mesa 18.0.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				58cc5c5b1ab2a44e6e47f18ef6c29836ad06f95450adce635ce3c317507a171b  mesa-18.0.3.tar.gz

				099d9667327a76a61741a533f95067d76ea71a656e66b91507b3c0caf1d49e30  mesa-18.0.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv/winsys: fix leaking resources from bo's imported by fd</li>

				</ul>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>radeon/vcn: fix mpeg4 msg buffer settings</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>gallium/util: Fix incorrect refcounting of separate stencil.</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv/allocator: Don't shrink either end of the block pool</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.2</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>Update version to 18.0.3</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>st/omx/enc: fix blit setup for YUV LoadImage</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>util/u_queue: fix a deadlock in util_queue_finish</li>

				  <li>radeonsi/gfx9: workaround for INTERP with indirect indexing</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>i965/tex_image: Avoid the ASTC LDR workaround on gen9lp</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: compute the number of subpass attachments correctly</li>

				</ul>

				</div>

				</body>

				</html>

									
										157

docs/relnotes/18.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,157 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.4 Release Notes / May 17, 2018</h1>

				<p>

				Mesa 18.0.4 is a bug fix release which fixes bugs found since the 18.0.3 release.

				</p>

				<p>

				Mesa 18.0.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				d1dc3469faccdd73439479426952d71a9e8f684e8d03b6687063c12b13430801  mesa-18.0.4.tar.gz

				1f3bcfe7cef0a5c20dae2b41df5d7e0a985e06be0183fa4d43b6068fcba2920f  mesa-18.0.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>

				</ul>

				<h2>Changes</h2>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Translate logic ops.</li>

				  <li>radv: Fix up 2_10_10_10 alpha sign.</li>

				  <li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>r600: fix constant buffer bounds.</li>

				  <li>radv: resolve all layers in compute resolve path.</li>

				  <li>radv: use compute path for multi-layer images.</li>

				</ul>

				<p>Deepak Rawat (1):</p>

				<ul>

				  <li>egl/x11: Send invalidate to driver on copy_region path in swap_buffer</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>mesa: Add missing support for glFogiv(GL_FOG_DISTANCE_MODE_NV)</li>

				</ul>

				<p>Jan Vesely (8):</p>

				<ul>

				  <li>clover: Add explicit virtual destructor to argument class</li>

				  <li>eg/compute: Drop reference on code_bo in destructor.</li>

				  <li>r600: Cleanup constant buffers on context destruction</li>

				  <li>eg/compute: Drop reference to kernel_param bo in destructor</li>

				  <li>pipe-loader: Free driver_name in error path</li>

				  <li>gallium/auxiliary: Add helper function to count the number of entries in hash table</li>

				  <li>winsys/radeon: Destroy fd_hash table when the last winsys is removed.</li>

				  <li>winsys/amdgpu: Destroy dev_hash table when the last winsys is removed.</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965,anv: Set the CS stall bit on the ISP disable PIPE_CONTROL</li>

				</ul>

				<p>Jose Maria Casanova Crespo (2):</p>

				<ul>

				  <li>intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediate</li>

				  <li>intel/compiler: fix brw_imm_w for negative 16-bit integers</li>

				</ul>

				<p>Juan A. Suarez Romero (7):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.3</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>cherry-ignore: glsl: change ast_type_qualifier bitset size to work around GCC 5.4 bug</li>

				  <li>cherry-ignore: mesa: fix glGetInteger/Float/etc queries for vertex arrays attribs</li>

				  <li>cherry-ignore: mesa: revert GL_[SECONDARY_]COLOR_ARRAY_SIZE glGet type to TYPE_INT</li>

				  <li>cherry-ignore: radv/resolve: do fmask decompress on all layers.</li>

				  <li>Update version to 18.0.4</li>

				</ul>

				<p>Kai Wasserbäch (1):</p>

				<ul>

				  <li>opencl: autotools: Fix linking order for OpenCL target</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Don't leak blorp on Gen4-5.</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>i965: require pixel scoreboard stall prior to ISP disable</li>

				  <li>anv: emit pixel scoreboard stall before ISP disable</li>

				</ul>

				<p>Matthew Nicholls (1):</p>

				<ul>

				  <li>radv: fix multisample image copies</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>spirv: Apply OriginUpperLeft to FragCoord</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>mesa: fix error handling in get_framebuffer_parameteriv</li>

				</ul>

				<p>Ross Burton (1):</p>

				<ul>

				  <li>src/intel/Makefile.vulkan.am: add missing MKDIR_GEN</li>

				</ul>

				</div>

				</body>

				</html>

									
										162

docs/relnotes/18.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,162 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.0.5 Release Notes / June 3, 2018</h1>

				<p>

				Mesa 18.0.5 is a bug fix release which fixes bugs found since the 18.0.4 release.

				</p>

				<p>

				Mesa 18.0.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				ea3e00329cea899b1e32db812fd2f426832be37e4baa2e2fd9288a3480f30531  mesa-18.0.5.tar.gz

				5187bba8d72aea78f2062d134ec6079a508e8216062dce9ec9048b5eb2c4fc6b  mesa-18.0.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106465">Bug 106465</a> - No test for Image Load/Store on format-incompatible texture buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106481">Bug 106481</a> - No test for Image Load/Store on texture buffer sized greater than MAX_TEXTURE_BUFFER_SIZE_ARB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965/glk: Add l3 banks count for 2x6 configuration</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>amd/addrlib: Use defines in autotools build.</li>

				  <li>radv: Fix SRGB compute copies.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>tgsi/scan: add hw atomic to the list of memory accessing files</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>

				  <li>i965: Move buffer texture size calculation into a common helper function.</li>

				  <li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>

				  <li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>eg/compute: Use reference counting to handle compute memory pool.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>

				  <li>intel/blorp: Support blits and clears on surfaces with offsets</li>

				</ul>

				<p>Jose Dapena Paz (1):</p>

				<ul>

				  <li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>

				</ul>

				<p>Juan A. Suarez Romero (8):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.0.4</li>

				  <li>cherry-ignore: i965/miptree: Fix handling of uninitialized MCS buffers</li>

				  <li>cherry-ignore: add explicit 18.1 only nominations</li>

				  <li>cherry-ignore: mesa/st: handle vert_attrib_mask in nir case too</li>

				  <li>cherry-ignore: Tegra is not supported</li>

				  <li>cherry-ignore: st/mesa: fix assertion failures with GL_UNSIGNED_INT64_ARB (v2)</li>

				  <li>cherry-ignore: nv30: ensure that displayable formats are marked accordingly</li>

				  <li>Update version to 18.0.5</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>

				  <li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>

				  <li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>dri3: Stricter SBC wraparound handling</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>i965/miptree: Zero-initialize CCS_D buffers</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>spirv: fix visiting inner loops with same break/continue block</li>

				  <li>radv: fix centroid interpolation</li>

				</ul>

				<p>Stuart Young (1):</p>

				<ul>

				  <li>etnaviv: Fix missing rnndb file in tarballs</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: add glUniform*ui{v} support to display lists</li>

				</ul>

				</div>

				</body>

				</html>

									
										268

docs/relnotes/18.1.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,268 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.0 Release Notes / May 18 2018</h1>

				<p>

				Mesa 18.1.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 18.1.1.

				</p>

				<p>

				Mesa 18.1.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b1c1dbb42597190503d3abc518b12de880623f097c6cb6c293ecf69ae87e6fbf  mesa-18.1.0.tar.gz

				c855c5b67ef993b7621f76d8b120769ec0415f1c3616eaff44ef7f7f300aceba  mesa-18.1.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 3.1 with ARB_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe, svga</li>

				<li>GL_ARB_bindless_texture on nvc0/maxwell+</li>

				<li>GL_ARB_transform_feedback_overflow_query on nvc0</li>

				<li>GL_EXT_semaphore on radeonsi</li>

				<li>GL_EXT_semaphore_fd on radeonsi</li>

				<li>GL_EXT_shader_framebuffer_fetch on i965 on desktop GL (GLES was already supported)</li>

				<li>GL_EXT_shader_framebuffer_fetch_non_coherent on i965</li>

				<li>GL_KHR_blend_equation_advanced on radeonsi</li>

				<li>Disk shader cache support for i965 enabled by default</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90311">Bug 90311</a> - Fail to build libglx with clang at linking stage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98281">Bug 98281</a> - 'message's in ctx-&gt;Debug.LogMessages[] seem to leak.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99549">Bug 99549</a> - pp: Failed to translate a shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101408">Bug 101408</a> - [Gen8+] Xonotic fails to render one of the weapons</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101442">Bug 101442</a> - Piglit shaders&#64;ssa&#64;fs-if-def-else-break fails with sb but passes with R600_DEBUG=nosb</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102342">Bug 102342</a> - mesa-17.1.7/src/gallium/auxiliary/pipebuffer/pb_cache.c:169]: (style) Suspicious condition</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102542">Bug 102542</a> - mesa-17.2.0/src/gallium/state_trackers/nine/nine_ff.c:1938: bad assignment ?</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102905">Bug 102905</a> - [R600] Miscompilation of TGSI to VLIW causes artifacts in Gallium Nine with Crysis2 bump mapping</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103006">Bug 103006</a> - [OpenGL CTS] [HSW] KHR-GL45.vertex_attrib_binding.basic-inputL-case1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103142">Bug 103142</a> - R600g+sb: optimizer apparently stuck in an endless loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103626">Bug 103626</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103746">Bug 103746</a> - [BDW BSW SKL KBL] dEQP-GLES31.functional.copy_image regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104302">Bug 104302</a> - Wolfenstein 2 (2017) under wine graphical artifacting on RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104335">Bug 104335</a> - [OpenGL CTS][SKL,KBL] KHR-GL45.vertex_attrib_64bit.limits_test occasionally fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104625">Bug 104625</a> - semicolon after if</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104636">Bug 104636</a> - [BSW/HD400] Aztec Ruins GL version GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104642">Bug 104642</a> - Android: NULL pointer dereference with i965 mesa-dev, seems build_id_length related</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104654">Bug 104654</a> - r600/sb: Alien Isolation GPU lock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104668">Bug 104668</a> - dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104717">Bug 104717</a> - Rocket League: grass rendering broken with nir</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104732">Bug 104732</a> - [radv] Binding descriptor sets disturbs other pipeline bindings</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104741">Bug 104741</a> - Graphic corruption for Android apps Telegram and KineMaster</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104762">Bug 104762</a> - Various segfaults/problems in qt/plasma</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104777">Bug 104777</a> - Attaching multiple shader objects for the same stage to a GLSL program triggers a linker error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104794">Bug 104794</a> - piglit.spec.arb_internalformat_query2.samples and num_sample_counts pname checks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104803">Bug 104803</a> - SIGSEGV in state_tracker/st_glsl_to_tgsi_temprename.cpp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104863">Bug 104863</a> - 186 assertions in piglit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104884">Bug 104884</a> - memory leak with intel i965 mesa when running android container in Ubuntu</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104905">Bug 104905</a> - SpvOpFOrdEqual doesn't return correct results for NaNs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104908">Bug 104908</a> - Texture Compression Hint not converted to enum16</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104915">Bug 104915</a> - Indexed SHADING_LANGUAGE_VERSION query not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104923">Bug 104923</a> - anv: Dota2 rendering corruption</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104989">Bug 104989</a> - [r600] [bisected] OpenGL applications can't render anything at all</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105013">Bug 105013</a> - [regression] GLX+VA-API+clutter-gst video playback is corrupt with Mesa 17.3 (but is fine with 17.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105026">Bug 105026</a> - glxgears asserts with pp_jimenezmlaa=1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105029">Bug 105029</a> - simdlib_512_avx512.inl:371:57: error: could not convert ‘_mm512_mask_blend_epi32((__mmask16)(ImmT), a, b)’ from ‘__m512i’ {aka ‘__vector(8) long long int’} to ‘SIMDImpl::SIMD512Impl::Float’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105052">Bug 105052</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105065">Bug 105065</a> - Qt Programs occasionally fail to render with new Mesa (glGetProgramBinary)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105067">Bug 105067</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105088">Bug 105088</a> - brw_nir_uniforms.cpp:256:10: error: non-constant-expression cannot be narrowed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105098">Bug 105098</a> - [RADV] GPU freeze with simple Vulkan App</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105103">Bug 105103</a> - Wayland master causes Mesa to fail to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105120">Bug 105120</a> - meson build broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105161">Bug 105161</a> - KHR_blend_equation_advanced doesn't work in GLSL 1.10-1.40 shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105183">Bug 105183</a> - Weird assertion in NIR linker</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105211">Bug 105211</a> - build failure after zwp_dmabuf commit if wayland-protocols is not installed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105224">Bug 105224</a> - Webgl Pointclouds flickers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105229">Bug 105229</a> - [KBL SKL BDW HSW] [Regression] KHR-GLES31.core.shader_image_load_store.advanced-sso-simple failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105238">Bug 105238</a> - ast.h:648:16: error: union member 'i' has a non-trivial constructor</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105255">Bug 105255</a> - Waiting for fences without waitAll is not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105262">Bug 105262</a> - [R600] [BISECTED] ttf fonts are invisible in many programs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105271">Bug 105271</a> - WebGL2 shader crashes i965_dri.so 17.3.3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105274">Bug 105274</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105290">Bug 105290</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105292">Bug 105292</a> - vkGetQueryPoolResults returns incorrect query status for large query buffers (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105317">Bug 105317</a> - The GPU Vega 56 was hang while try to pass #GraphicsFuzz shader15 test</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105320">Bug 105320</a> - Storage texel buffer access produces wrong results (RX Vega)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105436">Bug 105436</a> - Blinking textures in UT2004 [bisected]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105440">Bug 105440</a> - GEN7: rendering issue on citra</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105442">Bug 105442</a> - Hang when running nine ff lighting shader with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105444">Bug 105444</a> - Enable GL disk shader cache when transform feedback is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105464">Bug 105464</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105471">Bug 105471</a> - [g33] [bisected] dEQP-GLES2.functional.shaders failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105529">Bug 105529</a> - u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105567">Bug 105567</a> - meson/ninja: 1. mesa/vdpau incorrect symlinks in DESTDIR and 2. Ddri-drivers-path Dvdpau-libs-path overrides DESTDIR</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105621">Bug 105621</a> - Build failure on GNOME Continuous</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105634">Bug 105634</a> - Android build test fails when building brw_oa_metrics.c</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105670">Bug 105670</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105704">Bug 105704</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105717">Bug 105717</a> - [bisected] Mesa build tests fails: BIGENDIAN_CPU or LITTLEENDIAN_CPU must be defined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105737">Bug 105737</a> - st_tests_common.cpp:140:42: error: no matching function for call to 'tgsi_get_opcode_info'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105738">Bug 105738</a> - commit f7ffa504a065dc2631fd38cc5fe885b277f4e7e7 causes artifacting in radv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105740">Bug 105740</a> - glsl_types.cpp(524): error: a dynamically-initialized local static variable is not allowed inside of a statement expression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105775">Bug 105775</a> - SI reaches the maximum IB size in dwords and fail to submit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105807">Bug 105807</a> - [Regression, bisected]: 3D Rendering not working correctly in Warhammer 40k: Dawn of War II</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105817">Bug 105817</a> - scons build broken by glSpecializeShaderARB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105820">Bug 105820</a> - [m32] piglit regressions relinking program without shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105942">Bug 105942</a> - Graphical artefacts after update to mesa 18.0.0-2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105952">Bug 105952</a> - radv causes GPU hang on SI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105960">Bug 105960</a> - [bisected] meson build test fails with: undefined reference to `etna_pm_create_query'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105994">Bug 105994</a> - surface state leak when creating and destroying image views with aspectMask depth and stencil</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106074">Bug 106074</a> - radv: si_scissor_from_viewport returns incorrect result when using half-pixel viewport offset</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106126">Bug 106126</a> - eglMakeCurrent does not always ensure dri_drawable-&gt;update_drawable_info has been called for a new EGLSurface if another has been created and destroyed first</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106131">Bug 106131</a> - meson/ninja build missing file gtest.h</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106147">Bug 106147</a> - SIGBUS in write_reloc() when Sacha Willems' &quot;texture3d&quot; Vulkan demo starts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Remove incomplete GLX_SGIX_swap_barrier stubs from the Xlib libGL</li>

				<li>Remove incomplete GLX_SGIX_swap_group stubs from the Xlib libGL</li>

				</ul>

				</div>

				</body>

				</html>

									
										168

docs/relnotes/18.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,168 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.1 Release Notes / June 1 2018</h1>

				<p>

				Mesa 18.1.1 is a bug fix release which fixes bugs found since the 18.1.0 release.

				</p>

				<p>

				Mesa 18.1.1 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				366a35f7530a016f2a8284fb0ee5759eeb216b4d6fa47f0e96b89ad2e43faf96  mesa-18.1.1.tar.gz

				d3312a2ede5aac14a47476b208b8e3a401367838330197c4588ab8ad420d7781  mesa-18.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>None<p>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965/glk: Add l3 banks count for 2x6 configuration</li>

				</ul>

				<p>Bas Nieuwenhuizen (7):</p>

				<ul>

				  <li>radv: Fix multiview queries.</li>

				  <li>radv: Translate logic ops.</li>

				  <li>radv: Fix up 2_10_10_10 alpha sign.</li>

				  <li>radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.</li>

				  <li>amd/addrlib: Use defines in autotools build.</li>

				  <li>radv: Fix SRGB compute copies.</li>

				  <li>radv: Only expose subgroup shuffles on VI+.</li>

				</ul>

				<p>Christoph Haag (1):</p>

				<ul>

				  <li>radv: fix VK_EXT_descriptor_indexing</li>

				</ul>

				<p>Dave Airlie (5):</p>

				<ul>

				  <li>radv/resolve: do fmask decompress on all layers.</li>

				  <li>radv: resolve all layers in compute resolve path.</li>

				  <li>radv: use compute path for multi-layer images.</li>

				  <li>virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.</li>

				  <li>tgsi/scan: add hw atomic to the list of memory accessing files</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add sha sums for release</li>

				  <li>VERSION: bump to 18.1.1 for next release</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>vulkan: don't free uninitialised memory</li>

				</ul>

				<p>Francisco Jerez (4):</p>

				<ul>

				  <li>Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"</li>

				  <li>i965: Move buffer texture size calculation into a common helper function.</li>

				  <li>i965: Handle non-zero texture buffer offsets in buffer object range calculation.</li>

				  <li>i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nv30: ensure that displayable formats are marked accordingly</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>eg/compute: Use reference counting to handle compute memory pool.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0</li>

				  <li>intel/blorp: Support blits and clears on surfaces with offsets</li>

				</ul>

				<p>Jose Dapena Paz (1):</p>

				<ul>

				  <li>mesa: do not leak ctx-&gt;Shader.ReferencedProgram references</li>

				</ul>

				<p>Kai Wasserbäch (1):</p>

				<ul>

				  <li>opencl: autotools: Fix linking order for OpenCL target</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>st/mesa: simplify lastLevel determination in st_finalize_texture</li>

				  <li>radeonsi: fix incorrect parentheses around VS-PS varying elimination</li>

				  <li>mesa: handle GL_UNSIGNED_INT64_ARB properly (v2)</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>dri3: Stricter SBC wraparound handling</li>

				</ul>

				<p>Nanley Chery (4):</p>

				<ul>

				  <li>i965: Add and use a getter for the miptree aux buffer</li>

				  <li>i965: Add and use a single miptree aux_buf field</li>

				  <li>i965/miptree: Fix handling of uninitialized MCS buffers</li>

				  <li>i965/miptree: Zero-initialize CCS_D buffers</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>spirv: fix visiting inner loops with same break/continue block</li>

				  <li>radv: fix centroid interpolation</li>

				</ul>

				<p>Stuart Young (1):</p>

				<ul>

				  <li>etnaviv: Fix missing rnndb file in tarballs</li>

				</ul>

				<p>Thierry Reding (3):</p>

				<ul>

				  <li>tegra: Treat resources with modifiers as scanout</li>

				  <li>tegra: Fix scanout resources without modifiers</li>

				  <li>tegra: Remove usage of non-stable UAPI</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>mesa: add glUniform*ui{v} support to display lists</li>

				</ul>

				</div>

				</body>

				</html>

									
										170

docs/relnotes/18.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,170 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.2 Release Notes / June 15 2018</h1>

				<p>

				Mesa 18.1.2 is a bug fix release which fixes bugs found since the 18.1.1 release.

				</p>

				<p>

				Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				a644df23937f4078a2bd9a54349f6315c1955f5e3a4ac272832da51dea4d3c11  mesa-18.1.1.tar.gz

				070bf0648ba5b242d7303ceed32aed80842f4c0ba16e5acc1a650a46eadfb1f9  mesa-18.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>None<p>

				<h2>Changes</h2>

				<p>Alex Smith (4):</p>

				<ul>

				  <li>radv: Consolidate GFX9 merged shader lookup logic</li>

				  <li>radv: Handle GFX9 merged shaders in radv_flush_constants()</li>

				  <li>radeonsi: Fix crash on shaders using MSAA image load/store</li>

				  <li>radv: Set active_stages the same whether or not shaders were cached</li>

				</ul>

				<p>Andrew Galante (2):</p>

				<ul>

				  <li>meson: Test for __atomic_add_fetch in atomic checks</li>

				  <li>configure.ac: Test for __atomic_add_fetch in atomic checks</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Don't pass a TESS_EVAL shader when tesselation is not enabled.</li>

				</ul>

				<p>Cameron Kumar (1):</p>

				<ul>

				  <li>vulkan/wsi: Destroy swapchain images after terminating FIFO queues</li>

				</ul>

				<p>Dylan Baker (6):</p>

				<ul>

				  <li>docs/relnotes: Add sha256 sums for mesa 18.1.1</li>

				  <li>cherry-ignore: add commits not to pull</li>

				  <li>cherry-ignore: Add patches from Jason that he rebased on 18.1</li>

				  <li>meson: work around gentoo applying -m32 to host compiler in cross builds</li>

				  <li>cherry-ignore: Add another patch</li>

				  <li>version: bump version for 18.1.2 release</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>autotools: add missing android file to package</li>

				  <li>configure: radv depends on mako</li>

				  <li>i965: fix resource leak</li>

				</ul>

				<p>Jason Ekstrand (10):</p>

				<ul>

				  <li>intel/eu: Add some brw_get_default_ helpers</li>

				  <li>intel/eu: Copy fields manually in brw_next_insn</li>

				  <li>intel/eu: Set flag [sub]register number differently for 3src</li>

				  <li>intel/blorp: Don't vertex fetch directly from clear values</li>

				  <li>intel/isl: Add bounds-checking assertions in isl_format_get_layout</li>

				  <li>intel/isl: Add bounds-checking assertions for the format_info table</li>

				  <li>i965/screen: Refactor query_dma_buf_formats</li>

				  <li>i965/screen: Use RGBA non-sRGB formats for images</li>

				  <li>anv: Set fence/semaphore types to NONE in impl_cleanup</li>

				  <li>i965/screen: Return false for unsupported formats in query_modifiers</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>mesa/program_binary: add implicit UseProgram after successful ProgramBinary</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>glsl: Add ir_binop_vector_extract in NIR</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Fix batch-last mode to properly swap BOs.</li>

				  <li>anv: Disable __gen_validate_value if NDEBUG is set.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>r300g/swtcl: make pipe_context uploaders use malloc'd memory as before</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>meson: Fix -latomic check</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>glx: Fix number of property values to read in glXImportContextEXT</li>

				</ul>

				<p>Nicolas Boichat (1):</p>

				<ul>

				  <li>configure.ac/meson.build: Fix -latomic test</li>

				</ul>

				<p>Philip Rebohle (1):</p>

				<ul>

				  <li>radv: Use correct color format for fast clears</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>radv: fix a GPU hang when MRTs are sparse</li>

				  <li>radv: fix missing ZRANGE_PRECISION(1) for GFX9+</li>

				  <li>radv: add a workaround for DXVK hangs by setting amdgpu-skip-threshold</li>

				</ul>

				<p>Scott D Phillips (1):</p>

				<ul>

				  <li>intel/tools: add intel_sanitize_gpu to EXTRA_DIST</li>

				</ul>

				<p>Thomas Petazzoni (1):</p>

				<ul>

				  <li>configure.ac: rework -latomic check</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>ac: fix possible truncation of intrinsic name</li>

				  <li>radeonsi: fix possible truncation on renderer string</li>

				</ul>

				</div>

				</body>

				</html>

									
										167

docs/relnotes/18.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,167 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.3 Release Notes / June 29 2018</h1>

				<p>

				Mesa 18.1.3 is a bug fix release which fixes bugs found since the 18.1.2 release.

				</p>

				<p>

				Mesa 18.1.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2a1e36280d01ad18ba6d5b3fbd653ceaa109eaa031b78eb5dfaa4df452742b66  mesa-18.1.3.tar.gz

				54f08deeda0cd2f818e8d40140040ed013de7852573002453b7f50da9ea738ce  mesa-18.1.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark fails.</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965/gen6/gs: Handle case where a GS doesn't allocate VUE</li>

				</ul>

				<p>Bas Nieuwenhuizen (2):</p>

				<ul>

				  <li>radv: Fix output for sparse MRTs.</li>

				  <li>ac/surface: Set compressZ for stencil-only surfaces.</li>

				</ul>

				<p>Christian Gmeiner (1):</p>

				<ul>

				  <li>util/bitset: include util/macro.h</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>glsl: allow standalone semicolons outside main()</li>

				</ul>

				<p>Dylan Baker (8):</p>

				<ul>

				  <li>docs: Add release notes for 18.1.2</li>

				  <li>cherry-ignore: Add 587e712eda95c31d88ea9d20e59ad0ae59afef4f</li>

				  <li>meson: Fix auto option for va</li>

				  <li>meson: Fix auto option for xvmc</li>

				  <li>meson: Correct behavior of vdpau=auto</li>

				  <li>cherry-ignore: Ignore cac7ab1192eefdd8d8b3f25053fb006b5c330eb8</li>

				  <li>cherry-ignore: add a2f5292c82ad07731d633b36a663e46adc181db9</li>

				  <li>VERSION: bump version to 18.1.3</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>configure: use compliant grep regex checks</li>

				  <li>glsl/tests/glcpp: reinstate "error out if no tests found"</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>radv: fix reported number of available VGPRs</li>

				  <li>radv: fix bitwise check</li>

				  <li>meson: fix i965/anv/isl genX static lib names</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>glsl: Don't copy propagate from SSBO or shared variables either</li>

				  <li>glsl: Don't copy propagate elements from SSBO or shared variables either</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>nir: Handle call instructions in foreach_src</li>

				  <li>nir/validate: Use the type from the tail of call parameter derefs</li>

				</ul>

				<p>Lukas Rusak (2):</p>

				<ul>

				  <li>meson: only build vl_winsys_dri.c when x11 platform is used</li>

				  <li>meson: fix private libs when building without glx</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointers</li>

				  <li>ac/gpu_info: report real total memory sizes</li>

				  <li>ac/gpu_info: add kernel_flushes_hdp_before_ib</li>

				  <li>radeonsi: always put persistent buffers into GTT on radeon</li>

				  <li>mesa: fix glGetInteger64v for arrays of integers</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno/ir3: fix base_vertex</li>

				</ul>

				<p>Samuel Pitoiset (6):</p>

				<ul>

				  <li>radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8</li>

				  <li>radv: update the ZRANGE_PRECISION value for the TC-compat bug</li>

				  <li>radv: fix emitting the TCS regs on GFX9</li>

				  <li>radv: fix HTILE metadata initialization in presence of subpass clears</li>

				  <li>radv: ignore pInheritanceInfo for primary command buffers</li>

				  <li>radv: use separate bind points for the dynamic buffers</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: serialize data from glTransformFeedbackVaryings</li>

				</ul>

				<p>Tomeu Vizoso (1):</p>

				<ul>

				  <li>virgl: Remove debugging left-overs</li>

				</ul>

				</div>

				</body>

				</html>

									
										150

docs/relnotes/18.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,150 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>

				<p>

				Mesa 18.1.4 is a bug fix release which fixes bugs found since the 18.1.3 release.

				</p>

				<p>

				Mesa 18.1.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: 8acd42e4ac4d1e96ed22344073b3d4fef03d10f225f4eaf3f88c001dfc10e2db  mesa-18.1.4.tar.gz

				SHA256: 3061488b5d85504092cf4343816cfb2d96f2ad9bc2edec31fc96933d184cf58b  mesa-18.1.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (1):</p>

				<ul>

				  <li>glx: Don't allow glXMakeContextCurrent() with only one valid drawable</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600/sb: cleanup if_conversion iterator to be legal C++</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>docs: Add SHA256 sums to notes for 18.1.3</li>

				  <li>Bump version for release</li>

				</ul>

				<p>Iago Toral Quiroga (3):</p>

				<ul>

				  <li>anv/cmd_buffer: make descriptors dirty when emitting base state address</li>

				  <li>anv/cmd_buffer: clean dirty push constants flag after emitting push constants</li>

				  <li>anv/cmd_buffer: never shrink the push constant buffer size</li>

				</ul>

				<p>Ian Romanick (4):</p>

				<ul>

				  <li>i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible</li>

				  <li>intel/compiler: Relax mixed type restriction for saturating immediates</li>

				  <li>i965/vec4: Properly handle sign(-abs(x))</li>

				  <li>i965/fs: Properly handle sign(-abs(x))</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>intel/fs: Split instructions low to high in lower_simd_width</li>

				  <li>anv: Be more careful about hashing pipeline layouts</li>

				  <li>intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLN</li>

				</ul>

				<p>Jose Maria Casanova Crespo (3):</p>

				<ul>

				  <li>i965/fs: Register allocator shoudn't use grf127 for sends dest</li>

				  <li>intel/compiler: grf127 can not be dest when src and dest overlap in send</li>

				  <li>i965/fs: unspills shoudn't use grf127 as dest since Gen8+</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>i965: fix clear color bo address relocation</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2</li>

				  <li>glsl/cache: save and restore ExternalSamplersUsed</li>

				  <li>st/dri: fix a crash in server_wait_sync</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>i965: Fix output register sizes when variable ranges are interleaved</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>nvc0/ir: fix TargetNVC0::insnCanLoadOffset()</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>r600/sb: fix crash in fold_alu_op3</li>

				</ul>

				<p>Ross Burton (1):</p>

				<ul>

				  <li>egl: fix build race in automake</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix emitting the view index on GFX9</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: skip comparison opt when adding vars of different size</li>

				  <li>nir: fix selection of loop terminator when two or more have the same limit</li>

				</ul>

				<p>zhaowei yuan (1):</p>

				<ul>

				  <li>glsl: Treat sampler2DRect and sampler2DRectShadow as reserved in ES2</li>

				</ul>

				</div>

				</body>

				</html>

									
										183

docs/relnotes/18.1.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,183 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.4 Release Notes / July 13 2018</h1>

				<p>

				Mesa 18.1.5 is a bug fix release which fixes bugs found since the 18.1.4 release.

				</p>

				<p>

				Mesa 18.1.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: f966d5d5d373a5b8a16ed5036c1e7f05d4ad46d130f793bf9782c3ac9133a02e  mesa-18.1.5.tar.gz

				SHA256: 69dbe6f1a6660386f5beb85d4fcf003ee23023ed7b9a603de84e9a37e8d98dea  mesa-18.1.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>anv: Pay attention to VK_ACCESS_MEMORY_(READ|WRITE)_BIT</li>

				</ul>

				<p>Bas Nieuwenhuizen (7):</p>

				<ul>

				  <li>radv: Select correct entries for binning.</li>

				  <li>radv: Fix number of samples used for binning.</li>

				  <li>radv: Disable disabled color buffers in rbplus opts.</li>

				  <li>nir: Do not use continue block after removing it.</li>

				  <li>util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.</li>

				  <li>nir: Fix end of function without return warning/error.</li>

				  <li>radv: Still enable inmemory &amp; API level caching if disk cache is not enabled.</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>anv/android: Fix type error in call to vk_errorf()</li>

				  <li>anv/android: Fix Autotools build for VK_ANDROID_native_buffer</li>

				</ul>

				<p>Chih-Wei Huang (1):</p>

				<ul>

				  <li>Android: fix a missing nir_intrinsics.h error</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>i965: Sweep NIR after linking phase to free held memory</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600: enable tess_input_info for TES</li>

				</ul>

				<p>Dylan Baker (5):</p>

				<ul>

				  <li>docs: Add sha256 sums for 18.1.4 tarballs</li>

				  <li>cherry-ignore: add 4a67ce886a7b3def5f66c1aedf9e5436d157a03c</li>

				  <li>cherry-ignore: Add 1f616a840eac02241c585d28e9dac8f19a297f39</li>

				  <li>cherry-ignore: add 11712b9ca17e4e1a819dcb7d020e19c6da77bc90</li>

				  <li>bump version to 18.1.5</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Don't automatically reallocate a PERSISTENT-mapped buffer.</li>

				  <li>meson: Move xvmc test tools from unit tests to installed tools.</li>

				</ul>

				<p>Harish Krupo (1):</p>

				<ul>

				  <li>egl: Fix missing clamping in eglSetDamageRegionKHR</li>

				</ul>

				<p>Jan Vesely (3):</p>

				<ul>

				  <li>radeonsi: Refuse to accept code with unhandled relocations</li>

				  <li>clover: Report error when pipe driver fails to create compute state</li>

				  <li>clover: Catch errors from executing event action</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV</li>

				  <li>nir/serialize: Alloc constants off the variable</li>

				  <li>blorp: Handle the RGB workaround more like other workarounds</li>

				  <li>intel/blorp: Handle 3-component formats in clears</li>

				  <li>intel/compiler: Account for built-in uniforms in analyze_ubo_ranges</li>

				  <li>spirv: Fix a couple of image atomic load/store bugs</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>gallium/tests: Don't ignore S3TC errors.</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nir: fix printing of vec16 type</li>

				</ul>

				<p>Lepton Wu (1):</p>

				<ul>

				  <li>virgl: Fix flush in virgl_encoder_inline_write.</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>st/mesa: call resource_changed when binding a EGLImage to a texture</li>

				</ul>

				<p>Mauro Rossi (2):</p>

				<ul>

				  <li>radv: winsys/amdgpu: include missing pthread.h header</li>

				  <li>android: util/disk_cache: fix building errors in gallium drivers</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>gallium: Check pipe_screen::resource_changed before dereferencing it</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>draw: force draw pipeline if there's more than 65535 vertices</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>anv: fix assert in anv_CmdBindDescriptorSets()</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>radv: make sure to wait for CP DMA when needed</li>

				  <li>radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9</li>

				  <li>radv: fix a memleak for merged shaders on GFX9</li>

				</ul>

				</div>

				</body>

				</html>

									
										188

docs/relnotes/18.1.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,188 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.6 Release Notes / August 13 2018</h1>

				<p>

				Mesa 18.1.6 is a bug fix release which fixes bugs found since the 18.1.5 release.

				</p>

				<p>

				Mesa 18.1.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				580e03328ffefe1fd43b19ab7669f20d931601a1c0a4c0f8b9c65d6e81a06df3  mesa-18.1.6.tar.gz

				bb7ce759069801804fcfb8152da3457f76cd7b4e0096e4870ff5adcb5c894289  mesa-18.1.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106382">Bug 106382</a> - Shader cache breaks INTEL_DEBUG=shader_time</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107117">Bug 107117</a> - mesa-18.1: regression with TFP on intel with modesettings and glamor acceleration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (1):</p>

				<ul>

				  <li>glx: GLX_MESA_multithread_makecurrent is direct-only</li>

				</ul>

				<p>Andres Gomez (3):</p>

				<ul>

				  <li>ddebug: use util_snprintf() in dd_get_debug_filename_and_mkdir</li>

				  <li>gallium/aux/util: use util_snprintf() in test_texture_barrier</li>

				  <li>glsl: use util_snprintf()</li>

				</ul>

				<p>Christian Gmeiner (1):</p>

				<ul>

				  <li>etnaviv: fix typo in query names</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600: reduce num compute threads to 1024.</li>

				</ul>

				<p>Dylan Baker (6):</p>

				<ul>

				  <li>docs: Add sha-256 sums for 18.1.5</li>

				  <li>nir/meson: fix c vs cpp args for nir test</li>

				  <li>gallium: fix ddebug on windows</li>

				  <li>cherry-ignore: add patches that get-pick-list is finding in error</li>

				  <li>cherry-ignore: Add some additional patches that are for 18.2</li>

				  <li>bump version to 18.1.6</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>swr: don't export swr_create_screen_internal</li>

				  <li>automake: require shared glapi when using DRI based libGL</li>

				  <li>autotools: error out when using the broken --with-{gl, osmesa}-lib-name</li>

				  <li>autotools: error out when building with mangling and glvnd</li>

				  <li>autotools: use correct gl.pc LIBS when using glvnd</li>

				</ul>

				<p>Eric Anholt (4):</p>

				<ul>

				  <li>vc4: Fix a leak of the no-vertex-elements workaround BO.</li>

				  <li>vc4: Respect a sampler view's first_layer field.</li>

				  <li>vc4: Ignore samplers for finding uniform offsets.</li>

				  <li>egl: Fix leak of X11 pixmaps backing pbuffers in DRI3.</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>meson, install_megadrivers: Also remove stale symlinks</li>

				</ul>

				<p>Jan Vesely (2):</p>

				<ul>

				  <li>clover: Reduce wait_count in abort path.</li>

				  <li>clover: Don't extend illegal integer types.</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>nir: Take if uses into account in ssa_def_components_read</li>

				  <li>i965/fs: Flag all slots of a flat input as flat</li>

				</ul>

				<p>Jon Turney (1):</p>

				<ul>

				  <li>meson: use correct keyword to fix a meson warning</li>

				</ul>

				<p>Jordan Justen (2):</p>

				<ul>

				  <li>i965, anv: Use INTEL_DEBUG for disk_cache driver flags</li>

				  <li>i965: Disable shader cache with INTEL_DEBUG=shader_time</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>wayland/egl: update surface size on window resize</li>

				  <li>wayland/egl: initialize window surface size to window size</li>

				</ul>

				<p>Karol Herbst (2):</p>

				<ul>

				  <li>nir/lower_int64: mark all metadata as dirty</li>

				  <li>nvc0/ir: return 0 in imageLoad on incomplete textures</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>ac/surface: fix MSAA corruption on Vega due to FMASK tile swizzle</li>

				</ul>

				<p>Mauro Rossi (2):</p>

				<ul>

				  <li>radv: generate entrypoints for VK_ANDROID_native_buffer</li>

				  <li>radv: move vk_format_table.c to generated sources</li>

				</ul>

				<p>Olivier Fourdan (1):</p>

				<ul>

				  <li>dri3: For 1.2, use root window instead of pixmap drawable</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: handle error case with ast_post_inc, ast_post_dec</li>

				</ul>

				<p>Vlad Golovkin (1):</p>

				<ul>

				  <li>swr: Remove unnecessary memset call</li>

				</ul>

				<p>vadym.shovkoplias (1):</p>

				<ul>

				  <li>drirc: Allow extension midshader for Metro Redux</li>

				</ul>

				</div>

				</body>

				</html>

									
										104

docs/relnotes/18.1.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,104 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.7 Release Notes / August 24 2018</h1>

				<p>

				Mesa 18.1.7 is a bug fix release which fixes bugs found since the 18.1.6 release.

				</p>

				<p>

				Mesa 18.1.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				0c3c240bcd1352d179e65993214f9d55a399beac852c3ab4433e8df9b6c51c83  mesa-18.1.7.tar.gz

				655e3b32ce3bdddd5e6e8768596e5d4bdef82d0dd37067c324cc4b2daa207306  mesa-18.1.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>

				</ul>

				<h2>Changes</h2>

				<p>Alexander Tsoy (1):</p>

				<ul>

				  <li>meson: fix build for egl platform_x11 without dri3 and gbm</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Fix missing Android platform define.</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>i965: Advertise 8 bits subpixel precision for viewport bounds on gen6+</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600/eg: rework atomic counter emission with flushes</li>

				</ul>

				<p>Dylan Baker (7):</p>

				<ul>

				  <li>docs: Add sha256 sums for 18.1.6</li>

				  <li>cherry-ignore: Add additional 18.2 only patches</li>

				  <li>cherry-ignore: Add more 18.2 patches</li>

				  <li>cherry-ignore: Add more 18.2 patches</li>

				  <li>cherry-ignore: Add a couple of patches with &gt; 1 fixes tags</li>

				  <li>cherry-ignore: more 18.2 patches</li>

				  <li>bump version for 18.1.7 release</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>intel: Switch the order of the 2x MSAA sample positions</li>

				  <li>anv/lower_ycbcr: Use the binding array size for bounds checks</li>

				</ul>

				<p>Ray Strode (1):</p>

				<ul>

				  <li>gallium/winsys/kms: don't unmap what wasn't mapped</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv/winsys: fix creating the BO list for virtual buffers</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>radv: add Doom workaround</li>

				</ul>

				</div>

				</body>

				</html>

									
										180

docs/relnotes/18.1.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,180 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.8 Release Notes / September 7 2018</h1>

				<p>

				Mesa 18.1.8 is a bug fix release which fixes bugs found since the 18.1.7 release.

				</p>

				<p>

				Mesa 18.1.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				8ec62f215dd1bb3910987f9941c6fc31632a0874e618815cf1e8e29445c86e0a  mesa-18.1.8.tar.gz

				bd1be67fe9c73b517765264ac28911c84144682d28dbff140e1c2deb2f44c21b  mesa-18.1.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106738">Bug 106738</a> - No test for miptrees with DRI modifiers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965/gen6/xfb: handle case where transform feedback is not active</li>

				</ul>

				<p>Bas Nieuwenhuizen (3):</p>

				<ul>

				  <li>radv: Add missing checks in radv_get_image_format_properties.</li>

				  <li>radv: Fix CMASK dimensions.</li>

				  <li>radv: Use a lower max offchip buffer count.</li>

				</ul>

				<p>Christian Gmeiner (1):</p>

				<ul>

				  <li>tegra: fix memory leak</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>st/dri: Don't expose sRGB formats to clients</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>ac/radeonsi: fix CIK copy max size</li>

				</ul>

				<p>Dylan Baker (10):</p>

				<ul>

				  <li>docs: Add mesa 18.1.7 notes</li>

				  <li>cherry-ignore: add a patch</li>

				  <li>cherry-ignore: Add more 18.2 only patches</li>

				  <li>meson: Actually load translation files</li>

				  <li>cherry-ignore: Add more 18.2 patches</li>

				  <li>cherry-ignore: Add additional patch</li>

				  <li>cherry-ignore: Add patch that doesn't apply to 18.1</li>

				  <li>cherry-ignore: Add a couple of two fixes warning patches</li>

				  <li>cherry-ignore: Add patch that needs more significant patches to function</li>

				  <li>Bump version to 18.1.8</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>docs: update required mako version</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>radv: place pointer length into cache uuid</li>

				</ul>

				<p>Gurchetan Singh (2):</p>

				<ul>

				  <li>meson: fix egl build for surfaceless</li>

				  <li>meson: fix egl build for android</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>i965/vec4: Clamp indirect tes input array reads with 0x0fffffff</li>

				  <li>i965/vec4: Correctly handle uniform sources in generate_tes_add_indirect_urb_offset</li>

				</ul>

				<p>Jason Ekstrand (5):</p>

				<ul>

				  <li>anv: Fill holes in the VF VUE to zero</li>

				  <li>nir/algebraic: Be more careful converting ushr to extract_u8/16</li>

				  <li>egl/dri2: Add a helper for the number of planes for a FOURCC format</li>

				  <li>egl/dri2: Guard against invalid fourcc formats</li>

				  <li>anv/blorp: Do more flushing around HiZ clears</li>

				</ul>

				<p>Juan A. Suarez Romero (1):</p>

				<ul>

				  <li>egl/wayland: do not leak wl_buffer when it is locked</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: blorp: support multiple aspect blits</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>glapi: actually implement GL_EXT_robustness for GLES</li>

				</ul>

				<p>Nanley Chery (7):</p>

				<ul>

				  <li>intel/isl: Avoid tiling some 16K-wide render targets</li>

				  <li>i965: Make blt_pitch public</li>

				  <li>i965/miptree: Drop an if case from retile_as_linear</li>

				  <li>i965/miptree: Use the correct BLT pitch</li>

				  <li>i965/miptree: Use miptree_map in map_blit functions</li>

				  <li>i965/miptree: Fix can_blit_slice()</li>

				  <li>i965/gen7_urb: Re-emit PUSH_CONSTANT_ALLOC on some gen9</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: fix passing clip/cull distances from VS to PS</li>

				</ul>

				<p>vadym.shovkoplias (1):</p>

				<ul>

				  <li>glsl/linker: Allow unused in blocks which are not declated on previous stage</li>

				</ul>

				</div>

				</body>

				</html>

									
										178

docs/relnotes/18.1.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,178 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.1.8 Release Notes / September 24 2018</h1>

				<p>

				Mesa 18.1.9 is a bug fix release which fixes bugs found since the 18.1.8 release.

				</p>

				<p>

				Mesa 18.1.9 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				0f825dc834b1b3e3d9a6c3ce58b42977f0d9a248a7627a36dd3b313ffe41a499  mesa-18.1.9.tar.gz

				55f5778d58a710a63d6635f000535768faf7db9e8144dc0f4fd1989f936c1a83  mesa-18.1.9.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces:  HW cursor for format 875713089 not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrii Simiklit (4):</p>

				<ul>

				  <li>apple/glx/log: added missing va_end() after va_copy()</li>

				  <li>mesa/util: don't use the same 'va_list' instance twice</li>

				  <li>mesa/util: don't ignore NULL returned from 'malloc'</li>

				  <li>mesa/util: add missing va_end() after va_copy()</li>

				</ul>

				<p>Bas Nieuwenhuizen (4):</p>

				<ul>

				  <li>radv: Use build ID if available for cache UUID.</li>

				  <li>radv: Only allow 16 user SGPRs for compute on GFX9+.</li>

				  <li>radv: Set the user SGPR MSB for Vega.</li>

				  <li>radv: Fix driver UUID SHA1 init.</li>

				</ul>

				<p>Christopher Egert (1):</p>

				<ul>

				  <li>radeon: fix ColorMask</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>virgl: don't send a shader create with no data. (v2)</li>

				</ul>

				<p>Dylan Baker (10):</p>

				<ul>

				  <li>docs/relnotes: Add sha256 sums for mesa 18.1.8</li>

				  <li>cherry-ignore: Add additional 18.2 patch</li>

				  <li>meson: Print a message about why a libdrm version was selected</li>

				  <li>cherry-ignore: add another 18.2 patch</li>

				  <li>cherry-ignore: Add patches that don't apply cleanly and are for developer tools</li>

				  <li>cherry-ignore: Add more 18.2 patches</li>

				  <li>cherry-ignore: add 18.2 patchs</li>

				  <li>cherry-ignore: add a patch that was reverted on master</li>

				  <li>cherry-ignore: one final update</li>

				  <li>Bump version to 18.1.9</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>winsys/virgl: avoid unintended behavior</li>

				  <li>virgl: adjust strides when mapping temp-resources</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>winsys/virgl: correct resource and handle allocation (v2)</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>anv/pipeline: Only consider double elements which actually exist</li>

				  <li>i965: Workaround the gen9 hw astc5x5 sampler bug</li>

				  <li>anv: Re-emit vertex buffers when the pipeline changes</li>

				  <li>anv: Disable the vertex cache when tessellating on SKL GT4</li>

				  <li>anv: Clamp scissors to the framebuffer boundary</li>

				  <li>anv/query: Write both dwords in emit_zero_queries</li>

				</ul>

				<p>Josh Pieper (1):</p>

				<ul>

				  <li>st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)</li>

				</ul>

				<p>Kenneth Feng (1):</p>

				<ul>

				  <li>amd: Add Picasso device id</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>st/mesa: help fix stencil border color for GL_DEPTH_STENCIL textures</li>

				  <li>radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI</li>

				  <li>r600: fix HTILE for NPOT textures with mipmapping</li>

				  <li>radeonsi: fix printing a BO list into ddebug reports</li>

				</ul>

				<p>Mathias Fröhlich (1):</p>

				<ul>

				  <li>tnl: Fix green gun regression in xonotic.</li>

				</ul>

				<p>Mauro Rossi (3):</p>

				<ul>

				  <li>android: broadcom/genxml: fix collision with intel/genxml header-gen macro</li>

				  <li>android: broadcom/cle: add gallium include path</li>

				  <li>android: broadcom/cle: export the broadcom top level path headers</li>

				</ul>

				<p>Michal Srb (1):</p>

				<ul>

				  <li>st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>loader/dri3: Only wait for back buffer fences in dri3_get_buffer</li>

				</ul>

				<p>Pierre Moreau (1):</p>

				<ul>

				  <li>nvir: Always split 64-bit IMAD/IMUL operations</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>intel: compiler option msse2 and mstackrealign</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>glsl: fixer lexer for unreachable defines</li>

				</ul>

				</div>

				</body>

				</html>

									
										284

docs/relnotes/18.2.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,284 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.0 Release Notes / September 7, 2018</h1>

				<p>

				Mesa 18.2.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 18.2.1.

				</p>

				<p>

				Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<p>

				libwayland-egl is now distributed by Wayland (since 1.15,

				<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),

				and has been removed from Mesa in this release. Make sure you're using

				an up-to-date version of Wayland to keep the functionality.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b9e6bb3eb7660b0726ba28405ffa0cb77de619e925b910b72f4d7a85c0098596  mesa-18.2.0.tar.gz

				22452bdffff8e11bf4284278155a9f77cb28d6d73a12c507f1490732d0d9ddce  mesa-18.2.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 4.3 on virgl</li>

				<li>OpenGL 4.4 Compatibility profile on radeonsi</li>

				<li>OpenGL ES 3.2 on radeonsi and virgl</li>

				<li>GL_ARB_ES3_2_compatibility on radeonsi</li>

				<li>GL_ARB_fragment_shader_interlock on i965</li>

				<li>GL_ARB_sample_locations and GL_NV_sample_locations on nvc0 (GM200+)</li>

				<li>GL_ANDROID_extension_pack_es31a on radeonsi.</li>

				<li>GL_KHR_texture_compression_astc_ldr on radeonsi</li>

				<li>GL_NV_conservative_raster and GL_NV_conservative_raster_dilate on nvc0 (GM200+)</li>

				<li>GL_NV_conservative_raster_pre_snap_triangles on nvc0 (GP102+)</li>

				<li>multisampled images on nvc0 (GM107+) (now supported on GF100+)</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61761">Bug 61761</a> - glPolygonOffsetEXT, OFFSET_BIAS incorrectly set to a huge number</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=65422">Bug 65422</a> - Rename api_validate.[ch] to draw_validate.[ch]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78097">Bug 78097</a> - glUniform1ui and friends not supported by display lists</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91808">Bug 91808</a> - trine1 misrender r600g</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95009">Bug 95009</a> - [SNB] amd_shader_trinary_minmax.execution.built-in-functions.gs-mid3-ivec2-ivec2-ivec2 intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95012">Bug 95012</a> - [SNB] glsl-1_50.execution.built-in-functions.gs-op tests intermittent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99116">Bug 99116</a> - Wine DirectDraw programs showing only a blackscreen when using Mesa Gallium drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100177">Bug 100177</a> - [GM206] Misrendering in XCOM Ennemy Within</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100430">Bug 100430</a> - [radv] graphical glitches on dolphin emulator</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102390">Bug 102390</a> - centroid interpolation causes broken attribute values</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102678">Bug 102678</a> - gl_BaseVertex should always be zero when the draw command has no &lt;basevertex&gt; parameter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103274">Bug 103274</a> - BRW allocates too much heap memory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104388">Bug 104388</a> - [snb] GPU HANG: ecode 6:0:0x85fffff8 in fgfs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104626">Bug 104626</a> - broadcom/vc5: double compare</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105351">Bug 105351</a> - [Gen6+] piglit's arb_shader_image_load_store-host-mem-barrier fails with a glGetTexSubImage fallback path</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105374">Bug 105374</a> - texture3d, a SaschaWillems demo, assert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105396">Bug 105396</a> - tc compatible htile sets depth of htiles of discarded fragments to 1.0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105399">Bug 105399</a> - [snb] GPU hang: after geometry shader emits no geometry, the program hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105497">Bug 105497</a> - shader-db crashes on 72 core system after ast_type_qualifier bitset change</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105613">Bug 105613</a> - Compute shader locks up within nested &quot;for&quot; loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106090">Bug 106090</a> - Compiling compute shader crashes RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106133">Bug 106133</a> - make check &quot;OSError: [Errno 24] Too many open files&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106163">Bug 106163</a> - r600/sb: optimizer tries to schedule access to different array elements in one instruction group</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106174">Bug 106174</a> - vulkan dota2 broken (segfaulting), found bug commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106180">Bug 106180</a> - [bisected] radv vulkan smoke test black screen (Add support for DRI3 v1.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106232">Bug 106232</a> - LLVM unit tests have error in random number handling</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106243">Bug 106243</a> - [kbl] GPU HANG: 9:0:0x85dffffb, in Cinnamon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106315">Bug 106315</a> - The witness + dxvk suffers flickering garbage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106331">Bug 106331</a> - radv doesnt support VK_FORMAT_R32G32B32_SFLOAT</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106382">Bug 106382</a> - Shader cache breaks INTEL_DEBUG=shader_time</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106393">Bug 106393</a> - glsl-fs-shader-stencil-export hangs forever</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106450">Bug 106450</a> - glGetIntegerv return wrong value in some cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106462">Bug 106462</a> - piglit.spec.arb_vertex_array_bgra.get regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106479">Bug 106479</a> - NDEBUG not defined for libamdgpu_addrlib</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106480">Bug 106480</a> - A2B10G10R10_SNORM vertex attribute doesn't work.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106499">Bug 106499</a> - [regression, bisected] Several games crash on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106504">Bug 106504</a> - vulkan SPIR-V parsing failed at ../src/compiler/spirv/vtn_cfg.c:381</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106511">Bug 106511</a> - radv: MSAA broken on SI (assertion failure in vkCreateImage)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106587">Bug 106587</a> - Dota2 is very dark when using vulkan render on a Intel &lt;&lt; AMD prime setup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106594">Bug 106594</a> - [regression,apitrace,bisected] Prison Architect rendered unplayable by multicoloured flickering triangles and overlayed triangles when performing certain actions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106619">Bug 106619</a> - [OpenCL][llvm-svn]build failure  addPassesToEmitFile candidate expects 6 arguments, 3 provided</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106629">Bug 106629</a> - [SNB,IVB,HSW,BDW] dEQP-EGL.functional.image.create.gles2_cubemap_negative_z_rgb_read_pixels</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106642">Bug 106642</a> - X server crashes in i965 on desktop startup when DRI3 v1.2 / modifier support is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106643">Bug 106643</a> - double free when exporting a temporarily imported semaphore</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106673">Bug 106673</a> - [bisected] Steam is unusable since commit 5c33e8c7</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106687">Bug 106687</a> - radv: Fast color clears use incorrect format</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106708">Bug 106708</a> - [SKL/KBL/GLK] 2-3% performance drop in SynMark DrvState and 5-9% drop on SynMark Multithread</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106748">Bug 106748</a> - st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY broke qemu -display sdl,gl=on</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106756">Bug 106756</a> - Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106774">Bug 106774</a> - GLSL IR copy propagates loads of SSBOs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106776">Bug 106776</a> - vma_random unrecognized command line option &quot;-std=c++11&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106778">Bug 106778</a> - Files missing from tarball - intel_sanitize_gpu.*</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106779">Bug 106779</a> - Files missing from tarball - u_debug_stack_android.cpp</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106784">Bug 106784</a> - 18.1.1 autotools build fail without mako</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106801">Bug 106801</a> - vma_random_test.cpp:239:18: error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'uint_fast32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106810">Bug 106810</a> - ProgramBinary does not switch program correctly when using transform feedback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106823">Bug 106823</a> - Failed to recongnize keyword of shader code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106830">Bug 106830</a> - [bisected] 32 bit tests (deqp, piglit, glcts, vulkancts) crashing on all platforms</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106861">Bug 106861</a> - fatal error: wayland-egl-backend.h: No such file or directory compilation terminated.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106903">Bug 106903</a> - radv: Fragment shader output goes to wrong attachments when render targets are sparse</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106906">Bug 106906</a> - Failed to recongnize keyword “sampler2DRect” and &quot;sampler2DRectShadow&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106907">Bug 106907</a> - Correct Transform Feedback Varyings information is expected after using ProgramBinary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106912">Bug 106912</a> - radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106928">Bug 106928</a> - When starting a match Rocket League crashes on &quot;Go&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106941">Bug 106941</a> - Intel ANV vulkan driver exposing version 1.1.0 which is incorrect</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106986">Bug 106986</a> - glGetQueryiv error when querying number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106997">Bug 106997</a> - [Regression]. Dying light game is crashing on latest mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107117">Bug 107117</a> - mesa-18.1: regression with TFP on intel with modesettings and glamor acceleration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107190">Bug 107190</a> - Got seg fault on snb when use INTEL_DEBUG=bat</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107193">Bug 107193</a> - piglit.spec.arb_compute_shader.linker.bug-93840 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107223">Bug 107223</a> - [GEN9+] 50% perf drop in SynMark Fill* tests (E2E RBC gets disabled?)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107248">Bug 107248</a> - [G45 ILK G965] Texture handling broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107275">Bug 107275</a> - NIR segfaults after spirv-opt</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107295">Bug 107295</a> - Access violation on glDrawArrays with count &gt;= 2048</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107305">Bug 107305</a> - glsl/opt_copy_propagation_elements.cpp:72:9: error: delegating constructors are permitted only in C++11</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107312">Bug 107312</a> - Mesa-git RPM build fails after commit 8cacf38f527d42e41441ef8c25d95d4b2f4e8602</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107366">Bug 107366</a> - NIR verification crashes on piglit tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107423">Bug 107423</a> - vc4 build failure: &quot;v3d_decoder.c:893: undefined reference to `clif_lookup_bo'&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107443">Bug 107443</a> - Build error on arm64: v3d_decoder.c:837:17: error: format not a string literal and no format arguments [-Werror=format-security]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107460">Bug 107460</a> - radv: OpControlBarrier does not always work correctly (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107510">Bug 107510</a> - [GEN8+] up to 10% perf drop on several 3D benchmarks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107544">Bug 107544</a> - intel/decoder: out of bounds group_iter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107550">Bug 107550</a> - &quot;0[2]&quot; as function parameter hits assert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107610">Bug 107610</a> - Dolphin emulator mis-renders shadow overlay in Super Mario Sunshine</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Removed GL_EXT_polygon_offset applications should use glPolygonOffset instead.</li>

				<li>Removed libwayland-egl, now part of Wayland</li>

				</ul>

				</div>

				</body>

				</html>

									
										227

docs/relnotes/18.2.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,227 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.1 Release Notes / September 21, 2018</h1>

				<p>

				Mesa 18.2.1 is a bug fix release which fixes bugs found since the 18.2.0 release.

				</p>

				<p>

				Mesa 18.2.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: 45419ccbe1bf9a2e15ffe71ced34615002e1b42c24b917fbe2b2f58ab1970562  mesa-18.2.1.tar.gz

				SHA256: 9636dc6f3d188abdcca02da97cedd73640d9035224efd5db724187d062c81056  mesa-18.2.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107832">Bug 107832</a> - Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107843">Bug 107843</a> - 32bit Mesa build failes with meson.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107879">Bug 107879</a> - crash happens when link program</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107891">Bug 107891</a> - [wine, regression, bisected] RAGE, Wolfenstein The New Order hangs in menu</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.0</li>

				  <li>Revert "Revert "glsl: skip stringification in preprocessor if in unreachable branch""</li>

				  <li>cherry-ignore: i965/tools: 32bit compilation with meson</li>

				</ul>

				<p>Andrii Simiklit (4):</p>

				<ul>

				  <li>apple/glx/log: added missing va_end() after va_copy()</li>

				  <li>mesa/util: don't use the same 'va_list' instance twice</li>

				  <li>mesa/util: don't ignore NULL returned from 'malloc'</li>

				  <li>mesa/util: add missing va_end() after va_copy()</li>

				</ul>

				<p>Bas Nieuwenhuizen (5):</p>

				<ul>

				  <li>radv: Support v3 of VK_EXT_vertex_attribute_divisor.</li>

				  <li>radv: Set the user SGPR MSB for Vega.</li>

				  <li>radv: Only allow 16 user SGPRs for compute on GFX9+.</li>

				  <li>radv: Use build ID if available for cache UUID.</li>

				  <li>radv: Fix driver UUID SHA1 init.</li>

				</ul>

				<p>Christopher Egert (1):</p>

				<ul>

				  <li>radeon: fix ColorMask</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>virgl: don't send a shader create with no data. (v2)</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: Print a message about why a libdrm version was selected</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>v3d: Fix SRC_ALPHA_SATURATE blending for RTs without alpha.</li>

				  <li>v3d: Fix setup of the VCM cache size.</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>winsys/virgl: avoid unintended behavior</li>

				  <li>virgl: adjust strides when mapping temp-resources</li>

				</ul>

				<p>Fritz Koenig (2):</p>

				<ul>

				  <li>mesa: Additional FlipY applications</li>

				  <li>mesa: FramebufferParameteri parameter checking</li>

				</ul>

				<p>Gert Wollny (2):</p>

				<ul>

				  <li>winsys/virgl: correct resource and handle allocation (v2)</li>

				  <li>mesa/texture: Also check for LA texture when querying intensity component size</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>i965/fs: Don't propagate conditional modifiers from integer compares to adds</li>

				</ul>

				<p>Jason Ekstrand (11):</p>

				<ul>

				  <li>anv/pipeline: Only consider double elements which actually exist</li>

				  <li>i965: Workaround the gen9 hw astc5x5 sampler bug</li>

				  <li>anv: Re-emit vertex buffers when the pipeline changes</li>

				  <li>anv: Disable the vertex cache when tessellating on SKL GT4</li>

				  <li>anv: Clamp scissors to the framebuffer boundary</li>

				  <li>vulkan: Update the XML and headers to 1.1.84</li>

				  <li>anv: Support v3 of VK_EXT_vertex_attribute_divisor</li>

				  <li>anv/query: Write both dwords in emit_zero_queries</li>

				  <li>nir: Add a small pass to rematerialize derefs per-block</li>

				  <li>nir/loop_unroll: Re-materialize derefs in use blocks before unrolling</li>

				  <li>nir/opt_if: Re-materialize derefs in use blocks before peeling loops</li>

				</ul>

				<p>Josh Pieper (1):</p>

				<ul>

				  <li>st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>cherry-ignore: radv: fix descriptor pool allocation size</li>

				  <li>Update version to 18.2.1</li>

				</ul>

				<p>Kenneth Feng (1):</p>

				<ul>

				  <li>amd: Add Picasso device id</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>radeonsi: fix HTILE for NPOT textures with mipmapping on SI/CI</li>

				  <li>winsys/radeon: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI</li>

				  <li>r600: fix HTILE for NPOT textures with mipmapping</li>

				  <li>radeonsi: fix printing a BO list into ddebug reports</li>

				  <li>ac: revert new LLVM 7.0 behavior for fdiv</li>

				</ul>

				<p>Mathias Fröhlich (1):</p>

				<ul>

				  <li>tnl: Fix green gun regression in xonotic.</li>

				</ul>

				<p>Mauro Rossi (3):</p>

				<ul>

				  <li>android: broadcom/genxml: fix collision with intel/genxml header-gen macro</li>

				  <li>android: broadcom/cle: add gallium include path</li>

				  <li>android: broadcom/cle: export the broadcom top level path headers</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>loader/dri3: Only wait for back buffer fences in dri3_get_buffer</li>

				</ul>

				<p>Pierre Moreau (1):</p>

				<ul>

				  <li>nvir: Always split 64-bit IMAD/IMUL operations</li>

				</ul>

				<p>Samuel Pitoiset (7):</p>

				<ul>

				  <li>radv: fix function names for VK_EXT_conditional_rendering</li>

				  <li>radv: fix VK_EXT_conditional_rendering visibility</li>

				  <li>radv: bump the maximum number of arguments to 64</li>

				  <li>radv: handle loc-&gt;indirect correctly for the first descriptor</li>

				  <li>radv: fix GPU hangs with 32-bit indirect descriptors</li>

				  <li>radv: fix flushing indirect descriptors</li>

				  <li>radv: fix setting global locations for indirect descriptors</li>

				</ul>

				<p>Sergii Romantsov (3):</p>

				<ul>

				  <li>intel: compiler option msse2 and mstackrealign</li>

				  <li>i965/tools: 32bit compilation with meson</li>

				  <li>mesa/meson: 32bit xmlconfig linkage</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>glsl: fixer lexer for unreachable defines</li>

				  <li>Revert "radeonsi: avoid syncing the driver thread in si_fence_finish"</li>

				</ul>

				</div>

				</body>

				</html>

									
										155

docs/relnotes/18.2.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,155 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.2 Release Notes / October 5, 2018</h1>

				<p>

				Mesa 18.2.2 is a bug fix release which fixes bugs found since the 18.2.1 release.

				</p>

				<p>

				Mesa 18.2.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				SHA256: c51711168971957037cc7e3e19e8abe1ec6eeab9cf236d419a1e7728a41cac8a  mesa-18.2.2.tar.gz

				SHA256: c3ba82b12a89d3d9fed2bdd96b4702dbb7ab675034650a8b1b718320daf073c4  mesa-18.2.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces:  HW cursor for format 875713089 not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107786">Bug 107786</a> - [DXVK] MSAA reflections are broken in GTA V</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108024">Bug 108024</a> - [Debian Stretch]Fail to build because &quot;xcb_randr_lease_t&quot;</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>pci_ids: add new polaris pci id</li>

				</ul>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv: only emit ZPASS_DONE for timestamp queries on gfx queues</li>

				</ul>

				<p>Axel Davy (3):</p>

				<ul>

				  <li>st/nine: Clamp RCP when 0*inf!=0</li>

				  <li>st/nine: Avoid redundant SetCursorPos calls</li>

				  <li>st/nine: Increase maximum number of temp registers</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: Don't compile pipe loader with dri support when not using dri</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Fix sin(0.0) and cos(0.0) accuracy to fix SDL rendering rotation.</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>vulkan/wsi/display: check if wsi_swapchain_init() succeeded</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv,radv: Implement vkAcquireNextImage2</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.1</li>

				  <li>Update version to 18.2.2</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/uvd: use bitstream coded number for symbols of Huffman tables</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>glsl_to_tgsi: invert gl_SamplePosition.y for the default framebuffer</li>

				  <li>radeonsi: NaN should pass kill_if</li>

				</ul>

				<p>Maxime (1):</p>

				<ul>

				  <li>vulkan: Disable randr lease for libxcb &lt; 1.13</li>

				</ul>

				<p>Michal Srb (1):</p>

				<ul>

				  <li>st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it</li>

				</ul>

				<p>Rhys Perry (2):</p>

				<ul>

				  <li>nvc0: Update counter reading shaders to new NVC0_CB_AUX_MP_INFO</li>

				  <li>nvc0: fix bindless multisampled images on Maxwell+</li>

				</ul>

				<p>Samuel Iglesias Gonsálvez (1):</p>

				<ul>

				  <li>anv: Add support for protected memory properties on anv_GetPhysicalDeviceProperties2()</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: use the resolve compute path if dest uses multiple layers</li>

				</ul>

				<p>Stuart Young (1):</p>

				<ul>

				  <li>docs: Update FAQ with respect to s3tc support</li>

				</ul>

				<p>Timothy Arceri (1):</p>

				<ul>

				  <li>radeonsi: add a workaround for bitfield_extract when count is 0</li>

				</ul>

				</div>

				</body>

				</html>

									
										167

docs/relnotes/18.2.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,167 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.3 Release Notes / October 19, 2018</h1>

				<p>

				Mesa 18.2.3 is a bug fix release which fixes bugs found since the 18.2.2 release.

				</p>

				<p>

				Mesa 18.2.3 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				0e13e2342eae74d8848df23595c4bb4b2f8874c9e1213b8466b1fbfa7ef99375  mesa-18.2.3.tar.gz

				e2bf83c17e1abdecb1ee81af22652e27e9aa38f963e95e60f34275cc0376304f  mesa-18.2.3.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99507">Bug 99507</a> - Corrupted frame contents with Vulkan version of DOTA2, Talos Principle and Sascha Willems' demos when they're run Vsynched in fullscreen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107857">Bug 107857</a> - GPU hang - GS_EMIT without shader outputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107926">Bug 107926</a> - [anv] Rise of the Tomb Raider always misrendering, segfault and gpu hang.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108012">Bug 108012</a> - Compiler crashes on access of non-existent member incremental operations</li>

				</ul>

				<h2>Changes</h2>

				<p>Boyuan Zhang (1):</p>

				<ul>

				  <li>st/va: use provided sizes and coords for vlVaGetImage</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>anv: add missing unlock in error path.</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: Don't allow building EGL on Windows or MacOS</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>st/nine: do not double-close the fd on teardown</li>

				  <li>egl: make eglSwapInterval a no-op for !window surfaces</li>

				  <li>egl: make eglSwapBuffers* a no-op for !window surfaces</li>

				  <li>vl/dri3: do full teardown on screen_destroy</li>

				  <li>Revert "mesa: remove unnecessary 'sort by year' for the GL extensions"</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>radv: add missing meson c++ visibility arguments</li>

				</ul>

				<p>Fritz Koenig (1):</p>

				<ul>

				  <li>i965: Replace checks for rb-&gt;Name with FlipY (v2)</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>virgl, vtest: Correct the transfer size calculation</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>glsl: fix array assignments of a swizzled vector</li>

				  <li>nv50,nvc0: mark RGBX_UINT formats as renderable</li>

				  <li>nv50,nvc0: guard against zero-size blits</li>

				  <li>nvc0: fix blitting red to srgb8_alpha</li>

				</ul>

				<p>Jason Ekstrand (7):</p>

				<ul>

				  <li>nir/cf: Remove phi sources if needed in nir_handle_add_jump</li>

				  <li>anv: Use separate MOCS settings for external BOs</li>

				  <li>intel/fs: Fix a typo in need_matching_subreg_offset</li>

				  <li>nir/from_ssa: Don't rewrite derefs destinations to registers</li>

				  <li>anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START</li>

				  <li>nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions</li>

				  <li>intel: Don't propagate conditional modifiers if a UD source is negated</li>

				</ul>

				<p>Juan A. Suarez Romero (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.2</li>

				  <li>Update version to 18.2.3</li>

				</ul>

				<p>Józef Kucia (1):</p>

				<ul>

				  <li>radeonsi: avoid sending GS_EMIT in shaders without outputs</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>drirc: add a workaround for ARMA 3</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: add a workaround for a VGT hang with prim restart and strips</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: do not attempt assignment if operand type not parsed correctly</li>

				</ul>

				<p>Timothy Arceri (11):</p>

				<ul>

				  <li>glsl: ignore trailing whitespace when define redefined</li>

				  <li>util: disable cache if we have no build-id and timestamp is zero</li>

				  <li>util: rename timestamp param in disk_cache_create()</li>

				  <li>util: add disk_cache_get_function_identifier()</li>

				  <li>radeonsi: use build-id when available for disk cache</li>

				  <li>nouveau: use build-id when available for disk cache</li>

				  <li>r600: use build-id when available for disk cache</li>

				  <li>mesa/st: add force_compat_profile option to driconfig</li>

				  <li>util: use force_compat_profile for Wolfenstein The Old Blood</li>

				  <li>util: better handle program names from wine</li>

				  <li>util: add drirc workarounds for RAGE</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>r600/sb: Fix constant-logical-operand warning.</li>

				</ul>

				</div>

				</body>

				</html>

									
										154

docs/relnotes/18.2.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,154 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.4 Release Notes / October 31, 2018</h1>

				<p>

				Mesa 18.2.4 is a bug fix release which fixes bugs found since the 18.2.4 release.

				</p>

				<p>

				Mesa 18.2.4 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				968bfe78605e9397ddf244933b1fa62edb8429fc55aaec2ae7e20bb1c82abdea  mesa-18.2.4.tar.gz

				621d1aebb57876d5b6a5d2dcf4eb7e0620e650c6fe5cf3655c65e243adc9cb4e  mesa-18.2.4.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107865">Bug 107865</a> - swr fail to build with llvm-libs 6.0.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">Bug 108272</a> - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108524">Bug 108524</a> - [RADV]  GPU lockup on event synchronization</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (2):</p>

				<ul>

				  <li>ac/nir: Use context-specific LLVM types</li>

				  <li>anv: Fix sanitization of stencil state when the depth test is disabled</li>

				</ul>

				<p>Alok Hota (2):</p>

				<ul>

				  <li>swr/rast: ignore CreateElementUnorderedAtomicMemCpy</li>

				  <li>swr/rast: fix intrinsic/function for LLVM 7 compatibility</li>

				</ul>

				<p>Andres Rodriguez (1):</p>

				<ul>

				  <li>radv: fix check for perftest options size</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Emit enqueued pipeline barriers on event write.</li>

				</ul>

				<p>Connor Abbott (2):</p>

				<ul>

				  <li>ac: Introduce ac_build_expand()</li>

				  <li>ac: Fix loading a dvec3 from an SSBO</li>

				</ul>

				<p>David McFarland (1):</p>

				<ul>

				  <li>util: Change remaining uint32 cache ids to sha1</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: don't require libelf for r600 without LLVM</li>

				</ul>

				<p>Elie Tournier (1):</p>

				<ul>

				  <li>gallium: Correctly handle no config context creation</li>

				</ul>

				<p>Eric Engestrom (1):</p>

				<ul>

				  <li>radv: s/abs/fabsf/ for floats</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>radeonsi: Bump number of allowed global buffers to 32</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>spirv: Use the right bit-size for spec constant ops</li>

				  <li>blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OP</li>

				  <li>anv: Flag semaphore BOs as external</li>

				</ul>

				<p>Juan A. Suarez Romero (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.3</li>

				  <li>cherry-ignore: Revert "anv/skylake: disable ForceThreadDispatchEnable"</li>

				  <li>Update version to 18.2.4</li>

				</ul>

				<p>Liviu Prodea (1):</p>

				<ul>

				  <li>scons: Put to rest zombie texture_float build option.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: fix a VGT hang with primitive restart on Polaris10 and later</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>loader/dri3: Also wait for front buffer fence if we triggered it</li>

				</ul>

				<p>Nanley Chery (1):</p>

				<ul>

				  <li>intel/blorp: Define the clear value bounds for HiZ clears</li>

				</ul>

				<p>Rob Clark (2):</p>

				<ul>

				  <li>freedreno: fix inorder rendering case</li>

				  <li>freedreno: don't flush when new and old pfb is identical</li>

				</ul>

				</div>

				</body>

				</html>

									
										172

docs/relnotes/18.2.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,172 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.5 Release Notes / November 15, 2018</h1>

				<p>

				Mesa 18.2.5 is a bug fix release which fixes bugs found since the 18.2.4 release.

				</p>

				<p>

				Mesa 18.2.5 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				dddc28928b6f4083a0d5120b58c1c8e2dc189ab5c14299c08a386607fdbbdce7  mesa-18.2.5.tar.gz

				b12c32872832e5353155e1e8026e1f1ab75bba9dc5b178d712045684d26c2b73  mesa-18.2.5.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>

				</ul>

				<h2>Changes</h2>

				<p>Andre Heider (1):</p>

				<ul>

				  <li>st/nine: fix stack corruption due to ABI mismatch</li>

				</ul>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965/batch: don't ignore the 'brw_new_batch' call for a 'new batch'</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>meson: link gallium nine with pthreads</li>

				  <li>meson: fix libatomic tests</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>egl/glvnd: correctly report errors when vendor cannot be found</li>

				  <li>m4: add Werror when checking for compiler flags</li>

				</ul>

				<p>Eric Engestrom (6):</p>

				<ul>

				  <li>svga: add missing meson build dependency</li>

				  <li>clover: add missing meson build dependency</li>

				  <li>wsi/wayland: use proper VkResult type</li>

				  <li>wsi/wayland: only finish() a successfully init()ed display</li>

				  <li>configure: install KHR/khrplatform.h when needed</li>

				  <li>meson: install KHR/khrplatform.h when needed</li>

				</ul>

				<p>Gert Wollny (1):</p>

				<ul>

				  <li>virgl/vtest-winsys: Use virgl version of bind flags</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>intel/tools: include stdarg.h in error2aub</li>

				</ul>

				<p>Juan A. Suarez Romero (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.4</li>

				  <li>cherry-ignore: add explicit 18.3 only nominations</li>

				  <li>cherry-ignore: i965/batch: avoid reverting batch buffer if saved state is an empty</li>

				  <li>Update version to 18.2.5</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv/android: mark gralloc allocated BOs as external</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>ac: fix ac_build_fdiv for f64</li>

				  <li>st/va: fix incorrect use of resource_destroy</li>

				  <li>include: update GL &amp; GLES headers (v2)</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>util/ralloc: Switch from DEBUG to NDEBUG</li>

				  <li>util/ralloc: Make sizeof(linear_header) a multiple of 8</li>

				</ul>

				<p>Olivier Fourdan (1):</p>

				<ul>

				  <li>wayland/egl: Resize EGL surface on update buffer for swrast</li>

				</ul>

				<p>Rhys Perry (1):</p>

				<ul>

				  <li>glsl_to_tgsi: don't create 64-bit integer MAD/FMA</li>

				</ul>

				<p>Samuel Pitoiset (2):</p>

				<ul>

				  <li>radv: disable conditional rendering for vkCmdCopyQueryPoolResults()</li>

				  <li>radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+</li>

				</ul>

				<p>Sergii Romantsov (1):</p>

				<ul>

				  <li>autotools: library-dependency when no sse and 32-bit</li>

				</ul>

				<p>Timothy Arceri (4):</p>

				<ul>

				  <li>st/mesa: calculate buffer size correctly for packed uniforms</li>

				  <li>st/glsl_to_nir: fix next_stage gathering</li>

				  <li>nir: add glsl_type_is_integer() helper</li>

				  <li>nir: don't pack varyings ints with floats unless flat</li>

				</ul>

				<p>Vadym Shovkoplias (1):</p>

				<ul>

				  <li>glsl/linker: Fix out variables linking during single stage</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>r600/sb: Fix constant logical operand in assert.</li>

				</ul>

				</div>

				</body>

				</html>

									
										179

docs/relnotes/18.2.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,179 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.6 Release Notes / November 28, 2018</h1>

				<p>

				Mesa 18.2.6 is a bug fix release which fixes bugs found since the 18.2.5 release.

				</p>

				<p>

				Mesa 18.2.6 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e0ea1236dbc6c412b02e1b5d7f838072525971a6630246fa82ae4466a6d8a587  mesa-18.2.6.tar.gz

				9ebafa4f8249df0c718e93b9ca155e3593a1239af303aa2a8b0f2056a7efdc12  mesa-18.2.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrii Simiklit (1):</p>

				<ul>

				  <li>i965/batch: avoid reverting batch buffer if saved state is an empty</li>

				</ul>

				<p>Bas Nieuwenhuizen (1):</p>

				<ul>

				  <li>radv: Fix opaque metadata descriptor last layer.</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>scons/svga: remove opt from the list of valid build types</li>

				</ul>

				<p>Danylo Piliaiev (1):</p>

				<ul>

				  <li>i965: Fix calculation of layers array length for isl_view</li>

				</ul>

				<p>Dylan Baker (2):</p>

				<ul>

				  <li>meson: Don't set -Wall</li>

				  <li>meson: Don't force libva to required from auto</li>

				</ul>

				<p>Emil Velikov (13):</p>

				<ul>

				  <li>bin/get-pick-list.sh: simplify git oneline printing</li>

				  <li>bin/get-pick-list.sh: prefix output with "[stable] "</li>

				  <li>bin/get-pick-list.sh: handle "typod" usecase.</li>

				  <li>bin/get-pick-list.sh: handle the fixes tag</li>

				  <li>bin/get-pick-list.sh: tweak the commit sha matching pattern</li>

				  <li>bin/get-pick-list.sh: flesh out is_sha_nomination</li>

				  <li>bin/get-pick-list.sh: handle fixes tag with missing colon</li>

				  <li>bin/get-pick-list.sh: handle unofficial "broken by" tag</li>

				  <li>bin/get-pick-list.sh: use test instead of [ ]</li>

				  <li>bin/get-pick-list.sh: handle reverts prior to the branchpoint</li>

				  <li>travis: drop unneeded x11proto-xf86vidmode-dev</li>

				  <li>glx: make xf86vidmode mandatory for direct rendering</li>

				  <li>travis: adding missing x11-xcb for meson+vulkan</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>vc4: Make sure we make ro scanout resources for create_with_modifiers.</li>

				</ul>

				<p>Eric Engestrom (5):</p>

				<ul>

				  <li>meson: only run vulkan's meson.build when building vulkan</li>

				  <li>gbm: remove unnecessary meson include</li>

				  <li>meson: fix wayland-less builds</li>

				  <li>egl: add missing glvnd entrypoint for EGL_ANDROID_blob_cache</li>

				  <li>glapi: add missing visibility args</li>

				</ul>

				<p>Erik Faye-Lund (1):</p>

				<ul>

				  <li>mesa/main: remove bogus error for zero-sized images</li>

				</ul>

				<p>Gert Wollny (3):</p>

				<ul>

				  <li>mesa: Reference count shaders that are used by transform feedback objects</li>

				  <li>r600: clean up the GS ring buffers when the context is destroyed</li>

				  <li>glsl: free or reuse memory allocated for TF varying</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>nir/lower_alu_to_scalar: Don't try to lower unpack_32_2x16</li>

				  <li>anv: Put robust buffer access in the pipeline hash</li>

				</ul>

				<p>Juan A. Suarez Romero (6):</p>

				<ul>

				  <li>cherry-ignore: add explicit 18.3 only nominations</li>

				  <li>cherry-ignore: intel/aub_viewer: fix dynamic state printing</li>

				  <li>cherry-ignore: intel/aub_viewer: Print blend states properly</li>

				  <li>cherry-ignore: mesa/main: fix incorrect depth-error</li>

				  <li>docs: add sha256 checksums for 18.2.5</li>

				  <li>Update version to 18.2.6</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nir/spirv: cast shift operand to u32</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Add PCI IDs for new Amberlake parts that are Coffeelake based</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>egl/dri: fix error value with unknown drm format</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>winsys/amdgpu: fix a buffer leak in amdgpu_bo_from_handle</li>

				  <li>winsys/amdgpu: fix a device handle leak in amdgpu_winsys_create</li>

				</ul>

				<p>Rodrigo Vivi (4):</p>

				<ul>

				  <li>i965: Add a new CFL PCI ID.</li>

				  <li>intel: aubinator: Adding missed platforms to the error message.</li>

				  <li>intel: Introducing Amber Lake platform</li>

				  <li>intel: Introducing Whiskey Lake platform</li>

				</ul>

				</div>

				</body>

				</html>

									
										167

docs/relnotes/18.2.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,167 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.7 Release Notes / December 13, 2018</h1>

				<p>

				Mesa 18.2.7 is a bug fix release which fixes bugs found since the 18.2.6 release.

				</p>

				<p>

				Mesa 18.2.7 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				092351cfbcd430ec595fbd3a3d8d253fd62c29074e1740d7198b00289ab400f8  mesa-18.2.7.tar.gz

				9c7b02560d89d77ca279cd21f36ea9a49e9ffc5611f6fe35099357d744d07ae6  mesa-18.2.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108925">Bug 108925</a> - vkCmdCopyQueryPoolResults(VK_QUERY_RESULT_WAIT_BIT) for timestamps with large query count hangs</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Smith (1):</p>

				<ul>

				  <li>radv: Flush before vkCmdWriteTimestamp() if needed</li>

				</ul>

				<p>Bas Nieuwenhuizen (4):</p>

				<ul>

				  <li>radv: Align large buffers to the fragment size.</li>

				  <li>radv: Clamp gfx9 image view extents to the allocated image extents.</li>

				  <li>radv/android: Mark android WSI image as shareable.</li>

				  <li>radv/android: Use buffer metadata to determine scanout compat.</li>

				</ul>

				<p>Dave Airlie (2):</p>

				<ul>

				  <li>r600: make suballocator 256-bytes align</li>

				  <li>radv: use 3d shader for gfx9 copies if dst is 3d</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>egl/wayland: bail out when drmGetMagic fails</li>

				  <li>egl/wayland: plug memory leak in drm_handle_device()</li>

				</ul>

				<p>Eric Anholt (3):</p>

				<ul>

				  <li>v3d: Fix a leak of the transfer helper on screen destroy.</li>

				  <li>vc4: Fix a leak of the transfer helper on screen destroy.</li>

				  <li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>

				</ul>

				<p>Eric Engestrom (3):</p>

				<ul>

				  <li>anv: correctly use vulkan 1.0 by default</li>

				  <li>wsi/display: fix mem leak when freeing swapchains</li>

				  <li>vulkan/wsi: fix s/,/;/ typo</li>

				</ul>

				<p>Gurchetan Singh (3):</p>

				<ul>

				  <li>virgl: quadruple command buffer size</li>

				  <li>virgl: avoid large inline transfers</li>

				  <li>virgl: don't mark buffers as unclean after a write</li>

				</ul>

				<p>Juan A. Suarez Romero (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.6</li>

				  <li>cherry-ignore: freedreno: Fix autotools build.</li>

				  <li>cherry-ignore: mesa: Revert INTEL_fragment_shader_ordering support</li>

				  <li>Update version to 18.2.7</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nv50,nvc0: Fix gallium nine regression regarding sampler bindings</li>

				</ul>

				<p>Lionel Landwerlin (2):</p>

				<ul>

				  <li>anv: flush pipeline before query result copies</li>

				  <li>anv/query: flush render target before copying results</li>

				</ul>

				<p>Michal Srb (2):</p>

				<ul>

				  <li>gallium: Constify drisw_loader_funcs struct</li>

				  <li>drisw: Use separate drisw_loader_funcs for shm</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>egl/wayland: rather obvious build fix</li>

				  <li>meson: link LLVM 'native' component when LLVM is available</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: rework the TC-compat HTILE hardware bug with COND_EXEC</li>

				</ul>

				<p>Thomas Hellstrom (2):</p>

				<ul>

				  <li>st/xa: Fix a memory leak</li>

				  <li>winsys/svga: Fix a memory leak</li>

				</ul>

				<p>Tobias Klausmann (1):</p>

				<ul>

				  <li>amd/vulkan: meson build - use radv_deps for libvulkan_radeon</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>st/xvmc: Add X11 include path.</li>

				</ul>

				</div>

				</body>

				</html>

									
										183

docs/relnotes/18.2.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,183 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.2.8 Release Notes / December 27, 2018</h1>

				<p>

				Mesa 18.2.8 is a bug fix release which fixes bugs found since the 18.2.7 release.

				</p>

				<p>

				Mesa 18.2.8 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				77512edc0a84e19c7131a0e2e5ebf1beaf1494dc4b71508fcc92d06d65f9f4f5  mesa-18.2.8.tar.gz

				1d2ed9fd435d86d95b7215b287258d3e6b1180293a36f688e5a2efc18298d863  mesa-18.2.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Deucher (3):</p>

				<ul>

				  <li>pci_ids: add new vega10 pci ids</li>

				  <li>pci_ids: add new vega20 pci id</li>

				  <li>pci_ids: add new VegaM pci id</li>

				</ul>

				<p>Axel Davy (3):</p>

				<ul>

				  <li>st/nine: Fix volumetexture dtor on ctor failure</li>

				  <li>st/nine: Bind src not dst in nine_context_box_upload</li>

				  <li>st/nine: Add src reference to nine_context_range_upload</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (1):</p>

				<ul>

				  <li>nir: properly clear the entry sources in copy_prop_vars</li>

				</ul>

				<p>Dylan Baker (1):</p>

				<ul>

				  <li>meson: Fix ppc64 little endian detection</li>

				</ul>

				<p>Emil Velikov (9):</p>

				<ul>

				  <li>glx: mandate xf86vidmode only for "drm" dri platforms</li>

				  <li>bin/get-pick-list.sh: rework handing of sha nominations</li>

				  <li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>

				  <li>meson: don't require glx/egl/gbm with gallium drivers</li>

				  <li>pipe-loader: meson: reference correct library</li>

				  <li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>

				  <li>glx: meson: drop includes from a link-only library</li>

				  <li>glx: meson: wire up the dispatch-index-check test</li>

				  <li>glx/test: meson: assorted include fixes</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>

				  <li>v3d: Add missing flagging of SYNCB as a TSY op.</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>virgl: wrap vertex element state in a struct</li>

				  <li>virgl: work around bad assumptions in virglrenderer</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>

				  <li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>clover: Fix build after clang r348827</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>nir/constant_folding: Fix source bit size logic</li>

				</ul>

				<p>Jon Turney (1):</p>

				<ul>

				  <li>glx: Fix compilation with GLX_USE_WINDOWSGL</li>

				</ul>

				<p>Juan A. Suarez Romero (7):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.2.7</li>

				  <li>cherry-ignore: add explicit 18.3 only nominations</li>

				  <li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>

				  <li>cherry-ignore: radv: Fix multiview depth clears</li>

				  <li>cherry-ignore: nir: properly find the entry to keep in copy_prop_vars</li>

				  <li>cherry-ignore: intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs</li>

				  <li>Update version to 18.2.8</li>

				</ul>

				<p>Kirill Burtsev (1):</p>

				<ul>

				  <li>loader: free error state, when checking the drawable type</li>

				</ul>

				<p>Lionel Landwerlin (1):</p>

				<ul>

				  <li>anv: don't do partial resolve on layer &gt; 0</li>

				</ul>

				<p>Rhys Perry (2):</p>

				<ul>

				  <li>radv: don't set surf_index for stencil-only images</li>

				  <li>ac: split 16-bit ssbo loads that may not be dword aligned</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>mesa/st/nir: fix missing nir_compact_varyings</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>

				</ul>

				<p>Vinson Lee (2):</p>

				<ul>

				  <li>meson: Fix typo.</li>

				  <li>meson: Fix libsensors detection.</li>

				</ul>

				</div>

				</body>

				</html>

									
										283

docs/relnotes/18.3.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,283 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.0 Release Notes / December 7, 2018</h1>

				<p>

				Mesa 18.3.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 18.3.1.

				</p>

				<p>

				Mesa 18.3.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<p>

				libwayland-egl is now distributed by Wayland (since 1.15,

				<a href="https://lists.freedesktop.org/archives/wayland-devel/2018-April/037767.html">see announcement</a>),

				and has been removed from Mesa in this release. Make sure you're using

				an up-to-date version of Wayland to keep the functionality.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				17a124d4dbc712505d22a7815c9b0cee22214c96c8abb91539a2b1351e38a000  mesa-18.3.0.tar.gz

				b63f947e735d6ef3dfaa30c789a9adfbae18aea671191eaacde95a18c17fc38a  mesa-18.3.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_AMD_depth_clamp_separate on r600, radeonsi.</li>

				<li>GL_AMD_framebuffer_multisample_advanced on radeonsi.</li>

				<li>GL_AMD_gpu_shader_int64 on i965, nvc0, radeonsi.</li>

				<li>GL_AMD_multi_draw_indirect on all GL 4.x drivers.</li>

				<li>GL_AMD_query_buffer_object on i965, nvc0, r600, radeonsi.</li>

				<li>GL_EXT_disjoint_timer_query on radeonsi and most other Gallium drivers (ES extension)</li>

				<li>GL_EXT_texture_compression_s3tc on all drivers (ES extension)<li>

				<li>GL_EXT_vertex_attrib_64bit on i965, nvc0, radeonsi.</li>

				<li>GL_EXT_window_rectangles on radeonsi.</li>

				<li>GL_KHR_texture_compression_astc_sliced_3d on radeonsi.</li>

				<li>GL_NV_fragment_shader_interlock on i965.</li>

				<li>EGL_EXT_device_base for all drivers.</li>

				<li>EGL_EXT_device_drm for all drivers.</li>

				<li>EGL_MESA_device_software for all drivers.</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=13728">Bug 13728</a> - [G965] Some objects in Neverwinter Nights Linux version not displayed correctly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91433">Bug 91433</a> - piglit.spec.arb_depth_buffer_float.fbo-depth-gl_depth_component32f-copypixels fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93355">Bug 93355</a> - [BXT,SKLGT4e] intermittent ext_framebuffer_multisample.accuracy fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94957">Bug 94957</a> - dEQP failures on llvmpipe</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98699">Bug 98699</a> - &quot;float[a+++4 ? 1:1] f;&quot; crashes glsl_compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99507">Bug 99507</a> - Corrupted frame contents with Vulkan version of DOTA2, Talos Principle and Sascha Willems' demos when they're run Vsynched in fullscreen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99730">Bug 99730</a> - Metro Redux game(s) needs override for midshader extension declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100200">Bug 100200</a> - Default Unreal Engine 4 frag shader fails to compile</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101247">Bug 101247</a> - Mesa fails to link GLSL programs with unused output blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102597">Bug 102597</a> - [Regression] mpv, high rendering times (two to three times higher)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=103241">Bug 103241</a> - Anv crashes when using 64-bit vertex inputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104602">Bug 104602</a> - [apitrace] Graphical artifacts in Civilization VI on RX Vega</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104809">Bug 104809</a> - anv: DOOM 2016 and Wolfenstein II:The New Colossus crash due to not having depthBoundsTest</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=104926">Bug 104926</a> - swrast: Mesa 17.3.3 produces:  HW cursor for format 875713089 not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105333">Bug 105333</a> - [gallium-nine] missing geometry after commit ac: replace ac_build_kill with ac_build_kill_if_false</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105371">Bug 105371</a> - r600_shader_from_tgsi - GPR limit exceeded - shader requires 360 registers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105731">Bug 105731</a> - linker error &quot;fragment shader input ... has no matching output in the previous stage&quot; when previous stage's output declaration in a separate shader object</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105904">Bug 105904</a> - Needed to delete mesa shader cache after driver upgrade for 32 bit wine vulkan programs to work.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=105975">Bug 105975</a> - i965 always reports 0 viewport subpixel bits</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106231">Bug 106231</a> - llvmpipe blends produce bad code after llvm patch https://reviews.llvm.org/D44785</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106283">Bug 106283</a> - Shader replacements works only for limited use cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106577">Bug 106577</a> - broken rendering with nine and nouveau (GM107)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106833">Bug 106833</a> - glLinkProgram is expected to fail when vertex attribute aliasing happens on ES3.0 context or later</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106865">Bug 106865</a> - [GLK] piglit.spec.ext_framebuffer_multisample.accuracy stencil tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106980">Bug 106980</a> - Basemark GPU vulkan benchmark hangs on GFX9</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106997">Bug 106997</a> - [Regression]. Dying light game is crashing on latest mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107088">Bug 107088</a> - [GEN8+] Hang when discarding a fragment if dual source blending is enabled but shader doesn't support it</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107098">Bug 107098</a> - Segfault after munmap(kms_sw_dt-&gt;ro_mapped)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107212">Bug 107212</a> - Dual-Core CPU E5500 / G45: RetroArch with reicast core results in corrupted graphics</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107223">Bug 107223</a> - [GEN9+] 50% perf drop in SynMark Fill* tests (E2E RBC gets disabled?)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107276">Bug 107276</a> - radv: OpBitfieldUExtract returns incorrect result when count is zero</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107280">Bug 107280</a> - [DXVK] Batman: Arkham City with tessellation enabled hangs on SKL GT4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107313">Bug 107313</a> - Meson instructions on web site are non-optimal</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107359">Bug 107359</a> - [Regression] [bisected] [OpenGL CTS] [SKL,BDW] KHR-GL46.texture_barrier*-texels, GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners, and GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners fail with some configuration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107460">Bug 107460</a> - radv: OpControlBarrier does not always work correctly (bisected)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107477">Bug 107477</a> - [DXVK] Setting high shader quality in GTA V results in LLVM error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107483">Bug 107483</a> - DispatchSanity_test.GL31_CORE regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107487">Bug 107487</a> - [intel] [tools] intel gpu tools don't honor -D tools=[]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107488">Bug 107488</a> - gl.h:2090: error: redefinition of typedef ‘GLeglImageOES’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107510">Bug 107510</a> - [GEN8+] up to 10% perf drop on several 3D benchmarks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107511">Bug 107511</a> - KHR/khrplatform.h not always installed when needed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107524">Bug 107524</a> - Broken packDouble2x32 at llvmpipe</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107544">Bug 107544</a> - intel/decoder: out of bounds group_iter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107547">Bug 107547</a> - shader crashing glsl_compiler (uniform block assigned to vec2, then component substraced by 1)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107550">Bug 107550</a> - &quot;0[2]&quot; as function parameter hits assert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107563">Bug 107563</a> - [RADV] Broken rendering in Unity demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107565">Bug 107565</a> - TypeError: __init__() got an unexpected keyword argument 'future_imports'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107579">Bug 107579</a> - [SNB] The graphic corruption when we reuse the GS compiled and used for TFB when statebuffer contain magic trash in the unused space</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107601">Bug 107601</a> - Rise of the Tomb Raider Segmentation Fault when the game starts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107610">Bug 107610</a> - Dolphin emulator mis-renders shadow overlay in Super Mario Sunshine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107626">Bug 107626</a> - [SNB] The graphical corruption and GPU hang occur sometimes on the piglit test &quot;arb_texture_multisample-large-float-texture&quot; with parameter --fp16</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107658">Bug 107658</a> - [Regression] [bisected] [OpenGLES CTS] KHR-GLES3.packed_pixels.*rectangle.r*8_snorm</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107734">Bug 107734</a> - [GLSL] glsl-fface-invariant, glsl-fcoord-invariant and glsl-pcoord-invariant should fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107745">Bug 107745</a> - [bisected] [bdw bsw] piglit.­spec.­arb_fragment_shader_interlock.­arb_fragment_shader_interlock-image-load-store failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107760">Bug 107760</a> - GPU Hang when Playing DiRT 3 Complete Edition using Steam Play with DXVK</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107765">Bug 107765</a> - [regression] Batman Arkham City crashes with DXVK under wine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107772">Bug 107772</a> - Mesa preprocessor matches if(def)s &amp; endifs incorrectly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107779">Bug 107779</a> - Access violation with some games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107786">Bug 107786</a> - [DXVK] MSAA reflections are broken in GTA V</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107806">Bug 107806</a> - glsl_get_natural_size_align_bytes() ABORT with GfxBench Vulkan AztecRuins</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107810">Bug 107810</a> - The 'va_end' call is missed after 'va_copy' in 'util_vsnprintf' function under windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107832">Bug 107832</a> - Gallium picking A16L16 formats when emulating INTENSITY16 conflicts with mesa</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107843">Bug 107843</a> - 32bit Mesa build failes with meson.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107856">Bug 107856</a> - i965 incorrectly calculates the number of layers for texture views (assert)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107857">Bug 107857</a> - GPU hang - GS_EMIT without shader outputs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107865">Bug 107865</a> - swr fail to build with llvm-libs 6.0.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107869">Bug 107869</a> - u_thread.h:87:4: error: use of undeclared identifier 'cpu_set_t'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107870">Bug 107870</a> - Undefined symbols for architecture x86_64: &quot;_util_cpu_caps&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107879">Bug 107879</a> - crash happens when link program</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107891">Bug 107891</a> - [wine, regression, bisected] RAGE, Wolfenstein The New Order hangs in menu</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107923">Bug 107923</a> - build_id.c:126: multiple definition of `build_id_length'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107926">Bug 107926</a> - [anv] Rise of the Tomb Raider always misrendering, segfault and gpu hang.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107941">Bug 107941</a> - GPU hang and system crash with Dota 2 using Vulkan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107971">Bug 107971</a> - SPV_GOOGLE_hlsl_functionality1 / SPV_GOOGLE_decorate_string</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108012">Bug 108012</a> - Compiler crashes on access of non-existent member incremental operations</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108024">Bug 108024</a> - [Debian Stretch]Fail to build because &quot;xcb_randr_lease_t&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108082">Bug 108082</a> - warning: unknown warning option '-Wno-format-truncation' [-Wunknown-warning-option]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108109">Bug 108109</a> - [GLSL] no-overloads.vert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108112">Bug 108112</a> - [vulkancts] some of the coherent memory tests fail.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108113">Bug 108113</a> - [vulkancts] r32g32b32 transfer operations not implemented</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108115">Bug 108115</a> - [vulkancts] dEQP-VK.subgroups.vote.graphics.subgroupallequal.* fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108164">Bug 108164</a> - [radv] VM faults since 5d6a560a2986c9ab421b3c7904d29bb7bc35e36f</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108245">Bug 108245</a> - RADV/Vega: Low mip levels of large BCn textures get corrupted by vkCmdCopyBufferToImage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108272">Bug 108272</a> - [polaris10] opencl-mesa: Anything using OpenCL segfaults, XFX Radeon RX 580</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108311">Bug 108311</a> - Query buffer object support is broken on r600.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108319">Bug 108319</a> - [GLK BXT BSW] Assertion in piglit.spec.arb_gpu_shader_fp64.execution.built-in-functions.vs-sign-sat-neg-abs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108491">Bug 108491</a> - Commit baa38c14 causes output issues on my VEGA with RADV</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108524">Bug 108524</a> - [RADV]  GPU lockup on event synchronization</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108530">Bug 108530</a> - (mesa-18.3) [Tracker] Mesa 18.3 Release Tracker</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108532">Bug 108532</a> - make check nir_copy_prop_vars_test.store_store_load_different_components regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108560">Bug 108560</a> - Mesa 32 is built without sse</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108595">Bug 108595</a> - ir3_compiler valgrind build error</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108617">Bug 108617</a> - [deqp] Mesa fails conformance for egl_ext_device</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108630">Bug 108630</a> - [G965] piglit.spec.!opengl 1_2.tex3d-maxsize spins forever</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108635">Bug 108635</a> - Mesa master commit 68dc591af16ebb36814e4c187e4998948103c99c causes XWayland to segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108713">Bug 108713</a> - Gallium: use after free with transform feedback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108829">Bug 108829</a> - [meson] libglapi exports internal API</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108894">Bug 108894</a> - [anv] vkCmdCopyBuffer() and vkCmdCopyQueryPoolResults() write-after-write hazard</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108909">Bug 108909</a> - Vkd3d test failure test_resolve_non_issued_query_data()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108914">Bug 108914</a> - blocky shadow artifacts in The Forest with DXVK, RADV_DEBUG=nohiz fixes this</li>

				<h2>Changes</h2>

				<ul>

				<li>TBD</li>

				</ul>

				</div>

				</body>

				</html>

									
										63

docs/relnotes/18.3.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,63 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.1 Release Notes / December 11, 2018</h1>

				<p>

				Mesa 18.3.1 is a bug fix release which fixes bugs found since the 18.3.0 release.

				</p>

				<p>

				Mesa 18.3.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				256d0c3d88e380c1b8e3fc5c6ac34001e3b7c30458b8b852407ec68b8ccd9fda  mesa-18.3.1.tar.gz

				5b1f827d28684a25f6657289f8b7d47ac56395988c7ac23e0ec9a62b644bdc63  mesa-18.3.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>None</p>

				<h2>Changes</h2>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.0</li>

				  <li>Update version to 18.3.1</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>anv,radv: Disable VK_EXT_pci_bus_info</li>

				</ul>

				</div>

				</body>

				</html>

									
										265

docs/relnotes/18.3.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,265 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 18.3.2 Release Notes / January 17, 2019</h1>

				<p>

				Mesa 18.3.2 is a bug fix release which fixes bugs found since the 18.3.1 release.

				</p>

				<p>

				Mesa 18.3.2 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				1cde4fafd40cd1ad4ee3a13b364b7a0175a08b7afdd127fb46f918c1e1dfd4b0  mesa-18.3.2.tar.gz

				f7ce7181c07b6d8e0132da879af1729523a6c8aa87f79a9d59dfd064024cfb35  mesa-18.3.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=106595">Bug 106595</a> - [RADV] Rendering distortions only when MSAA is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=107728">Bug 107728</a> - Wrong background in Sascha Willem's Multisampling Demo</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108114">Bug 108114</a> - [vulkancts] new VK_KHR_16bit_storage tests fail.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108116">Bug 108116</a> - [vulkancts] stencil partial clear tests fail.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108624">Bug 108624</a> - [regression][bisected] &quot;nir: Copy propagation between blocks&quot; regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108910">Bug 108910</a> - Vkd3d test failure test_multisample_array_texture()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108911">Bug 108911</a> - Vkd3d test failure test_clear_render_target_view()</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=108943">Bug 108943</a> - Build fails on ppc64le with meson</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109072">Bug 109072</a> - GPU hang in blender 2.80</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109081">Bug 109081</a> - [bisected] [HSW] Regression in clipping.user_defined.clip_* vulkancts tests</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109151">Bug 109151</a> - [KBL-G][vulkan] dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat failed verification.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109202">Bug 109202</a> - nv50_ir.cpp:749:19: error: cannot use typeid with -fno-rtti</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=109204">Bug 109204</a> - [regression, bisected] retroarch's crt-royale shader crash radv</li>

				</ul>

				<h2>Changes</h2>

				<p>Alex Deucher (3):</p>

				<ul>

				  <li>pci_ids: add new vega10 pci ids</li>

				  <li>pci_ids: add new vega20 pci id</li>

				  <li>pci_ids: add new VegaM pci id</li>

				</ul>

				<p>Alexander von Gluck IV (1):</p>

				<ul>

				  <li>egl/haiku: Fix reference to disp vs dpy</li>

				</ul>

				<p>Andres Gomez (2):</p>

				<ul>

				  <li>glsl: correct typo in GLSL compilation error message</li>

				  <li>glsl/linker: specify proper direction in location aliasing error</li>

				</ul>

				<p>Axel Davy (3):</p>

				<ul>

				  <li>st/nine: Fix volumetexture dtor on ctor failure</li>

				  <li>st/nine: Bind src not dst in nine_context_box_upload</li>

				  <li>st/nine: Add src reference to nine_context_range_upload</li>

				</ul>

				<p>Bas Nieuwenhuizen (5):</p>

				<ul>

				  <li>radv: Do a cache flush if needed before reading predicates.</li>

				  <li>radv: Implement buffer stores with less than 4 components.</li>

				  <li>anv/android: Do not reject storage images.</li>

				  <li>radv: Fix rasterization precision bits.</li>

				  <li>spirv: Fix matrix parameters in function calls.</li>

				</ul>

				<p>Caio Marcelo de Oliveira Filho (3):</p>

				<ul>

				  <li>nir: properly clear the entry sources in copy_prop_vars</li>

				  <li>nir: properly find the entry to keep in copy_prop_vars</li>

				  <li>nir: remove dead code from copy_prop_vars</li>

				</ul>

				<p>Dave Airlie (2):</p>

				<ul>

				  <li>radv/xfb: fix counter buffer bounds checks.</li>

				  <li>virgl/vtest: fix front buffer flush with protocol version 0.</li>

				</ul>

				<p>Dylan Baker (6):</p>

				<ul>

				  <li>meson: Fix ppc64 little endian detection</li>

				  <li>meson: Add support for gnu hurd</li>

				  <li>meson: Add toggle for glx-direct</li>

				  <li>meson: Override C++ standard to gnu++11 when building with altivec on ppc64</li>

				  <li>meson: Error out if building nouveau and using LLVM without rtti</li>

				  <li>autotools: Remove tegra vdpau driver</li>

				</ul>

				<p>Emil Velikov (12):</p>

				<ul>

				  <li>docs: add sha256 checksums for 18.3.1</li>

				  <li>bin/get-pick-list.sh: rework handing of sha nominations</li>

				  <li>bin/get-pick-list.sh: warn when commit lists invalid sha</li>

				  <li>cherry-ignore: meson: libfreedreno depends upon libdrm (for fence support)</li>

				  <li>glx: mandate xf86vidmode only for "drm" dri platforms</li>

				  <li>meson: don't require glx/egl/gbm with gallium drivers</li>

				  <li>pipe-loader: meson: reference correct library</li>

				  <li>TODO: glx: meson: build dri based glx tests, only with -Dglx=dri</li>

				  <li>glx: meson: drop includes from a link-only library</li>

				  <li>glx: meson: wire up the dispatch-index-check test</li>

				  <li>glx/test: meson: assorted include fixes</li>

				  <li>Update version to 18.3.2</li>

				</ul>

				<p>Eric Anholt (6):</p>

				<ul>

				  <li>v3d: Fix a leak of the transfer helper on screen destroy.</li>

				  <li>vc4: Fix a leak of the transfer helper on screen destroy.</li>

				  <li>v3d: Fix a leak of the disassembled instruction string during debug dumps.</li>

				  <li>v3d: Make sure that a thrsw doesn't split a multop from its umul24.</li>

				  <li>v3d: Add missing flagging of SYNCB as a TSY op.</li>

				  <li>gallium/ttn: Fix setup of outputs_written.</li>

				</ul>

				<p>Erik Faye-Lund (2):</p>

				<ul>

				  <li>virgl: wrap vertex element state in a struct</li>

				  <li>virgl: work around bad assumptions in virglrenderer</li>

				</ul>

				<p>Francisco Jerez (5):</p>

				<ul>

				  <li>intel/fs: Handle source modifiers in lower_integer_multiplication().</li>

				  <li>intel/fs: Implement quad swizzles on ICL+.</li>

				  <li>intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.</li>

				  <li>intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.</li>

				  <li>intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>i965/vec4/dce: Don't narrow the write mask if the flags are used</li>

				  <li>Revert "nir/lower_indirect: Bail early if modes == 0"</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>clover: Fix build after clang r348827</li>

				</ul>

				<p>Jason Ekstrand (6):</p>

				<ul>

				  <li>nir/constant_folding: Fix source bit size logic</li>

				  <li>intel/blorp: Be more conservative about copying clear colors</li>

				  <li>spirv: Handle any bit size in vector_insert/extract</li>

				  <li>anv/apply_pipeline_layout: Set the cursor in lower_res_reindex_intrinsic</li>

				  <li>spirv: Sign-extend array indices</li>

				  <li>intel/peephole_ffma: Fix swizzle propagation</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nv50/ir: fix use-after-free in ConstantFolding::visit</li>

				</ul>

				<p>Kirill Burtsev (1):</p>

				<ul>

				  <li>loader: free error state, when checking the drawable type</li>

				</ul>

				<p>Lionel Landwerlin (5):</p>

				<ul>

				  <li>anv: don't do partial resolve on layer &gt; 0</li>

				  <li>i965: include draw_params/derived_draw_params for VF cache workaround</li>

				  <li>i965: add CS stall on VF invalidation workaround</li>

				  <li>anv: explictly specify format for blorp ccs/mcs op</li>

				  <li>anv: flush fast clear colors into compressed surfaces</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: don't leak pipe_surface if pipe_context is not current</li>

				</ul>

				<p>Mario Kleiner (1):</p>

				<ul>

				  <li>radeonsi: Fix use of 1- or 2- component GL_DOUBLE vbo's.</li>

				</ul>

				<p>Nicolai Hähnle (1):</p>

				<ul>

				  <li>meson: link LLVM 'native' component when LLVM is available</li>

				</ul>

				<p>Rhys Perry (3):</p>

				<ul>

				  <li>radv: don't set surf_index for stencil-only images</li>

				  <li>ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics</li>

				  <li>ac: split 16-bit ssbo loads that may not be dword aligned</li>

				</ul>

				<p>Rob Clark (2):</p>

				<ul>

				  <li>freedreno/drm: fix memory leak</li>

				  <li>mesa/st/nir: fix missing nir_compact_varyings</li>

				</ul>

				<p>Samuel Pitoiset (1):</p>

				<ul>

				  <li>radv: switch on EOP when primitive restart is enabled with triangle strips</li>

				</ul>

				<p>Timothy Arceri (2):</p>

				<ul>

				  <li>tgsi/scan: fix loop exit point in tgsi_scan_tess_ctrl()</li>

				  <li>tgsi/scan: correctly walk instructions in tgsi_scan_tess_ctrl()</li>

				</ul>

				<p>Vinson Lee (2):</p>

				<ul>

				  <li>meson: Fix typo.</li>

				  <li>meson: Fix libsensors detection.</li>

				</ul>

				</div>

				</body>

				</html>

									
										74

docs/relnotes/19.0.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,74 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 19.0.0 Release Notes / TBD</h1>

				<p>

				Mesa 19.0.0 is a new development release. People who are concerned

				with stability and reliability should stick with a previous release or

				wait for Mesa 19.0.1.

				</p>

				<p>

				Mesa 19.0.0 implements the OpenGL 4.5 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.5.  OpenGL

				4.5 is <strong>only</strong> available if requested at context creation.

				Compatibility contexts may report a lower version depending on each driver.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<ul>

				<li>GL_AMD_texture_texture4 on all GL 4.0 drivers.</li>

				<li>GL_EXT_shader_implicit_conversions on all drivers (ES extension).</li>

				<li>GL_EXT_texture_compression_bptc on all GL 4.0 drivers (ES extension).</li>

				<li>GL_EXT_texture_compression_rgtc on all GL 3.0 drivers (ES extension).</li>

				<li>GL_EXT_render_snorm on gallium drivers (ES extension).</li>

				<li>GL_EXT_texture_view on drivers supporting texture views (ES extension).</li>

				<li>GL_OES_texture_view on drivers supporting texture views (ES extension).</li>

				<li>GL_NV_shader_atomic_float on nvc0 (Fermi/Kepler only).</li>

				<li>Shader-based software implementations of GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_vertex_attrib_64bit, and GL_ARB_shader_ballot on i965.</li>

				<li>VK_ANDROID_external_memory_android_hardware_buffer on Intel</li>

				<li>Fixed and re-exposed VK_EXT_pci_bus_info on Intel and RADV</li>

				<li>VK_EXT_scalar_block_layout on Intel and RADV</li>

				<li>VK_KHR_depth_stencil_resolve on Intel</li>

				<li>VK_KHR_draw_indirect_count on Intel</li>

				<li>VK_EXT_conditional_rendering on Intel</li>

				<li>VK_EXT_memory_budget on RADV</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li>TBD</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>TBD</li>

				</ul>

				</div>

				</body>

				</html>

									
										39

docs/repository.html
									
												View File
												
				@@ -35,9 +35,9 @@ You may access the repository either as an

				<p>

				You may also 

				<a href="https://cgit.freedesktop.org/mesa/mesa/"

				<a href="https://gitlab.freedesktop.org/mesa/mesa"

				>browse the main Mesa git repository</a> and the

				<a href="https://cgit.freedesktop.org/mesa/demos"

				<a href="https://gitlab.freedesktop.org/mesa/demos"

				>Mesa demos and tests git repository</a>.

				</p>

				@@ -52,7 +52,7 @@ To get the Mesa sources anonymously (read-only):

				<li>Install the git software on your computer if needed.<br><br>

				<li>Get an initial, local copy of the repository with:

				    <pre>

				    git clone git://anongit.freedesktop.org/git/mesa/mesa

				    git clone https://gitlab.freedesktop.org/mesa/mesa.git

				    </pre>

				<li>Later, you can update your tree from the master repository with:

				    <pre>

				@@ -60,7 +60,7 @@ To get the Mesa sources anonymously (read-only):

				    </pre>

				<li>If you also want the Mesa demos/tests repository:

				    <pre>

				    git clone git://anongit.freedesktop.org/git/mesa/demos

				    git clone https://gitlab.freedesktop.org/mesa/demos.git

				    </pre>

				</ol>

				@@ -98,24 +98,17 @@ on a particular driver, add a new extension, etc.) in the bugzilla record.

				</ol>

				<p>

				Once your account is established:

				</p>

				Once your account is established, you can update your push url to use SSH:

				<pre>

				git remote set-url --push <em>origin</em> git@gitlab.freedesktop.org:mesa/mesa.git

				</pre>

				<ol>

				<li>Get an initial, local copy of the repository with:

				    <pre>

				    git clone git+ssh://username@git.freedesktop.org/git/mesa/mesa

				    </pre>

				    Replace <em>username</em> with your actual login name.<br><br>

				<li>Later, you can update your tree from the master repository with:

				    <pre>

				    git pull origin

				    </pre>

				<li>If you also want the Mesa demos/tests repository:

				    <pre>

				    git clone git+ssh://username@git.freedesktop.org/git/mesa/demos

				    </pre>

				</ol>

				You can also use <a href="https://gitlab.freedesktop.org/profile/personal_access_tokens">personal access tokens</a>

				to push over HTTPS instead (useful for people behind strict proxies).

				In this case, create a token, and put it in the url as shown here:

				<pre>

				git remote set-url --push <em>origin</em> https://<em>USER</em>:<em>TOKEN</em>@gitlab.freedesktop.org/mesa/mesa.git

				</pre>

				<h2>Windows Users</h2>

				@@ -149,12 +142,12 @@ code while a branch has the latest stable code.

				</p>

				<p>

				The command <code>git-branch</code> will list all available branches.

				The command <code>git branch</code> will list all available branches.

				</p>

				<p>

				Questions about branch status/activity should be posted to the

				mesa3d-dev mailing list.

				mesa-dev mailing list.

				</p>

				<h2>Developer Git Tips</h2>

									
										2

docs/shading.html
									
												View File
												
				@@ -85,7 +85,7 @@ should match the filenames of the corresponding dumped shaders.

				<p>

				Setting <b>MESA_SHADER_CAPTURE_PATH</b> to a directory will cause the compiler

				to write <tt>.shader_test</tt> files for use with

				<a href="https://cgit.freedesktop.org/mesa/shader-db">shader-db</a>, a tool

				<a href="https://gitlab.freedesktop.org/mesa/shader-db">shader-db</a>, a tool

				which compiler developers can use to gather statistics about shaders

				(instructions, cycles, memory accesses, and so on).

				</p>

									
										2

docs/sourcedocs.html
									
												View File
												
				@@ -31,7 +31,7 @@ the <code>doxygen</code> directory and run <code>make</code>.

				<p>

				For an example of Doxygen usage in Mesa, see a recent source file

				such as <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>.

				such as <a href="https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/mesa/main/bufferobj.c">bufferobj.c</a>.

				</p>

82

docs/specs/EGL_MESA_device_software.txt Normal file

View File

@@ -0,0 +1,82 @@
 Name
     MESA_device_software
 Name Strings
     EGL_MESA_device_software
 Contributors
     Adam Jackson <ajax@redhat.com>
     Emil Velikov <emil.velikov@collabora.com>
 Contacts
     Adam Jackson <ajax@redhat.com>
 Status
     DRAFT
 Version
     Version 2, 2018-10-03
 Number
     EGL Extension #TODO
 Extension Type
     EGL device extension
 Dependencies
     Requires EGL_EXT_device_query.
     This extension is written against the EGL 1.5 Specification.
 Overview
     This extension defines a software EGL "device". The device is not backed by
     any actual device node and simply renders into client memory.
     By defining this as an extension, EGL_EXT_device_enumeration is able to
     sanely enumerate a software device.
 New Types
     None
 New Procedures and Functions
     None
 New Tokens
     None
 Additions to the EGL Specification
     None
 New Behavior
     The device list produced by eglQueryDevicesEXT will include a software
     device. This can be distinguished from other device classes in the usual
     way by calling eglQueryDeviceStringEXT(EGL_EXTENSIONS) and matching this
     extension's string in the result.
 Issues
     None
 Revision History
     Version 2, 2018-10-03 (Emil Velikov)
         - Drop "fallback" from "software fallback device"
         - Add Emil Velikov as contributor
     Version 1, 2017-07-06 (Adam Jackson)
         - Initial version

95

docs/specs/EGL_MESA_query_driver.txt Normal file

View File

@@ -0,0 +1,95 @@
 Name
     MESA_query_driver
 Name Strings
     EGL_MESA_query_driver
 Contact
     Rob Clark      <robdclark 'at' gmail.com>
     Nicolai Hähnle <Nicolai.Haehnle 'at' amd.com>
 Contibutors
     Veluri Mithun <velurimithun38 'at' gmail.com>
 Status
     Complete
 Version
     Version 3, 2019-01-24
 Number
     EGL Extension 131
 Dependencies
     EGL 1.0 is required.
 Overview
     When an application has to query the name of a driver and for
     obtaining driver's option list (UTF-8 encoded XML) of a driver
     the below functions are useful.
     XML file formally describes all available options and also
     includes verbal descriptions in multiple languages. Its main purpose
     is to be automatically processed by configuration GUIs.
     The XML shall respect the following DTD:
     <!ELEMENT driinfo      (section*)>
     <!ELEMENT section      (description+, option+)>
     <!ELEMENT description  (enum*)>
     <!ATTLIST description  lang CDATA #REQUIRED
                            text CDATA #REQUIRED>
     <!ELEMENT option       (description+)>
     <!ATTLIST option       name CDATA #REQUIRED
                            type (bool|enum|int|float) #REQUIRED
                            default CDATA #REQUIRED
                            valid CDATA #IMPLIED>
     <!ELEMENT enum         EMPTY>
     <!ATTLIST enum         value CDATA #REQUIRED
                            text CDATA #REQUIRED>
 New Procedures and Functions
     char* eglGetDisplayDriverConfig(EGLDisplay dpy);
     const char* eglGetDisplayDriverName(EGLDisplay dpy);
 Description
     By passing EGLDisplay as parameter to `eglGetDisplayDriverName` one can retrieve
     driverName. Similarly passing EGLDisplay to `eglGetDisplayDriverConfig` we can retrieve
     driverConfig options of the driver in XML format.
     The string returned by `eglGetDisplayDriverConfig` is heap-allocated and caller
     is responsible for freeing it.
     EGL_BAD_DISPLAY is generated if `disp` is not an EGL display connection.
     EGL_NOT_INITIALIZED is generated if `disp` has not been initialized.
     If the implementation does not have enough resources to allocate the XML then an
     EGL_BAD_ALLOC error is generated.
 New Tokens
     No new tokens
 Issues
     None
 Revision History
     Version 1, 2018-11-05 - First draft (Veluri Mithun)
     Version 2, 2019-01-23 - Final version (Veluri Mithun)
     Version 3, 2019-01-24 - Mark as complete, add Khronos extension
                             number, fix parameter name in prototypes,
                             write revision history (Eric Engestrom)

200

docs/specs/INTEL_shader_atomic_float_minmax.txt Normal file

View File

@@ -0,0 +1,200 @@
 Name
     INTEL_shader_atomic_float_minmax
 Name Strings
     GL_INTEL_shader_atomic_float_minmax
 Contact
     Ian Romanick (ian . d . romanick 'at' intel . com)
 Contributors
 Status
     In progress
 Version
     Last Modified Date: 06/22/2018
     Revision: 4
 Number
     TBD
 Dependencies
     OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
     ARB_compute_shader is required.
     This extension is written against version 4.60 of the OpenGL Shading
     Language Specification.
 Overview
     This extension provides GLSL built-in functions allowing shaders to
     perform atomic read-modify-write operations to floating-point buffer
     variables and shared variables.  Minimum, maximum, exchange, and
     compare-and-swap are enabled.
 New Procedures and Functions
     None.
 New Tokens
     None.
 IP Status
     None.
 Modifications to the OpenGL Shading Language Specification, Version 4.60
     Including the following line in a shader can be used to control the
     language features described in this extension:
       #extension GL_INTEL_shader_atomic_float_minmax : <behavior>
     where <behavior> is as specified in section 3.3.
     New preprocessor #defines are added to the OpenGL Shading Language:
       #define GL_INTEL_shader_atomic_float_minmax   1
 Additions to Chapter 8 of the OpenGL Shading Language Specification
 (Built-in Functions)
     Modify Section 8.11, "Atomic Memory Functions"
     (add a new row after the existing "atomicMin" table row, p. 179)
         float atomicMin(inout float mem, float data)
         Computes a new value by taking the minimum of the value of data and
         the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
         a NaN with the most-significant bit of the mantissa cleared), it is
         always considered smaller.  If one of these is an IEEE quiet NaN
         (i.e., a NaN with the most-significant bit of the mantissa set), it is
         always considered larger.  If both are IEEE quiet NaNs or both are
         IEEE signaling NaNs, the result of the comparison is undefined.
     (add a new row after the exiting "atomicMax" table row, p. 179)
         float atomicMax(inout float mem, float data)
         Computes a new value by taking the maximum of the value of data and
         the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
         a NaN with the most-significant bit of the mantissa cleared), it is
         always considered larger.  If one of these is an IEEE quiet NaN (i.e.,
         a NaN with the most-significant bit of the mantissa set), it is always
         considered smaller.  If both are IEEE quiet NaNs or both are IEEE
         signaling NaNs, the result of the comparison is undefined.
     (add to "atomicExchange" table cell, p. 180)
         float atomicExchange(inout float mem, float data)
     (add to "atomicCompSwap" table cell, p. 180)
         float atomicCompSwap(inout float mem, float compare, float data)
 Interactions with OpenGL 4.6 and ARB_gl_spirv
     If OpenGL 4.6 or ARB_gl_spirv is supported, then
     SPV_INTEL_shader_atomic_float_minmax must also be supported.
     The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
     OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
 Issues
 ) Why call this extension INTEL_shader_atomic_float_minmax?
     RESOLVED: Several other extensions already set the precedent of
     VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
     that enable floating-point atomic operations.  Using that as a base for
     the name seems logical.
     There already exists NV_shader_atomic_float, but the two extensions have
     nearly zero overlap in functionality.  NV_shader_atomic_float adds
     atomicAdd and image atomic operations that currently shipping Intel GPUs
     do not support.  Calling this extension INTEL_shader_atomic_float would
     likely have been confusing.
     Adding something to describe the actual functions added by this extension
     seemed reasonable.  INTEL_shader_atomic_float_compare was considered, but
     that name was deemed to be not properly descriptive.  Calling this
     extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
     out.
 ) What atomic operations should we support for floating-point targets?
     RESOLVED.  Exchange, min, max, and compare-swap make sense, and these are
     all supported by the hardware.  Future extensions may add other functions.
     For buffer variables and shared variables it is not possible to bit-cast
     the memory location in GLSL, so existing integer operations, such as
     atomicOr, cannot be used.  However, the underlying hardware implementation
     can do this by treating the memory as an integer.  It would be possible to
     implement atomicNegate using this technique with atomicXor.  It is unclear
     whether this provides any actual utility.
 ) What should be said about the NaN behavior?
     RESOLVED.  There are several aspects of NaN behavior that should be
     documented in this extension.  However, some of this behavior varies based
     on NaN concepts that do not exist in the GLSL specification.
     * atomicCompSwap performs the comparison as the floating-point equality
       operator (==).  That is, if either 'mem' or 'compare' is NaN, the
       comparison result is always false.
     * atomicMin and atomicMax implement the IEEE specification with respect to
       NaN.  IEEE considers two different kinds of NaN: signaling NaN and quiet
       NaN.  A quiet NaN has the most significant bit of the mantissa set, and
       a signaling NaN does not.  This concept does not exist in SPIR-V,
       Vulkan, or OpenGL.  Let qNaN denote a quiet NaN and sNaN denote a
       signaling NaN.  atomicMin and atomicMax specifically implement
       - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
       - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
       - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
         fmax(qNaN, sNaN) = sNaN
       - fmin(sNaN, sNaN) = sNaN.  This specification does not define which of
         the two arguments is stored.
       - fmax(sNaN, sNaN) = sNaN.  This specification does not define which of
         the two arguments is stored.
       - fmin(qNaN, qNaN) = qNaN.  This specification does not define which of
         the two arguments is stored.
       - fmax(qNaN, qNaN) = qNaN.  This specification does not define which of
         the two arguments is stored.
     Further details are available in the Skylake Programmer's Reference
     Manuals available at
     https://01.org/linuxgraphics/documentation/hardware-specification-prms.
 ) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
     arguments?
     RESOLVED.  atomicMin should store -0.0, and atomicMax should store +0.0.
     Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
     stored.  This behavior may change in later GPUs.
 Revision History
     Rev  Date        Author    Changes
     ---  ----------  --------  ---------------------------------------------
 04/19/2018  idr       Initial version
 05/05/2018  idr       Describe interactions with the capabilities
                                added by SPV_INTEL_shader_atomic_float_minmax.
 05/29/2018  idr       Remove mention of 64-bit float support.
 06/22/2018  idr       Resolve issue #2.
                                Add issue #3 (regarding NaN behavior).
                                Add issue #4 (regarding atomicMin(-0, +0).

81

docs/specs/MESA_framebuffer_flip_y.txt Normal file

View File

@@ -0,0 +1,81 @@
 Name
     MESA_framebuffer_flip_y
 Name Strings
     GL_MESA_framebuffer_flip_y
 Contact
     Fritz Koenig <frkoenig@google.com>
 Contributors
     Fritz Koenig, Google
     Kristian Høgsberg, Google
     Chad Versace, Google
 Status
     Proposal
 Version
     Version 1, June 7, 2018
 Number
 
 Dependencies
     OpenGL ES 3.1 is required, for FramebufferParameteri.
 Overview
     This extension defines a new framebuffer parameter,
     GL_FRAMEBUFFER_FLIP_Y_MESA, that changes the behavior of the reads and
     writes to the framebuffer attachment points. When GL_FRAMEBUFFER_FLIP_Y_MESA
     is GL_TRUE, render commands and pixel transfer operations access the
     backing store of each attachment point with an y-inverted coordinate
     system. This y-inversion is relative to the coordinate system set when
     GL_FRAMEBUFFER_FLIP_Y_MESA is GL_FALSE.
     Access through TexSubImage2D and similar calls will notice the effect of
     the flip when they are not attached to framebuffer objects because
     GL_FRAMEBUFFER_FLIP_Y_MESA is associated with the framebuffer object and
     not the attachment points.
 IP Status
     None
 Issues
     None
 New Procedures and Functions
     None
 New Types
     None
 New Tokens
     Accepted by the <pname> argument of FramebufferParameteri and
     GetFramebufferParameteriv:
         GL_FRAMEBUFFER_FLIP_Y_MESA                      0x8BBB
 Errors
     An INVALID_OPERATION error is generated by GetFramebufferParameteriv if the
     default framebuffer is bound to <target> and <pname> is FRAMEBUFFER_FLIP_Y_MESA.
 Revision History
     Version 1, June, 2018
         Initial draft (Fritz Koenig)

Compare commits

10673 Commits chadv/cros ... arc-mesa-1

1 .editorconfig Unescape Escape View File

7 .mailmap Unescape Escape View File

744 .travis.yml View File

19 Android.common.mk Unescape Escape View File

8 Android.mk Unescape Escape View File

6 CleanSpec.mk Unescape Escape View File

19 Makefile.am Unescape Escape View File

79 README.rst Normal file Unescape Escape View File

17 REVIEWERS Unescape Escape View File

7 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

30 appveyor.yml Unescape Escape View File

3 bin/.cherry-ignore Normal file Unescape Escape View File

2 bin/bugzilla_mesa.sh Unescape Escape View File

81 bin/get-fixes-pick-list.sh Unescape Escape View File

124 bin/get-pick-list.sh Unescape Escape View File

42 bin/get-typod-pick-list.sh Unescape Escape View File

23 bin/git_sha1_gen.py Executable file → Normal file Unescape Escape View File

37 bin/install_megadrivers.py Executable file → Normal file Unescape Escape View File

88 bin/meson-cmd-extract.py Executable file Unescape Escape View File

21 bin/meson.build Normal file Unescape Escape View File

35 bin/meson_get_version.py Normal file Unescape Escape View File

3 build-support/conftest.dyn Normal file Unescape Escape View File

6 build-support/conftest.map Normal file Unescape Escape View File

8 common.py Unescape Escape View File

704 configure.ac View File

13 docs/autoconf.html Unescape Escape View File

4 docs/codingstyle.html Unescape Escape View File

1 docs/contents.html Unescape Escape View File

6 docs/download.html Unescape Escape View File

1 docs/egl.html Unescape Escape View File

72 docs/envvars.html Unescape Escape View File

4 docs/extensions.html Unescape Escape View File

20 docs/faq.html Unescape Escape View File

BIN docs/favicon.ico Normal file View File

BIN docs/favicon.png Normal file View File

323 docs/features.txt Unescape Escape View File

2 docs/helpwanted.html Unescape Escape View File

289 docs/index.html Unescape Escape View File

48 docs/install.html Unescape Escape View File

34 docs/llvmpipe.html Unescape Escape View File

3 docs/mesa.css Unescape Escape View File

336 docs/meson.html Normal file Unescape Escape View File

31 docs/patents.txt Unescape Escape View File

2 docs/precompiled.html Unescape Escape View File

167 docs/release-calendar.html Unescape Escape View File

114 docs/releasing.html Unescape Escape View File

44 docs/relnotes.html Unescape Escape View File

181 docs/relnotes/17.2.3.html Normal file Unescape Escape View File

132 docs/relnotes/17.2.4.html Normal file Unescape Escape View File

156 docs/relnotes/17.2.5.html Normal file Unescape Escape View File

187 docs/relnotes/17.2.6.html Normal file Unescape Escape View File

247 docs/relnotes/17.2.7.html Normal file Unescape Escape View File

112 docs/relnotes/17.2.8.html Normal file Unescape Escape View File

188 docs/relnotes/17.3.0.html Unescape Escape View File

191 docs/relnotes/17.3.1.html Normal file Unescape Escape View File

109 docs/relnotes/17.3.2.html Normal file Unescape Escape View File

151 docs/relnotes/17.3.3.html Normal file Unescape Escape View File

275 docs/relnotes/17.3.4.html Normal file Unescape Escape View File

66 docs/relnotes/17.3.5.html Normal file Unescape Escape View File

85 docs/relnotes/17.3.6.html Normal file Unescape Escape View File

312 docs/relnotes/17.3.7.html Normal file Unescape Escape View File

147 docs/relnotes/17.3.8.html Normal file Unescape Escape View File

162 docs/relnotes/17.3.9.html Normal file Unescape Escape View File

321 docs/relnotes/18.0.0.html Normal file Unescape Escape View File

225 docs/relnotes/18.0.1.html Normal file Unescape Escape View File

144 docs/relnotes/18.0.2.html Normal file Unescape Escape View File

107 docs/relnotes/18.0.3.html Normal file Unescape Escape View File

157 docs/relnotes/18.0.4.html Normal file Unescape Escape View File

162 docs/relnotes/18.0.5.html Normal file Unescape Escape View File

268 docs/relnotes/18.1.0.html Normal file Unescape Escape View File

168 docs/relnotes/18.1.1.html Normal file Unescape Escape View File

170 docs/relnotes/18.1.2.html Normal file Unescape Escape View File

167 docs/relnotes/18.1.3.html Normal file Unescape Escape View File

150 docs/relnotes/18.1.4.html Normal file Unescape Escape View File

183 docs/relnotes/18.1.5.html Normal file Unescape Escape View File

188 docs/relnotes/18.1.6.html Normal file Unescape Escape View File

104 docs/relnotes/18.1.7.html Normal file Unescape Escape View File

10673 Commits

chadv/cros ... arc-mesa-1

1

.editorconfig

View File

7

.mailmap

View File

744

.travis.yml

View File

19

Android.common.mk

View File

8

Android.mk

View File

6

CleanSpec.mk

View File

19

Makefile.am

View File

79

README.rst Normal file

View File

17

REVIEWERS

View File

7

SConstruct

View File

2

VERSION

View File

30

appveyor.yml

View File

3

bin/.cherry-ignore Normal file

View File

2

bin/bugzilla_mesa.sh

View File

81

bin/get-fixes-pick-list.sh

View File

124

bin/get-pick-list.sh

View File

42

bin/get-typod-pick-list.sh

View File

23

bin/git_sha1_gen.py Executable file → Normal file

View File

37

bin/install_megadrivers.py Executable file → Normal file

View File

88

bin/meson-cmd-extract.py Executable file

View File

21

bin/meson.build Normal file

View File

35

bin/meson_get_version.py Normal file

View File

3

build-support/conftest.dyn Normal file

View File

6

build-support/conftest.map Normal file

View File

8

common.py

View File

704

configure.ac

View File

13

docs/autoconf.html

View File

4

docs/codingstyle.html

View File

1

docs/contents.html

View File

6

docs/download.html

View File

1

docs/egl.html

View File

72

docs/envvars.html

View File

4

docs/extensions.html

View File

20

docs/faq.html

View File

BIN
docs/favicon.ico Normal file

View File

BIN
docs/favicon.png Normal file

View File

323

docs/features.txt

View File

2

docs/helpwanted.html

View File

289

docs/index.html

View File

48

docs/install.html

View File

34

docs/llvmpipe.html

View File

3

docs/mesa.css

View File

336

docs/meson.html Normal file

View File

31

docs/patents.txt

View File

2

docs/precompiled.html

View File

167

docs/release-calendar.html

View File

114

docs/releasing.html

View File

44

docs/relnotes.html

View File

181

docs/relnotes/17.2.3.html Normal file

View File

132

docs/relnotes/17.2.4.html Normal file

View File

156

docs/relnotes/17.2.5.html Normal file

View File

187

docs/relnotes/17.2.6.html Normal file

View File

247

docs/relnotes/17.2.7.html Normal file

View File

112

docs/relnotes/17.2.8.html Normal file

View File

188

docs/relnotes/17.3.0.html

View File

191

docs/relnotes/17.3.1.html Normal file

View File

109

docs/relnotes/17.3.2.html Normal file

View File

151

docs/relnotes/17.3.3.html Normal file

View File

275

docs/relnotes/17.3.4.html Normal file

View File

66

docs/relnotes/17.3.5.html Normal file

View File

85

docs/relnotes/17.3.6.html Normal file

View File

312

docs/relnotes/17.3.7.html Normal file

View File

147

docs/relnotes/17.3.8.html Normal file

View File

162

docs/relnotes/17.3.9.html Normal file

View File

321

docs/relnotes/18.0.0.html Normal file

View File

225

docs/relnotes/18.0.1.html Normal file

View File

144

docs/relnotes/18.0.2.html Normal file

View File

107

docs/relnotes/18.0.3.html Normal file

View File

157

docs/relnotes/18.0.4.html Normal file

View File

162

docs/relnotes/18.0.5.html Normal file

View File

268

docs/relnotes/18.1.0.html Normal file

View File

168

docs/relnotes/18.1.1.html Normal file

View File

170

docs/relnotes/18.1.2.html Normal file

View File

167

docs/relnotes/18.1.3.html Normal file

View File

150

docs/relnotes/18.1.4.html Normal file

View File

183

docs/relnotes/18.1.5.html Normal file

View File

188

docs/relnotes/18.1.6.html Normal file

View File

104

docs/relnotes/18.1.7.html Normal file

View File

180

docs/relnotes/18.1.8.html Normal file

View File