fran/mesa - mesa - GNLUG git store

fran/mesa

Author	SHA1	Message	Date
George Kyriazis	c5d7b37fe7	swr: add x86 lowering pass to fragment shader Needed because some FP paths (namely stipple) use gather intrinsics that now need to be lowered to x86. v2: fix typo in commit message Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9161c40d14	swr/rast: Enable generalized fetch jit Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some work needed to remove some simd8 double pumping for 16-wide target. Also removed unused non-gather load vertices path. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d73082b98b	swr/rast: Add builder_gfx_mem.{h\|cpp} Abstract usage scenarios for memory accesses into builder_gfx_mem. Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer types for use by builder_mem. v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	1eb72673fc	swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86. Some more work to do before we can support simultaneous 8-wide and 16-wide and remove the VGATHERPS_16 version. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b15fb78df5	swr/rast: Cleanup of JitManager convenience types Small cleanup. Remove convenience types from JitManager and standardize on the Builder's convenience types. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d68694016c	swr/rast: Lower PERMD and PERMPS to x86. Add support for providing an emulation callback function for arch/width combinations that don't map cleanly to an x86 intrinsic. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	8f848ada8a	swr/rast: Start refactoring of builder/packetizer. Move x86 intrinsic lowering to a separate pass. Builder now instantiates generic intrinsics for features not supported by llvm. The separate x86 lowering pass is responsible for lowering to valid x86 for the target SIMD architecture. Currently it's a port of existing code to get it up and running quickly. Will eventually support optimized x86 for AVX, AVX2 and AVX512. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	ffc0aeb4ec	swr/rast: Simplify #define usage in gen source file Removed preprocessor defines from structures passed to LLVM jitted code. The python scripts do not understand the preprocessor defines and ignores them. So for fields that are compiled out due to a preprocessor define the LLVM script accounts for them anyway because it doesn't know what the defines are set to. The sanitize defines for open source are fine in that they're safely used. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f36026ce2e	swr/rast: Move CallPrint() to a separate file Needed work for jit code debug. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	67c8bb4db7	swr/rast: Fix name mangling for LLVM pow intrinsic Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	7a5054aa1c	swr/rast: Add some archrast counters Hook up archrast counters for shader stats: instructions executed. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	f52a501716	swr/rast: Code cleanup Removing some code that doesn't seem to do anything meaningful. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	093c1aee88	swr/rast: Add "Num Instructions Executed" stats intrinsic. Added a SWR_SHADER_STATS structure which is passed to each shader. The stats pass will instrument the shader to populate this. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	5fbee5e4ef	swr/rast: Add MEM_ADD helper function to Builder. mem[offset] += value This function will be heavily used by all stats intrinsics. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	9103119cb3	swr/rast: Permute work for simd16 Fix slow permutes in PA tri lists under SIMD16 emulation on AVX Added missing permute (interlane, immediate) to SIMDLIB Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	4c69823d15	swr/rast: WIP builder rewrite (2) Finish up the remaining explicit intrinsic uses. At this point all explicit Intrinsic::getDeclaration() usage has been replaced with auto generated macros generated with gen_llvm_ir_macros.py. Going forward, make sure to only use the intrinsics here, adding new ones as needed. Next step is to remove all references to x86 intrinsics to keep the builder target-independent. Any x86 lowering will be handled by a separate pass. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	c2163dc56a	swr/rast: Add autogen of helper llvm intrinsics. Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent. Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm macros for stacksave, stackrestore, popcnt. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	6427315e43	swr/rast: WIP builder rewrite. Start removing avx2 macros for functionality that exists in llvm. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a16f8e0554	swr/rast: LLVM 6 fix for getting masked gather intrinsic (also compatible with LLVM 4) Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a92cc09c7a	swr/rast: Changes to allow jitter to compile with LLVM5 Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	0f6fef9632	swr/rast: Add some archrast stats Add stats for degenerate and backfacing primitive counts Wire archrast stats for alpha blend and alpha test. pass value to jitter, upon return have archrast event increment a value Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	b488028854	swr/rast: Silence some unused variable warnings Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	e84bfec4ab	swr/rast: Add debug type info for i128 Help support debug info in 16 wide shaders. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	a3edcfe1fb	swr/rast: Use blend context struct to pass params Stuff parameters into a blend context struct before passing down through the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	be6cf0fd7c	swr/rast: Introduce JIT_MEM_CLIENT Add assert for correct usage of memory accesses v2: reworded commit message; renamed enum more appropriately Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
George Kyriazis	d34edffe48	swr/rast: Add some instructions to jitter VPHADDD, PMAXUD, PMINUD Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-04-18 10:51:38 -05:00
Juan A. Suarez Romero	4aa03581b5	docs: update calendar, add news and link release notes to 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 15:29:12 +00:00
Juan A. Suarez Romero	ad51d8871e	docs: add sha256 checksums for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `a1c421c638`)	2018-04-18 15:25:32 +00:00
Juan A. Suarez Romero	76cadaa1de	docs: add release notes for 18.0.1 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `8bd719e3fa`)	2018-04-18 15:25:30 +00:00
Juan A. Suarez Romero	193d615917	docs: update calendar, add news and link release notes to 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2018-04-18 09:45:11 +00:00
Juan A. Suarez Romero	6372227209	docs: add sha256 checksums for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `cf0864dc63`)	2018-04-18 09:40:44 +00:00
Juan A. Suarez Romero	6a1261bd09	docs: add release notes for 17.3.9 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> (cherry picked from commit `6d88ea9dd4`)	2018-04-18 09:40:42 +00:00
Dylan Baker	b9ad5282ba	Revert "meson: add wrap for libdrm" This reverts commit `6217eedc9b`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:55 -07:00
Dylan Baker	efcbcfa7c8	Revert "Add subprojects directory and git ignore" This reverts commit `21e2e73f71`. I was using this for testing and accidentally put it on master Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>	2018-04-17 13:48:43 -07:00
Jan Alexander Steffens (heftig)	5cf752b18b	meson: Version libMesaOpenCL like autotools does This is for parity with autotools. It names the library libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink. opencl_version now matches configure.ac's OPENCL_VERSION. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Jan Alexander Steffens (heftig)	5bb98cfd92	meson: Add library versions to swr drivers This is for parity with autotools. Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> Acked-by: Dylan Baker <dylan@pnwbakers.com>	2018-04-17 13:46:15 -07:00
Dylan Baker	6217eedc9b	meson: add wrap for libdrm Currently this requires libdrm from git, since the version reported by meson is wrong.	2018-04-17 13:46:15 -07:00
Dylan Baker	21e2e73f71	Add subprojects directory and git ignore For meson wraps.	2018-04-17 13:46:15 -07:00
Samuel Pitoiset	893e19efb7	radv: fix scissor computation when using half-pixel viewport offset 'scale[i]' can be non-integer. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074 Fixes: `0f3de89a56` ("radv: Use the guard band.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-17 22:12:14 +02:00
Neil Roberts	608d70bc02	spirv: Accept doubles in FaceForward, Reflect and Refract The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32- or 16- bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 64-bit. This adds an optional conversion of the final argument in order to support any type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 v2: Convert the eta operand of Refract from any size in order to make it eventually cope with 16-bit floats. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:11 +02:00
Neil Roberts	6e499572b9	spirv: Add a 64-bit implementation of OpIsInf The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. v2: Use nir_imm_floatN_t for the constant. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:06 +02:00
Neil Roberts	696f4abcbc	spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins There is an existing macro that is used to choose between either a float or a double immediate constant based on the bit size of the first operand to the builtin. This is now changed to use the new nir_imm_floatN_t helper function to reduce the number of places that make this decision. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:58:03 +02:00
Neil Roberts	e7b2c125c3	nir/builder: Add a nir_imm_floatN_t helper This lets you easily build float immediates just given the bit size. If we have this single place here to handle this then it will be easier to add support for 16-bit floats later. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 20:57:36 +02:00
Timothy Arceri	6e22ad6edc	nir: return early when lowering a return at the end of a function Otherwise we create unused conditional return flags and things get unnecessarily ugly fast when lowering nested functions. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-17 14:17:56 +10:00
Timothy Arceri	d3cafc18fc	mesa: merge the driver functions DrawBuffers and DrawBuffer The extra params we unused by the drivers that used DrawBuffers. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Brian Paul <brianp@vmware.com>	2018-04-17 14:17:48 +10:00
Marc Dietrich	268d8f244b	glsl: fix gcc 8 parenthesis warning fixes warnings like this: [184/1137] Compiling C++ object 'src/compiler/glsl/glsl@sta/lower_jumps.cpp.o'. In file included from ../src/mesa/main/mtypes.h:48, from ../src/compiler/glsl_types.h:149, from ../src/compiler/glsl/lower_jumps.cpp:59: ../src/compiler/glsl/lower_jumps.cpp: In member function '{anonymous}::block_record {anonymous}::ir_lower_jumps_visitor::visit_block(exec_list)': ../src/compiler/glsl/list.h:650:17: warning: unnecessary parentheses in declaration of 'node' [-Wparentheses] for (__type (__inst) = (__type *)(__list)->head_sentinel.next; \ ^ ../src/compiler/glsl/lower_jumps.cpp:510:7: note: in expansion of macro 'foreach_in_list' foreach_in_list(ir_instruction, node, list) { ^~~~~~~~~~~~~~~ Signed-off-by: Marc Dietrich <marvin24@gmx.de> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-17 11:53:59 +10:00
Rob Clark	2a55344e7d	compiler: int8/uint8 fixes A couple spots were missed for handling of the new INT8/UINT8 base type. Also de-duplicate get_base_type().. get_scalar_type() had nearly the same switch statement, with the exception that anything with base_type that was not scalar would return error_type. So just handle that one special case in get_scalar_type(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-04-16 20:41:18 -04:00
Marek Olšák	60299e9abe	radeonsi: don't emit partial flushes for internal CS flushes only Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	692f550740	winsys/amdgpu: always set AMDGPU_IB_FLAG_TC_WB_NOT_INVALIDATE There is a kernel patch that adds the new flag. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	1b3199d14d	radeonsi: implement mechanism for IBs without partial flushes at the end (v6) (This patch doesn't enable the behavior. It will be enabled in a later commit.) Draw calls from multiple IBs can be executed in parallel. v2: do emit partial flushes on SI v3: invalidate all shader caches at the beginning of IBs v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed, only do this for flushes invoked internally v5: empty IBs should wait for idle if the flush requires it v6: split the commit If we artificially limit the number of draw calls per IB to 5, we'll get a lot more IBs, leading to a lot more partial flushes. Let's see how the removal of partial flushes changes GPU utilization in that scenario: With partial flushes (time busy): CP: 99% SPI: 86% CB: 73: Without partial flushes (time busy): CP: 99% SPI: 93% CB: 81% Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00

... 9 10 11 12 13 ...

102190 Commits