sneedmca/suyu - sneedGit: git for sneed group

mirror of https://git.suyu.dev/suyu/suyu synced 2024-11-06 07:17:53 +00:00

Author	SHA1	Message	Date
bunnei	41682e0888	Merge pull request #3815 from FernandoS27/command-list-2 GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations	2020-05-05 17:12:42 -04:00
bunnei	2aff0b4733	Merge pull request #3808 from ReinUsesLisp/wait-for-idle {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers	2020-05-03 02:43:18 -04:00
bunnei	bf3f030a0d	Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp maxwell_3d: Fix depth clamping register	2020-04-30 13:07:31 -04:00
bunnei	c7b5a87c90	Merge pull request #3799 from ReinUsesLisp/iadd-cc shader: Implement P2R CC, IADD Rd.CC and IADD.X	2020-04-30 12:56:36 -04:00
Fernando Sahmkow	9df67b2095	Clang Format and Documentation.	2020-04-28 14:02:51 -04:00
Fernando Sahmkow	37c690576f	MaxwellDMA: Optimize micro copies.	2020-04-28 13:44:14 -04:00
ReinUsesLisp	fe931ac976	{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).	2020-04-28 02:18:12 -03:00
Fernando Sahmkow	90e5694230	VideoCore/Engines: Refactor Engines CallMethod.	2020-04-27 21:47:58 -04:00
ReinUsesLisp	bb1ed66d99	maxwell_3d: Fix depth clamping register Using deko3d as reference: `4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)` We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.	2020-04-27 20:50:14 -03:00
bunnei	6c7d8073be	Merge pull request #3742 from FernandoS27/command-list Optimize GPU Command Lists and Introduce Fast GPU Time Option	2020-04-27 00:18:46 -04:00
Rodrigo Locatti	7e38dd580f	Merge pull request #3753 from ReinUsesLisp/ac-vulkan {gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers	2020-04-26 01:55:43 -03:00
ReinUsesLisp	c788f9c0bd	shader/arithmetic_integer: Implement IADD.X IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.	2020-04-25 22:56:11 -03:00
bunnei	4e37825dab	Merge pull request #3734 from ReinUsesLisp/half-float-mods decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits	2020-04-25 00:41:43 -04:00
Markus Wick	e717a1df20	Fix -Wdeprecated-copy warning.	2020-04-24 09:33:04 +02:00
ReinUsesLisp	dbaebd8582	decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: `8dbc389957/table.h (L68)` That is itself tested against nvdisasm (Nvidia's official disassembler).	2020-04-23 18:29:38 -03:00
Fernando Sahmkow	5c9feaebb6	Clang Format.	2020-04-23 08:52:58 -04:00
Fernando Sahmkow	18a88d19dc	Maxwell3D: Process Macros on MultiMethod.	2020-04-23 08:52:56 -04:00
Fernando Sahmkow	3fedcc2f6e	DMAPusher: Propagate multimethod writes into the engines.	2020-04-23 08:52:55 -04:00
bunnei	2409fedacf	Merge pull request #3697 from lioncash/declarations CMakeLists: Enable -Wmissing-declarations on Linux builds	2020-04-23 02:18:52 -04:00
Fernando Sahmkow	1b3be8a8f8	MaxwellDMA: Correct copying on accuracy level.	2020-04-22 11:36:25 -04:00
Fernando Sahmkow	b7bc3c2549	FenceManager: Manage syncpoints and rename fences to semaphores.	2020-04-22 11:36:16 -04:00
Fernando Sahmkow	4adfc9bb08	Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.	2020-04-22 11:36:14 -04:00
Fernando Sahmkow	a081a7c855	GPU: Fix rebase errors.	2020-04-22 11:36:13 -04:00
Fernando Sahmkow	487379c593	OpenGL: Implement Fencing backend.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	339d0d9d6c	GPU: Delay Fences.	2020-04-22 11:36:08 -04:00
Fernando Sahmkow	da8f17715d	GPU: Refactor synchronization on Async GPU	2020-04-22 11:36:06 -04:00
Fernando Sahmkow	084ceb925a	UI: Replasce accurate GPU option for GPU Accuracy Level	2020-04-22 11:36:04 -04:00
ReinUsesLisp	0bbae63300	gl_rasterizer: Fix buffers without size On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. `1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)`	2020-04-21 19:55:44 -03:00
Rodrigo Locatti	f293b15611	Merge pull request #3718 from ReinUsesLisp/better-pipeline-state fixed_pipeline_state: Pack structure, use memcmp and CityHash on it	2020-04-21 18:17:58 -03:00
bunnei	d3e0cefa60	Merge pull request #3695 from ReinUsesLisp/default-attributes maxwell_3d: Initialize format attributes constant as one	2020-04-20 21:40:18 -04:00
ReinUsesLisp	ab6704f20c	fixed_pipeline_state: Pack attribute state Reduce FixedPipelineState's size from 1384 to 664 bytes	2020-04-18 19:21:19 -03:00
Lioncash	e2d8be1ca2	General: Resolve warnings related to missing declarations	2020-04-16 23:43:34 -04:00
ReinUsesLisp	238c6016f9	maxwell_3d: Initialize format attributes constant as one nouveau expects this to be true but it doesn't set it.	2020-04-16 21:15:07 -03:00
Lioncash	1c340c6efa	CMakeLists: Specify -Wextra on linux builds Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.	2020-04-15 21:33:46 -04:00
Fernando Sahmkow	e33196d4e7	Merge pull request #3612 from ReinUsesLisp/red shader/memory: Implement RED.E.ADD and minor changes to ATOM	2020-04-15 15:03:49 -04:00
Mat M	64b5985f0a	Merge pull request #3662 from ReinUsesLisp/constant-attrs gl_rasterizer: Implement constant vertex attributes	2020-04-15 11:54:50 -04:00
ReinUsesLisp	fefe7f18f9	shader/arithmetic: Add FCMP_CR variant Adds another variant of FCMP.	2020-04-14 19:11:04 -03:00
ReinUsesLisp	6dfcabc800	gl_rasterizer: Implement constant vertex attributes Credits go to gdkchan from Ryujinx for finding constant attributes are used in retail games.	2020-04-14 17:58:53 -03:00
ReinUsesLisp	76615b9f34	gl_rasterizer: Implement line widths and smooth lines Implements "legacy" features from OpenGL present on hardware such as smooth lines and line width.	2020-04-13 01:30:34 -03:00
Fernando Sahmkow	3d91dbb21d	Merge pull request #3578 from ReinUsesLisp/vmnmx shader/video: Partially implement VMNMX	2020-04-12 10:44:03 -04:00
ReinUsesLisp	76f178ba6e	shader/video: Partially implement VMNMX Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).	2020-04-12 00:34:42 -03:00
ReinUsesLisp	a7baf6fee4	video_core: Add MSAA registers in 3D engine and TIC This adds the registers used for multisampling. It doesn't implement anything for now.	2020-04-12 00:21:27 -03:00
bunnei	b96fd0bd0e	Merge pull request #3601 from ReinUsesLisp/some-shader-encodings video_core/shader: Add some instruction and S2R encodings	2020-04-09 00:17:39 -04:00
ReinUsesLisp	3185245845	shader/memory: Implement RED.E.ADD Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.	2020-04-06 02:24:47 -03:00
ReinUsesLisp	8b719e9e1d	shader_bytecode: Rename MOV_SYS to S2R	2020-04-04 03:37:51 -03:00
ReinUsesLisp	9d15feb892	shader_bytecode: Add encoding for BAR	2020-04-04 03:36:21 -03:00
ReinUsesLisp	c02a2dc24a	shader_bytecode: Add encoding for VOTE.VTG	2020-04-04 03:28:11 -03:00
ReinUsesLisp	2339fe199f	shader_decompiler: Remove FragCoord.w hack and change IPA implementation Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html	2020-04-01 21:48:55 -03:00
namkazy	c8f6d9effd	shader_decode: merge GlobalAtomicOp to AtomicOp	2020-03-30 18:47:00 +07:00
ReinUsesLisp	08470d261d	shader_bytecode: Fix I2I_IMM encoding	2020-03-28 18:49:07 -03:00

1 2 3 4 5 ...

743 commits