sneedmca/suyu - sneedGit: git for sneed group

mirror of https://git.suyu.dev/suyu/suyu synced 2025-01-09 16:03:21 +00:00

Author	SHA1	Message	Date
ReinUsesLisp	2a9d17b7e7	maxwell_dma: Rename registers to match official docs and reorder Rename registers in the MaxwellDMA class to match Nvidia's official documentation. This one can be found here: https://github.com/NVIDIA/open-gpu-doc/blob/master/classes/dma-copy/clb0b5.h While we are at it, reorganize the code in MaxwellDMA to be separated in different functions.	2020-07-07 19:19:33 -03:00
bunnei	efef7b1517	Merge pull request #4147 from ReinUsesLisp/hset2-imm shader/half_set: Implement HSET2_IMM	2020-06-26 23:14:56 -04:00
David Marcec	fabdf5d385	Addressed issues	2020-06-24 12:09:03 +10:00
David Marcec	6ce5f3120b	Macro HLE support	2020-06-24 12:09:01 +10:00
ReinUsesLisp	39ab33ee1c	shader/half_set: Implement HSET2_IMM Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone.	2020-06-22 20:51:18 -03:00
bunnei	c2ea1e1bcb	Merge pull request #4049 from ReinUsesLisp/separate-samplers shader/texture: Join separate image and sampler pairs offline	2020-06-13 13:48:27 -04:00
ReinUsesLisp	c95c254f3e	texture_cache: Implement rendering to 3D textures This allows rendering to 3D textures with more than one slice. Applications are allowed to render to more than one slice of a texture using gl_Layer from a VTG shader. This also requires reworking how 3D texture collisions are handled, for now, this commit allows rendering to slices but not to miplevels. When a render target attempts to write to a mipmap, we fallback to the previous implementation (copying or flushing as needed). - Fixes color correction 3D textures on UE4 games (rainbow effects). - Allows Xenoblade games to render to 3D textures directly.	2020-06-08 05:01:00 -03:00
ReinUsesLisp	5b2b6d594c	shader/texture: Join separate image and sampler pairs offline Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch	2020-06-05 00:24:51 -03:00
bunnei	34d4abc4f9	Merge pull request #4009 from ogniK5377/macro-jit-prod video_core: Implement Macro JIT	2020-06-04 11:40:52 -04:00
David Marcec	eca3d16e54	Default init labels and use initializer list for macro engine	2020-06-04 22:23:07 +10:00
David Marcec	411f5527d4	Mark parameters as const	2020-06-03 16:33:38 +10:00
David Marcec	3a20e74f40	Pass by reference instead of copying parameters	2020-06-02 16:37:06 +10:00
bunnei	bb6d93630f	Merge pull request #3998 from ReinUsesLisp/init-3d maxwell_3d: Initialize more registers to their expected value	2020-06-01 16:11:56 -04:00
David Marcec	b032ebdfee	Implement macro JIT	2020-05-30 11:40:04 +10:00
ReinUsesLisp	9b06e823ee	maxwell_3d: Reduce severity of logs that can be spammed These logs were killing performance on some games when they were spammed. Reduce them to Debug severity.	2020-05-28 18:23:25 -03:00
ReinUsesLisp	f3f056c3b6	maxwell_3d: Initialize line widths Initialize line widths to avoid setting a line width of zero.	2020-05-27 16:53:43 -03:00
ReinUsesLisp	31eb658fea	maxwell_3d: Initialize polygon modes NVN expects this to be initialized as Fill, otherwise games that never bind a rasterizer state will log an invalid polygon mode.	2020-05-27 16:52:52 -03:00
bunnei	b1a1bd12ca	Merge pull request #3899 from ReinUsesLisp/float-comparisons shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL	2020-05-13 09:51:14 -04:00
ReinUsesLisp	4e57f9d5cf	shader_ir: Separate float-point comparisons in ordered and unordered This allows us to use native SPIR-V instructions without having to manually check for NAN.	2020-05-09 04:55:15 -03:00
bunnei	50c27d5ae1	Merge pull request #3885 from ReinUsesLisp/viewport-swizzles video_core: Implement viewport swizzles with NV_viewport_swizzle	2020-05-08 15:16:53 -04:00
bunnei	41682e0888	Merge pull request #3815 from FernandoS27/command-list-2 GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations	2020-05-05 17:12:42 -04:00
ReinUsesLisp	2dbf5290f2	vk_graphics_pipeline: Implement viewport swizzles with NV_viewport_swizzle	2020-05-04 18:31:17 -03:00
ReinUsesLisp	9b8e962368	maxwell_3d: Add viewport swizzles	2020-05-04 17:50:59 -03:00
bunnei	2aff0b4733	Merge pull request #3808 from ReinUsesLisp/wait-for-idle {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers	2020-05-03 02:43:18 -04:00
bunnei	bf3f030a0d	Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp maxwell_3d: Fix depth clamping register	2020-04-30 13:07:31 -04:00
bunnei	c7b5a87c90	Merge pull request #3799 from ReinUsesLisp/iadd-cc shader: Implement P2R CC, IADD Rd.CC and IADD.X	2020-04-30 12:56:36 -04:00
Fernando Sahmkow	9df67b2095	Clang Format and Documentation.	2020-04-28 14:02:51 -04:00
Fernando Sahmkow	37c690576f	MaxwellDMA: Optimize micro copies.	2020-04-28 13:44:14 -04:00
ReinUsesLisp	fe931ac976	{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).	2020-04-28 02:18:12 -03:00
Fernando Sahmkow	90e5694230	VideoCore/Engines: Refactor Engines CallMethod.	2020-04-27 21:47:58 -04:00
ReinUsesLisp	bb1ed66d99	maxwell_3d: Fix depth clamping register Using deko3d as reference: `4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)` We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.	2020-04-27 20:50:14 -03:00
bunnei	6c7d8073be	Merge pull request #3742 from FernandoS27/command-list Optimize GPU Command Lists and Introduce Fast GPU Time Option	2020-04-27 00:18:46 -04:00
Rodrigo Locatti	7e38dd580f	Merge pull request #3753 from ReinUsesLisp/ac-vulkan {gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers	2020-04-26 01:55:43 -03:00
ReinUsesLisp	c788f9c0bd	shader/arithmetic_integer: Implement IADD.X IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.	2020-04-25 22:56:11 -03:00
bunnei	4e37825dab	Merge pull request #3734 from ReinUsesLisp/half-float-mods decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits	2020-04-25 00:41:43 -04:00
Markus Wick	e717a1df20	Fix -Wdeprecated-copy warning.	2020-04-24 09:33:04 +02:00
ReinUsesLisp	dbaebd8582	decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: `8dbc389957/table.h (L68)` That is itself tested against nvdisasm (Nvidia's official disassembler).	2020-04-23 18:29:38 -03:00
Fernando Sahmkow	5c9feaebb6	Clang Format.	2020-04-23 08:52:58 -04:00
Fernando Sahmkow	18a88d19dc	Maxwell3D: Process Macros on MultiMethod.	2020-04-23 08:52:56 -04:00
Fernando Sahmkow	3fedcc2f6e	DMAPusher: Propagate multimethod writes into the engines.	2020-04-23 08:52:55 -04:00
bunnei	2409fedacf	Merge pull request #3697 from lioncash/declarations CMakeLists: Enable -Wmissing-declarations on Linux builds	2020-04-23 02:18:52 -04:00
Fernando Sahmkow	1b3be8a8f8	MaxwellDMA: Correct copying on accuracy level.	2020-04-22 11:36:25 -04:00
Fernando Sahmkow	b7bc3c2549	FenceManager: Manage syncpoints and rename fences to semaphores.	2020-04-22 11:36:16 -04:00
Fernando Sahmkow	4adfc9bb08	Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.	2020-04-22 11:36:14 -04:00
Fernando Sahmkow	a081a7c855	GPU: Fix rebase errors.	2020-04-22 11:36:13 -04:00
Fernando Sahmkow	487379c593	OpenGL: Implement Fencing backend.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	339d0d9d6c	GPU: Delay Fences.	2020-04-22 11:36:08 -04:00
Fernando Sahmkow	da8f17715d	GPU: Refactor synchronization on Async GPU	2020-04-22 11:36:06 -04:00
Fernando Sahmkow	084ceb925a	UI: Replasce accurate GPU option for GPU Accuracy Level	2020-04-22 11:36:04 -04:00
ReinUsesLisp	0bbae63300	gl_rasterizer: Fix buffers without size On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. `1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)`	2020-04-21 19:55:44 -03:00
Rodrigo Locatti	f293b15611	Merge pull request #3718 from ReinUsesLisp/better-pipeline-state fixed_pipeline_state: Pack structure, use memcmp and CityHash on it	2020-04-21 18:17:58 -03:00
bunnei	d3e0cefa60	Merge pull request #3695 from ReinUsesLisp/default-attributes maxwell_3d: Initialize format attributes constant as one	2020-04-20 21:40:18 -04:00
ReinUsesLisp	ab6704f20c	fixed_pipeline_state: Pack attribute state Reduce FixedPipelineState's size from 1384 to 664 bytes	2020-04-18 19:21:19 -03:00
Lioncash	e2d8be1ca2	General: Resolve warnings related to missing declarations	2020-04-16 23:43:34 -04:00
ReinUsesLisp	238c6016f9	maxwell_3d: Initialize format attributes constant as one nouveau expects this to be true but it doesn't set it.	2020-04-16 21:15:07 -03:00
Lioncash	1c340c6efa	CMakeLists: Specify -Wextra on linux builds Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.	2020-04-15 21:33:46 -04:00
Fernando Sahmkow	e33196d4e7	Merge pull request #3612 from ReinUsesLisp/red shader/memory: Implement RED.E.ADD and minor changes to ATOM	2020-04-15 15:03:49 -04:00
Mat M	64b5985f0a	Merge pull request #3662 from ReinUsesLisp/constant-attrs gl_rasterizer: Implement constant vertex attributes	2020-04-15 11:54:50 -04:00
ReinUsesLisp	fefe7f18f9	shader/arithmetic: Add FCMP_CR variant Adds another variant of FCMP.	2020-04-14 19:11:04 -03:00
ReinUsesLisp	6dfcabc800	gl_rasterizer: Implement constant vertex attributes Credits go to gdkchan from Ryujinx for finding constant attributes are used in retail games.	2020-04-14 17:58:53 -03:00
ReinUsesLisp	76615b9f34	gl_rasterizer: Implement line widths and smooth lines Implements "legacy" features from OpenGL present on hardware such as smooth lines and line width.	2020-04-13 01:30:34 -03:00
Fernando Sahmkow	3d91dbb21d	Merge pull request #3578 from ReinUsesLisp/vmnmx shader/video: Partially implement VMNMX	2020-04-12 10:44:03 -04:00
ReinUsesLisp	76f178ba6e	shader/video: Partially implement VMNMX Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).	2020-04-12 00:34:42 -03:00
ReinUsesLisp	a7baf6fee4	video_core: Add MSAA registers in 3D engine and TIC This adds the registers used for multisampling. It doesn't implement anything for now.	2020-04-12 00:21:27 -03:00
bunnei	b96fd0bd0e	Merge pull request #3601 from ReinUsesLisp/some-shader-encodings video_core/shader: Add some instruction and S2R encodings	2020-04-09 00:17:39 -04:00
ReinUsesLisp	3185245845	shader/memory: Implement RED.E.ADD Implements a reduction operation. It's an atomic operation that doesn't return a value. This commit introduces another primitive because some shading languages might have a primitive for reduction operations.	2020-04-06 02:24:47 -03:00
ReinUsesLisp	8b719e9e1d	shader_bytecode: Rename MOV_SYS to S2R	2020-04-04 03:37:51 -03:00
ReinUsesLisp	9d15feb892	shader_bytecode: Add encoding for BAR	2020-04-04 03:36:21 -03:00
ReinUsesLisp	c02a2dc24a	shader_bytecode: Add encoding for VOTE.VTG	2020-04-04 03:28:11 -03:00
ReinUsesLisp	2339fe199f	shader_decompiler: Remove FragCoord.w hack and change IPA implementation Credits go to gdkchan and Ryujinx. The pull request used for this can be found here: https://github.com/Ryujinx/Ryujinx/pull/1082 yuzu was already using the header for interpolation, but it was missing the FragCoord.w multiplication described in the linked pull request. This commit finally removes the FragCoord.w == 1.0f hack from the shader decompiler. While we are at it, this commit renames some enumerations to match Nvidia's documentation (linked below) and fixes component declaration order in the shader program header (z and w were swapped). https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html	2020-04-01 21:48:55 -03:00
namkazy	c8f6d9effd	shader_decode: merge GlobalAtomicOp to AtomicOp	2020-03-30 18:47:00 +07:00
ReinUsesLisp	08470d261d	shader_bytecode: Fix I2I_IMM encoding	2020-03-28 18:49:07 -03:00
ReinUsesLisp	cedbe925cd	engines/const_buffer_engine_interface: Store image format type This information is required to properly implement SULD.B. It might also be handy for all image operations, since it would allow us to implement them on devices that require the image format to be specified (on desktop, this would be AMD on OpenGL and Intel on OpenGL and Vulkan).	2020-03-27 00:36:22 -03:00
bunnei	e6aff11057	Merge pull request #3520 from ReinUsesLisp/legacy-varyings gl_shader_decompiler: Implement legacy varyings	2020-03-25 19:27:51 -04:00
namkazy	fc37672f26	apply replay logic to all writes. remove replay from MacroInterpreter::Send (@fincs)	2020-03-22 22:25:44 +07:00
namkazy	f66743cd0c	maxwell_3d: change declaration order	2020-03-22 13:41:16 +07:00
namkazy	d4e93cf38c	maxwell_3d: init shadow_state	2020-03-22 13:35:11 +07:00
namkazy	22f4268c2f	maxwell_3d: this seem more correct.	2020-03-22 12:02:54 +07:00
namkazy	7051dc1902	maxwell_3d: update comments for shadow ram usage	2020-03-22 11:35:26 +07:00
Nguyen Dac Nam	63c2635e6f	maxwell_3d: track shadow ram ctrl and hw reg value	2020-03-22 10:53:41 +07:00
Nguyen Dac Nam	dbfbe352e0	maxwell_3d: implement MME shadow RAM	2020-03-22 10:53:35 +07:00
ReinUsesLisp	9f46066bda	kepler_compute: Remove unused variables	2020-03-18 20:03:19 -03:00
Rodrigo Locatti	ddafc99776	Merge pull request #3502 from namkazt/patch-3 shader_decode: Reimplement BFE instructions	2020-03-15 21:23:04 -03:00
ReinUsesLisp	6442e02c5d	shader/shader_ir: Track usage in input attribute and of legacy varyings	2020-03-15 21:01:52 -03:00
ReinUsesLisp	afebdda203	maxwell_3d: Add padding words to XFB entries Use INSERT_UNION_PADDING_WORDS instead of alignas to ensure a size requirement.	2020-03-13 18:33:05 -03:00
ReinUsesLisp	8e9f23f393	gl_rasterizer: Implement transform feedback bindings	2020-03-13 18:33:04 -03:00
Rodrigo Locatti	244fe13219	Merge branch 'master' into shader-purge	2020-03-13 16:44:06 -03:00
Nguyen Dac Nam	93547cac68	shader_bytecode: update BFE instructions struct.	2020-03-13 12:52:16 +07:00
ReinUsesLisp	e4bc3c3342	gl_rasterizer: Implement polygon modes and fill rectangles	2020-03-09 20:39:58 -03:00
ReinUsesLisp	eb5861e0a2	engines/maxwell_3d: Add TFB registers and store them in shader registry	2020-03-09 18:40:53 -03:00
ReinUsesLisp	978172530e	const_buffer_engine_interface: Store component types This is required for Vulkan. Sampling integer textures with float handles is illegal.	2020-03-09 18:40:53 -03:00
ReinUsesLisp	042256c6bb	state_tracker: Remove type traits with named structures	2020-02-28 17:56:43 -03:00
ReinUsesLisp	15cadc3948	maxwell_3d: Use two tables instead of three for dirty flags	2020-02-28 17:56:42 -03:00
ReinUsesLisp	9b08698a0c	maxwell_3d: Change write dirty flags to a bitset	2020-02-28 17:56:42 -03:00
ReinUsesLisp	9e74e6988b	maxwell_3d: Flatten cull and front face registers	2020-02-28 17:56:41 -03:00
ReinUsesLisp	eed789d0d1	video_core: Reintroduce dirty flags infrastructure	2020-02-28 17:56:41 -03:00
ReinUsesLisp	1eee891f6e	gl_state: Remove clip distances tracking	2020-02-28 17:26:26 -03:00
ReinUsesLisp	d3e433a380	gl_state: Remove viewport and depth range tracking	2020-02-28 17:25:18 -03:00
ReinUsesLisp	96ac3d518a	gl_rasterizer: Remove dirty flags	2020-02-28 16:39:27 -03:00
bunnei	e22ad52cdb	Merge pull request #3425 from ReinUsesLisp/layered-framebuffer texture_cache: Implement layered framebuffer attachments	2020-02-24 10:14:50 -05:00

1 2 3 4 5 ...

815 commits