sneedmca/suyu - sneedGit: git for sneed group

mirror of https://git.suyu.dev/suyu/suyu synced 2024-11-09 00:37:53 +00:00

Author	SHA1	Message	Date
Fernando Sahmkow	06d1c5a991	Document unsafe versions and add BlockCopyUnsafe	2019-04-16 10:11:35 -04:00
Fernando Sahmkow	6fc562a9aa	Use ReadBlockUnsafe for Shader Cache	2019-04-15 23:34:03 -04:00
Fernando Sahmkow	ef381e6924	Use ReadBlockUnsafe on TIC and TSC reading Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed from host GPU there.	2019-04-15 23:10:24 -04:00
Fernando Sahmkow	367704aa82	GPU MemoryManager: Implement ReadBlockUnsafe and WriteBlockUnsafe	2019-04-15 23:01:35 -04:00
Fernando Sahmkow	3e96c367bd	Use WriteBlock and ReadBlock.	2019-04-15 22:42:34 -04:00
bunnei	9186f76b07	Merge pull request #2382 from lioncash/table service: Update service function tables	2019-04-15 21:46:15 -04:00
bunnei	fc64156533	Merge pull request #2393 from lioncash/svc kernel/svc: Implement svcMapProcessCodeMemory/svcUnmapProcessCodeMemory	2019-04-15 21:43:56 -04:00
bunnei	a7c3275b8b	Merge pull request #2398 from lioncash/boost kernel/thread: Remove BoostPriority()	2019-04-15 21:42:16 -04:00
bunnei	c1e35d117c	Merge pull request #2399 from FernandoS27/fermi-fix Correct Pitch in Fermi2D	2019-04-15 21:41:52 -04:00
Fernando Sahmkow	bec28d692d	Implement Block Linear copies in Kepler Memory.	2019-04-15 21:22:16 -04:00
ReinUsesLisp	ef8245bed2	vk_shader_decompiler: Add missing operations	2019-04-15 21:32:57 -03:00
ReinUsesLisp	f43995ec53	shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic Operations done before the main half float operation (like HAdd) were managing a packed value instead of the unpacked one. Adding an unpacked operation allows us to drop the per-operand MetaHalfArithmetic entry, simplifying the code overall.	2019-04-15 21:16:10 -03:00
ReinUsesLisp	abcbcb1b2a	gl_shader_decompiler: Fix MrgH0 decompilation GLSL decompilation for HMergeH0 was wrong. This addresses that issue.	2019-04-15 21:16:10 -03:00
ReinUsesLisp	64613db605	shader_ir/decode: Implement half float saturation	2019-04-15 21:16:10 -03:00
ReinUsesLisp	90cbf89303	shader_ir/decode: Reduce severity of unimplemented half-float FTZ	2019-04-15 21:16:09 -03:00
ReinUsesLisp	acf618afbc	renderer_opengl: Implement half float NaN comparisons	2019-04-15 21:13:26 -03:00
ReinUsesLisp	ae46ad48ed	shader_ir: Avoid using static on heap-allocated objects Using static here might be faster at runtime, but it adds a heap allocation called before main.	2019-04-15 21:12:43 -03:00
Fernando Sahmkow	aa471274d9	Do some corrections in conversion shader instructions. Corrects encodings for I2F, F2F, I2I and F2I Implements Immediate variants of all four conversion types. Add assertions to unimplemented stuffs.	2019-04-15 19:16:27 -04:00
Cameron Cawley	1f3cc036da	travis: Use Ninja for Travis builds	2019-04-16 01:06:34 +02:00
fearlessTobi	b67be7154d	GenerateSCMRev: fix Travis compilation on repo forks	2019-04-16 00:34:22 +02:00
Lioncash	d28bb56c91	CMakeLists: Define QT_USE_QSTRINGBUILDER for the Qt target This is a compile definition introduced in Qt 4.8 for reducing the total potential number of strings created when performing string concatenation. This allows for less memory churn. This can be read about here: https://blog.qt.io/blog/2011/06/13/string-concatenation-with-qstringbuilder/ For a change that isn't source-compatible, we only had one occurrence that actually need to have its type clarified, which is pretty good, as far as transitioning goes.	2019-04-15 17:59:41 -04:00
liushuyu	a9f58593d4	travis: use prebuilt image (#3839 ) * travis: use prebuilt image * travis: use prebuilt image (MinGW)	2019-04-15 22:22:09 +02:00
Lioncash	3283aa1e20	svc: Specify handle value in thread's name Allows the handle to be seen alongside the entry point.	2019-04-15 15:56:18 -04:00
Fernando Sahmkow	8a099ac99f	Correct Kepler Memory on Linear Pushes.	2019-04-15 14:51:36 -04:00
Fernando Sahmkow	773d955dfa	Support compressed formats on linear textures.	2019-04-15 13:56:09 -04:00
Lioncash	4620ed47a3	common/{lz4_compression, zstd_compression}: Add missing header guards These two files were missing the #pragma once directive.	2019-04-15 13:00:08 -04:00
Fernando Sahmkow	bf561e4340	Correct Pitch in Fermi2D	2019-04-15 12:24:29 -04:00
Lioncash	e3566e6c1d	kernel/thread: Remove BoostPriority() This is a holdover from Citra that currently remains unused, so it can be removed from the Thread interface.	2019-04-15 06:59:19 -04:00
Lioncash	09caf8a756	kernel/thread: Remove unused guest_handle member variable This member variable is entirely unused. It was only set but never actually utilized. Given that, we can remove it to get rid of noise in the thread interface.	2019-04-14 06:06:06 -04:00
ReinUsesLisp	f15c59a164	gl_shader_decompiler: Use variable AOFFI on supported hardware	2019-04-14 05:13:19 -03:00
ReinUsesLisp	5c280e6ff0	shader_ir: Implement STG, keep track of global memory usage and flush	2019-04-14 00:25:32 -03:00
bunnei	1f4dfb3998	Merge pull request #2378 from lioncash/ro ldr: Minor amendments to IPC-related parameters	2019-04-13 22:16:10 -04:00
bunnei	c9454c8422	Merge pull request #2373 from FernandoS27/z32 Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8	2019-04-13 22:14:51 -04:00
bunnei	6088898b02	Merge pull request #2357 from zarroboogs/force-30fps-mode Add a toggle to force 30FPS mode	2019-04-13 22:14:04 -04:00
bunnei	a788c861bd	Merge pull request #2381 from lioncash/fs fsp_srv: Minor cleanup related changes	2019-04-13 22:09:58 -04:00
bunnei	ee2206a1b7	Merge pull request #2386 from ReinUsesLisp/shader-manager gl_shader_manager: Move code to source file and minor clean up	2019-04-13 22:09:27 -04:00
bunnei	065f83c6c3	Merge pull request #2017 from jroweboy/glwidget Frontend: Migrate to QOpenGLWindow and support shared contexts	2019-04-13 22:08:40 -04:00
bunnei	ee3f576495	Merge pull request #2389 from FreddyFunk/rename-gamedir ui_settings: Rename game directory variables	2019-04-13 22:06:51 -04:00
Lioncash	4d293bb5cb	kernel/svc: Implement svcUnmapProcessCodeMemory Essentially performs the inverse of svcMapProcessCodeMemory. This unmaps the aliasing region first, then restores the general traits of the aliased memory. What this entails, is: - Restoring Read/Write permissions to the VMA. - Restoring its memory state to reflect it as a general heap memory region. - Clearing the memory attributes on the region.	2019-04-12 21:56:03 -04:00
Lioncash	76a2465655	kernel/svc: Implement svcMapProcessCodeMemory This is utilized for mapping code modules into memory. Notably, the ldr service would call this in order to map objects into memory.	2019-04-12 21:55:50 -04:00
bunnei	b42595fa6b	Merge pull request #2391 from lioncash/scope common/scope_exit: Replace std::move with std::forward in ScopeExit()	2019-04-12 21:52:35 -04:00
bunnei	0faf7b17a1	Merge pull request #2392 from lioncash/swap common/swap: Minor cleanup and improvements to byte swapping functions	2019-04-12 21:52:16 -04:00
FreddyFunk	382722b9c4	Fix Clang Format	2019-04-12 16:40:35 +02:00
Lioncash	0d8ef2d3b9	common/swap: Improve codegen of the default swap fallbacks Uses arithmetic that can be identified more trivially by compilers for optimizations. e.g. Rather than shifting the halves of the value and then swapping and combining them, we can swap them in place. e.g. for the original swap32 code on x86-64, clang 8.0 would generate: mov ecx, edi rol cx, 8 shl ecx, 16 shr edi, 16 rol di, 8 movzx eax, di or eax, ecx ret while GCC 8.3 would generate the ideal: mov eax, edi bswap eax ret now both generate the same optimal output. MSVC used to generate the following with the old code: mov eax, ecx rol cx, 8 shr eax, 16 rol ax, 8 movzx ecx, cx movzx eax, ax shl ecx, 16 or eax, ecx ret 0 Now MSVC also generates a similar, but equally optimal result as clang/GCC: bswap ecx mov eax, ecx ret 0 ==== In the swap64 case, for the original code, clang 8.0 would generate: mov eax, edi bswap eax shl rax, 32 shr rdi, 32 bswap edi or rax, rdi ret (almost there, but still missing the mark) while, again, GCC 8.3 would generate the more ideal: mov rax, rdi bswap rax ret now clang also generates the optimal sequence for this fallback as well. This is a case where MSVC unfortunately falls short, despite the new code, this one still generates a doozy of an output. mov r8, rcx mov r9, rcx mov rax, 71776119061217280 mov rdx, r8 and r9, rax and edx, 65280 mov rax, rcx shr rax, 16 or r9, rax mov rax, rcx shr r9, 16 mov rcx, 280375465082880 and rax, rcx mov rcx, 1095216660480 or r9, rax mov rax, r8 and rax, rcx shr r9, 16 or r9, rax mov rcx, r8 mov rax, r8 shr r9, 8 shl rax, 16 and ecx, 16711680 or rdx, rax mov eax, -16777216 and rax, r8 shl rdx, 16 or rdx, rcx shl rdx, 16 or rax, rdx shl rax, 8 or rax, r9 ret 0 which is pretty unfortunate.	2019-04-12 00:07:39 -04:00
Lioncash	612e1388df	core/core: Move process execution start to System's Load() This gives us significantly more control over where in the initialization process we start execution of the main process. Previously we were running the main process before the CPU or GPU threads were initialized (not good). This amends execution to start after all of our threads are properly set up.	2019-04-11 22:11:41 -04:00
Lioncash	32a6ceb4e5	core/process: Remove unideal page table setting from LoadFromMetadata() Initially required due to the split codepath with how the initial main process instance was initialized. We used to initialize the process like: Init() { main_process = Process::Create(...); kernel.MakeCurrentProcess(main_process.get()); } Load() { const auto load_result = loader.Load(*kernel.GetCurrentProcess()); if (load_result != Loader::ResultStatus::Success) { // Handle error here. } ... } which presented a problem. Setting a created process as the main process would set the page table for that process as the main page table. This is fine... until we get to the part that the page table can have its size changed in the Load() function via NPDM metadata, which can dictate either a 32-bit, 36-bit, or 39-bit usable address space. Now that we have full control over the process' creation in load, we can simply set the initial process as the main process after all the loading is done, reflecting the potential page table changes without any special-casing behavior. We can also remove the cache flushing within LoadModule(), as execution wouldn't have even begun yet during all usages of this function, now that we have the initialization order cleaned up.	2019-04-11 22:11:41 -04:00
Lioncash	a4b0a8559c	core/core: Move main process creation into Load() Now that we have dependencies on the initialization order, we can move the creation of the main process to a more sensible area: where we actually load in the executable data. This allows localizing the creation and loading of the process in one location, making the initialization of the process much nicer to trace.	2019-04-11 22:11:40 -04:00
Lioncash	6d0551196d	video_core/gpu: Create threads separately from initialization Like with CPU emulation, we generally don't want to fire off the threads immediately after the relevant classes are initialized, we want to do this after all necessary data is done loading first. This splits the thread creation into its own interface member function to allow controlling when these threads in particular get created.	2019-04-11 22:11:40 -04:00
Lioncash	f2331a804a	core/cpu_core_manager: Create threads separately from initialization. Our initialization process is a little wonky than one would expect when it comes to code flow. We initialize the CPU last, as opposed to hardware, where the CPU obviously needs to be first, otherwise nothing else would work, and we have code that adds checks to get around this. For example, in the page table setting code, we check to see if the system is turned on before we even notify the CPU instances of a page table switch. This results in dead code (at the moment), because the only time a page table switch will occur is when the system is not running, preventing the emulated CPU instances from being notified of a page table switch in a convenient manner (technically the code path could be taken, but we don't emulate the process creation svc handlers yet). This moves the threads creation into its own member function of the core manager and restores a little order (and predictability) to our initialization process. Previously, in the multi-threaded cases, we'd kick off several threads before even the main kernel process was created and ready to execute (gross!). Now the initialization process is like so: Initialization: 1. Timers 2. CPU 3. Kernel 4. Filesystem stuff (kind of gross, but can be amended trivially) 5. Applet stuff (ditto in terms of being kind of gross) 6. Main process (will be moved into the loading step in a following change) 7. Telemetry (this should be initialized last in the future). 8. Services (4 and 5 should ideally be alongside this). 9. GDB (gross. Uses namespace scope state. Needs to be refactored into a class or booted altogether). 10. Renderer 11. GPU (will also have its threads created in a separate step in a following change). Which... isn't ideal per-se, however getting rid of the wonky intertwining of CPU state initialization out of this mix gets rid of most of the footguns when it comes to our initialization process.	2019-04-11 22:11:40 -04:00
bunnei	ea80e2bc57	Merge pull request #2235 from ReinUsesLisp/spirv-decompiler vk_shader_decompiler: Implement a SPIR-V decompiler	2019-04-11 21:54:23 -04:00

... 70 71 72 73 74 ...

14923 commits