ReinUsesLisp
1f43e5296f
gl_shader_decompiler: Keep track of written images and mark them as modified
2019-09-05 23:26:05 -03:00
ReinUsesLisp
3a450c1395
kepler_compute: Implement texture queries
2019-09-05 20:35:51 -03:00
ReinUsesLisp
4de04eba39
shader_ir: Implement LD_S
...
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
f17415d431
shader_ir: Implement ST_S
...
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
77ef4fa907
shader/shift: Implement SHR wrapped and clamped variants
...
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp
dfae2d141a
half_set_predicate: Fix predicate assignments
2019-09-04 01:54:23 -03:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
...
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
...
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
Rodrigo Locatti
4d4f9cc104
video_core: Silent miscellaneous warnings ( #2820 )
...
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
bunnei
f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
...
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp
e3534700d7
shader_ir/conversion: Split int and float selector and implement F2F H1
2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8
shader_ir/conversion: Implement F2I F16 Ra.H1
2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00
float_set_predicate: Add missing negation bit for the second operand
2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23
shader_ir: Implement VOTE
...
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
...
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
...
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
bunnei
ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
...
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
ReinUsesLisp
2ff8044806
shader_ir: Implement NOP
2019-08-04 03:02:55 -03:00
ReinUsesLisp
ec0da3ef64
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-04 02:50:55 -03:00
ReinUsesLisp
77f1a676a1
decode/half_set_predicate: Fix predicates
2019-07-26 00:12:38 -03:00
bunnei
b0ff3179ef
Merge pull request #2739 from lioncash/cflow
...
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei
4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
...
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
...
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
ReinUsesLisp
104641db07
shader/decode: Implement S2R Tic
2019-07-22 16:16:10 -03:00
Fernando Sahmkow
11f4e739bd
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
...
This commit takes care of implementing the F16 Variants of the
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
1158777737
Shader_Ir: Change Debug Asserts for Log Warnings
2019-07-19 22:15:34 -04:00
ReinUsesLisp
45c162444d
shader/half_set_predicate: Fix HSETP2 implementation
2019-07-19 22:21:22 -03:00
ReinUsesLisp
6c4985edc9
shader/half_set_predicate: Implement missing HSETP2 variants
2019-07-19 22:20:47 -03:00
Lioncash
c1c89411da
video_core/control_flow: Provide operator!= for types with operator==
...
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash
1780e0e3d0
video_core/control_flow: Prevent sign conversion in TryGetBlock()
...
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash
a162a844d2
video_core/control_flow: Remove unnecessary BlockStack copy constructor
...
This is the default behavior of the copy constructor, so it doesn't need
to be specified.
While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash
56bc11d952
video_core/control_flow: Use std::move where applicable
...
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash
e7b39f47f8
video_core/control_flow: Use the prefix variant of operator++ for iterators
...
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash
6885e7e7ec
video_core/control_flow: Use empty() member function for checking emptiness
...
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash
45fa12a05c
video_core: Resolve -Wreorder warnings
...
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash
47df844338
video_core/control_flow: Make program_size for ScanFlow() a std::size_t
...
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash
3df9558593
video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
...
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash
1109db86b7
video_core/shader/decode: Prevent sign-conversion warnings
...
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei
63bda67a34
Merge pull request #2738 from lioncash/shader-ir
...
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow
5a06e33859
Shader_Ir: correct clang format
2019-07-18 10:09:26 -04:00
Fernando Sahmkow
0b65e9335e
Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
...
This commit reduces the sevirity of asserts for FP precision and
rounding as this are well known and have little to no consequences in
gpu's accuracy.
2019-07-18 08:17:19 -04:00
Fernando Sahmkow
223a535f3f
Merge pull request #2740 from lioncash/bra
...
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash
bebbdc2067
shader_ir: std::move Node instance where applicable
...
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.
This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash
60926ac16b
shader_ir: Rename Get/SetTemporal to Get/SetTemporary
...
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash
44d87ff641
shader_ir: Remove unused includes
...
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow
d614193e49
Shader_Ir: Correct tracking to track from right to left
2019-07-16 15:06:59 -04:00
Fernando Sahmkow
b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
...
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash
e2d7dda166
shader/decode/other: Correct branch indirect argument within BRA handling
...
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
...
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
ReinUsesLisp
afa8096df5
shader: Allow tracking of indirect buffers without variable offset
...
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
...
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
f2549739d1
shader_ir: Add comments on missing instruction.
...
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Fernando Sahmkow
2de7649311
shader_ir: limit explorastion to best known program size.
2019-07-09 08:14:43 -04:00
Fernando Sahmkow
e7c6045a03
control_flow: Correct block breaking algorithm.
2019-07-09 08:14:43 -04:00
Fernando Sahmkow
dc4a93594c
control_flow: Assert shaders bigger than limit.
2019-07-09 08:14:42 -04:00
Fernando Sahmkow
e7a88f0ab3
control_flow: Address feedback.
2019-07-09 08:14:42 -04:00
Fernando Sahmkow
34357b110c
shader_ir: Correct parsing of scheduling instructions and correct sizing
2019-07-09 08:14:41 -04:00
Fernando Sahmkow
cfb3db1a32
shader_ir: Correct max sizing
2019-07-09 08:14:40 -04:00
Fernando Sahmkow
d45fed3030
shader_ir: Remove unnecessary constructors and use optional for ScanFlow result
2019-07-09 08:14:40 -04:00
Fernando Sahmkow
01b21ee1e8
shader_ir: Corrections, documenting and asserting control_flow
2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5533b440c
shader_ir: Unify blocks in decompiled shaders.
2019-07-09 08:14:39 -04:00
Fernando Sahmkow
926b80102f
shader_ir: Decompile Flow Stack
2019-07-09 08:14:38 -04:00
Fernando Sahmkow
459fce3a8f
shader_ir: propagate shader size to the IR
2019-07-09 08:14:37 -04:00
Fernando Sahmkow
8a6fc529a9
shader_ir: Implement BRX & BRA.CC
2019-07-09 08:14:37 -04:00
Fernando Sahmkow
c218ae4b02
shader_ir: Remove the old scanner.
2019-07-09 08:14:36 -04:00
Fernando Sahmkow
8af6e6a052
shader_ir: Implement a new shader scanner
2019-07-09 08:14:36 -04:00
ReinUsesLisp
c9d886c84e
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
...
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.
In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias
be020f7621
Delete decode_integer_set.cpp
2019-07-07 21:40:33 +02:00
ReinUsesLisp
d0966b9f7c
shader/texture: Add F16 support for TLDS
2019-07-07 16:05:56 -03:00
ReinUsesLisp
10a83653ee
decode/texture: Address feedback
2019-06-24 02:05:05 -03:00
Fernando Sahmkow
d1812316e1
texture_cache: Style and Corrections
2019-06-20 21:24:47 -04:00
Fernando Sahmkow
b7de31ac97
shader_ir: Fix image copy rebase issues
2019-06-20 21:38:34 -03:00
ReinUsesLisp
9097301d92
shader: Implement bindless images
2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645
shader: Decode SUST and implement backing image functionality
2019-06-20 21:38:33 -03:00
ReinUsesLisp
4e81fc8296
shader: Implement texture buffers
2019-06-20 21:36:12 -03:00
ReinUsesLisp
fe8e6618f2
shader: Split SSY and PBK stack
...
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:
SSY label1;
PBK label2;
SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp
769a50661a
shader/node: Minor changes
...
Reflect std::shared_ptr nature of Node on initializers and remove
constant members in nodes.
Add some commentaries.
2019-06-06 20:03:33 -03:00
ReinUsesLisp
e1b3be7ced
shader: Move Node declarations out of the shader IR header
...
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp
bf4dfb3ad4
shader: Use shared_ptr to store nodes and move initialization to file
...
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.
This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei
e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
...
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
bunnei
1a2d90ab09
Merge pull request #2485 from ReinUsesLisp/generic-memory
...
shader/memory: Implement generic memory stores and loads (ST and LD)
2019-05-24 18:24:26 -04:00
Lioncash
b6dcb1ae4d
shader/shader_ir: Make Comment() take a std::string by value
...
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.
e.g. previously:
Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
cbuf->GetIndex(), cbuf_offset));
Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).
Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
2019-05-23 03:01:55 -03:00
Lioncash
228e58d0a5
shader/decode/*: Add missing newline to files lacking them
...
Keeps the shader code file endings consistent.
2019-05-23 02:55:52 -03:00
Lioncash
87b4c1ac5e
shader/decode/*: Eliminate indirect inclusions
...
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
2019-05-23 02:55:52 -03:00
Lioncash
195b54602f
shader/decode/memory: Remove left in debug pragma
2019-05-22 17:08:50 -04:00
ReinUsesLisp
75e7b45d69
shader/memory: Implement ST (generic memory)
2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6
shader/memory: Implement LD (generic memory)
2019-05-20 22:38:59 -03:00
ReinUsesLisp
9c3461604c
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-20 16:36:49 -03:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
...
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Lioncash
e310d943b8
shader/shader_ir: Remove unnecessary inline specifiers
...
constexpr internally links by default, so the inline specifier is
unnecessary.
2019-05-19 08:23:15 -04:00
Lioncash
212b148923
shader/shader_ir: Simplify constructors for OperationNode
...
Many of these constructors don't even need to be templated. The only
ones that need to be templated are the ones that actually make use of
the parameter pack.
Even then, since std::vector accepts an initializer list, we can supply
the parameter pack directly to it instead of creating our own copy of
the list, then copying it again into the std::vector.
2019-05-19 08:23:14 -04:00
Lioncash
81e7e63080
shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable
...
These overloads don't actually make use of the parameter pack, so they
can be turned into regular non-template function overloads.
2019-05-19 08:23:14 -04:00
Lioncash
e09ee0ff23
shader/shader_ir: Mark tracking functions as const member functions
...
These don't actually modify instance state, so they can be marked as
const member functions
2019-05-19 08:23:09 -04:00
Lioncash
ce04ab38bb
shader/shader_ir: Place implementations of constructor and destructor in cpp file
...
Given the class contains quite a lot of non-trivial types, place the
constructor and destructor within the cpp file to avoid inlining
construction and destruction code everywhere the class is used.
2019-05-19 04:02:02 -04:00
Lioncash
e43ba3acd4
video_core/shader/decode/texture: Remove unused variable from GetTld4Code()
2019-05-09 18:49:56 -04:00
Lioncash
9e15193ef8
shader/decode/texture: Remove unused variable
...
This isn't used anywhere, so we can get rid of it.
2019-05-04 02:10:38 -04:00
ReinUsesLisp
d4df803b2b
shader_ir/other: Implement IPA.IDX
2019-05-02 21:46:37 -03:00
ReinUsesLisp
28bffb1ffa
shader_ir/memory: Assert on non-32 bits ALD.PHYS
2019-05-02 21:46:25 -03:00
ReinUsesLisp
fe700e1856
shader: Add physical attributes commentaries
2019-05-02 21:46:25 -03:00
ReinUsesLisp
c6f9e651b2
gl_shader_decompiler: Implement GLSL physical attributes
2019-05-02 21:46:25 -03:00
ReinUsesLisp
71aa9d0877
shader_ir/memory: Implement physical input attributes
2019-05-02 21:46:25 -03:00
ReinUsesLisp
06b363c9b5
shader: Remove unused AbufNode Ipa mode
2019-05-02 21:46:25 -03:00
ReinUsesLisp
002ecbea19
shader_ir/memory: Emit AL2P IR
2019-05-02 21:46:25 -03:00
bunnei
91e239d66f
Merge pull request #2435 from ReinUsesLisp/misc-vc
...
shader_ir: Miscellaneous fixes
2019-04-28 22:29:43 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
...
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
bunnei
9a3737120d
Merge pull request #2423 from FernandoS27/half-correct
...
Corrections on Half Float operations: HADD2 HMUL2 and HFMA2
2019-04-28 22:24:22 -04:00
ReinUsesLisp
2156e52014
shader_ir: Move Sampler index entry in operand< to sort declarations
2019-04-26 01:13:05 -03:00
ReinUsesLisp
b77b4b76bb
shader_ir: Add missing entry to Sampler operand< comparison
2019-04-26 01:11:24 -03:00
ReinUsesLisp
0b91087a1e
shader_ir/texture: Fix sampler const buffer key shift
2019-04-26 01:09:29 -03:00
Fernando Sahmkow
623b2e4b8f
Corrections Half Float operations on const buffers and implement saturation.
2019-04-20 21:11:33 -04:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
...
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
650d9b1044
Merge pull request #2409 from ReinUsesLisp/half-floats
...
shader_ir/decode: Miscellaneous fixes to half-float decompilation
2019-04-19 21:31:52 -04:00
ReinUsesLisp
fbe8d1ceaa
video_core: Silent -Wswitch warnings
2019-04-18 15:54:39 -03:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
...
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
...
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
ReinUsesLisp
f43995ec53
shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
...
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
64613db605
shader_ir/decode: Implement half float saturation
2019-04-15 21:16:10 -03:00
ReinUsesLisp
90cbf89303
shader_ir/decode: Reduce severity of unimplemented half-float FTZ
2019-04-15 21:16:09 -03:00
ReinUsesLisp
acf618afbc
renderer_opengl: Implement half float NaN comparisons
2019-04-15 21:13:26 -03:00
ReinUsesLisp
ae46ad48ed
shader_ir: Avoid using static on heap-allocated objects
...
Using static here might be faster at runtime, but it adds a heap
allocation called before main.
2019-04-15 21:12:43 -03:00
Fernando Sahmkow
aa471274d9
Do some corrections in conversion shader instructions.
...
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
ReinUsesLisp
5c280e6ff0
shader_ir: Implement STG, keep track of global memory usage and flush
2019-04-14 00:25:32 -03:00
Fernando Sahmkow
16adc735a5
Correct XMAD mode, psl and high_b on different encodings.
2019-04-08 13:01:17 -04:00
Fernando Sahmkow
ef8be408d3
Adapt Bindless to work with AOFFI
2019-04-08 12:07:56 -04:00
Fernando Sahmkow
492040bd9c
Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.
2019-04-08 11:36:11 -04:00
Fernando Sahmkow
c60b0b8432
Fix TMML
2019-04-08 11:35:22 -04:00
Fernando Sahmkow
fd4e994de3
Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters
2019-04-08 11:35:18 -04:00
Fernando Sahmkow
4841440382
Implement TXQ_B
2019-04-08 11:29:52 -04:00
Fernando Sahmkow
189bd1980c
Implement TMML_B
2019-04-08 11:29:49 -04:00
Fernando Sahmkow
ac3ba9a33e
Corrections to TEX_B
2019-04-08 11:28:44 -04:00
Fernando Sahmkow
7af82ca022
Implement Bindless Handling on SetupTexture
2019-04-08 11:23:46 -04:00
Fernando Sahmkow
fe392fff24
Unify both sampler types.
2019-04-08 11:23:45 -04:00
Fernando Sahmkow
e28fd3d0a5
Implement Bindless Samplers and TEX_B in the IR.
2019-04-08 11:23:42 -04:00
ReinUsesLisp
04979560fb
shader_ir/memory: Reduce severity of LD_L cache management and log it
2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67
shader_ir/memory: Reduce severity of ST_L cache management and log it
2019-04-03 17:12:44 -03:00
Mat M
da02946f4f
shader_ir/decode: Silent implicit sign conversion warning
...
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-31 00:12:54 -03:00
ReinUsesLisp
cb68ce7c2f
shader_ir/decode: Implement AOFFI for TEX and TLD4
2019-03-30 02:53:29 -03:00
ReinUsesLisp
cf4ecc1945
shader_ir: Implement immediate register tracking
2019-03-30 02:53:16 -03:00
ReinUsesLisp
5ca63d0675
shader/decode: Remove extras from MetaTexture
2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03
shader/decode: Split memory and texture instructions decoding
2019-02-26 00:11:30 -03:00
Lioncash
c1b2e35625
shader/track: Resolve variable shadowing warnings
2019-02-25 09:10:59 -05:00
bunnei
c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
...
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
Fernando Sahmkow
10682ad7e0
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-14 03:25:07 -04:00
ReinUsesLisp
e60d4d70bc
gl_shader_decompiler: Re-implement TLDS lod
2019-02-12 17:03:07 -03:00
bunnei
444231a83d
Merge pull request #2108 from FernandoS27/fix-cc
...
Fix incorrect value for CC bit in IADD
2019-02-12 10:39:03 -05:00
bunnei
c1accfefde
Merge pull request #2109 from FernandoS27/fix-f2i
...
Corrected F2I None mode to RoundEven.
2019-02-12 10:20:29 -05:00
Fernando Sahmkow
f5ec165e8c
Corrected F2I None mode to RoundEven.
2019-02-11 18:46:45 -04:00
Fernando Sahmkow
edd668047c
Fix incorrect value for CC bit in IADD
2019-02-11 16:44:43 -04:00
ReinUsesLisp
889c646ac0
shader_ir: Remove F4 prefix to texture operations
...
This was originally included because texture operations returned a vec4.
These operations now return a single float and the F4 prefix doesn't
mean anything.
2019-02-07 17:36:46 -03:00
ReinUsesLisp
d62b0a9e29
shader_ir: Clean texture management code
...
Previous code relied on GLSL parameter order (something that's always
ill-formed on an IR design). This approach passes spatial coordiantes
through operation nodes and array and depth compare values in the the
texture metadata. It still contains an "extra" vector containing generic
nodes for bias and component index (for example) which is still a bit
ill-formed but it should be better than the previous approach.
2019-02-07 00:46:13 -03:00
bunnei
f09d1dffd1
Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking
...
shader/track: Add a more permissive global memory tracking
2019-02-06 21:56:14 -05:00
ReinUsesLisp
cfb20c4c9d
gl_shader_disk_cache: Save GLSL and entries into the precompiled file
2019-02-06 22:23:39 -03:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
...
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
...
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Fernando Sahmkow
0306c50339
Fix TXQ not using the component mask.
2019-02-03 18:17:18 -04:00
ReinUsesLisp
dfa7be5ddf
shader_ir/memory: Add ST_L 64 and 128 bits stores
2019-02-03 19:08:10 -03:00
ReinUsesLisp
0d1d755086
shader/track: Search inside of conditional nodes
...
Some games search conditionally use global memory instructions. This
allows the heuristic to search inside conditional nodes for the source
constant buffer.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
42b75e8be8
shader_ir: Rename BasicBlock to NodeBlock
...
It's not always used as a basic block. Rename it for consistency.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
6a6fabea58
shader_ir: Pass decoded nodes as a whole instead of per basic blocks
...
Some games call LDG at the top of a basic block, making the tracking
heuristic to fail. This commit lets the heuristic the decoded nodes as a
whole instead of per basic blocks.
This may lead to some false positives but allows it the heuristic to
track cases it previously couldn't.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
f61c1ed246
shader_ir/memory: Add LD_L 128 bits loads
2019-02-03 00:35:34 -03:00
ReinUsesLisp
9feb68085d
shader_bytecode: Rename BytesN enums to BitsN
2019-02-03 00:25:40 -03:00
ReinUsesLisp
0be835132c
shader_ir/memory: Add LD_L 64 bits loads
2019-02-03 00:25:40 -03:00
ReinUsesLisp
477d616f7d
shader_ir: Unify constant buffer offset values
...
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp
3b84e04af1
shader_decode: Implement LDG and basic cbuf tracking
2019-01-30 00:00:15 -03:00
Lioncash
b2b98b2f44
shader/shader_ir: Amend three comment typos
...
Given we're in the area, these are three trivial typos that can be
corrected.
2019-01-28 07:52:04 -05:00
Lioncash
62e08c30b7
shader/shader_ir: Amend constructor initializer ordering for AbufNode
...
Orders the class members in the same order that they would actually be
initialized in. Gets rid of two compiler warnings.
2019-01-28 07:50:34 -05:00
Lioncash
3e1a9a45a6
shader/decode: Avoid a pessimizing std::move within DecodeRange()
...
std::moveing a local variable in a return statement has the potential to
prevent copy elision from occurring, so this can just be converted into
a regular return.
2019-01-28 07:43:23 -05:00
ReinUsesLisp
a63d7c49fc
shader_ir: Fixup clang build
2019-01-15 21:06:05 -03:00
ReinUsesLisp
1c9c4eefeb
shader_decode: Fixup XMAD
2019-01-15 17:54:53 -03:00
ReinUsesLisp
170c8212bb
shader_ir: Pass to decoder functions basic block's code
2019-01-15 17:54:53 -03:00
ReinUsesLisp
2d6c064e66
shader_decode: Improve zero flag implementation
2019-01-15 17:54:53 -03:00
ReinUsesLisp
d911740e5d
shader_ir: Remove composite primitives and use temporals instead
2019-01-15 17:54:53 -03:00
ReinUsesLisp
50195b1704
shader_decode: Use proper primitive names
2019-01-15 17:54:53 -03:00
ReinUsesLisp
2faad9bf23
shader_decode: Use BitfieldExtract instead of shift + and
2019-01-15 17:54:53 -03:00
ReinUsesLisp
52223313b1
shader_ir: Remove Ipa primitive
2019-01-15 17:54:53 -03:00
ReinUsesLisp
af5d7e2c49
video_core: Rename glsl_decompiler to gl_shader_decompiler
2019-01-15 17:54:53 -03:00
ReinUsesLisp
d9118d324a
shader_ir: Remove RZ and use Register::ZeroIndex instead
2019-01-15 17:54:53 -03:00
ReinUsesLisp
5af82a8ed4
shader_decode: Implement TEXS.F16
2019-01-15 17:54:53 -03:00
ReinUsesLisp
c68c13e1aa
shader_decode: Fixup R2P
2019-01-15 17:54:53 -03:00
ReinUsesLisp
8b5588e776
glsl_decompiler: Fixup TLDS
2019-01-15 17:54:53 -03:00
ReinUsesLisp
dbed6c6485
glsl_decompiler: Fixup geometry shaders
2019-01-15 17:54:53 -03:00
ReinUsesLisp
ea78c78253
shader_decode: Fixup WriteLogicOperation zero comparison
2019-01-15 17:54:53 -03:00
ReinUsesLisp
ab7f52b279
glsl_decompiler: Fixup permissive member function declarations
2019-01-15 17:54:53 -03:00
ReinUsesLisp
55a10d02e5
shader_decode: Fixup PSET
2019-01-15 17:54:53 -03:00
ReinUsesLisp
a2e22b4359
shader_decode: Fixup clang-format
2019-01-15 17:54:53 -03:00
ReinUsesLisp
e1fea1e0c5
video_core: Implement IR based geometry shaders
2019-01-15 17:54:53 -03:00
ReinUsesLisp
a1b845b651
shader_decode: Implement VMAD and VSETP
2019-01-15 17:54:53 -03:00
ReinUsesLisp
b11e0b94c7
shader_decode: Implement HSET2
2019-01-15 17:54:53 -03:00
ReinUsesLisp
2df55985b6
shader_decode: Rework HSETP2
2019-01-15 17:54:53 -03:00
ReinUsesLisp
8332482c24
shader_decode: Implement R2P
2019-01-15 17:54:53 -03:00
ReinUsesLisp
3f1136ac6f
shader_decode: Implement CSETP
2019-01-15 17:54:52 -03:00
ReinUsesLisp
7e13e8bfcb
shader_decode: Implement PSET
2019-01-15 17:54:52 -03:00
ReinUsesLisp
dd91650aaf
shader_decode: Implement HFMA2
2019-01-15 17:54:52 -03:00
ReinUsesLisp
d6f76307fe
glsl_decompiler: Remove HNegate inlining
2019-01-15 17:54:52 -03:00
ReinUsesLisp
027f443e69
shader_decode: Implement POPC
2019-01-15 17:54:52 -03:00
ReinUsesLisp
55e6786254
shader_decode: Implement TLDS (untested)
2019-01-15 17:54:52 -03:00
ReinUsesLisp
ec98e4d842
shader_decode: Update TLD4 reflecting #1862 changes
2019-01-15 17:54:52 -03:00
ReinUsesLisp
03e088a4f4
shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling
2019-01-15 17:54:52 -03:00
ReinUsesLisp
2d9136cec6
shader_decode: Fixup FSET
2019-01-15 17:54:52 -03:00
ReinUsesLisp
af5c6e4ccb
shader_decode: Implement IADD32I
2019-01-15 17:54:52 -03:00
ReinUsesLisp
fc46ecddb3
video_core: Return safe values after an assert hits
2019-01-15 17:54:52 -03:00
ReinUsesLisp
148a6418ed
shader_decode: Implement FFMA
2019-01-15 17:54:52 -03:00
ReinUsesLisp
21aff36459
video_core: Address feedback
2019-01-15 17:54:52 -03:00
ReinUsesLisp
59b34b1d76
shader_ir: Fixup file inclusions and clang-format
2019-01-15 17:54:52 -03:00
Mat M
57a900cc45
shader_ir: Move comment node string
...
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-15 17:54:52 -03:00
ReinUsesLisp
d4fae3a699
shader_ir: Address feedback to avoid UB in bit casting
2019-01-15 17:54:52 -03:00
ReinUsesLisp
946c86f0bb
shader_decode: Fixup clang-format
2019-01-15 17:54:52 -03:00
ReinUsesLisp
c9cf899d18
shader_decode: Implement LEA
2019-01-15 17:54:52 -03:00
ReinUsesLisp
4fd06efeb9
shader_decode: Implement IADD3
2019-01-15 17:54:52 -03:00
ReinUsesLisp
a40fd07516
shader_decode: Implement LOP3
2019-01-15 17:54:52 -03:00
ReinUsesLisp
b184ca9089
shader_decode: Implement ST_L
2019-01-15 17:54:52 -03:00
ReinUsesLisp
8d42feb09b
shader_decode: Implement LD_L
2019-01-15 17:54:52 -03:00
ReinUsesLisp
21f9e9da09
shader_decode: Implement HSETP2
2019-01-15 17:54:52 -03:00
ReinUsesLisp
68c99d2597
shader_decode: Implement HADD2 and HMUL2
2019-01-15 17:54:52 -03:00
ReinUsesLisp
cf4a08d950
shader_decode: Implement HADD2_IMM and HMUL2_IMM
2019-01-15 17:54:52 -03:00
ReinUsesLisp
376a837511
shader_decode: Implement MOV_SYS
2019-01-15 17:54:52 -03:00
ReinUsesLisp
518a2bd206
shader_decode: Implement IMNMX
2019-01-15 17:54:52 -03:00
ReinUsesLisp
07944a2345
shader_decode: Implement F2F_C
2019-01-15 17:54:52 -03:00
ReinUsesLisp
e8235c0215
shader_decode: Implement I2I
2019-01-15 17:54:52 -03:00
ReinUsesLisp
6ca31f544a
shader_decode: Implement BRA internal flag
2019-01-15 17:54:52 -03:00
ReinUsesLisp
210620ff31
shader_decode: Implement ISCADD
2019-01-15 17:54:52 -03:00
ReinUsesLisp
b0e7920838
shader_decode: Implement XMAD
2019-01-15 17:54:51 -03:00
ReinUsesLisp
becfdb8638
shader_decode: Implement PBK and BRK
2019-01-15 17:54:51 -03:00
ReinUsesLisp
8f37531f8e
shader_decode: Implement LOP
2019-01-15 17:54:51 -03:00
ReinUsesLisp
8486e7f8c8
shader_decode: Implement SEL
2019-01-15 17:54:51 -03:00
ReinUsesLisp
ccb71bece9
shader_decode: Implement IADD
2019-01-15 17:54:51 -03:00
ReinUsesLisp
faadae5814
shader_decode: Implement ISETP
2019-01-15 17:54:51 -03:00
ReinUsesLisp
80183de884
shader_decode: Implement BFI
2019-01-15 17:54:51 -03:00
ReinUsesLisp
078ba28e13
shader_decode: Implement ISET
2019-01-15 17:54:51 -03:00
ReinUsesLisp
acdbbb8885
shader_decode: Implement LD_C
2019-01-15 17:54:51 -03:00
ReinUsesLisp
d79c462af0
shader_decode: Implement SHL
2019-01-15 17:54:51 -03:00
ReinUsesLisp
a2819c204f
shader_decode: Implement SHR
2019-01-15 17:54:51 -03:00
ReinUsesLisp
39f1c6246a
shader_decode: Implement LOP32I
2019-01-15 17:54:51 -03:00
ReinUsesLisp
501284a81a
shader_decode: Implement BFE
2019-01-15 17:54:51 -03:00
ReinUsesLisp
e444a6553f
shader_decode: Implement FSET
2019-01-15 17:54:51 -03:00
ReinUsesLisp
3052eae25e
shader_decode: Implement F2I
2019-01-15 17:54:51 -03:00
ReinUsesLisp
8abe5ba2c8
shader_decode: Implement I2F
2019-01-15 17:54:51 -03:00
ReinUsesLisp
c849b5b320
shader_decode: Implement F2F
2019-01-15 17:54:50 -03:00
ReinUsesLisp
9118deb990
shader_decode: Stub DEPBAR
2019-01-15 17:54:50 -03:00
ReinUsesLisp
97f33f00cf
shader_decode: Implement SSY and SYNC
2019-01-15 17:54:50 -03:00
ReinUsesLisp
abdbafbc20
shader_decode: Implement PSETP
2019-01-15 17:54:50 -03:00
ReinUsesLisp
802c23b8a8
shader_decode: Implement TMML
2019-01-15 17:54:50 -03:00
ReinUsesLisp
2b90637f4b
shader_decode: Implement TEX and TXQ
2019-01-15 17:54:50 -03:00
ReinUsesLisp
878672f371
shader_decode: Implement TEXS (F32)
2019-01-15 17:54:50 -03:00
ReinUsesLisp
c703f0aee4
shader_decode: Implement FSETP
2019-01-15 17:54:50 -03:00
ReinUsesLisp
8215ae942c
shader_decode: Partially implement BRA
2019-01-15 17:54:50 -03:00
ReinUsesLisp
4f95dc950e
shader_decode: Implement IPA
2019-01-15 17:54:50 -03:00
ReinUsesLisp
cacb934f21
shader_decode: Implement EXIT
2019-01-15 17:54:50 -03:00
ReinUsesLisp
0c049e0a21
shader_decode: Implement ST_A
2019-01-15 17:54:50 -03:00
ReinUsesLisp
e3f1233ce1
shader_decode: Implement LD_A
2019-01-15 17:54:50 -03:00
ReinUsesLisp
ea358bd4bf
shader_decode: Implement FADD32I
2019-01-15 17:54:50 -03:00
ReinUsesLisp
c9b2a1b051
shader_decode: Implement FMUL32_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
2edee801ce
shader_decode: Implement MOV32_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
06cb910c6d
shader_decode: Stub RRO_C, RRO_R and RRO_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
5e6a0a08c1
shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
964ddeeb90
shader_decode: Implement MUFU
2019-01-15 17:54:50 -03:00
ReinUsesLisp
4ccaa1402d
shader_decode: Implement FADD_C, FADD_R and FADD_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
7c192ec43f
shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM
2019-01-15 17:54:50 -03:00
ReinUsesLisp
4c70d5b8eb
shader_decode: Implement MOV_C and MOV_R
2019-01-15 17:54:50 -03:00
ReinUsesLisp
0c6fb456e0
glsl_decompiler: Implementation
2019-01-15 17:54:50 -03:00
ReinUsesLisp
fbc67a0563
shader_ir: Add condition code helper
2019-01-15 17:54:50 -03:00
ReinUsesLisp
a58abbcfc4
shader_ir: Add predicate combiner helper
2019-01-15 17:54:49 -03:00
ReinUsesLisp
bf07272695
shader_ir: Add comparison helpers
2019-01-15 17:54:49 -03:00
ReinUsesLisp
60f044df56
shader_ir: Add half float helpers
2019-01-15 17:54:49 -03:00
ReinUsesLisp
e3c55e31d7
shader_ir: Add integer helpers
2019-01-15 17:54:49 -03:00
ReinUsesLisp
833d0806f9
shader_ir: Add float helpers
2019-01-15 17:54:49 -03:00
ReinUsesLisp
6b9eea3fe5
shader_ir: Add setters
2019-01-15 17:54:49 -03:00
ReinUsesLisp
12a95ff453
shader_ir: Add local memory getters
2019-01-15 17:54:49 -03:00
ReinUsesLisp
2f87fd060d
shader_ir: Add internal flag getters
2019-01-15 17:54:49 -03:00
ReinUsesLisp
15f431f0cb
shader_ir: Add attribute getters
2019-01-15 17:54:49 -03:00
ReinUsesLisp
864e8f55cf
shader_ir: Add constant buffer getters
2019-01-15 17:54:49 -03:00
ReinUsesLisp
5e639bfcf6
shader_ir: Add register getter
2019-01-15 17:54:49 -03:00
ReinUsesLisp
4aaa2192b9
shader_ir: Add immediate node constructors
2019-01-15 17:54:49 -03:00
ReinUsesLisp
15a0e1481d
shader_ir: Initial implementation
2019-01-15 17:54:49 -03:00
James Rowe
1d28b2e142
Remove references to PICA and rasterizers in video_core
2018-01-12 19:11:03 -07:00
Huw Pascoe
a234e4c200
Improved performance of FromAttributeBuffer
...
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.
I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
wwylele
8285ca4ad8
pica/shader/jit: implement SETEMIT and EMIT
2017-08-19 10:13:20 +03:00
wwylele
bb63ae3052
correct constness
2017-08-19 10:13:20 +03:00
wwylele
28128348f2
pica/shader/interpreter: implement SETEMIT and EMIT
2017-08-19 10:13:20 +03:00
wwylele
46c6973d2b
pica/shader: extend UnitState for GS
...
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele
c89f804a01
pica/shader_interpreter: fix off-by-one in LOOP
2017-07-27 13:48:27 +03:00
Yuri Kunde Schlesner
f6715f98f5
Stop using reserved operator names (and/or/xor) with Xbyak
...
Also has the Dynarmic upgrade with the same change
2017-06-17 12:20:22 -07:00
Jannik Vogel
925724c990
Pica: Set program code / swizzle data limit to 4096
...
One of the later commits will enable writing to GS regs.
It turns out that on startup, most games will write 4096 GS program words.
The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages:
```
HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024
```
New constants have been introduced to represent these limits.
The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX ) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX ) (12 bit = [0; 4095]).
2017-05-11 15:01:27 +02:00
Mat M
0cb52ee74a
Doxygen: Amend minor issues ( #2593 )
...
Corrects a few issues with regards to Doxygen documentation, for example:
- Incorrect parameter referencing.
- Missing @param tags.
- Typos in @param tags.
and a few minor other issues.
2017-02-26 17:58:51 -08:00
Yuri Kunde Schlesner
e10b11a5d0
video_core/shader: Document sanitized MUL operation
2017-02-12 13:29:14 -08:00
Yuri Kunde Schlesner
443bb3d522
Merge pull request #2550 from yuriks/pica-refactor2
...
Small VideoCore cleanups
2017-02-12 12:33:26 -08:00
Yuri Kunde Schlesner
e2fa1ca5e1
video_core: Fix benign out-of-bounds indexing of array ( #2553 )
...
The resulting pointer wasn't written to unless the index was verified as
valid, but that's still UB and triggered debug checks in MSVC.
Reported by garrettboast on IRC
2017-02-10 20:51:09 -08:00
Yuri Kunde Schlesner
60fc0b086f
VideoCore: Split regs.h inclusions
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
5759d94b5c
VideoCore: Move Regs to its own file
2017-02-04 13:59:12 -08:00
Yuri Kunde Schlesner
f7c7f422c6
VideoCore: Split shader regs from Regs struct
2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
000e78144c
VideoCore: Split rasterizer regs from Regs struct
2017-02-04 13:08:47 -08:00
Yuri Kunde Schlesner
97e06b0a0d
Merge pull request #2476 from yuriks/shader-refactor3
...
Oh No! More shader changes!
2017-02-04 13:02:48 -08:00
wwylele
6dc1d6e568
ShaderJIT: add 16 dummy bytes at the bottom of the stack
2017-02-03 14:53:38 +02:00
Weiyi Wang
0b9c59ff22
Common/x64: remove legacy emitter and abi ( #2504 )
...
These are not used any more since we moved shader JIT to xbyak.
2017-01-31 01:06:42 -08:00
Merry
f7e96dc068
shader_jit_x64_compiler: esi and edi should be persistent ( #2500 )
2017-01-31 00:38:31 -08:00
Yuri Kunde Schlesner
dcdffabfe6
VideoCore: Extract swrast-specific data from OutputVertex
2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
8ed9f9d49f
VideoCore/Shader: Clean up OutputVertex::FromAttributeBuffer
...
This also fixes a long-standing but neverthless harmless memory
corruption bug, whech the padding of the OutputVertex struct would get
corrupted by unused attributes.
2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
92bf5c88e6
VideoCore: Split shader output writing from semantic loading
2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
335df895b9
VideoCore: Consistently use shader configuration to load attributes
2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
ab6954e942
VideoCore: Rename some types to more accurate names
2017-01-29 21:31:36 -08:00
Yuri Kunde Schlesner
0e9081b973
VideoCore/Shader: Move entry_point to SetupBatch
2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
0f64274145
VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetup
2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
6fa3687afc
Shader: Remove OutputRegisters struct
2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
9ea5eacf91
Shader: Initialize conditional_code in interpreter
...
This doesn't belong in LoadInputVertex because it also happens for
non-VS invocations. Since it's not used by the JIT it seems adequate to
initialize it in the interpreter which is the only thing that cares
about them.
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
1a2acc3baa
Shader: Don't read ShaderSetup from global state
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
fa4ac279a7
shader_jit_x64: Don't read program from global state
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
ade7ed7c5f
VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngine
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
114d6b2f97
VideoCore/Shader: Split interpreter and JIT into separate ShaderEngines
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
8eefc62833
VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h}
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
dd4a1672a7
VideoCore/Shader: Split shader uniform state and shader engine
...
Currently there's only a single dummy implementation, which will be
split in a following commit.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
bd82cffd0b
VideoCore/Shader: Add constness to methods
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
1e1f939817
VideoCore/Shader: Use only entry_point as ShaderSetup param
...
This removes all implicit dependency of ShaderState on global PICA
state.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
e3caf669b0
VideoCore/Shader: Use self instead of g_state.vs in ShaderSetup
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
34d581f2dc
VideoCore/Shader: Extract input vertex loading code into function
2017-01-25 18:53:20 -08:00
Kloen
5cc94c17f6
video_core: fix shader.cpp signed / unsigned warning
2017-01-23 16:53:31 +01:00
Jonathan Hao
c18cb1b192
Fix some warnings ( #2399 )
2017-01-04 13:48:29 -03:00
Yuri Kunde Schlesner
c135317de1
VideoCore/Shader: Extract DebugData out from UnitState
2016-12-16 00:16:25 -08:00
Yuri Kunde Schlesner
6e7e767645
Remove unnecessary cast
2016-12-16 00:15:55 -08:00
Yuri Kunde Schlesner
b5e3599704
VideoCore/Shader: Extract evaluate_condition lambda to function scope
2016-12-16 00:15:51 -08:00
Yuri Kunde Schlesner
960578f4e1
VideoCore/Shader: Extract call lambda up a scope and remove unused param
2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
e4e962bc7c
VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffset
2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
d27cb1dedc
VideoCore/Shader: Move DebugData to a separate file
2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
fb9e856b91
shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexing
2016-12-15 10:02:42 -08:00
Yuri Kunde Schlesner
f00ada3363
VideoCore: Eliminate an unnecessary copy in the drawcall loop
2016-12-14 21:00:29 -08:00
Yuri Kunde Schlesner
5ff3206207
shader_jit_x64: Use Reg32 for LOOP* registers, eliminating casts
2016-12-14 20:06:09 -08:00
Yuri Kunde Schlesner
f4e98ecf3f
VideoCore: Convert x64 shader JIT to use Xbyak for assembly
2016-12-14 20:06:08 -08:00
Jannik Vogel
2d8097eecc
shader_jit: Fix non-SSE4.1 path where FLR would not truncate
2016-12-04 04:26:33 +01:00
Jannik Vogel
e2cb7d7833
shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shifted
2016-12-02 04:33:15 +01:00
Yuri Kunde Schlesner
d9a904f9cb
VideoCore: Shader interpreter cleanups
2016-09-29 21:15:49 -07:00
Yuri Kunde Schlesner
26b68313b9
VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfo
...
As far as I can tell, memset was replaced by a fill without correcting
the parameter type, causing an out-of-bounds array read in the Vec4
constructor.
2016-09-29 21:11:36 -07:00
Yuri Kunde Schlesner
f120e78b56
Remove special rules for Windows.h and library includes
2016-09-21 00:16:33 -07:00
Yuri Kunde Schlesner
84fbbe2629
Use negative priorities to avoid special-casing the self-include
2016-09-21 00:15:56 -07:00
Emmanuel Gil Peyrot
ebdae19fd2
Remove empty newlines in #include blocks.
...
This makes clang-format useful on those.
Also add a bunch of forgotten transitive includes, which otherwise
prevented compilation.
2016-09-21 11:15:47 +09:00
Yuri Kunde Schlesner
396a8d91a4
Manually tweak source formatting and then re-run clang-format
2016-09-18 21:14:25 -07:00
Emmanuel Gil Peyrot
dc8479928c
Sources: Run clang-format on everything.
2016-09-18 09:38:01 +09:00
Yuri Kunde Schlesner
a3afeb4687
VideoCore: Fix dangling lambda context in shader interpreter
...
The static meant that after the first execution, these lambda context
would be pointing to a random location on the stack. Fixes a random
crash when using the interpreter.
2016-09-15 22:15:11 -07:00
Jannik Vogel
ff0fa86b17
Retrieve shader result from new OutputRegisters-type
2016-05-16 18:55:51 +02:00
Jannik Vogel
1308afe2c2
Use new shader-jit signature for interpreter
2016-05-13 09:41:55 +02:00
Jannik Vogel
4e01e9ffc5
Refactor access to state in shader-jit
2016-05-13 09:20:14 +02:00
Jannik Vogel
7e756faaba
Move program_counter and call_stack from UnitState to interpreter
2016-05-12 19:05:42 +02:00
Jannik Vogel
6c6d99ca51
Move default_attributes into Pica state
2016-05-12 19:05:41 +02:00
bunnei
f6eb62d062
Merge pull request #1690 from JayFoxRox/tex-type-3
...
Pica: Implement texture type 3 (Projection2D)
2016-05-11 21:47:08 -04:00
Jannik Vogel
ae7a82fa1c
Turn ShaderSetup into struct
2016-05-11 23:48:24 +02:00
Jannik Vogel
2f8e8e1455
Pica: Add tc0.w to OutputVertex
2016-05-11 08:07:36 +02:00
Jannik Vogel
696cb197a5
Pica: Replace logic in shader.cpp with loop
2016-05-03 01:40:47 +02:00
Emmanuel Gil Peyrot
691a42fe98
VideoCore: Run include-what-you-use and fix most includes.
2016-04-30 17:02:41 +01:00
bunnei
90243c56fb
Merge pull request #1730 from hrydgard/vertex-loader
...
* Remove late accesses to attribute_config
* Refactor: Extract VertexLoader from command_processor.cpp.
Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
* Move "&" to their proper place, add missing includes and make some properly relative.
* Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached).
* Optimize the vertex loader, nearly doubling its speed.
* Debugger fix
* Move and rename the MemoryAccesses class to MemoryAccessTracker.
2016-04-29 09:42:47 -04:00
Yuri Kunde Schlesner
e3a8292495
Common: Remove section measurement from profiler ( #1731 )
...
This has been entirely superseded by MicroProfile. The rest of the code
can go when a simpler frametime/FPS meter is added to the GUI.
2016-04-29 00:07:10 -07:00
Henrik Rydgard
47ff008817
Refactor: Extract VertexLoader from command_processor.cpp.
...
Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
2016-04-28 19:05:55 +02:00
Sam Spilsbury
656a442433
shader: Shader size is long uint, not uint.
2016-04-25 00:40:03 +08:00
Sam Spilsbury
c6709d97bc
shader: Handle non-CALL opcodes with a break
2016-04-25 00:39:54 +08:00
Sam Spilsbury
bbffa6ad69
shader: Format string must be provided inline and not as a variable
2016-04-24 23:40:52 +08:00
bunnei
d7fe2784cc
shader_jit_x64: Rename RuntimeAssert to Compile_Assert.
2016-04-13 23:04:53 -04:00
bunnei
3f623b2561
shader_jit_x64.cpp: Rename JitCompiler to JitShader.
2016-04-13 23:04:53 -04:00
bunnei
847fb951e2
shader_jit_x64: Free memory that's no longer needed after compilation.
2016-04-13 23:04:52 -04:00
bunnei
60aa72e117
shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses.
2016-04-13 23:04:52 -04:00
bunnei
60749f2cda
shader_jit_x64: Use CALL/RET instead of JMP for subroutines.
2016-04-13 23:04:52 -04:00
bunnei
1d45b57939
shader_jit_x64: Separate initialization and code generation for readability.
2016-04-13 23:04:50 -04:00
bunnei
6e0319eec9
shader_jit_x64: Get rid of unnecessary last_program_counter variable.
2016-04-13 23:04:49 -04:00
bunnei
f3afe24594
shader_jit_x64: Execute certain asserts at runtime.
...
- This is because we compile the full shader code space, and therefore its common to compile malformed instructions.
2016-04-13 23:04:49 -04:00
bunnei
ffcf7ecee9
shader: Remove unused 'state' argument from 'Setup' function.
2016-04-13 23:04:48 -04:00
bunnei
a5a74eb121
shader_jit_x64: Specify shader main offset at runtime.
2016-04-13 23:04:47 -04:00
bunnei
c9d10de644
shader_jit_x64: Allocate each program independently and persist for emu session.
2016-04-13 23:04:47 -04:00
bunnei
4632791a40
shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions.
2016-04-13 23:04:44 -04:00
bunnei
135aec7bea
shader_jit_x64: Fix strict memory aliasing issues.
2016-04-13 23:04:43 -04:00
Mathew Maidment
aa6380e5bc
Merge pull request #1643 from MerryMage/make_unique
...
Common: Remove Common::make_unique, use std::make_unique
2016-04-05 20:10:11 -04:00
MerryMage
a06dcfeb61
Common: Remove Common::make_unique, use std::make_unique
2016-04-05 13:31:17 +01:00
bunnei
ebbba0d381
Merge pull request #1508 from JayFoxRox/vs-output-map
...
Respect vs output map
2016-03-22 11:59:12 -04:00
bunnei
784c5539ea
Merge pull request #1538 from lioncash/dot
...
shader_interpreter: use std::inner_product for the dot product
2016-03-20 00:35:06 -04:00
Lioncash
63e956cc7a
video_core: Don't cast away const
2016-03-17 02:01:38 -04:00
Lioncash
4d89df8df2
shader_interpreter: use std::inner_product for the dot product
...
Same thing, less code.
2016-03-17 01:00:30 -04:00
bunnei
96cafbe4cc
Merge pull request #1503 from bunnei/clear-jit-cache
...
Clear JIT cache
2016-03-16 13:18:51 -04:00
Jannik Vogel
9aad2f29bb
PICA: Fix MAD/MADI encoding
2016-03-15 20:01:25 +01:00
Jannik Vogel
f746a00964
Respect vs output map
2016-03-14 13:03:34 +01:00
bunnei
6efb710b28
shader_jit_x64: Clear cache after code space fills up.
2016-03-12 12:15:49 -05:00
bunnei
c103759cdc
shader_jit_x64: Make assert outputs more useful & cleanup formatting.
2016-03-12 12:06:28 -05:00
bunnei
46f78b7f19
shader: Update log message to use proper log class.
2016-03-12 12:03:32 -05:00
Lioncash
88d604383e
Common: Get rid of alignment macros
...
The gl rasterizer already uses alignas,
so we may as well move everything over.
2016-03-09 01:31:14 -05:00
Dwayne Slater
6b775034dd
Add immediate mode vertex submission
2016-03-02 22:16:38 -05:00
bunnei
b003075570
pica: Implement decoding of basic fragment lighting components.
...
- Diffuse
- Distance attenuation
- float16/float20 types
- Vertex Shader 'view' output
2016-02-05 17:17:28 -05:00
bunnei
a43f8d2fb7
Merge pull request #1367 from yuriks/jit-jmp
...
Shader JIT: Fix off-by-one error when compiling JMPs
2016-01-27 09:19:28 -05:00
Yuri Kunde Schlesner
083d2d89a5
Shader: Implement "invert condition" feature of IFU instruction
...
If the bit 0 of the JMPU instruction is set, then the jump condition
will be inverted. That is, a jump will happen when the boolean is false
instead of when it is true.
2016-01-24 20:29:06 -08:00
Yuri Kunde Schlesner
c1071c1ff7
Shader JIT: Fix off-by-one error when compiling JMPs
...
There was a mistake in the JMP code which meant that one instruction at
the destination would be skipped when the jump was taken. This commit
also changes the meaning of the culprit parameter to make it less
confusing and avoid similar mistakes in the future.
2016-01-24 02:15:56 -08:00
Lioncash
aec28ed91e
video_core: Reorganize headers
2015-09-11 07:31:15 -04:00
Lioncash
1fa772393b
video_core: Remove unnecessary includes from headers
2015-09-11 00:10:03 -04:00
Lioncash
526eb33d1e
video_core: Remove unused variables
2015-09-10 10:26:21 -04:00
aroulin
1484a23530
Shader JIT: Use SCALE constant from emitter
2015-09-07 16:50:28 +02:00
aroulin
87e3b9ffc0
Shader: Fix size_t to int casts of register offsets
2015-09-07 16:50:28 +02:00
bunnei
918ca40c68
Merge pull request #1088 from aroulin/x64-emitter-abi-call
...
x64: Proper stack alignment in shader JIT function calls
2015-09-02 08:46:58 -04:00
aroulin
ba998b85a1
video_core: Fix format specifiers warnings
2015-09-02 08:20:00 +02:00
aroulin
179ad35c2e
x64: Proper stack alignment in shader JIT function calls
...
Import Dolphin stack handling and register saving routines
Also removes the x86 parts from abi files
2015-09-01 23:39:52 +02:00
aroulin
84959be150
Shader JIT: Fix SGE/SGEI NaN behavior
...
SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE
instruction was used with NLT
2015-08-31 08:16:15 +02:00
Yuri Kunde Schlesner
c5a4025b65
Merge pull request #1065 from yuriks/shader-fp
...
Shader FP compliance fixes
2015-08-27 16:34:13 -07:00
aroulin
f52d8c1a9b
Shader JIT: Fix float to integer rounding in MOVA
...
MOVA converts new address register values from floats to integers using truncation
2015-08-27 15:26:41 +02:00
archshift
dd0e1061ef
Shader JIT: ifdef out reference to ifdef'd out shader_map
...
shader_map was only defined on x86 architectures, but was cleared on shutdown
with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
2015-08-26 22:28:19 +00:00
Yuri Kunde Schlesner
0fcabd2b11
Integrate the MicroProfile profiling library
...
This brings goodies such as a configurable user interface and
multi-threaded timeline view.
2015-08-24 22:16:28 -03:00
Yuri Kunde Schlesner
d8ef20c856
Shader JIT: Tiny micro-optimization in DPH
2015-08-24 01:48:37 -03:00
Yuri Kunde Schlesner
630a850d4d
Shaders: Fix multiplications between 0.0 and inf
...
The PICA200 semantics for multiplication are so that when multiplying
inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by
IEEE. This is relied upon by games.
Fixes #1024 (missing OoT interface items)
2015-08-24 01:48:15 -03:00
Yuri Kunde Schlesner
082b74fa24
Shaders: Explicitly conform to PICA semantics in MAX/MIN
2015-08-24 01:46:58 -03:00
Yuri Kunde Schlesner
76247170df
Shader JIT: Add name to second scratch register (XMM4)
2015-08-24 01:46:10 -03:00