bunnei
cd0a7dfdbc
Merge pull request #3258 from FernandoS27/shader-amend
...
Shader_IR: add the ability to amend code in the shader ir.
2020-01-04 14:05:17 -05:00
Fernando Sahmkow
3dd6b55851
Shader_IR: Address Feedback
2020-01-04 14:40:57 -04:00
Fernando Sahmkow
a1667a7b46
Shader_IR: Implement TXD Array.
...
This commit extends the compilation of TXD to support array samplers on
TXD.
2020-01-04 13:28:02 -04:00
bunnei
028b2718ed
Merge pull request #3239 from ReinUsesLisp/p2r
...
shader/p2r: Implement P2R Pr
2019-12-31 20:37:16 -05:00
Fernando Sahmkow
b3371ed09e
Shader_IR: add the ability to amend code in the shader ir.
...
This commit introduces a mechanism by which shader IR code can be
amended and extended. This useful for track algorithms where certain
information can derived from before the track such as indexes to array
samplers.
2019-12-30 15:31:48 -04:00
bunnei
8a76f816a4
Merge pull request #3228 from ReinUsesLisp/ptp
...
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
bunnei
16dcfacbfc
Merge pull request #3235 from ReinUsesLisp/ldg-u8
...
shader/memory: Implement LDG.U8 and unaligned U8 loads
2019-12-21 22:50:28 -05:00
ReinUsesLisp
38d3a48873
shader/p2r: Implement P2R Pr
...
P2R dumps predicate or condition codes state to a register. This is
useful for unit testing.
2019-12-20 18:02:41 -03:00
ReinUsesLisp
cf27b59493
shader/r2p: Refactor P2R to support P2R
2019-12-20 17:55:42 -03:00
bunnei
7be65c6a68
Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector
...
shader/conversion: Implement byte selector in I2F
2019-12-19 22:36:26 -05:00
ReinUsesLisp
ae8d4b6c0c
shader/memory: Implement LDG.U8 and unaligned U8 loads
...
LDG can load single bytes instead of full integers or packs of integers.
These have the advantage of loading bytes that are not aligned to 4
bytes.
To emulate these this commit gets the byte being referenced (by doing
"address & 3" and then using that to extract the byte from the loaded
integer:
result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
2019-12-18 01:21:46 -03:00
ReinUsesLisp
a7d6bd1ef1
shader/conversion: Implement byte selector in I2F
...
I2F's byte selector is used to choose what bytes to convert to float.
e.g. if the input is 0xaabbccdd and the selector is ".B3" it will
convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in
that example the default would convert 0xdd to float.
2019-12-18 00:41:22 -03:00
ReinUsesLisp
15a753b9a5
shader/texture: Properly shrink unused entries in size mismatches
...
When a image format mismatches we were inserting zeroes to the texture
itself. This was not handling cases were the mismatch uses less
coordinates than the guest shader code. Address that by resizing the
vector.
2019-12-17 23:38:10 -03:00
ReinUsesLisp
e09c1fbc1f
shader/texture: Implement TLD4.PTP
2019-12-16 04:09:24 -03:00
ReinUsesLisp
844e4a297b
shader/texture: Enable arrayed TLD4
2019-12-16 02:37:21 -03:00
ReinUsesLisp
3d2c44848b
shader/texture: Implement AOFFI for TLD4S
2019-12-16 02:06:42 -03:00
ReinUsesLisp
3d9fff82c0
shader/texture: Remove unnecesary parenthesis
2019-12-16 01:52:33 -03:00
Fernando Sahmkow
c0ee0aa1a8
Shader_IR: Correct TLD4S Depth Compare.
2019-12-11 19:53:17 -04:00
Fernando Sahmkow
af89723fa3
Shader_Ir: Correct TLD4S encoding and implement f16 flag.
2019-12-11 19:53:17 -04:00
Fernando Sahmkow
271a3264f3
Shader_Ir: default failed tracks on bindless samplers to null values.
2019-12-11 19:53:16 -04:00
ReinUsesLisp
425a254fa2
shader: Implement MEMBAR.GL
...
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp
0b5b93053d
shader_ir/other: Implement S2R InvocationId
2019-12-09 23:52:28 -03:00
ReinUsesLisp
9ad6327fbd
shader: Keep track of shaders using warp instructions
2019-12-09 23:40:41 -03:00
ReinUsesLisp
6233b1db08
shader_ir/memory: Implement patch stores
2019-12-09 23:25:21 -03:00
bunnei
e36814d6d5
Merge pull request #3109 from FernandoS27/new-instr
...
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
Lioncash
9403979c22
video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys()
...
Tidies it up a little bit visually.
2019-11-27 05:53:43 -05:00
Lioncash
930e311526
video_core/const_buffer_locker: Remove unused includes
2019-11-27 05:51:13 -05:00
Lioncash
9341ca7979
video_core/const_buffer_locker: Remove #pragma once from cpp file
...
Silences a compiler warning.
2019-11-27 05:50:51 -05:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType
2019-11-22 21:28:48 -03:00
ReinUsesLisp
dc9961f341
shader/texture: Handle TLDS texture type mismatches
...
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.
E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp
32c1bc6a67
shader/texture: Deduce texture buffers from locker
...
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
24f4198cee
shader/other: Reduce DEPBAR log severity
...
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
Fernando Sahmkow
c8473f399e
Shader_IR: Address Feedback
2019-11-18 07:34:34 -04:00
Fernando Sahmkow
cd0f5dfc17
Shader_IR: Implement TXD instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
f3d1b370aa
Shader_IR: Implement FLO instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
...
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
Rodrigo Locatti
cf770a68a5
Merge pull request #3084 from ReinUsesLisp/cast-warnings
...
video_core: Treat implicit conversions as errors
2019-11-13 02:16:22 -03:00
ReinUsesLisp
096f339a2a
video_core: Silence implicit conversion warnings
2019-11-08 22:48:50 +00:00
ReinUsesLisp
56e237d1f9
shader_ir/warp: Implement FSWZADD
2019-11-07 20:08:41 -03:00
ReinUsesLisp
08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics
2019-11-07 20:08:41 -03:00
bunnei
b6ae48966d
Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx
...
shader/control_flow: Abstract repeated code chunks in BRX tracking
2019-11-07 01:30:01 -05:00
Rodrigo Locatti
ff5a0f370c
shader/control_flow: Specify constness on caller lambdas
...
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07 01:44:09 -03:00
ReinUsesLisp
7b069252f8
shader/control_flow: Use callable template instead of std::function
2019-11-07 01:44:08 -03:00
ReinUsesLisp
46c3047283
shader/control_flow: Abstract repeated code chunks in BRX tracking
...
Remove copied and pasted for cycles into a common templated function.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
ae7dfa93be
shader/control_flow: Silence Intellisense cast warnings
2019-11-07 01:44:08 -03:00
ReinUsesLisp
deb1b54eed
shader/control_flow: Remove brace initializer in std containers
...
These containers have a default constructor.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings
2019-11-07 01:43:38 -03:00
ReinUsesLisp
c4374d0d41
shader/arithmetic: Reduce RRO stub severity
2019-11-07 01:43:38 -03:00
ReinUsesLisp
35d40b74b3
shader/texture: Remove NODEP warnings
...
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
Rodrigo Locatti
654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
...
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
Fernando Sahmkow
23cabc98db
Shader_IR: Fix regression on TLD4
...
Originally on the last commit I thought TLD4 acted the same as TLD4S and
didn't have a mask. It actually does have a component mask. This commit
corrects that.
2019-10-30 21:14:57 -04:00
Fernando Sahmkow
9293c3a0f2
Shader_IR: Fix TLD4 and add Bindless Variant.
...
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
...
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.
Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti
26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
...
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow
be856a38d6
Shader_IR: Address Feedback.
2019-10-26 15:38:30 -04:00
Rodrigo Locatti
a0d79085c4
Merge pull request #3027 from lioncash/lookup
...
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti
d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
...
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
ReinUsesLisp
78f3e8a757
gl_shader_cache: Implement locker variants invalidation
2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3
gl_shader_disk_cache: Store and load fast BRX
2019-10-25 09:01:31 -04:00
ReinUsesLisp
fa2c297f3e
const_buffer_locker: Minor style changes
2019-10-25 09:01:31 -04:00
ReinUsesLisp
7b81ba4d8a
gl_shader_decompiler: Move entries to a separate function
2019-10-25 09:01:31 -04:00
Fernando Sahmkow
1244f2d368
Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.
2019-10-25 09:01:31 -04:00
Fernando Sahmkow
a05120ec0b
Shader_IR: Correct typo in Consistent method.
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
33fcec3502
Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
8909f52166
Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
acd6441134
Shader_Cache: setup connection of ConstBufferLocker
2019-10-25 09:01:29 -04:00
Fernando Sahmkow
1a58f45d76
VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.
2019-10-25 09:01:29 -04:00
Fernando Sahmkow
2ef696c85a
Shader_IR: Implement BRX tracking.
2019-10-25 09:01:29 -04:00
Lioncash
382717172e
shader_ir: Use std::array with pair instead of unordered_map
...
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash
1f5401c89c
video_core/shader: Resolve instances of variable shadowing
...
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow
1509d2ffbd
Shader_Ir: Fix TLD4S from using a component mask.
...
TLD4S always outputs 4 values, the previous code checked a component
mask and omitted those values that weren't part of it. This commit
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp
1ea07954fb
shader_ir/memory: Ignore global memory when tracking fails
...
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.
In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.
The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
Lioncash
074b38b7a9
video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions
...
These can also trivially be made const member functions, with the
addition of a few consts.
2019-10-17 20:59:48 -04:00
Lioncash
222f4b45eb
video_core/shader/ast: Make ASTManager::Print a const member function
...
Given all visiting functions never modify the nodes, we can trivially
make this a const member function.
2019-10-17 20:56:39 -04:00
Lioncash
7831e86c34
video_core/shader/ast: Make ExprPrinter members private
...
This member already has an accessor, so there's no need for it to be
public.
2019-10-17 20:39:36 -04:00
Lioncash
a2eccbf075
video_core/shader/ast: Make Indent() return a string_view
...
The returned string is simply a substring of our constexpr tabs
string_view, so we can just use a string_view here as well, since the
original string_view is guaranteed to always exist.
Now the function is fully non-allocating.
2019-10-17 20:29:00 -04:00
Lioncash
15d177a6ac
video_core/shader/ast: Make Indent() private
...
It's never used outside of this class, so we can narrow its scope down.
2019-10-17 20:26:13 -04:00
Lioncash
7f6a8a33d4
video_core/shader/ast: Rename Ident() to Indent()
...
This can be confusing, given "ident" is generally used as a shorthand
for "identifier".
2019-10-17 20:26:13 -04:00
Lioncash
081530686c
video_core/shader/ast: Make use of fmt where applicable
...
Makes a few strings nicer to read and also eliminates a bit of string
churn with operator+.
2019-10-17 20:26:10 -04:00
bunnei
9fe8072c67
Merge pull request #2980 from lioncash/warn
...
maxwell_3d: Silence truncation warnings
2019-10-17 14:02:16 -04:00
Lioncash
77b4916b33
control_flow: Silence truncation warnings
...
This can be trivially fixed by making the input size a size_t.
CFGRebuildState's constructor parameter is already a std::size_t, so
this just makes the size type fully conform with it.
2019-10-15 19:10:28 -04:00
Lioncash
67658dd6e8
shader/node: std::move Meta instance within OperationNode constructor
...
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
ReinUsesLisp
3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
...
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp
632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG
2019-10-07 14:48:58 -03:00
Lioncash
f883cd4f0e
video_core/control_flow: Eliminate variable shadowing warnings
2019-10-05 09:14:27 -04:00
Lioncash
25702b6256
video_core/control_flow: Eliminate pessimizing moves
...
These can inhibit the ability of a compiler to perform RVO.
2019-10-05 09:14:27 -04:00
Lioncash
d82b181d44
video_core/ast: Unindent most of IsFullyDecompiled() by one level
2019-10-05 09:14:27 -04:00
Lioncash
6c41d1cd7e
video_core/ast: Make ShowCurrentState() take a string_view instead of std::string
...
Allows the function to be non-allocating in terms of the output string.
2019-10-05 09:14:27 -04:00
Lioncash
3c54edae24
video_core/ast: Eliminate variable shadowing warnings
2019-10-05 09:14:26 -04:00
Lioncash
5a0a9c7449
video_core/ast: Replace std::string with a constexpr std::string_view
...
Same behavior, but without the need to heap allocate
2019-10-05 09:14:26 -04:00
Lioncash
3a20d9734f
video_core/ast: Default the move constructor and assignment operator
...
This is behaviorally equivalent and also fixes a bug where some members
weren't being moved over.
2019-10-05 09:14:26 -04:00
Lioncash
43503a69bf
video_core/{ast, expr}: Organize forward declaration
...
Keeps them alphabetically sorted for readability.
2019-10-05 09:14:26 -04:00
Lioncash
50ad745585
video_core/expr: Supply operator!= along with operator==
...
Provides logical symmetry to the interface.
2019-10-05 09:14:26 -04:00
Lioncash
8eb1398f8d
video_core/{ast, expr}: Use std::move where applicable
...
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05 09:14:23 -04:00
Lioncash
8e0c80f269
video_core/ast: Supply const accessors for data where applicable
...
Provides const equivalents of data accessors for use within const
contexts.
2019-10-05 08:22:03 -04:00
Fernando Sahmkow
e6eae4b815
Shader_ir: Address feedback
2019-10-04 18:52:57 -04:00
Fernando Sahmkow
3c09d9abe6
Shader_Ir: Address Feedback and clang format.
2019-10-04 18:52:57 -04:00
Fernando Sahmkow
7c756baa77
Shader_IR: clean up AST handling and add documentation.
2019-10-04 18:52:55 -04:00
Fernando Sahmkow
5ea740beb5
Shader_IR: Correct OutwardMoves for Ifs
2019-10-04 18:52:54 -04:00
Fernando Sahmkow
b3c46d6948
Shader_IR: corrections and clang-format
2019-10-04 18:52:53 -04:00